snor.re

# Constrained Shortest Path is NP-complete

Published 2020-09-05

A friend recently mentioned a graph/flow network problem consisting of finding the shortest path between two vertices such that some secondary cost of the path doesn’t exceed some set threshold.

As an example, we can think of a transportation network, where each edge has both a cost and a transportation time associated. Then the problem consist of finding a route that can move goods as cheaply as possible within a given maximum transportation time, if one exists. Or, vice versa, finding a fastest route that doesn’t exceed some transportation budget, if one exists.

We can formalize this as the following optimization problem:

Optimization problem [Constrained Shortest Path].

Input. A directed multigraph $G = (V,E)$, two vertices $s,t \in V$, two edge weight functions $w_1,w_2 : E \to \mathbb{Z}_{\ge 0}$, such that $w_1(e)$ and $w_2(e)$ are not both zero for each $e \in E$, and a maximum weight $W_2 \in \mathbb{Z}_{\ge 0}$.

Output. A simple path $\pi$ from $s$ to $t$ minimizing $w_1(\pi)$ subject to $w_2(\pi) \le W_2$, or, if no such path exists, $\bot$.

(In the above, $w_1(\pi)$ denotes the total $w_1$-weight of the path $\pi$, that is, $w_1(\pi) = \sum_{e \in \pi} w_1(e)$, and similarly for $w_2$.)

My friend was in particular interested in whether the problem was solvable in polynomial time, so, having recently taken a course in NP-completeness, I decided to try to answer this question.

## Decision problem

While the optimization problem is the problem we actually want to reason about, to use the concept of NP-completeness, we must first find a corresponding decision problem. A decision problem is a problem that given some input either has the answer YES or NO.

To be more precise, a corresponding decision problem is one where, given an algorithm $A$ that answers the decision problem, we can construct an algorithm for answering the optimization problem, and vice versa, such that the constructed algorithm invokes $A$ only a polynomial number of times (as a function of the input size) and only performs additional computations with polynomial time complexity.

We focus here on polynomial time complexity because it means that a polynomial-time $A$ leads to a polynomial-time algorithm for the optimization problem—and, conversely, that if no polynomial-time $A$ exists, there can exist no polynomial-time algorithm for the optimization problem.

Surprisingly, it is possible to formulate the decision problem in a way that makes the weights $w_1$ and $w_2$ symmetrical. Using this, we can group them as pairs, so that $w(e) = (w_1(e),w_2(e))$ for each $e \in E$. This lets us simplify our notation a bit, by saying $(w_1,w_2) \le (W_1,W_2)$ if and only if $w_1 \le W_1$ and $w_2 \le W_2$.

A somewhat natural corresponding decision problem is then given by:

Decision problem [Constrained Shortest Path].

Input. A directed multigraph $G = (V,E)$, two vertices $s,t \in V$, a pair-valued edge weight function $w : E \to \mathbb{Z}_{\ge 0}^2$ where $w(e) \ne (0,0)$ for each $e \in E$, and a maximum weight pair $W \in \mathbb{Z}_{\ge 0}^2$.

Output. YES if there exists a path $\pi$ from $s$ to $t$ such that $w(\pi) \le W$, otherwise NO.

(Here, we let $w(\pi) = (w_1(\pi), w_2(\pi))$.)

Importantly, this problem is actually in NP. A bit informally, this means that there exists a polynomial-time non-deterministic algorithm that always answers NO for NO-instances, and is capable of answering YES for every YES-instance.

To see this, we can observe, that we can non-deterministically “guess” a simple path from $s$ to $t$ (in polynomial time), and then determine whether it satisfies the constraint $w(\pi) \le W$ (in polynomial time), by simply calculating $w(\pi)$.

We will now show that the optimization corresponds to the decision problem.

### Correspondence to the optimization problem

First we need to prove that we can use a solution to the optimization to solve the decision problem. To do this, consider an instance $X$ of the decision problem. We can then construct an instance of the optimization problem using the same multigraph $G=(V,E)$, same vertices $s,t$, same edge weight function $w$, and same $w_2$-constraint $W_2$.

A solution to the optimization problem will then provide us with either a path $\pi$ minimizing $w_1$ subject to $w_2(\pi) \le W_2$, or $\bot$, if no such path exists. In the first case, we know that the decision problem must have answer YES if $W_1 \ge w_1(\pi)$, as $\pi$ then is an example of a satisfying path, or NO if $W_1 < w_1(\pi)$, as $\pi$ minimizes $w_1$. In the second case, there exists no path $\pi$ satisfying $w_2(\pi) \le W_2$ at all, so clearly the answer to the decision problem is NO. This shows that we can turn an answer to the optimization problem into an answer to the decision problem.1

To see that the reverse is also the case, that is, that a solution to the decision problem helps us solve the optimization problem, let $A$ be an algorithm solving the decision problem. Then we can solve the optimization problem with the following algorithm:

Algorithm [Optimization problem from decision problem].

Input. A directed multigraph $G = (V,E)$, two vertices $s,t \in V$, two edge weight functions $w_1,w_2 : E \to \mathbb{Z}_{\ge 0}$, such that $w_1(e)$ and $w_2(e)$ are not both zero for each $e \in E$, and a maximum weight $W_2 \in \mathbb{Z}_{\ge 0}$.

Output. A simple path $\pi$ from $s$ to $t$ minimizing $w_1(\pi)$ subject to $w_2(\pi) \le W_2$, or, if no such path exists, $\bot$.

Steps.

1. Let $M = \sum_{e \in E} w_1(e)$. If $A$ answers NO when given $W = (M,W_2)$, then answer $\bot$ and stop.
2. Perform a binary search on the interval $[0,M]$ to find the smallest $W_1 \in [0,M]$ such that $A$ answers YES when given $W = (W_1,W_2)$.
3. For each edge $e \in E$, do the following:
1. Remove $e$ from $E$.
2. If this causes $A$ to answer NO, then put $e$ back in $E$.
4. Return the path consisting of exactly the remaining edges in $E$.

We now need to show two things: That the algorithm correctly solves the optimization problem, and that it runs in polynomial time provided that $A$ runs in polynomial time.

Correctness. Since every simple path contains each edge at most once, we must have $w_1(\pi) \le M$ for all simple paths $\pi$. It is therefore correct to return $\bot$ if $A$ answers NO in step 1 when given $W = (M,W_2)$.

Assume now that the algorithm reaches step 2. Then there must exist at least one simple path $\pi$ such that $w_2(\pi) \le W_2$. Let then $W_1 \in [0,M]$ denote the smallest $w_1$-weight of such a path, such that no simple path $\pi$ satisfies both $w_1(\pi) < W_1$ and $w_2(\pi) \le W_2$. Clearly, $A$, when given $W = (W_1',W_2)$, must then answer YES if $W_1' \ge W_1$ and NO if $W_1' < W_1$. Because of this, we can conclude that the binary search in step 2 must successfully find $W_1$, and that every solution $\pi$ to the optimization problem we are trying to solve must have $w_1(\pi) = W_1$.

If we reach step 2, we thus know $A$ answers YES for $W_1 = M$. Since there is a minimal $w_1$-weight, we know that $A$ must answer NO for all values of $W_1$ less than this value and YES for all values greater than or equal to it. Hence, the so the binary search must successfully find a smallest satisfactory $W_1$. In other words, in order to minimize $w_1(\pi)$, every solution $\pi$ must have $w_1(\pi) = W_1$.

For the loop in step 3, at the start and end of each iteration, $A$ must answer YES. This means that we gradually remove each edge that is not part of every remaining solution to the problem. Hence, the edges remaining after step 3 are exactly those edges that are part of every remaining solution. But then they must in fact form the last and only remaining solution, so step 4 is correct.

Analysis. The only parts of the algorithm that could be problematic in terms of time complexity are the binary search in step 2 and the loop in step 3. However, since the binary search takes at most $O(\log M)$ iterations, it invokes $A$ a number of times polynomial in the input size.2 The loop in step 3 invokes $A$ exactly once for each edge in $E$, so it requires $O(|E|)$ calls to $A$, which is also polynomial in the input size. We can thus conclude that the algorithm invokes $A$ a polynomial number of times and otherwise takes polynomial time.

In summary, we can conclude, that there exists a (deterministic) polynomial-time algorithm for solving the optimization problem if and only if the given decision problem is in P.

## NP-completeness

Having found a corresponding decision problem, the next step is to either show it is in P or that it is NP-complete. If the problem is in P, then it can be solved by a deterministic polynomial-time algorithm. If on the other hand it is NP-complete, then it is at least as difficult to solve as all other problems in NP, and probably can’t be solved by a polynomial-time algorithm.

Failing to easily come up with a proof either way, I decided to try to look up whether the problem was well-known, hoping to find an existing proof. Doing this, I found a reference to a variant of the same problem in the book Computers and Intractability: A Guide to the Theory of NP-Completeness by Michael R. Garey and David S. Johnson:

[ND30] SHORTEST WEIGHT-CONSTRAINED PATH

INSTANCE: Graph $G = (V,E)$, length $l(e) \in Z^+$, and weight $w(e) \in Z^+$ for each $e \in E$, specified vertices $s,t \in V$, positive integers $K,W$.

QUESTION: Is there a simple path in $G$ from $s$ to $t$ with total weight $W$ or less and total length $K$ or less?

Reference: [Megiddo, 1977]. Transformation from PARTITION.

Comment: Also NP-complete for directed graphs. Both problems are solvable in polynomial time if all weights are equal or all lengths are equal.

It states that the problem can be proven NP-complete by transformation from the Partition problem. Unfortunately, the reference to the proof is given as “private communication”, so no actual proof is included.

It does, however, provide a hint for the kind of problem we can use. Instead of the Partition problem, I will here provide a proof using the similar Subset Sum problem.

It should be noted, that the formulation of the Shortest Weight-Constrained Path problem differs from that of the Constrained Shortest Path problem given earlier. However, the proof presented here can, as we will see at the end, be adapted to work for the Shortest Weight-Constrained Path problem as well.

In any case, here we go.

### Transformation from Subset Sum

Subset Sum is a known NP-complete problem, so if we can show that Constrained Shortest Path is at least as difficult to solve as Subset Sum, it too must be NP-complete. To show this in practice, we must show that if we can solve Constrained Shortest Path instances in polynomial time, then we can also solve Subset Sum instances in polynomial time. Somewhat counterintuitively, this means we must be able to transform the problem instances in the opposite direction, i.e., that we must be able to turn each instance of Subset Sum into an instance of Constrained Shortest Path. Additionally, this transformation must preserve the answer of the instance and must itself have polynomial time complexity.

The formulation of Subset Sum we will use here is given by:

Decision problem [Subset Sum].

Input. Positive integers $s_1,\dotsc,s_n$ and integer $B$.

Output. YES if there exists $A \subseteq \{1,\dotsc,n\}$ such that $\sum_{i \in S} s_i = B$, otherwise NO.

For this pair of problems, there exists a relatively clean transformation:

Transformation.

Input. An instance of Subset Sum, that is, positive integers $s_1,\dotsc,s_n$ and integer $B$.

Output. An instance of Constrained Shortest Path, that is, a directed multigraph $G = (V,E)$, two vertices $s,t \in V$, a pair-valued edge weight function $w : E \to \mathbb{Z}_{\ge 0}^2$ where $w(e) \ne (0,0)$ for each $e \in E$, and a maximum weight pair $W \in \mathbb{Z}_{\ge 0}^2$.

Steps.

1. Create $n+1$ vertices, $v_0$ through $v_n$, and let $V = \{v_0,\dotsc,v_n\}$. For each $i=1,\dotsc,n$, create two parallel edges $e_i$ and $e_i'$ between $v_{i-1}$ and $v_i$, and let $E = \{e_i,e_i' \mid i=1,\dotsc,n\}$.
2. Let $s=v_0$ and $t=v_n$.
3. Let $w(e_i) = (s_i,0)$ and $w(e_i') = (0,s_i)$ for each $i=1,\dotsc,n$.
4. Let $W = (B,\: S - B)$, where $S = \sum_{i=1}^n s_i$.

The transformation has polynomial time complexity. Labeling each edge $e$ with $w(e)$, we can visualize the constructed multigraph as:

This transformation has several important properties. First note, that for each $i=1,\dotsc,n$ we have $w_1(e_i) + w_2(e_i) = w_1(e_i') + w_2(e_i') = s_i$. Hence, since a path $\pi$ from $s$ to $t$ must include either $e_i$ or $e_i'$ but not both for each $i=1,\dotsc,n$, we must have $w_1(\pi) + w_2(\pi) = s_1 + s_2 + \dotsb + s_n = S$.

Now, assume the answer to the Subset Sum instance is YES, that is, that there exists $A \subseteq \{1,\dotsc,n\}$ such that $\sum_{i \in A} s_i = B$. Then consider the path $\pi$ constructed by following $e_i$ if $i \in A$ and following $e_i'$ if $i \not\in A$ for each $i=1,\dotsc,n$. Then clearly $w_1(\pi) = \sum_{i \in A} s_i = B$ and $w_2(\pi) = S - w_1(\pi) = S - B$, so $\pi$ satisfies $w(\pi) \le W$. The answer to the constructed Constrained Shortest Path instance is thus also YES.

Assume now instead, that the answer to the constructed Constrained Shortest Path instance is YES, that is, that there exists a path $\pi$ from $s$ to $t$ such that $w_1(\pi) \le B$ and $w_2(\pi) \le S - B$. Then

$w_1(\pi) = S - w_2(\pi) \ge S - (S - B) = B.$

so $w_1(\pi) = B$. Let now $A$ be defined by $i \in A$ if and only if $\pi$ includes $e_i$. Then $\sum_{i \in A} s_i = w_1(\pi) = B$, so the answer to the Subset Sum instance is YES.

We have in other words shown that the Subset Sum is a YES-instance if and only if the constructed Constrained Shortest Path instance is a YES-instance. Hence if there exists a polynomial-time algorithm for solving instances of Constrained Shortest Path, then we can combine that algorithm with the above transformation to get a polynomial-time algorithm for solving Subset Sum.

In conclusion, since Subset Sum is NP-complete, we see that Constrained Shortest Path also is NP-complete.

Note that since the construction happens to be acyclic, the problem is still NP-complete if we restrict it to acyclic directed multigraphs.

### Adapting the proof to Shortest Weight-Constrained Path

The formulation of Shortest Weight-Constrained Path from Computers and Intractability differs from our Constrained Shortest Path formulation in two aspects:

1. It requires $G=(V,E)$ to be an undirected graph rather than a directed multigraph.
2. It requires both $w_1(e)$ and $w_2(e)$ to be positive for each $e \in E$.

Adapting to the first difference, we can make the edges undirected and add a dummy vertex per edge to ensure there is at most one edge between any two vertices. To satisfy the second difference, we can add $(1,1)$ to the weight of every edge, and, in response, increase $W$ by $(2n,2n)$.

We can visualize the new transformation as:

We then set $W = (B + 2n,\: S - B + 2n)$.

## References

1. M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W.H. Freeman & Co, 1990.

1. For completeness sake, it is also important that the transformation takes at worst polynomial time.↩︎

2. The problem encoding scheme must encode the weights, so the input size is $\Omega(\log M)$.↩︎