Posted 2005-03-27 05:00:13 by
Jim Crawford
I've been sending out resumes to various companies and have been
getting some tests back. I took one recently that had one quite interesting
problem: to write a function that takes an unparenthesized arithmetic
expression, and generate the results of all possible parenthesizations. e.g.:
input of “2 + 4 - 6 * 8” would generate output of “3 unique
{ -42, -14, 0 }”, these being the possible parenthesizations:
((2 + 4) - 6) * 8 = 0
(2 + 4) - (6 * 8) = -42
(2 + (4 - 6) * 8) = -14
2 + ((4 - 6) * 8) = -14
2 + (4 - (6 * 8)) = -42
The test was in C, but I couldn't come up with an algorithm that mapped well to
C. I ended up sending off a rather inelegant mess, but later decided to
convert the code to Python, because I decided the algorithm I came up with was
pleasant and deserved a pleasant implementation.
The algorithm works first by parsing the expression into a list of nodes:
class Node:
def __init__(this, place, number, operator):
this.place = place
this.number = number
this.operator = operator
def __repr__(this):
return 'Node(%d, %d, "%s")' % (this.place, this.number, this.operator)
place is the numerical label we use to refer to each node. We'll
be removing nodes from the list later, and it's important to keep a constant
naming scheme. number is the numerical operand associated with
the node. operator is the operator, if any, that was to the right
of the number. I build the node list nodes from the expression
string expr like so:
nodes = []
place = 0
number = 0
for token in re.split("([\+\-\*\/])", expr.replace(" ", "")):
if "+-*/".find(token) < 0: number = int(token)
else:
nodes.append(Node(place, number, token))
place += 1
nodes.append(Node(place, number, ""))
The result, for the input of 2 + 4 - 6 * 8, is
[Node(0, 2, "+"), Node(1, 4, "-"), Node(2, 6, "*"), Node(3, 8, "")].
Given this list, we can simulate a parenthesization set by evaluating the
operators in a certain order, as specified in the list called
permutation:
def evaluate(nodes, permutation):
for i in permutation:
for j in xrange(0, len(nodes)):
if nodes[j].place == i: break
result = 0
left = nodes[j ].number
right = nodes[j+1].number
op = nodes[j ].operator
if op == "+": result = left+right
if op == "-": result = left-right
if op == "*": result = left*right
if op == "/": result = left/right
nodes.remove(nodes[j])
nodes[j].number = result
return nodes[0].number
For instance, a nodes value of
[Node(0, 2, "+"), Node(1, 4, "-"), Node(2, 6, "*"), Node(3, 8, "")]
and a permutation of [0, 2, 1] corresponds to
the expression (2+4) - (6*8). To evaluate this we first perform
the operation of the node with the place of 0, leaving us
with:
[Node(1, 6, "-"), Node(2, 6, "*"), Node(3, 8, "")].
Then we would calculate the node with a place of 2, giving:
[Node(1, 6, "-"), Node(3, 48, "")].
Finally we would calculate the node with a place of 1, giving:
[Node(3, -42, "")].
In the single remaining node, we have our result of -42. The main chunk of
code left to describe is the one to build the list of permutations. The
problem here is essentially to come up with every possible ordering of whole
numbers below the number of operators in the expression. For the previous
expression, with 3 operators, we'd want the following set of permutations:
[[0, 1, 2], [0, 2, 1], [1, 0, 2], [1, 2, 0], [2, 0, 1], [2, 1, 0]] Not necessarily in that order. This is the code I came up with to generate such a list:
def permute(size):
permutation = [0] * size
permutations = []
def subPermute(n):
for i in xrange(0, size):
if permutation[:n].count(i): continue
permutation[n] = i
if n < size-1: subPermute(n+1)
else: permutations.append(permutation[:])
subPermute(0)
return permutations
This sets up the local lists permutation, which contains the
permutation currently being built, and permutations, which is the
list of permutations which have already been built. Then it defines a local
subroutine, subPermute, that recursively calculates all the
permutations.
subPermute takes as the parameter n the place in the
permutation array that it's working on, and it iterates through
all possible entries. If any given possible value for nth list
element hasn't been used in the list already, it places the value on in the
nth list element and either recurses with n+1, or, if
that was the final digit, places a copy of the permutation list on
the permutations list.
The rest is just putting together the pieces:
results = {}
if len(nodes) == 1: results[nodes[0].number] = 0
else:
for permutation in permute(len(nodes)-1):
results[evaluate(copy.deepcopy(nodes), permutation)] = 0
print "%d unique: { %s }" % \
(len(results), ", ".join([str(i) for i in results.keys()]))
I have a special case for expressions consisting of only a single number,
because permute(0) doesn't return anything helpful. A
deepcopy is necessary because evaluate mangles the
incoming nodes, and Python passes everything by reference; if we
just copied the list, with the listname[:] idiom, the list would
still store references to the same Nodes. I store each result as
the key to a hash, associated with a dummy value.
Phew. I had to resist the urge to turn this into a mini Python tutorial. I
even simplified the code a bit to make it more readable --
permute() used to be a generator, for instance. Debugging a
recursive generator was quite an experience, but that's another story for
another day!
(The final program is here, by request. If only I'd used Knuth's literate programming tools!)
|