REAL SUBJECT: Analysis of alternate algorithms.

Peter & Jach and anyone interested,

As Peter said in his altered subject line, Jack changed directions from 
tweaking an algorithm to trying something quite different.

Reminder of the problem. 

Horizontal View:
SEND + MORE = MONEY

Vertical View:
  SEND
+MORE
...........
MONEY

Hard to be precise as I am sticking to plain text in my message. The three 
words above are meant to be right adjusted.

When solving it horizontally, Jach and I used variants of a brute force 
approach. Substitute all possible combinations. He did it in-line and used 
eval. I did it by making lists of items on both sides and summing the int() of 
each component and comparing. We tried both our algorithms and his was slower 
and he profiled that the cause was that eval() was much more expensive as were 
his use of regular expression functions. For comparison, mine used int() and 
string manipulation functions and sets and dictionaries.

But the real puzzle is meant to be solved in a more vertical way by humans 
using logic. I won't do that here but note we have 4 equations going down but 8 
unknowns. And the equations are not unique.

The rightmost column (I will call it the first as our arithmetic proceeds from 
right to left) is meant to represent ONES and provides the equation:

(D+E== Y) or (D+E == Y + 10)

Worse, for the next column, you either get a "1" carried from the previous 
addition or not and either pass a "1" along to the next column or not. 4 
Possibilities.

(N+R==E) or (N+R+1==E) or (N+R==E+10) or (N+R+1==E+10)

Getting a headache yet?

For a human, they need a way to come up with additional info in terms of 
constraints.

There is a guiding inequality that looks like this:

S is not the same as any of the others. Anytime you solve for another, the list 
of possible values for S shrinks.
Ditto for each other variable.
Or, since anything plus 0 is itself, then D and E adding up to Y (with no 
possible carry) cannot be 0.

But getting a computer to do this might be a bit harder than blunt-force 
searches. So let's see why Jach's new algorithm was faster.

The code I am analyzing can be viewed in the archives and will not be entered 
again:

https://mail.python.org/pipermail/python-list/2018-December/738454.html

So what did Jach try in his newer version? It is what I call a vertical 
approach but one a computer can do easier than a human can or would. I see it 
is a very specific algorithm that hard codes in these variables as global 
entities that are altered by a set of nested functions. S, E, N, D, M, O, R, Y. 
There are approaches that might be better such as passing a dictionary 
partially filled out from one function to the next as the only one that prints 
the solution is the final function call.

So his is not a general solution.

What interests me as a probable reason this is faster is the number of 
situations it tries. The earlier versions asked itertools.permutations() to 
provide all unique combinations of ten tokens in eight positions. So there were 
10 choices for the first and 9 for the second and so on adding up to 10!/2! or 
1,814,400  different combinations tried. That approaches 2 million.

Jack broke the problem down into evaluating the units column with a loop like 
this:

itertools.permutations(range(10), 3)

That is 720 possibilities. He then doubled that to 1,440 to consider a carry. 
Only if the selected values for the three variables in contention (plus a 
carry) does he go on to call to evaluate the tens column.

It then shrinks a bit more as he no longer gets the permutations of all 10 
digits. He subtracts the three values that might work for the units, so he is 
asking for permutations of 7 digits, two at a time. That is 42, doubled again 
to 84 for carry purposes. And this function is not called 1,440 times, but 
quite a bit fewer. 

So, similarly, of those 84 loops for tens, he only sometimes calls to evaluate 
hundreds. As mentioned, the set of 10 digits shrinks some more and this 
continues upward to functions that evaluate hundreds and thousands  and finally 
the one evaluating ten thousands pretty much prints out an answer it finds. 

So overall iterations can be shown to drop. We could add code to measure how 
many times each function is called and come up with an exact value for this 
built-in problem. I did and the functions were called this many times:

>>> counting
{'unit': 1, 'ten': 72, 'hundred': 290, 'thou': 183, '10thou': 196}
>>> sum(counting.values())
742

But I also see the permutation function was called 542 times. Most of those 
runs though were fairly trivial as the combined number of items issued back 
were only 7,390 as compared to nearly two million in the earlier version. 
Overall, this is much faster.

Range() is probably a highly efficient built-in written in C, but it can 
totally be eliminated in this code as it is a constant. You can write 
{1,2,2,3,4,5,6,7,8,9} or [0,1] instead of calculating ranges.

Naturally, this code does not scale up to finding the two solutions for:

HAWAII + IDAHO +IOWA + OHIO = STATES

The horizontal versions solved that easily.

The carry issues get more complex here even if a general solution is written. 
One approach might be to make a matrix as wide as the widest term and place all 
entries in it as characters, right justified. You can then work on any number 
of columns by extracting columns backwards from the right and applying whatever 
logic, perhaps recursively. The possible carry amounts need to be estimated as 
no more than about the number of items being added minus one.

But, I am ready to do other things so I leave it as an exercise for the reader 😉

-----Original Message-----
From: Python-list <python-list-bounces+avigross=verizon....@python.org> On 
Behalf Of Peter Otten
Sent: Friday, December 14, 2018 3:21 AM
To: python-list@python.org
Subject: Smarter algo, was Re: 03 digression by brute force

jf...@ms4.hinet.net wrote:

> Just for fun:-) On my ooold PC, it takes 0.047 seconds to run the 
> following algorithm on the problem 'SNED + MORE == MONEY".

> def tenThousand(u, Cin):  # Cin == M
>     global n
>     if Cin == M:
>         print(S, E, N, D, '+', M, O, R, E, '==', M, O, N, E, Y)
>         n += 1

And it probably took under two minutes with the brute force approach.
So if you were only interested in the result the code below were efficient only 
if it took at most two more minutes to write it;)

But seriously, the next step is to generalize this to work with arbitrary 
additions...
More fun!


-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to