[Pharo-dev] Re: Array sum. is very slow

Jimmie Houchin Tue, 11 Jan 2022 06:38:03 -0800

Personally I am okay with Python implementing in C. That is theirimplementation detail. It does not impose anything on the user otherthan knowing normal Python. It isn't cheating or unfair. They are underno obligation to handicap themselves so that we can be more comparable.


Are we going to put such requirements on C, C++, Julia, Crystal, Nim?

I expect every language to put forth its best. I would like the same forPharo. And let the numbers fall where they may.


Jimmie



On 1/11/22 03:07, Andrei Chis wrote:

Hi Jimmie,

I was scanning through this thread and saw that the Python call uses
the sum function. If I remember correctly, in Python the built-in sum
function is directly implemented in C [1] (unless Python is compiled
with SLOW_SUM set to true). In that case on large arrays the function
can easily be several times faster than just iterating over the
individual objects as the Pharo code does. The benchmark seems to
compare summing numbers in C with summing numbers in Pharo. Would be
interesting to modify the Python code to use a similar loop as in
Pharo for doing the sum.

Cheers,
Andrei

[1] 
https://github.com/python/cpython/blob/135cabd328504e1648d17242b42b675cdbd0193b/Python/bltinmodule.c#L2461

On Mon, Jan 10, 2022 at 9:06 PM Jimmie Houchin <jlhouc...@gmail.com> wrote:

Some experiments and discoveries.

I am running my full language test every time. It is the only way I can compare 
results. It is also what fully stresses the language.

The reason I wrote the test as I did is because I wanted to know a couple of 
things. Is the language sufficiently performant on basic maths. I am not doing 
any high PolyMath level math. Simple things like moving averages over portions 
of arrays.

The other is efficiency of array iteration and access. This why #sum is the 
best test of this attribute. #sum iterates and accesses every element of the 
array. It will reveal if there are any problems.

The default test  Julia 1m15s, Python 24.5 minutes, Pharo 2hour 4minutes.

When I comment out the #sum and #average calls, Pharo completes the test in 3.5 
seconds. So almost all the time is spent in those two calls.

So most of this conversation has focused on why #sum is as slow as it is or how 
to improve the performance of #sum with other implementations.



So I decided to breakdown the #sum and try some things.

Starting with the initial implementation and SequenceableCollection's default 
#sum  time of 02:04:03


"This implementation does no work. Only iterates through the array.
It completed in 00:10:08"
sum
     | sum |
      sum := 1.
     1 to: self size do: [ :each | ].
     ^ sum


"This implementation does no work, but adds to iteration, accessing the value 
of the array.
It completed in 00:32:32.
Quite a bit of time for simply iterating and accessing."
sum
     | sum |
     sum := 1.
     1 to: self size do: [ :each | self at: each ].
     ^ sum


"This implementation I had in my initial email as an experiment and also 
several other did the same in theirs.
A naive simple implementation.
It completed in 01:00:53.  Half the time of the original."
sum
    | sum |
     sum := 0.
     1 to: self size do: [ :each |
         sum := sum + (self at: each) ].
     ^ sum



"This implementation I also had in my initial email as an experiment I had done.
It completed in 00:50:18.
It reduces the iterations and increases the accesses per iteration.
It is the fastest implementation so far."
sum
     | sum |
     sum := 0.
     1 to: ((self size quo: 10) * 10) by: 10 do: [ :i |
         sum := sum + (self at: i) + (self at: (i + 1)) + (self at: (i + 2)) + 
(self at: (i + 3)) + (self at: (i + 4))              + (self at: (i + 5)) + 
(self at: (i + 6)) + (self at: (i + 7)) + (self at: (i + 8)) + (self at: (i + 
9))].

     ((self size quo: 10) * 10 + 1) to: self size do: [ :i |
         sum := sum + (self at: i)].
       ^ sum

Summary

For whatever reason iterating and accessing on an Array is expensive. That 
alone took longer than Python to complete the entire test.

I had allowed this knowledge of how much slower Pharo was to stop me from using 
Pharo. Encouraged me to explore other options.

I have the option to use any language I want. I like Pharo. I do not like 
Python at all. Julia is unexciting to me. I don't like their anti-OO approach.

At one point I had a fairly complete Pharo implementation, which is where I got 
frustrated with backtesting taking days.

That implementation is gone. I had not switched to Iceberg. I had a problem 
with my hard drive. So I am starting over.

I am not a computer scientist, language expert, vm expert or anyone with the 
skills to discover and optimize arrays. So I will end my tilting at windmills 
here.

I value all the other things that Pharo brings, that I miss when I am using 
Julia or Python or Crystal, etc. Those languages do not have the vision to do 
what Pharo (or any Smalltalk) does.

Pharo may not optimize my app as much as x,y or z. But Pharo optimized me.

That said, I have made the decision to go all in with Pharo. Set aside all else.
In that regard I went ahead and put my money in with my decision and joined the 
Pharo Association last week.

Thanks for all of your help in exploring the problem.


Jimmie Houchin

[Pharo-dev] Re: Array sum. is very slow

Reply via email to