Unfortunately, I was not able to run the benchmark at all on the
latest Spur image+VM.

As soon as I try writing anything in the Workspace, I get
PrimitiveFailed: primitive #basicNew: in Array class failed. Since the
stacktrace mentions font rendering, I thought it had something to do
with the fonts not loaded and tried

FreeTypeFontProvider current updateFromSystem

(at least, I was able to paste it) but it seems it has nothing to do
with it. For the records, I am on Ubuntu 14.04 running gnome shell.

Looking forward to the next stable release!

thank you again
Andrea


2015-02-17 11:54 GMT+01:00 Andrea Ferretti <ferrettiand...@gmail.com>:
> Thank you for the quick response! I will try what I get from the 4.0
> VM, and of course publish the updated result once Pharo4 is out.
>
> Of course, you can add the benchmark and tweak it for your needs.
>
> Thank you for all the good work you are doing! Reaching a speed near
> pypy would be a real game changer!
>
> 2015-02-17 11:24 GMT+01:00 Sven Van Caekenberghe <s...@stfx.eu>:
>>
>>> On 17 Feb 2015, at 11:06, Clément Bera <bera.clem...@gmail.com> wrote:
>>>
>>> Hello Andrea,
>>>
>>> The way you wrote you algorithm is nice but makes extensive use of closures 
>>> and iterates a lot over collections.
>>
>> I was about to say the same thing.
>>
>>> Those are two aspects where the performance of Pharo have issues. Eliot 
>>> Miranda and myself are working especially on those 2 cases to improve Pharo 
>>> performance. If you don't mind, I will add your algorithm to the benchmarks 
>>> we use because it really makes extensive uses of cases we are trying to 
>>> optimize so its results on the bleeding edge VM are very encouraging.
>>>
>>>
>>> About your implementation, someone familiar with Pharo may change 
>>> #timesRepeat: by #to:do: in the 2 places you use it.
>>>
>>> For example:
>>> run: points times: times
>>>     1 to: times do: [ :i | self run: points ].
>>>
>>> I don't believe it makes it really harder to read but depending on the 
>>> number of times you're using, it may show some real improvements because 
>>> #to:do: is optimized at compile-time, though I tried and I got a -15% on 
>>> the overall time to run only in the bleeding edge VM.
>>
>> That is a lot of difference for such a small change.
>>
>>> Another thing is that #groupedBy: is almost never used in the system and 
>>> it's really *not* optimized at all. Maybe another collection protocol is 
>>> better and not less readable, I don't know.
>>>
>>>
>>> Now about solutions:
>>>
>>> Firstly, the VM is getting faster.
>>>         The Pharo 4 VM, to be released in July 2015, should be at least 2x 
>>> faster than now. I tried it on your benchmark, and I got 5352.7 instead of 
>>> 22629.1 on my machine, which is over x4 performance boost, and which put 
>>> Pharo between factor and clojure performance.
>>
>> Super. Thank you, Esteban and of course Eliot for such great work, 
>> eventually we'll all be better off thanks to these improvements.
>>
>>> An alpha release is available here: 
>>> https://ci.inria.fr/pharo/view/4.0-VM-Spur/ . You need to use 
>>> PharoVM-spur32 as a VM and Pharo-spur32 as an image (Yes, the image changed 
>>> too). You should be able to load your code, try your benchmark and have a 
>>> similar result.
>>
>> I did a quick test (first time I tried Spur) and code loading was 
>> spectacularly fast. But the ride is still rough ;-)
>>
>>>         In addition, we're working on making the VM again much faster on 
>>> benchmarks like yours in Pharo 5. We hope to have an alpha release this 
>>> summer but we don't know if it will be ready for sure. For this second 
>>> step, I'm at a point where I can barely run a bench without a crash, so I 
>>> can't tell right now the exact performance you can expect, but except if 
>>> there's a miracle it should be somewhere between pypy and scala performance 
>>> (It'll reach full performance once it gets more mature and not at first 
>>> release anyway). Now I don't think we'll reach any time soon the 
>>> performance of languages such as nim or rust. They're very different from 
>>> Pharo: direct compilation to machine code, many low level types, ... I'm 
>>> not even sure a Java implementation could compete with them.
>>>
>>> Secondly, you can use bindings to native code instead. I showed here how to 
>>> write the code in C and bind it with a simple callout, which may be what 
>>> you need for your bench: 
>>> https://clementbera.wordpress.com/2013/06/19/optimizing-pharo-to-c-speed-with-nativeboost-ffi/
>>>  . Now this way of calling C does not work on the latest VM. There are 3 
>>> existing frameworks to call C from Pharo, all having pros and cons, we're 
>>> trying to unify them but it's taking time. I believe for the July release 
>>> of Pharo 4 there will be an official recommended way of calling C and 
>>> that's the one you should use.
>>>
>>>
>>> I hope I wrote you a satisfying answer :-). I'm glad some people are deeply 
>>> interested in Pharo performance.
>>>
>>> Best,
>>>
>>> Clement
>>>
>>>
>>>
>>> 2015-02-17 9:03 GMT+01:00 Andrea Ferretti <ferrettiand...@gmail.com>:
>>> Hi, a while ago I was evaluating Pharo as a platform for interactive
>>> data exploration, mining and visualization.
>>>
>>> I was fairly impressed by the tools offered by the Pharo distribution,
>>> but I had a general feeling that the platform was a little slow, so I
>>> decided to set up a small benchmark, given by an implementation of
>>> K-means.
>>>
>>> The original intention was to compare Pharo to Python (a language that
>>> is often used in this niche) and Scala (the language that we use in
>>> production), but since then I have implemented a few other languages
>>> as well. You can find the benchmark here
>>>
>>> https://github.com/andreaferretti/kmeans
>>>
>>> Unfortunately, it turns out that Pharo is indeed the slowest among the
>>> implementations that I have tried. Since I am not an expert on Pharo
>>> or Smalltalk in general, I am asking advice here to find out if maybe
>>> I am doing something stupid.
>>>
>>> To be clear: the aim is *not* to have an optimized version of Kmeans.
>>> There are various ways to improve the algorithm that I am using, but I
>>> am trying to get a feeling for the performance of an algorithm that a
>>> casual user could implement without much thought while exploring some
>>> data. So I am not looking for:
>>>
>>> - better algorithms
>>> - clever optimizations, such as, say, invoking native code
>>>
>>> I am asking here because there is the real possibility that I am just
>>> messing something up, and the same naive algorithm, written by someone
>>> more competent, would show real improvements.
>>>
>>> Please, let me know if you find anything
>>> Best,
>>> Andrea
>>>
>>>
>>
>>

Reply via email to