On 29 Jul 2013, at 23:20, Chris <cpmbai...@btinternet.com> wrote:

> On 29/07/2013 21:33, Sven Van Caekenberghe wrote:
>> Chris,
>> 
>> On 29 Jul 2013, at 20:52, Chris <cpmbai...@btinternet.com> wrote:
>> 
>>> I've been getting a little concerned with certain aspects of performance 
>>> recently. Just a couple of examples off the top of my head were trying to 
>>> do a printString on 200000 floats which takes over 3 seconds. If I do the 
>>> same in Python it is only 0.25 seconds. Similarly reading 65000 points from 
>>> a database with the PostgresV2 driver was about 800m/s and only 40 with 
>>> psycopg. I'd have to try it again but am pretty sure going native was 
>>> faster than OpenDBX as well. I appreciate Pharo is never going to be able 
>>> to compete with the static-typed heavyweight languages but would hope we 
>>> can get performance at least comparable to other dynamic languages :) Is it 
>>> just that some method implementations are in need of some TLC; more things 
>>> moved on top of C libraries and primitives and so forth rather than 
>>> anything with the VM itself?
>>> 
>>> Cheers
>>> Chris
>> You could try:
>> 
>> | floats |
>> floats := (1 to: 200000) collect: #asFloat.
>> [ FloatPrintPolicy   
>>     value: InexactFloatPrintPolicy new
>>     during: [
>>         String new: 1500000 streamContents: [ :stream |
>>             floats do: [ :each | each printOn: stream ] ] ] ] timeToRun
>> 
>> => 796 ms
>> 
>> I haven't looked at the Postgresql driver in detail, but I would guess that 
>> PostgresV2 reads from a String somewhere, which is slow unless care is 
>> taken, while psycopg probably does a binary read. That last thing can be 
>> done in Pharo as well, but not if the driver is text based somehow.
>> 
>> You are right: Pharo should definitively be in the same range as other 
>> dynamically typed languages. But dynamic languages are dangerous: user 
>> become lazy / ignorant about performance.
>> 
>> Thanks for pushing this.
>> 
>> Sven
>> 
>> --
>> Sven Van Caekenberghe
>> http://stfx.eu
>> Smalltalk is the Red Pill
>> 
>> 
>> 
> Thanks for the float tip :) I think I need to investigate the Postgres thing 
> a bit further. I thought it was a fairly native driver but the double array 
> type may well be using a string

Here is some code that basically shows that it is possible to read a lot of 
data quickly - although it probably does not match your example directly (I had 
not enough information).

| points bytes string |
points := (1 to: 65000) collect: [ :each | each asPoint ].
bytes := ByteArray streamContents: [ :stream |
        points do: [ :each |
                stream nextInt32Put: each x; nextInt32Put: each y ] ].
string := String streamContents: [ :stream |
        points do: [ :each |
                stream print: each x; space; print: each y; space ] ].
{ 
[ Array new: 65000 streamContents: [ :out | | in |
        in := bytes readStream.
        [ in atEnd ] whileFalse: [ 
                out nextPut: (in nextInt32 @ in nextInt32) ] ] ] timeToRun.
[ Array new: 65000 streamContents: [ :out | | in |
        in := string readStream.
        [ in atEnd ] whileFalse: [ | x y |
                x := Integer readFrom: in.
                in peekFor: $ .
                y := Integer readFrom: in.
                in peekFor: $ .
                out nextPut: x @ y ] ] ] timeToRun
} 

=> #(51 65)

Both are in the order of magnitude of Python. With Floats, it will be somewhat 
slower, especially the textual part.

The explanation for the slowdown must be in the PgV2 driver.

Sven


Reply via email to