Re: [Pharo-users] Pharo performance
I think my package is based on the one from the DBX suite. It has a few edits, some of which I'm confident should go in and others just for my needs which probably aren't fit to. Disappointly, I got the psyco test wrong and it is only taking 30m/s :( I guess this is the difference when something is implemented on top of C and libpq (I presume) From: Mariano Martinez Peck marianop...@gmail.com To: Any question about pharo is welcome pharo-users@lists.pharo.org Sent: Wednesday, 31 July 2013, 22:40 Subject: Re: [Pharo-users] Pharo performance On Wed, Jul 31, 2013 at 6:32 PM, Chris cpmbai...@btinternet.com wrote: It definitely seems to be because most things in the V2 driver are coming in as strings and then being converted. Running the same query on the three systems gives the following for a 65000 double array PGV2 770m/s PGV3 200m/s Psycopg2 130m/s Now I just need to find out how to get PGV3 as the backend for Glorp! PLease please please use the Glorp version you get from the DBX suite. You need to create a subclass of DatabaseDriver. You can take a look to NativePostgresDriver and create a similar one for V3. I would also like to have Glorp + PostgresV3 running. Also...we should somehow use NativePostgresPoolingDriver and NativePostgresConnectionPool for V3... we need some refactor here... We should join forces!! Cheers Chris Hi Yanni, On 30 Jul 2013, at 05:17, Yanni Chiu ya...@rogers.com wrote: On 29/07/13 7:08 PM, Sven Van Caekenberghe wrote: The explanation for the slowdown must be in the PgV2 driver. The PgV2 protocol is described at: http://www.postgresql.org/docs/7.1/static/protocol-message-formats.html Have a glance at the AsciiRow and BinaryRow message formats. The driver reads the data off the socket, parsing the the data, as described as described by the message format. With the V2 protocol design, you have to read the result row, one field at a time. IIUC, in the newer V3? protocol, the AsciiRow/BinaryRow message is replaced by a DataRow message. The DataRow message has the message size included, which could allow the driver to read the entire set of fields for one data row, using a single socket read (or a few buffer sized reads). I recall seeing an experimental V3 protocol implementation, a few years back - sorry, no links handy. It would be nice to see some benchmarks. Hope that helps. Thanks for the response. I believe the V3 project is here http://www.squeaksource.com/PostgresV3.html. Now, I probably spoke too fast and should have taken Mariano's advice to never speculate and first measure. Here is my quick test that, for me, shows that PostgresV2 seems more than fast enough (something that I had experienced before without doing benchmarks). [ self execute: 'select longitude,latitude from log_data limit 1;' ] timeToRun. = 76 ms [ self execute: 'select longitude,latitude from log_data limit 10;' ] timeToRun. = 765 ms This is querying for 2 floats from a huge table, over the network. Pretty fast ;-) So, back to Chris: what exactly are you doing that is (so) slow ? Anyway, thanks Yanni for all your work on the existing driver ! Sven -- Sven Van Caekenberghe Proudly supporting Pharo http://pharo.org http://association.pharo.org http://consortium.pharo.org -- Mariano http://marianopeck.wordpress.com
Re: [Pharo-users] Pharo performance
On Thu, Aug 1, 2013 at 4:06 AM, CHRIS BAILEY cpmbai...@btinternet.comwrote: I think my package is based on the one from the DBX suite. It has a few edits, some of which I'm confident should go in and others just for my needs which probably aren't fit to. Ok, please let me know how that goes. Disappointly, I got the psyco test wrong and it is only taking 30m/s :( I guess this is the difference when something is implemented on top of C and libpq (I presume) mmm maybe. For that you could compare OpenDBXDriver with Postgres, but you will need OpenDBX lib and FFI. -- *From:* Mariano Martinez Peck marianop...@gmail.com *To:* Any question about pharo is welcome pharo-users@lists.pharo.org *Sent:* Wednesday, 31 July 2013, 22:40 *Subject:* Re: [Pharo-users] Pharo performance On Wed, Jul 31, 2013 at 6:32 PM, Chris cpmbai...@btinternet.com wrote: It definitely seems to be because most things in the V2 driver are coming in as strings and then being converted. Running the same query on the three systems gives the following for a 65000 double array PGV2 770m/s PGV3 200m/s Psycopg2 130m/s Now I just need to find out how to get PGV3 as the backend for Glorp! PLease please please use the Glorp version you get from the DBX suite. You need to create a subclass of DatabaseDriver. You can take a look to NativePostgresDriver and create a similar one for V3. I would also like to have Glorp + PostgresV3 running. Also...we should somehow use NativePostgresPoolingDriver and NativePostgresConnectionPool for V3... we need some refactor here... We should join forces!! Cheers Chris Hi Yanni, On 30 Jul 2013, at 05:17, Yanni Chiu ya...@rogers.com wrote: On 29/07/13 7:08 PM, Sven Van Caekenberghe wrote: The explanation for the slowdown must be in the PgV2 driver. The PgV2 protocol is described at: http://www.postgresql.org/**docs/7.1/static/protocol-** message-formats.htmlhttp://www.postgresql.org/docs/7.1/static/protocol-message-formats.html Have a glance at the AsciiRow and BinaryRow message formats. The driver reads the data off the socket, parsing the the data, as described as described by the message format. With the V2 protocol design, you have to read the result row, one field at a time. IIUC, in the newer V3? protocol, the AsciiRow/BinaryRow message is replaced by a DataRow message. The DataRow message has the message size included, which could allow the driver to read the entire set of fields for one data row, using a single socket read (or a few buffer sized reads). I recall seeing an experimental V3 protocol implementation, a few years back - sorry, no links handy. It would be nice to see some benchmarks. Hope that helps. Thanks for the response. I believe the V3 project is here http://www.squeaksource.com/** PostgresV3.html http://www.squeaksource.com/PostgresV3.html. Now, I probably spoke too fast and should have taken Mariano's advice to never speculate and first measure. Here is my quick test that, for me, shows that PostgresV2 seems more than fast enough (something that I had experienced before without doing benchmarks). [ self execute: 'select longitude,latitude from log_data limit 1;' ] timeToRun. = 76 ms [ self execute: 'select longitude,latitude from log_data limit 10;' ] timeToRun. = 765 ms This is querying for 2 floats from a huge table, over the network. Pretty fast ;-) So, back to Chris: what exactly are you doing that is (so) slow ? Anyway, thanks Yanni for all your work on the existing driver ! Sven -- Sven Van Caekenberghe Proudly supporting Pharo http://pharo.org http://association.pharo.org http://consortium.pharo.org -- Mariano http://marianopeck.wordpress.com -- Mariano http://marianopeck.wordpress.com
Re: [Pharo-users] Pharo performance
ok so we will create a private business mailing-list Stef On Jul 30, 2013, at 11:41 PM, p...@highoctane.be wrote: Yes, as discussed, I am pushing Pharo and discussing business in the open just doesn't work for me.
Re: [Pharo-users] Pharo performance
On Wed, Jul 31, 2013 at 6:32 PM, Chris cpmbai...@btinternet.com wrote: It definitely seems to be because most things in the V2 driver are coming in as strings and then being converted. Running the same query on the three systems gives the following for a 65000 double array PGV2 770m/s PGV3 200m/s Psycopg2 130m/s Now I just need to find out how to get PGV3 as the backend for Glorp! PLease please please use the Glorp version you get from the DBX suite. You need to create a subclass of DatabaseDriver. You can take a look to NativePostgresDriver and create a similar one for V3. I would also like to have Glorp + PostgresV3 running. Also...we should somehow use NativePostgresPoolingDriver and NativePostgresConnectionPool for V3... we need some refactor here... We should join forces!! Cheers Chris Hi Yanni, On 30 Jul 2013, at 05:17, Yanni Chiu ya...@rogers.com wrote: On 29/07/13 7:08 PM, Sven Van Caekenberghe wrote: The explanation for the slowdown must be in the PgV2 driver. The PgV2 protocol is described at: http://www.postgresql.org/**docs/7.1/static/protocol-** message-formats.htmlhttp://www.postgresql.org/docs/7.1/static/protocol-message-formats.html Have a glance at the AsciiRow and BinaryRow message formats. The driver reads the data off the socket, parsing the the data, as described as described by the message format. With the V2 protocol design, you have to read the result row, one field at a time. IIUC, in the newer V3? protocol, the AsciiRow/BinaryRow message is replaced by a DataRow message. The DataRow message has the message size included, which could allow the driver to read the entire set of fields for one data row, using a single socket read (or a few buffer sized reads). I recall seeing an experimental V3 protocol implementation, a few years back - sorry, no links handy. It would be nice to see some benchmarks. Hope that helps. Thanks for the response. I believe the V3 project is here http://www.squeaksource.com/** PostgresV3.html http://www.squeaksource.com/PostgresV3.html. Now, I probably spoke too fast and should have taken Mariano's advice to never speculate and first measure. Here is my quick test that, for me, shows that PostgresV2 seems more than fast enough (something that I had experienced before without doing benchmarks). [ self execute: 'select longitude,latitude from log_data limit 1;' ] timeToRun. = 76 ms [ self execute: 'select longitude,latitude from log_data limit 10;' ] timeToRun. = 765 ms This is querying for 2 floats from a huge table, over the network. Pretty fast ;-) So, back to Chris: what exactly are you doing that is (so) slow ? Anyway, thanks Yanni for all your work on the existing driver ! Sven -- Sven Van Caekenberghe Proudly supporting Pharo http://pharo.org http://association.pharo.org http://consortium.pharo.org -- Mariano http://marianopeck.wordpress.com
Re: [Pharo-users] Pharo performance
BTW, is there any special consortium list where we can discuss business out of the public eye? --- Philippe Back Dramatic Performance Improvements Mob: +32(0) 478 650 140 | Fax: +32 (0) 70 408 027 Mail:p...@highoctane.be | Web: http://philippeback.eu Blog: http://philippeback.be | Twitter: @philippeback Youtube: http://www.youtube.com/user/philippeback/videos High Octane SPRL rue cour Boisacq 101 | 1301 Bierges | Belgium Featured on the Software Process and Measurement Cast http://spamcast.libsyn.com Sparx Systems Enterprise Architect and Ability Engineering EADocX Value Added Reseller On Tue, Jul 30, 2013 at 11:02 PM, Stéphane Ducasse stephane.duca...@inria.fr wrote: You could try: | floats | floats := (1 to: 20) collect: #asFloat. [ FloatPrintPolicy value: InexactFloatPrintPolicy new during: [ String new: 150 streamContents: [ :stream | floats do: [ :each | each printOn: stream ] ] ] ] timeToRun = 796 ms I haven't looked at the Postgresql driver in detail, but I would guess that PostgresV2 reads from a String somewhere, which is slow unless care is taken, while psycopg probably does a binary read. That last thing can be done in Pharo as well, but not if the driver is text based somehow. You are right: Pharo should definitively be in the same range as other dynamically typed languages. But dynamic languages are dangerous: user become lazy / ignorant about performance. Thanks for pushing this. +1 we need more people to profile and look at these aspects. I can tell you that we are all working like crazy to improve the system and I hope that one of these days we will have real regression tests. But well you know it takes a lot of time. Stef
Re: [Pharo-users] Pharo performance
yes normally there is one :) Now I will check who is in. What we should pay attention is that we are not really in favor of private discussions (even if I understand it fr business it is important). If you think that this would be good we can create a private business list. Stef BTW, is there any special consortium list where we can discuss business out of the public eye? --- Philippe Back Dramatic Performance Improvements Mob: +32(0) 478 650 140 | Fax: +32 (0) 70 408 027 Mail:p...@highoctane.be | Web: http://philippeback.eu Blog: http://philippeback.be | Twitter: @philippeback Youtube: http://www.youtube.com/user/philippeback/videos High Octane SPRL rue cour Boisacq 101 | 1301 Bierges | Belgium Featured on the Software Process and Measurement Cast http://spamcast.libsyn.com Sparx Systems Enterprise Architect and Ability Engineering EADocX Value Added Reseller On Tue, Jul 30, 2013 at 11:02 PM, Stéphane Ducasse stephane.duca...@inria.fr wrote: You could try: | floats | floats := (1 to: 20) collect: #asFloat. [ FloatPrintPolicy value: InexactFloatPrintPolicy new during: [ String new: 150 streamContents: [ :stream | floats do: [ :each | each printOn: stream ] ] ] ] timeToRun = 796 ms I haven't looked at the Postgresql driver in detail, but I would guess that PostgresV2 reads from a String somewhere, which is slow unless care is taken, while psycopg probably does a binary read. That last thing can be done in Pharo as well, but not if the driver is text based somehow. You are right: Pharo should definitively be in the same range as other dynamically typed languages. But dynamic languages are dangerous: user become lazy / ignorant about performance. Thanks for pushing this. +1 we need more people to profile and look at these aspects. I can tell you that we are all working like crazy to improve the system and I hope that one of these days we will have real regression tests. But well you know it takes a lot of time. Stef
Re: [Pharo-users] Pharo performance
Yes, as discussed, I am pushing Pharo and discussing business in the open just doesn't work for me.
Re: [Pharo-users] Pharo performance
On 29/07/2013 21:33, Sven Van Caekenberghe wrote: Chris, On 29 Jul 2013, at 20:52, Chris cpmbai...@btinternet.com wrote: I've been getting a little concerned with certain aspects of performance recently. Just a couple of examples off the top of my head were trying to do a printString on 20 floats which takes over 3 seconds. If I do the same in Python it is only 0.25 seconds. Similarly reading 65000 points from a database with the PostgresV2 driver was about 800m/s and only 40 with psycopg. I'd have to try it again but am pretty sure going native was faster than OpenDBX as well. I appreciate Pharo is never going to be able to compete with the static-typed heavyweight languages but would hope we can get performance at least comparable to other dynamic languages :) Is it just that some method implementations are in need of some TLC; more things moved on top of C libraries and primitives and so forth rather than anything with the VM itself? Cheers Chris You could try: | floats | floats := (1 to: 20) collect: #asFloat. [ FloatPrintPolicy value: InexactFloatPrintPolicy new during: [ String new: 150 streamContents: [ :stream | floats do: [ :each | each printOn: stream ] ] ] ] timeToRun = 796 ms I haven't looked at the Postgresql driver in detail, but I would guess that PostgresV2 reads from a String somewhere, which is slow unless care is taken, while psycopg probably does a binary read. That last thing can be done in Pharo as well, but not if the driver is text based somehow. You are right: Pharo should definitively be in the same range as other dynamically typed languages. But dynamic languages are dangerous: user become lazy / ignorant about performance. Thanks for pushing this. Sven -- Sven Van Caekenberghe http://stfx.eu Smalltalk is the Red Pill Thanks for the float tip :) I think I need to investigate the Postgres thing a bit further. I thought it was a fairly native driver but the double array type may well be using a string
Re: [Pharo-users] Pharo performance
On 29 Jul 2013, at 23:20, Chris cpmbai...@btinternet.com wrote: On 29/07/2013 21:33, Sven Van Caekenberghe wrote: Chris, On 29 Jul 2013, at 20:52, Chris cpmbai...@btinternet.com wrote: I've been getting a little concerned with certain aspects of performance recently. Just a couple of examples off the top of my head were trying to do a printString on 20 floats which takes over 3 seconds. If I do the same in Python it is only 0.25 seconds. Similarly reading 65000 points from a database with the PostgresV2 driver was about 800m/s and only 40 with psycopg. I'd have to try it again but am pretty sure going native was faster than OpenDBX as well. I appreciate Pharo is never going to be able to compete with the static-typed heavyweight languages but would hope we can get performance at least comparable to other dynamic languages :) Is it just that some method implementations are in need of some TLC; more things moved on top of C libraries and primitives and so forth rather than anything with the VM itself? Cheers Chris You could try: | floats | floats := (1 to: 20) collect: #asFloat. [ FloatPrintPolicy value: InexactFloatPrintPolicy new during: [ String new: 150 streamContents: [ :stream | floats do: [ :each | each printOn: stream ] ] ] ] timeToRun = 796 ms I haven't looked at the Postgresql driver in detail, but I would guess that PostgresV2 reads from a String somewhere, which is slow unless care is taken, while psycopg probably does a binary read. That last thing can be done in Pharo as well, but not if the driver is text based somehow. You are right: Pharo should definitively be in the same range as other dynamically typed languages. But dynamic languages are dangerous: user become lazy / ignorant about performance. Thanks for pushing this. Sven -- Sven Van Caekenberghe http://stfx.eu Smalltalk is the Red Pill Thanks for the float tip :) I think I need to investigate the Postgres thing a bit further. I thought it was a fairly native driver but the double array type may well be using a string Here is some code that basically shows that it is possible to read a lot of data quickly - although it probably does not match your example directly (I had not enough information). | points bytes string | points := (1 to: 65000) collect: [ :each | each asPoint ]. bytes := ByteArray streamContents: [ :stream | points do: [ :each | stream nextInt32Put: each x; nextInt32Put: each y ] ]. string := String streamContents: [ :stream | points do: [ :each | stream print: each x; space; print: each y; space ] ]. { [ Array new: 65000 streamContents: [ :out | | in | in := bytes readStream. [ in atEnd ] whileFalse: [ out nextPut: (in nextInt32 @ in nextInt32) ] ] ] timeToRun. [ Array new: 65000 streamContents: [ :out | | in | in := string readStream. [ in atEnd ] whileFalse: [ | x y | x := Integer readFrom: in. in peekFor: $ . y := Integer readFrom: in. in peekFor: $ . out nextPut: x @ y ] ] ] timeToRun } = #(51 65) Both are in the order of magnitude of Python. With Floats, it will be somewhat slower, especially the textual part. The explanation for the slowdown must be in the PgV2 driver. Sven
Re: [Pharo-users] Pharo performance
I forward Nico's answer because he is not in the mailing list. On Mon, Jul 29, 2013 at 6:12 PM, Nicolas Cellier nicolas.cellier.aka.n...@gmail.com wrote: First thing, printing a Float correctly is a difficult task. Let's remind the requirements for a decimal printString: 1) every two different Float shall have a different printString representation 2) the printString should be re-interpreted as the same Float 3) the printString should be the shortest decimal that would be re-interpreted as the same Float Requirement 2) is stronger than 1), and 3) is stronger than 2). For a REPL, we at least want 2) If we restrict to requirement 2) then we can simply use printf with 17 decimal digits to accelerate printString. But then we must expect 0.1 to be printed '0.10001' or something like that, an awfull thing for a REPL Requirement 3) is stronger than 2) and it nicely gives 0.1 printString - '0.1' This is what Python Java Scheme and maybe some other languages do. Squeak/Pharo did implement 3) for long by using the algorithm of Scheme, absPrintExactlyOn:base: It is a relatively easy naive implementation using LargeInteger arithmetic. But it did not use it by default, which made Squeak/Pharo not even respecting requirement 1) So all I did was to make this algorithm the default choice (and maybe correct some edge cases, can't remember). But LargeInteger arithmetic is relatively slow (compared to SmallInteger / Float arithmetic). So I have done plenty of things to try and (marginally) accelerate printString. But allmost all should already be integrated in Pharo. My last experiments consisted in rewriting some LargeInteger primitives to use 32 bits digits instead of 8 bits digits. This work is mostly ready. I personnally exclusively work with such modified VM. Only ImageSegment support on BigEndian VM is lacking, but this should hardly be a problem for Pharo. It has not been integrated, but again the acceleration of Float printString is marginal. Maybe it could be accelerated with a primitive, but this will be a tough task. In presence of identified bottleneck, the questions are: a) can we use a binary format ? b) can we use another exact representation (like hexadecimal printString of float bits for example, or C printf %a format...) c) can we use a degraded solution 2) (printf with 17 decimals) d) can we afford an inexact representation not satisfying 1)-2)-3) To me, it depends mostly on usage. a) and b) are not for human end-user for example, but is a human going to grok Giga-Floats ? a) and b) might be solutions for serialization, but it mostly depend on the other end. It must be an application specific decision Maybe a support for solution b) and c) could help. Nicolas 2013/7/29 Mariano Martinez Peck marianop...@gmail.com I remember Nicolas Cieller did something about improving performance of Float printing. Not sure what is the state yet On Mon, Jul 29, 2013 at 4:50 PM, Chris cpmbai...@btinternet.com wrote: I often use the profiler :) The float printString is all fairly evenly split which is why I mention about whether another implementation may be required. I'll always raise anything much more obvious that I see! Hi Chris, My recommendation in this case is always do not spend a single second trying to figure out what is the bottleneck by yourself. First thing ever to do, is to run a profiler. You can do that very easily in Pharo. TimeProfiler spyOn: [ Transcript show: 'do something for real here to benchmark'. Object allSubclasses. ] replace the closure for what you want to benchmark. Then, share with us if you have interesting finds Cheers, On Mon, Jul 29, 2013 at 3:52 PM, Chris cpmbai...@btinternet.com wrote: I've been getting a little concerned with certain aspects of performance recently. Just a couple of examples off the top of my head were trying to do a printString on 20 floats which takes over 3 seconds. If I do the same in Python it is only 0.25 seconds. Similarly reading 65000 points from a database with the PostgresV2 driver was about 800m/s and only 40 with psycopg. I'd have to try it again but am pretty sure going native was faster than OpenDBX as well. I appreciate Pharo is never going to be able to compete with the static-typed heavyweight languages but would hope we can get performance at least comparable to other dynamic languages :) Is it just that some method implementations are in need of some TLC; more things moved on top of C libraries and primitives and so forth rather than anything with the VM itself? Cheers Chris -- Mariano http://marianopeck.wordpress.com -- Mariano http://marianopeck.wordpress.com -- Mariano http://marianopeck.wordpress.com
Re: [Pharo-users] Pharo performance
On 29/07/13 7:08 PM, Sven Van Caekenberghe wrote: The explanation for the slowdown must be in the PgV2 driver. The PgV2 protocol is described at: http://www.postgresql.org/docs/7.1/static/protocol-message-formats.html Have a glance at the AsciiRow and BinaryRow message formats. The driver reads the data off the socket, parsing the the data, as described as described by the message format. With the V2 protocol design, you have to read the result row, one field at a time. IIUC, in the newer V3? protocol, the AsciiRow/BinaryRow message is replaced by a DataRow message. The DataRow message has the message size included, which could allow the driver to read the entire set of fields for one data row, using a single socket read (or a few buffer sized reads). I recall seeing an experimental V3 protocol implementation, a few years back - sorry, no links handy. It would be nice to see some benchmarks. Hope that helps.
Re: [Pharo-users] Pharo performance
btw, one of the reasons why I would like to (some day) rewrite the DBXTalk Driver getting rid of opendbx is because it answers always strings (char*) [1], which at the time needs to be parsed too... slowing a lot the process. but well, since it is not an easy task, it will be delayed indefinitely, I think (until someone really needs it) :) cheers, Esteban [1] See: http://www.linuxnetworks.de/doc/index.php/OpenDBX/C_API/odbx_field_value On Jul 30, 2013, at 5:17 AM, Yanni Chiu ya...@rogers.com wrote: On 29/07/13 7:08 PM, Sven Van Caekenberghe wrote: The explanation for the slowdown must be in the PgV2 driver. The PgV2 protocol is described at: http://www.postgresql.org/docs/7.1/static/protocol-message-formats.html Have a glance at the AsciiRow and BinaryRow message formats. The driver reads the data off the socket, parsing the the data, as described as described by the message format. With the V2 protocol design, you have to read the result row, one field at a time. IIUC, in the newer V3? protocol, the AsciiRow/BinaryRow message is replaced by a DataRow message. The DataRow message has the message size included, which could allow the driver to read the entire set of fields for one data row, using a single socket read (or a few buffer sized reads). I recall seeing an experimental V3 protocol implementation, a few years back - sorry, no links handy. It would be nice to see some benchmarks. Hope that helps.