Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
On Tue, Jun 1, 2010 at 8:59 AM, Matthew Wakeling wrote: > On Tue, 1 Jun 2010, Stephen Frost wrote: >> >> * Matthew Wakeling (matt...@flymine.org) wrote: >>> >>> The major case I found when writing pl/pgsql was when trying to build >>> arrays row by row. AFAIK when I tried it, adding a row to an array caused >>> the whole array to be copied, which put a bit of a damper on performance. >> >> Using the built-ins now available in 8.4 (array_agg), that copying >> doesn't happen any more. > > Thanks. I had wondered if that had been improved. even better is array(query) -- which has been around for a while. not too many people know about it because it's syntactically weird but it's the preferred way to build arrays when you don't need true aggregation (group by and such). generally speaking, concatenation of any kind in loops should be avoided in pl/pgsql. in fact, whenever writing pl/pgsql, it's all to easy to over-use the loop construct...every time you're looping it's always good to ask yourself: 'can this be done in a query?'. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
On Tue, 1 Jun 2010, Stephen Frost wrote: * Matthew Wakeling (matt...@flymine.org) wrote: The major case I found when writing pl/pgsql was when trying to build arrays row by row. AFAIK when I tried it, adding a row to an array caused the whole array to be copied, which put a bit of a damper on performance. Using the built-ins now available in 8.4 (array_agg), that copying doesn't happen any more. Thanks. I had wondered if that had been improved. Matthew -- Our riverbanks and seashores have a beauty all can share, provided there's at least one boot, three treadless tyres, a half-eaten pork pie, some oil drums, an old felt hat, a lorry-load of tar blocks, and a broken bedstead there. -- Flanders and Swann -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
* Matthew Wakeling (matt...@flymine.org) wrote: > The major case I found when writing pl/pgsql was when trying to build > arrays row by row. AFAIK when I tried it, adding a row to an array caused > the whole array to be copied, which put a bit of a damper on performance. Using the built-ins now available in 8.4 (array_agg), that copying doesn't happen any more. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
On Fri, 28 May 2010, Merlin Moncure wrote: At best, if you are a ninja with the marginally documented backend api, you will create code that goes about as fast as your pl/pgsql function for 10 times the amount of input work, unless there are heavy amounts of 'other than sql' code in your function. The reason to write C in the backend is: *) Interface w/3rd party libraries w/C linkage *) Do things that are illegal in regular SQL (write files, etc) *) Make custom types The major case I found when writing pl/pgsql was when trying to build arrays row by row. AFAIK when I tried it, adding a row to an array caused the whole array to be copied, which put a bit of a damper on performance. Matthew -- "The problem with defending the purity of the English language is that English is about as pure as a cribhouse whore. We don't just borrow words; on occasion, English has pursued other languages down alleyways to beat them unconscious and rifle their pockets for new vocabulary." - James Nicoll -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
On Wed, May 26, 2010 at 12:41 PM, Eliot Gable wrote: > Ah, that clears things up. Yes, the connections are more or less persistent. > I have a connection manager which doles connections out to the worker > threads and reclaims them when the workers are done with them. It > dynamically adds new connections based on load. Each worker obtains a > connection from the connection manager, performs a transaction which > involves executing the function and pulling back the results from the > cursors, then releases the connection back to the connection manager for > other workers to use. So, this means that even when written in C, the SQL > queries will be planned and cached on each connection after the first > execution. So, I guess the question just becomes whether using SPI in C has > any extra overhead verses using PL/PGSQL which might make it slower for > performing queries. Since PostgreSQL is written in C, I assume there is no > such additional overhead. I assume that the PL/PGSQL implementation at its > heart also uses SPI to perform those executions. Is that a fair statement? At best, if you are a ninja with the marginally documented backend api, you will create code that goes about as fast as your pl/pgsql function for 10 times the amount of input work, unless there are heavy amounts of 'other than sql' code in your function. The reason to write C in the backend is: *) Interface w/3rd party libraries w/C linkage *) Do things that are illegal in regular SQL (write files, etc) *) Make custom types Things like that. If your pl/pgsql function is running slow, it's probably better to look at what's going on there. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
On 5/26/10 9:47 AM, Stephen Frost wrote: * Eliot Gable (egable+pgsql-performa...@gmail.com) wrote: Since PostgreSQL is written in C, I assume there is no such additional overhead. I assume that the PL/PGSQL implementation at its heart also uses SPI to perform those executions. Is that a fair statement? Right, but I also wouldn't expect a huge improvment either, unless you're calling these queries a ton, or the queries that you're calling from the pl/pgsql are pretty short-lived. Don't get me wrong, C is going to be faster, but it depends on exactly what's going on as to if it's going to be an overall improvment of, say, 10%, or a 10-fold improvment. :) Or a 0.1% improvement, which is more likely. Or that the PL/PGSQL version is even faster than the C version, because if you do any string regexp in your function, Perl has extremely efficient algorithms, probably better than you have time to write in C. We use Perl extensively and have never had any complaints. The database activity completely dominates all queries, and the performance of Perl has never even been noticable. We use a C functions for a few things, and it is a big nuisance. Every time you upgrade Postgres or your OS, there's a chance the recompile will fail because of changed header files. Any bugs in your code crash Postgres itself. We avoid C as much as possible (and I love C, been doing it since 1984). Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
* Eliot Gable (egable+pgsql-performa...@gmail.com) wrote: > Since PostgreSQL is written in C, I assume there is no > such additional overhead. I assume that the PL/PGSQL implementation at its > heart also uses SPI to perform those executions. Is that a fair statement? Right, but I also wouldn't expect a huge improvment either, unless you're calling these queries a ton, or the queries that you're calling from the pl/pgsql are pretty short-lived. Don't get me wrong, C is going to be faster, but it depends on exactly what's going on as to if it's going to be an overall improvment of, say, 10%, or a 10-fold improvment. :) Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
Ah, that clears things up. Yes, the connections are more or less persistent. I have a connection manager which doles connections out to the worker threads and reclaims them when the workers are done with them. It dynamically adds new connections based on load. Each worker obtains a connection from the connection manager, performs a transaction which involves executing the function and pulling back the results from the cursors, then releases the connection back to the connection manager for other workers to use. So, this means that even when written in C, the SQL queries will be planned and cached on each connection after the first execution. So, I guess the question just becomes whether using SPI in C has any extra overhead verses using PL/PGSQL which might make it slower for performing queries. Since PostgreSQL is written in C, I assume there is no such additional overhead. I assume that the PL/PGSQL implementation at its heart also uses SPI to perform those executions. Is that a fair statement? On Wed, May 26, 2010 at 12:32 PM, Stephen Frost wrote: > * Eliot Gable > (egable+pgsql-performa...@gmail.com) > wrote: > > Thanks for the quick follow-up. So, you are saying that if I can do SPI > in > > _PG_init, then I could prepare all my queries there and they would be > > prepared once for the entire function when it is loaded? That would > > certainly achieve what I want. Does anybody know whether I can do SPI in > > _PG_init? > > Unless you're using EXECUTE in your pl/pgsql, the queries in your > pl/pgsql function are already getting prepared on the first call of the > function for a given backend connection.. If you're using EXECUTE in > pl/gpsql then your problem might be planning time. Moving that to C > isn't going to change things as much as you might hope if you still have > to plan the query every time you call it.. > > > The function gets called a lot, but not in the same transaction. It is > only > > called once per transaction. > > That's not really relevant.. Is it called alot from the same > backend/database connection? If so, and if you're using regular SELECT > statements and the like (not EXECUTE), then they're getting prepared the > first time they're used and that is kept across transactions. > >Thanks, > >Stephen > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.9 (GNU/Linux) > > iEYEARECAAYFAkv9TTMACgkQrzgMPqB3kijiNQCfY/wTud+VZ4Z53Lw8cNY/N9ZD > 0R4AnA4diz1aptFGYXh3j8N9/k96C7/S > =6oz+ > -END PGP SIGNATURE- > > -- Eliot Gable "We do not inherit the Earth from our ancestors: we borrow it from our children." ~David Brower "I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime." ~David Brower "Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
* Eliot Gable (egable+pgsql-performa...@gmail.com) wrote: > Thanks for the quick follow-up. So, you are saying that if I can do SPI in > _PG_init, then I could prepare all my queries there and they would be > prepared once for the entire function when it is loaded? That would > certainly achieve what I want. Does anybody know whether I can do SPI in > _PG_init? Unless you're using EXECUTE in your pl/pgsql, the queries in your pl/pgsql function are already getting prepared on the first call of the function for a given backend connection.. If you're using EXECUTE in pl/gpsql then your problem might be planning time. Moving that to C isn't going to change things as much as you might hope if you still have to plan the query every time you call it.. > The function gets called a lot, but not in the same transaction. It is only > called once per transaction. That's not really relevant.. Is it called alot from the same backend/database connection? If so, and if you're using regular SELECT statements and the like (not EXECUTE), then they're getting prepared the first time they're used and that is kept across transactions. Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
Thanks for the quick follow-up. So, you are saying that if I can do SPI in _PG_init, then I could prepare all my queries there and they would be prepared once for the entire function when it is loaded? That would certainly achieve what I want. Does anybody know whether I can do SPI in _PG_init? The function gets called a lot, but not in the same transaction. It is only called once per transaction. On Wed, May 26, 2010 at 12:18 PM, Stephen Frost wrote: > * Eliot Gable > (egable+pgsql-performa...@gmail.com) > wrote: > > Would a query such as this obtain any performance improvement by being > > re-written using C? > > I wouldn't expect the queries called by the pl/pgsql function to be much > faster if called through SPI from C instead. I think the question you > need to answer is- how long does the pl/pgsql code take vs. the overall > time the function takes as a whole? You could then consider that your > 'max benefit' (or pretty close to it) which could be gained by rewriting > it in C. > > > Are there specific cases where writing a function in C would be highly > > desirable verses using PL/PGSQL (aside from simply gaining access to > > functionality not present in PL/PGSQL)? > > Cases where a function is called over and over again, or there are loops > which go through tons of data, or there's alot of data processing to be > done. > > > Are there specific cases where writing a function in C would be slower > than > > writing the equivalent in PL/PGSQL? > > Probably not- provided the C code is written correctly. You can > certainly screw that up (eg: not preparing a query in C and having PG > replan it every time would probably chew up any advantage C has over > pl/pgsql, in a simple function). > > > Basically, I am looking for some guidelines based primarily on > performance > > of when I should use C to write a function verses using PL/PGSQL. > > Realize that C functions have alot of other issues associated with them- > typically they're much larger foot-guns, for one, for another, C is an > untrusted language because it can do all kinds of bad things. So you > have to be a superuser to create them. > > > Can anybody quantify any of the performance differences between doing a > > particular task in C verses doing the same thing in PL/PGSQL? For > example, > > performing a SELECT query or executing a certain number of lines of > control > > logic (primarily IF/THEN, but an occasional loop included)? How about > > assignments or basic math like > > addition/subtraction/multiplication/division? > > Actually performing a SELECT through SPI vs. calling it from pl/pgsql > probably won't result in that much difference, presuming most of the > time there is in the actual query itself. Assignments, basic math, > control logic, etc, will all be faster in C. You need to figure out if > that work is taking enough time to justify the switch though. > > > When executing SQL queries inside a C-based function, is there any way to > > have all of the SQL queries pre-planned through the compilation process, > > definition of the function, and loading of the .so file similar to > PL/PGSQL? > > You might be able to do that when the module is loaded, but I'm not 100% > sure.. Depends on if you can start using SPI in _PG_init.. I think > there was some discussion about that recently but I'm not sure what the > answer was. > > > Would I get better performance writing each SQL query as a stored > procedure > > and then call these stored procedures from within a C-based function > which > > does the logging, math, control logic, and builds the result sets and > > cursors? > > Uhh, I'd guess 'no' to that one. > >Thanks, > >Stephen > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.9 (GNU/Linux) > > iEYEARECAAYFAkv9Sd8ACgkQrzgMPqB3kihj/gCdEIA8DhnvZX4Hz3tof6yzLscS > Lf8An2Xp8R/KXnkmp8uWg+84Cz7Pp7R3 > =AX4g > -END PGP SIGNATURE- > > -- Eliot Gable "We do not inherit the Earth from our ancestors: we borrow it from our children." ~David Brower "I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime." ~David Brower "Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero
Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL
* Eliot Gable (egable+pgsql-performa...@gmail.com) wrote: > Would a query such as this obtain any performance improvement by being > re-written using C? I wouldn't expect the queries called by the pl/pgsql function to be much faster if called through SPI from C instead. I think the question you need to answer is- how long does the pl/pgsql code take vs. the overall time the function takes as a whole? You could then consider that your 'max benefit' (or pretty close to it) which could be gained by rewriting it in C. > Are there specific cases where writing a function in C would be highly > desirable verses using PL/PGSQL (aside from simply gaining access to > functionality not present in PL/PGSQL)? Cases where a function is called over and over again, or there are loops which go through tons of data, or there's alot of data processing to be done. > Are there specific cases where writing a function in C would be slower than > writing the equivalent in PL/PGSQL? Probably not- provided the C code is written correctly. You can certainly screw that up (eg: not preparing a query in C and having PG replan it every time would probably chew up any advantage C has over pl/pgsql, in a simple function). > Basically, I am looking for some guidelines based primarily on performance > of when I should use C to write a function verses using PL/PGSQL. Realize that C functions have alot of other issues associated with them- typically they're much larger foot-guns, for one, for another, C is an untrusted language because it can do all kinds of bad things. So you have to be a superuser to create them. > Can anybody quantify any of the performance differences between doing a > particular task in C verses doing the same thing in PL/PGSQL? For example, > performing a SELECT query or executing a certain number of lines of control > logic (primarily IF/THEN, but an occasional loop included)? How about > assignments or basic math like > addition/subtraction/multiplication/division? Actually performing a SELECT through SPI vs. calling it from pl/pgsql probably won't result in that much difference, presuming most of the time there is in the actual query itself. Assignments, basic math, control logic, etc, will all be faster in C. You need to figure out if that work is taking enough time to justify the switch though. > When executing SQL queries inside a C-based function, is there any way to > have all of the SQL queries pre-planned through the compilation process, > definition of the function, and loading of the .so file similar to PL/PGSQL? You might be able to do that when the module is loaded, but I'm not 100% sure.. Depends on if you can start using SPI in _PG_init.. I think there was some discussion about that recently but I'm not sure what the answer was. > Would I get better performance writing each SQL query as a stored procedure > and then call these stored procedures from within a C-based function which > does the logging, math, control logic, and builds the result sets and > cursors? Uhh, I'd guess 'no' to that one. Thanks, Stephen signature.asc Description: Digital signature