Re: [PERFORM] Setting nice values
Tobias Brox wrote: [Madison Kelly - Thu at 10:25:07AM -0500] Will the priority of the script pass down to the pgsql queries it calls? I figured (likely incorrectly) that because the queries were executed by the psql server the queries ran with the server's priority. I think you are right, and in any case, I don't think the niceness value won't help much if the bottleneck is iowait. In our application, I've made a special function for doing low-priority transactions which I believe is quite smart - though maybe not always. Before introducing this logic, we observed we had a tipping point, too many queries, and the database wouldn't swallow them fast enough, and the database server just jammed up, trying to work at too many queries at once, yielding the results far too slow. In the config file, I now have those two flags set: stats_start_collector = on stats_command_string = on This will unfortunately cause some CPU-load, but the benefit is great - one can actually check what the server is working with at any time: select * from pg_stat_activity with those, it is possible to check a special view pg_stat_activity - it will contain all the queries the database is working on right now. My idea is to peek into this table - if there is no active queries, the database is idle, and it's safe to start our low-priority transaction. If this view is full of stuff, one should certainly not run any low-priority transactions, rather sleep a bit and try again later. select count(*) from pg_stat_activity where not current_query like 'IDLE%' and query_start+?now() The algorithm takes four parameters, the time value to put in above, the maximum number of queries allowed to run, the sleep time between each attempt, and the amount of attempts to try before giving up. So here are the cons and drawbacks: con: Given small queries and small transactions, one can tune this in such a way that the low priority queries (almost) never causes significant delay for the higher priority queries. con: can be used to block users of an interactive query application to cause disturbances on the production database. con: can be used for pausing low-priority batch jobs to execute only when the server is idle. drawback: unsuitable for long-running queries and transactions drawback: with fixed values in the parameters above, one risks that the queries never gets run if the server is sufficiently stressed. drawback: the stats collection requires some CPU drawback: the select * from pg_stats_activity query requires some CPU drawback: the pg_stats_activity-view is constant within the transaction, so one has to roll back if there is activity (this is however not a really bad thing, because one certainly shouldn't live an idle transaction around if the database is stressed). I can see how this would be very useful (and may make use of it later!). For the current job at hand though, at full tilt it can take a few hours to run, which puts it into your drawback section. The server in question is also almost under load of some sort, too. A great tip and one I am sure to make use of later, thanks! Madi ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Setting nice values
Scott Marlowe wrote: nope, the priorities don't pass down. you connect via a client lib to the server, which spawns a backend process that does the work for you. The backend process inherits its priority from the postmaster that spawns it, and they all run at the same priority. Shoot, but figured. :) Thanks for the tip, too, it's something I will try. Sometimes it's the simple solutions that work best. :) Welcome to the world of pgsql, btw... Heh, if only I was new to pgsql I wouldn't feel silly for asking so many questions :P. In the same right though, I enjoy PgSQL/Linux/FOSS in general *because* there seems to never be a shortage of things to learn. Thanks! Madi ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Setting nice values
Andreas Kostyrka wrote: Am Donnerstag, den 02.11.2006, 09:41 -0600 schrieb Scott Marlowe: Sometimes it's the simple solutions that work best. :) Welcome to the world of pgsql, btw... OTOH, there are also non-simple solutions to this, which might make sense anyway: Install slony, and run your queries against a readonly replica of your data. Bingo! This seems like exactly what we can/should do, and it will likely help with other jobs we run, too. I feel a little silly for not having thought of this myself... Guess I was too focused on niceness :). Thanks! Madi ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Setting nice values
[Madison Kelly - Mon at 08:10:12AM -0500] to run, which puts it into your drawback section. The server in question is also almost under load of some sort, too. A great tip and one I am sure to make use of later, thanks! I must have been sleepy, listing up cons vs drawbacks ;-) Anyway, the central question is not the size of the job, but the size of the transactions within the job - if the job consists of many transactions, my test can be run before every transaction. Having transactions lasting for hours is a very bad thing to do, anyway. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Setting nice values
Tobias Brox wrote: [Madison Kelly - Mon at 08:10:12AM -0500] to run, which puts it into your drawback section. The server in question is also almost under load of some sort, too. A great tip and one I am sure to make use of later, thanks! I must have been sleepy, listing up cons vs drawbacks ;-) :) I noticed but figured what you meant (I certainly do similar flubs!). Anyway, the central question is not the size of the job, but the size of the transactions within the job - if the job consists of many transactions, my test can be run before every transaction. Having transactions lasting for hours is a very bad thing to do, anyway. Ah, sorry, long single queries is what you meant. I have inherited this code so I am not sure how long a given query takes, though they do use a lot of joins and such, so I suspect it isn't quick; indexes aside. When I get some time (and get the backup server running) I plan to play with this. Currently the DB is on a production server so I am hesitant to poke around just now. Once I get the backup server though, I will play with your suggestions. I am quite curious to see how it will work out. Thanks again! Madi ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Setting nice values
[Madison Kelly - Mon at 08:48:19AM -0500] Ah, sorry, long single queries is what you meant. No - long running single transactions :-) If it's only read-only queries, one will probably benefit by having one transaction for every query. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Setting nice values
Tobias Brox wrote: [Madison Kelly - Mon at 08:48:19AM -0500] Ah, sorry, long single queries is what you meant. No - long running single transactions :-) If it's only read-only queries, one will probably benefit by having one transaction for every query. In this case, what happens is one kinda ugly big transaction is read into a hash, and then looped through (usually ~10,000 rows). On each loop another, slightly less ugly query is performed based on the first query's values now in the hash (these queries being where throttling might help). Then after the second query is parsed a PDF file is created (also a big source of slowness). It isn't entirely read-only though because as the PDFs are created a flag is updated in the given record's row. So yeah, need to experiment some. :) Madi ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Setting nice values
On Nov 2, 2006, at 9:14 AM, Madison Kelly wrote: I've got a script (perl, in case it matters) that I need to run once a month to prepare statements. This script queries and updates the database a *lot*. I am not concerned with the performance of the SQL calls so much as I am about the impact it has on the server's load. Is there a way to limit queries speed (ie: set a low 'nice' value on a query)? This might be an odd question, or I could be asking the question the wrong way, but hopefully you the idea. :) The BizGres folks have been working on resource queuing, which will eventually do what you want. Take a look at the BizGres mailing list archives for more info. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Setting nice values
Am Donnerstag, den 02.11.2006, 09:41 -0600 schrieb Scott Marlowe: Sometimes it's the simple solutions that work best. :) Welcome to the world of pgsql, btw... OTOH, there are also non-simple solutions to this, which might make sense anyway: Install slony, and run your queries against a readonly replica of your data. Andreas signature.asc Description: Dies ist ein digital signierter Nachrichtenteil
[PERFORM] Setting nice values
Hi all, I've got a script (perl, in case it matters) that I need to run once a month to prepare statements. This script queries and updates the database a *lot*. I am not concerned with the performance of the SQL calls so much as I am about the impact it has on the server's load. Is there a way to limit queries speed (ie: set a low 'nice' value on a query)? This might be an odd question, or I could be asking the question the wrong way, but hopefully you the idea. :) Thanks! Madi ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Setting nice values
On Thu, 2006-11-02 at 09:14, Madison Kelly wrote: Hi all, I've got a script (perl, in case it matters) that I need to run once a month to prepare statements. This script queries and updates the database a *lot*. I am not concerned with the performance of the SQL calls so much as I am about the impact it has on the server's load. Is there a way to limit queries speed (ie: set a low 'nice' value on a query)? This might be an odd question, or I could be asking the question the wrong way, but hopefully you the idea. :) While you can safely set the priority lower on the calling perl script, setting db backend priorities lower can result in problems caused by priority inversion Look up that phrase on the pgsql admin, perform, general, or hackers lists for an explanation, or go here: http://en.wikipedia.org/wiki/Priority_inversion I have a simple script that grabs raw data from an oracle db and shoves it into a postgresql database for reporting purposes. Every 100 rows I put into postgresql, I usleep 10 or so and the load caused by that script on both systems is minimal. You might try something like that. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Setting nice values
Scott Marlowe wrote: On Thu, 2006-11-02 at 09:14, Madison Kelly wrote: Hi all, I've got a script (perl, in case it matters) that I need to run once a month to prepare statements. This script queries and updates the database a *lot*. I am not concerned with the performance of the SQL calls so much as I am about the impact it has on the server's load. Is there a way to limit queries speed (ie: set a low 'nice' value on a query)? This might be an odd question, or I could be asking the question the wrong way, but hopefully you the idea. :) While you can safely set the priority lower on the calling perl script, setting db backend priorities lower can result in problems caused by priority inversion Look up that phrase on the pgsql admin, perform, general, or hackers lists for an explanation, or go here: http://en.wikipedia.org/wiki/Priority_inversion I have a simple script that grabs raw data from an oracle db and shoves it into a postgresql database for reporting purposes. Every 100 rows I put into postgresql, I usleep 10 or so and the load caused by that script on both systems is minimal. You might try something like that. Will the priority of the script pass down to the pgsql queries it calls? I figured (likely incorrectly) that because the queries were executed by the psql server the queries ran with the server's priority. If this isn't the case, then perfect. :) Thanks for the tip, too, it's something I will try. Madi ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Setting nice values
On Thu, 2006-11-02 at 09:25, Madison Kelly wrote: Scott Marlowe wrote: On Thu, 2006-11-02 at 09:14, Madison Kelly wrote: Hi all, I've got a script (perl, in case it matters) that I need to run once a month to prepare statements. This script queries and updates the database a *lot*. I am not concerned with the performance of the SQL calls so much as I am about the impact it has on the server's load. Is there a way to limit queries speed (ie: set a low 'nice' value on a query)? This might be an odd question, or I could be asking the question the wrong way, but hopefully you the idea. :) While you can safely set the priority lower on the calling perl script, setting db backend priorities lower can result in problems caused by priority inversion Look up that phrase on the pgsql admin, perform, general, or hackers lists for an explanation, or go here: http://en.wikipedia.org/wiki/Priority_inversion I have a simple script that grabs raw data from an oracle db and shoves it into a postgresql database for reporting purposes. Every 100 rows I put into postgresql, I usleep 10 or so and the load caused by that script on both systems is minimal. You might try something like that. Will the priority of the script pass down to the pgsql queries it calls? I figured (likely incorrectly) that because the queries were executed by the psql server the queries ran with the server's priority. If this isn't the case, then perfect. :) nope, the priorities don't pass down. you connect via a client lib to the server, which spawns a backend process that does the work for you. The backend process inherits its priority from the postmaster that spawns it, and they all run at the same priority. Thanks for the tip, too, it's something I will try. Sometimes it's the simple solutions that work best. :) Welcome to the world of pgsql, btw... ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Setting nice values
[Madison Kelly - Thu at 10:25:07AM -0500] Will the priority of the script pass down to the pgsql queries it calls? I figured (likely incorrectly) that because the queries were executed by the psql server the queries ran with the server's priority. I think you are right, and in any case, I don't think the niceness value won't help much if the bottleneck is iowait. In our application, I've made a special function for doing low-priority transactions which I believe is quite smart - though maybe not always. Before introducing this logic, we observed we had a tipping point, too many queries, and the database wouldn't swallow them fast enough, and the database server just jammed up, trying to work at too many queries at once, yielding the results far too slow. In the config file, I now have those two flags set: stats_start_collector = on stats_command_string = on This will unfortunately cause some CPU-load, but the benefit is great - one can actually check what the server is working with at any time: select * from pg_stat_activity with those, it is possible to check a special view pg_stat_activity - it will contain all the queries the database is working on right now. My idea is to peek into this table - if there is no active queries, the database is idle, and it's safe to start our low-priority transaction. If this view is full of stuff, one should certainly not run any low-priority transactions, rather sleep a bit and try again later. select count(*) from pg_stat_activity where not current_query like 'IDLE%' and query_start+?now() The algorithm takes four parameters, the time value to put in above, the maximum number of queries allowed to run, the sleep time between each attempt, and the amount of attempts to try before giving up. So here are the cons and drawbacks: con: Given small queries and small transactions, one can tune this in such a way that the low priority queries (almost) never causes significant delay for the higher priority queries. con: can be used to block users of an interactive query application to cause disturbances on the production database. con: can be used for pausing low-priority batch jobs to execute only when the server is idle. drawback: unsuitable for long-running queries and transactions drawback: with fixed values in the parameters above, one risks that the queries never gets run if the server is sufficiently stressed. drawback: the stats collection requires some CPU drawback: the select * from pg_stats_activity query requires some CPU drawback: the pg_stats_activity-view is constant within the transaction, so one has to roll back if there is activity (this is however not a really bad thing, because one certainly shouldn't live an idle transaction around if the database is stressed). ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster