[SQL] update on a large table
I need to do an UPDATE on a large (100 million record) table. Is there any way to speed up the process (Like turning off the transaction log)? So far postgres has been handling the large database exceptionally well (large \copy imports and WHERE clauses w/ multiple params) but it is killing me on UPDATES. It takes about 4 hours to run an UPDATE (the WHERE clause is against an INDEX) but about 50 sec for a similar SELECT. Thank You, -Aaron Held ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] update on a large table
Thanks Doug, I was doing an UPATE on 100 million rows and I was updating an indexed column, it was also the column that I was basing my seach on. UPDATE "Calls" SET "GroupCode"='100 my street' WHERE "GroupCode"='' AND "Site"='05' GroupCode was Indexed. I dropped the index and the query ran in under one hour. Now I have been running Vaccum Analyze for three days.. Thanks, -Aaron Held ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[SQL] bulk imports with sequence
I am importing a large number of records monthly using the \copy (from text file)command. I would like to use a sequence as a unique row ID for display in my app. Is there any way to let postgresql generate the sequence itself. Currently the only way I can make it work is to grab the next seq value and insert my own numbers into the file Thank You, -Aaron Held ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[SQL] Performance w/ multiple WHERE clauses
I am running into a serious performance issue with some basic queries. If I run something like select * from "Calls" WHERE ( ("CallType" = 'LONG DIST' )) The search takes about 15 seconds if I run select * from "Calls" WHERE ( ( "DateOfCall"='06/19/02') ) AND ( ( "CallType" = 'LONG DIST' )) [DateOfCall is a DateTime field] it takes 15 minutes! both columns are indexed individually and expalain reports and Index scan for both queries. Any pointers where to look next? Running pg 7.2 on RH Thanks, -Aaron Held ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Performance w/ multiple WHERE clauses
Thanks, Changing '0/19/01' to '0/19/01'::date gave me a subjective 50% speedup. A ran a bunch of queries w/ explain and I noticed that some combinations did not use the indexes and went right to seq scan. All of the where clause args are indexed. # SET enable_seqscan to FALSE ; forced the use of an Index and sped things up greatly. I am not sure why it made the switch. The load on the server seems to affect the performance, but I am seeing it more on the production server with 100 million rows as opposed to the development server with only about 6 million. I need to buy more drives and develop on a larger data set. Thanks for the help, -Aaron Held Chris Ruprecht wrote: > Aaron, > > On Wed September 18 2002 17:17, Aaron Held wrote: > >>I am running into a serious performance issue with some basic queries. >> >>If I run something like >> >> select * from "Calls" WHERE >>( ("CallType" = 'LONG DIST' )) >> >>The search takes about 15 seconds >> >>if I run >>select * from "Calls" WHERE >>( ( "DateOfCall"='06/19/02') ) >>AND ( ( "CallType" = 'LONG DIST' )) >> [DateOfCall is a DateTime field] > > > try ... "DateOfCall" = '2002-06-19'::date ... > > Best regards, > Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[SQL] Monitoring a Query
Is there any way to monitor a long running query? I have stats turned on and I can see my queries, but is there any better measure of the progress? Thanks, -Aaron Held select current_query from pg_stat_activity; current_query in transaction FETCH ALL FROM PgSQL_470AEE94 in transaction select * from "Calls" WHERE "DurationOfCall" = 2.5 AND "DateOfCall" = '7/01/02' AND ("GroupCode" = 'MIAMI' OR "GroupCode" = 'Salt Lake'); ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] [GENERAL] Monitoring a Query
There are some good views and functions you can use to get at the SQL query being executed try turning on the stats collector and running select * from pg_stat_activity; (See http://www.postgresql.org/idocs/index.php?monitoring-stats.html ) You can also see the procID. From Python I can use this info to get a lot of details about the running query, CPU and memory use. But I can't tell how far along it actually is. -Aaron [EMAIL PROTECTED] wrote: > I just downloaded and installed pgmonitor on my dev. machine after seeing > your post, and it looks nifty. Only problem is I really want to avoid > running X on the database server to conserve the RAM it uses, and this > appears to require X. Any terminal applications to monitor database > activity, perhaps loosely analagous to mtop for MySQL? > (http://mtop.sf.net/) > > Wes Sheldahl > > > > > Bruce Momjian <[EMAIL PROTECTED]>@postgresql.org on 09/20/2002 > 12:18:06 PM > > Sent by:[EMAIL PROTECTED] > > > To:Aaron Held <[EMAIL PROTECTED]> > cc:[EMAIL PROTECTED], [EMAIL PROTECTED] > Subject:Re: [GENERAL] Monitoring a Query > > > > There is pgmonitor: > > http://gborg.postgresql.org/project/pgmonitor > > --- > > Aaron Held wrote: > >>Is there any way to monitor a long running query? >> >>I have stats turned on and I can see my queries, but is there any better >>measure of the progress? >> >>Thanks, >>-Aaron Held >> >>select current_query from pg_stat_activity; >>current_query >> >> >> >> >> >> in transaction >>FETCH ALL FROM PgSQL_470AEE94 >> in transaction >>select * from "Calls" WHERE "DurationOfCall" = 2.5 AND "DateOfCall" = >>'7/01/02' AND ("GroupCode" = 'MIAMI' OR "GroupCode" = 'Salt Lake'); >> >> >> >> >> >>---(end of broadcast)--- >>TIP 2: you can get off all lists at once with the unregister command >>(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) >> > > > -- > Bruce Momjian| http://candle.pha.pa.us > [EMAIL PROTECTED] | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup.| Newtown Square, Pennsylvania > 19073 > > ---(end of broadcast)--- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > > > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] [GENERAL] Monitoring a Query
Bruce Momjian wrote: > Neil Conway wrote: > >>Bruce Momjian <[EMAIL PROTECTED]> writes: >> >>>Aaron Held wrote: >>> >>>>Is there any way to monitor a long running query? >>> >>>Oh, sorry, you want to know how far the query has progressed. Gee, I >>>don't think there is any easy way to do that. >> >>Would it be a good idea to add the time that the current query began >>execution at to pg_stat_activity? > > > What do people think about this? It seems like a good idea to me. > My application marks the start time of each query and I have found it very useful. The users like to see how long each query took, and the admin can take a quick look and see how many queries are running and how long each has been active for. Good for debugging and billing. -Aaron Held ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] Performance w/ multiple WHERE clauses
I am running pg 7.2 the PG reference build. Thanks for the ANALYZE tip, it led me to a answer. This database gets a monthly update and it read only until the next update. I ANALYZE once after each update. Since the data does not change I should not need to ANALYZE again afterwards. I mentioned this to the dbadmin that manages the data and found out one of the other users UPDATED some of the columns the morning that I was seeing this behavior. I'll reANALYZE and see what happens. Thanks, -Aaron Held Josh Berkus wrote: > Aaron, > > >> # SET enable_seqscan to FALSE ; >> forced the use of an Index and sped things up greatly. >> >>I am not sure why it made the switch. The load on the server seems to >>affect the performance, but I am seeing it more on the production server >>with 100 million rows as opposed to the development server with only >>about 6 million. I need to buy more drives and develop on a larger data >>set. > > > What version are you using? > > I'd have 3 suggestions: > 1) ANALYZE, ANALYZE, ANALYZE. Then check if the row estimates made by EXPLAIN > seem accurate. > 2) Modify your postgresql.conf file to raise the cost of seq_scans for parser > estimates. > 3) Test this all again when 7.3 comes out, as parser estimate improves all the > time. > ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] [GENERAL] Monitoring a Query
It looks like that just timestamps things in its connection pool, that is what I do now. What I would like is to know about queries that have not finished yet. -Aaron Roberto Mello wrote: > On Sun, Sep 22, 2002 at 09:51:55PM -0400, Bruce Momjian wrote: > >>>Would it be a good idea to add the time that the current query began >>>execution at to pg_stat_activity? >> >>What do people think about this? It seems like a good idea to me. > > > OpenACS has a package called "Developer Support" that shows you (among > other things) how long a query took to be executed. Very good to finding > out slow-running queries that need to be optimized. > > -Roberto > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org