[SQL] update on a large table

2002-08-10 Thread Aaron Held

I need to do an UPDATE on a large (100 million record) table.  Is there
any way to speed up the process (Like turning off the transaction log)?

So far postgres has been handling the large database exceptionally well
(large \copy imports and WHERE clauses w/ multiple params) but it is
killing me on UPDATES.  It takes about 4 hours to run an UPDATE (the WHERE
clause is against an INDEX) but about 50 sec for a similar SELECT.

Thank You,
-Aaron Held



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [SQL] update on a large table

2002-08-13 Thread Aaron Held

Thanks Doug,

I was doing an UPATE on 100 million rows and I was updating an indexed column, it was 
also
the column that I was basing my seach on.

UPDATE "Calls" SET "GroupCode"='100 my street' WHERE "GroupCode"='' AND "Site"='05'

GroupCode was Indexed.  I dropped the index and the query ran in under one hour.

Now I have been running Vaccum Analyze for three days..

Thanks,
-Aaron Held


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[SQL] bulk imports with sequence

2002-08-20 Thread Aaron Held

I am importing a large number of records monthly using the \copy (from text 
file)command.

I would like to use a sequence as a unique row ID for display in my app.

Is there any way to let postgresql generate the sequence itself.  Currently the only 
way I
can make it work is to grab the next seq value and insert my own numbers into the file

Thank You,
-Aaron Held


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



[SQL] Performance w/ multiple WHERE clauses

2002-09-18 Thread Aaron Held

I am running into a serious performance issue with some basic queries.

If I run something like

   select * from "Calls" WHERE
( ("CallType" = 'LONG DIST' ))

The search takes about 15 seconds

if I run
select * from "Calls" WHERE
( ( "DateOfCall"='06/19/02') )
AND ( ( "CallType" = 'LONG DIST' ))
   [DateOfCall is a DateTime field]

it takes 15 minutes!

both columns are indexed individually and expalain reports and Index scan for both
queries.

Any pointers where to look next?

Running pg 7.2 on RH

Thanks,
-Aaron Held


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [SQL] Performance w/ multiple WHERE clauses

2002-09-20 Thread Aaron Held


Thanks,

Changing '0/19/01' to '0/19/01'::date gave me a subjective 50% speedup.
A ran a bunch of queries w/ explain and I noticed that some 
combinations did not use the indexes and went right to seq scan.  All of 
the where clause args are indexed.

# SET enable_seqscan to FALSE ;
forced the use of an Index and sped things up greatly.

I am not sure why it made the switch.  The load on the server seems to 
affect the performance, but I am seeing it more on the production server 
with 100 million rows as opposed to the development server with only 
about 6 million.  I need to buy more drives and develop on a larger data 
set.

Thanks for the help,
-Aaron Held

Chris Ruprecht wrote:
> Aaron,
> 
> On Wed September 18 2002 17:17, Aaron Held wrote:
> 
>>I am running into a serious performance issue with some basic queries.
>>
>>If I run something like
>>
>>   select * from "Calls" WHERE
>>( ("CallType" = 'LONG DIST' ))
>>
>>The search takes about 15 seconds
>>
>>if I run
>>select * from "Calls" WHERE
>>( ( "DateOfCall"='06/19/02') )
>>AND ( ( "CallType" = 'LONG DIST' ))
>>   [DateOfCall is a DateTime field]
> 
> 
> try ... "DateOfCall" = '2002-06-19'::date ...
> 
> Best regards,
> Chris



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



[SQL] Monitoring a Query

2002-09-20 Thread Aaron Held

Is there any way to monitor a long running query?

I have stats turned on and I can see my queries, but is there any better 
measure of the progress?

Thanks,
-Aaron Held

select current_query from pg_stat_activity;
current_query





 in transaction
FETCH ALL FROM PgSQL_470AEE94
 in transaction
select * from "Calls" WHERE "DurationOfCall" = 2.5 AND "DateOfCall" = 
'7/01/02' AND ("GroupCode" = 'MIAMI' OR "GroupCode" = 'Salt Lake');





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [SQL] [GENERAL] Monitoring a Query

2002-09-20 Thread Aaron Held

There are some good views and functions you can use to get at the SQL 
query being executed

try turning on the stats collector and running
select * from pg_stat_activity;

(See http://www.postgresql.org/idocs/index.php?monitoring-stats.html )

You can also see the procID.
 From Python I can use this info to get a lot of details about the 
running query, CPU and memory use.

But I can't tell how far along it actually is.

-Aaron

[EMAIL PROTECTED] wrote:
> I just downloaded and installed pgmonitor on my dev. machine after seeing
> your post, and it looks nifty. Only problem is I really want to avoid
> running X on the database server to conserve the RAM it uses, and this
> appears to require X. Any terminal applications to monitor database
> activity, perhaps loosely analagous to mtop for MySQL?
> (http://mtop.sf.net/)
> 
> Wes Sheldahl
> 
> 
> 
> 
> Bruce Momjian <[EMAIL PROTECTED]>@postgresql.org on 09/20/2002
> 12:18:06 PM
> 
> Sent by:[EMAIL PROTECTED]
> 
> 
> To:Aaron Held <[EMAIL PROTECTED]>
> cc:[EMAIL PROTECTED], [EMAIL PROTECTED]
> Subject:Re: [GENERAL] Monitoring a Query
> 
> 
> 
> There is pgmonitor:
> 
>  http://gborg.postgresql.org/project/pgmonitor
> 
> ---
> 
> Aaron Held wrote:
> 
>>Is there any way to monitor a long running query?
>>
>>I have stats turned on and I can see my queries, but is there any better
>>measure of the progress?
>>
>>Thanks,
>>-Aaron Held
>>
>>select current_query from pg_stat_activity;
>>current_query
>>
>>
>>
>>
>>
>> in transaction
>>FETCH ALL FROM PgSQL_470AEE94
>> in transaction
>>select * from "Calls" WHERE "DurationOfCall" = 2.5 AND "DateOfCall" =
>>'7/01/02' AND ("GroupCode" = 'MIAMI' OR "GroupCode" = 'Salt Lake');
>>
>>
>>
>>
>>
>>---(end of broadcast)---
>>TIP 2: you can get off all lists at once with the unregister command
>>(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
>>
> 
> 
> --
>   Bruce Momjian|  http://candle.pha.pa.us
>   [EMAIL PROTECTED]   |  (610) 359-1001
>   +  If your life is a hard drive, |  13 Roberts Road
>   +  Christ can be your backup.|  Newtown Square, Pennsylvania
>   19073
> 
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 
> 
> 



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [SQL] [GENERAL] Monitoring a Query

2002-09-23 Thread Aaron Held

Bruce Momjian wrote:
> Neil Conway wrote:
> 
>>Bruce Momjian <[EMAIL PROTECTED]> writes:
>>
>>>Aaron Held wrote:
>>>
>>>>Is there any way to monitor a long running query?
>>>
>>>Oh, sorry, you want to know how far the query has progressed.  Gee, I
>>>don't think there is any easy way to do that.
>>
>>Would it be a good idea to add the time that the current query began
>>execution at to pg_stat_activity?
> 
> 
> What do people think about this?  It seems like a good idea to me.
> 

My application marks the start time of each query and I have found it 
very useful.  The users like to see how long each query took, and the 
admin can take a quick look and see how many queries are running and how 
long each has been active for.  Good for debugging and billing.

-Aaron Held


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



Re: [SQL] Performance w/ multiple WHERE clauses

2002-09-23 Thread Aaron Held

I am running pg 7.2 the PG reference build.

Thanks for the ANALYZE tip,  it led me to a answer.

This database gets a monthly update and it read only until the next 
update.  I ANALYZE once after each update.  Since the data does not 
change I should not need to ANALYZE again afterwards.

I mentioned this to the dbadmin that manages the data and found out one 
of the other users UPDATED some of the columns the morning that I was 
seeing this behavior.

I'll reANALYZE and see what happens.

Thanks,
-Aaron Held


Josh Berkus wrote:
> Aaron,
> 
> 
>>  # SET enable_seqscan to FALSE ;
>>  forced the use of an Index and sped things up greatly.
>>
>>I am not sure why it made the switch.  The load on the server seems to 
>>affect the performance, but I am seeing it more on the production server 
>>with 100 million rows as opposed to the development server with only 
>>about 6 million.  I need to buy more drives and develop on a larger data 
>>set.
> 
> 
> What version are you using?
> 
> I'd have 3 suggestions:
> 1) ANALYZE, ANALYZE, ANALYZE.  Then check if the row estimates made by EXPLAIN 
> seem accurate.
> 2) Modify your postgresql.conf file to raise the cost of seq_scans for parser 
> estimates.
> 3) Test this all again when 7.3 comes out, as parser estimate improves all the 
> time.
> 



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [SQL] [GENERAL] Monitoring a Query

2002-09-23 Thread Aaron Held

It looks like that just timestamps things in its connection pool, that 
is what I do now.

What I would like is to know about queries that have not finished yet.

-Aaron

Roberto Mello wrote:
> On Sun, Sep 22, 2002 at 09:51:55PM -0400, Bruce Momjian wrote:
> 
>>>Would it be a good idea to add the time that the current query began
>>>execution at to pg_stat_activity?
>>
>>What do people think about this?  It seems like a good idea to me.
> 
> 
> OpenACS has a package called "Developer Support" that shows you (among
> other things) how long a query took to be executed. Very good to finding 
> out slow-running queries that need to be optimized.
> 
> -Roberto
> 



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org