Re: [PERFORM] How long should it take to insert 200,000 records?

2007-02-14 Thread Lou O';Quin
unless you specify otherwiise, every insert carries its own transaction
begin/commit.  That's a lot of overhead for a single insert, no?  Why
not use a single transaction for, say, each 1000 inserts?  That would
strike a nice balance of security with efficiency.

pseudo code for the insert:

Begin Transaction;
FOR i in 1..20 LOOP
  INSERT INTO viewfoo (x) VALUES (x);
  IF i % 1000 = 0 THEN
Commit Transaction;
Begin Transaction;
  END IF;
END LOOP;
Commit Transaction;
End


This approach should speed up things dramatically. 


 
>>> "Karen Hill" <[EMAIL PROTECTED]> 2/6/2007 2:39 PM >>>
On Feb 5, 9:33 pm, [EMAIL PROTECTED] (Tom Lane) wrote:
> "Karen Hill" <[EMAIL PROTECTED]> writes:
> > I have a pl/pgsql function that is inserting 200,000 records for
> > testing purposes.  What is the expected time frame for this
operation
> > on a pc with 1/2 a gig of ram and a 7200 RPM disk?
>
> I think you have omitted a bunch of relevant facts.  Bare INSERT is
> reasonably quick:
>
> regression=# create table foo (f1 int);
> CREATE TABLE
> regression=# \timing
> Timing is on.
> regression=# insert into foo select x from generate_series(1,20)
x;
> INSERT 0 20
> Time: 5158.564 ms
> regression=#
>
> (this on a not-very-fast machine) but if you weigh it down with a
ton
> of index updates, foreign key checks, etc, it could get slow ...
> also you haven't mentioned what else that plpgsql function is doing.
>

The postgres version is 8.2.1 on Windows.   The pl/pgsql function is
inserting to an updatable view (basically two tables).

CREATE TABLE foo1
(


) ;

CREATE TABLE foo2
(

);

CREATE VIEW viewfoo AS
(

);
CREATE RULE ruleFoo AS ON INSERT TO viewfoo DO INSTEAD
(

);

CREATE OR REPLACE FUNCTION functionFoo() RETURNS VOID AS $$
BEGIN
FOR i in 1..20 LOOP
INSERT INTO viewfoo (x) VALUES (x);
END LOOP;
END;
$$ LANGUAGE plpgsql;



---(end of
broadcast)---
TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Query performance

2005-03-11 Thread Lou O';Quin


I'll post there concerning how they determine the query execution time vs. data retrieval time.
 
I did think about the processor/memory when choosing the machines - all three of the processors are similar.  All are Pentium P4s with 512 MB memory.
the server is Win2K, P4, 2.3 gHz
the local network client  is a WinXP Pro, P4, 2.2 gHzthe remote network client is WinXP Pro, P4, 1.9 gHz
 
Lou
>>> Tom Lane <[EMAIL PROTECTED]> 3/11/2005 1:21 PM >>>
"Lou O'Quin" <[EMAIL PROTECTED]> writes:> Hi Tom.  I referenced the status line of pgAdmin.  Per the pgAdmin help> file:>> "The status line will show how long the last query took to complete. If a> dataset was returned, not only the elapsed time for server execution is> displayed, but also the time to retrieve the data from the server to the> Data Output page."Well, you should probably ask the pgadmin boys exactly what they aremeasuring.  In any case, the Postgres server overlaps query executionwith result sending, so I don't think it's possible to get a puremeasurement of just one of those costs --- certainly not by looking atit only from the client end.BTW, one factor to consider is that if the test client machines weren'tall the same speed, that would have some impact on their ability toabsorb 15K records ...            regards, tom lane---(end of broadcast)---TIP 9: the planner will ignore your desire to choose an index scan if your  joining column's datatypes do not match


Re: [PERFORM] Query performance

2005-03-11 Thread Lou O';Quin


Hi Tom.  I referenced the status line of pgAdmin.  Per the pgAdmin help file:
 
"The status line will show how long the last query took to complete. If a dataset was returned, not only the elapsed time for server execution is displayed, but also the time to retrieve the data from the server to the Data Output page."
 
Lou>>> Tom Lane <[EMAIL PROTECTED]> 3/11/2005 12:10 PM >>>
"Lou O'Quin" <[EMAIL PROTECTED]> writes:> it appears to actually be hypersensitive to the transport delay. The => ratios of time for the data transport (assuming 1 for the local server) => are:> 1 : 2.43 : 7.71> whereas the query execution time ratios are:> 1 : 2.08 : 25.5  (!!!)How do you know that's what the data transport time is --- ie, how canyou measure that separately from the total query time?            regards, tom lane


[PERFORM] Query performance

2005-03-11 Thread Lou O';Quin

As a test, I ran a query in the pgAdmin query tool, which returns about 15K records from a PostgreSQL v8.01 table on my Win2K server.I ran the same query from the local server, from another PC on the same 100 mbit local network, and from a PC on a different network, over the internet. The times for the query to run and the data to return for each of the three locations are shown here: 

Local Server : 571+521 ms 
Local network: 1187+1266 ms 
Internet:14579+4016 msMy question is this: Why does the execution time for the query to run increase so much? Since the query should be running on the server, it's time should be somewhat independent of the network transport delay. (unlike the data transport time) However, it appears to actually be hypersensitive to the transport delay. The ratios of time for the data transport (assuming 1 for the local server) are:1 : 2.43 : 7.71whereas the query execution time ratios are:1 : 2.08 : 25.5  (!!!)Obviously, the transport times will be greater.  But why does the execution time bloat so?