Re: [PERFORM] How long should it take to insert 200,000 records?
unless you specify otherwiise, every insert carries its own transaction begin/commit. That's a lot of overhead for a single insert, no? Why not use a single transaction for, say, each 1000 inserts? That would strike a nice balance of security with efficiency. pseudo code for the insert: Begin Transaction; FOR i in 1..20 LOOP INSERT INTO viewfoo (x) VALUES (x); IF i % 1000 = 0 THEN Commit Transaction; Begin Transaction; END IF; END LOOP; Commit Transaction; End This approach should speed up things dramatically. >>> "Karen Hill" <[EMAIL PROTECTED]> 2/6/2007 2:39 PM >>> On Feb 5, 9:33 pm, [EMAIL PROTECTED] (Tom Lane) wrote: > "Karen Hill" <[EMAIL PROTECTED]> writes: > > I have a pl/pgsql function that is inserting 200,000 records for > > testing purposes. What is the expected time frame for this operation > > on a pc with 1/2 a gig of ram and a 7200 RPM disk? > > I think you have omitted a bunch of relevant facts. Bare INSERT is > reasonably quick: > > regression=# create table foo (f1 int); > CREATE TABLE > regression=# \timing > Timing is on. > regression=# insert into foo select x from generate_series(1,20) x; > INSERT 0 20 > Time: 5158.564 ms > regression=# > > (this on a not-very-fast machine) but if you weigh it down with a ton > of index updates, foreign key checks, etc, it could get slow ... > also you haven't mentioned what else that plpgsql function is doing. > The postgres version is 8.2.1 on Windows. The pl/pgsql function is inserting to an updatable view (basically two tables). CREATE TABLE foo1 ( ) ; CREATE TABLE foo2 ( ); CREATE VIEW viewfoo AS ( ); CREATE RULE ruleFoo AS ON INSERT TO viewfoo DO INSTEAD ( ); CREATE OR REPLACE FUNCTION functionFoo() RETURNS VOID AS $$ BEGIN FOR i in 1..20 LOOP INSERT INTO viewfoo (x) VALUES (x); END LOOP; END; $$ LANGUAGE plpgsql; ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Query performance
I'll post there concerning how they determine the query execution time vs. data retrieval time. I did think about the processor/memory when choosing the machines - all three of the processors are similar. All are Pentium P4s with 512 MB memory. the server is Win2K, P4, 2.3 gHz the local network client is a WinXP Pro, P4, 2.2 gHzthe remote network client is WinXP Pro, P4, 1.9 gHz Lou >>> Tom Lane <[EMAIL PROTECTED]> 3/11/2005 1:21 PM >>> "Lou O'Quin" <[EMAIL PROTECTED]> writes:> Hi Tom. I referenced the status line of pgAdmin. Per the pgAdmin help> file:>> "The status line will show how long the last query took to complete. If a> dataset was returned, not only the elapsed time for server execution is> displayed, but also the time to retrieve the data from the server to the> Data Output page."Well, you should probably ask the pgadmin boys exactly what they aremeasuring. In any case, the Postgres server overlaps query executionwith result sending, so I don't think it's possible to get a puremeasurement of just one of those costs --- certainly not by looking atit only from the client end.BTW, one factor to consider is that if the test client machines weren'tall the same speed, that would have some impact on their ability toabsorb 15K records ... regards, tom lane---(end of broadcast)---TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Query performance
Hi Tom. I referenced the status line of pgAdmin. Per the pgAdmin help file: "The status line will show how long the last query took to complete. If a dataset was returned, not only the elapsed time for server execution is displayed, but also the time to retrieve the data from the server to the Data Output page." Lou>>> Tom Lane <[EMAIL PROTECTED]> 3/11/2005 12:10 PM >>> "Lou O'Quin" <[EMAIL PROTECTED]> writes:> it appears to actually be hypersensitive to the transport delay. The => ratios of time for the data transport (assuming 1 for the local server) => are:> 1 : 2.43 : 7.71> whereas the query execution time ratios are:> 1 : 2.08 : 25.5 (!!!)How do you know that's what the data transport time is --- ie, how canyou measure that separately from the total query time? regards, tom lane
[PERFORM] Query performance
As a test, I ran a query in the pgAdmin query tool, which returns about 15K records from a PostgreSQL v8.01 table on my Win2K server.I ran the same query from the local server, from another PC on the same 100 mbit local network, and from a PC on a different network, over the internet. The times for the query to run and the data to return for each of the three locations are shown here: Local Server : 571+521 ms Local network: 1187+1266 ms Internet:14579+4016 msMy question is this: Why does the execution time for the query to run increase so much? Since the query should be running on the server, it's time should be somewhat independent of the network transport delay. (unlike the data transport time) However, it appears to actually be hypersensitive to the transport delay. The ratios of time for the data transport (assuming 1 for the local server) are:1 : 2.43 : 7.71whereas the query execution time ratios are:1 : 2.08 : 25.5 (!!!)Obviously, the transport times will be greater. But why does the execution time bloat so?