Re: [GENERAL] Slowdown problem when writing 1.7million records
This is a followup to a problem report Stephen Livesey made back in February, to the effect that successive insertions got slower and slower. At the time we speculated that btree indexes had a problem with becoming out-of-balance when fed steadily increasing data values. I have now tried to reproduce that effect --- and failed, in both current sources and 7.0.2. I did create table foo (f1 serial primary key); and then ran a process that just did insert into foo default values; over and over (each insert as a separate transaction). This will result in inserting a continually increasing sequence of key values into the pkey index. I ran this out to about 3.4million records (twice the number of records Stephen used) on both 7.0.2 and current sources. I do not see any real slowdown in the insertion rate, and certainly not the drastic slowdown that Stephen observed: he said I am now writing 1.7million records to this file. The first 100,000 records took 15mins. The next 100,000 records took 30mins The last 100,000 records took 4hours. In total, it took 43 hours to write 1.7million records. It took me about 140 minutes to write 3.4million records, on a not particularly fast machine; the insertion rate held pretty steady at around 400 records/sec (fsync off). So I no longer think that the problem was with the btree index. Other possibilities are: * If Stephen had fsync on, and his kernel was using a particularly stupid fsync algorithm, the time might all have been going into kernel searches of its buffer lists --- the files being fsync'd would've been growing and growing, and some fsync routines take time proportional to the number of buffered blocks. * If Stephen had any relevant foreign-key checks (though he said not) and was inserting all the records in one transaction, then the known O(N^2) behavior in 7.0.*'s handling of deferred triggers could've been the culprit. If so, this is fixed in 7.1. * The problem might have been on the client side. I'm not sure whether JDBC on Windows might suffer from memory leaks or anything like that. Anyway: Stephen, if you haven't lost interest, we need to take another look at your problem and figure out where the issue really is. Bruce, please remove the TODO item * Be smarter about insertion of already-ordered data into btree index It seems to have been a false alarm. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
RE: [GENERAL] Slowdown problem when writing 1.7million records
No, it's not. Do you have any triggers or rules on this table that you haven't shown us? How about other tables referencing this one as foreign keys? (Probably not, if you're running an identical test on MySQL, but I just want to be sure that I'm not missing something.) I have no triggers or rules. I have only created 1 table and their are no foreign keys. How exactly are you writing the records? First I read the data from a 'Powerflex' file and hold this in a record set. pfxstmt = pfxconn.createStatement(); pfxrs = pfxstmt.executeQuery("SELECT * from expafh"); I then perform a loop which writes the data to my 'Postgresql' file as follows: stmt = conn.createStatement(); while (pfxrs.next()) { cmd = "INSERT INTO expafh VALUES "; cmd = cmd + "('"+pfxrs.getString(2)+"',"+pfxrs.getString(3)+",'"+pfxrs.getString(4)+"',' "+pfxrs.getString(5)+"')"; stmt.executeUpdate(cmd); } I have a suspicion that the slowdown must be on the client side (perhaps some inefficiency in the JDBC code?) but that's only a guess at this point. I have used identical code for all of my testing, the only changes being which drivers I use to access the data. Thanks Stephen Livesey Legal Disclaimer: Internet communications are not secure and therefore Exact Abacus does not accept legal responsibility for the contents of this message. Any views or opinions presented are solely those of the author and do not necessarily represent those of Exact Abacus unless otherwise specifically stated. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
RE: [GENERAL] Slowdown problem when writing 1.7million records
I just joined this list, so pardon if this has been suggested. Have you tried 'COPY expafh FROM stdin', rather than inserting each record? I'm managing a 2.5 million record import, creating a btree index on two columns, and then vacuuming the db in 36 minutes (on an Ultra 5 - similar to a AMD K6-2 500). The data is being read from a 600Mb file. I'm also using 7.1beta5 with the -F flag on the backend (prevents db server from flushing after each transaction - can be dangerous, but the server is faster). I've attached a Perl script I use - the key being the putline command. Note that when using COPY, default values, sequences, etc. are not used. If you have a SERIAL field, you have to put in the incrementing values yourself, and then use 'setval' to get thing correct again. I apologize for the Perl script - it's not commented. If you have trouble understanding it, let me know and I'll spruce it up. Robert Creager Senior Software Engineer Client Server Library 303.673.2365 V 303.661.5379 F 888.912.4458 P StorageTek INFORMATION made POWERFUL -Original Message- From: Stephen Livesey [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 28, 2001 2:20 AM To: Tom Lane Cc: [EMAIL PROTECTED] Subject: RE: [GENERAL] Slowdown problem when writing 1.7million records No, it's not. Do you have any triggers or rules on this table that you haven't shown us? How about other tables referencing this one as foreign keys? (Probably not, if you're running an identical test on MySQL, but I just want to be sure that I'm not missing something.) I have no triggers or rules. I have only created 1 table and their are no foreign keys. How exactly are you writing the records? First I read the data from a 'Powerflex' file and hold this in a record set. pfxstmt = pfxconn.createStatement(); pfxrs = pfxstmt.executeQuery("SELECT * from expafh"); I then perform a loop which writes the data to my 'Postgresql' file as follows: stmt = conn.createStatement(); while (pfxrs.next()) { cmd = "INSERT INTO expafh VALUES "; cmd = cmd + "('"+pfxrs.getString(2)+"',"+pfxrs.getString(3)+",'"+pfxrs.get String(4)+"',' "+pfxrs.getString(5)+"')"; stmt.executeUpdate(cmd); } I have a suspicion that the slowdown must be on the client side (perhaps some inefficiency in the JDBC code?) but that's only a guess at this point. I have used identical code for all of my testing, the only changes being which drivers I use to access the data. Thanks Stephen Livesey Legal Disclaimer: Internet communications are not secure and therefore Exact Abacus does not accept legal responsibility for the contents of this message. Any views or opinions presented are solely those of the author and do not necessarily represent those of Exact Abacus unless otherwise specifically stated. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) tassivSeed.pl ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]