Re: [HACKERS] WAL logging problem in 9.4.3?
Hi, This thread seemed to trail off without a resolution. Was anything done? (See more below.) On 07/09/15 19:06, Tom Lane wrote: Andres Freund writes: On 2015-07-06 11:49:54 -0400, Tom Lane wrote: Rather than reverting cab9a0656c36739f, which would re-introduce a different performance problem, perhaps we could have COPY create a new relfilenode when it does this. That should be safe if the table was previously empty. I'm not convinced that cab9a0656c36739f needs to survive in that form. To me only allowing one COPY to benefit from the wal_level = minimal optimization has a significantly higher cost than cab9a0656c36739f. What evidence have you got to base that value judgement on? cab9a0656c36739f was based on an actual user complaint, so we have good evidence that there are people out there who care about the cost of truncating a table many times in one transaction. I'm the complainer mentioned in the cab9a0656c36739f commit message. :) FWIW, we use a temp table to split a join across 4 largish tables (10^8 rows or more each) and 2 small tables (10^6 rows each). We write the results of joining the 2 largest tables into the temp table, and then join that to the other 4. This gave significant performance benefits because the planner would know the exact row count of the 2-way join heading into the 4-way join. After commit cab9a0656c36739f, we got another noticeable performance improvement (I did timings before and after, but I can't seem to put my hands on the numbers right now). We do millions of these queries every day in batches. Each batch reuses a single temp table (truncating it before each pair of joins) so as to reduce the churn in the system catalogs. In case it matters, the temp table is created with ON COMMIT DROP. This was (and still is) done on 9.2.x. HTH. -- todd cook -- tc...@blackducksoftware.com > On the other hand, I know of no evidence that anyone's depending on multiple sequential COPYs, nor intermixed COPY and INSERT, to be fast. The original argument for having this COPY optimization at all was to make restoring pg_dump scripts in a single transaction fast; and that use-case doesn't care about anything but a single COPY into a virgin table. I think you're worrying about exactly the wrong case. My tentative guess is that the best course is to a) Make heap_truncate_one_rel() create a new relfeilnode. That fixes the truncation replay issue. b) Force new pages to be used when using the heap_sync mode in COPY. That avoids the INIT danger you found. It seems rather reasonable to avoid using pages that have already been the target of WAL logging here in general. And what reason is there to think that this would fix all the problems? We know of those two, but we've not exactly looked hard for other cases. Again, the only known field usage for the COPY optimization is the pg_dump scenario; were that not so, we'd have noticed the problem long since. So I don't have any faith that this is a well-tested area. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "stored procedures" - use cases?
Peter Eisentraut wrote: Another point, as there appear to be diverging camps about supertransactional stored procedures vs. autonomous transactions, what would be the actual use cases of any of these features? Looping over hundreds of identical schema executing DDL statements on each. We can't do this in a single transaction because it consumes all of shared memory with locks. -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow commits with heavy temp table usage in 8.4.0
Tom Lane wrote: > If you roll back a truncate, do you get the expected state? I did a number of variations on the test below, with and without "on drop commit", and similar tests where the "create table" is done before the "begin". After the checkpoint, the number of files in the database directory always returned to the value before the "begin" (210 in this case). Everything behaved as expected. test=# begin; BEGIN test=# create temp table t1 (x integer) ; CREATE TABLE test=# insert into t1 select s from generate_series(1,1000) s ; INSERT 0 1000 test=# select count(*) from t1 ; 1000 test=# savepoint s1; SAVEPOINT test=# truncate t1; TRUNCATE TABLE test=# select count(*) from t1 ; 0 test=# insert into t1 select s from generate_series(1,11000) s ; INSERT 0 1001 test=# select count(*) from t1 ; 1001 test=# rollback to savepoint s1 ; ROLLBACK test=# select count(*) from t1 ; 1000 test=# commit ; COMMIT test=# select count(*) from t1 ; 1000 test=# checkpoint; CHECKPOINT test=# > How about after a database crash? Repeating the same test as above, after the second insert, I did "killall -9 postgres". Restarting generated the expected messages in the log: 2009-08-07 13:09:56 EDT LOG: database system was interrupted; last known up at 2009-08-07 13:06:01 EDT 2009-08-07 13:09:56 EDT LOG: database system was not properly shut down; automatic recovery in progress 2009-08-07 13:09:56 EDT LOG: redo starts at 0/1F8D6D0 2009-08-07 13:09:56 EDT LOG: invalid magic number in log file 0, segment 1, offset 16392192 2009-08-07 13:09:56 EDT LOG: redo done at 0/1F9F3B8 2009-08-07 13:09:56 EDT LOG: autovacuum launcher started 2009-08-07 13:09:56 EDT LOG: database system is ready to accept connections However, the DB directory now has 214 files (up from 210); I have no idea whether this means anything or not. Repeating the previous tests gives expected results. -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow commits with heavy temp table usage in 8.4.0
Alex Hunsaker wrote: FYI, on my 8.2.13 system, the test created 30001 files which were all deleted during the commit. Â On my 8.4.0 system, the test created 60001 files, of which 3 were deleted at commit and 30001 disappeared later (presumably during a checkpoint?). Smells like fsm? Yes, that was it. 3 of the filenames ended with "_fsm". > With double the number of files maybe something simple like turning on dir_index if you are ext3 will help? Thanks for the tip. Doing "tune2fs -O +dir_index" didn't seem to make a difference, which is kinda expected for an existing directory. When I get a chance, I'll try to recreate the filesystem. -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow commits with heavy temp table usage in 8.4.0
Tom Lane wrote: Actually, this is easier than I thought, because there is already bookkeeping being done that (in effect) tracks whether a table has already been truncated in the current transaction. So we can rely on that, and with only a very few lines of code added, ensure that a situation like this does only one full-scale transaction-safe truncation per transaction. The attached prototype patch does this and seems to fix the speed problem nicely. It's not tremendously well tested, but perhaps you'd like to test? Should work in 8.4. I downloaded the 8.4 source, built it unmodified, created a new cluster, and ran the test in an empty DB there. Function execution took about 230 seconds, and commit took about 6 seconds. With the patch applied, the test only took 35 seconds, and the commit was practically instant (30ms). I monitored the database directory, and the test execution only created 2 files (down from 6). Thanks for the patch; it looks great. :) Is there any chance that it will be backpatched to 8.4? -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow commits with heavy temp table usage in 8.4.0
Tom Lane wrote: The attached prototype patch does this and seems to fix the speed problem nicely. It's not tremendously well tested, but perhaps you'd like to test? Should work in 8.4. I'll give it a try and report back (though probably not until tomorrow). -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow commits with heavy temp table usage in 8.4.0
Tom Lane wrote: I took a look through the CVS history and verified that there were no post-8.4 commits that looked like they'd affect performance in this area. So I think it's got to be a platform difference not a PG version difference. In particular I think we are probably looking at a filesystem issue: how fast can you delete [...] 3 files. I'm still on Fedora 7, so maybe this will be motivation to upgrade. FYI, on my 8.2.13 system, the test created 30001 files which were all deleted during the commit. On my 8.4.0 system, the test created 60001 files, of which 3 were deleted at commit and 30001 disappeared later (presumably during a checkpoint?). But I'm not sure if the use-case is popular enough to deserve such a hack. FWIW, the full app was looping over a set of datasets. On each iteration, it computed some intermediate results into a temp table, generated several reports from those intermediate results, and finally truncated the table for the next iteration. -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] slow commits with heavy temp table usage in 8.4.0
Tom Lane wrote: So what I'm seeing is entirely explained by the buildup of dead versions of the temp table's pg_class row --- the index_getnext time is spent scanning over dead HOT-chain members. It might be possible to avoid that by special-casing temp tables in TRUNCATE to recycle the existing file instead of assigning a new one. However, there is no reason to think that 8.3 would be any better than 8.4 on that score. Also, I'm not seeing the very long CPU-bound commit phase that Todd is seeing. The commit looks CPU-bound when I let the residual I/O from the function execution die out before I issue the commit. I'm actually testing CVS HEAD, not 8.4.0, but I don't recall that we've done anything in the past month that would be likely to affect this ... regards, tom lane . Tom's theory may explain the different commit results I get when testing on two different databases: db truncations function commit --- --- -- test 1 29603.624 6054.889 test 1 34740.16714551.177 test 1 30608.26011144.503 test 1 32239.049 9846.676 test 3 227115.85050206.947 test 3 201859.69846083.222 test 3 231926.64246681.009 test 3 235665.97047113.137 production 1 32982.06917654.772 production 1 33297.52417396.792 production 1 35503.18518343.045 production 1 34251.75318284.725 production 3 200899.78675480.448 production 3 206793.20973316.405 production 3 260491.75972570.297 production 3 191363.16866659.129 The "test" DB is nearly empty with 251 entries in pg_class, whereas "production" has real data with 9981 entries in pg_class. -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] slow commits with heavy temp table usage in 8.4.0
Hi, I've noticed that on 8.4.0, commits can take a long time when a temp table is repeatedly filled and truncated within a loop. A very contrived example is begin; create or replace function commit_test_with_truncations() returns void language 'plpgsql' as $_func_$ declare i integer; begin create temp table t1 (x integer) on commit drop ; for i in 1 .. 22000 loop insert into t1 select s from generate_series(1,1000) s ; truncate t1 ; end loop; end; $_func_$; select commit_test_with_truncations() ; commit ; On may laptop (Core2 Duo with 3.5GB and a disk dedicated to PG), the function call takes about 124 seconds, and the commit takes about 43 seconds. The function execution generates a lot of I/O activity, but the commit is entirely CPU bound. By contrast, the same test on an 8.2.13 system (2 older Xeons and 8GB) had times of 495 and 19 seconds. In this case, both the function execution and the commit were entirely CPU bound. The overall process in 8.4 is much faster than 8.2.13, but the commit time is somewhat surprising to me. Is that to be expected? 8.4 version(): PostgreSQL 8.4.0 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-27), 64-bit 8.2.13 version(): PostgreSQL 8.2.13 on x86_64-suse-linux-gnu, compiled by GCC gcc (GCC) 3.3.5 20050117 (prerelease) (SUSE Linux) -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] is 8.4 array_agg() supposed to work with array values?
Peter Eisentraut wrote: On Monday 19 January 2009 23:22:21 Todd A. Cook wrote: The docs at http://developer.postgresql.org/pgdocs/postgres/functions-aggregate.html don't prohibit using array values with array_arg(), so I assumed that it would work. test=> select array_agg(v.a) from (values (array[1,2]), (array[3,4])) as v(a) ; ERROR: could not find array type for data type integer[] Yeah ... This is one of the weirdnesses of the PostgreSQL array implementation. integer[] and integer[][] etc. are actually the same type, just using a different number of dimensions internally. This would work much better if integer[][] where "array of integer[]", in the same way as integer[] is "array of integer", in the way C deals with arrays. This is also a main reason why composite types and arrays don't mix orthogonally; there is no way to represent that in the system catalogs. To get back to your question, as far as array_agg() itself is concerned, it would probably work, but the rest of the sytem doesn't deal with it very well. You will probably be able to find a number of other places that break when trying to derive the array type from something that is already an array. Thanks for the clarification. -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] is 8.4 array_agg() supposed to work with array values?
Hi, The docs at http://developer.postgresql.org/pgdocs/postgres/functions-aggregate.html don't prohibit using array values with array_arg(), so I assumed that it would work. However, with CVS HEAD from Friday afternoon, I get test=> select version() ; version -- PostgreSQL 8.4devel on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070502 (Red Hat 4.1.2-12), 64-bit (1 row) test=> select v.a, pg_typeof(v.a) from (values (array[1,2]), (array[3,4])) as v(a) ; a | pg_typeof ---+--- {1,2} | integer[] {3,4} | integer[] (2 rows) test=> select array_agg(v.a) from (values (array[1,2]), (array[3,4])) as v(a) ; ERROR: could not find array type for data type integer[] test=> If this is expected behavior, the docs should mention the restriction. -- todd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] More thoughts about planner's cost estimates
Josh Berkus wrote: Greg, Tom, But for most users analyze doesn't really have to run as often as vacuum. One sequential scan per night doesn't seem like that big a deal to me. Clearly you don't have any 0.5 TB databases. Perhaps something like "ANALYZE FULL"? Then only those who need the more precise statistics would pay the cost for a full scan. -- todd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org