Re: [HACKERS] Pull up aggregate subquery
On Tue, May 24, 2011 at 3:47 AM, Hitoshi Harada umi.tan...@gmail.com wrote: That's true. But if the planning cost is an only issue, why not adding new GUC for user to choose if they prefer it or not? Of course if we have some method to predict which way to go before proving both ways, it's great. Do you have some blue picture on it? I like your simple patch and looks like it fixes your concern. Your problem statement ignores the fact that most people would not write the original query like this select m_id, sum_len from size_m m inner join(select m_id, sum(length(val)) as sum_len from size_l group by m_id)l on m.id = l.m_id where val = '1'; they would write it like this select m_id, sum(length(val)) from size_m m join size_l l on m.id = l.m_id where val = '1' group by m_id; Which gives a far worse plan and one that is not solved by your patch. Your way of writing the SQL is one of the hand optimized ways that an SQL expert would try to re-write the SQL. We shouldn't be optimizing only for hand-altered code, since it can always be further tweaked by hand. We should be optimizing the original, simple queries (as well as other forms of expressing the same thing). This highlights that we do not have the infrastructure to push aggregates up or down, and that the lack of a known primary key for the output of each plan node prevents us from developing a general transformation infrastructure to solve the general case. That particular piece of infrastructure is also an essential step towards materialized views, which would be pretty useless without the capability to transform aggregates up and down the join tree. In terms of costing, I think it would be likely that we can apply simple heuristics. We already assume that applying quals down to the lowest level possible make sense. I would guess that anything that reduces the number of rows should be pushed down as far as possible. I'm sure there are cases where that isn't true, but lets not stop from solving simple general cases because of the theoretical existence of complex cases -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Fri, Jun 3, 2011 at 2:17 PM, Robert Haas robertmh...@gmail.com wrote: I've now spent enough time working on this issue now to be convinced that the approach has merit, if we can work out the kinks. Yes, the approach has merits and I'm sure we can work out the kinks. As you can see, this works out to a bit more than a 4% improvement on this two-core box. I also got access (thanks to Nate Boley) to a 24-core box and ran the same test with scale factor 100 and shared_buffers=8GB. Here are the results of alternating runs without and with the patch on that machine: tps = 36291.996228 (including connections establishing) tps = 129242.054578 (including connections establishing) tps = 36704.393055 (including connections establishing) tps = 128998.648106 (including connections establishing) tps = 36531.208898 (including connections establishing) tps = 131341.367344 (including connections establishing) That's an improvement of about ~3.5x. According to the vmstat output, when running without the patch, the CPU state was about 40% idle. With the patch, it dropped down to around 6%. Congratulations. I believe that is realistic based upon my investigations. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cube Index Size
2011/6/2 Teodor Sigaev teo...@sigaev.ru Can we figure out some information about index i.e. whet is the height of index tree, how many values are placed in one leaf node and one non leaf level node? http://www.sigaev.ru/cvsweb/cvsweb.cgi/gevel/ For improving space utilization, When node is splitted, then we have to assign enteries to two groups. Once, one group is reached some threshod (m) then, insert the remaining entries into another group. Can you suggest some way to choose 'm' (beacuse cube store in form of NDBOX that having variable length) or provide some guide with code? Thanks -- Teodor Sigaev E-mail: teo...@sigaev.ru WWW: http://www.sigaev.ru/
Re: [HACKERS] storing TZ along timestamps
On Thu, Jun 2, 2011 at 8:58 PM, Jim Nasby j...@nasby.net wrote: I'm torn between whether the type should store the original time or the original time converted to GMT. This is the wrong way to think about it. We *never* store time converted to GMT. When we want to represent a point in time we represent it as seconds since the epoch. This is a quantity that is the same everywhere in the world (modulo relativistic effects...) ie. it's a concept which is completely independent of timezones. The question here is how to represent more complex concepts than simply points in time. I think the two concepts under discussion are a) a composite type representing a point in time and a timezone it should be interpreted in for operations and display and b) the original input provided which is a text string with the constraint that it's a valid input which can be interpreted as a point in time. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On Sat, Jun 4, 2011 at 2:59 PM, Simon Riggs si...@2ndquadrant.com wrote: As you can see, this works out to a bit more than a 4% improvement on this two-core box. I also got access (thanks to Nate Boley) to a 24-core box and ran the same test with scale factor 100 and shared_buffers=8GB. Here are the results of alternating runs without and with the patch on that machine: tps = 36291.996228 (including connections establishing) tps = 129242.054578 (including connections establishing) tps = 36704.393055 (including connections establishing) tps = 128998.648106 (including connections establishing) tps = 36531.208898 (including connections establishing) tps = 131341.367344 (including connections establishing) That's an improvement of about ~3.5x. According to the vmstat output, when running without the patch, the CPU state was about 40% idle. With the patch, it dropped down to around 6%. Congratulations. I believe that is realistic based upon my investigations. Tom, You should look at this. It's good. The approach looks sound to me. It's a fairly isolated patch and we should be considering this for inclusion in 9.1, not wait another year. I will happily add its a completely different approach to the one I'd been working on, and even more happily is so different from the Oracle approach that we are definitely unencumbered by patent issues here. Well done Robert, Noah. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Fast GiST index build
On 03.06.2011 14:02, Alexander Korotkov wrote: Hackers, WIP patch of fast GiST index build is attached. Code is dirty and comments are lacking, but it works. Now it is ready for first benchmarks, which should prove efficiency of selected technique. It's time to compare fast GiST index build with repeat insert build on large enough datasets (datasets which don't fit to cache). There are following aims of testing: 1) Measure acceleration of index build. 2) Measure change in index quality. I'm going to do first testing using synthetic datasets. Everybody who have interesting real-life datasets for testing are welcome. I did some quick performance testing of this. I installed postgis 1.5, and loaded an extract of the OpenStreetMap data covering Finland. The biggest gist index in that data set is the idx_nodes_geom index on nodes table. I have maintenance_work_mem and shared_buffers both set to 512 MB, and this laptop has 4GB of RAM. Without the patch, reindexing the index takes about 170 seconds and the index size is 321 MB. And with the patch, it takes about 150 seconds, and the resulting index size is 319 MB. The nodes table is 618MB in size, so it fits in RAM. I presume the gain would be bigger if it doesn't, as the random I/O to update the index starts to hurt more. But this shows that even when it does, this patch helps a little bit, and the resulting index size is comparable. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BLOB support
Tom Lane t...@sss.pgh.pa.us Friday 03 of June 2011 16:44:13 Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from RadosÅaw Smogura's message of jue jun 02 15:26:29 -0400 2011: So do I understand good should We think about create bettered TOAST to support larger values then 30-bit length? I like this much more, Good :-) (BTW while it'd be good to have longer-than-30 bit length words for varlena, I'm not sure we have room for that.) You wouldn't want to push such values around as whole values anyway. Possibly what would work here is a variant form of TOAST pointer for which we'd simply throw error if you tried to fetch the entire value at once. regards, tom lane I mentoined about JDBC call. 1. b = conn.createBlob(); ps = conn.preparesStatement(INSRT INTO t blob = ?); ps.setBlob(1, b); ps.executeQuery(); 2. Statements could be as well too SELECT blob_index_of(?, ?); where 1st ? is blob, 2nd one is some text/sequence This call must be supported to make BLOB as far as possible simillar to other types, this actually disallow to put LOB in TOAST as there is no relation (2.) or relation is not known (1.) during reading LOB - in any case you can't skip bytes from protocol stream, so possible solutions are: 1. Create temporaly LOB in file or memory depending of it size. 2. Use existing Large Object interface. 3. Client will not send LOB, just it's faked ID and during call Server will ask client to serialize this LOB, by faked id. 4. Any other propositions? I vote for 2. For pg_type/class changes I think about removing haslobs, and put this as attstorage or somewhere else for example ('l' may stand for lobs or -3 length), but currently TOAST composites doesn't support recursion, and those types are toasted as whole. I may add recursion for those types, and support special maintain for LOBs. In any case handling this will be better in toasting code, not in nodeModifyTable. Any ideas about this? Reagrds, Radek -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
On 04.06.2011 18:01, Simon Riggs wrote: It's a fairly isolated patch and we should be considering this for inclusion in 9.1, not wait another year. -1 -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ts_count
One of our PostgreSQL Experts Inc customers wanted a function to count all the occurrences of terms in a tsquery in a tsvector. This has been written as a loadable module function, and initial testing shows it is working well. With the client's permission we are releasing the code - it's available at https://github.com/pgexperts/ts_count. The actual new code involved here is tiny, some of the code is CP'd from tsrank.c and much of the rest is boilerplate. A snippet from the regression test: select ts_count(to_tsvector('managing managers manage peons managerially'), to_tsquery('managers | peon')); ts_count -- 4 We'd like to add something like this for 9.2, so I'd like to get the API agreed and then I'll prepare a patch and submit it for the next CF. Comments? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error in PQsetvalue
On Fri, Jun 3, 2011 at 10:36 PM, Andrew Chernow a...@esilo.com wrote: On 6/3/2011 10:26 PM, Andrew Chernow wrote: I disagree -- I think the fix is a one-liner. line 446: if (tup_num == res-ntups !res-tuples[tup_num]) should just become if (tup_num == res-ntups) also the memset of the tuple slots when the slot array is expanded can be removed. (in addition, the array tuple array expansion should really be abstracted, but that isn't strictly necessary here). All true. This is a cleaner fix to something that was in fact broken ;) You want Attached a patch that fixes the OP's issue. PQsetvalue now uses pqAddTuple to grow the tuple table and has removed the remnants of an older idea that caused the bug. Sorry, I attached the wrong patch. Here is the correct one. This looks good. Pavel, want to test it? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] creating CHECK constraints as NOT VALID
On Thu, Jun 2, 2011 at 5:48 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Excerpts from Alvaro Herrera's message of mié jun 01 20:56:12 -0400 2011: Excerpts from Thom Brown's message of mié jun 01 19:48:44 -0400 2011: Is this expected? [ pg_dump fails to preserve not-valid status of constraints ] Certainly not. Shouldn't the constraint be dumped as not valid too?? Sure, I'll implement that tomorrow. Actually, it turns out that NOT VALID foreign keys were already buggy here, and fixing them automatically fixes this case as well, because the fix involves touching pg_get_constraintdef to dump the flag. This also gets it into psql's \d. Patch attached. (Maybe the changes in psql's describe.c should be reverted, not sure.) Thanks. As soon as Thom said that, I thought ahh... didn't do that. Patch looks fine. Will you commit this patch to 9.1 now, or would you like me to? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIREAD lock versus ACCESS EXCLUSIVE lock
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 03.06.2011 21:04, Kevin Grittner wrote: Also, if anyone has comments or hints about the placement of those calls, I'd be happy to receive them. heap_drop_with_catalog() schedules the relation for deletion at the end of transaction, but it's still possible that the transaction aborts and the heap doesn't get dropped after all. If you put the DropAllPredicateLocksFromTable() call there, and the transaction later aborts, you've lost all the locks already. But on the third thought: is that wrong? Surely locks taken by an aborted transaction can be discarded. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIREAD lock versus ACCESS EXCLUSIVE lock
On 04.06.2011 19:19, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: On 03.06.2011 21:04, Kevin Grittner wrote: Also, if anyone has comments or hints about the placement of those calls, I'd be happy to receive them. heap_drop_with_catalog() schedules the relation for deletion at the end of transaction, but it's still possible that the transaction aborts and the heap doesn't get dropped after all. If you put the DropAllPredicateLocksFromTable() call there, and the transaction later aborts, you've lost all the locks already. But on the third thought: is that wrong? Surely locks taken by an aborted transaction can be discarded. These are predicate locks - there can be locks on the table belonging to transactions that have already committed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] plperl fails with perl 5.14.0
It appears from the buildfarm configure logs that member anchovy was updated yesterday from perl 5.12.3 to 5.14.0. It doesn't like plperl.c anymore: ccache cc -flto -march=amdfam10 -fstack-protector -O3 -pipe -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wformat-security -fno-strict-aliasing -fwrapv -g -fpic -I. -I. -I../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/lib/perl5/core_perl/CORE -c -o plperl.o plperl.c plperl.c: In function 'plperl_trusted_init': plperl.c:929:36: error: lvalue required as left operand of assignment make[3]: *** [plperl.o] Error 1 What it's unhappy about is evidently this: GvCV(sv) = NULL;/* prevent call via GV */ I think we are going to have to find a different way of doing that... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon Riggs wrote: we should be considering this for inclusion in 9.1, not wait another year. -1 I'm really happy that we're addressing the problems with scaling to a large number of cores, and this patch sounds great. Adding a new feature at this point in the release cycle would be horrible. Frankly, from the tone of Robert's post, it probably wouldn't be appropriate to include it in a release if it showed up in this condition at the start of the last CF for that release. The nice thing about annual releases is there's never one too far away -- unless, of course, we hold up a release up to squeeze in just one more feature. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pull up aggregate subquery
2011/6/4 Simon Riggs si...@2ndquadrant.com: I like your simple patch and looks like it fixes your concern. Thanks for your interest. I forgot to mention but this type of query is quite general in one-to-many entities and likely to be generated by simple ORMappers. Your problem statement ignores the fact that most people would not write the original query like this select m_id, sum_len from size_m m inner join(select m_id, sum(length(val)) as sum_len from size_l group by m_id)l on m.id = l.m_id where val = '1'; they would write it like this select m_id, sum(length(val)) from size_m m join size_l l on m.id = l.m_id where val = '1' group by m_id; Hm, I didn't notice this hand transformation. Will check what's happened. But my example is simplified and it might be likely that some other joins (not uniquely) to size_m. In terms of costing, I think it would be likely that we can apply simple heuristics. We already assume that applying quals down to the lowest level possible make sense. I would guess that anything that reduces the number of rows should be pushed down as far as possible. I'm sure there are cases where that isn't true, but lets not stop from solving simple general cases because of the theoretical existence of complex cases Agreed. After more thought, push-down-qual approach would be better than push down/pull up aggregates. The only concern was multiple aggregate call case in such cases like more rows than one are qualified in size_m. But it is clear that each aggregate call scans only qualified size_l rows if we can push down parameter qual to the subquery. Nestloop with parameter push down to aggregate subquery appoach is more general because it doesn't concern about primary key issue. You can push it down whenever the total execution cost is smaller. So, I'm now walking through planner code and finally I found the clue to start. First, the current problem of parameterized nestloop idea in general case is that while nested index scan pushes parameter to the other join relation, more general approach needs to do it with subquery. A nested index scan Path is generated in match_unsorted_outer(), which is at much deeper of make_rel_from_joinlist(), which is after set_base_rel_pathlist(), which contains set_subquery_pathlist(), which calls subquery_planner(). This means that if you want to add code to generate general nestloop with parameter push down during join search process, it is too late to push down the parameter to subquery, because subquery was already planned at that time. So we need to add new operation before subquery_planner(). It is hard because any join-relevant information is not ready at the stage. But I bet some simple cases like my aggregate-join can find it possible to make parameter from join qual. (I haven't yet written any code nor proven my theory). In this case we need to plan subquery twice, one with pure and the other with parameterized. Other than subquery case, LATERAL will be ok with near the nested index scan approach, since the joinned relation is FunctionScan, which is not planned lie subquery. s JOIN(l1 LEFT JOIN l2) case is unclear which of subquery or index scan. Maybe the third way, because l1 LEFT JOIN l2 is inside deconstructed jointree which is not planned subquery but also not plain RelOptInfo like base relation / function scan. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postmaster holding unlinked files for pg_largeobject table
Alvaro Herrera alvhe...@commandprompt.com writes: What surprises me is that the open references remain after a database drop. Surely this means that no backends keep open file descriptors to any table in that database, because there are no connections. bgwriter ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing the overhead of frequent table locks - now, with WIP patch
Simon Riggs si...@2ndquadrant.com writes: The approach looks sound to me. It's a fairly isolated patch and we should be considering this for inclusion in 9.1, not wait another year. That suggestion is completely insane. The patch is only WIP and full of bugs, even according to its author. Even if it were solid, it is way too late to be pushing such stuff into 9.1. We're trying to ship a release, not find ways to cause it to slip more. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIREAD lock versus ACCESS EXCLUSIVE lock
Heikki Linnakangas wrote: On 04.06.2011 19:19, Tom Lane wrote: Heikki Linnakangas writes: On 03.06.2011 21:04, Kevin Grittner wrote: Also, if anyone has comments or hints about the placement of those calls, I'd be happy to receive them. heap_drop_with_catalog() schedules the relation for deletion at the end of transaction, but it's still possible that the transaction aborts and the heap doesn't get dropped after all. If you put the DropAllPredicateLocksFromTable() call there, and the transaction later aborts, you've lost all the locks already. But on the third thought: is that wrong? Surely locks taken by an aborted transaction can be discarded. These are predicate locks - there can be locks on the table belonging to transactions that have already committed. It took me a while to think this through, but if the transaction (T1) which reads the table to create the SIREAD lock overlaps another transaction (T2) with which it might interact to create a dangerous structure, and T2 has not yet accessed the table in any way, then after T1 commits a third transaction (T3) could try to drop the table but roll back, and T2 could still proceed to do a write which conflicts with the predicate lock. That certainly sounds like a low frequency combination of events, but one which can't be ignored if we want correct behavior (i.e., no false negatives). -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl fails with perl 5.14.0
There was some discussion on this about 6 weeks ago, at which time 5.14 wasn't released, so I didn't want to apply the patch then. I'll look at applying the fix now. cheers andrew On 06/04/2011 12:30 PM, Tom Lane wrote: It appears from the buildfarm configure logs that member anchovy was updated yesterday from perl 5.12.3 to 5.14.0. It doesn't like plperl.c anymore: ccache cc -flto -march=amdfam10 -fstack-protector -O3 -pipe -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wformat-security -fno-strict-aliasing -fwrapv -g -fpic -I. -I. -I../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/lib/perl5/core_perl/CORE -c -o plperl.o plperl.c plperl.c: In function 'plperl_trusted_init': plperl.c:929:36: error: lvalue required as left operand of assignment make[3]: *** [plperl.o] Error 1 What it's unhappy about is evidently this: GvCV(sv) = NULL;/* prevent call via GV */ I think we are going to have to find a different way of doing that... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl fails with perl 5.14.0
On 06/04/2011 01:05 PM, Andrew Dunstan wrote: There was some discussion on this about 6 weeks ago, at which time 5.14 wasn't released, so I didn't want to apply the patch then. I'll look at applying the fix now. (and apologies for top-replying ;-) ) cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plperl fails with perl 5.14.0
Andrew Dunstan and...@dunslane.net writes: There was some discussion on this about 6 weeks ago, at which time 5.14 wasn't released, so I didn't want to apply the patch then. Oh, right, I thought the issue seemed familiar: http://archives.postgresql.org/message-id/BANLkTi=qrfqppfqto2r5amb_cfsyhdv...@mail.gmail.com It looks like we were waiting to see if they'd put GvCV_set() into ppport.h. If they didn't, Alex's fix looks sane to me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] creating CHECK constraints as NOT VALID
Excerpts from Simon Riggs's message of sáb jun 04 09:11:52 -0400 2011: On Thu, Jun 2, 2011 at 5:48 PM, Alvaro Herrera alvhe...@commandprompt.com wrote: Actually, it turns out that NOT VALID foreign keys were already buggy here, and fixing them automatically fixes this case as well, because the fix involves touching pg_get_constraintdef to dump the flag. This also gets it into psql's \d. Patch attached. (Maybe the changes in psql's describe.c should be reverted, not sure.) Thanks. As soon as Thom said that, I thought ahh... didn't do that. Patch looks fine. Will you commit this patch to 9.1 now, or would you like me to? Thanks for the review. I already committed it on 9.1: http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=048417511aef8d5fb2d541b17b73afc730935cd5 I'd still like your opinion on the psql bits. Should they be reverted? I haven't verified what the output currently looks like. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ts_count
Well, there are several functions available around tsearch2. so I suggest somebody to collect all of them and create one extension - ts_addon. For example, these are what I remember: 1. tsvector2array 2. noccurences(tsvector, tsquery) - like your ts_count 3. nmatches(tsvector, tsquery) - # of matched lexems in query Of course, we need to think about better names for functions, since ts_count is a bit ambiguous. Oleg On Sat, 4 Jun 2011, Andrew Dunstan wrote: One of our PostgreSQL Experts Inc customers wanted a function to count all the occurrences of terms in a tsquery in a tsvector. This has been written as a loadable module function, and initial testing shows it is working well. With the client's permission we are releasing the code - it's available at https://github.com/pgexperts/ts_count. The actual new code involved here is tiny, some of the code is CP'd from tsrank.c and much of the rest is boilerplate. A snippet from the regression test: select ts_count(to_tsvector('managing managers manage peons managerially'), to_tsquery('managers | peon')); ts_count -- 4 We'd like to add something like this for 9.2, so I'd like to get the API agreed and then I'll prepare a patch and submit it for the next CF. Comments? cheers andrew Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assert failure when rechecking an exclusion constraint
Noah Misch n...@leadboat.com writes: Commit d2f60a3ab055fb61c8e1056a7c5652f1dec85e00 added an assert to indexam.c's RELATION_CHECKS to block use of an index while it's being rebuilt. This assert trips while rechecking an exclusion constraint after an ALTER TABLE rebuild: CREATE TABLE t ( c int, EXCLUDE (c WITH =) ); INSERT INTO t VALUES (1), (2); ALTER TABLE t ALTER c TYPE smallint USING 1; Mph ... obviously not tested enough ... I could not come up with an actual wrong behavior arising from this usage, so I'll tentatively call it a false positive. reindex_index() could instead unconditionally clear indexInfo-ii_Exclusion* before calling index_build(), then pop pendingReindexedIndexes and call IndexCheckExclusion() itself. Popping pendingReindexedIndexes as we go can make the success of a reindex_relation() dependent on the order in which we choose to rebuild indexes, though. Another option is to just remove the assert as not worth preserving. Removing the assert would be a seriously bad idea IMO. I think we could just allow index_build to call ResetReindexProcessing() midstream (ie, before it calls IndexCheckExclusion). This does raise the question of whether the existing call to IndexCheckExclusion is badly placed and we should move it to after the index is fully rebuilt. That would allow us to avoid doing ResetReindexProcessing until the index is clearly safe to use. So in short, I'm thinking move lines 1760-1772 (in HEAD) of index.c to the end of index_build(), then insert a ResetReindexProcessing() call in front of them; or maybe only do ResetReindexProcessing there if we actually do call IndexCheckExclusion. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ts_count
On 06/04/2011 04:51 PM, Oleg Bartunov wrote: Well, there are several functions available around tsearch2. so I suggest somebody to collect all of them and create one extension - ts_addon. For example, these are what I remember: 1. tsvector2array 2. noccurences(tsvector, tsquery) - like your ts_count 3. nmatches(tsvector, tsquery) - # of matched lexems in query Of course, we need to think about better names for functions, since ts_count is a bit ambiguous. Getting agreed names was one reason for posting. I don't know why these need to be an extension. I think they are of sufficiently general interest (and sufficiently lightweight) that we could just build them in. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assert failure when rechecking an exclusion constraint
On Sat, Jun 04, 2011 at 05:49:31PM -0400, Tom Lane wrote: Noah Misch n...@leadboat.com writes: I could not come up with an actual wrong behavior arising from this usage, so I'll tentatively call it a false positive. reindex_index() could instead unconditionally clear indexInfo-ii_Exclusion* before calling index_build(), then pop pendingReindexedIndexes and call IndexCheckExclusion() itself. Popping pendingReindexedIndexes as we go can make the success of a reindex_relation() dependent on the order in which we choose to rebuild indexes, though. Another option is to just remove the assert as not worth preserving. Removing the assert would be a seriously bad idea IMO. I think we could just allow index_build to call ResetReindexProcessing() midstream (ie, before it calls IndexCheckExclusion). This does raise the question of whether the existing call to IndexCheckExclusion is badly placed and we should move it to after the index is fully rebuilt. That would allow us to avoid doing ResetReindexProcessing until the index is clearly safe to use. So in short, I'm thinking move lines 1760-1772 (in HEAD) of index.c to the end of index_build(), then insert a ResetReindexProcessing() call in front of them; or maybe only do ResetReindexProcessing there if we actually do call IndexCheckExclusion. Sounds reasonable. Need to remove the index from pendingReindexedIndexes, not just call ResetReindexProcessing(). Also, wouldn't that specific construction make the catalog updates fail due to running in the table owner's security context? But certainly something along those lines will do. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ts_count
Excerpts from Andrew Dunstan's message of sáb jun 04 08:47:02 -0400 2011: A snippet from the regression test: select ts_count(to_tsvector('managing managers manage peons managerially'), to_tsquery('managers | peon')); ts_count -- 4 Err, shouldn't this return 5? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ts_count
On 06/04/2011 08:59 PM, Alvaro Herrera wrote: Excerpts from Andrew Dunstan's message of sáb jun 04 08:47:02 -0400 2011: A snippet from the regression test: select ts_count(to_tsvector('managing managers manage peons managerially'), to_tsquery('managers | peon')); ts_count -- 4 Err, shouldn't this return 5? No. 'managerially' doesn't get the same stemming. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] heap vacuum cleanup locks
We've occasionally seen problems with VACUUM getting stuck for failure to acquire a cleanup lock due to, for example, a cursor holding a pin on the buffer page. In the worst case, this can cause an undetected deadlock, if the backend holding the buffer pin blocks trying to acquire a heavyweight lock that is in turn blocked by VACUUM. A while back, someone (Greg Stark? me?) floated the idea of not waiting for the cleanup lock. If we can't get it immediately, or within some short period of time, then we just skip the page and continue on. Today I had what might be a better idea: don't try to acquire a cleanup lock at all. Instead, acquire an exclusive lock. After having done so, observe the pin count. If there are no other buffer pins, that means our exclusive lock is actually a cleanup lock, and we proceed as now. If other buffer pins do exist, then we can't defragment the page, but that doesn't mean no useful work can be done: we can still mark used line pointers dead, or dead line pointers unused. We cannot defragment, but that can be done either by the next VACUUM or by a HOT cleanup. We can even arrange - using existing mechanism - to leave behind a hint that the page is a good candidate for a HOT cleanup, by setting pd_prune_xid to, say, FrozenXID. Like the idea of skipping pages on which we can't acquire a cleanup lock altogether, this should prevent VACUUM from getting stuck trying to lock a heap page. While buffer pins can be held for extended periods of time, I don't think there is any operation that holds a buffer content lock more than very briefly. Furthermore, unlike the idea of skipping the page altogether, we could use this approach even during an anti-wraparound vacuum. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers