Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On 01.10.2015 00:27, Tom Lane wrote: Josh Berkuswrites: On 09/30/2015 03:10 PM, Tom Lane wrote: I'd be feeling a lot more positive about this whole thread if any people had stepped up and said "yes, *I* will put in a lot of grunt-work to make something happen here". The lack of any volunteers suggests strongly that this thread is a waste of time, just as the several similar ones before it have been. Hmmm? Frost volunteered to stand up debbugs. So he did, and did anyone volunteer to put data into it, or to do ongoing curation of said data? Yes, i did. Greetings, Torsten -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> -Original Message- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Etsuro Fujita > Sent: Thursday, October 01, 2015 5:50 PM > To: Kaigai Kouhei(海外 浩平); Robert Haas > Cc: PostgreSQL-development; 花田茂 > Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual > > On 2015/10/01 11:15, Kouhei Kaigai wrote: > >> From: pgsql-hackers-ow...@postgresql.org > >> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas > >> On Mon, Sep 28, 2015 at 11:15 PM, Etsuro Fujita > >>wrote: > >>> I thought the same thing [1]. While I thought it was relatively easy to > >>> make changes to RefetchForeignRow that way for the foreign table case > >>> (scanrelid>0), I was not sure how hard it would be to do so for the > >>> foreign > >>> join case (scanrelid==0). So, I proposed to leave that changes for 9.6. > >>> I'll have a rethink on this issue along the lines of that approach. > > >> So, if we wanted to fix this in a way that preserves the spirit of > >> what's there now, it seems to me that we'd want the FDW to return > >> something that's like a whole row reference, but represents the output > >> of the foreign join rather than some underlying base table. And then > >> get the EPQ machinery to have the evaluation of the ForeignScan for > >> the join, when it happens in an EPQ context, to return that tuple. > >> But I don't really have a good idea how to do that. > > > Alternative built-in join execution? > > Once it is executed under the EPQ context, built-in join node fetches > > a tuple from both of inner and outer side for each. It is eventually > > fetched from the EPQ slot, then the alternative join produce a result > > tuple. > > In case when FDW is not designed to handle join by itself, it is > > a reasonable fallback I think. > > > > I expect FDW driver needs to handle EPQ recheck in the case below: > > * ForeignScan on base relation and it uses late row locking. > > * ForeignScan on join relation, even if early locking. > > I also think the approach would be one choice. But one thing I'm > concerned about is plan creation for that by the FDW author; that would > make life hard for the FDW author. (That was proposed by me ...) > I don't follow the standpoint, but not valuable to repeat same discussion. > So, I'd like to investigate another approach that preserves the > applicability of late row locking to the join pushdown case as well as > the spirit of what's there now. The basic idea is (1) add a new > callback routine RefetchForeignJoinRow that refetches one foreign-join > tuple from the foreign server, after locking remote tuples for the > component foreign tables if required, and (2) call that routine in > ExecScanFetch if the target scan is for a foreign join and the component > foreign tables require to be locked lately, else just return the > foreign-join tuple stored in the parent's state tree, which is the tuple > mentioned by Robert, for preserving the spirit of what's there now. I > think that ExecLockRows and EvalPlanQualFetchRowMarks should probably be > modified so as to skip foreign tables involved in a foreign join. > As long as FDW author can choose their best way to produce a joined tuple, it may be worth to investigate. My comments are: * ForeignRecheck is the best location to call RefetchForeignJoinRow when scanrelid==0, not ExecScanFetch. Why you try to add special case for FDW in the common routine. * It is FDW's choice where the remote join tuple is kept, even though most of FDW will keep it on the private field of ForeignScanState. Apart from FDW requirement, custom-scan/join needs recheckMtd is called when scanrelid==0 to avoid assertion fail. I hope FDW has symmetric structure, however, not a mandatory requirement for me. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Wed, Sep 30, 2015 at 7:05 AM, Robert Haaswrote: > > On Thu, Sep 24, 2015 at 2:31 PM, Amit Kapila wrote: > > [ parallel_seqscan_partialseqscan_v18.patch ] > > I spent a bit of time reviewing the heapam.c changes in this patch > this evening, and I think that your attempt to add support for > synchronized scans has some problems. > Thanks for the review and I agree with all the suggestions provided by you. Fixed all of them in attached patch (parallel_seqscan_heapscan_v1.patch). I have rebased partial seq scan patch (as attached with this mail) to test synchronized scan and parallelheapscan patch. Also I have added Log (parallel_seqscan_heapscan_test_v1.patch) to see the start positions during synchronized parallel heap scans. I have done various tests with parallel scans and found that it works fine for sync scans as well as without sync scan. Basic test to verify the patch: CREATE TABLE t1(c1, c2) AS SELECT g, repeat('x', 5) FROM generate_series(1, 1000) g; CREATE TABLE t2(c1, c2) AS SELECT g, repeat('x', 5) FROM generate_series(1, 100) g; set parallel_tuple_cost=0.001 set max_parallel_degree=2; set parallel_setup_cost=0; SELECT count(*) FROM t1 JOIN t2 ON t1.c1 = t2.c1 AND t1.c1 BETWEEN 100 AND 200; Run the above Select query from multiple clients and notice start scan positions and Results of the query. It returns the expected results (Count as 101 rows). With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com parallel_seqscan_heapscan_v1.patch Description: Binary data parallel_seqscan_heapscan_test_v1.patch Description: Binary data parallel_seqscan_partialseqscan_v19.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On 2015-09-29 15:49:28 -0400, Stephen Frost wrote: > From Andres' reply, it looks like this is about the EXCLUDED pseudo > relation which comes from the INSERT'd values themselves Right. > in which case, I tend to agree with his assessment that it doesn't > make sense for those to be subject to RLS policies, given that it's > all user-provided data, as long as the USING check is done on the row > found to be conflicting and the CHECK constraints are dealt with > correctly for any row added, which I believe is what we had agreed was > the correct way to handle this case in prior discussions. Yes, that what I think as well. At this point we'll already have executed insert rls stuff on the EXCLUDED tuple: /* * Check any RLS INSERT WITH CHECK policies * * ExecWithCheckOptions() will skip any WCOs which are not of the kind * we are looking for at this point. */ if (resultRelInfo->ri_WithCheckOptions != NIL) ExecWithCheckOptions(WCO_RLS_INSERT_CHECK, resultRelInfo, slot, estate); and before executing the actual projection we also checked the existing tuple: ExecWithCheckOptions(WCO_RLS_CONFLICT_CHECK, resultRelInfo, mtstate->mt_existing, mtstate->ps.state); after the update triggers have, if applicable run, we run the the normal checks there as well because it's just ExecUpdate() if (resultRelInfo->ri_WithCheckOptions != NIL) ExecWithCheckOptions(WCO_RLS_UPDATE_CHECK, resultRelInfo, slot, estate); so I do indeed think that there's no point in layering RLS above EXCLUDED. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Thu, Oct 1, 2015 at 2:35 AM, Kouhei Kaigaiwrote: > Gather node was oversight by readfunc.c, even though it shall not be > serialized actually. > Also, it used incompatible WRITE_xxx_FIELD() macro on outfuncs.c. > > The attached patch fixes both of incomsistence. Thanks. You missed READ_DONE() but fortunately my compiler noticed that oversight. Committed with that fix. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] max_worker_processes on the standby
On Thu, Oct 1, 2015 at 7:48 AM, Alvaro Herrerawrote: > oonish...@nttdata.co.jp wrote: > >> The below error messages were shown in standby server log: >> >> FATAL: could not access status of transaction 9009 >> DETAIL: Could not read from file "pg_commit_ts/" at offset 90112: >> Success. >> CONTEXT: xlog redo Transaction/COMMIT: 2015-09-30 15:52:41.924141+09 >> LOG: startup process (PID 23199) exited with exit code 1 >> LOG: terminating any other active server processes >> >> Before this FATAL, there were some INFO but annoying messages: >> >> LOG: file "pg_commit_ts/" doesn't exist, reading as zeroes > > Here's a patch. I've not read the patch yet, but the patched server with track_commit_timestamp enabled caused the following PANIC error when I ran pgbench. PANIC: could not access status of transaction 2457 DETAIL: Could not read from file "pg_commit_ts/" at offset 24576: Success. STATEMENT: END; The procedure to reproduce the PANIC error is, 1. Enable track_commit_timestamp 2. Start up the server 3. Run pgbench -i -s10 4. Run pgbench -j 4 -c 4 -T 30 Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On 2015/10/01 19:02, Kyotaro HORIGUCHI wrote: At Thu, 1 Oct 2015 17:50:25 +0900, Etsuro Fujitawrote in <560cf3d1.9060...@lab.ntt.co.jp> From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas So, if we wanted to fix this in a way that preserves the spirit of what's there now, it seems to me that we'd want the FDW to return something that's like a whole row reference, but represents the output of the foreign join rather than some underlying base table. And then get the EPQ machinery to have the evaluation of the ForeignScan for the join, when it happens in an EPQ context, to return that tuple. But I don't really have a good idea how to do that. So, I'd like to investigate another approach that preserves the applicability of late row locking to the join pushdown case as well as the spirit of what's there now. The basic idea is (1) add a new callback routine RefetchForeignJoinRow that refetches one foreign-join tuple from the foreign server, after locking remote tuples for the component foreign tables if required, It would be the case that at least one of the component relations of a foreign join is other than ROW_MARK_COPY, which is not possible so far on postgres_fdw. Yes. To be exact, it's possible for the component relations to have rowmark methods other than ROW_MARK_COPY using GetForeignRowMarkType, in principle, but the server crashes ... For the case that some of the component relations are other than ROW_MARK_COPY, we might should call RefetchForeignRow for such relations and construct joined row involving ROW_MARK_COPY relations. You are saying that we should construct the joined row using an alternative local join execution plan? Indeed we could consider some logic for the case, it is obvious that the case now we should focus on is a "foreign join" scan with all underlying foreign scans are ROW_MARK_COPY, I think. "foreign join" scan with ROW_MARK_COPY looks to be promising (for me) and in future it would be able to coexist with refetch mechanism maybe in your mind from this point of view... Maybe:p I agree that the approach "foreign-join scan with ROW_MARK_COPY" would be promising. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On 2015-09-29 11:52:14 -0700, Peter Geoghegan wrote: > On Tue, Sep 29, 2015 at 8:24 AM, Andres Freundwrote: > > So, took a bit longer than "tomorrow. I fought for a long while with a > > mysterious issue, which turned out to be separate bug: The excluded > > relation was affected by row level security policies, which doesn't make > > sense. > > Why? You certainly thought that it made sense for conventional column > permissions due to potential problems with before row insert triggers. I don't see how those compare: > I specifically remember discussing this with you off list (on IM, > roughly a couple of weeks prior to initial commit). I recommended that > we err towards a more restrictive behavior in the absence of any > strong principle pushing us one way or the other. You seemed to agree. I don't think this really is comparable. Comparing this with a plain INSERT or UPDATE this would be akin to running RLS on the RETURNING tuple - which we currently don't. I think this is was just a bug. > I suppose that we have a tight enough grip on the targetlist that it's > hard to imagine anything else being introduced there spuriously. I had > thought that the pull-up did allow useful additional > defense/sanitization, but that may not be an excellent argument. The > only remaining argument is that my approach is closer to RETURNING, > but that doesn't seem like an excellent argument. I indeed don't think this is comparable to RETURNING - the pullup there is into an actual querytree above unrelated relations. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect patch, for showing tuple data
В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier написал: > > - errmsg("input page too small (%d bytes)", > raw_page_size))); > +errmsg("input page too small (%d > bytes)", raw_page_size))); > Please be careful of spurious diffs. Those just make the life of committers > more difficult than it already is. Oh, that's not diff. That's I've fixed indent on the code I did not write :-) > + > + General idea about output columns: lp_* > attributes > + are about where tuple is located inside the page; > + t_xmin, t_xmax, > + t_field3, t_ctid are about > + visibility of this tuplue inside or outside of the transaction; > + t_infomask2, t_infomask has > some > + information about properties of attributes stored in tuple data. > + t_hoff sais where tuple data begins and > + t_bits sais which attributes are NULL and which > are > + not. Please notice that t_bits here is not an actual value that is > stored > + in tuple data, but it's text representation with '0' and '1' > charactrs. > + > I would remove that as well. htup_details.h contains all this information. Made it even more shorter. Just links and comments about representation of t_bits. > + > +test=# select * from heap_page_item_attrs(get_raw_page('pg_range', > 0),'pg_range'::regclass); > +[loong tuple data] > This example is too large in character per lines, I think that you should > cut a major part of the fields, why not just keeping lp and t_attrs for > example. Did id. > > + > + > +rel_oid > +oid > +OID of the relation, of the tuple we want to split > + > + > + > +tuple_data > +bytea > +tuple raw data to split > + > + > In the description of tuple_data_split, I am not sure it is worth listing > all the argument of the function like that. IMO, we should just say: those > are the fields returned by "heap_page_items". tuple_data should as well be > renamed to t_data. Did it. > > + tuple attributes instead of one peace of raw tuple data. All other > return > This is not that "peaceful" to me. It should be "piece" :) Oops ;-) > + values[13] = PointerGetDatum(tuple_data_bytea); > + nulls[13] = false; > There is no point to set nulls[13] here. Oh, you are right! > > + > +test=# select tuple_data_split('pg_range'::regclass, > '\x400f1700ba074a0f520f'::bytea, 2304, 6, null); > + tuple_data_split > +--- > + > {"\\x400f","\\x1700","\\x","\\xba07","\\x4a0f","\\x5 > 20f"} +(1 row) > This would be more demonstrative if combined with heap_page_items, like > that for example: > SELECT tuple_data_split('pg_class'::regclass, t_data, t_infomask, > t_infomask2, t_bits) FROM heap_page_items(get_raw_page('pg_class', 0)); > And actually this query crashes. Oh... It crached because I did not check that t_data can actually be NULL. There also was a bug in original pageinspect, that did not get t_bits right when there was OID in the tuple. I've fixed it too. Here is next patch in attachment. -- Nikolay Shaplov Postgres Professional: http://www.postgrespro.com Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile index aec5258..e4bc1af 100644 --- a/contrib/pageinspect/Makefile +++ b/contrib/pageinspect/Makefile @@ -5,7 +5,7 @@ OBJS = rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \ brinfuncs.o ginfuncs.o $(WIN32RES) EXTENSION = pageinspect -DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \ +DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \ pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \ pageinspect--unpackaged--1.0.sql PGFILEDESC = "pageinspect - functions to inspect contents of database pages" diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c index 8d1666c..4fd3087 100644 --- a/contrib/pageinspect/heapfuncs.c +++ b/contrib/pageinspect/heapfuncs.c @@ -29,7 +29,14 @@ #include "funcapi.h" #include "utils/builtins.h" #include "miscadmin.h" +#include "utils/array.h" +#include "utils/rel.h" +#include "catalog/pg_type.h" +Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len, + TupleDesc tuple_desc, uint16 t_infomask, + uint16 t_infomask2, bits8 *t_bits, bool + do_detoast); /* * bits_to_text @@ -122,8 +129,8 @@ heap_page_items(PG_FUNCTION_ARGS) HeapTuple resultTuple; Datum result; ItemId id; - Datum values[13]; - bool nulls[13]; + Datum values[14]; + bool nulls[14]; uint16 lp_offset; uint16 lp_flags; uint16 lp_len; @@ -154,7 +161,8 @@ heap_page_items(PG_FUNCTION_ARGS) lp_offset + lp_len <= raw_page_size) {
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
Robert Haaswrites: > - Bug numbers are sometimes preserved in commit messages, but they > never make it into release notes. This actually seems like something > we could improve pretty easily and without a lot of extra work (and > also without a bug tracker). If every committer makes a practice of > putting the bug number into the commit message, and the people who > write the release notes then transcribe the information there, I bet > that would be pretty useful to a whole lotta people. The main reason bug numbers don't go into release notes is that only a fraction of our bugs actually have bug numbers. Very many problem reports show up via ordinary email traffic. If we had a mail-aware tracker and there were curators making sure that every problem-reporting thread got into the tracker, then it might become reasonable to cite bug numbers in the release notes. Personally I do make a practice of citing bug numbers in commit messages, but if you go through my commits, you'll soon agree that the coverage is too spotty to be useful in release notes. So I have not bothered to pester other committers to do likewise. Also, I suspect it will not be uncommon for tracker entries to appear only after the related commits, particularly for security bugs; so expecting the commit messages to be the links may be impractical anyway. Playing devil's advocate ... would this really do much other than bloat the release notes? The entire assumption of this thread is that people don't, or don't want to, use the release notes to find out what got fixed; they'd rather search a tracker. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On 10/01/2015 10:35 AM, Robert Haas wrote: On Wed, Sep 30, 2015 at 10:44 AM, Merlin Moncurewrote: I'm not trolling in any way. I'm just challenging you to back up your blanket assertions with evidence. For example, you're assertion that mailing lists are insufficient is simply stated and expected to be taken on faith: *How* is it insufficient and *what* do things like in the new world? Be specific: glossing over these details doesn't really accomplish anything and avoids the careful examination that may suggest small tweaks to the current processes that could get similar results with a lot less effort. In this entire massive thread, so far only Josh has come up with what I'd consider to be actionable problem cases. I think that the mailing list is pretty much just as good as a bug tracker would be for finding the discussion about some particular bug. I mean, our web site has all the mails from the email thread, and that's where the discussion is, and if that discussion were in a bug tracker it wouldn't have any more information than what is on the email thread. The email thread also usually contains a message indicating whether a fix was committed. Where the mailing list is less adequate is: - If you want to see a list of all the bugs by status, you have to review every thread individually. It would be useful to have a way to filter out the bug reports that turn out not to be really bugs vs. the ones that are real bugs which have been fixed vs. the ones that are real bugs that have not been fixed. Associating status with each bug number would make this easier. - Bug numbers are sometimes preserved in commit messages, but they never make it into release notes. This actually seems like something we could improve pretty easily and without a lot of extra work (and also without a bug tracker). If every committer makes a practice of putting the bug number into the commit message, and the people who write the release notes then transcribe the information there, I bet that would be pretty useful to a whole lotta people. A lot of errors get fixed without a bug ever being raised. If we want a tracker to represent some sort of historical record, all commits, or all non-feature commits if we don't want to track features, should be against tracker items. (In my former life I once had to send out a memo to developers that said "If you're not working on items in the tracker you're not doing your job.") cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On 2015-10-01 16:48:32 +0200, Magnus Hagander wrote: > That would require people to actually use the bug form to submit the > initial thread as well of course - which most developers don't do > themselves today. But there is in itself nothing that prevents them from > doing that, of course - other than a Small Amount Of Extra Work. It'd be cool if there were a newbug@ or similar mail address that automatically also posted to -bugs or so. > I think when a patch is directly related to a specific bug as reported > through the webform, don't most committers already refer to that bug > number? Maybe not every time, but at least most of the time? It's the many > discussions that don't actually have a bug number and yet result in a patch > that don't? I think it's mentioned somewhere in the commit message most of the time - but not in an easy to locate way. If we'd agree on putting something like: Bug: #XXX Affected-Versions: 9.5- Fixed-Versions: 9.3- in commit messages that'd be a fair bit easier to get into the release notes.. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Thu, Oct 1, 2015 at 7:52 AM, Amit Kapilawrote: > On Wed, Sep 30, 2015 at 7:05 AM, Robert Haas wrote: >> On Thu, Sep 24, 2015 at 2:31 PM, Amit Kapila >> wrote: >> > [ parallel_seqscan_partialseqscan_v18.patch ] >> >> I spent a bit of time reviewing the heapam.c changes in this patch >> this evening, and I think that your attempt to add support for >> synchronized scans has some problems. > > Thanks for the review and I agree with all the suggestions provided > by you. Fixed all of them in attached patch > (parallel_seqscan_heapscan_v1.patch). Thanks. Does heap_parallelscan_nextpage really need a pscan_finished output parameter, or can it just return InvalidBlockNumber to indicate end of scan? I think the latter can be done and would be cleaner. There doesn't seem to be anything that ensures that everybody who's scanning the relation agrees on whether we're doing a synchronized scan. I think that heap_parallelscan_initialize() should taken an additional Boolean argument, allow_sync. It should decide whether to actually perform a syncscan using the logic from initscan(), and then it should store a phs_syncscan flag into the ParallelHeapScanDesc. heap_beginscan_internal should set rs_syncscan based on phs_syncscan, regardless of anything else. I think heap_parallel_rescan() is an unworkable API. When rescanning a synchronized scan, the existing logic keeps the original start-block so that the scan's results are reproducible, but no longer reports the scan position since we're presumably out of step with other backends. This isn't going to work at all with this API. I don't think you can swap out the ParallelHeapScanDesc for another one once you've installed it; the point of a rescan is that you are letting the HeapScanDesc (or ParallelHeapScanDesc) hold onto some state from the first time, and that doesn't work at all here. So, I think this function should just go away, and callers should be able to just use heap_rescan(). Now this presents a bit of a problem for PartialSeqScan, because, on a rescan, nodeGather.c completely destroys the DSM and creates a new one. I think we're going to need to change that. I think we can adapt the parallel context machinery so that after WaitForParallelWorkersToFinish(), you can again call LaunchParallelWorkers(). (That might already work, but I wouldn't be surprised if it doesn't quite work.) This would make rescans somewhat more efficient because we wouldn't have to destroy and re-create the DSM each time. It means that the DSM would have to stick around until we're totally done with the query, rather than going away when ExecGather() returns the last tuple, but that doesn't sound too bad. We can still clean up the workers when we've returned all the tuples, which I think is the most important thing. This is obviously going to present some design complications for the as-yet-uncommitted code to push down PARAM_EXEC parameters, because if the new value takes more bytes to store than the old value, there won't be room to update the existing DSM in place. There are several possible solutions to that problem, but the one that appeals to me most right now is just don't generate plans that would require that feature. It doesn't seem particularly appealing to me to put a Gather node on the inner side of a NestLoop -- at least not until we can execute that without restarting workers, which we're certainly some distance from today. And maybe not even then. For initPlans, the existing code might be adequate, because I think we never re-evaluate those. And for subPlans, we can potentially handle cases where each worker can evaluate the subPlan separately below the Gather; we just can't handle cases where the subPlan attaches above the Gather and is used below it. Or, we can get around these limitations by redesigning the PARAM_EXEC pushdown mechanism in some way. But even if we don't, it's not crippling. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On Fri, Sep 18, 2015 at 8:14 PM, Masahiko Sawadawrote: > On Fri, Sep 18, 2015 at 6:13 PM, Fujii Masao wrote: >> On Fri, Sep 4, 2015 at 2:55 PM, Masahiko Sawada >> wrote: >>> On Fri, Sep 4, 2015 at 10:35 AM, Fujii Masao wrote: On Fri, Sep 4, 2015 at 2:23 AM, Masahiko Sawada wrote: > On Thu, Aug 27, 2015 at 1:54 AM, Masahiko Sawada > wrote: >> On Thu, Aug 20, 2015 at 11:46 PM, Alvaro Herrera >> wrote: >>> Jim Nasby wrote: >>> I think things like pageinspect are very different; I really can't see any use for those beyond debugging (and debugging by an expert at that). >>> >>> I don't think that necessarily means it must continue to be in contrib. >>> Quite the contrary, I think it is a tool critical enough that it should >>> not be relegated to be a second-class citizen as it is now (let's face >>> it, being in contrib *is* second-class citizenship). >>> >> >> Attached patch is latest patch. > > The previous patch lacks some files for regression test. > Attached fixed v12 patch. The patch could be applied cleanly. "make check" could pass successfully. But "make check-world -j 2" failed. >>> >>> Thank you for looking at this patch. >>> Could you tell me what test you got failed? >>> make check-world -j 2 or more is done successfully in my environment. >> >> I tried to do the test again, but initdb failed with the following error. >> >> creating template1 database in data/base/1 ... FATAL: invalid >> input syntax for type oid: "f" >> >> This error didn't happen when I tested before. So the commit which was >> applied recently might interfere with the patch. >> > > Thank you for testing! > Attached fixed version patch. Thanks for updating the patch! Here are comments. +#include "access/visibilitymap.h" visibilitymap.h doesn't need to be included in cluster.c. - errmsg("table row type and query-specified row type do not match"), + errmsg("table row type and query-specified row type do not match"), This change doesn't seem to be necessary. +#define Anum_pg_class_relallfrozen12 Why is pg_class.relallfrozen necessary? ISTM that there is no user of it now. lazy_scan_heap() calls PageClearAllVisible() when the page containing dead tuples is marked as all-visible. Shouldn't PageClearAllFrozen() be called at the same time? -"vm",/* VISIBILITYMAP_FORKNUM */ +"vfm",/* VISIBILITYMAP_FORKNUM */ I wonder how much it's worth renaming only the file extension while there are many places where "visibility map" and "vm" are used, for example, log messages, function names, variables, etc. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On Thu, Oct 1, 2015 at 4:35 PM, Robert Haaswrote: > On Wed, Sep 30, 2015 at 10:44 AM, Merlin Moncure > wrote: > > I'm not trolling in any way. I'm just challenging you to back up your > > blanket assertions with evidence. For example, you're assertion that > > mailing lists are insufficient is simply stated and expected to be > > taken on faith: *How* is it insufficient and *what* do things like in > > the new world? Be specific: glossing over these details doesn't > > really accomplish anything and avoids the careful examination that may > > suggest small tweaks to the current processes that could get similar > > results with a lot less effort. In this entire massive thread, so far > > only Josh has come up with what I'd consider to be actionable problem > > cases. > > I think that the mailing list is pretty much just as good as a bug > tracker would be for finding the discussion about some particular bug. > I mean, our web site has all the mails from the email thread, and > that's where the discussion is, and if that discussion were in a bug > tracker it wouldn't have any more information than what is on the > email thread. The email thread also usually contains a message > indicating whether a fix was committed. > > Where the mailing list is less adequate is: > > - If you want to see a list of all the bugs by status, you have to > review every thread individually. It would be useful to have a way to > filter out the bug reports that turn out not to be really bugs vs. the > ones that are real bugs which have been fixed vs. the ones that are > real bugs that have not been fixed. Associating status with each bug > number would make this easier. > I think that's a pretty good summary. A bug tracker can be used to add metadata about a bug, which can be very useful. Such as which versions are affected, and when it was fixed (or if it wasn't), which platforms it breaks, etc. But using the bugtracker for the discussion itself is usually not a win. In fact, I'd say in most cases it's counterproductive because it forces a single tool upon everybody, instead of email which allows each person to pick their own favourite tool. Using a bugtracker where all discussion happens in email removes that downside, and moves it back to the state where it doesn't help but also doesn't hinder the communication. > > - Bug numbers are sometimes preserved in commit messages, but they > never make it into release notes. This actually seems like something > we could improve pretty easily and without a lot of extra work (and > also without a bug tracker). If every committer makes a practice of > putting the bug number into the commit message, and the people who > write the release notes then transcribe the information there, I bet > that would be pretty useful to a whole lotta people. > That would require people to actually use the bug form to submit the initial thread as well of course - which most developers don't do themselves today. But there is in itself nothing that prevents them from doing that, of course - other than a Small Amount Of Extra Work. I think when a patch is directly related to a specific bug as reported through the webform, don't most committers already refer to that bug number? Maybe not every time, but at least most of the time? It's the many discussions that don't actually have a bug number and yet result in a patch that don't? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
Andres Freundwrites: > On 2015-10-01 11:07:12 -0400, Tom Lane wrote: >> As one of the people who do most of the gruntwork for release notes, >> I can tell you that that sort of fixed-format annotation is useless >> and usually annoying. I can see what branches you fixed the bug in >> anyway, from git_changelog's output. > I know that I very frequently wish that information were in the commit > messages in a easily discernible way. I'm inclined to think that commit messages are not really the right medium for that at all. Commit messages ought to be primarily text for consumption by humans. If we had a tracker, I think that it would be the place for fixed-format metadata, such as "fixed in version(s) x,y,z" and "fixed by commit(s) aaa,bbb,ccc". Expecting the tracker to link to the commit rather than vice versa would also solve a bunch of other problems like assigned-after-the-fact bug numbers. Not to mention plain old mistakes: once committed, a log message is immutable, but a tracker could be updated as needed. If this process actually works, I could see the tracker becoming the source material for generating release notes, at least for bug-fix notes. We do it off the commit log now because that's all we've got, but the log isn't really ideal source material. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump LOCK TABLE ONLY question
(sorry I lost the original thread somehow) tgl wrote: > Filip wrote: > > I'm running pg_dump constrained to one schema. It appears that pg_dump runs > > "LOCK TABLE %s IN ACCESS SHARE MODE" for each table. > > Naturally it makes sense, but... > > This schema has a table that serves as parent for thousands of child > > tables (via INHERITS). > > So effectively, to dump this schema, I have to LOCK all these tables > > not only parent. > They'd all end up locked anyway wouldn't they? I would like to dump the whole schema in ONLY mode, including table data for only that schema, excluding data for child tables in other schemas. Why would they be locked then? Which part of pg_dump requires locking child tables? Per the docs, "COPY only deals with the specific table named; it does not copy data to or from child tables. " I just want to understand why there is LOCK TABLE not LOCK TABLE ONLY. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On Wed, Sep 30, 2015 at 10:44 AM, Merlin Moncurewrote: > I'm not trolling in any way. I'm just challenging you to back up your > blanket assertions with evidence. For example, you're assertion that > mailing lists are insufficient is simply stated and expected to be > taken on faith: *How* is it insufficient and *what* do things like in > the new world? Be specific: glossing over these details doesn't > really accomplish anything and avoids the careful examination that may > suggest small tweaks to the current processes that could get similar > results with a lot less effort. In this entire massive thread, so far > only Josh has come up with what I'd consider to be actionable problem > cases. I think that the mailing list is pretty much just as good as a bug tracker would be for finding the discussion about some particular bug. I mean, our web site has all the mails from the email thread, and that's where the discussion is, and if that discussion were in a bug tracker it wouldn't have any more information than what is on the email thread. The email thread also usually contains a message indicating whether a fix was committed. Where the mailing list is less adequate is: - If you want to see a list of all the bugs by status, you have to review every thread individually. It would be useful to have a way to filter out the bug reports that turn out not to be really bugs vs. the ones that are real bugs which have been fixed vs. the ones that are real bugs that have not been fixed. Associating status with each bug number would make this easier. - Bug numbers are sometimes preserved in commit messages, but they never make it into release notes. This actually seems like something we could improve pretty easily and without a lot of extra work (and also without a bug tracker). If every committer makes a practice of putting the bug number into the commit message, and the people who write the release notes then transcribe the information there, I bet that would be pretty useful to a whole lotta people. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Typo in /src/backend/optimizer/README
Etsuro Fujitawrites: > The following is a remark added to /src/backend/optimizer/README by > commit 8703059c6b55c427100e00a09f66534b6ccbfaa1, and IIUC, I think "LHS" > in the last sentence "We prevent that by forcing the min LHS for the > upper join to include B." should be "RHS". Mmm, yeah, that's a typo. Will fix, thanks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Josh Berkuswrites: > > On 09/30/2015 03:10 PM, Tom Lane wrote: > >> I'd be feeling a lot more positive about this whole thread if any people > >> had stepped up and said "yes, *I* will put in a lot of grunt-work to make > >> something happen here". The lack of any volunteers suggests strongly > >> that this thread is a waste of time, just as the several similar ones > >> before it have been. > > > Hmmm? Frost volunteered to stand up debbugs. > > So he did, and did anyone volunteer to put data into it, or to do ongoing > curation of said data? If we simply connect it up to the mailing lists, > and then stand back and wait for magic to happen, we will not ever have > anything that's any more useful than the existing mailing list archives. I'm still planning to stand up debbugs, but as I said earlier on in the thread, it's lower priority than the current work around getting beta out the door (and, generally, 9.5 into good shape). I've been following the thread and don't see any reason to stray from that plan. Once it's up, I'll provide information about how it gets populated, how to interact with it, etc. Based on the responses on this thread, it looks like we've got quite a few people who are willing to help manage it, once it's up and running. I'll also be involved (either directly or by bringing in other resources) in the ongoing curation and management. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On 2015-10-01 11:07:12 -0400, Tom Lane wrote: > Andres Freundwrites: > > On 2015-10-01 16:48:32 +0200, Magnus Hagander wrote: > >> That would require people to actually use the bug form to submit the > >> initial thread as well of course - which most developers don't do > >> themselves today. But there is in itself nothing that prevents them from > >> doing that, of course - other than a Small Amount Of Extra Work. > > > It'd be cool if there were a newbug@ or similar mail address that > > automatically also posted to -bugs or so. > > I believe that's spelled pgsql-b...@postgresql.org. The point is that newbug would automatically assign a bug id, without going through the form. > > I think it's mentioned somewhere in the commit message most of the time > > - but not in an easy to locate way. If we'd agree on putting something like: > > Bug: #XXX > > Affected-Versions: 9.5- > > Fixed-Versions: 9.3- > > in commit messages that'd be a fair bit easier to get into the release > > notes.. > > As one of the people who do most of the gruntwork for release notes, > I can tell you that that sort of fixed-format annotation is useless > and usually annoying. I can see what branches you fixed the bug in > anyway, from git_changelog's output. I know that I very frequently wish that information were in the commit messages in a easily discernible way. > Actually useful information of that sort would be commentary along the > lines of "The bug exists back to 8.4, but I only fixed it in 9.2 and > up because ." That should definitely be there as well, agreed. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Freeze avoidance of very large table.
On Thu, Oct 1, 2015 at 9:44 AM, Fujii Masaowrote: > I wonder how much it's worth renaming only the file extension while > there are many places where "visibility map" and "vm" are used, > for example, log messages, function names, variables, etc. I'd be inclined to keep calling it the visibility map (vm) even if it also contains freeze information. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
Andres Freundwrites: > On 2015-10-01 16:48:32 +0200, Magnus Hagander wrote: >> That would require people to actually use the bug form to submit the >> initial thread as well of course - which most developers don't do >> themselves today. But there is in itself nothing that prevents them from >> doing that, of course - other than a Small Amount Of Extra Work. > It'd be cool if there were a newbug@ or similar mail address that > automatically also posted to -bugs or so. I believe that's spelled pgsql-b...@postgresql.org. > I think it's mentioned somewhere in the commit message most of the time > - but not in an easy to locate way. If we'd agree on putting something like: > Bug: #XXX > Affected-Versions: 9.5- > Fixed-Versions: 9.3- > in commit messages that'd be a fair bit easier to get into the release notes.. As one of the people who do most of the gruntwork for release notes, I can tell you that that sort of fixed-format annotation is useless and usually annoying. I can see what branches you fixed the bug in anyway, from git_changelog's output. Actually useful information of that sort would be commentary along the lines of "The bug exists back to 8.4, but I only fixed it in 9.2 and up because ." Without the , you're just adding bloat to what's already a pretty large file. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On 10/01/2015 07:48 AM, Magnus Hagander wrote: But using the bugtracker for the discussion itself is usually not a win. I know you said usually but: In fact, I'd say in most cases it's counterproductive because it forces a single tool upon everybody, instead of email which allows each person to pick their own favourite tool. Using a bugtracker where all discussion happens in email removes that downside, and moves it back to the state where it doesn't help but also doesn't hinder the communication. This doesn't seem correct, roundup, debbugs, redmine, and rt all support email as the primary form of communication. That would require people to actually use the bug form to submit the initial thread as well of course - which most developers don't do themselves today. But there is in itself nothing that prevents them from doing that, of course - other than a Small Amount Of Extra Work. It requires using a tracker somewhere within the email chain which would automatically assign an issue number to the email. Sincerely, JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On Thu, Oct 1, 2015 at 10:18 AM, Tom Lanewrote: > Andres Freund writes: >> On 2015-10-01 11:07:12 -0400, Tom Lane wrote: >>> As one of the people who do most of the gruntwork for release notes, >>> I can tell you that that sort of fixed-format annotation is useless >>> and usually annoying. I can see what branches you fixed the bug in >>> anyway, from git_changelog's output. > >> I know that I very frequently wish that information were in the commit >> messages in a easily discernible way. > > I'm inclined to think that commit messages are not really the right medium > for that at all. Commit messages ought to be primarily text for > consumption by humans. If we had a tracker, I think that it would be the > place for fixed-format metadata, such as "fixed in version(s) x,y,z" and > "fixed by commit(s) aaa,bbb,ccc". Expecting the tracker to link to the > commit rather than vice versa would also solve a bunch of other problems > like assigned-after-the-fact bug numbers. Not to mention plain old > mistakes: once committed, a log message is immutable, but a tracker could > be updated as needed. > > If this process actually works, I could see the tracker becoming the > source material for generating release notes, at least for bug-fix > notes. We do it off the commit log now because that's all we've got, > but the log isn't really ideal source material. At least some of that information could be generated by crawling and parsing commit emails, I think. The missing link FWICT is reliably tying the bug resolution to the relevant commit. Maybe we could teach the tracker to watch for "fixed by commit ABCDEF" in the bug list (and maybe hackers, too), identifying the commit as a bug fix and associating it to the release branches. That gets crawled and rendered to a database for collection and reporting. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On 10/01/2015 08:18 AM, Tom Lane wrote: Andres Freundwrites: On 2015-10-01 11:07:12 -0400, Tom Lane wrote: I'm inclined to think that commit messages are not really the right medium for that at all. Commit messages ought to be primarily text for consumption by humans. If we had a tracker, I think that it would be the place for fixed-format metadata, such as "fixed in version(s) x,y,z" and "fixed by commit(s) aaa,bbb,ccc". Expecting the tracker to link to the commit rather than vice versa would also solve a bunch of other problems like assigned-after-the-fact bug numbers. Not to mention plain old mistakes: once committed, a log message is immutable, but a tracker could be updated as needed. What I imagined was this: TGL commits foo, part of the commit message says: Status: Committed. Then a commit hook is fired from git to the tracker from a fixed address, That message would say: Git commit $hash Status: Committed Which would not only link to the specific commit but also automatically close the ticket with a status of Committed. Does that make sense for -hackers? It seems like it would take a load off but I am not the one in it every day. If this process actually works, I could see the tracker becoming the source material for generating release notes, at least for bug-fix notes. We do it off the commit log now because that's all we've got, but the log isn't really ideal source material. Yep. regards, tom lane -- Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564 PostgreSQL Centered full stack support, consulting and development. Announcing "I'm offended" is basically telling the world you can't control your own emotions, so everyone else should do it for you. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On 10/01/2015 07:55 AM, Tom Lane wrote: > Playing devil's advocate ... would this really do much other than bloat > the release notes? The entire assumption of this thread is that people > don't, or don't want to, use the release notes to find out what got fixed; > they'd rather search a tracker. It's not a question of "rather", it's a question of how searchable the release notes are, which is "not really at all". Yes, you can scan the release notes for the latest update, but consider users who have an issue and are running 9.2.7. Reasonably enough, they want to know that their issue is fixed in 9.2.13 (or in 9.4 if it turns out to be a feature, not a bug) before they ask their boss for a downtime. Figuring that out now is really hard. I tried to tackle this three or four years ago, by writing a tool which would slurp the release notes and put them into a full-text search database. This turned out to be very hard to do; our formatting for the release notes makes it very difficult for an automated import program to interpret (SGML doesn't help), especially on point releases to old versions. It also turned out that the resulting database was useful mostly to me, because you had to figure out what terms to search on based on the bug report in front of you. As a result, it never went online. So today, the only time the release notes are useful for a "is this issue fixed or not" is when a release note message mentions the specific error message the user is getting, which is a minority of the time. So in addition to what Haas mentions, I think we want to be able to link the release notes to the original issues for our hypothetical bug tracker. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On 10/01/2015 05:10 PM, Andres Freund wrote: > On 2015-10-01 11:07:12 -0400, Tom Lane wrote: >> Andres Freundwrites: >>> On 2015-10-01 16:48:32 +0200, Magnus Hagander wrote: That would require people to actually use the bug form to submit the initial thread as well of course - which most developers don't do themselves today. But there is in itself nothing that prevents them from doing that, of course - other than a Small Amount Of Extra Work. >> >>> It'd be cool if there were a newbug@ or similar mail address that >>> automatically also posted to -bugs or so. >> >> I believe that's spelled pgsql-b...@postgresql.org. > > The point is that newbug would automatically assign a bug id, without > going through the form. if we only want that - we can trivially implement that on the mailserver side by asking the backend database sequence for a bugid and rewriting the subject... But given debbugs is on the radar not sure we need it... Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] max_worker_processes on the standby
Fujii Masao wrote: > I've not read the patch yet, but the patched server with > track_commit_timestamp > enabled caused the following PANIC error when I ran pgbench. Ah, that was a stupid typo: I used || instead of &&. Fixed that. I also changed DeactivateCommitTs() so that it removes all slru segments instead of leaving the last one around (which is what SimpleLruTruncate was doing). This was noticeable when you ran a server with the feature enabled (which created some files), then disabled it (which removed all but the last) and ran for some more xacts; then enabled it again (which created a new file, far ahead from the previously existing one). Since the feature has enough protections that it doesn't have a problem with no files at all being present, this works correctly. Note no wal-logging of this operation: it's not necessary AFAICS because the same deactivation routine would be called again in recovery and in XLOG_PARAMETER_CHANGE, so it should be safe. And pushed. Thanks! -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On Thu, Oct 1, 2015 at 12:09 PM, Josh Berkuswrote: > On 10/01/2015 07:55 AM, Tom Lane wrote: >> Playing devil's advocate ... would this really do much other than bloat >> the release notes? The entire assumption of this thread is that people >> don't, or don't want to, use the release notes to find out what got fixed; >> they'd rather search a tracker. > > It's not a question of "rather", it's a question of how searchable the > release notes are, which is "not really at all". Yes, you can scan the > release notes for the latest update, but consider users who have an > issue and are running 9.2.7. Reasonably enough, they want to know that > their issue is fixed in 9.2.13 (or in 9.4 if it turns out to be a > feature, not a bug) before they ask their boss for a downtime. Figuring > that out now is really hard. Yeah -- so maybe it's the wrong path. The bugs/commits list are very parse-able for important elements and should be able to be slurped into a database for tracking and further insertion of metadata. A 'commit tracker' if you will; it would organize commits and relevant bug reports (so long as they could be linked by certain conventions). It's a read only system except for what other human inputs you'd want to arrange for other processes (such as generating release notes which might require cleaned up language). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Potential GIN vacuum bug
On Tue, Sep 1, 2015 at 8:05 AM, Robert Haaswrote: > On Sun, Aug 30, 2015 at 6:57 PM, Tom Lane wrote: > >> But we would still have to deal with the > >> fact that unconditional acquire attempt by the backends will cause a > vacuum > >> to cancel itself, which is undesirable. > > > > Good point. > > > >> If we define a new namespace for > >> this lock (like the relation extension lock has its own namespace) then > >> perhaps the cancellation code could be made to not cancel on that > >> condition. But that too seems like a lot of work to backpatch. > > > > We could possibly teach the autocancel logic to distinguish this lock > type > > from others without using a new namespace. That seems a bit klugy, but > > maybe better than adding a new namespace. (For example, there are > > probably only a couple of modes in which we take page-level locks at > > present. Choosing a currently unused, but self-exclusive, mode for > taking > > such a lock might serve.) > > That seems like a pretty big kludge to me; it will work until it doesn't. > > Adding a new lock type (similar to "relation extension") would address > a lot of my concern with the heavyweight lock approach. It strikes me > that trying to grab a lock on the index in what's basically a pretty > low-level operation here could have a variety of unintended > consequences. The autovacuum thing is one; but consider also the risk > of new deadlock scenarios resulting from a lock upgrade. Those > deadlocks would likely be, to use Peter Geoghegan's term, > unprincipled. The locking regime around indexes is already pretty > complicated, and I'm skeptical about the idea that we can complicate > it further without any blowback. > Is the locking around indexes summarized someplace? About the best thing I could come up with was to do a "git grep LockRelat" and then look for lines where the first argument had a name that seemed likely to refer to an index. Cheers, Jeff
Re: [HACKERS] Freeze avoidance of very large table.
On 10/01/2015 07:43 AM, Robert Haas wrote: > On Thu, Oct 1, 2015 at 9:44 AM, Fujii Masaowrote: >> I wonder how much it's worth renaming only the file extension while >> there are many places where "visibility map" and "vm" are used, >> for example, log messages, function names, variables, etc. > > I'd be inclined to keep calling it the visibility map (vm) even if it > also contains freeze information. > -1 to rename. Visibility Map is a perfectly good name. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On Thu, Oct 1, 2015 at 3:42 AM, Andres Freundwrote: > Yes, that what I think as well. At this point we'll already have > executed insert rls stuff on the EXCLUDED tuple: > /* > * Check any RLS INSERT WITH CHECK policies > * > * ExecWithCheckOptions() will skip any WCOs which are not of > the kind > * we are looking for at this point. > */ > if (resultRelInfo->ri_WithCheckOptions != NIL) > ExecWithCheckOptions(WCO_RLS_INSERT_CHECK, > > resultRelInfo, slot, estate); > and before executing the actual projection we also checked the existing > tuple: > ExecWithCheckOptions(WCO_RLS_CONFLICT_CHECK, resultRelInfo, > mtstate->mt_existing, > mtstate->ps.state); > > after the update triggers have, if applicable run, we run the the normal > checks there as well because it's just ExecUpdate() > if (resultRelInfo->ri_WithCheckOptions != NIL) > ExecWithCheckOptions(WCO_RLS_UPDATE_CHECK, > > resultRelInfo, slot, estate); > > so I do indeed think that there's no point in layering RLS above > EXCLUDED. I see your point, I think. It might be a problem if we weren't already making the statement error out, but we are. However, we're checking the excluded tuple (the might-be-inserted, might-be-excluded tuple that reflects before row insert trigger effects) with WCO_RLS_INSERT_CHECK, not WCO_RLS_UPDATE_CHECK. The WCO_RLS_UPDATE_CHECK applies to the tuple to be appended to the relation (the tuple that an UPDATE makes supersede some existing tuple, a new row version). We all seem to be in agreement that excluded.* ought to be subject to column-level privilege enforcement, mostly due to possible leaks with before row insert triggers (these could be SoD; a malicious UPSERT could be written a certain way). None of the checks in the code above are the exact RLS equivalent of the principle we have for column privileges, AFAICT, because update-applicable policies (everything but insert-applicable policies, actually) are not checked against the excluded tuple. Shouldn't select-applicable policies also be applied to the excluded tuples, just as with UPDATE ... FROM "join from" tables, which excluded is kinda similar to? I'm not trying to be pedantic; I just don't grok the underlying principles here. Couldn't a malicious WHERE clause leak the excluded.* tuple contents (and cause the UPDATE to not proceed) before the WCO_RLS_CONFLICT_CHECK call site was reached, while also preventing it from ever actually being reached (with a malicious function that returns false after stashing excluded.* elsewhere)? You can put volatile functions in UPDATE WHERE clauses, even if it is generally a bad idea. Perhaps I'm simply not following you here, though. I think that this is one challenge with having per-command policies with a system that checks permissions dynamically (not during parse analysis). Note that I'm not defending the status quo of the master branch -- I'm just a little uneasy about what the ideal, least surprising behavior is here. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump LOCK TABLE ONLY question
On Thu, Oct 1, 2015 at 10:43 PM, Filip Rembiałkowskiwrote: > I just want to understand why there is LOCK TABLE not LOCK TABLE ONLY. It seems to me that you'd still want to use LOCK TABLE particularly if the dump is only done on a subset of tables, using --table for example. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On Thu, Oct 1, 2015 at 3:53 AM, Andres Freundwrote: >> I specifically remember discussing this with you off list (on IM, >> roughly a couple of weeks prior to initial commit). I recommended that >> we err towards a more restrictive behavior in the absence of any >> strong principle pushing us one way or the other. You seemed to agree. > > I don't think this really is comparable. Comparing this with a plain > INSERT or UPDATE this would be akin to running RLS on the RETURNING > tuple - which we currently don't. > > I think this is was just a bug. Maybe that's the problem here; I still thought that we were planning on changing RLS in this regard, but it actually seems we changed course, looking at the 9.5 open items list. I would say that that's a clear divergence between RLS and column privileges. That might be fine, but it doesn't match my prior understanding of RLS (or, more accurately, how it was likely to change pre-release). If that's the design that we want for RLS across the board, then I'm happy to defer to that decision. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On Thu, Oct 1, 2015 at 4:17 PM, Andres Freundwrote: > On 2015-10-01 16:13:12 -0700, Peter Geoghegan wrote: >> However, we're checking the excluded tuple (the might-be-inserted, >> might-be-excluded tuple that reflects before row insert trigger >> effects) with WCO_RLS_INSERT_CHECK, not WCO_RLS_UPDATE_CHECK. The >> WCO_RLS_UPDATE_CHECK applies to the tuple to be appended to the >> relation (the tuple that an UPDATE makes supersede some existing >> tuple, a new row version). > > You can already see the effects of an INSERT modified by before triggers > via RETURNING. No? I'm not saying that I agree with the decision to not do anything special with RLS + RETURNING in general. I'm also not going to say that I disagree with it. As I said, I missed that decision until just now. I agree that it's obviously true that what you propose is consistent with what is now considered ideal behavior for RLS (that's what I get from the wiki page comments on RLS + RETURNING). FWIW, I think that this technically wasn't a bug, because it was based on a deliberate design decision that I thought (not without justification) was consistent with what we wanted for RLS across the board. But, again, happy to go along with what you say in light of this new information. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On Thu, Oct 1, 2015 at 4:26 PM, Peter Geogheganwrote: >> You can already see the effects of an INSERT modified by before triggers >> via RETURNING. No? > > I'm not saying that I agree with the decision to not do anything > special with RLS + RETURNING in general. I'm also not going to say > that I disagree with it. As I said, I missed that decision until just > now. I agree that it's obviously true that what you propose is > consistent with what is now considered ideal behavior for RLS (that's > what I get from the wiki page comments on RLS + RETURNING). I see now that commit 4f3b2a8883 changed things for UPDATE and DELETE statements, but not INSERT statements. I guess my unease is because that isn't entirely consistent with INSERT + RETURNING and the GRANT system. Logically, the only possible justification for our long standing INSERT and RETURNING behavior with GRANT (the fact that it requires SELECT privilege for rows returned, just like UPDATE and DELETE) is that before row insert triggers could do something secret (e.g. they could be security definer). It doesn't seem to be too much of a stretch to suppose the same should apply with RLS. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Idea for improving buildfarm robustness
I wrote: > It strikes me that a different approach that might be of value would > be to re-read postmaster.pid and make sure that (a) it's still there > and (b) it still contains the current postmaster's PID. This would > be morally equivalent to what Jim suggests above, and it would dodge > the checkpointer-destroys-the-evidence problem, and it would have the > additional advantage that we'd notice when a brain-dead DBA decides > to manually remove postmaster.pid so he can start a new postmaster. > (It's probably far too late to avoid data corruption at that point, > but better late than never.) > This is still not bulletproof against all overwrite-with-a-backup > scenarios, but it seems like a noticeable improvement over what we > discussed yesterday. Here's a rewritten patch that looks at postmaster.pid instead of pg_control. It should be effectively the same as the prior patch in terms of response to directory-removal cases, and it should also catch many overwrite cases. regards, tom lane diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index baa43b2..498ebfa 100644 *** a/src/backend/postmaster/postmaster.c --- b/src/backend/postmaster/postmaster.c *** ServerLoop(void) *** 1602,1610 fd_set readmask; int nSockets; time_t now, last_touch_time; ! last_touch_time = time(NULL); nSockets = initMasks(); --- 1602,1611 fd_set readmask; int nSockets; time_t now, + last_lockfile_recheck_time, last_touch_time; ! last_lockfile_recheck_time = last_touch_time = time(NULL); nSockets = initMasks(); *** ServerLoop(void) *** 1754,1772 if (StartWorkerNeeded || HaveCrashedWorker) maybe_start_bgworker(); - /* - * Touch Unix socket and lock files every 58 minutes, to ensure that - * they are not removed by overzealous /tmp-cleaning tasks. We assume - * no one runs cleaners with cutoff times of less than an hour ... - */ - now = time(NULL); - if (now - last_touch_time >= 58 * SECS_PER_MINUTE) - { - TouchSocketFiles(); - TouchSocketLockFiles(); - last_touch_time = now; - } - #ifdef HAVE_PTHREAD_IS_THREADED_NP /* --- 1755,1760 *** ServerLoop(void) *** 1793,1798 --- 1781,1828 /* reset flag so we don't SIGKILL again */ AbortStartTime = 0; } + + /* + * Lastly, check to see if it's time to do some things that we don't + * want to do every single time through the loop, because they're a + * bit expensive. Note that there's up to a minute of slop in when + * these tasks will be performed, since DetermineSleepTime() will let + * us sleep at most that long. + */ + now = time(NULL); + + /* + * Once a minute, verify that postmaster.pid hasn't been removed or + * overwritten. If it has, we commit hara-kiri. This avoids having + * postmasters and child processes hanging around after their database + * is gone, and maybe causing problems if a new database cluster is + * created in the same place. It also provides some protection + * against a DBA foolishly removing postmaster.pid and manually + * starting a new postmaster. Data corruption is likely to ensue from + * that anyway, but we can minimize the damage by aborting ASAP. + */ + if (now - last_lockfile_recheck_time >= 1 * SECS_PER_MINUTE) + { + if (!RecheckDataDirLockFile()) + { + ereport(LOG, + (errmsg("performing immediate shutdown because data directory lock file is invalid"))); + kill(MyProcPid, SIGQUIT); + } + last_lockfile_recheck_time = now; + } + + /* + * Touch Unix socket and lock files every 58 minutes, to ensure that + * they are not removed by overzealous /tmp-cleaning tasks. We assume + * no one runs cleaners with cutoff times of less than an hour ... + */ + if (now - last_touch_time >= 58 * SECS_PER_MINUTE) + { + TouchSocketFiles(); + TouchSocketLockFiles(); + last_touch_time = now; + } } } diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c index f0099d3..b6270e1 100644 *** a/src/backend/utils/init/miscinit.c --- b/src/backend/utils/init/miscinit.c *** AddToDataDirLockFile(int target_line, co *** 1202,1207 --- 1202,1277 } + /* + * Recheck that the data directory lock file still exists with expected + * content. Return TRUE if the lock file appears OK, FALSE if it isn't. + * + * We call this periodically in the postmaster. The idea is that if the + * lock file has been removed or replaced by another postmaster, we should + * do a panic database shutdown. Therefore, we should return TRUE if there + * is any doubt: we do not want to cause a panic shutdown unnecessarily. + * Transient failures like EINTR or ENFILE should not cause us to fail. + * (If there really is something wrong, we'll detect it on a future
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On 2015-10-01 16:13:12 -0700, Peter Geoghegan wrote: > However, we're checking the excluded tuple (the might-be-inserted, > might-be-excluded tuple that reflects before row insert trigger > effects) with WCO_RLS_INSERT_CHECK, not WCO_RLS_UPDATE_CHECK. The > WCO_RLS_UPDATE_CHECK applies to the tuple to be appended to the > relation (the tuple that an UPDATE makes supersede some existing > tuple, a new row version). You can already see the effects of an INSERT modified by before triggers via RETURNING. No? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On 2015-10-01 16:26:07 -0700, Peter Geoghegan wrote: > FWIW, I think that this technically wasn't a bug Meh. In which scenario would do a policy applied to EXCLUDED actually anything reasonable? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On Thu, Oct 1, 2015 at 4:49 PM, Andres Freundwrote: > On 2015-10-01 16:26:07 -0700, Peter Geoghegan wrote: >> FWIW, I think that this technically wasn't a bug > > Meh. In which scenario would do a policy applied to EXCLUDED actually > anything reasonable? I agree that it's very unlikely to matter. Consistency is something that is generally valued, though. I'm not going to object if you want to continue with committing something that changes excluded + RLS. I was just explaining my view of the matter. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
Hello, I caught up this thread, maybe. > > So, if we wanted to fix this in a way that preserves the spirit of > > what's there now, it seems to me that we'd want the FDW to return > > something that's like a whole row reference, but represents the output > > of the foreign join rather than some underlying base table. And then > > get the EPQ machinery to have the evaluation of the ForeignScan for > > the join, when it happens in an EPQ context, to return that tuple. > > But I don't really have a good idea how to do that. > > > > More thought seems needed here... > > > Alternative built-in join execution? > Once it is executed under the EPQ context, built-in join node fetches > a tuple from both of inner and outer side for each. It is eventually > fetched from the EPQ slot, then the alternative join produce a result > tuple. It seems quite similar to what Fujita-san is trying now by somehow *replacing* "foreign join" scan node with alternative local join plan when EPQ. I think what Robert says is that "foreign join" scans that completely behaves as a ordinary scan node on executor. Current framework of foreign join pushdown seems a bit tricky because it incompletely emulating local join on foreign scans. The mixture seems to be the root cause of this problem. 1. Somehow run local joins on current EPQ tuples currently given by "foreign join" scans. 1.1 Somehow detecting running EPQ and switch the plan to run in ExecScanFetch or somewhere else. 1.2 Replace "foreign join scan" node with the alternative local join node on ExecInit. (I don't like this.) 1.3 In-core alternative local join executor for join pushdown? 2. "foreign join" scan plan node completely compliant to current executor semantics of ordinary scan node. In other words, the node has corresponding RTE_RELATION RTE, marked with ROW_MARK_COPY on locking and returns a slot with tlist that contains join result columns and the whole-row var on them. Then, ExecPlanQualFetchRowMarks gets the whole-row var and set it into eqpTuple for corresponding *relid*. I prefer the 2, but have no good idea how to do that now, too. > In case when FDW is not designed to handle join by itself, it is > a reasonable fallback I think. > > I expect FDW driver needs to handle EPQ recheck in the case below: > * ForeignScan on base relation and it uses late row locking. I think this is indisputable. > * ForeignScan on join relation, even if early locking. This could be unnecessary if the "foreign join" scan node can have its own rowmark of ROW_MARK_COPY. regards, At Thu, 1 Oct 2015 02:15:29 +, Kouhei Kaigaiwrote in <9a28c8860f777e439aa12e8aea7694f80114d...@bpxm15gp.gisp.nec.co.jp> > > -Original Message- > > From: pgsql-hackers-ow...@postgresql.org > > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas > > Sent: Wednesday, September 30, 2015 6:55 AM > > To: Etsuro Fujita > > Cc: Kaigai Kouhei(海外 浩平); PostgreSQL-development; 花田茂 > > Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual > > > > On Mon, Sep 28, 2015 at 11:15 PM, Etsuro Fujita > > wrote: > > > I thought the same thing [1]. While I thought it was relatively easy to > > > make changes to RefetchForeignRow that way for the foreign table case > > > (scanrelid>0), I was not sure how hard it would be to do so for the > > > foreign > > > join case (scanrelid==0). So, I proposed to leave that changes for 9.6. > > > I'll have a rethink on this issue along the lines of that approach. > > > > Well, I spent some more time looking at this today, and testing it out > > using a fixed-up version of your foreign_join_v16 patch, and I decided > > that RefetchForeignRow is basically a red herring. That's only used > > for FDWs that do late row locking, but postgres_fdw (and probably many > > others) do early row locking, in which case RefetchForeignRow never > > gets called. Instead, the row is treated as a "non-locked source row" > > by ExecLockRows (even though it is in fact locked) and is re-fetched > > by EvalPlanQualFetchRowMarks. We should probably update the comment > > about non-locked source rows to mention the case of FDWs that do early > > row locking. > > > Indeed, select_rowmark_type() says ROW_MARK_COPY if GetForeignRowMarkType > callback is not defined. > > > Anyway, everything appears to work OK up to this point: we correctly > > retrieve the saved whole-rows from the foreign side and call > > EvalPlanQualSetTuple on each one, setting es_epqTuple[rti - 1] and > > es_epqTupleSet[rti - 1]. So far, so good. Now we call > > EvalPlanQualNext, and that's where we get into trouble. We've got the > > already-locked tuples from the foreign side and those tuples CANNOT > > have gone away or been modified because we have already locked them. > > So, all the foreign join needs to do is return the same tuple that it > > returned before: the EPQ recheck was triggered by some
Re: [HACKERS] Parallel Seq Scan
Hi Robert, Gather node was oversight by readfunc.c, even though it shall not be serialized actually. Also, it used incompatible WRITE_xxx_FIELD() macro on outfuncs.c. The attached patch fixes both of incomsistence. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei> -Original Message- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas > Sent: Wednesday, September 30, 2015 2:19 AM > To: Amit Kapila > Cc: Kaigai Kouhei(海外 浩平); Haribabu Kommi; Gavin Flower; Jeff Davis; Andres > Freund; Amit Langote; Amit Langote; Fabrízio Mello; Thom Brown; Stephen Frost; > pgsql-hackers > Subject: Re: [HACKERS] Parallel Seq Scan > > On Tue, Sep 29, 2015 at 12:39 AM, Amit Kapila wrote: > > Attached patch is a rebased patch based on latest commit (d1b7c1ff) > > for Gather node. > > > > - I have to reorganize the defines in execParallel.h and .c. To keep > > ParallelExecutorInfo, in GatherState node, we need to include execParallel.h > > in execnodes.h which was creating compilation issues as execParallel.h > > also includes execnodes.h, so for now I have defined ParallelExecutorInfo > > in execnodes.h and instrumentation related structures in instrument.h. > > - Renamed parallel_seqscan_degree to degree_of_parallelism > > - Rename Funnel to Gather > > - Removed PARAM_EXEC parameter handling code, I think we can do this > > separately. > > > > I have to work more on partial seq scan patch for rebasing it and handling > > review comments for the same, so for now I am sending the first part of > > patch (which included Gather node functionality and some general support > > for parallel-query execution). > > Thanks for the fast rebase. > > This patch needs a bunch of cleanup: > > - The formatting for the GatherState node's comment block is unlike > that of surrounding comment blocks. It lacks the --- dividers, > and the indentation is not the same. Also, it refers to > ParallelExecutorInfo by the type name, but the other members by > structure member name. The convention is to refer to them by > structure member name, so please do that. > > - The naming of fs_workersReady is inconsistent with the other > structure members. The other members use all lower-case names, > separating words with dashes, but this one uses a capital letter. The > other members also don't prefix the names with anything, but this uses > a "fs_" prefix which I assume is left over from when this was called > FunnelState. Finally, this doesn't actually tell you when workers are > ready, just whether they were launched. I suggest we rename this to > "any_worker_launched". > > - Instead of moving the declaration of ParallelExecutorInfo, please > just refer to it as "struct ParallelExecutorInfo" in execnodes.h. > That way, you're not sucking these includes into all kinds of places > they don't really need to be. > > - Let's not create a new PARALLEL_QUERY category of GUC. Instead, > let's the GUC for the number of workers with under resource usage -> > asynchronous behavior. > > - I don't believe that shm_toc *toc has any business being part of a > generic PlanState node. At most, it should be part of an individual > type of PlanState, like a GatherState or PartialSeqScanState. But > really, I don't see why we need it there at all. It should, I think, > only be needed during startup to dig out the information we need. So > we should just dig that stuff out and keep pointers to whatever we > actually need - in this case the ParallelExecutorInfo, I think - in > the particular type of PlanState node that's at issue - here > GatherState. After that we don't need a pointer to the toc any more. > > - I'd like to do some renaming of the new GUCs. I suggest we rename > cpu_tuple_comm_cost to parallel_tuple_cost and degree_of_parallelism > to max_parallel_degree. > > - I think that a Gather node should inherit from Plan, not Scan. A > Gather node really shouldn't have a scanrelid. Now, admittedly, if > the only thing under the Gather is a Partial Seq Scan, it wouldn't be > totally bonkers to think of the Gather as scanning the same relation > that the Partial Seq Scan is scanning. But in any more complex case, > like where it's scanning a join, you're out of luck. You'll have to > set scanrelid == 0, I suppose, but then, for example, ExecScanReScan > is not going to work. In fact, as far as I can see, the only way > nodeGather.c is actually using any of the generic scan stuff is by > calling ExecInitScanTupleSlot, which is all of one line of code. > ExecEndGather fetches node->ss.ss_currentRelation but then does > nothing with it. So I think this is just a holdover from early > version of this patch where what's now Gather and PartialSeqScan were > a single node, and I think we should rip it out. > > - On a related note, the assertions in cost_gather() are both bogus > and should be
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On 2015-10-01 16:55:23 -0700, Peter Geoghegan wrote: > On Thu, Oct 1, 2015 at 4:49 PM, Andres Freundwrote: > > On 2015-10-01 16:26:07 -0700, Peter Geoghegan wrote: > >> FWIW, I think that this technically wasn't a bug > > > > Meh. In which scenario would do a policy applied to EXCLUDED actually > > anything reasonable? > > I agree that it's very unlikely to matter. Consistency is something > that is generally valued, though. I don't think you get my gist. I'm can't see how the current code can do anything sensible at all. What do you think is going to be the effect of an excluded row that doesn't meet security quals? Even if it worked in the sense that the correct data were accessed and every - which I doubt is completely the case as things stands given there's no actual scan node and stuff - you'd still have EXCLUDED.* being used in the projection for the new version of the tuple. As far as I can see the only correct thing you could do in that situation is error out. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] max_worker_processes on the standby
On Fri, Oct 2, 2015 at 3:12 AM, Alvaro Herrerawrote: > Fujii Masao wrote: > >> I've not read the patch yet, but the patched server with >> track_commit_timestamp >> enabled caused the following PANIC error when I ran pgbench. > > Ah, that was a stupid typo: I used || instead of &&. Fixed that. > > I also changed DeactivateCommitTs() so that it removes all slru segments > instead of leaving the last one around (which is what SimpleLruTruncate > was doing). This was noticeable when you ran a server with the feature > enabled (which created some files), then disabled it (which removed all > but the last) and ran for some more xacts; then enabled it again (which > created a new file, far ahead from the previously existing one). Since > the feature has enough protections that it doesn't have a problem with > no files at all being present, this works correctly. Note no > wal-logging of this operation: it's not necessary AFAICS because the > same deactivation routine would be called again in recovery and in > XLOG_PARAMETER_CHANGE, so it should be safe. What happens if pg_xact_commit_timestamp() is called in standby after track_commit_timestamp is disabled in master, DeactivateCommitTs() is called and all commit_ts files are removed in standby? I tried that case and got the following assertion failure. TRAP: FailedAssertion("!(((oldestCommitTs) != ((TransactionId) 0)) == ((newestCommitTs) != ((TransactionId) 0)))", File: "commit_ts.c", Line: 307) LOG: server process (PID 11160) was terminated by signal 6: Aborted DETAIL: Failed process was running: select num, pg_xact_commit_timestamp(num::text::xid) from generate_series(1750, 1800) num The steps to reproduce the problem is 1. Set up replication with track_commit_timestamp enabled. 2. Run several write transactions. 3. Disable track_commit_timestamp only in master and wait for XLOG_PARAMETER_CHANGE record to be replayed in standby. 4. Run pg_xact_commit_timestamp() in standby. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> -Original Message- > From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro HORIGUCHI > Sent: Friday, October 02, 2015 9:50 AM > To: Kaigai Kouhei(海外 浩平) > Cc: fujita.ets...@lab.ntt.co.jp; robertmh...@gmail.com; > pgsql-hackers@postgresql.org; shigeru.han...@gmail.com > Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual > > Hello, I had more condieration on this. > > > As long as FDW author can choose their best way to produce a joined > > tuple, it may be worth to investigate. > > > > My comments are: > > * ForeignRecheck is the best location to call RefetchForeignJoinRow > > when scanrelid==0, not ExecScanFetch. Why you try to add special > > case for FDW in the common routine. > > * It is FDW's choice where the remote join tuple is kept, even though > > most of FDW will keep it on the private field of ForeignScanState. > > I think that scanrelid == 0 means that the node in focus is not a > scan node in current executor > semantics. EvalPlanQualFetchRowMarks fetches the possiblly > modified row then EvalPlanQualNext does recheck for the new > row. It's the roles of each functions. > > In this criteria, recheck routines are not the place for > refetching. EvalPlanQualFetchRowMarks is that. > I never say FDW should refetch tuples on the recheck routine. All I suggest is, projection to generate a joined tuple and recheck according to the qualifier pushed down are role of FDW driver, because it knows the best strategy to do the job. > Again, the problem here is that "foreign join" scan node is > actually a scan node but it doesn't provide all materials which > executor expects for a scan node. So the way to fix this > preserving the semantics would be in two choices. > > 1. make "foreign join" scan node to behave as complete scan >node. That is, EvalPlanQualFetchRowMarks can retrieve the >modified row version anyhow according to the type of row mark. > > 2. make "foreign join" node that the node actuall a join node >which has subnodes and the "foreign join" node can reconstruct >the result row using the result of subnodes on EPQ. >(ExecForeignJoinNode would cease to call subnodes if it is > actually a scan node) > > "3". Any other means to break current semantics of joins and > scans in executor, as you recommends. Some more adjustment > would be needed to go on this way. > > > I don't know how the current disign of FDW has been built, > especialy about join pushdown feature so I should be missing > something but I think as the above for this issue. > It looks to me all of them makes the problem complicated more. I never heard why "foreign-join" scan node is difficult to construct a joined tuple using the EPQ slots that are already loaded on. Regardless of the early or late locking, EPQ slots of base relation are already filled up, aren't it? All mission of the "foreign-join" scan node is return a joined tuple as if it was executed by local join logic. Local join consumes two tuples then generate one tuple. The "foreign-join" scan node can perform equivalently, even if it is under EPQ recheck context. So, job of FDW driver is... Step-1) Fetch tuples from the EPQ slots of the base foreign relation to be joined. Please note that it is just a pointer reference. Step-2) Try to join these two (or more) tuples according to the join condition (only FDW knows because it is kept in private) Step-3) If result is valid, FDW driver makes a projection from these tuples, then return it. If you concern about re-invention of the code for each FDW, core can provide a utility routine to cover 95% of FDW structure. I want to keep EvalPlanQualFetchRowMarks per base relation basis. It is a bad choice to consider join at this point. > > Apart from FDW requirement, custom-scan/join needs recheckMtd is > > called when scanrelid==0 to avoid assertion fail. I hope FDW has > > symmetric structure, however, not a mandatory requirement for me. > > It wouldn't be needed if EvalPlanQualFetchRowMarks works as > exepcted. Is this wrong? > Yes, it does not work. Expected behavior EvalPlanQualFetchRowMarks is to load the tuple to be rechecked onto EPQ slot, using heap_fetch or copied image. It is per base relation basis. Who can provide a projection to generate joined tuple? It is a job of individual plan-state-node to be walked on during EvalPlanQualNext(). Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei> regards, > > > At Thu, 1 Oct 2015 13:17:34 +, Kouhei Kaigai wrote > in <9a28c8860f777e439aa12e8aea7694f80114d...@bpxm15gp.gisp.nec.co.jp> > > > > In case when FDW is not designed to handle join by itself, it is > > > > a reasonable fallback I think. > > > > > > > > I expect FDW driver needs to handle EPQ recheck in the case below: > > > > * ForeignScan on base relation and it uses late row locking. > > > > *
Re: [HACKERS] row_security GUC, BYPASSRLS
On Mon, Sep 28, 2015 at 05:13:56PM -0400, Stephen Frost wrote: > * Noah Misch (n...@leadboat.com) wrote: > > In schema reviews, I will raise a red flag for use of this feature; the best > > designs will instead use additional roles. I forecast that PostgreSQL would > > fare better with no owner-constrained-by-RLS capability. Even so, others > > want > > it, and FORCE ROW SECURITY would deliver it with an acceptable risk profile. > > I've attached a patch to implement it. It's not fully polished but it's > sufficient for comment, I believe. Additional comments, documentation > and regression tests are to be added, if we have agreement on the > grammer and implementation approach. This patch has FORCE ROW LEVEL SECURITY take precedence over row_security=off, which thwarts pg_dump use of row_security=off to ensure dump completeness. Should this be a table-level flag, or should it be a policy-level flag? A policy-level flag is more powerful. If nobody really anticipates using that power, this table-level flag works for me. > > SECURITY_ROW_LEVEL_DISABLED could have been okay. I removed an incomplete > > implementation (e.g. didn't affect CASCADE constraints). Writing a full one > > would be a mammoth job, and for what? Setting the wrong SELECT policies can > > disrupt _any_ application logic; no foreign keys or FORCE ROW SECURITY need > > be > > involved. Protecting just foreign keys brings some value, but it will not > > materially reduce the vigilance demanded of RLS policy authors and > > reviewers. > > I have a hard time with this. We're not talking about the application > logic in this case, we're talking about the guarantees which the > database is making to the user, be it an application or an individual. If disabling policies has an effect, table owners must be feeding conflicting requirements into the system. Violating policies during referential integrity queries is not, in general, less serious than violating referential integrity itself. Rules and triggers pose the same threat, and we let those break referential integrity. I think the ideal in all of these cases is rather to detect the conflict and raise an error. SECURITY_ROW_LEVEL_DISABLED and SECURITY_NOFORCE_RLS send policies in a third direction, neither the beaten path of rules/triggers nor the ideal. > I've included a patch (the second in the set attached) which adds a > SECURITY_NOFORCE_RLS bit which has a much tighter scope- it only applies > after the regular owner check is done. This reduces the risk of it > being set mistakenly dramatically, I believe. Yes, that's far safer than SECURITY_ROW_LEVEL_DISABLED. I assume the final design will let table owners completely bypass FORCE ROW LEVEL SECURITY under "SET row_security = off". If so, SECURITY_NOFORCE_RLS poses negligible risk. Even if not, SECURITY_NOFORCE_RLS poses low risk. > @@ -3667,6 +3673,12 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, > Relation rel, > case AT_DisableRowSecurity: > ATExecDisableRowSecurity(rel); > break; > + case AT_ForceRowSecurity: > + ATExecForceRowSecurity(rel); > + break; > + case AT_NoForceRowSecurity: > + ATExecNoForceRowSecurity(rel); > + break; Functions differing only in s/ = true/ = false/? ATExecEnableDisableTrigger() is a better model for this. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ON CONFLICT issues around whole row vars,
On Thu, Oct 1, 2015 at 5:12 PM, Andres Freundwrote: > I'm can't see how the current code can do anything sensible at all. What > do you think is going to be the effect of an excluded row that doesn't > meet security quals? Even if it worked in the sense that the correct > data were accessed and every - which I doubt is completely the case as > things stands given there's no actual scan node and stuff - you'd still > have EXCLUDED.* being used in the projection for the new version of the > tuple. > > As far as I can see the only correct thing you could do in that > situation is error out. I agree. I wasn't defending the current code (although that might have been made unclear by the "technically wasn't a bug" remark). Note that I'm not telling you what I think needs to happen. I'm just explaining my understanding of what has happened. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Obsolete use of volatile in walsender.c, walreceiver.c, walreceiverfuncs.c?
On Tue, Sep 22, 2015 at 7:25 AM, Thomas Munrowrote: > On Tue, Sep 22, 2015 at 8:19 AM, Alvaro Herrera > wrote: >> Thomas Munro wrote: >> >>> In walsender.c, walreceiver.c, walreceiverfuncs.c there are several >>> places where volatile qualifiers are used apparently only to prevent >>> reordering around spinlock operations. >> >> In replication/slot.c there are a number of places (12, I think) that >> introduce a block specifically to contain a volatile cast on a variable >> for spinlock-protected access. We could remove the whole thing and save >> at least 3 lines and one indentation level for each of them. > > Right, see attached. It seems to me that we could as well simplify checkpoint.c and logical.c. In those files volatile casts are used as well to protect from reordering for spinlock operations. See for example 0002 on top of 0001 that is Thomas' patch. -- Michael From 929db46e249ac3f6e587b38aff8ba4468e7d776d Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Fri, 2 Oct 2015 10:57:15 +0900 Subject: [PATCH 1/2] Remove obsolete use of volatile in WAL-related files Patch by Thomas Munro. --- src/backend/replication/slot.c | 100 ++--- src/backend/replication/walreceiver.c | 16 ++--- src/backend/replication/walreceiverfuncs.c | 22 ++- src/backend/replication/walsender.c| 39 --- 4 files changed, 59 insertions(+), 118 deletions(-) diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index c66619c..5b18cb6 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -288,15 +288,11 @@ ReplicationSlotCreate(const char *name, bool db_specific, slot->in_use = true; /* We can now mark the slot active, and that makes it our slot. */ - { - volatile ReplicationSlot *vslot = slot; - - SpinLockAcquire(>mutex); - Assert(vslot->active_pid == 0); - vslot->active_pid = MyProcPid; - SpinLockRelease(>mutex); - MyReplicationSlot = slot; - } + SpinLockAcquire(>mutex); + Assert(slot->active_pid == 0); + slot->active_pid = MyProcPid; + SpinLockRelease(>mutex); + MyReplicationSlot = slot; LWLockRelease(ReplicationSlotControlLock); @@ -329,12 +325,10 @@ ReplicationSlotAcquire(const char *name) if (s->in_use && strcmp(name, NameStr(s->data.name)) == 0) { - volatile ReplicationSlot *vslot = s; - SpinLockAcquire(>mutex); - active_pid = vslot->active_pid; + active_pid = s->active_pid; if (active_pid == 0) -vslot->active_pid = MyProcPid; +s->active_pid = MyProcPid; SpinLockRelease(>mutex); slot = s; break; @@ -380,10 +374,8 @@ ReplicationSlotRelease(void) else { /* Mark slot inactive. We're not freeing it, just disconnecting. */ - volatile ReplicationSlot *vslot = slot; - SpinLockAcquire(>mutex); - vslot->active_pid = 0; + slot->active_pid = 0; SpinLockRelease(>mutex); } @@ -459,11 +451,10 @@ ReplicationSlotDropAcquired(void) } else { - volatile ReplicationSlot *vslot = slot; bool fail_softly = slot->data.persistency == RS_EPHEMERAL; SpinLockAcquire(>mutex); - vslot->active_pid = 0; + slot->active_pid = 0; SpinLockRelease(>mutex); ereport(fail_softly ? WARNING : ERROR, @@ -533,16 +524,13 @@ ReplicationSlotSave(void) void ReplicationSlotMarkDirty(void) { + ReplicationSlot *slot = MyReplicationSlot; Assert(MyReplicationSlot != NULL); - { - volatile ReplicationSlot *vslot = MyReplicationSlot; - - SpinLockAcquire(>mutex); - MyReplicationSlot->just_dirtied = true; - MyReplicationSlot->dirty = true; - SpinLockRelease(>mutex); - } + SpinLockAcquire(>mutex); + MyReplicationSlot->just_dirtied = true; + MyReplicationSlot->dirty = true; + SpinLockRelease(>mutex); } /* @@ -557,13 +545,9 @@ ReplicationSlotPersist(void) Assert(slot != NULL); Assert(slot->data.persistency != RS_PERSISTENT); - { - volatile ReplicationSlot *vslot = slot; - - SpinLockAcquire(>mutex); - vslot->data.persistency = RS_PERSISTENT; - SpinLockRelease(>mutex); - } + SpinLockAcquire(>mutex); + slot->data.persistency = RS_PERSISTENT; + SpinLockRelease(>mutex); ReplicationSlotMarkDirty(); ReplicationSlotSave(); @@ -593,14 +577,10 @@ ReplicationSlotsComputeRequiredXmin(bool already_locked) if (!s->in_use) continue; - { - volatile ReplicationSlot *vslot = s; - - SpinLockAcquire(>mutex); - effective_xmin = vslot->effective_xmin; - effective_catalog_xmin = vslot->effective_catalog_xmin; - SpinLockRelease(>mutex); - } + SpinLockAcquire(>mutex); + effective_xmin = s->effective_xmin; + effective_catalog_xmin = s->effective_catalog_xmin; + SpinLockRelease(>mutex); /* check the data xmin */ if (TransactionIdIsValid(effective_xmin) && @@ -641,13 +621,9 @@ ReplicationSlotsComputeRequiredLSN(void) if (!s->in_use) continue; - { - volatile ReplicationSlot *vslot = s; - - SpinLockAcquire(>mutex); -
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
Hello, I had more condieration on this. > As long as FDW author can choose their best way to produce a joined > tuple, it may be worth to investigate. > > My comments are: > * ForeignRecheck is the best location to call RefetchForeignJoinRow > when scanrelid==0, not ExecScanFetch. Why you try to add special > case for FDW in the common routine. > * It is FDW's choice where the remote join tuple is kept, even though > most of FDW will keep it on the private field of ForeignScanState. I think that scanrelid == 0 means that the node in focus is not a scan node in current executor semantics. EvalPlanQualFetchRowMarks fetches the possiblly modified row then EvalPlanQualNext does recheck for the new row. It's the roles of each functions. In this criteria, recheck routines are not the place for refetching. EvalPlanQualFetchRowMarks is that. Again, the problem here is that "foreign join" scan node is actually a scan node but it doesn't provide all materials which executor expects for a scan node. So the way to fix this preserving the semantics would be in two choices. 1. make "foreign join" scan node to behave as complete scan node. That is, EvalPlanQualFetchRowMarks can retrieve the modified row version anyhow according to the type of row mark. 2. make "foreign join" node that the node actuall a join node which has subnodes and the "foreign join" node can reconstruct the result row using the result of subnodes on EPQ. (ExecForeignJoinNode would cease to call subnodes if it is actually a scan node) "3". Any other means to break current semantics of joins and scans in executor, as you recommends. Some more adjustment would be needed to go on this way. I don't know how the current disign of FDW has been built, especialy about join pushdown feature so I should be missing something but I think as the above for this issue. > Apart from FDW requirement, custom-scan/join needs recheckMtd is > called when scanrelid==0 to avoid assertion fail. I hope FDW has > symmetric structure, however, not a mandatory requirement for me. It wouldn't be needed if EvalPlanQualFetchRowMarks works as exepcted. Is this wrong? regards, At Thu, 1 Oct 2015 13:17:34 +, Kouhei Kaigaiwrote in <9a28c8860f777e439aa12e8aea7694f80114d...@bpxm15gp.gisp.nec.co.jp> > > > In case when FDW is not designed to handle join by itself, it is > > > a reasonable fallback I think. > > > > > > I expect FDW driver needs to handle EPQ recheck in the case below: > > > * ForeignScan on base relation and it uses late row locking. > > > * ForeignScan on join relation, even if early locking. > > > > I also think the approach would be one choice. But one thing I'm > > concerned about is plan creation for that by the FDW author; that would > > make life hard for the FDW author. (That was proposed by me ...) > > > I don't follow the standpoint, but not valuable to repeat same discussion. > > > So, I'd like to investigate another approach that preserves the > > applicability of late row locking to the join pushdown case as well as > > the spirit of what's there now. The basic idea is (1) add a new > > callback routine RefetchForeignJoinRow that refetches one foreign-join > > tuple from the foreign server, after locking remote tuples for the > > component foreign tables if required, and (2) call that routine in > > ExecScanFetch if the target scan is for a foreign join and the component > > foreign tables require to be locked lately, else just return the > > foreign-join tuple stored in the parent's state tree, which is the tuple > > mentioned by Robert, for preserving the spirit of what's there now. I > > think that ExecLockRows and EvalPlanQualFetchRowMarks should probably be > > modified so as to skip foreign tables involved in a foreign join. > > > As long as FDW author can choose their best way to produce a joined > tuple, it may be worth to investigate. > > My comments are: > * ForeignRecheck is the best location to call RefetchForeignJoinRow > when scanrelid==0, not ExecScanFetch. Why you try to add special > case for FDW in the common routine. > * It is FDW's choice where the remote join tuple is kept, even though > most of FDW will keep it on the private field of ForeignScanState. > > Apart from FDW requirement, custom-scan/join needs recheckMtd is > called when scanrelid==0 to avoid assertion fail. I hope FDW has > symmetric structure, however, not a mandatory requirement for me. -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
> On 09/30/2015 03:27 PM, Tom Lane wrote: >> Josh Berkuswrites: >>> On 09/30/2015 03:10 PM, Tom Lane wrote: I'd be feeling a lot more positive about this whole thread if any people had stepped up and said "yes, *I* will put in a lot of grunt-work to make something happen here". The lack of any volunteers suggests strongly that this thread is a waste of time, just as the several similar ones before it have been. >> >>> Hmmm? Frost volunteered to stand up debbugs. >> >> So he did, and did anyone volunteer to put data into it, or to do ongoing >> curation of said data? If we simply connect it up to the mailing lists, >> and then stand back and wait for magic to happen, we will not ever have >> anything that's any more useful than the existing mailing list archives. On 10/01/2015 01:49 AM, Alvaro Herrera wrote: > Josh Berkus wrote: > >> Well, it's hard for anyone to volunteer when we don't know what the >> actual volunteer tasks are. I certainly intend to do *something* to >> support the bug tracker system, but I don't know yet what that something is. > > I volunteer to do something too, as long as I don't have to click on > endless web forms. > I volunteer to help populate the new tracker from whatever from the currently exists in, and will also put into action any reasonable scraping/augmentation ideas put forth to make it a more productive tool. Amir -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Typo in /src/backend/optimizer/README
The following is a remark added to /src/backend/optimizer/README by commit 8703059c6b55c427100e00a09f66534b6ccbfaa1, and IIUC, I think "LHS" in the last sentence "We prevent that by forcing the min LHS for the upper join to include B." should be "RHS". The use of minimum Relid sets has some pitfalls; consider a query like A leftjoin (B leftjoin (C innerjoin D) on (Pbcd)) on Pa where Pa doesn't mention B/C/D at all. In this case a naive computation would give the upper leftjoin's min LHS as {A} and min RHS as {C,D} (since we know that the innerjoin can't associate out of the leftjoin's RHS, and enforce that by including its relids in the leftjoin's min RHS). And the lower leftjoin has min LHS of {B} and min RHS of {C,D}. Given such information, join_is_legal would think it's okay to associate the upper join into the lower join's RHS, transforming the query to B leftjoin (A leftjoin (C innerjoin D) on Pa) on (Pbcd) which yields totally wrong answers. We prevent that by forcing the min LHS for the upper join to include B. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
Hello, At Thu, 1 Oct 2015 17:50:25 +0900, Etsuro Fujitawrote in <560cf3d1.9060...@lab.ntt.co.jp> > On 2015/10/01 11:15, Kouhei Kaigai wrote: > >> From: pgsql-hackers-ow...@postgresql.org > >> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas > >> On Mon, Sep 28, 2015 at 11:15 PM, Etsuro Fujita > >> wrote: > >> So, if we wanted to fix this in a way that preserves the spirit of > >> what's there now, it seems to me that we'd want the FDW to return > >> something that's like a whole row reference, but represents the output > >> of the foreign join rather than some underlying base table. And then > >> get the EPQ machinery to have the evaluation of the ForeignScan for > >> the join, when it happens in an EPQ context, to return that tuple. > >> But I don't really have a good idea how to do that. > > > Alternative built-in join execution? > > Once it is executed under the EPQ context, built-in join node fetches > > a tuple from both of inner and outer side for each. It is eventually > > fetched from the EPQ slot, then the alternative join produce a result > > tuple. > > In case when FDW is not designed to handle join by itself, it is > > a reasonable fallback I think. > > > > I expect FDW driver needs to handle EPQ recheck in the case below: > > * ForeignScan on base relation and it uses late row locking. > > * ForeignScan on join relation, even if early locking. > > I also think the approach would be one choice. But one thing I'm > concerned about is plan creation for that by the FDW author; that > would make life hard for the FDW author. (That was proposed by me > ...) > > So, I'd like to investigate another approach that preserves the > applicability of late row locking to the join pushdown case as well as > the spirit of what's there now. The basic idea is (1) add a new > callback routine RefetchForeignJoinRow that refetches one foreign-join > tuple from the foreign server, after locking remote tuples for the > component foreign tables if required, It would be the case that at least one of the component relations of a foreign join is other than ROW_MARK_COPY, which is not possible so far on postgres_fdw. For the case that some of the component relations are other than ROW_MARK_COPY, we might should call RefetchForeignRow for such relations and construct joined row involving ROW_MARK_COPY relations. Indeed we could consider some logic for the case, it is obvious that the case now we should focus on is a "foreign join" scan with all underlying foreign scans are ROW_MARK_COPY, I think. "foreign join" scan with ROW_MARK_COPY looks to be promising (for me) and in future it would be able to coexist with refetch mechanism maybe in your mind from this point of view... Maybe:p > and (2) call that routine in > ExecScanFetch if the target scan is for a foreign join and the > component foreign tables require to be locked lately, else just return > the foreign-join tuple stored in the parent's state tree, which is the > tuple mentioned by Robert, for preserving the spirit of what's there > now. I think that ExecLockRows and EvalPlanQualFetchRowMarks should > probably be modified so as to skip foreign tables involved in a > foreign join. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
Hello, At Fri, 2 Oct 2015 03:10:01 +, Kouhei Kaigaiwrote in <9a28c8860f777e439aa12e8aea7694f80114d...@bpxm15gp.gisp.nec.co.jp> > > > As long as FDW author can choose their best way to produce a joined > > > tuple, it may be worth to investigate. > > > > > > My comments are: > > > * ForeignRecheck is the best location to call RefetchForeignJoinRow > > > when scanrelid==0, not ExecScanFetch. Why you try to add special > > > case for FDW in the common routine. > > > * It is FDW's choice where the remote join tuple is kept, even though > > > most of FDW will keep it on the private field of ForeignScanState. > > > > I think that scanrelid == 0 means that the node in focus is not a > > scan node in current executor > > semantics. EvalPlanQualFetchRowMarks fetches the possiblly > > modified row then EvalPlanQualNext does recheck for the new > > row. It's the roles of each functions. > > > > In this criteria, recheck routines are not the place for > > refetching. EvalPlanQualFetchRowMarks is that. > > > I never say FDW should refetch tuples on the recheck routine. > All I suggest is, projection to generate a joined tuple and > recheck according to the qualifier pushed down are role of > FDW driver, because it knows the best strategy to do the job. I have no objection that rechecking is FDW's job. I think you are thinking that all ROW_MARK_COPY base rows are held in ss_ScanTupleSlot so simply calling recheckMtd on the slot gives enough data to the function. (EPQState would also be needed to retrieve, though..) Right? All the underlying foreign tables should be marked as ROW_MARK_COPY to call recheckMtd safely. And somehow it required to know what column stores what base tuple. > It looks to me all of them makes the problem complicated more. > I never heard why "foreign-join" scan node is difficult to construct > a joined tuple using the EPQ slots that are already loaded on. > > Regardless of the early or late locking, EPQ slots of base relation > are already filled up, aren't it? recheckMtd needs to take EState as a parameter? > All mission of the "foreign-join" scan node is return a joined > tuple as if it was executed by local join logic. > Local join consumes two tuples then generate one tuple. > The "foreign-join" scan node can perform equivalently, even if it > is under EPQ recheck context. > > So, job of FDW driver is... > Step-1) Fetch tuples from the EPQ slots of the base foreign relation > to be joined. Please note that it is just a pointer reference. > Step-2) Try to join these two (or more) tuples according to the > join condition (only FDW knows because it is kept in private) > Step-3) If result is valid, FDW driver makes a projection from these > tuples, then return it. > > If you concern about re-invention of the code for each FDW, core > can provide a utility routine to cover 95% of FDW structure. > > I want to keep EvalPlanQualFetchRowMarks per base relation basis. > It is a bad choice to consider join at this point. > > > Apart from FDW requirement, custom-scan/join needs recheckMtd is > > > called when scanrelid==0 to avoid assertion fail. I hope FDW has > > > symmetric structure, however, not a mandatory requirement for me. > > > > It wouldn't be needed if EvalPlanQualFetchRowMarks works as > > exepcted. Is this wrong? > > > Yes, it does not work. > Expected behavior EvalPlanQualFetchRowMarks is to load the tuple > to be rechecked onto EPQ slot, using heap_fetch or copied image. > It is per base relation basis. Hmm. What I said by "works as expected" is that the function stores the tuple for the "foreign join" scan node. If it doesn't, you're right. > Who can provide a projection to generate joined tuple? > It is a job of individual plan-state-node to be walked on during > EvalPlanQualNext(). EvalPlanQualNext simply does recheck tuples stored in epqTuples, which are designed to be provided by EvalPlanQualFetchRowMarks. I think that that premise shouldn't be broken for convenience... regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
Hello, At Fri, 2 Oct 2015 12:51:42 +0900, Etsuro Fujitawrote in <560dff4e.2000...@lab.ntt.co.jp> > On 2015/10/02 9:50, Kyotaro HORIGUCHI wrote: Most of the citation are of Kiagai-san's mentions:) > >> As long as FDW author can choose their best way to produce a joined > >> tuple, it may be worth to investigate. > >> > >> My comments are: > >> * ForeignRecheck is the best location to call RefetchForeignJoinRow > >>when scanrelid==0, not ExecScanFetch. Why you try to add special > >>case for FDW in the common routine. > > In my understanding, the job that ExecScanRecheckMtd should do is to > check if the test tuple *already stored* in the plan node's scan slot > meets the access-method conditions, in general. So, ISTM that it'd be > somewhat odd to replace RefetchForeignJoinRow within ForeignRecheck, > to store the remote join tuple in the slot. Also, RefetchForeignRow > is called from the common routines > ExecLockRows/EvalPlanQualFetchRowMarks ... Agreed, except for the necessity of RefetchForeignJoinRow. > >> * It is FDW's choice where the remote join tuple is kept, even though > >>most of FDW will keep it on the private field of ForeignScanState. > > I see. > > To make it possible that the FDW doesn't have to do anything for cases > where the FDW doesn't do any late row locking, however, I think it'd > be more promising to use the remote join tuple stored in the scan slot > of the corresponding ForeignScanState node in the parent's planstate > tree. I haven't had a good idea for that yet, though. One coarse idea is that adding root->rowMarks for the "foreign join" paths (then removing rowMarks for underlying scans later if the foreign join wins). Somehow propagating it to epqstate->arowMarks, EvalPlanQualFetchRowMarks will stores needed tuple into eqptuples. This is which Kaigai-san criticized as 'makes things too complex'.:) But I'm awkward to break the assumption of ExecScanFetch. > > EvalPlanQualFetchRowMarks fetches the possiblly > > modified row then EvalPlanQualNext does recheck for the new > > row. > > Really? EvalPlanQualFetchRowMarks fetches the tuples for any > non-locked relations, so I think that that function should fetch the > same version previously obtained for each such relation successfully. Sorry, please ignore "possibly modified". regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> -Original Message- > From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp] > Sent: Friday, October 02, 2015 1:28 PM > To: Kaigai Kouhei(海外 浩平) > Cc: fujita.ets...@lab.ntt.co.jp; robertmh...@gmail.com; > pgsql-hackers@postgresql.org; shigeru.han...@gmail.com > Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual > > Hello, > > At Fri, 2 Oct 2015 03:10:01 +, Kouhei Kaigaiwrote > in <9a28c8860f777e439aa12e8aea7694f80114d...@bpxm15gp.gisp.nec.co.jp> > > > > As long as FDW author can choose their best way to produce a joined > > > > tuple, it may be worth to investigate. > > > > > > > > My comments are: > > > > * ForeignRecheck is the best location to call RefetchForeignJoinRow > > > > when scanrelid==0, not ExecScanFetch. Why you try to add special > > > > case for FDW in the common routine. > > > > * It is FDW's choice where the remote join tuple is kept, even though > > > > most of FDW will keep it on the private field of ForeignScanState. > > > > > > I think that scanrelid == 0 means that the node in focus is not a > > > scan node in current executor > > > semantics. EvalPlanQualFetchRowMarks fetches the possiblly > > > modified row then EvalPlanQualNext does recheck for the new > > > row. It's the roles of each functions. > > > > > > In this criteria, recheck routines are not the place for > > > refetching. EvalPlanQualFetchRowMarks is that. > > > > > I never say FDW should refetch tuples on the recheck routine. > > All I suggest is, projection to generate a joined tuple and > > recheck according to the qualifier pushed down are role of > > FDW driver, because it knows the best strategy to do the job. > > I have no objection that rechecking is FDW's job. > > I think you are thinking that all ROW_MARK_COPY base rows are > held in ss_ScanTupleSlot so simply calling recheckMtd on the slot > gives enough data to the function. (EPQState would also be needed > to retrieve, though..) Right? > Not ss_ScanTupleSlot. It is initialized according to fdw_scan_tlist in case of scanrelid==0, regardless of base foreign relation's definition. My expectation is, FDW callback construct tts_values/tts_isnull of ss_ScanTupleSlot according to the preloaded tuples in EPQ slots and underlying projection. Only FDW driver knows the best way to construct this result tuple. You can pull out EState reference from PlanState portion of the ForeignScanState, so nothing needs to be changed. > All the underlying foreign tables should be marked as > ROW_MARK_COPY to call recheckMtd safely. And somehow it required > to know what column stores what base tuple. > Even if ROW_MARK_REFERENCE by later locking, the tuple to be rechecked is already loaded estate->es_epqTuple[], isn't it? Recheck routine does not needs to care about row-mark policy. > > It looks to me all of them makes the problem complicated more. > > I never heard why "foreign-join" scan node is difficult to construct > > a joined tuple using the EPQ slots that are already loaded on. > > > > Regardless of the early or late locking, EPQ slots of base relation > > are already filled up, aren't it? > > recheckMtd needs to take EState as a parameter? > No. > > All mission of the "foreign-join" scan node is return a joined > > tuple as if it was executed by local join logic. > > Local join consumes two tuples then generate one tuple. > > The "foreign-join" scan node can perform equivalently, even if it > > is under EPQ recheck context. > > > > So, job of FDW driver is... > > Step-1) Fetch tuples from the EPQ slots of the base foreign relation > > to be joined. Please note that it is just a pointer reference. > > Step-2) Try to join these two (or more) tuples according to the > > join condition (only FDW knows because it is kept in private) > > Step-3) If result is valid, FDW driver makes a projection from these > > tuples, then return it. > > > > If you concern about re-invention of the code for each FDW, core > > can provide a utility routine to cover 95% of FDW structure. > > > > I want to keep EvalPlanQualFetchRowMarks per base relation basis. > > It is a bad choice to consider join at this point. > > > > > > Apart from FDW requirement, custom-scan/join needs recheckMtd is > > > > called when scanrelid==0 to avoid assertion fail. I hope FDW has > > > > symmetric structure, however, not a mandatory requirement for me. > > > > > > It wouldn't be needed if EvalPlanQualFetchRowMarks works as > > > exepcted. Is this wrong? > > > > > Yes, it does not work. > > Expected behavior EvalPlanQualFetchRowMarks is to load the tuple > > to be rechecked onto EPQ slot, using heap_fetch or copied image. > > It is per base relation basis. > > Hmm. What I said by "works as expected" is that the function > stores the tuple for the "foreign join" scan node. If it doesn't, > you're right. > Which slot of the EPQ slot will save the joined tuple? scanrelid is zero, and we have no identifier of join
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On 2015/10/02 9:50, Kyotaro HORIGUCHI wrote: As long as FDW author can choose their best way to produce a joined tuple, it may be worth to investigate. My comments are: * ForeignRecheck is the best location to call RefetchForeignJoinRow when scanrelid==0, not ExecScanFetch. Why you try to add special case for FDW in the common routine. In my understanding, the job that ExecScanRecheckMtd should do is to check if the test tuple *already stored* in the plan node's scan slot meets the access-method conditions, in general. So, ISTM that it'd be somewhat odd to replace RefetchForeignJoinRow within ForeignRecheck, to store the remote join tuple in the slot. Also, RefetchForeignRow is called from the common routines ExecLockRows/EvalPlanQualFetchRowMarks ... * It is FDW's choice where the remote join tuple is kept, even though most of FDW will keep it on the private field of ForeignScanState. I see. To make it possible that the FDW doesn't have to do anything for cases where the FDW doesn't do any late row locking, however, I think it'd be more promising to use the remote join tuple stored in the scan slot of the corresponding ForeignScanState node in the parent's planstate tree. I haven't had a good idea for that yet, though. EvalPlanQualFetchRowMarks fetches the possiblly modified row then EvalPlanQualNext does recheck for the new row. Really? EvalPlanQualFetchRowMarks fetches the tuples for any non-locked relations, so I think that that function should fetch the same version previously obtained for each such relation successfully. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Typo in /src/backend/optimizer/README
On 2015/10/01 22:31, Tom Lane wrote: > Etsuro Fujitawrites: >> The following is a remark added to /src/backend/optimizer/README by >> commit 8703059c6b55c427100e00a09f66534b6ccbfaa1, and IIUC, I think "LHS" >> in the last sentence "We prevent that by forcing the min LHS for the >> upper join to include B." should be "RHS". > Mmm, yeah, that's a typo. Will fix, thanks. Thanks! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect patch, for showing tuple data
On Thu, Oct 1, 2015 at 8:13 PM, Nikolay Shaplov wrote: > В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier написал: >> >> - errmsg("input page too small (%d bytes)", >> raw_page_size))); >> +errmsg("input page too small (%d >> bytes)", raw_page_size))); >> Please be careful of spurious diffs. Those just make the life of committers >> more difficult than it already is. > > Oh, that's not diff. That's I've fixed indent on the code I did not write :-) pgindent would have taken care of that if needed. There is no need to add noise in the code for this patch. >> + >> + General idea about output columns: lp_* >> attributes >> + are about where tuple is located inside the page; >> + t_xmin, t_xmax, >> + t_field3, t_ctid are about >> + visibility of this tuplue inside or outside of the transaction; >> + t_infomask2, t_infomask has >> some >> + information about properties of attributes stored in tuple data. >> + t_hoff sais where tuple data begins and >> + t_bits sais which attributes are NULL and which >> are >> + not. Please notice that t_bits here is not an actual value that is >> stored >> + in tuple data, but it's text representation with '0' and '1' >> charactrs. >> + >> I would remove that as well. htup_details.h contains all this information. > Made it even more shorter. Just links and comments about representation of > t_bits. I would completely remove this part. > There also was a bug in original pageinspect, that did not get t_bits right > when there was OID in the tuple. I've fixed it too. Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an OID it will be stored at the end of HeapTupleHeader and t_hoff includes it, so bits_len is definitely larger of 4 bytes should an OID be present. if (tuphdr->t_infomask & HEAP_HASNULL) { - bits_len = tuphdr->t_hoff - - offsetof(HeapTupleHeaderData, t_bits); + int bits_len = + ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 + 1)*8; values[11] = CStringGetTextDatum( - bits_to_text(tuphdr->t_bits, bits_len * 8)); + bits_to_text(tuphdr->t_bits, bits_len)); } And here is what you are referring to in your patch. I think that we should instead check for HEAP_HASOID and reduce bits_len by the size of Oid should one be present. As this is a bug that applies to all the existing versions of Postgres it would be good to extract it as a separate patch and then apply your own patch on top of it instead of including in your feature. Attached is a patch, this should be applied down to 9.0 I guess. Could you rebase your patch on top of it? > Here is next patch in attachment. Cool, thanks. -test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0)); + +test=# select * from heap_page_items(get_raw_page('pg_range', 0)); This example in the docs is far too long in character length... We should get that reduced. + Please notice that t_bits in tuple header structure + is a binary bitmap, but heap_page_items returns + t_bits as human readable text representation of + original t_bits bitmap. This had better remove the mention to "please notice". Still as this is already described in htup_details.h there is no real point to describe it again here: that's a bitmap of NULLs. -- Michael From 3a21fd817748b8001e91159c3f2a557088b4fa26 Mon Sep 17 00:00:00 2001 From: Michael PaquierDate: Fri, 2 Oct 2015 12:58:13 +0900 Subject: [PATCH] Fix overestimated size of t_bits in pageinspect In a tuple, an object-id field is added at the end of HeapTupleHeader and the size of he header is updated to include it. However heap_page_items in pageinspect ignored that fact and overestimated the size of bitmap of NULLs by 4 bytes should an OID be defined in this tuple. Per report from Nikolay Sharplov, for a problem found while working on a new feature for pageinspect, and patch by Michael Paquier. --- contrib/pageinspect/heapfuncs.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c index 8d1666c..031d455 100644 --- a/contrib/pageinspect/heapfuncs.c +++ b/contrib/pageinspect/heapfuncs.c @@ -154,7 +154,6 @@ heap_page_items(PG_FUNCTION_ARGS) lp_offset + lp_len <= raw_page_size) { HeapTupleHeader tuphdr; - int bits_len; /* Extract information from the tuple header */ @@ -180,9 +179,15 @@ heap_page_items(PG_FUNCTION_ARGS) { if (tuphdr->t_infomask & HEAP_HASNULL) { + int bits_len; + bits_len = tuphdr->t_hoff - offsetof(HeapTupleHeaderData, t_bits); + /* ignore OID field of tuple if present
Re: [HACKERS] No Issue Tracker - Say it Ain't So!
On Thu, Oct 1, 2015 at 9:50 AM, Torsten Zuehlsdorff < mailingli...@toco-domains.de> wrote: > On 01.10.2015 00:27, Tom Lane wrote: > >> Josh Berkuswrites: >> >>> On 09/30/2015 03:10 PM, Tom Lane wrote: >>> I'd be feeling a lot more positive about this whole thread if any people had stepped up and said "yes, *I* will put in a lot of grunt-work to make something happen here". >>> >> Hmmm? Frost volunteered to stand up debbugs. >>> >> >> So he did, and did anyone volunteer to put data into it, or to do ongoing >> curation of said data? >> > > Yes, i did. > > I am also volunteering for this kind of grunt-work. Greetings, Félix
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On 2015/10/01 11:15, Kouhei Kaigai wrote: From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas On Mon, Sep 28, 2015 at 11:15 PM, Etsuro Fujitawrote: I thought the same thing [1]. While I thought it was relatively easy to make changes to RefetchForeignRow that way for the foreign table case (scanrelid>0), I was not sure how hard it would be to do so for the foreign join case (scanrelid==0). So, I proposed to leave that changes for 9.6. I'll have a rethink on this issue along the lines of that approach. So, if we wanted to fix this in a way that preserves the spirit of what's there now, it seems to me that we'd want the FDW to return something that's like a whole row reference, but represents the output of the foreign join rather than some underlying base table. And then get the EPQ machinery to have the evaluation of the ForeignScan for the join, when it happens in an EPQ context, to return that tuple. But I don't really have a good idea how to do that. Alternative built-in join execution? Once it is executed under the EPQ context, built-in join node fetches a tuple from both of inner and outer side for each. It is eventually fetched from the EPQ slot, then the alternative join produce a result tuple. In case when FDW is not designed to handle join by itself, it is a reasonable fallback I think. I expect FDW driver needs to handle EPQ recheck in the case below: * ForeignScan on base relation and it uses late row locking. * ForeignScan on join relation, even if early locking. I also think the approach would be one choice. But one thing I'm concerned about is plan creation for that by the FDW author; that would make life hard for the FDW author. (That was proposed by me ...) So, I'd like to investigate another approach that preserves the applicability of late row locking to the join pushdown case as well as the spirit of what's there now. The basic idea is (1) add a new callback routine RefetchForeignJoinRow that refetches one foreign-join tuple from the foreign server, after locking remote tuples for the component foreign tables if required, and (2) call that routine in ExecScanFetch if the target scan is for a foreign join and the component foreign tables require to be locked lately, else just return the foreign-join tuple stored in the parent's state tree, which is the tuple mentioned by Robert, for preserving the spirit of what's there now. I think that ExecLockRows and EvalPlanQualFetchRowMarks should probably be modified so as to skip foreign tables involved in a foreign join. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On 2015/10/01 15:38, Kyotaro HORIGUCHI wrote: >> I expect FDW driver needs to handle EPQ recheck in the case below: >> * ForeignScan on base relation and it uses late row locking. > I think this is indisputable. I think so. But I think this case would probably be handled by the existing RefetchForeignRow routine as I said upthread. >> * ForeignScan on join relation, even if early locking. > This could be unnecessary if the "foreign join" scan node can > have its own rowmark of ROW_MARK_COPY. That's an idea, but I'd vote for preserving the applicability of late row locking to the foreign join case, allowing component foreign tables involved in a foreign join to have different rowmark methods other than ROW_MARK_COPY. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers