Re: [HACKERS] SERIALIZABLE with parallel query
On Mon, Mar 4, 2019 at 10:17 AM Thomas Munro wrote: > On Thu, Oct 11, 2018 at 10:15 AM Kevin Grittner wrote: > > It applies and builds clean, it passed make world with cassert and TAP > > tests, and I can't see any remaining flaws. This is true both of just > > the 0001 v16 patch and that with 0002 v16 applied on top of it. > > Thanks. I'd like to commit this soon. I did a round of testing under load and some printf-debugging to convince myself that the SXACT_FLAG_RO_SAFE handling really is exercised by serializable-parallel-2.spec and behaving as expected, along with some more testing by hand, and pushed this. To generate load I used a knock-off of sibench[1], run as eg ./petit-sibench --rows 1 --threads 8 --ssi, against a server running with -c min_parallel_table_scan_size=128kB -c parallel_setup_cost=0 -c max_worker_processes=16 -c max_parallel_workers=16. [1] https://github.com/macdice/petit-sibench/blob/master/petit-sibench.c -- Thomas Munro https://enterprisedb.com
Re: [HACKERS] SERIALIZABLE with parallel query
On Mon, Oct 8, 2018 at 9:40 PM Thomas Munro wrote: > Rebased. It applies and builds clean, it passed make world with cassert and TAP tests, and I can't see any remaining flaws. This is true both of just the 0001 v16 patch and that with 0002 v16 applied on top of it. It would be great if someone with a big test machine could stress test and benchmark this versus current production versions. -- Kevin Grittner VMware vCenter Server https://www.vmware.com/
Re: [HACKERS] SERIALIZABLE with parallel query
On Tue, Oct 2, 2018 at 4:53 PM Thomas Munro wrote: > Thanks for the review! And sorry for my delayed response. Here is a > rebased patch, with changes as requested. Rebased. -- Thomas Munro http://www.enterprisedb.com 0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v16.patch Description: Binary data 0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v16.patch Description: Binary data
Re: [HACKERS] SERIALIZABLE with parallel query
On Thu, Sep 20, 2018 at 9:50 AM Kevin Grittner wrote: > On Tue, Jul 10, 2018 at 8:58 PM Masahiko Sawada wrote: > > I looked at this patches. The latest patch can build without any > > errors and warnings and pass all regression tests. I don't see > > critical bugs but there are random comments. Thanks for the review! And sorry for my delayed response. Here is a rebased patch, with changes as requested. I have replies also for Kevin, see further down. > > + /* > > +* If the leader in a parallel query earler stashed a > > partially > > +* released SERIALIZABLEXACT for final clean-up at end > > of transaction > > +* (because workers might still have been accessing > > it), then it's > > +* time to restore it. > > +*/ > > > > There is a typo. > > s/earler/earlier/ Fixed. > > > > Should we add test to check if write-skew[1] anomaly doesn't happen > > even in parallel mode? I suppose we could find another one of the existing specs that shows write-skew and adapt it to run a read-only part of the transaction in a parallel worker, but what would it prove that the proposed new test doesn't prove already? > > > > - * We aren't acquiring lightweight locks for the predicate lock or lock > > + * We acquire an LWLock in the case of parallel mode, because worker > > + * backends have access to the leader's SERIALIZABLEXACT. Otherwise, > > + * we aren't acquiring lightweight locks for the predicate lock or lock > > > > There are LWLock and lightweight lock. Maybe it's better to unify the > > spelling. Done. > > > > @@ -2231,9 +2234,12 @@ PrepareTransaction(void) > > /* > > * Mark serializable transaction as complete for predicate locking > > * purposes. This should be done as late as we can put it and > > still allow > > -* errors to be raised for failure patterns found at commit. > > +* errors to be raised for failure patterns found at commit. > > This is not > > +* appropriate for parallel workers however, because we aren't > > committing > > +* the leader's transaction and its serializable state will live on. > > */ > > - PreCommit_CheckForSerializationFailure(); > > + if (!IsParallelWorker()) > > + PreCommit_CheckForSerializationFailure(); > > > > This code assumes that parallel workers could prepare transaction. Is > > that expected behavior of parallel query? There is an assertion > > !IsInParallelMode() at the beginning of that function though. You are right. I made a change exactly like this in CommitTransaction(), where it is necessary, but then somehow I managed to apply that hunk to the identical code in PrepareTransaction() also, where it is not necessary. Fixed. > > > > +/* > > + * If the transaction is committing, but it has been partially released > > + * already, then treat this as a roll back. It was marked as rolled > > back. > > + */ > > +if (isCommit && SxactIsPartiallyReleased(MySerializableXact)) > > +isCommit = false; > > + > > > > Isn't it better to add an assertion to check if > > MySerializableXact->flags has SXACT_FLAG_ROLLED_BACK flag for safety? That can't hurt. Added. It's don't really the fact that the flag contradicts reality here... but it was already established that the read-only safe optimisation calls ReleasePredicateLocks(false) which behaves like a rollback and marks the SERIALIZABLEXACT that way. I don't have a better idea right now. On Thu, Sep 20, 2018 at 9:50 AM Kevin Grittner wrote: > After reviewing the thread and the current two patches, I agree with > Masahiko Sawada plus preferring one adjustment to the coding: I would > prefer to break out the majority of the ReleasePredicateLocks function > to a static ReleasePredicateLocksMain (or similar) function and > eliminating the goto. Hi Kevin, Thanks for the review. How about moving that bit of local-cleanup code from the end of the function into a new static function ReleasePredicateLocksLocal(), so that we can call it and then return, instead of the evil "goto"? Done that way in the attached. > The optimization in patch 0002 is important. Previous benchmarks > showed a fairly straightforward pgbench test scaled as well as > REPEATABLE READ when it was present, but performance fell off up to > 20% as the scale increased without it. +1 > I will spend a few more days in testing and review, but figured I > should pass along "first impressions" now. Thanks! -- Thomas Munro http://www.enterprisedb.com 0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v15.patch Description: Binary data 0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v15.patch Description: Binary data
Re: [HACKERS] SERIALIZABLE with parallel query
On Wed, Sep 19, 2018 at 04:50:40PM -0500, Kevin Grittner wrote: > I will spend a few more days in testing and review, but figured I > should pass along "first impressions" now. Kevin, it seems that this patch is pending on your input. I have moved this patch to next CF for now. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] SERIALIZABLE with parallel query
After reviewing the thread and the current two patches, I agree with Masahiko Sawada plus preferring one adjustment to the coding: I would prefer to break out the majority of the ReleasePredicateLocks function to a static ReleasePredicateLocksMain (or similar) function and eliminating the goto. The optimization in patch 0002 is important. Previous benchmarks showed a fairly straightforward pgbench test scaled as well as REPEATABLE READ when it was present, but performance fell off up to 20% as the scale increased without it. I will spend a few more days in testing and review, but figured I should pass along "first impressions" now. On Tue, Jul 10, 2018 at 8:58 PM Masahiko Sawada wrote: > > On Mon, Jul 2, 2018 at 3:12 PM, Masahiko Sawada wrote: > > On Fri, Jun 29, 2018 at 7:28 PM, Thomas Munro > > wrote: > >> On Thu, Jun 28, 2018 at 7:55 PM, Masahiko Sawada > >> wrote: > >>> I'd like to test and review this patches but they seem to conflict > >>> with current HEAD. Could you please rebase them? > >> > >> Hi Sawada-san, > >> > >> Thanks! Rebased and attached. The only changes are: the LWLock > >> tranche is now shown as "serializable_xact" instead of "sxact" (hmm, > >> LWLock tranches have lower_case_names_with_underscores, but individual > >> LWLocks have CamelCaseName...), and ShareSerializableXact() no longer > >> does Assert(!IsParallelWorker()). These changes are based on the last > >> feedback from Robert. > >> > > > > Thank you! Will look at patches. > > > > I looked at this patches. The latest patch can build without any > errors and warnings and pass all regression tests. I don't see > critical bugs but there are random comments. > > + /* > +* If the leader in a parallel query earler stashed a > partially > +* released SERIALIZABLEXACT for final clean-up at end > of transaction > +* (because workers might still have been accessing > it), then it's > +* time to restore it. > +*/ > > There is a typo. > s/earler/earlier/ > > > Should we add test to check if write-skew[1] anomaly doesn't happen > even in parallel mode? > > > - * We aren't acquiring lightweight locks for the predicate lock or lock > + * We acquire an LWLock in the case of parallel mode, because worker > + * backends have access to the leader's SERIALIZABLEXACT. Otherwise, > + * we aren't acquiring lightweight locks for the predicate lock or lock > > There are LWLock and lightweight lock. Maybe it's better to unify the > spelling. > > > @@ -2231,9 +2234,12 @@ PrepareTransaction(void) > /* > * Mark serializable transaction as complete for predicate locking > * purposes. This should be done as late as we can put it and > still allow > -* errors to be raised for failure patterns found at commit. > +* errors to be raised for failure patterns found at commit. > This is not > +* appropriate for parallel workers however, because we aren't > committing > +* the leader's transaction and its serializable state will live on. > */ > - PreCommit_CheckForSerializationFailure(); > + if (!IsParallelWorker()) > + PreCommit_CheckForSerializationFailure(); > > This code assumes that parallel workers could prepare transaction. Is > that expected behavior of parallel query? There is an assertion > !IsInParallelMode() at the beginning of that function though. > > > +/* > + * If the transaction is committing, but it has been partially released > + * already, then treat this as a roll back. It was marked as rolled > back. > + */ > +if (isCommit && SxactIsPartiallyReleased(MySerializableXact)) > +isCommit = false; > + > > Isn't it better to add an assertion to check if > MySerializableXact->flags has SXACT_FLAG_ROLLED_BACK flag for safety? > > [1] https://en.wikipedia.org/wiki/Snapshot_isolation#Definition > > Regards, > > -- > Masahiko Sawada > NIPPON TELEGRAPH AND TELEPHONE CORPORATION > NTT Open Source Software Center -- Kevin Grittner VMware vCenter Server https://www.vmware.com/
Re: [HACKERS] SERIALIZABLE with parallel query
On Mon, Jul 2, 2018 at 3:12 PM, Masahiko Sawada wrote: > On Fri, Jun 29, 2018 at 7:28 PM, Thomas Munro > wrote: >> On Thu, Jun 28, 2018 at 7:55 PM, Masahiko Sawada >> wrote: >>> I'd like to test and review this patches but they seem to conflict >>> with current HEAD. Could you please rebase them? >> >> Hi Sawada-san, >> >> Thanks! Rebased and attached. The only changes are: the LWLock >> tranche is now shown as "serializable_xact" instead of "sxact" (hmm, >> LWLock tranches have lower_case_names_with_underscores, but individual >> LWLocks have CamelCaseName...), and ShareSerializableXact() no longer >> does Assert(!IsParallelWorker()). These changes are based on the last >> feedback from Robert. >> > > Thank you! Will look at patches. > I looked at this patches. The latest patch can build without any errors and warnings and pass all regression tests. I don't see critical bugs but there are random comments. + /* +* If the leader in a parallel query earler stashed a partially +* released SERIALIZABLEXACT for final clean-up at end of transaction +* (because workers might still have been accessing it), then it's +* time to restore it. +*/ There is a typo. s/earler/earlier/ Should we add test to check if write-skew[1] anomaly doesn't happen even in parallel mode? - * We aren't acquiring lightweight locks for the predicate lock or lock + * We acquire an LWLock in the case of parallel mode, because worker + * backends have access to the leader's SERIALIZABLEXACT. Otherwise, + * we aren't acquiring lightweight locks for the predicate lock or lock There are LWLock and lightweight lock. Maybe it's better to unify the spelling. @@ -2231,9 +2234,12 @@ PrepareTransaction(void) /* * Mark serializable transaction as complete for predicate locking * purposes. This should be done as late as we can put it and still allow -* errors to be raised for failure patterns found at commit. +* errors to be raised for failure patterns found at commit. This is not +* appropriate for parallel workers however, because we aren't committing +* the leader's transaction and its serializable state will live on. */ - PreCommit_CheckForSerializationFailure(); + if (!IsParallelWorker()) + PreCommit_CheckForSerializationFailure(); This code assumes that parallel workers could prepare transaction. Is that expected behavior of parallel query? There is an assertion !IsInParallelMode() at the beginning of that function though. +/* + * If the transaction is committing, but it has been partially released + * already, then treat this as a roll back. It was marked as rolled back. + */ +if (isCommit && SxactIsPartiallyReleased(MySerializableXact)) +isCommit = false; + Isn't it better to add an assertion to check if MySerializableXact->flags has SXACT_FLAG_ROLLED_BACK flag for safety? [1] https://en.wikipedia.org/wiki/Snapshot_isolation#Definition Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] SERIALIZABLE with parallel query
On Fri, Jun 29, 2018 at 7:28 PM, Thomas Munro wrote: > On Thu, Jun 28, 2018 at 7:55 PM, Masahiko Sawada > wrote: >> I'd like to test and review this patches but they seem to conflict >> with current HEAD. Could you please rebase them? > > Hi Sawada-san, > > Thanks! Rebased and attached. The only changes are: the LWLock > tranche is now shown as "serializable_xact" instead of "sxact" (hmm, > LWLock tranches have lower_case_names_with_underscores, but individual > LWLocks have CamelCaseName...), and ShareSerializableXact() no longer > does Assert(!IsParallelWorker()). These changes are based on the last > feedback from Robert. > Thank you! Will look at patches. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] SERIALIZABLE with parallel query
On Thu, Jun 28, 2018 at 7:55 PM, Masahiko Sawada wrote: > I'd like to test and review this patches but they seem to conflict > with current HEAD. Could you please rebase them? Hi Sawada-san, Thanks! Rebased and attached. The only changes are: the LWLock tranche is now shown as "serializable_xact" instead of "sxact" (hmm, LWLock tranches have lower_case_names_with_underscores, but individual LWLocks have CamelCaseName...), and ShareSerializableXact() no longer does Assert(!IsParallelWorker()). These changes are based on the last feedback from Robert. -- Thomas Munro http://www.enterprisedb.com 0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v14.patch Description: Binary data 0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v14.patch Description: Binary data
Re: [HACKERS] SERIALIZABLE with parallel query
On Fri, Mar 30, 2018 at 2:56 PM, Thomas Munro wrote: > On Thu, Mar 8, 2018 at 10:28 AM, Robert Haas wrote: >> +SerializableXactHandle >> +ShareSerializableXact(void) >> +{ >> +Assert(!IsParallelWorker()); >> + >> +return MySerializableXact; >> +} >> >> Uh, how's that OK? There's no rule that you can't create a >> ParallelContext in a worker. Parallel query currently doesn't, so it >> probably won't happen, but burying an assertion to that effect in the >> predicate locking code doesn't seem nice. > > Hmm. I suppose you could have a PARALLEL SAFE function that itself > launches parallel workers explicitly (not via parallel query), and > they should inherit the same SERIALIZABLEXACT from their parent and > that should all just work. > >> Is "sxact" really the best (i.e. clearest) name we can come up with >> for the lock tranche? > > Yeah, needs a better name. > > I have some lingering uncertainty about this patch and we're out of > time, so I moved it to PG12 CF1. Thanks Haribabu, Robert, Amit for > the reviews and comments so far. > I'd like to test and review this patches but they seem to conflict with current HEAD. Could you please rebase them? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] SERIALIZABLE with parallel query
On Thu, Mar 8, 2018 at 10:28 AM, Robert Haaswrote: > +SerializableXactHandle > +ShareSerializableXact(void) > +{ > +Assert(!IsParallelWorker()); > + > +return MySerializableXact; > +} > > Uh, how's that OK? There's no rule that you can't create a > ParallelContext in a worker. Parallel query currently doesn't, so it > probably won't happen, but burying an assertion to that effect in the > predicate locking code doesn't seem nice. Hmm. I suppose you could have a PARALLEL SAFE function that itself launches parallel workers explicitly (not via parallel query), and they should inherit the same SERIALIZABLEXACT from their parent and that should all just work. > Is "sxact" really the best (i.e. clearest) name we can come up with > for the lock tranche? Yeah, needs a better name. I have some lingering uncertainty about this patch and we're out of time, so I moved it to PG12 CF1. Thanks Haribabu, Robert, Amit for the reviews and comments so far. -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] SERIALIZABLE with parallel query
On Mon, Feb 26, 2018 at 6:37 PM, Thomas Munrowrote: > I've now broken it into two patches. Rebased. -- Thomas Munro http://www.enterprisedb.com 0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v13.patch Description: Binary data 0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v13.patch Description: Binary data
Re: [HACKERS] SERIALIZABLE with parallel query
On Sat, Feb 24, 2018 at 12:04 AM, Thomas Munrowrote: > I'm testing another version that is a lot simpler: like v10, it relies > on the knowledge that the leader's transaction will always end after > the workers have finished, but it handles the RO_SAFE optimisation by > keeping the SERIALIZABLEXACT alive but freeing its locks etc. More > soon. I've now broken it into two patches. Patch 0001 is like my original patch with some minor improvements, except that it now disables the RO_SAFE optimisation completely in parallel mode. In other words, it's the stupidest fix possible to the problem you flagged up. I think the main questions to answer about the 0001 patch are whether this new locking protocol is sufficient, whether anything bad could happen as a result of lock escalation/transfer, and whether the underlying assumption about the SERIALIZABLEXACT's lifetime holds true (that the leader will never call ReleasePredicateLocks() while a worker is still running). There are a couple of easy incremental improvements that could be made on top of that patch, but I didn't make them because I'm trying to be conservative in the hope of landing at least the basic feature in PostgreSQL 11. Namely: 1. We could still return false if we see SXACT_FLAG_RO_SAFE in SerializationNeededForRead() (we just couldn't call ReleasePredicateLocks()). 2. We could set MySerializableXact to InvalidSerializableXact in worker backends so at least they'd benefit from the optimisation (we just couldn't do that in the leader or it'd leak resources). Patch 0002 aims a bit higher than those ideas. I wanted to make sure that the leader wouldn't arbitrarily miss out on the optimisation, and I also suspect that the optimisation might be contagious in the sense that actually releasing sooner might cause the RO_SAFE flag to be set on *other* transactions sooner. Patch 0002 works like this: The first backend to observe the RO_SAFE flag 'partially releases' the SERIALIZABLEXACT, so that the SERIALIZABLEXACT itself remains valid. (The concept of 'partial release' already existed, but I'm using it in a new way.) All backends clear their MySerializableXact variable so that they drop to faster SI in their own time. The leader keeps a copy of it in SavedSerializableXact, so that it can fully release it at the end of the transaction when we know that no other backend has a reference to it. These patches survive hammering with a simple test that generates a mixture of read only and read write parallel queries that hit the interesting case (attached; this test helped me understand that the refcount scheme I considered was going to be hard). I haven't personally tried to measure the value of the optimisation (though I'm pretty sure it exists, based on the VLDB paper and the knowledge that REPEATABLE READ (what the optimisation effectively gives you) just has to be faster than SERIALIZABLE 'cause I've see all that code you get to not run!). I'd like to propose the 0001 patch for now, but keep the 0002 patch back for a bit as it's very new and needs more feedback, if possible from Kevin and others involved in the SSI project. Of course their input on the 0001 patch is also super welcome. -- Thomas Munro http://www.enterprisedb.com import psycopg2 import threading DSN = "dbname=postgres" def run2(i): conn = psycopg2.connect(DSN) cursor = conn.cursor() cursor.execute("""set parallel_setup_cost=0""") cursor.execute("""set parallel_tuple_cost=0""") cursor.execute("""set min_parallel_table_scan_size=0""") cursor.execute("""set max_parallel_workers_per_gather=4""") cursor.execute("""set default_transaction_isolation=serializable""") if i % 2 == 0: cursor.execute("""set default_transaction_read_only=on""") cursor.execute("""select count(*) from foo""") cursor.execute("""select count(*) from foo""") conn.commit() for i in range(16): conn = psycopg2.connect(DSN) cursor = conn.cursor() cursor.execute("""drop table if exists foo""") cursor.execute("""create table foo as select generate_series(1, 10)::int as a""") cursor.execute("""alter table foo set (parallel_workers = 4)""") conn.commit() conn.close() threading.Thread(target=run2, args=[i]).start() 0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v12.patch Description: Binary data 0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v12.patch Description: Binary data
Re: [HACKERS] SERIALIZABLE with parallel query
On Fri, Feb 23, 2018 at 6:05 AM, Robert Haaswrote: > On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro > wrote:> >> The best solution I have come up with so far is to add a reference >> count to SERIALIZABLEXACT. I toyed with putting the refcount into the >> DSM instead, but then I ran into problems making that work when you >> have a query with multiple Gather nodes. Since the refcount is in >> SERIALIZABLEXACT I also had to add a generation counter so that I >> could detect the case where you try to attach too late (the leader has >> already errored out, the refcount has reached 0 and the >> SERIALIZABLEXACT object has been recycled). > > I don't know whether that's safe or not. It certainly sounds like > it's solving one category of problem, but is that the only issue? If > some backends haven't noticed that we're safe, they might keep > acquiring SIREAD locks or doing other manipulations of shared state, > which maybe could cause confusion. I haven't looked into this deeply > enough to understand whether there's actually a possibility of trouble > there, but I can't rule it out off-hand. After some testing, I think the refcount approach could be made to work, but it seems quite complicated and there are some weird edge cases that showed up that started to make it look like more trouble than it was worth. One downside of refcounts is that you never get to free the SERIALIZABLEXACT until the end of the transaction with parallel_leader_participation = off. I'm testing another version that is a lot simpler: like v10, it relies on the knowledge that the leader's transaction will always end after the workers have finished, but it handles the RO_SAFE optimisation by keeping the SERIALIZABLEXACT alive but freeing its locks etc. More soon.
Re: [HACKERS] SERIALIZABLE with parallel query
On Fri, Feb 23, 2018 at 7:56 PM, Amit Kapilawrote: > On Fri, Feb 23, 2018 at 8:48 AM, Thomas Munro > wrote: >> On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapila wrote: >>> By the way, in which case leader can exit early? As of now, we do >>> wait for workers to end both before the query is finished or in error >>> cases. >> >> create table foo as select generate_series(1, 10)::int a; >> alter table foo set (parallel_workers = 2); >> set parallel_setup_cost = 0; >> set parallel_tuple_cost = 0; >> select count(a / 0) from foo; >> >> That reliably gives me: >> ERROR: division by zero [from leader] >> ERROR: could not map dynamic shared memory segment [from workers] >> >> I thought this was coming from resource manager cleanup, but you're >> right: that happens after we wait for all workers to finish. Perhaps >> this is a race within DestroyParallelContext() itself: when it is >> called by AtEOXact_Parallel() during an abort, it asks the postmaster >> to SIGTERM the workers, then it immediately detaches from the DSM >> segment, and then it waits for the worker to start up. >> > > I guess you mean to say worker waits to shutdown/exit. Why would it > wait for startup at that stage? Right, I meant to say shutdown/exit. -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] SERIALIZABLE with parallel query
On Fri, Feb 23, 2018 at 8:48 AM, Thomas Munrowrote: > On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapila wrote: >> On Thu, Feb 22, 2018 at 10:35 PM, Robert Haas wrote: >>> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro PS I noticed that for BecomeLockGroupMember() we say "If we can't join the lock group, the leader has gone away, so just exit quietly" but for various other similar things we spew errors (most commonly seen one being "ERROR: could not map dynamic shared memory segment"). Intentional? >>> >>> I suppose I thought that if we failed to map the dynamic shared memory >>> segment, it might be down to any one of several causes; whereas if we >>> fail to join the lock group, it must be because the leader has already >>> exited. There might be a flaw in that thinking, though. >>> >> >> By the way, in which case leader can exit early? As of now, we do >> wait for workers to end both before the query is finished or in error >> cases. > > create table foo as select generate_series(1, 10)::int a; > alter table foo set (parallel_workers = 2); > set parallel_setup_cost = 0; > set parallel_tuple_cost = 0; > select count(a / 0) from foo; > > That reliably gives me: > ERROR: division by zero [from leader] > ERROR: could not map dynamic shared memory segment [from workers] > > I thought this was coming from resource manager cleanup, but you're > right: that happens after we wait for all workers to finish. Perhaps > this is a race within DestroyParallelContext() itself: when it is > called by AtEOXact_Parallel() during an abort, it asks the postmaster > to SIGTERM the workers, then it immediately detaches from the DSM > segment, and then it waits for the worker to start up. > I guess you mean to say worker waits to shutdown/exit. Why would it wait for startup at that stage? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] SERIALIZABLE with parallel query
On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapilawrote: > On Thu, Feb 22, 2018 at 10:35 PM, Robert Haas wrote: >> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro >>> PS I noticed that for BecomeLockGroupMember() we say "If we can't >>> join the lock group, the leader has gone away, so just exit quietly" >>> but for various other similar things we spew errors (most commonly >>> seen one being "ERROR: could not map dynamic shared memory segment"). >>> Intentional? >> >> I suppose I thought that if we failed to map the dynamic shared memory >> segment, it might be down to any one of several causes; whereas if we >> fail to join the lock group, it must be because the leader has already >> exited. There might be a flaw in that thinking, though. >> > > By the way, in which case leader can exit early? As of now, we do > wait for workers to end both before the query is finished or in error > cases. create table foo as select generate_series(1, 10)::int a; alter table foo set (parallel_workers = 2); set parallel_setup_cost = 0; set parallel_tuple_cost = 0; select count(a / 0) from foo; That reliably gives me: ERROR: division by zero [from leader] ERROR: could not map dynamic shared memory segment [from workers] I thought this was coming from resource manager cleanup, but you're right: that happens after we wait for all workers to finish. Perhaps this is a race within DestroyParallelContext() itself: when it is called by AtEOXact_Parallel() during an abort, it asks the postmaster to SIGTERM the workers, then it immediately detaches from the DSM segment, and then it waits for the worker to start up. The workers unblock signals before the they try to attach to the DSM segment, but they don't CHECK_FOR_INTERRUPTS before they try to attach (and even if they did it wouldn't solve nothing). I don't like the error much, though at least the root cause error is logged first. I don't immediately see how BecomeLockGroupMember() could have the same kind of problem though, for the reason you said: the leader waits for the workers to finish, so I'm not sure in which circumstances it would cease to be the lock group leader while the workers are still running. -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] SERIALIZABLE with parallel query
On Thu, Feb 22, 2018 at 10:35 PM, Robert Haaswrote: > On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro >> PS I noticed that for BecomeLockGroupMember() we say "If we can't >> join the lock group, the leader has gone away, so just exit quietly" >> but for various other similar things we spew errors (most commonly >> seen one being "ERROR: could not map dynamic shared memory segment"). >> Intentional? > > I suppose I thought that if we failed to map the dynamic shared memory > segment, it might be down to any one of several causes; whereas if we > fail to join the lock group, it must be because the leader has already > exited. There might be a flaw in that thinking, though. > By the way, in which case leader can exit early? As of now, we do wait for workers to end both before the query is finished or in error cases. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] SERIALIZABLE with parallel query
On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munrowrote:> > The best solution I have come up with so far is to add a reference > count to SERIALIZABLEXACT. I toyed with putting the refcount into the > DSM instead, but then I ran into problems making that work when you > have a query with multiple Gather nodes. Since the refcount is in > SERIALIZABLEXACT I also had to add a generation counter so that I > could detect the case where you try to attach too late (the leader has > already errored out, the refcount has reached 0 and the > SERIALIZABLEXACT object has been recycled). I don't know whether that's safe or not. It certainly sounds like it's solving one category of problem, but is that the only issue? If some backends haven't noticed that we're safe, they might keep acquiring SIREAD locks or doing other manipulations of shared state, which maybe could cause confusion. I haven't looked into this deeply enough to understand whether there's actually a possibility of trouble there, but I can't rule it out off-hand. One approach is to just disable this optimization for parallel query. Being able to use SERIALIZABLE with parallel query is better than not being able to do it, even if some optimizations are not applied in that case. Of course making the optimizations work is better, but we've got to be sure we're doing it right. > PS I noticed that for BecomeLockGroupMember() we say "If we can't > join the lock group, the leader has gone away, so just exit quietly" > but for various other similar things we spew errors (most commonly > seen one being "ERROR: could not map dynamic shared memory segment"). > Intentional? I suppose I thought that if we failed to map the dynamic shared memory segment, it might be down to any one of several causes; whereas if we fail to join the lock group, it must be because the leader has already exited. There might be a flaw in that thinking, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] SERIALIZABLE with parallel query
On Fri, Jan 26, 2018 at 4:24 AM, Robert Haaswrote: > I took a look at this today and thought it might be OK to commit, Thank you for looking at this! > modulo a few minor issues: (1) you didn't document the new tranche and Fixed. > (2) I prefer to avoid if (blah) { Assert(thing) } in favor of > Assert(!blah || thing). Done. > But then I got a little bit concerned about ReleasePredicateLocks(). > Suppose that in the middle of a read-only transaction, > SXACT_FLAG_RO_SAFE becomes true. The next call to > SerializationNeededForRead in each process will call > ReleasePredicateLocks. In the workers, this won't do anything, so > we'll just keep coming back there. But in the leader, we'll go ahead > do all that stuff, including MySerializableXact = > InvalidSerializableXact. But in the workers, it's still set. Maybe > that's OK, but I'm not sure that it's OK... Ouch. Yeah. It's not OK. If the leader gives up its SERIALIZABLEXACT object early due to that safe-read-only optimisation, the workers are left with a dangling pointer to a SERIALIZABLEXACT object that has been pushed onto FinishedSerializableTransactions. >From there it will move to PredXact->availableTransactions and might be handed out to some other transaction, so it is not safe to retain a pointer to that object. The best solution I have come up with so far is to add a reference count to SERIALIZABLEXACT. I toyed with putting the refcount into the DSM instead, but then I ran into problems making that work when you have a query with multiple Gather nodes. Since the refcount is in SERIALIZABLEXACT I also had to add a generation counter so that I could detect the case where you try to attach too late (the leader has already errored out, the refcount has reached 0 and the SERIALIZABLEXACT object has been recycled). The attached is a draft patch only, needing some testing and polish. Brickbats, better ideas? FWIW I also considered a couple of other ideas: (1) Keeping the object alive on the FinishedSerializableTransactions list until the leader's transaction is finished seems broken because we need to be able to spill that list to the SLRU at any time, and if we somehow made them sticky we could run out of memory. (2) Anything involving the leader having sole control of the object lifetime seems problematic... well, it might work if you disabled the SXACT_FLAG_RO_SAFE optimisation so that ReleasePredicateLocks() always happens after all workers have finished, but that seems like cheating. PS I noticed that for BecomeLockGroupMember() we say "If we can't join the lock group, the leader has gone away, so just exit quietly" but for various other similar things we spew errors (most commonly seen one being "ERROR: could not map dynamic shared memory segment"). Intentional? -- Thomas Munro http://www.enterprisedb.com ssi-parallel-v11.patch Description: Binary data
Re: [HACKERS] SERIALIZABLE with parallel query
On Wed, Jan 24, 2018 at 7:39 PM, Thomas Munrowrote: > This started crashing some time yesterday with an assertion failure in > the isolation tests after commit 2badb5af landed. Reordering of code > in parallel.c confused patch's fuzz heuristics leading > SetSerializableXact() to be called too soon. Here's a fix for that. I took a look at this today and thought it might be OK to commit, modulo a few minor issues: (1) you didn't document the new tranche and (2) I prefer to avoid if (blah) { Assert(thing) } in favor of Assert(!blah || thing). But then I got a little bit concerned about ReleasePredicateLocks(). Suppose that in the middle of a read-only transaction, SXACT_FLAG_RO_SAFE becomes true. The next call to SerializationNeededForRead in each process will call ReleasePredicateLocks. In the workers, this won't do anything, so we'll just keep coming back there. But in the leader, we'll go ahead do all that stuff, including MySerializableXact = InvalidSerializableXact. But in the workers, it's still set. Maybe that's OK, but I'm not sure that it's OK... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] SERIALIZABLE with parallel query
On Wed, Dec 13, 2017 at 5:30 PM, Thomas Munrowrote: > On Wed, Dec 13, 2017 at 2:09 PM, Haribabu Kommi > wrote: >> Thanks for explaining the problem in generating an isolation test to >> test the serialize parallel query. >> >> Committer can decide whether existing test is fine to part of the test suite >> or remove it, other than that everything is fine. so I am moving the patch >> into "ready for committer" state. > > Thank you! I will try to find a good benchmark that will really > exercise parallel query + SSI. This started crashing some time yesterday with an assertion failure in the isolation tests after commit 2badb5af landed. Reordering of code in parallel.c confused patch's fuzz heuristics leading SetSerializableXact() to be called too soon. Here's a fix for that. -- Thomas Munro http://www.enterprisedb.com ssi-parallel-v10.patch Description: Binary data
Re: [HACKERS] SERIALIZABLE with parallel query
On Wed, Dec 13, 2017 at 2:09 PM, Haribabu Kommiwrote: > Thanks for explaining the problem in generating an isolation test to > test the serialize parallel query. > > Committer can decide whether existing test is fine to part of the test suite > or remove it, other than that everything is fine. so I am moving the patch > into "ready for committer" state. Thank you! I will try to find a good benchmark that will really exercise parallel query + SSI. -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] SERIALIZABLE with parallel query
On Fri, Dec 8, 2017 at 11:33 AM, Thomas Munrowrote: > On Thu, Nov 30, 2017 at 2:44 PM, Thomas Munro > wrote: > > On Thu, Nov 30, 2017 at 2:32 PM, Michael Paquier > > wrote: > >> Could this question be answered? The patch still applies so I am > >> moving it to next CF. > > Rebased, 'cause it broke. Thanks for explaining the problem in generating an isolation test to test the serialize parallel query. Committer can decide whether existing test is fine to part of the test suite or remove it, other than that everything is fine. so I am moving the patch into "ready for committer" state. Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] SERIALIZABLE with parallel query
On Thu, Nov 30, 2017 at 2:44 PM, Thomas Munrowrote: > On Thu, Nov 30, 2017 at 2:32 PM, Michael Paquier > wrote: >> Could this question be answered? The patch still applies so I am >> moving it to next CF. Rebased, 'cause it broke. -- Thomas Munro http://www.enterprisedb.com ssi-parallel-v9.patch Description: Binary data
Re: [HACKERS] SERIALIZABLE with parallel query
On Tue, Sep 26, 2017 at 4:41 PM, Haribabu Kommiwrote: > > > On Mon, Sep 25, 2017 at 6:57 PM, Thomas Munro < > thomas.mu...@enterprisedb.com> wrote: > >> On Mon, Sep 25, 2017 at 8:37 PM, Haribabu Kommi >> wrote: >> > After I tune the GUC to go with sequence scan, still I am not getting >> the >> > error >> > in the session-2 for update operation like it used to generate an error >> for >> > parallel >> > sequential scan, and also it even takes some many commands until unless >> the >> > S1 >> > commits. >> >> Hmm. Then this requires more explanation because I don't expect a >> difference. I did some digging and realised that the error detail >> message "Reason code: Canceled on identification as a pivot, during >> write." was reached in a code path that requires >> SxactIsPrepared(writer) and also MySerializableXact == writer, which >> means that the process believes it is committing. Clearly something >> is wrong. After some more digging I realised that >> ParallelWorkerMain() calls EndParallelWorkerTransaction() which calls >> CommitTransaction() which calls >> PreCommit_CheckForSerializationFailure(). Since the worker is >> connected to the leader's SERIALIZABLEXACT, that finishes up being >> marked as preparing to commit (not true!), and then the leader get >> confused during that write, causing a serialization failure to be >> raised sooner (though I can't explain why it should be raised then >> anyway, but that's another topic). Oops. I think the fix here is >> just not to do that in a worker (the worker's CommitTransaction() >> doesn't really mean what it says). >> >> Here's a version with a change that makes that conditional. This way >> your test case behaves the same as non-parallel mode. >> > > The patch looks good, and I don't have any comments for the code. > The test that is going to add by the patch is not generating a true > parallelism scenario, I feel it is better to change the test that can > generate a parallel sequence/index/bitmap scan. > The latest patch is good. It lacks a test that verifies the serialize support with actual parallel workers, so in case if it broken, it is difficult to know. Regards, Hari Babu Fujitsu Australia