Re: [HACKERS] SERIALIZABLE with parallel query

2019-03-14 Thread Thomas Munro
On Mon, Mar 4, 2019 at 10:17 AM Thomas Munro  wrote:
> On Thu, Oct 11, 2018 at 10:15 AM Kevin Grittner  wrote:
> > It applies and builds clean, it passed make world with cassert and TAP
> > tests, and I can't see any remaining flaws.  This is true both of just
> > the 0001 v16 patch and that with 0002 v16 applied on top of it.
>
> Thanks.  I'd like to commit this soon.

I did a round of testing under load and some printf-debugging to
convince myself that the  SXACT_FLAG_RO_SAFE handling really is
exercised by serializable-parallel-2.spec and behaving as expected,
along with some more testing by hand, and pushed this.

To generate load I used a knock-off of sibench[1], run as eg
./petit-sibench --rows 1 --threads 8 --ssi, against a server
running with -c min_parallel_table_scan_size=128kB -c
parallel_setup_cost=0 -c max_worker_processes=16 -c
max_parallel_workers=16.

[1] https://github.com/macdice/petit-sibench/blob/master/petit-sibench.c

-- 
Thomas Munro
https://enterprisedb.com



Re: [HACKERS] SERIALIZABLE with parallel query

2018-10-10 Thread Kevin Grittner
On Mon, Oct 8, 2018 at 9:40 PM Thomas Munro
 wrote:
> Rebased.

It applies and builds clean, it passed make world with cassert and TAP
tests, and I can't see any remaining flaws.  This is true both of just
the 0001 v16 patch and that with 0002 v16 applied on top of it.

It would be great if someone with a big test machine could stress test
and benchmark this versus current production versions.

-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/



Re: [HACKERS] SERIALIZABLE with parallel query

2018-10-08 Thread Thomas Munro
On Tue, Oct 2, 2018 at 4:53 PM Thomas Munro
 wrote:
> Thanks for the review!  And sorry for my delayed response.  Here is a
> rebased patch, with changes as requested.

Rebased.

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v16.patch
Description: Binary data


0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v16.patch
Description: Binary data


Re: [HACKERS] SERIALIZABLE with parallel query

2018-10-01 Thread Thomas Munro
On Thu, Sep 20, 2018 at 9:50 AM Kevin Grittner  wrote:
> On Tue, Jul 10, 2018 at 8:58 PM Masahiko Sawada  wrote:
> > I looked at this patches. The latest patch can build without any
> > errors and warnings and pass all regression tests. I don't see
> > critical bugs but there are random comments.

Thanks for the review!  And sorry for my delayed response.  Here is a
rebased patch, with changes as requested.  I have replies also for
Kevin, see further down.

> > +   /*
> > +* If the leader in a parallel query earler stashed a 
> > partially
> > +* released SERIALIZABLEXACT for final clean-up at end
> > of transaction
> > +* (because workers might still have been accessing
> > it), then it's
> > +* time to restore it.
> > +*/
> >
> > There is a typo.
> > s/earler/earlier/

Fixed.

> > 
> > Should we add test to check if write-skew[1] anomaly doesn't happen
> > even in parallel mode?

I suppose we could find another one of the existing specs that shows
write-skew and adapt it to run a read-only part of the transaction in
a parallel worker, but what would it prove that the proposed new test
doesn't prove already?

> > 
> > - * We aren't acquiring lightweight locks for the predicate lock or lock
> > + * We acquire an LWLock in the case of parallel mode, because worker
> > + * backends have access to the leader's SERIALIZABLEXACT.  Otherwise,
> > + * we aren't acquiring lightweight locks for the predicate lock or lock
> >
> > There are LWLock and lightweight lock. Maybe it's better to unify the 
> > spelling.

Done.

> > 
> > @@ -2231,9 +2234,12 @@ PrepareTransaction(void)
> > /*
> >  * Mark serializable transaction as complete for predicate locking
> >  * purposes.  This should be done as late as we can put it and
> > still allow
> > -* errors to be raised for failure patterns found at commit.
> > +* errors to be raised for failure patterns found at commit.
> > This is not
> > +* appropriate for parallel workers however, because we aren't
> > committing
> > +* the leader's transaction and its serializable state will live on.
> >  */
> > -   PreCommit_CheckForSerializationFailure();
> > +   if (!IsParallelWorker())
> > +   PreCommit_CheckForSerializationFailure();
> >
> > This code assumes that parallel workers could prepare transaction. Is
> > that expected behavior of parallel query? There is an assertion
> > !IsInParallelMode() at the beginning of that function though.

You are right.  I made a change exactly like this in
CommitTransaction(), where it is necessary, but then somehow I managed
to apply that hunk to the identical code in PrepareTransaction() also,
where it is not necessary.  Fixed.

> > 
> > +/*
> > + * If the transaction is committing, but it has been partially released
> > + * already, then treat this as a roll back.  It was marked as rolled 
> > back.
> > + */
> > +if (isCommit && SxactIsPartiallyReleased(MySerializableXact))
> > +isCommit = false;
> > +
> >
> > Isn't it better to add an assertion to check if
> > MySerializableXact->flags has SXACT_FLAG_ROLLED_BACK flag for safety?

That can't hurt.  Added.

It's don't really the fact that the flag contradicts reality here...
but it was already established that the read-only safe optimisation
calls ReleasePredicateLocks(false) which behaves like a rollback and
marks the SERIALIZABLEXACT that way.  I don't have a better idea right
now.

On Thu, Sep 20, 2018 at 9:50 AM Kevin Grittner  wrote:
> After reviewing the thread and the current two patches, I agree with
> Masahiko Sawada plus preferring one adjustment to the coding: I would
> prefer to break out the majority of the ReleasePredicateLocks function
> to a static ReleasePredicateLocksMain (or similar) function and
> eliminating the goto.

Hi Kevin,

Thanks for the review.

How about moving that bit of local-cleanup code from the end of the
function into a new static function ReleasePredicateLocksLocal(), so
that we can call it and then return, instead of the evil "goto"?  Done
that way in the attached.

> The optimization in patch 0002 is important.  Previous benchmarks
> showed a fairly straightforward pgbench test scaled as well as
> REPEATABLE READ when it was present, but performance fell off up to
> 20% as the scale increased without it.

+1

> I will spend a few more days in testing and review, but figured I
> should pass along "first impressions" now.

Thanks!

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v15.patch
Description: Binary data


0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v15.patch
Description: Binary data


Re: [HACKERS] SERIALIZABLE with parallel query

2018-10-01 Thread Michael Paquier
On Wed, Sep 19, 2018 at 04:50:40PM -0500, Kevin Grittner wrote:
> I will spend a few more days in testing and review, but figured I
> should pass along "first impressions" now.

Kevin, it seems that this patch is pending on your input.  I have moved
this patch to next CF for now.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] SERIALIZABLE with parallel query

2018-09-19 Thread Kevin Grittner
After reviewing the thread and the current two patches, I agree with
Masahiko Sawada plus preferring one adjustment to the coding: I would
prefer to break out the majority of the ReleasePredicateLocks function
to a static ReleasePredicateLocksMain (or similar) function and
eliminating the goto.

The optimization in patch 0002 is important.  Previous benchmarks
showed a fairly straightforward pgbench test scaled as well as
REPEATABLE READ when it was present, but performance fell off up to
20% as the scale increased without it.

I will spend a few more days in testing and review, but figured I
should pass along "first impressions" now.

On Tue, Jul 10, 2018 at 8:58 PM Masahiko Sawada  wrote:
>
> On Mon, Jul 2, 2018 at 3:12 PM, Masahiko Sawada  wrote:
> > On Fri, Jun 29, 2018 at 7:28 PM, Thomas Munro
> >  wrote:
> >> On Thu, Jun 28, 2018 at 7:55 PM, Masahiko Sawada  
> >> wrote:
> >>> I'd like to test and review this patches but they seem to conflict
> >>> with current HEAD. Could you please rebase them?
> >>
> >> Hi Sawada-san,
> >>
> >> Thanks!  Rebased and attached.  The only changes are: the LWLock
> >> tranche is now shown as "serializable_xact" instead of "sxact" (hmm,
> >> LWLock tranches have lower_case_names_with_underscores, but individual
> >> LWLocks have CamelCaseName...), and ShareSerializableXact() no longer
> >> does Assert(!IsParallelWorker()).  These changes are based on the last
> >> feedback from Robert.
> >>
> >
> > Thank you! Will look at patches.
> >
>
> I looked at this patches. The latest patch can build without any
> errors and warnings and pass all regression tests. I don't see
> critical bugs but there are random comments.
>
> +   /*
> +* If the leader in a parallel query earler stashed a 
> partially
> +* released SERIALIZABLEXACT for final clean-up at end
> of transaction
> +* (because workers might still have been accessing
> it), then it's
> +* time to restore it.
> +*/
>
> There is a typo.
> s/earler/earlier/
>
> 
> Should we add test to check if write-skew[1] anomaly doesn't happen
> even in parallel mode?
>
> 
> - * We aren't acquiring lightweight locks for the predicate lock or lock
> + * We acquire an LWLock in the case of parallel mode, because worker
> + * backends have access to the leader's SERIALIZABLEXACT.  Otherwise,
> + * we aren't acquiring lightweight locks for the predicate lock or lock
>
> There are LWLock and lightweight lock. Maybe it's better to unify the 
> spelling.
>
> 
> @@ -2231,9 +2234,12 @@ PrepareTransaction(void)
> /*
>  * Mark serializable transaction as complete for predicate locking
>  * purposes.  This should be done as late as we can put it and
> still allow
> -* errors to be raised for failure patterns found at commit.
> +* errors to be raised for failure patterns found at commit.
> This is not
> +* appropriate for parallel workers however, because we aren't
> committing
> +* the leader's transaction and its serializable state will live on.
>  */
> -   PreCommit_CheckForSerializationFailure();
> +   if (!IsParallelWorker())
> +   PreCommit_CheckForSerializationFailure();
>
> This code assumes that parallel workers could prepare transaction. Is
> that expected behavior of parallel query? There is an assertion
> !IsInParallelMode() at the beginning of that function though.
>
> 
> +/*
> + * If the transaction is committing, but it has been partially released
> + * already, then treat this as a roll back.  It was marked as rolled 
> back.
> + */
> +if (isCommit && SxactIsPartiallyReleased(MySerializableXact))
> +isCommit = false;
> +
>
> Isn't it better to add an assertion to check if
> MySerializableXact->flags has SXACT_FLAG_ROLLED_BACK flag for safety?
>
> [1] https://en.wikipedia.org/wiki/Snapshot_isolation#Definition
>
> Regards,
>
> --
> Masahiko Sawada
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center



-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/



Re: [HACKERS] SERIALIZABLE with parallel query

2018-07-10 Thread Masahiko Sawada
On Mon, Jul 2, 2018 at 3:12 PM, Masahiko Sawada  wrote:
> On Fri, Jun 29, 2018 at 7:28 PM, Thomas Munro
>  wrote:
>> On Thu, Jun 28, 2018 at 7:55 PM, Masahiko Sawada  
>> wrote:
>>> I'd like to test and review this patches but they seem to conflict
>>> with current HEAD. Could you please rebase them?
>>
>> Hi Sawada-san,
>>
>> Thanks!  Rebased and attached.  The only changes are: the LWLock
>> tranche is now shown as "serializable_xact" instead of "sxact" (hmm,
>> LWLock tranches have lower_case_names_with_underscores, but individual
>> LWLocks have CamelCaseName...), and ShareSerializableXact() no longer
>> does Assert(!IsParallelWorker()).  These changes are based on the last
>> feedback from Robert.
>>
>
> Thank you! Will look at patches.
>

I looked at this patches. The latest patch can build without any
errors and warnings and pass all regression tests. I don't see
critical bugs but there are random comments.

+   /*
+* If the leader in a parallel query earler stashed a partially
+* released SERIALIZABLEXACT for final clean-up at end
of transaction
+* (because workers might still have been accessing
it), then it's
+* time to restore it.
+*/

There is a typo.
s/earler/earlier/


Should we add test to check if write-skew[1] anomaly doesn't happen
even in parallel mode?


- * We aren't acquiring lightweight locks for the predicate lock or lock
+ * We acquire an LWLock in the case of parallel mode, because worker
+ * backends have access to the leader's SERIALIZABLEXACT.  Otherwise,
+ * we aren't acquiring lightweight locks for the predicate lock or lock

There are LWLock and lightweight lock. Maybe it's better to unify the spelling.


@@ -2231,9 +2234,12 @@ PrepareTransaction(void)
/*
 * Mark serializable transaction as complete for predicate locking
 * purposes.  This should be done as late as we can put it and
still allow
-* errors to be raised for failure patterns found at commit.
+* errors to be raised for failure patterns found at commit.
This is not
+* appropriate for parallel workers however, because we aren't
committing
+* the leader's transaction and its serializable state will live on.
 */
-   PreCommit_CheckForSerializationFailure();
+   if (!IsParallelWorker())
+   PreCommit_CheckForSerializationFailure();

This code assumes that parallel workers could prepare transaction. Is
that expected behavior of parallel query? There is an assertion
!IsInParallelMode() at the beginning of that function though.


+/*
+ * If the transaction is committing, but it has been partially released
+ * already, then treat this as a roll back.  It was marked as rolled back.
+ */
+if (isCommit && SxactIsPartiallyReleased(MySerializableXact))
+isCommit = false;
+

Isn't it better to add an assertion to check if
MySerializableXact->flags has SXACT_FLAG_ROLLED_BACK flag for safety?

[1] https://en.wikipedia.org/wiki/Snapshot_isolation#Definition

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] SERIALIZABLE with parallel query

2018-07-02 Thread Masahiko Sawada
On Fri, Jun 29, 2018 at 7:28 PM, Thomas Munro
 wrote:
> On Thu, Jun 28, 2018 at 7:55 PM, Masahiko Sawada  
> wrote:
>> I'd like to test and review this patches but they seem to conflict
>> with current HEAD. Could you please rebase them?
>
> Hi Sawada-san,
>
> Thanks!  Rebased and attached.  The only changes are: the LWLock
> tranche is now shown as "serializable_xact" instead of "sxact" (hmm,
> LWLock tranches have lower_case_names_with_underscores, but individual
> LWLocks have CamelCaseName...), and ShareSerializableXact() no longer
> does Assert(!IsParallelWorker()).  These changes are based on the last
> feedback from Robert.
>

Thank you! Will look at patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] SERIALIZABLE with parallel query

2018-06-29 Thread Thomas Munro
On Thu, Jun 28, 2018 at 7:55 PM, Masahiko Sawada  wrote:
> I'd like to test and review this patches but they seem to conflict
> with current HEAD. Could you please rebase them?

Hi Sawada-san,

Thanks!  Rebased and attached.  The only changes are: the LWLock
tranche is now shown as "serializable_xact" instead of "sxact" (hmm,
LWLock tranches have lower_case_names_with_underscores, but individual
LWLocks have CamelCaseName...), and ShareSerializableXact() no longer
does Assert(!IsParallelWorker()).  These changes are based on the last
feedback from Robert.

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v14.patch
Description: Binary data


0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v14.patch
Description: Binary data


Re: [HACKERS] SERIALIZABLE with parallel query

2018-06-28 Thread Masahiko Sawada
On Fri, Mar 30, 2018 at 2:56 PM, Thomas Munro
 wrote:
> On Thu, Mar 8, 2018 at 10:28 AM, Robert Haas  wrote:
>> +SerializableXactHandle
>> +ShareSerializableXact(void)
>> +{
>> +Assert(!IsParallelWorker());
>> +
>> +return MySerializableXact;
>> +}
>>
>> Uh, how's that OK?  There's no rule that you can't create a
>> ParallelContext in a worker.  Parallel query currently doesn't, so it
>> probably won't happen, but burying an assertion to that effect in the
>> predicate locking code doesn't seem nice.
>
> Hmm.  I suppose you could have a PARALLEL SAFE function that itself
> launches parallel workers explicitly (not via parallel query), and
> they should inherit the same SERIALIZABLEXACT from their parent and
> that should all just work.
>
>> Is "sxact" really the best (i.e. clearest) name we can come up with
>> for the lock tranche?
>
> Yeah, needs a better name.
>
> I have some lingering uncertainty about this patch and we're out of
> time, so I moved it to PG12 CF1.  Thanks Haribabu, Robert, Amit for
> the reviews and comments so far.
>

I'd like to test and review this patches but they seem to conflict
with current HEAD. Could you please rebase them?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] SERIALIZABLE with parallel query

2018-03-29 Thread Thomas Munro
On Thu, Mar 8, 2018 at 10:28 AM, Robert Haas  wrote:
> +SerializableXactHandle
> +ShareSerializableXact(void)
> +{
> +Assert(!IsParallelWorker());
> +
> +return MySerializableXact;
> +}
>
> Uh, how's that OK?  There's no rule that you can't create a
> ParallelContext in a worker.  Parallel query currently doesn't, so it
> probably won't happen, but burying an assertion to that effect in the
> predicate locking code doesn't seem nice.

Hmm.  I suppose you could have a PARALLEL SAFE function that itself
launches parallel workers explicitly (not via parallel query), and
they should inherit the same SERIALIZABLEXACT from their parent and
that should all just work.

> Is "sxact" really the best (i.e. clearest) name we can come up with
> for the lock tranche?

Yeah, needs a better name.

I have some lingering uncertainty about this patch and we're out of
time, so I moved it to PG12 CF1.  Thanks Haribabu, Robert, Amit for
the reviews and comments so far.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-28 Thread Thomas Munro
On Mon, Feb 26, 2018 at 6:37 PM, Thomas Munro
 wrote:
> I've now broken it into two patches.

Rebased.

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v13.patch
Description: Binary data


0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v13.patch
Description: Binary data


Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-25 Thread Thomas Munro
On Sat, Feb 24, 2018 at 12:04 AM, Thomas Munro
 wrote:
> I'm testing another version that is a lot simpler: like v10, it relies
> on the knowledge that the leader's transaction will always end after
> the workers have finished, but it handles the RO_SAFE optimisation by
> keeping the SERIALIZABLEXACT alive but freeing its locks etc.  More
> soon.

I've now broken it into two patches.

Patch 0001 is like my original patch with some minor improvements,
except that it now disables the RO_SAFE optimisation completely in
parallel mode.  In other words, it's the stupidest fix possible to the
problem you flagged up.  I think the main questions to answer about
the 0001 patch are whether this new locking protocol is sufficient,
whether anything bad could happen as a result of lock
escalation/transfer, and whether the underlying assumption about the
SERIALIZABLEXACT's lifetime holds true (that the leader will never
call ReleasePredicateLocks() while a worker is still running).

There are a couple of easy incremental improvements that could be made
on top of that patch, but I didn't make them because I'm trying to be
conservative in the hope of landing at least the basic feature in
PostgreSQL 11.  Namely:

1.  We could still return false if we see SXACT_FLAG_RO_SAFE in
SerializationNeededForRead() (we just couldn't call
ReleasePredicateLocks()).

2.  We could set MySerializableXact to InvalidSerializableXact in
worker backends so at least they'd benefit from the optimisation (we
just couldn't do that in the leader or it'd leak resources).

Patch 0002 aims a bit higher than those ideas.  I wanted to make sure
that the leader wouldn't arbitrarily miss out on the optimisation, and
I also suspect that the optimisation might be contagious in the sense
that actually releasing sooner might cause the RO_SAFE flag to be set
on *other* transactions sooner.  Patch 0002 works like this:

The first backend to observe the RO_SAFE flag 'partially releases' the
SERIALIZABLEXACT, so that the SERIALIZABLEXACT itself remains valid.
(The concept of 'partial release' already existed, but I'm using it in
a new way.)  All backends clear their MySerializableXact variable so
that they drop to faster SI in their own time.  The leader keeps a
copy of it in SavedSerializableXact, so that it can fully release it
at the end of the transaction when we know that no other backend has a
reference to it.

These patches survive hammering with a simple test that generates a
mixture of read only and read write parallel queries that hit the
interesting case (attached; this test helped me understand that the
refcount scheme I considered was going to be hard).  I haven't
personally tried to measure the value of the optimisation (though I'm
pretty sure it exists, based on the VLDB paper and the knowledge that
REPEATABLE READ (what the optimisation effectively gives you) just has
to be faster than SERIALIZABLE 'cause I've see all that code you get
to not run!).  I'd like to propose the 0001 patch for now, but keep
the 0002 patch back for a bit as it's very new and needs more
feedback, if possible from Kevin and others involved in the SSI
project.  Of course their input on the 0001 patch is also super
welcome.

-- 
Thomas Munro
http://www.enterprisedb.com
import psycopg2
import threading

DSN = "dbname=postgres"

def run2(i):
  conn = psycopg2.connect(DSN)
  cursor = conn.cursor()
  cursor.execute("""set parallel_setup_cost=0""")
  cursor.execute("""set parallel_tuple_cost=0""")
  cursor.execute("""set min_parallel_table_scan_size=0""")
  cursor.execute("""set max_parallel_workers_per_gather=4""")
  cursor.execute("""set default_transaction_isolation=serializable""")
  if i % 2 == 0:
cursor.execute("""set default_transaction_read_only=on""")
cursor.execute("""select count(*) from foo""")
cursor.execute("""select count(*) from foo""")
conn.commit()

for i in range(16):
  conn = psycopg2.connect(DSN)
  cursor = conn.cursor()
  cursor.execute("""drop table if exists foo""")
  cursor.execute("""create table foo as select generate_series(1, 10)::int as a""")
  cursor.execute("""alter table foo set (parallel_workers = 4)""")
  conn.commit()
  conn.close()
  threading.Thread(target=run2, args=[i]).start()


0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v12.patch
Description: Binary data


0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v12.patch
Description: Binary data


Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-23 Thread Thomas Munro
On Fri, Feb 23, 2018 at 6:05 AM, Robert Haas  wrote:
> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
>  wrote:>
>> The best solution I have come up with so far is to add a reference
>> count to SERIALIZABLEXACT.  I toyed with putting the refcount into the
>> DSM instead, but then I ran into problems making that work when you
>> have a query with multiple Gather nodes.  Since the refcount is in
>> SERIALIZABLEXACT I also had to add a generation counter so that I
>> could detect the case where you try to attach too late (the leader has
>> already errored out, the refcount has reached 0 and the
>> SERIALIZABLEXACT object has been recycled).
>
> I don't know whether that's safe or not.  It certainly sounds like
> it's solving one category of problem, but is that the only issue?  If
> some backends haven't noticed that we're safe, they might keep
> acquiring SIREAD locks or doing other manipulations of shared state,
> which maybe could cause confusion.  I haven't looked into this deeply
> enough to understand whether there's actually a possibility of trouble
> there, but I can't rule it out off-hand.

After some testing, I think the refcount approach could be made to
work, but it seems quite complicated and there are some weird edge
cases that showed up that started to make it look like more trouble
than it was worth.  One downside of refcounts is that you never get to
free the SERIALIZABLEXACT until the end of the transaction with
parallel_leader_participation = off.

I'm testing another version that is a lot simpler: like v10, it relies
on the knowledge that the leader's transaction will always end after
the workers have finished, but it handles the RO_SAFE optimisation by
keeping the SERIALIZABLEXACT alive but freeing its locks etc.  More
soon.



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Thomas Munro
On Fri, Feb 23, 2018 at 7:56 PM, Amit Kapila  wrote:
> On Fri, Feb 23, 2018 at 8:48 AM, Thomas Munro
>  wrote:
>> On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapila  wrote:
>>> By the way, in which case leader can exit early?  As of now, we do
>>> wait for workers to end both before the query is finished or in error
>>> cases.
>>
>> create table foo as select generate_series(1, 10)::int a;
>> alter table foo set (parallel_workers = 2);
>> set parallel_setup_cost = 0;
>> set parallel_tuple_cost = 0;
>> select count(a / 0) from foo;
>>
>> That reliably gives me:
>> ERROR:  division by zero [from leader]
>> ERROR:  could not map dynamic shared memory segment [from workers]
>>
>> I thought this was coming from resource manager cleanup, but you're
>> right: that happens after we wait for all workers to finish.  Perhaps
>> this is a race within DestroyParallelContext() itself: when it is
>> called by AtEOXact_Parallel() during an abort, it asks the postmaster
>> to SIGTERM the workers, then it immediately detaches from the DSM
>> segment, and then it waits for the worker to start up.
>>
>
> I guess you mean to say worker waits to shutdown/exit.  Why would it
> wait for startup at that stage?

Right, I meant to say shutdown/exit.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Amit Kapila
On Fri, Feb 23, 2018 at 8:48 AM, Thomas Munro
 wrote:
> On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapila  wrote:
>> On Thu, Feb 22, 2018 at 10:35 PM, Robert Haas  wrote:
>>> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
 PS  I noticed that for BecomeLockGroupMember() we say "If we can't
 join the lock group, the leader has gone away, so just exit quietly"
 but for various other similar things we spew errors (most commonly
 seen one being "ERROR:  could not map dynamic shared memory segment").
 Intentional?
>>>
>>> I suppose I thought that if we failed to map the dynamic shared memory
>>> segment, it might be down to any one of several causes; whereas if we
>>> fail to join the lock group, it must be because the leader has already
>>> exited.  There might be a flaw in that thinking, though.
>>>
>>
>> By the way, in which case leader can exit early?  As of now, we do
>> wait for workers to end both before the query is finished or in error
>> cases.
>
> create table foo as select generate_series(1, 10)::int a;
> alter table foo set (parallel_workers = 2);
> set parallel_setup_cost = 0;
> set parallel_tuple_cost = 0;
> select count(a / 0) from foo;
>
> That reliably gives me:
> ERROR:  division by zero [from leader]
> ERROR:  could not map dynamic shared memory segment [from workers]
>
> I thought this was coming from resource manager cleanup, but you're
> right: that happens after we wait for all workers to finish.  Perhaps
> this is a race within DestroyParallelContext() itself: when it is
> called by AtEOXact_Parallel() during an abort, it asks the postmaster
> to SIGTERM the workers, then it immediately detaches from the DSM
> segment, and then it waits for the worker to start up.
>

I guess you mean to say worker waits to shutdown/exit.  Why would it
wait for startup at that stage?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Thomas Munro
On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapila  wrote:
> On Thu, Feb 22, 2018 at 10:35 PM, Robert Haas  wrote:
>> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
>>> PS  I noticed that for BecomeLockGroupMember() we say "If we can't
>>> join the lock group, the leader has gone away, so just exit quietly"
>>> but for various other similar things we spew errors (most commonly
>>> seen one being "ERROR:  could not map dynamic shared memory segment").
>>> Intentional?
>>
>> I suppose I thought that if we failed to map the dynamic shared memory
>> segment, it might be down to any one of several causes; whereas if we
>> fail to join the lock group, it must be because the leader has already
>> exited.  There might be a flaw in that thinking, though.
>>
>
> By the way, in which case leader can exit early?  As of now, we do
> wait for workers to end both before the query is finished or in error
> cases.

create table foo as select generate_series(1, 10)::int a;
alter table foo set (parallel_workers = 2);
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;
select count(a / 0) from foo;

That reliably gives me:
ERROR:  division by zero [from leader]
ERROR:  could not map dynamic shared memory segment [from workers]

I thought this was coming from resource manager cleanup, but you're
right: that happens after we wait for all workers to finish.  Perhaps
this is a race within DestroyParallelContext() itself: when it is
called by AtEOXact_Parallel() during an abort, it asks the postmaster
to SIGTERM the workers, then it immediately detaches from the DSM
segment, and then it waits for the worker to start up.  The workers
unblock signals before the they try to attach to the DSM segment, but
they don't CHECK_FOR_INTERRUPTS before they try to attach (and even if
they did it wouldn't solve nothing).

I don't like the error much, though at least the root cause error is
logged first.

I don't immediately see how BecomeLockGroupMember() could have the
same kind of problem though, for the reason you said: the leader waits
for the workers to finish, so I'm not sure in which circumstances it
would cease to be the lock group leader while the workers are still
running.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Amit Kapila
On Thu, Feb 22, 2018 at 10:35 PM, Robert Haas  wrote:
> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
>> PS  I noticed that for BecomeLockGroupMember() we say "If we can't
>> join the lock group, the leader has gone away, so just exit quietly"
>> but for various other similar things we spew errors (most commonly
>> seen one being "ERROR:  could not map dynamic shared memory segment").
>> Intentional?
>
> I suppose I thought that if we failed to map the dynamic shared memory
> segment, it might be down to any one of several causes; whereas if we
> fail to join the lock group, it must be because the leader has already
> exited.  There might be a flaw in that thinking, though.
>

By the way, in which case leader can exit early?  As of now, we do
wait for workers to end both before the query is finished or in error
cases.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Robert Haas
On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
 wrote:>
> The best solution I have come up with so far is to add a reference
> count to SERIALIZABLEXACT.  I toyed with putting the refcount into the
> DSM instead, but then I ran into problems making that work when you
> have a query with multiple Gather nodes.  Since the refcount is in
> SERIALIZABLEXACT I also had to add a generation counter so that I
> could detect the case where you try to attach too late (the leader has
> already errored out, the refcount has reached 0 and the
> SERIALIZABLEXACT object has been recycled).

I don't know whether that's safe or not.  It certainly sounds like
it's solving one category of problem, but is that the only issue?  If
some backends haven't noticed that we're safe, they might keep
acquiring SIREAD locks or doing other manipulations of shared state,
which maybe could cause confusion.  I haven't looked into this deeply
enough to understand whether there's actually a possibility of trouble
there, but I can't rule it out off-hand.

One approach is to just disable this optimization for parallel query.
Being able to use SERIALIZABLE with parallel query is better than not
being able to do it, even if some optimizations are not applied in
that case.  Of course making the optimizations work is better, but
we've got to be sure we're doing it right.

> PS  I noticed that for BecomeLockGroupMember() we say "If we can't
> join the lock group, the leader has gone away, so just exit quietly"
> but for various other similar things we spew errors (most commonly
> seen one being "ERROR:  could not map dynamic shared memory segment").
> Intentional?

I suppose I thought that if we failed to map the dynamic shared memory
segment, it might be down to any one of several causes; whereas if we
fail to join the lock group, it must be because the leader has already
exited.  There might be a flaw in that thinking, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Thomas Munro
On Fri, Jan 26, 2018 at 4:24 AM, Robert Haas  wrote:
> I took a look at this today and thought it might be OK to commit,

Thank you for looking at this!

> modulo a few minor issues: (1) you didn't document the new tranche and

Fixed.

> (2) I prefer to avoid if (blah) { Assert(thing) } in favor of
> Assert(!blah || thing).

Done.

> But then I got a little bit concerned about ReleasePredicateLocks().
> Suppose that in the middle of a read-only transaction,
> SXACT_FLAG_RO_SAFE becomes true. The next call to
> SerializationNeededForRead in each process will call
> ReleasePredicateLocks.  In the workers, this won't do anything, so
> we'll just keep coming back there.  But in the leader, we'll go ahead
> do all that stuff, including MySerializableXact =
> InvalidSerializableXact.  But in the workers, it's still set.  Maybe
> that's OK, but I'm not sure that it's OK...

Ouch.  Yeah.  It's not OK.  If the leader gives up its
SERIALIZABLEXACT object early due to that safe-read-only optimisation,
the workers are left with a dangling pointer to a SERIALIZABLEXACT
object that has been pushed onto FinishedSerializableTransactions.
>From there it will move to PredXact->availableTransactions and might
be handed out to some other transaction, so it is not safe to retain a
pointer to that object.

The best solution I have come up with so far is to add a reference
count to SERIALIZABLEXACT.  I toyed with putting the refcount into the
DSM instead, but then I ran into problems making that work when you
have a query with multiple Gather nodes.  Since the refcount is in
SERIALIZABLEXACT I also had to add a generation counter so that I
could detect the case where you try to attach too late (the leader has
already errored out, the refcount has reached 0 and the
SERIALIZABLEXACT object has been recycled).

The attached is a draft patch only, needing some testing and polish.
Brickbats, better ideas?

FWIW I also considered a couple of other ideas:  (1) Keeping the
object alive on the FinishedSerializableTransactions list until the
leader's transaction is finished seems broken because we need to be
able to spill that list to the SLRU at any time, and if we somehow
made them sticky we could run out of memory.  (2) Anything involving
the leader having sole control of the object lifetime seems
problematic... well, it might work if you disabled the
SXACT_FLAG_RO_SAFE optimisation so that ReleasePredicateLocks() always
happens after all workers have finished, but that seems like cheating.

PS  I noticed that for BecomeLockGroupMember() we say "If we can't
join the lock group, the leader has gone away, so just exit quietly"
but for various other similar things we spew errors (most commonly
seen one being "ERROR:  could not map dynamic shared memory segment").
Intentional?

-- 
Thomas Munro
http://www.enterprisedb.com


ssi-parallel-v11.patch
Description: Binary data


Re: [HACKERS] SERIALIZABLE with parallel query

2018-01-25 Thread Robert Haas
On Wed, Jan 24, 2018 at 7:39 PM, Thomas Munro
 wrote:
> This started crashing some time yesterday with an assertion failure in
> the isolation tests after commit 2badb5af landed.  Reordering of code
> in parallel.c confused patch's fuzz heuristics leading
> SetSerializableXact() to be called too soon.  Here's a fix for that.

I took a look at this today and thought it might be OK to commit,
modulo a few minor issues: (1) you didn't document the new tranche and
(2) I prefer to avoid if (blah) { Assert(thing) } in favor of
Assert(!blah || thing).

But then I got a little bit concerned about ReleasePredicateLocks().
Suppose that in the middle of a read-only transaction,
SXACT_FLAG_RO_SAFE becomes true. The next call to
SerializationNeededForRead in each process will call
ReleasePredicateLocks.  In the workers, this won't do anything, so
we'll just keep coming back there.  But in the leader, we'll go ahead
do all that stuff, including MySerializableXact =
InvalidSerializableXact.  But in the workers, it's still set.  Maybe
that's OK, but I'm not sure that it's OK...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] SERIALIZABLE with parallel query

2018-01-24 Thread Thomas Munro
On Wed, Dec 13, 2017 at 5:30 PM, Thomas Munro
 wrote:
> On Wed, Dec 13, 2017 at 2:09 PM, Haribabu Kommi
>  wrote:
>> Thanks for explaining the problem in generating an isolation test to
>> test the serialize parallel query.
>>
>> Committer can decide whether existing test is fine to part of the test suite
>> or remove it, other than that everything is fine. so I am moving the patch
>> into "ready for committer" state.
>
> Thank you!  I will try to find a good benchmark that will really
> exercise parallel query + SSI.

This started crashing some time yesterday with an assertion failure in
the isolation tests after commit 2badb5af landed.  Reordering of code
in parallel.c confused patch's fuzz heuristics leading
SetSerializableXact() to be called too soon.  Here's a fix for that.

-- 
Thomas Munro
http://www.enterprisedb.com


ssi-parallel-v10.patch
Description: Binary data


Re: [HACKERS] SERIALIZABLE with parallel query

2017-12-12 Thread Thomas Munro
On Wed, Dec 13, 2017 at 2:09 PM, Haribabu Kommi
 wrote:
> Thanks for explaining the problem in generating an isolation test to
> test the serialize parallel query.
>
> Committer can decide whether existing test is fine to part of the test suite
> or remove it, other than that everything is fine. so I am moving the patch
> into "ready for committer" state.

Thank you!  I will try to find a good benchmark that will really
exercise parallel query + SSI.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] SERIALIZABLE with parallel query

2017-12-12 Thread Haribabu Kommi
On Fri, Dec 8, 2017 at 11:33 AM, Thomas Munro  wrote:

> On Thu, Nov 30, 2017 at 2:44 PM, Thomas Munro
>  wrote:
> > On Thu, Nov 30, 2017 at 2:32 PM, Michael Paquier
> >  wrote:
> >> Could this question be answered? The patch still applies so I am
> >> moving it to next CF.
>
> Rebased, 'cause it broke.


Thanks for explaining the problem in generating an isolation test to
test the serialize parallel query.

Committer can decide whether existing test is fine to part of the test suite
or remove it, other than that everything is fine. so I am moving the patch
into "ready for committer" state.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] SERIALIZABLE with parallel query

2017-12-07 Thread Thomas Munro
On Thu, Nov 30, 2017 at 2:44 PM, Thomas Munro
 wrote:
> On Thu, Nov 30, 2017 at 2:32 PM, Michael Paquier
>  wrote:
>> Could this question be answered? The patch still applies so I am
>> moving it to next CF.

Rebased, 'cause it broke.

-- 
Thomas Munro
http://www.enterprisedb.com


ssi-parallel-v9.patch
Description: Binary data


Re: [HACKERS] SERIALIZABLE with parallel query

2017-11-23 Thread Haribabu Kommi
On Tue, Sep 26, 2017 at 4:41 PM, Haribabu Kommi 
wrote:

>
>
> On Mon, Sep 25, 2017 at 6:57 PM, Thomas Munro <
> thomas.mu...@enterprisedb.com> wrote:
>
>> On Mon, Sep 25, 2017 at 8:37 PM, Haribabu Kommi
>>  wrote:
>> > After I tune the GUC to go with sequence scan, still I am not getting
>> the
>> > error
>> > in the session-2 for update operation like it used to generate an error
>> for
>> > parallel
>> > sequential scan, and also it even takes some many commands until unless
>> the
>> > S1
>> > commits.
>>
>> Hmm.  Then this requires more explanation because I don't expect a
>> difference.  I did some digging and realised that the error detail
>> message "Reason code: Canceled on identification as a pivot, during
>> write." was reached in a code path that requires
>> SxactIsPrepared(writer) and also MySerializableXact == writer, which
>> means that the process believes it is committing.  Clearly something
>> is wrong.  After some more digging I realised that
>> ParallelWorkerMain() calls EndParallelWorkerTransaction() which calls
>> CommitTransaction() which calls
>> PreCommit_CheckForSerializationFailure().  Since the worker is
>> connected to the leader's SERIALIZABLEXACT, that finishes up being
>> marked as preparing to commit (not true!), and then the leader get
>> confused during that write, causing a serialization failure to be
>> raised sooner (though I can't explain why it should be raised then
>> anyway, but that's another topic).  Oops.  I think the fix here is
>> just not to do that in a worker (the worker's CommitTransaction()
>> doesn't really mean what it says).
>>
>> Here's a version with a change that makes that conditional.  This way
>> your test case behaves the same as non-parallel mode.
>>
>
> The patch looks good, and I don't have any comments for the code.
> The test that is going to add by the patch is not generating a true
> parallelism scenario, I feel it is better to change the test that can
> generate a parallel sequence/index/bitmap scan.
>


The latest patch is good. It lacks a test that verifies the serialize
support with actual parallel workers, so in case if it broken, it is
difficult
to know.

Regards,
Hari Babu
Fujitsu Australia