Re: Allow parallel plan for referential integrity checks?

2024-02-01 Thread vignesh C
On Sun, 21 Jan 2024 at 07:56, vignesh C  wrote:
>
> On Fri, 18 Aug 2023 at 16:29, Juan José Santamaría Flecha
>  wrote:
> >
> >
> > On Thu, Aug 17, 2023 at 3:51 PM Frédéric Yhuel  
> > wrote:
> >>
> >> On 8/17/23 14:00, Frédéric Yhuel wrote:
> >> > On 8/17/23 09:32, Frédéric Yhuel wrote:
> >> >> On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
> >> >>> Recently I restored a database from a directory format backup and
> >> >>> having this feature would have been quite useful
> >> >>
> >> >> Thanks for resuming work on this patch. I forgot to mention this in my
> >> >> original email, but the motivation was also to speed up the restore
> >> >> process. Parallelizing the FK checks could make a huge difference in
> >> >> certain cases. We should probably provide such a test case (with perf
> >> >> numbers), and maybe this is it what Robert asked for.
> >> >
> >> > I have attached two scripts which demonstrate the following problems:
> >
> >
> > Thanks for the scripts, but I think Robert's concerns come from the safety, 
> > and not the performance, of the parallel operation.
> >
> > Proving its vulnerability could be easy with a counter example, but 
> > assuring its safety is trickier. What test would suffice to do that?
>
> I'm seeing that there has been no activity in this thread for more
> than 5 months, I'm planning to close this in the current commitfest
> unless someone is planning to take it forward. It can be opened again
> when there is more interest.

Since the author or no one else showed interest in taking it forward
and the patch had no activity for more than 5 months, I have changed
the status to RWF. Feel free to add a new CF entry when someone is
planning to resume work more actively.

Regards,
Vignesh




Re: Allow parallel plan for referential integrity checks?

2024-01-20 Thread vignesh C
On Fri, 18 Aug 2023 at 16:29, Juan José Santamaría Flecha
 wrote:
>
>
> On Thu, Aug 17, 2023 at 3:51 PM Frédéric Yhuel  
> wrote:
>>
>> On 8/17/23 14:00, Frédéric Yhuel wrote:
>> > On 8/17/23 09:32, Frédéric Yhuel wrote:
>> >> On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
>> >>> Recently I restored a database from a directory format backup and
>> >>> having this feature would have been quite useful
>> >>
>> >> Thanks for resuming work on this patch. I forgot to mention this in my
>> >> original email, but the motivation was also to speed up the restore
>> >> process. Parallelizing the FK checks could make a huge difference in
>> >> certain cases. We should probably provide such a test case (with perf
>> >> numbers), and maybe this is it what Robert asked for.
>> >
>> > I have attached two scripts which demonstrate the following problems:
>
>
> Thanks for the scripts, but I think Robert's concerns come from the safety, 
> and not the performance, of the parallel operation.
>
> Proving its vulnerability could be easy with a counter example, but assuring 
> its safety is trickier. What test would suffice to do that?

I'm seeing that there has been no activity in this thread for more
than 5 months, I'm planning to close this in the current commitfest
unless someone is planning to take it forward. It can be opened again
when there is more interest.

Regards,
Vignesh




Re: Allow parallel plan for referential integrity checks?

2023-08-18 Thread Juan José Santamaría Flecha
On Thu, Aug 17, 2023 at 3:51 PM Frédéric Yhuel 
wrote:

> On 8/17/23 14:00, Frédéric Yhuel wrote:
> > On 8/17/23 09:32, Frédéric Yhuel wrote:
> >> On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
> >>> Recently I restored a database from a directory format backup and
> >>> having this feature would have been quite useful
> >>
> >> Thanks for resuming work on this patch. I forgot to mention this in my
> >> original email, but the motivation was also to speed up the restore
> >> process. Parallelizing the FK checks could make a huge difference in
> >> certain cases. We should probably provide such a test case (with perf
> >> numbers), and maybe this is it what Robert asked for.
> >
> > I have attached two scripts which demonstrate the following problems:
>

Thanks for the scripts, but I think Robert's concerns come from the safety,
and not the performance, of the parallel operation.

Proving its vulnerability could be easy with a counter example, but
assuring its safety is trickier. What test would suffice to do that?

Regards,

Juan José Santamaría Flecha


Re: Allow parallel plan for referential integrity checks?

2023-08-17 Thread Frédéric Yhuel




On 8/17/23 14:00, Frédéric Yhuel wrote:



On 8/17/23 09:32, Frédéric Yhuel wrote:



On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
Recently I restored a database from a directory format backup and 
having this feature would have been quite useful


Hi,

Thanks for resuming work on this patch. I forgot to mention this in my 
original email, but the motivation was also to speed up the restore 
process. Parallelizing the FK checks could make a huge difference in 
certain cases. We should probably provide such a test case (with perf 
numbers), and maybe this is it what Robert asked for.


I have attached two scripts which demonstrate the following problems:



Let me add the plans for more clarity:

1a. if the tables aren't analyzed nor vacuumed before the post-data 
step, then they are index-only scanned, with a lot of heap fetches 
(depending on their size, the planner sometimes chooses a seq scan 
instead).




https://explain.dalibo.com/plan/7491ga22c5293683#raw

1b. if the tables have been analyzed but not vacuumed before the 
post-data-step, then they are scanned sequentially. Usually better, but 
still not so good without a parallel plan.




https://explain.dalibo.com/plan/e92c5161g880bdhf#raw

2. if the visibility maps have been created, then the tables are 
index-only scanned without heap fetches, but this can still be slower 
than a parallel seq scan.




https://explain.dalibo.com/plan/de22bdb4ggc3dffg#raw

And at last, with the patch applied : 
https://explain.dalibo.com/plan/54612a09ffh2565a#raw





Re: Allow parallel plan for referential integrity checks?

2023-08-17 Thread Frédéric Yhuel



On 8/17/23 09:32, Frédéric Yhuel wrote:



On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
Recently I restored a database from a directory format backup and 
having this feature would have been quite useful


Hi,

Thanks for resuming work on this patch. I forgot to mention this in my 
original email, but the motivation was also to speed up the restore 
process. Parallelizing the FK checks could make a huge difference in 
certain cases. We should probably provide such a test case (with perf 
numbers), and maybe this is it what Robert asked for.


I have attached two scripts which demonstrate the following problems:

1a. if the tables aren't analyzed nor vacuumed before the post-data 
step, then they are index-only scanned, with a lot of heap fetches 
(depending on their size, the planner sometimes chooses a seq scan instead).


1b. if the tables have been analyzed but not vacuumed before the 
post-data-step, then they are scanned sequentially. Usually better, but 
still not so good without a parallel plan.


2. if the visibility maps have been created, then the tables are 
index-only scanned without heap fetches, but this can still be slower 
than a parallel seq scan.


So it would be nice if pg_restore could vacuum analyze the tables before 
the post-data step. I believe it would be faster in most cases.


And it would be nice to allow a parallel plan for RI checks.

Best regards,
Frédéric

create_and_fill_tables.sql
Description: application/sql


post_data.sql
Description: application/sql


Re: Allow parallel plan for referential integrity checks?

2023-08-17 Thread Frédéric Yhuel




On 8/10/23 17:06, Juan José Santamaría Flecha wrote:
Recently I restored a database from a directory format backup and having 
this feature would have been quite useful


Hi,

Thanks for resuming work on this patch. I forgot to mention this in my 
original email, but the motivation was also to speed up the restore 
process. Parallelizing the FK checks could make a huge difference in 
certain cases. We should probably provide such a test case (with perf 
numbers), and maybe this is it what Robert asked for.





Re: Allow parallel plan for referential integrity checks?

2023-08-16 Thread Juan José Santamaría Flecha
On Thu, Aug 10, 2023 at 5:06 PM Juan José Santamaría Flecha <
juanjo.santama...@gmail.com> wrote:

> On Tue, Jul 4, 2023 at 9:45 AM Daniel Gustafsson  wrote:
>
>> As there is no new patch submitted I will go ahead and do that, please
>> feel
>> free to resubmit when there is renewed interest in working on this.
>>
>
> Recently I restored a database from a directory format backup and having
> this feature would have been quite useful. So, I would like to resume the
> work on this patch, unless OP or someone else is already on it.
>

Hearing no one advising me otherwise I'll pick up the work on the patch.
First, I'll try to address all previous comments.

On Mon, Feb 14, 2022 at 3:34 PM Robert Haas  wrote:

> On Mon, Feb 7, 2022 at 5:26 AM Frédéric Yhuel 
> wrote:
> > I noticed that referential integrity checks aren't currently
> > parallelized. Is it on purpose?
>
> It's not 100% clear to me that it is safe. But on the other hand, it's
> also not 100% clear to me that it is unsafe.
>
> Generally, I only added CURSOR_OPT_PARALLEL_OK in places where I was
> confident that nothing bad would happen, and this wasn't one of those
> places. It's something of a nested-query environment -- your criterion
> #6. How do we know that the surrounding query isn't already parallel?
> Perhaps because it must be DML, but I think it must be more important
> to support parallel DML than to support this.
>

Currently I don't see a scenario where the validation might run inside
another parallel query, but I could be missing something. Suggestions on
some edge cases to test are more than welcome.

We already have a case where parallel operations can occur when adding a
table constraint, and that's the index creation for a primary key. Take for
example:

postgres=# SET max_parallel_maintenance_workers TO 4;
postgres=# SET parallel_setup_cost TO 0;
postgres=# SET parallel_tuple_cost TO 0;
postgres=# SET parallel_leader_participation TO 0;
postgres=# SET min_parallel_table_scan_size TO 0;
postgres=# SET client_min_messages TO debug1;
postgres=# CREATE TABLE parallel_pk_table (a int) WITH (autovacuum_enabled
= off);
postgres=# ALTER TABLE parallel_pk_table ADD PRIMARY KEY (a);
DEBUG:  ALTER TABLE / ADD PRIMARY KEY will create implicit index
"parallel_pk_table_pkey" for table "parallel_pk_table"
DEBUG:  building index "parallel_pk_table_pkey" on table
"parallel_pk_table" with request for 1 parallel workers
...
DEBUG:  index "parallel_pk_table_pkey" can safely use deduplication
DEBUG:  verifying table "parallel_pk_table"

This should be better documented. Maybe "parallel index build" [1] could
have its own section, so it can be cited from other sections?


> I'm not sure what the right thing to do here is, but I think it would
> be good if your patch included a test case.
>

I have included a basic test for parallel ADD PRIMARY KEY and ADD FOREIGN
KEY.

On Sat, Mar 19, 2022 at 1:57 AM Imseih (AWS), Sami 
wrote:

> It would be valuable to add logging to ensure that the ActiveSnapshot and
> TransactionSnapshot
> is the same for the leader and the workers. This logging could be tested
> in the TAP test.
>

That machinery is used in any parallel query, I'm not sure this patch
should be responsible for carrying that requirement.

Also, inside RI_Initial_Check you may want to set max_parallel_workers to
> max_parallel_maintenance_workers.
>
> Currently the work_mem is set to maintenance_work_mem. This will also
> require
> a doc change to call out.
>

Done.

PFA the patch.

[1] https://www.postgresql.org/docs/current/sql-createindex.html

Regards,

Juan José Santamaría Flecha


v2-0001-Allow-parallel-plan-for-referential-integrity-checks.patch
Description: Binary data


Re: Allow parallel plan for referential integrity checks?

2023-08-10 Thread Juan José Santamaría Flecha
On Tue, Jul 4, 2023 at 9:45 AM Daniel Gustafsson  wrote:

>
> As there is no new patch submitted I will go ahead and do that, please feel
> free to resubmit when there is renewed interest in working on this.
>
>
Recently I restored a database from a directory format backup and having
this feature would have been quite useful. So, I would like to resume the
work on this patch, unless OP or someone else is already on it.

Regards,

Juan José Santamaría Flecha


Re: Allow parallel plan for referential integrity checks?

2023-07-04 Thread Daniel Gustafsson
> On 20 Mar 2023, at 16:48, Frédéric Yhuel  wrote:
> On 3/20/23 15:58, Gregory Stark (as CFM) wrote:

>> Should we move it to next release at this
>> point? Even if you get time to work on it this commitfest do you think
>> it's likely to be committable in the next few weeks?
> 
> It is very unlikely. Maybe it's better to remove it from CF and put it back 
> later if the test case I will provide does a better job at convincing the 
> Postgres folks that RI checks should be parallelized.

As there is no new patch submitted I will go ahead and do that, please feel
free to resubmit when there is renewed interest in working on this.

--
Daniel Gustafsson





Re: Allow parallel plan for referential integrity checks?

2023-03-20 Thread Frédéric Yhuel




On 3/20/23 15:58, Gregory Stark (as CFM) wrote:

On Mon, 12 Dec 2022 at 11:37, Frédéric Yhuel  wrote:


I've planned to work on it full time on week 10 (6-10 March), if you
agree to bear with me. The idea would be to bootstrap my brain on it,
and then continue to work on it from time to time.


Any updates on this patch? 


I had the opportunity to discuss it with Melanie Plageman when she came 
to Paris last month and teach us (Dalibo's folk) about the patch review 
process.


She advised me to provide a good test case first, and to explain better 
how it would be useful, which I'm going to do.



Should we move it to next release at this
point? Even if you get time to work on it this commitfest do you think
it's likely to be committable in the next few weeks?



It is very unlikely. Maybe it's better to remove it from CF and put it 
back later if the test case I will provide does a better job at 
convincing the Postgres folks that RI checks should be parallelized.





Re: Allow parallel plan for referential integrity checks?

2023-03-20 Thread Gregory Stark (as CFM)
On Mon, 12 Dec 2022 at 11:37, Frédéric Yhuel  wrote:
>
> I've planned to work on it full time on week 10 (6-10 March), if you
> agree to bear with me. The idea would be to bootstrap my brain on it,
> and then continue to work on it from time to time.

Any updates on this patch? Should we move it to next release at this
point? Even if you get time to work on it this commitfest do you think
it's likely to be committable in the next few weeks?

-- 
Gregory Stark
As Commitfest Manager




Re: Allow parallel plan for referential integrity checks?

2023-01-16 Thread vignesh C
On Mon, 12 Dec 2022 at 22:06, Frédéric Yhuel  wrote:
>
>
>
> On 12/11/22 06:29, Ian Lawrence Barwick wrote:
> > 2022年7月26日(火) 20:58 Frédéric Yhuel :
> >>
> >>
> >>
> >> On 4/14/22 14:25, Frédéric Yhuel wrote:
> >>>
> >>>
> >>> On 3/19/22 01:57, Imseih (AWS), Sami wrote:
>  I looked at your patch and it's a good idea to make foreign key
>  validation
>  use parallel query on large relations.
> 
>  It would be valuable to add logging to ensure that the ActiveSnapshot
>  and TransactionSnapshot
>  is the same for the leader and the workers. This logging could be
>  tested in the TAP test.
> 
>  Also, inside RI_Initial_Check you may want to set max_parallel_workers to
>  max_parallel_maintenance_workers.
> 
>  Currently the work_mem is set to maintenance_work_mem. This will also
>  require
>  a doc change to call out.
> 
>  /*
> * Temporarily increase work_mem so that the check query can be
>  executed
> * more efficiently.  It seems okay to do this because the query
>  is simple
> * enough to not use a multiple of work_mem, and one typically
>  would not
> * have many large foreign-key validations happening
>  concurrently.  So
> * this seems to meet the criteria for being considered a
>  "maintenance"
> * operation, and accordingly we use maintenance_work_mem.
>  However, we
> 
> >>>
> >>> Hello Sami,
> >>>
> >>> Thank you for your review!
> >>>
> >>> I will try to do as you say, but it will take time, since my daily job
> >>> as database consultant takes most of my time and energy.
> >>>
> >>
> >> Hi,
> >>
> >> As suggested by Jacob, here is a quick message to say that I didn't find
> >> enough time to work further on this patch, but I didn't completely
> >> forget it either. I moved it to the next commitfest. Hopefully I will
> >> find enough time and motivation in the coming months :-)
> >
> > Hi Frédéric
> >
> > This patch has been carried forward for a couple more commitfests since
> > your message; do you think you'll be able to work on it further during this
> > release cycle?
> >
>
> Hi Ian,
>
> I've planned to work on it full time on week 10 (6-10 March), if you
> agree to bear with me. The idea would be to bootstrap my brain on it,
> and then continue to work on it from time to time.

I have moved this to Mar commitfest as the patch update is expected to
happen during March commitfest.

Regards,
Vignesh




Re: Allow parallel plan for referential integrity checks?

2022-12-12 Thread Frédéric Yhuel




On 12/11/22 06:29, Ian Lawrence Barwick wrote:

2022年7月26日(火) 20:58 Frédéric Yhuel :




On 4/14/22 14:25, Frédéric Yhuel wrote:



On 3/19/22 01:57, Imseih (AWS), Sami wrote:

I looked at your patch and it's a good idea to make foreign key
validation
use parallel query on large relations.

It would be valuable to add logging to ensure that the ActiveSnapshot
and TransactionSnapshot
is the same for the leader and the workers. This logging could be
tested in the TAP test.

Also, inside RI_Initial_Check you may want to set max_parallel_workers to
max_parallel_maintenance_workers.

Currently the work_mem is set to maintenance_work_mem. This will also
require
a doc change to call out.

/*
   * Temporarily increase work_mem so that the check query can be
executed
   * more efficiently.  It seems okay to do this because the query
is simple
   * enough to not use a multiple of work_mem, and one typically
would not
   * have many large foreign-key validations happening
concurrently.  So
   * this seems to meet the criteria for being considered a
"maintenance"
   * operation, and accordingly we use maintenance_work_mem.
However, we



Hello Sami,

Thank you for your review!

I will try to do as you say, but it will take time, since my daily job
as database consultant takes most of my time and energy.



Hi,

As suggested by Jacob, here is a quick message to say that I didn't find
enough time to work further on this patch, but I didn't completely
forget it either. I moved it to the next commitfest. Hopefully I will
find enough time and motivation in the coming months :-)


Hi Frédéric

This patch has been carried forward for a couple more commitfests since
your message; do you think you'll be able to work on it further during this
release cycle?



Hi Ian,

I've planned to work on it full time on week 10 (6-10 March), if you 
agree to bear with me. The idea would be to bootstrap my brain on it, 
and then continue to work on it from time to time.


Best regards,
Frédéric




Re: Allow parallel plan for referential integrity checks?

2022-12-10 Thread Ian Lawrence Barwick
2022年7月26日(火) 20:58 Frédéric Yhuel :
>
>
>
> On 4/14/22 14:25, Frédéric Yhuel wrote:
> >
> >
> > On 3/19/22 01:57, Imseih (AWS), Sami wrote:
> >> I looked at your patch and it's a good idea to make foreign key
> >> validation
> >> use parallel query on large relations.
> >>
> >> It would be valuable to add logging to ensure that the ActiveSnapshot
> >> and TransactionSnapshot
> >> is the same for the leader and the workers. This logging could be
> >> tested in the TAP test.
> >>
> >> Also, inside RI_Initial_Check you may want to set max_parallel_workers to
> >> max_parallel_maintenance_workers.
> >>
> >> Currently the work_mem is set to maintenance_work_mem. This will also
> >> require
> >> a doc change to call out.
> >>
> >> /*
> >>   * Temporarily increase work_mem so that the check query can be
> >> executed
> >>   * more efficiently.  It seems okay to do this because the query
> >> is simple
> >>   * enough to not use a multiple of work_mem, and one typically
> >> would not
> >>   * have many large foreign-key validations happening
> >> concurrently.  So
> >>   * this seems to meet the criteria for being considered a
> >> "maintenance"
> >>   * operation, and accordingly we use maintenance_work_mem.
> >> However, we
> >>
> >
> > Hello Sami,
> >
> > Thank you for your review!
> >
> > I will try to do as you say, but it will take time, since my daily job
> > as database consultant takes most of my time and energy.
> >
>
> Hi,
>
> As suggested by Jacob, here is a quick message to say that I didn't find
> enough time to work further on this patch, but I didn't completely
> forget it either. I moved it to the next commitfest. Hopefully I will
> find enough time and motivation in the coming months :-)

Hi Frédéric

This patch has been carried forward for a couple more commitfests since
your message; do you think you'll be able to work on it further during this
release cycle?

Thanks

Ian Barwick




Re: Allow parallel plan for referential integrity checks?

2022-07-26 Thread Frédéric Yhuel




On 4/14/22 14:25, Frédéric Yhuel wrote:



On 3/19/22 01:57, Imseih (AWS), Sami wrote:
I looked at your patch and it's a good idea to make foreign key 
validation

use parallel query on large relations.

It would be valuable to add logging to ensure that the ActiveSnapshot 
and TransactionSnapshot
is the same for the leader and the workers. This logging could be 
tested in the TAP test.


Also, inside RI_Initial_Check you may want to set max_parallel_workers to
max_parallel_maintenance_workers.

Currently the work_mem is set to maintenance_work_mem. This will also 
require

a doc change to call out.

/*
  * Temporarily increase work_mem so that the check query can be 
executed
  * more efficiently.  It seems okay to do this because the query 
is simple
  * enough to not use a multiple of work_mem, and one typically 
would not
  * have many large foreign-key validations happening 
concurrently.  So
  * this seems to meet the criteria for being considered a 
"maintenance"
  * operation, and accordingly we use maintenance_work_mem.  
However, we




Hello Sami,

Thank you for your review!

I will try to do as you say, but it will take time, since my daily job 
as database consultant takes most of my time and energy.




Hi,

As suggested by Jacob, here is a quick message to say that I didn't find 
enough time to work further on this patch, but I didn't completely 
forget it either. I moved it to the next commitfest. Hopefully I will 
find enough time and motivation in the coming months :-)


Best regards,
Frédéric




Re: Allow parallel plan for referential integrity checks?

2022-04-14 Thread Frédéric Yhuel




On 3/19/22 01:57, Imseih (AWS), Sami wrote:

I looked at your patch and it's a good idea to make foreign key validation
use parallel query on large relations.

It would be valuable to add logging to ensure that the ActiveSnapshot and 
TransactionSnapshot
is the same for the leader and the workers. This logging could be tested in the 
TAP test.

Also, inside RI_Initial_Check you may want to set max_parallel_workers to
max_parallel_maintenance_workers.

Currently the work_mem is set to maintenance_work_mem. This will also require
a doc change to call out.

/*
  * Temporarily increase work_mem so that the check query can be executed
  * more efficiently.  It seems okay to do this because the query is simple
  * enough to not use a multiple of work_mem, and one typically would not
  * have many large foreign-key validations happening concurrently.  So
  * this seems to meet the criteria for being considered a "maintenance"
  * operation, and accordingly we use maintenance_work_mem.  However, we



Hello Sami,

Thank you for your review!

I will try to do as you say, but it will take time, since my daily job 
as database consultant takes most of my time and energy.


Best regards,
Frédéric




Re: Allow parallel plan for referential integrity checks?

2022-03-18 Thread Imseih (AWS), Sami
I looked at your patch and it's a good idea to make foreign key validation
use parallel query on large relations.

It would be valuable to add logging to ensure that the ActiveSnapshot and 
TransactionSnapshot 
is the same for the leader and the workers. This logging could be tested in the 
TAP test.

Also, inside RI_Initial_Check you may want to set max_parallel_workers to 
max_parallel_maintenance_workers.

Currently the work_mem is set to maintenance_work_mem. This will also require
a doc change to call out.

/*
 * Temporarily increase work_mem so that the check query can be executed
 * more efficiently.  It seems okay to do this because the query is simple
 * enough to not use a multiple of work_mem, and one typically would not
 * have many large foreign-key validations happening concurrently.  So
 * this seems to meet the criteria for being considered a "maintenance"
 * operation, and accordingly we use maintenance_work_mem.  However, we







Re: Allow parallel plan for referential integrity checks?

2022-03-03 Thread Frédéric Yhuel

Hello, sorry for the late reply.

On 2/14/22 15:33, Robert Haas wrote:

On Mon, Feb 7, 2022 at 5:26 AM Frédéric Yhuel  wrote:

I noticed that referential integrity checks aren't currently
parallelized. Is it on purpose?


It's not 100% clear to me that it is safe. But on the other hand, it's
also not 100% clear to me that it is unsafe.

Generally, I only added CURSOR_OPT_PARALLEL_OK in places where I was
confident that nothing bad would happen, and this wasn't one of those
places. It's something of a nested-query environment -- your criterion
#6. How do we know that the surrounding query isn't already parallel?
Perhaps because it must be DML,


Did you mean DDL?


but I think it must be more important
to support parallel DML than to support this.




I'm not sure what the right thing to do here is, but I think it would
be good if your patch included a test case.



Would that be a regression test? (in src/test/regress ?)

If yes, what should I check? Is it good enough to load auto_explain and 
check that the query triggered by a foreign key addition is parallelized?


Best regards,
Frédéric




Re: Allow parallel plan for referential integrity checks?

2022-02-14 Thread Robert Haas
On Mon, Feb 7, 2022 at 5:26 AM Frédéric Yhuel  wrote:
> I noticed that referential integrity checks aren't currently
> parallelized. Is it on purpose?

It's not 100% clear to me that it is safe. But on the other hand, it's
also not 100% clear to me that it is unsafe.

Generally, I only added CURSOR_OPT_PARALLEL_OK in places where I was
confident that nothing bad would happen, and this wasn't one of those
places. It's something of a nested-query environment -- your criterion
#6. How do we know that the surrounding query isn't already parallel?
Perhaps because it must be DML, but I think it must be more important
to support parallel DML than to support this.

I'm not sure what the right thing to do here is, but I think it would
be good if your patch included a test case.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Allow parallel plan for referential integrity checks?

2022-02-14 Thread Frédéric Yhuel




On 2/11/22 00:16, Andreas Karlsson wrote:

On 2/7/22 11:26, Frédéric Yhuel wrote:
Attached is a (naive) patch that aims to fix the case of a FK 
addition, but the handling of the flag CURSOR_OPT_PARALLEL_OK, 
generally speaking, looks rather hackish.


Thanks, for the patch. You can add it to the current open commitfest 
(https://commitfest.postgresql.org/37/).




OK, I just did. Please let me know if I did something wrong.

Best regards,
Frédéric




Re: Allow parallel plan for referential integrity checks?

2022-02-10 Thread Andreas Karlsson

On 2/7/22 11:26, Frédéric Yhuel wrote:
Attached is a (naive) patch that aims to fix the case of a FK addition, 
but the handling of the flag CURSOR_OPT_PARALLEL_OK, generally speaking, 
looks rather hackish.


Thanks, for the patch. You can add it to the current open commitfest 
(https://commitfest.postgresql.org/37/).


Best regards,
Andreas




Allow parallel plan for referential integrity checks?

2022-02-07 Thread Frédéric Yhuel

Hello,

I noticed that referential integrity checks aren't currently 
parallelized. Is it on purpose?


From the documentation [1], the planner will not generate a parallel 
plan for a given query if any of the following are true:


1) The system is running in single-user mode.
2) max_parallel_workers_per_gather = 0.
3) The query writes any data or locks any database rows.
4) The query might be suspended during execution (cursor or  PL/pgSQL loop).
5) The query uses any function marked PARALLEL UNSAFE.
6) The query is running inside of another query that is already parallel.

From my understanding of the code, it seems to me that the fourth 
condition is not always accurately detected. In particular, the query 
triggered by a foreign key addition (a Left Excluding JOIN between the 
two tables) isn't parallelized, because the flag CURSOR_OPT_PARALLEL_OK 
hasn't been set at some point. I couldn't find any reason why not, but 
maybe I missed something.


Attached is a (naive) patch that aims to fix the case of a FK addition, 
but the handling of the flag CURSOR_OPT_PARALLEL_OK, generally speaking, 
looks rather hackish.


Best regards,
Frédéric

[1] 
https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.htmlFrom 1353a4b7e7d08b13cbaa85a5f9ae26819b5cf2c8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= 
Date: Sun, 6 Feb 2022 13:31:55 +0100
Subject: [PATCH] Allow parallel plan for referential integrity checks

---
 src/backend/utils/adt/ri_triggers.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 96269fc2ad..84c16b6962 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -1317,6 +1317,7 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	char		workmembuf[32];
 	int			spi_result;
 	SPIPlanPtr	qplan;
+	SPIPrepareOptions options;
 
 	riinfo = ri_FetchConstraintInfo(trigger, fk_rel, false);
 
@@ -1483,10 +1484,12 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	 * Generate the plan.  We don't need to cache it, and there are no
 	 * arguments to the plan.
 	 */
-	qplan = SPI_prepare(querybuf.data, 0, NULL);
+	memset(, 0, sizeof(options));
+	options.cursorOptions = CURSOR_OPT_PARALLEL_OK;
+	qplan = SPI_prepare_extended(querybuf.data, );
 
 	if (qplan == NULL)
-		elog(ERROR, "SPI_prepare returned %s for %s",
+		elog(ERROR, "SPI_prepare_extended returned %s for %s",
 			 SPI_result_code_string(SPI_result), querybuf.data);
 
 	/*
-- 
2.30.2