Re: Allowing parallel-safe initplans

2024-10-02 Thread Frédéric Yhuel
Le 12/04/2023 à 20:06, Robert Haas a écrit : There's only one existing test case that visibly changes plan with these changes. The new plan is clearly saner-looking than before, and testing with some data loaded into the table confirms that it is faster. I'm not sure if it's worth devising m

Re: pgstattuple: fix free space calculation

2024-09-09 Thread Frédéric Yhuel
On 9/7/24 22:45, Tom Lane wrote: I wrote: Now alternatively you could argue that a "new" page isn't usable free space yet and so we should count it as zero, just as we don't count dead tuples as usable free space. You need VACUUM to turn either of those things into real free space. But that'

Re: pgstattuple: fix free space calculation

2024-09-09 Thread Frédéric Yhuel
Hi Tom, thanks for your review. On 9/7/24 22:10, Tom Lane wrote: I looked at this patch. I agree with making the change. However, I don't agree with the CF entry's marking of "target version: stable" (i.e., requesting back-patch). I think this falls somewhere in the gray area between a bug fi

Re: pgstattuple: fix free space calculation

2024-08-29 Thread Frédéric Yhuel
On 8/23/24 12:51, Frédéric Yhuel wrote: On 8/23/24 12:02, Rafia Sabih wrote: On the other hand, this got me thinking about the purpose of this space information. If we want to understand that there's still some space for the tuples in a page, then using PageGetExactFreeSpace is not

Re: pgstattuple: fix free space calculation

2024-08-23 Thread Frédéric Yhuel
On 8/23/24 12:02, Rafia Sabih wrote: On the other hand, this got me thinking about the purpose of this space information. If we want to understand that there's still some space for the tuples in a page, then using PageGetExactFreeSpace is not doing justice in case of heap page, because we wi

Re: pgstattuple: fix free space calculation

2024-08-23 Thread Frédéric Yhuel
On 8/22/24 21:56, Rafia Sabih wrote: I agree with the approach here. A minor comment here is to change the comments in code referring to the PageGetHeapFreeSpace. Thank you Rafia. Here is a v2 patch. I've also added this to the commit message: Also, PageGetHeapFreeSpace() will return zero

pgstattuple: fix free space calculation

2024-08-22 Thread Frédéric Yhuel
Hello, I think that pgstattuple should use PageGetExactFreeSpace() instead of PageGetHeapFreeSpace() or PageGetFreeSpace(). The latter two compute the free space minus the space of a line pointer. They are used like this in the rest of the code (heapam.c): pagefree = PageGetHeapFreeSpace(pag

Re: New GUC autovacuum_max_threshold ?

2024-08-12 Thread Frédéric Yhuel
f 51M currently) for a 256M tuples table ; * 3M (instead of 5M currently) for a 25.6M tuples table. The other advantage is that you don't need another GUC. On Tue, Jun 18, 2024 at 12:36:42PM +0200, Frédéric Yhuel wrote: By the way, I wonder if there were any off-list discussions after Ro

Re: New GUC autovacuum_max_threshold ?

2024-06-18 Thread Frédéric Yhuel
Le 18/06/2024 à 05:06, Nathan Bossart a écrit : I didn't see a commitfest entry for this, so I created one to make sure we don't lose track of this: https://commitfest.postgresql.org/48/5046/ OK thanks! By the way, I wonder if there were any off-list discussions after Robert's co

Re: New GUC autovacuum_max_threshold ?

2024-05-13 Thread Frédéric Yhuel
Le 09/05/2024 à 16:58, Robert Haas a écrit : As I see it, a lot of the lack of agreement up until now is people just not understanding the math. Since I think I've got the right idea about the math, I attribute this to other people being confused about what is going to happen and would tend to

Re: New GUC autovacuum_max_threshold ?

2024-05-01 Thread Frédéric Yhuel
Le 01/05/2024 à 20:50, Robert Haas a écrit : Possibly what we need here is something other than a cap, where, say, we vacuum a 10GB table twice as often as now, a 100GB table four times as often, and a 1TB table eight times as often. Or whatever the right answer is. IMO, it would make more s

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Frédéric Yhuel
Le 25/04/2024 à 22:21, Robert Haas a écrit : The analyze case, I feel, is really murky. autovacuum_analyze_scale_factor stands for the proposition that as the table becomes larger, analyze doesn't need to be done as often. If what you're concerned about is the frequency estimates, that's true:

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Frédéric Yhuel
Le 26/04/2024 à 04:24, Laurenz Albe a écrit : On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote: I believe that the underlying problem here can be summarized in this way: just because I'm OK with 2MB of bloat in my 10MB table doesn't mean that I'm OK with 2TB of bloat in my 10TB table. One

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Frédéric Yhuel
Le 25/04/2024 à 18:51, Melanie Plageman a écrit : I'm not too sure I understand. What are the reasons it might by skipped? I can think of a concurrent index creation on the same table, or anything holding a SHARE UPDATE EXCLUSIVE lock or above. Is this the sort of thing you are talking about?

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Frédéric Yhuel
Le 25/04/2024 à 21:21, Nathan Bossart a écrit : On Thu, Apr 25, 2024 at 02:33:05PM -0400, Robert Haas wrote: What does surprise me is that Frédéric suggests a default value of 500,000. If half a million tuples (proposed default) is 20% of your table (default value of autovacuum_vacuum_scale_f

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Frédéric Yhuel
Hi Nathan, thanks for your review. Le 24/04/2024 à 21:57, Nathan Bossart a écrit : Yeah, I'm having trouble following the proposed mechanics for this new GUC, and it's difficult to understand how users would choose a value. If we just want to cap the number of tuples required before autovacuum

Re: New GUC autovacuum_max_threshold ?

2024-04-24 Thread Frédéric Yhuel
Le 24/04/2024 à 21:10, Melanie Plageman a écrit : On Wed, Apr 24, 2024 at 8:08 AM Frédéric Yhuel wrote: Hello, I would like to suggest a new parameter, autovacuum_max_threshold, which would set an upper limit on the number of tuples to delete/update/insert prior to vacuum/analyze. Hi

New GUC autovacuum_max_threshold ?

2024-04-24 Thread Frédéric Yhuel
Hello, I would like to suggest a new parameter, autovacuum_max_threshold, which would set an upper limit on the number of tuples to delete/update/insert prior to vacuum/analyze. A good default might be 50. The idea would be to replace the following calculation : vacthresh = (float4) vac

Re: Set log_lock_waits=on by default

2023-12-21 Thread Frédéric Yhuel
Le 21/12/2023 à 14:29, Laurenz Albe a écrit : Here is a patch to implement this. Being stuck behind a lock for more than a second is almost always a problem, so it is reasonable to turn this on by default. I think it's a really good idea. At Dalibo, we advise our customers to switch it on.

Re: Out of memory error handling in frontend code

2023-10-06 Thread Frédéric Yhuel
Hi Daniel, Thank you for your answer. On 9/28/23 14:02, Daniel Gustafsson wrote: On 28 Sep 2023, at 10:14, Frédéric Yhuel wrote: After some time, we understood that the 20 million of large objects were responsible for the huge memory usage (more than 10 GB) by pg_dump. This sounds like

Out of memory error handling in frontend code

2023-09-28 Thread Frédéric Yhuel
Hello, One of our customers recently complained that his pg_dump stopped abruptly with the message "out of memory". After some time, we understood that the 20 million of large objects were responsible for the huge memory usage (more than 10 GB) by pg_dump. I think a more useful error messag

Re: Allow parallel plan for referential integrity checks?

2023-08-17 Thread Frédéric Yhuel
On 8/17/23 14:00, Frédéric Yhuel wrote: On 8/17/23 09:32, Frédéric Yhuel wrote: On 8/10/23 17:06, Juan José Santamaría Flecha wrote: Recently I restored a database from a directory format backup and having this feature would have been quite useful Hi, Thanks for resuming work on this

Re: Allow parallel plan for referential integrity checks?

2023-08-17 Thread Frédéric Yhuel
On 8/17/23 09:32, Frédéric Yhuel wrote: On 8/10/23 17:06, Juan José Santamaría Flecha wrote: Recently I restored a database from a directory format backup and having this feature would have been quite useful Hi, Thanks for resuming work on this patch. I forgot to mention this in my

Re: Allow parallel plan for referential integrity checks?

2023-08-17 Thread Frédéric Yhuel
On 8/10/23 17:06, Juan José Santamaría Flecha wrote: Recently I restored a database from a directory format backup and having this feature would have been quite useful Hi, Thanks for resuming work on this patch. I forgot to mention this in my original email, but the motivation was also to

Re: Allow parallel plan for referential integrity checks?

2023-03-20 Thread Frédéric Yhuel
On 3/20/23 15:58, Gregory Stark (as CFM) wrote: On Mon, 12 Dec 2022 at 11:37, Frédéric Yhuel wrote: I've planned to work on it full time on week 10 (6-10 March), if you agree to bear with me. The idea would be to bootstrap my brain on it, and then continue to work on it from time to

Re: Allow parallel plan for referential integrity checks?

2022-12-12 Thread Frédéric Yhuel
On 12/11/22 06:29, Ian Lawrence Barwick wrote: 2022年7月26日(火) 20:58 Frédéric Yhuel : On 4/14/22 14:25, Frédéric Yhuel wrote: On 3/19/22 01:57, Imseih (AWS), Sami wrote: I looked at your patch and it's a good idea to make foreign key validation use parallel query on large relations

Re: [PATCH] minor optimization for ineq_histogram_selectivity()

2022-11-24 Thread Frédéric Yhuel
On 11/23/22 16:59, Tom Lane wrote: =?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Yhuel?= writes: On 10/24/22 17:26, Frédéric Yhuel wrote: When studying the weird planner issue reported here [1], I came up with the attached patch. It reduces the probability of calling get_actual_variable_range(). This

Re: [PATCH] minor optimization for ineq_histogram_selectivity()

2022-11-23 Thread Frédéric Yhuel
On 10/24/22 17:26, Frédéric Yhuel wrote: Hello, When studying the weird planner issue reported here [1], I came up with the attached patch. It reduces the probability of calling get_actual_variable_range(). This isn't very useful anymore thanks to this patch: https://git.postgresq

Re: [PATCH] minor optimization for ineq_histogram_selectivity()

2022-10-31 Thread Frédéric Yhuel
On 10/24/22 17:26, Frédéric Yhuel wrote: Hello, When studying the weird planner issue reported here [1], I came up with the attached patch. It reduces the probability of calling get_actual_variable_range(). The patch applies to the master branch. How to test : CREATE TABLE foo (a

Re: Transparent column encryption

2022-10-28 Thread Frédéric Yhuel
Hi, Here are a few more things I noticed : If a CEK is encrypted with cmk1 and cmk2, but cmk1 isn't found on the client,the following error is printed twice for the very first SELECT statement: could not open file "/path/to/cmk1.pem": No such file or directory ...and nothing is returned.

[PATCH] minor optimization for ineq_histogram_selectivity()

2022-10-24 Thread Frédéric Yhuel
Hello, When studying the weird planner issue reported here [1], I came up with the attached patch. It reduces the probability of calling get_actual_variable_range(). The patch applies to the master branch. How to test : CREATE TABLE foo (a bigint, b TEXT) WITH (autovacuum_enabled = off); IN

Re: [PATCH] minor bug fix for pg_dump --clean

2022-10-24 Thread Frédéric Yhuel
On 10/24/22 03:01, Tom Lane wrote: =?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Yhuel?= writes: When using pg_dump (or pg_restore) with option "--clean", there is some SQL code to drop every objects at the beginning. Yup ... The DROP statement for a view involving circular dependencies is : CREATE OR REPLA

[PATCH] minor bug fix for pg_dump --clean

2022-09-01 Thread Frédéric Yhuel
Hello, When using pg_dump (or pg_restore) with option "--clean", there is some SQL code to drop every objects at the beginning. The DROP statement for a view involving circular dependencies is : CREATE OR REPLACE VIEW [...] (see commit message of d8c05aff for a much better explanation) If t

Re: Allow parallel plan for referential integrity checks?

2022-07-26 Thread Frédéric Yhuel
On 4/14/22 14:25, Frédéric Yhuel wrote: On 3/19/22 01:57, Imseih (AWS), Sami wrote: I looked at your patch and it's a good idea to make foreign key validation use parallel query on large relations. It would be valuable to add logging to ensure that the ActiveSnapshot

Re: Allow parallel plan for referential integrity checks?

2022-04-14 Thread Frédéric Yhuel
On 3/19/22 01:57, Imseih (AWS), Sami wrote: I looked at your patch and it's a good idea to make foreign key validation use parallel query on large relations. It would be valuable to add logging to ensure that the ActiveSnapshot and TransactionSnapshot is the same for the leader and the worke

Re: REINDEX blocks virtually any queries but some prepared queries.

2022-04-11 Thread Frédéric Yhuel
On 4/11/22 02:57, Michael Paquier wrote: On Fri, Apr 08, 2022 at 04:23:48PM +0200, Frédéric Yhuel wrote: Thank you Michael. And done as of 8ac700a. -- Thank you Micheal! For reference purposes, we can see in the code of get_relation_info(), in plancat.c, that indeed every index of the

Re: REINDEX blocks virtually any queries but some prepared queries.

2022-04-08 Thread Frédéric Yhuel
On 4/8/22 02:22, Michael Paquier wrote: On Thu, Apr 07, 2022 at 05:29:36PM +0200, Guillaume Lelarge a écrit : Le jeu. 7 avr. 2022 à 15:44, Frédéric Yhuel a écrit : On 4/7/22 14:40, Justin Pryzby wrote: Thank you Justin! I applied your fixes in the v2 patch (attached). v2 patch sounds

Re: REINDEX blocks virtually any queries but some prepared queries.

2022-04-07 Thread Frédéric Yhuel
On 4/7/22 14:40, Justin Pryzby wrote: On Thu, Apr 07, 2022 at 01:37:57PM +0200, Frédéric Yhuel wrote: Maybe something along this line? (patch attached) Some language fixes. Thank you Justin! I applied your fixes in the v2 patch (attached). I didn't verify the behavior, but +1 to doc

Re: REINDEX blocks virtually any queries but some prepared queries.

2022-04-07 Thread Frédéric Yhuel
On 4/6/22 17:03, Peter Geoghegan wrote: On Wed, Apr 6, 2022 at 7:49 AM Frédéric Yhuel wrote: From the documentation (https://www.postgresql.org/docs/current/sql-reindex.html#id-1.9.3.162.7), it sounds like REINDEX won't block read queries that don't need the index. But it seem

REINDEX blocks virtually any queries but some prepared queries.

2022-04-06 Thread Frédéric Yhuel
Hello, From the documentation (https://www.postgresql.org/docs/current/sql-reindex.html#id-1.9.3.162.7), it sounds like REINDEX won't block read queries that don't need the index. But it seems like the planner wants to take an ACCESS SHARE lock on every indexes, regardless of the query, and s

Re: Allow parallel plan for referential integrity checks?

2022-03-03 Thread Frédéric Yhuel
Hello, sorry for the late reply. On 2/14/22 15:33, Robert Haas wrote: On Mon, Feb 7, 2022 at 5:26 AM Frédéric Yhuel wrote: I noticed that referential integrity checks aren't currently parallelized. Is it on purpose? It's not 100% clear to me that it is safe. But on the other hand,

Re: Allow parallel plan for referential integrity checks?

2022-02-14 Thread Frédéric Yhuel
On 2/11/22 00:16, Andreas Karlsson wrote: On 2/7/22 11:26, Frédéric Yhuel wrote: Attached is a (naive) patch that aims to fix the case of a FK addition, but the handling of the flag CURSOR_OPT_PARALLEL_OK, generally speaking, looks rather hackish. Thanks, for the patch. You can add it to

Should pg_restore vacuum the tables before the post-data stage?

2022-02-08 Thread Frédéric Yhuel
Hello, I was wondering if pg_restore should call VACUUM ANALYZE for all tables, after the "COPY" stage, and before the "post-data" stage. Indeed, without such a VACUUM, the visibility map isn't available. Depending on the size of the tables and on the configuration, a foreign key constraint

Allow parallel plan for referential integrity checks?

2022-02-07 Thread Frédéric Yhuel
Hello, I noticed that referential integrity checks aren't currently parallelized. Is it on purpose? From the documentation [1], the planner will not generate a parallel plan for a given query if any of the following are true: 1) The system is running in single-user mode. 2) max_parallel_wor