Add LZ4 compression in pg_dump
Hi, please find attached a patchset which adds lz4 compression in pg_dump. The first commit does the heavy lifting required for additional compression methods. It expands testing coverage for the already supported gzip compression. Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of unifying compression related code and allow for the introduction of additional archive formats. However, pg_backup_archiver.c was not using that API. This commit teaches pg_backup_archiver.c about cfp and is using it through out. Furthermore, compression was chosen based on the value of the level passed as an argument during the invocation of pg_dump or some hardcoded defaults. This does not scale for more than one compression methods. Now the method used for compression can be explicitly requested during command invocation, or set during hardcoded defaults. Then it is stored in the relevant structs and passed in the relevant functions, along side compression level which has lost it's special meaning. The method for compression is not yet stored in the actual archive. This is done in the next commit which does introduce a new method. The previously named CompressionAlgorithm enum is changed for CompressionMethod so that it matches better similar variables found through out the code base. In a fashion similar to the binary for pg_basebackup, the method for compression is passed using the already existing -Z/--compress parameter of pg_dump. The legacy format and behaviour is maintained. Additionally, the user can explicitly pass a requested method and optionaly the level to be used after a semicolon,e.g. --compress=gzip:6 The second commit adds LZ4 compression in pg_dump and pg_restore. Within compress_io.{c,h} there are two distinct APIs exposed, the streaming API and a file API. The first one, is aimed at inlined use cases and thus simple lz4.h calls can be used directly. The second one is generating output, or is parsing input, which can be read/generated via the lz4 utility. In the later case, the API is using an opaque wrapper around a file stream, which aquired via fopen() or gzopen() respectively. It would then provide wrappers around fread(), fwrite(), fgets(), fgetc(), feof(), and fclose(); or their gz equivallents. However the LZ4F api does not provide this functionality. So this has been implemented localy. In order to maintain the API compatibility a new structure LZ4File is introduced. It is responsible for keeping state and any yet unused generated content. The later is required when the generated decompressed output, exceeds the caller's buffer capacity. Custom compressed archives need to now store the compression method in their header. This requires a bump in the version number. The level of compression is still stored in the dump, though admittedly is of no apparent use. The series is authored by me. Rachel Heaton helped out with the expansion of the testing coverage, testing in different platforms and providing debug information on those, as well as native speaker wording. Cheers, //GeorgiosFrom 8dfe12b08378571add6e1d178bed97a61e988593 Mon Sep 17 00:00:00 2001 From: Georgios Kokolatos Date: Mon, 22 Nov 2021 09:58:49 + Subject: [PATCH v1 1/2] Prepare pg_dump for additional compression methods This commmit does the heavy lifting required for additional compression methods. It expands testing coverage for the already supported gzip compression. Commit bf9aa490db introduced cfp in compress_io.{c,h} with the intent of unifying compression related code and allow for the introduction of additional archive formats. However, pg_backup_archiver.c was not using that API. This commit teaches pg_backup_archiver.c about cfp and is using it through out. Furthermore, compression was chosen based on the value of the level passed as an argument during the invocation of pg_dump or some hardcoded defaults. This does not scale for more than one compression methods. Now the method used for compression can be explicitly requested during command invocation, or set during hardcoded defaults. Then it is stored in the relevant structs and passed in the relevant functions, along side compression level which has lost it's special meaning. The method for compression is not yet stored in the actual archive. This is done in the next commit which does introduce a new method. The previously named CompressionAlgorithm enum is changed for CompressionMethod so that it matches better similar variables found through out the code base. In a fashion similar to the binary for pg_basebackup, the method for compression is passed using the already existing -Z/--compress parameter of pg_dump. The legacy format and behaviour is maintained. Additionally, the user can explicitly pass a requested method and optionaly the level to be used after a semicolon, e.g. --compress=gzip:6 --- doc/src/sgml/ref/pg_dump.sgml | 30 +- src/bin/pg_dump/Makefile | 2 + src/bin/pg_dump/compress_io.c
Re: psql: \dl+ to list large objects privileges
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested Hi, There is something I forgot to mention in my previous review. Typically when describing a thing in psql, it is the column "Description" that belongs in the verbose version. For example listing indexes produces: List of relations Schema | Name| Type | Owner | Table and the verbose version: List of relations Schema | Name| Type | Owner | Table | Persistence | Access method | Size| Description Since '\dl' already contains description, it might be worthwhile to consider to add the column `Access privileges` without introducing a verbose version. Thoughts? Cheers, //Georgios
Re: psql: \dl+ to list large objects privileges
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested Hi, thank you for the patch, I personally think it is a useful addition and thus it gets my vote. However, I also think that the proposed code will need some changes. On a high level I will recommend the addition of tests. There are similar tests present in: ./src/test/regress/sql/psql.sql I will also inquire as to the need for renaming the function `do_lo_list` to `listLargeObjects` and its move to describe.c. from large_obj.c. In itself it is not necessarily a blocking point, though it will require some strong arguments for doing so. Applying the patch, generates several whitespace warnings. It will be helpful if those warnings are removed. The patch contains: case 'l': - success = do_lo_list(); + success = listLargeObjects(show_verbose); It might be of some interest to consider in the above to check the value of the next character in command or emit an error if not valid. Such a pattern can be found in the same switch block as for example: switch (cmd[2]) { case '\0': case '+': success = ... break; default: status = PSQL_CMD_UNKNOWN; break; } The patch contains: else if (strcmp(cmd + 3, "list") == 0) - success = do_lo_list(); + success = listLargeObjects(false); + + else if (strcmp(cmd + 3, "list+") == 0) + success = listLargeObjects(true); In a fashion similar to `exec_command_list`, it might be interesting to consider expressing the above as: show_verbose = strchr(cmd, '+') ? true : false; else if (strcmp(cmd + 3, "list") == 0 success = do_lo_list(show_verbose); Thoughts? Cheers, //Georgios
Introduce pg_receivewal gzip compression tests
Hi, As suggested on a different thread [1], pg_receivewal can increase it's test coverage. There exists a non trivial amount of code that handles gzip compression. The current patch introduces tests that cover creation of gzip compressed WAL files and the handling of gzip partial segments. Finally the integrity of the compressed files is verified. I hope you find this useful. Cheers, //Georgios [1] https://www.postgresql.org/message-id/flat/ZCm1J5vfyQ2E6dYvXz8si39HQ2gwxSZ3IpYaVgYa3lUwY88SLapx9EEnOf5uEwrddhx2twG7zYKjVeuP5MwZXCNPybtsGouDsAD1o2L_I5E%3D%40pm.me v1-0001-Introduce-pg_receivewal-gzip-compression-tests.patch Description: Binary data
Re: Allow batched insert during cross-partition updates
‐‐‐ Original Message ‐‐‐ On Wednesday, March 10, 2021 1:23 PM, Georgios wrote: > > > Hi, > > ‐‐‐ Original Message ‐‐‐ > On Thursday, February 18, 2021 10:54 AM, Amit Langote amitlangot...@gmail.com > wrote: > > > On Thu, Feb 18, 2021 at 6:52 PM Amit Langote amitlangot...@gmail.com wrote: > > > > > Based on the discussion at: > > > https://www.postgresql.org/message-id/6929d485-2d2a-da46-3681-4a400a3d794f%40enterprisedb.com > > > I'm posting the patch for $subject here in this new thread and I'll > > > add it to the next CF per Tomas' advice. > > > > Done:https://commitfest.postgresql.org/32/2992/ > > apparently I did not receive the review comment I sent via the commitfest app. > Apologies for the chatter. Find the message-id here: https://www.postgresql.org/message-id/161530518971.29967.9368488207318158252.pgcf%40coridan.postgresql.org I continued looking a bit at the patch, yet I am either failing to see fix or I am looking at the wrong thing. Please find attached a small repro of what my expectetions were. As you can see in the repro, I would expect the UPDATE local_root_remote_partitions SET a = 2; to move the tuples to remote_partition_2 on the same transaction. However this is not the case, with or without the patch. Is my expectation of this patch wrong? Cheers, //Georgios > > > Amit Langote > > EDB: http://www.enterprisedb.com repro.sql Description: application/sql
Re: Allow batched insert during cross-partition updates
Hi, ‐‐‐ Original Message ‐‐‐ On Thursday, February 18, 2021 10:54 AM, Amit Langote wrote: > On Thu, Feb 18, 2021 at 6:52 PM Amit Langote amitlangot...@gmail.com wrote: > > > Based on the discussion at: > > https://www.postgresql.org/message-id/6929d485-2d2a-da46-3681-4a400a3d794f%40enterprisedb.com > > I'm posting the patch for $subject here in this new thread and I'll > > add it to the next CF per Tomas' advice. > > Done:https://commitfest.postgresql.org/32/2992/ > > -- apparently I did not receive the review comment I sent via the commitfest app. Apologies for the chatter. Find the message-id here: > > Amit Langote > EDB: http://www.enterprisedb.com
Re: Allow batched insert during cross-partition updates
Hi, thanks for the patch. I had a first look and played around with the code. The code seems clean, complete, and does what it says on the tin. I will need a bit more time to acclimatise with all the use cases for a more thorough review. I small question though is why expose PartitionTupleRouting and not add a couple of functions to get the necessary info? If I have read the code correctly the only members actually needed to be exposed are num_partitions and partitions. Not a critique, I am just curious. Cheers, //Georgios
Re: [PATCH] postgres-fdw: column option to override foreign types
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested Hi, Thanks for the patch. I am afraid I will have to :-1: this patch. Of course it is possible that I am wrong, so please correct me if you, or any other reviewers, think so. The problem that is intended to be solved, upon closer inspection seems to be a conscious design decision rather than a problem. Even if I am wrong there, I am not certain that the proposed patch covers all the bases with respect to collations, build-in types, shipability etc for simple expressions, and covers any of more complicated expressions all together. As for the scenario which prompted the patch, you wrote, quote: The real scenario that prompted it is a tickets table with status, priority, category, etc. enums. We don't have plans to modify them any time soon, but if we do it's got to be coordinated and deployed across two databases, all so we can use what might as well be a text column in a single WHERE clause. Since foreign tables can be defined over subsets of columns, reordered, and names changed, a little opt-in flexibility with types too doesn't seem misplaced. end quote. I will add that creating a view on the remote server with type flexibility that you wish and then create foreign tables against the view, might address your problem. Respectfully, //Georgios
Re: Shared memory size computation oversight?
‐‐‐ Original Message ‐‐‐ On Thursday, March 4, 2021 9:21 AM, Michael Paquier wrote: > On Thu, Mar 04, 2021 at 03:23:33PM +0800, Julien Rouhaud wrote: > > > I was also considering adding new (add|mull)_*_size functions to avoid > > having > > too messy code. I'm not terribly happy with xxx_shm_size(), as not all call > > to > > those functions would require an alignment. Maybe (add|mull)shmemalign_size? > > But before modifying dozens of calls, should we really fix those or only > > increase a bit the "slop factor", or a mix of it? > > For instance, I can see that for instance BackendStatusShmemSize() never had > > any padding consideration, while others do. > > Maybe only fixing contribs, some macro like PredXactListDataSize that > > already > > do a MAXALIGN, SimpleLruShmemSize and hash_estimate_size() would be a short > > patch and should significantly improve the estimation. > > The lack of complaints in this area looks to me like a sign that we > may not really need to backpatch something, so I would not be against > a precise chirurgy, with a separate set of {add,mul}_size() routines > that are used where adapted, so as it is easy to track down which size > estimations expect an extra padding. I would be curious to hear more > thoughts from others here. > > Saying that, calling a new routine something like add_shmem_align_size > makes it clear what's its purpose. My limited opinion on the matter after spending some time yesterday through the related code, is that with the current api it is easy to miss something and underestimate or just be sloppy. If only from the readability point of view, adding the proposed add_shmem_align_size will be beneficial. I hold no opinion on backpatching. //Georgios > > --- > > Michael
Re: Shared memory size computation oversight?
Hi, ‐‐‐ Original Message ‐‐‐ On Saturday, February 27, 2021 9:08 AM, Julien Rouhaud wrote: > Hi, > > While investigating on some strange "out of shared memory" error reported on > the french BBS [1], I noticed that 09adc9a8c09 (adding Robert in Cc) changed > ShmemAlloc alignment to CACHELINEALIGN but didn't update any related code that > depended on it. Nice catch. > > Most of the core code isn't impacted as it doesn't have to reserve any shared > memory, but AFAICT pg_prewarm and pg_stat_statements can now slightly > underestimate the amount of shared memory they'll use, and similarly for any > third party extension that still rely on MAXALIGN alignment. As a consequence > those extension can consume a few hundred bytes more than they reserved, which > probably will be borrowed from the lock hashtables reserved size that isn't > alloced immediately. This can later lead to ShmemAlloc failing when trying to > acquire a lock while the hash table is almost full but should still have > enough > room to store it, which could explain the error reported. > > PFA a patch that fixes pg_prewarm and pg_stat_statements explicit alignment to > CACHELINEALIGN, and also update the alignment in hash_estimate_size() to what > I > think ShmemInitHash will actually consume. Please excuse me as I speak most from the side of ignorance. I am slightly curious to understand something in your patch, if you can be kind enough to explain it to me. The commit 09adc9a8c09 you pointed out to, as far as I can see, changed the total size of the shared memory, not individual bits. It did explain that the users of that space had properly padded their data, but even assuming that they did not do that as asked, the result would (or rather should) be cache misses, not running out of reserved memory. My limited understanding is also based in a comment in CreateSharedMemoryAndSemaphores() /* * Size of the Postgres shared-memory block is estimated via * moderately-accurate estimates for the big hogs, plus 100K for the * stuff that's too small to bother with estimating. * * We take some care during this phase to ensure that the total size * request doesn't overflow size_t. If this gets through, we don't * need to be so careful during the actual allocation phase. */ size = 10; Of course, my argument falls bare, if the size estimates of each of the elements is rather underestimated. Indeed this is the argument in the present patch expressed in code in hash_estimate_size most prominently, here: size = add_size(size, - mul_size(nElementAllocs, - mul_size(elementAllocCnt, elementSize))); + CACHELINEALIGN(Nmul_size(nElementAllocs, + mul_size(elementAllocCnt, elementSize; (small note, Nmul_size is a typo of mul_size, but that is minor, by amending it the code compiles). To conclude, the running out of shared memory, seems to me to be fixed rather vaguely with this patch. I am not claiming that increasing the memory used by the elements is not needed, I am claiming that I can not clearly see how is that specific increase needed. Thank you for your patience, //Georgios -- https://www.vmware.com > > [1] https://forums.postgresql.fr/viewtopic.php?pid=32138#p32138 sorry, it's > all > in French.
Re: GROUP BY DISTINCT
‐‐‐ Original Message ‐‐‐ On Tuesday, March 2, 2021 5:51 PM, Vik Fearing wrote: > On 3/2/21 4:06 PM, Georgios Kokolatos wrote: > > > As a minor gripe, I would note the addition of list_int_cmp. > > The block > > > > - /* Sort each groupset individually */ > > > > > > - foreach(cell, result) > > > > > > - list_sort(lfirst(cell), list_int_cmp); > > > > > > > > Can follow suit from the rest of the code, and define a static > > cmp_list_int_asc(), as > > indeed the same patch does for cmp_list_len_contents_asc. > > This is indeed point of which I will not hold a too strong opinion. > > I did it this way because list_int_cmp is a general purpose function for > int lists that can be reused elsewhere in the future. Whereas > cmp_list_len_contents_asc is very specific to this case so I kept it local. Of course. I got the intention and I have noted my opinion. > > I'm happy to change it around if that's what consensus wants. As before, I will not hold a too strong opinion. > > > Overall :+1: from me. > > Thanks for looking at it! > > > I will be bumping to 'Ready for Committer' unless objections. > > In that case, here is another patch that fixes a typo in the docs > mentioned privately to me by Erik. The typo (and a gratuitous rebase) > is the only change to what you just reviewed. Thank you. The typo was indistiguishable to me too. My :+1: stands for version 3 of the patch. Updating status in the commitfest to reflect that. //Georgios -- https://www.vmware.com > > > > Vik Fearing
Re: GROUP BY DISTINCT
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested Hi, this is a useful feature, thank you for implementing. I gather that it follows the standard, if so, then there are definitely no objections from me. The patch in version 2, applies cleanly and passes all the tests. It contains documentation which seems correct to a non native speaker. As a minor gripe, I would note the addition of list_int_cmp. The block + /* Sort each groupset individually */ + foreach(cell, result) + list_sort(lfirst(cell), list_int_cmp); Can follow suit from the rest of the code, and define a static cmp_list_int_asc(), as indeed the same patch does for cmp_list_len_contents_asc. This is indeed point of which I will not hold a too strong opinion. Overall :+1: from me. I will be bumping to 'Ready for Committer' unless objections.
Re: Error on failed COMMIT
Hi, this patch fails on the cfbot yet it has received an update during the current CF. I will move it to the next CF and mark it there as Waiting on Author. Cheers, Georgios The new status of this patch is: Needs review
Re: Display individual query in pg_stat_activity
This patch fails in the cfbot for quite some time now. I shall close it as Return With Feedback and not move it to the next CF. Please feel free to register an updated version afresh for the next CF. Cheers, //Georgios
Re: Supporting = operator in gin/gist_trgm_ops
‐‐‐ Original Message ‐‐‐ On Friday, November 13, 2020 10:50 AM, Julien Rouhaud wrote: > On Wed, Nov 11, 2020 at 8:34 PM Georgios Kokolatos > gkokola...@protonmail.com wrote: > > > The following review has been posted through the commitfest application: > > make installcheck-world: tested, passed > > Implements feature: tested, passed > > Spec compliant: not tested > > Documentation: not tested > > Hi, > > this patch implements a useful and missing feature. Thank you. > > It includes documentation, which to a non-native speaker as myself seems > > appropriate. > > It includes comprehensive tests that cover the implemented cases. > > In the thread Alexander has pointed out, quote: > > "It would be more efficient to generate trigrams for equal operator > > using generate_trgm() instead of generate_wildcard_trgm()" > > I will echo the sentiment, though from a slightly different and possibly not > > as important point of view. The method used to extract trigrams from the > > query > > should match the method used to extract trigrams from the values when they > > get added to the index. This is gin_extract_value_trgm() and is indeed using > > generate_trgm(). > > I have no opinion over Alexander's second comment regarding costing. > > I change the status to 'Waiting on Author', but please feel free to override > > my opinion if you feel I am wrong and reset it to 'Needs review'. > > Thanks for the reminder Georgios! Thanks a lot Alexander for the review! > > Indeed, I should have used generate_trgm() rather than > generate_wildcard_trgm(). IIUC, the rest of the code should still be > doing the same as [I]LikeStrategyNumber. I attach a v3 with that > modification. Great, thanks! v3 looks good. > > For the costing, I tried this naive dataset: > > CREATE TABLE t1 AS select md5(random()::text) AS val from > generate_series(1, 10); > CREATE INDEX t1_btree ON t1 (val); > CREATE INDEX t1_gist ON t1 USING gist (val gist_trgm_ops); > > Cost are like this (all default configuration, using any random existing > entry): > > EXPLAIN ANALYZE SELECT * FROM t1 where val = > > = > > '8dcf324ce38428e4d27a363953ac1c51'; > QUERY PLAN > > --- > > Index Only Scan using t1_btree on t1 (cost=0.42..4.44 rows=1 > width=33) (actual time=0.192..0.194 rows=1 loops=1) > Index Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) > Heap Fetches: 0 > Planning Time: 0.133 ms > Execution Time: 0.222 ms > (5 rows) > > EXPLAIN ANALYZE SELECT * FROM t1 where val = > > = > > '8dcf324ce38428e4d27a363953ac1c51'; > QUERY PLAN > > --- > > Index Scan using t1_gist on t1 (cost=0.28..8.30 rows=1 width=33) > (actual time=0.542..2.359 rows=1 loops=1) > Index Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) > Planning Time: 0.189 ms > Execution Time: 2.382 ms > (4 rows) > > EXPLAIN ANALYZE SELECT * FROM t1 where val = > > = > > '8dcf324ce38428e4d27a363953ac1c51'; > QUERY PLAN > > --- > > Bitmap Heap Scan on t1 (cost=400.01..404.02 rows=1 width=33) (actual > time=2.486..2.488 rows=1 loops=1) > Recheck Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) > Heap Blocks: exact=1 > -> Bitmap Index Scan on t1_gin (cost=0.00..400.01 rows=1 width=0) > (actual time=2.474..2.474 rows=1 loops=1) > Index Cond: (val = '8dcf324ce38428e4d27a363953ac1c51'::text) > Planning Time: 0.206 ms > Execution Time: 2.611 ms > > So assuming that this dataset is representative enough, costing indeed > prefers a btree index over a gist/gin index, which should avoid > regression with this change. It sounds reasonable, although I would leave it to a bit more cost savvy people to argue the point. > > Gin is however quite off, likely because it's a bitmap index scan > rather than an index scan, so gist is preferred in this scenario. > That's not ideal, but I'm not sure that there are many people having > both gin_trgm_ops and gist_trgm_ops. Same as above. In short, I think v3 of the patch looks good to change to 'RFC' status. Given the possible costing concerns, I will refrain from changing the status just yet, to give the opportunity to more reviewers to chime in. If in the next few days there are no more reviews, I will update the status to 'RFC' to move the patch forward. Thoughts? Cheers, //Georgios
Re: Add session statistics to pg_stat_database
‐‐‐ Original Message ‐‐‐ On Thursday, November 12, 2020 9:31 AM, Laurenz Albe wrote: > I wrote: > > > On Tue, 2020-11-10 at 15:03 +0000, Georgios Kokolatos wrote: > > > > > I noticed that the cfbot fails for this patch. > > > For this, I am setting the status to: 'Waiting on Author'. > > > > Thanks for noticing, it was only the documentation build. > > Version 5 attached, status changed back to "waiting for review". > > The patch is still failing, so I looked again: > > make[3]: Entering directory > '/home/travis/build/postgresql-cfbot/postgresql/doc/src/sgml' > { \ > echo ""; \ > > echo ""; \\ > > > } > version.sgml > '/usr/bin/perl' ./mk_feature_tables.pl YES > ../../../src/backend/catalog/sql_feature_packages.txt > ../../../src/backend/catalog/sql_features.txt > features-supported.sgml > '/usr/bin/perl' ./mk_feature_tables.pl NO > ../../../src/backend/catalog/sql_feature_packages.txt > ../../../src/backend/catalog/sql_features.txt > features-unsupported.sgml > '/usr/bin/perl' ./generate-errcodes-table.pl > ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml > '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml > /usr/bin/xmllint --path . --noout --valid postgres.sgml > error : Unknown IO error > postgres.sgml:21: /usr/bin/bison -Wno-deprecated -d -o gram.c gram.y > warning: failed to load external entity > "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd; > ]> > > ^ > > > postgres.sgml:23: element book: validity error : No declaration for attribute > id of element book > > > ^ > > > postgres.sgml:24: element title: validity error : No declaration for element > title > PostgreSQL Documentation > > I have the impression that this is not the fault of my patch, something seems > to be > wrong with the cfbot. > > I see that other patches are failing with the same error. You are indeed correct. Unfortunately the cfbot is a bit unstable due to some issues related to the documentation. I alerted a contributor and he was quick to try to address the issue in pgsql-www [1]. Thank you very much for looking and apologies for the chatter. > > Yours, > Laurenz Albe [1] https://www.postgresql.org/message-id/E2EE6B76-2D96-408A-B961-CAE47D1A86F0%40yesql.se
Re: Supporting = operator in gin/gist_trgm_ops
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation:not tested Hi, this patch implements a useful and missing feature. Thank you. It includes documentation, which to a non-native speaker as myself seems appropriate. It includes comprehensive tests that cover the implemented cases. In the thread Alexander has pointed out, quote: "It would be more efficient to generate trigrams for equal operator using generate_trgm() instead of generate_wildcard_trgm()" I will echo the sentiment, though from a slightly different and possibly not as important point of view. The method used to extract trigrams from the query should match the method used to extract trigrams from the values when they get added to the index. This is gin_extract_value_trgm() and is indeed using generate_trgm(). I have no opinion over Alexander's second comment regarding costing. I change the status to 'Waiting on Author', but please feel free to override my opinion if you feel I am wrong and reset it to 'Needs review'. Cheers, //Georgios The new status of this patch is: Waiting on Author
Re: SQL-standard function body
Hi, I noticed that this patch fails on the cfbot. For this, I changed the status to: 'Waiting on Author'. Cheers, //Georgios The new status of this patch is: Waiting on Author
Re: Error on failed COMMIT
Hi, I noticed that this patch fails on the cfbot. For this, I changed the status to: 'Waiting on Author'. Cheers, //Georgios The new status of this patch is: Waiting on Author
Re: WIP: System Versioned Temporal Table
Hi, just a quick comment that this patch fails on the cfbot. Cheers, //Georgios
Re: Allow an alias to be attached directly to a JOIN ... USING
Hi, I noticed that this patch fails on the cfbot. For this, I changed the status to: 'Waiting on Author'. Cheers, //Georgios The new status of this patch is: Waiting on Author
Re: [PATCH] Add extra statistics to explain for Nested Loop
Hi, I noticed that this patch fails on the cfbot. For this, I changed the status to: 'Waiting on Author'. Cheers, //Georgios The new status of this patch is: Waiting on Author
Re: Get memory contexts of an arbitrary backend process
Hi, I noticed that this patch fails on the cfbot. For this, I changed the status to: 'Waiting on Author'. Cheers, //Georgios The new status of this patch is: Waiting on Author
Re: Display individual query in pg_stat_activity
Hi, I noticed that this patch is failing on the cfbot. For this, I changed the status to: 'Waiting on Author' Cheers, Georgios The new status of this patch is: Waiting on Author
Re: Add session statistics to pg_stat_database
Hi, I noticed that the cfbot fails for this patch. For this, I am setting the status to: 'Waiting on Author'. Cheers, //Georgios The new status of this patch is: Waiting on Author
Re: Strange behavior with polygon and NaN
Hi, apologies for the very, very late reply to your fixes. You have answered/addressed all my questions concerns. The added documentation reads well, at least to a non native English speaker. The patch still applies and as far as I can see the tests are passing. It gets my :+1: and I am changing the status to "Ready for Committer". For what little is worth, I learned a lot from this patch, thank you. Cheers, Georgios The new status of this patch is: Ready for Committer
Re: Error on failed COMMIT
Hi, thank you for your contribution. I did notice that the cfbot [1] is failing for this patch. Please try to address the issue for the upcoming commitfest. Cheers, //Georgios [1] http://cfbot.cputube.org/dave-cramer.html
Re: pg_upgrade analyze script
Hi, thank you for you contribution. I did notice that the cfbot [1] is failing for this patch. Please try to address the issues for the upcoming Commitfest. Cheers, //Georgios [1] http://cfbot.cputube.org/magnus-hagander.html
Re: Extending range type operators to cope with elements
Hi, thank you for your contribution. I did notice that the cfbot [1] is failing for this patch. Please try to address the issues if you can for the upcoming commitfest. Cheers, //Georgios [1] http://cfbot.cputube.org/esteban-zimanyi.html
Re: Consistent error reporting for encryption/decryption in pgcrypto
Hi, thank you for your contribution. I did notice that the cfbot [1] is not failing for this patch. Cheers, //Georgios [1] http://cfbot.cputube.org/daniel-gustafsson.html
Re: shared-memory based stats collector
Hi, I noticed that according to the cfbot this patch no longer applies. As it is registered in the upcoming commitfest, it would be appreciated if you could rebase it. Cheers, //Georgios
Re: New default role- 'pg_read_all_data'
Hi, this patch is in "Ready for committer" state and the Cfbot is happy. Is there any committer that is available for taking a look at it? Cheers, //Georgios - CFM 2020-11
Re: v13: show extended stats target in \d
‐‐‐ Original Message ‐‐‐ On Thursday, 10 September 2020 00:36, Justin Pryzby wrote: > On Tue, Sep 01, 2020 at 12:35:19PM +0000, Georgios Kokolatos wrote: > > > I will humbly disagree with the current review. I shall refrain from > > changing the status though because I do not have very strong feeling about > > it. > > Sorry but this was in my spam, and didn't see until now. No worries at all. Thank you for replying. > > > However the patch contains: > > > > - " 'm' = > > any(stxkind) AS mcv_enabled\\n" > > > > > > > > - " 'm' = > > any(stxkind) AS mcv_enabled,\\n" > > > > > > - " %s" > > "FROM > > pg_catalog.pg_statistic_ext stat " > > "WHERE stxrelid > > = '%s'\\n" > > "ORDER BY 1;", > > > > > > - (pset.sversion > > >= 13 ? "stxstattarget\\n" : "-1\\n"), > > oid); > > > > > > > > This seems to be breaking a bit the pattern in describeOneTableDetails(). > > First, it is customary to write the whole query for the version in its own > > block. I do find this pattern rather helpful and clean. So in my humble > > opinion, the ternary expression should be replaced with a distinct if block > > that would implement stxstattarget. Second, setting the value to -1 as an > > indication of absence breaks the pattern a bit further. More on that bellow. > > Hm, I did like this using the "hastriggers" code as a template. But you're > right that everywhere else does it differently. Thank you for taking my input. > > > - if (strcmp(PQgetvalue(result, i, > > 8), "-1") != 0) > > > > > > - appendPQExpBuffer(, " > > STATISTICS %s", > > > > > > - > > PQgetvalue(result, i, 8)); > > > > > > - > > > > In the same function, one will find a bit of a different pattern for > > printing the statistics, e.g. > > if (strcmp(PQgetvalue(result, i, 7), "t") == 0) > > I will again hold the opinion that if the query gets crafted a bit > > differently, one can inspect if the table has `stxstattarget` and then, go > > ahead and print the value. > > Something in the lines of: > > if (strcmp(PQgetvalue(result, i, 8), "t") == 0) > > appendPQExpBuffer(, " STATISTICS %s", PQgetvalue(result, i, 9)); > > I think what you've written wouldn't give the desired behavior, since it would > show the stats target even when it's set to the default. I don't see the point > of having additional, separate, version-specific boolean columns for 1) column > exists; 2) column is not default, in addition to 3) column value. But I added > comment about what Peter and I agree is desirable, anyway. Fair enough. As I said above, I do not have a very strong feeling, so it gets my +1 if it is worth anything. > > > Finally, the patch might be more complete if a note is added in the > > documentation. > > Have you considered adding something in doc/src/sgml/ref/psql-ref.sgml? If > > no, will you consider it? If yes, why did you discard it? > > I don't think the details of psql output are currently documented. This shows > nothing about column statistics target, nor about MV stats at all. > https://www.postgresql.org/docs/13/app-psql.html Yeah, I have noticed that. Hence my question. If anything maybe the documentation can be expanded to cover these cases in a different patch. Thank you for your answer. > > As for the discussion about a separator, I think maybe a comma is enough. I > doubt anyone is going to think that you can get a valid command by prefixing > this by "CREATE STATISTICS". Actually, it didn't even occur to me it was valid > command without the stats target - after all, that's not true of indexes. > > - "public"."ab1_a_b_stats" (ndistinct, dependencies, mcv) ON a, b FROM ab1, > STATISTICS 0 > > This revision only shows the stats target in verbose mode (slash dee > plus). > > -- > Justin >
Re: v13: show extended stats target in \d
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested Hi, I will humbly disagree with the current review. I shall refrain from changing the status though because I do not have very strong feeling about it. I am in agreement that it is a helpful feature and as implemented, the result seems to be the desired one. However the patch contains: - " 'm' = any(stxkind) AS mcv_enabled\n" + " 'm' = any(stxkind) AS mcv_enabled,\n" + " %s" "FROM pg_catalog.pg_statistic_ext stat " "WHERE stxrelid = '%s'\n" "ORDER BY 1;", + (pset.sversion >= 13 ? "stxstattarget\n" : "-1\n"), oid); This seems to be breaking a bit the pattern in describeOneTableDetails(). First, it is customary to write the whole query for the version in its own block. I do find this pattern rather helpful and clean. So in my humble opinion, the ternary expression should be replaced with a distinct if block that would implement stxstattarget. Second, setting the value to -1 as an indication of absence breaks the pattern a bit further. More on that bellow. + if (strcmp(PQgetvalue(result, i, 8), "-1") != 0) + appendPQExpBuffer(, " STATISTICS %s", + PQgetvalue(result, i, 8)); + In the same function, one will find a bit of a different pattern for printing the statistics, e.g. if (strcmp(PQgetvalue(result, i, 7), "t") == 0) I will again hold the opinion that if the query gets crafted a bit differently, one can inspect if the table has `stxstattarget` and then, go ahead and print the value. Something in the lines of: if (strcmp(PQgetvalue(result, i, 8), "t") == 0) appendPQExpBuffer(, " STATISTICS %s", PQgetvalue(result, i, 9)); Finally, the patch might be more complete if a note is added in the documentation. Have you considered adding something in doc/src/sgml/ref/psql-ref.sgml? If no, will you consider it? If yes, why did you discard it? Regards, Georgios
Re: New default role- 'pg_read_all_data'
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: not tested Spec compliant: not tested Documentation:not tested Version 2 of the patch, implements a useful feature. Based on the mailing list discussion, it is also a feature that the community desires. The code seems to be correct and it follows the style. The patch comes complete with tests and documentation. As a non native English speaker, I did not notice any syntactical or grammatical errors in the documentation. Yet it should not mean a lot. As far as I am concerned, this version of the patch is ready for a committer. Please feel free to contest my review, if you think I am wrong. The new status of this patch is: Ready for Committer
Re: Include access method in listTables output
‐‐‐ Original Message ‐‐‐ On Tuesday, 1 September 2020 07:41, Michael Paquier wrote: > On Thu, Aug 20, 2020 at 08:16:19AM +0000, Georgios wrote: > > > Please find version 7 attached which hopefully addresses the error along > > with a proper > > expansion of the test coverage and removal of recently introduced > > whitespace warnings. > > +CREATE ROLE conditional_tableam_display_role; > As a convention, regression tests need to have roles prefixed with > "regress_" or this would cause some buildfarm members to turn red. > Please see -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS (you could use > that in your environment for example). I was wondering about the name. I hoped that it would either come up during review, or that it would be fine. Thank you for the detailed explanation. > > So, as of the tests.. The role gets added to make sure that when > using \d+ on the full schema as well as the various \d*+ variants we > have a consistent owner. The addition of the relation size for the > sequence and the btree index in the output generated is a problem > though, because that's not really portable when compiling with other > page sizes. It is true that there are other tests failing in this > case, but I think that we should try to limit that if we can. In > short, I agree that having some tests is better than nothing, but I > would suggest to reduce their scope, as per the attached. I could not agree more. I have only succumbed to the pressure of reviewing. > > Adding \dE as there are no foreign tables does not make much sense, > and also I wondered why \dt+ was not added. > > Does the attached look correct to you? You have my :+1: > > --- > > Michael
Re: [PATCH] - Provide robust alternatives for replace_string
‐‐‐ Original Message ‐‐‐ On Wednesday, 19 August 2020 11:07, Georgios wrote: > > > ‐‐‐ Original Message ‐‐‐ > On Friday, 7 August 2020 09:02, Asim Praveen pa...@vmware.com wrote: > > > > On 05-Aug-2020, at 7:01 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: > > > On 2020-Aug-05, Asim Praveen wrote: > > > > > > > Please find attached a StringInfo based solution to this problem. It > > > > uses fgetln instead of fgets such that a line is read in full, without > > > > ever splitting it. > > > > > > never heard of fgetln, my system doesn't have a manpage for it, and we > > > don't use it anywhere AFAICS. Are you planning to add something to > > > src/common for it? > > > > Indeed! I noticed fgetln on the man page of fgets and used it without > > checking. And this happened on a MacOS system. > > Please find a revised version that uses fgetc instead. > > Although not an issue in the current branch, fgetc might become a bit slow > in large files. Please find v3 which simply continues reading the line if > fgets fills the buffer and there is still data to read. > > Also this version, implements Alvaro's suggestion to break API compatibility. > > To that extent, ecpg regress has been slightly modified to use the new version > of replace_string() where needed, or remove it all together where possible. I noticed that the cfbot [1] was unhappy with the raw use of __attribute__ on windows builds. In retrospect it is rather obvious it would complain. Please find v4 attached. //Georgios > > //Georgios > > > Asim [1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.105985 v4-0001-Use-a-stringInfo-instead-of-a-char-for-replace_st.patch Description: Binary data
Re: New default role- 'pg_read_all_data'
Thank you for the patch. My high level review comment: The patch seems to be implementing a useful and requested feature. The patch applies cleanly and passes the basic regress tests. Also the commitfest bot is happy. A first pass at the code, has not revealed any worthwhile comments. Please allow me for a second and more thorough pass. The commitfest has hardly started after all. Also allow me a series of genuine questions: What would the behaviour be with REVOKE? In a sequence similar to: GRANT ALL ON ... REVOKE pg_read_all_data FROM ... What privileges would the user be left with? Would it be possible to end up in the same privilege only with a GRANT command? Does the above scenario even make sense? Regards,
Re: Include access method in listTables output
‐‐‐ Original Message ‐‐‐ On Thursday, 20 August 2020 07:31, Justin Pryzby wrote: > On Mon, Aug 17, 2020 at 11:10:05PM +0530, vignesh C wrote: > > > On Sat, Aug 1, 2020 at 8:12 AM vignesh C vignes...@gmail.com wrote: > > > > > > > +-- access method column should not be displayed for sequences > > > > > +\ds+ > > > > > > > > > > -List of relations > > > > > > > > > > > > > > > - Schema | Name | Type | Owner | Persistence | Size | Description > > > > > ++--+--+---+-+--+- > > > > > +(0 rows) > > > > > We can include one test for view. > > > > > > > > > I felt adding one test for view is good and added it. > > Attached a new patch for the same. > > I felt patch is in shape for committer to have a look at this. > > The patch tester shows it's failing xmllint ; could you send a fixed patch ? > > /usr/bin/xmllint --path . --noout --valid postgres.sgml > ref/psql-ref.sgml:1189: parser error : Opening and ending tag mismatch: link > line 1187 and para > > http://cfbot.cputube.org/georgios-kokolatos.html Thank you for pointing it out! Please find version 7 attached which hopefully addresses the error along with a proper expansion of the test coverage and removal of recently introduced whitespace warnings. > > -- > > Justin 0001-Include-access-method-in-listTables-output-v7.patch Description: Binary data
Re: [PATCH] - Provide robust alternatives for replace_string
‐‐‐ Original Message ‐‐‐ On Friday, 7 August 2020 09:02, Asim Praveen wrote: > > On 05-Aug-2020, at 7:01 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: > > On 2020-Aug-05, Asim Praveen wrote: > > > > > Please find attached a StringInfo based solution to this problem. It > > > uses fgetln instead of fgets such that a line is read in full, without > > > ever splitting it. > > > > never heard of fgetln, my system doesn't have a manpage for it, and we > > don't use it anywhere AFAICS. Are you planning to add something to > > src/common for it? > > Indeed! I noticed fgetln on the man page of fgets and used it without > checking. And this happened on a MacOS system. > > Please find a revised version that uses fgetc instead. Although not an issue in the current branch, fgetc might become a bit slow in large files. Please find v3 which simply continues reading the line if fgets fills the buffer and there is still data to read. Also this version, implements Alvaro's suggestion to break API compatibility. To that extent, ecpg regress has been slightly modified to use the new version of replace_string() where needed, or remove it all together where possible. //Georgios > > Asim v3-0001-Use-a-stringInfo-instead-of-a-char-for-replace_st.patch Description: Binary data
[PATCH] - Provide robust alternatives for replace_string
Hi, In our testing framework, backed by pg_regress, there exists the ability to use special strings that can be replaced by environment based ones. Such an example is '@testtablespace@'. The function used for this replacement is replace_string which inline replaces these occurrences in original line. It is documented that the original line buffer should be large enough to accommodate. However, it is rather possible and easy for subtle errors to occur, especially if there are multiple occurrences to be replaced in long enough lines. Please find two distinct versions of a possible solution. One, which is preferred, is using StringInfo though it requires for stringinfo.h to be included in pg_regress.c. The other patch is more basic and avoids including stringinfo.h. As a reminder stringinfo became available in the frontend in commit (26aaf97b683d) Because the original replace_string() is exposed to other users, it is currently left intact. Also if required, an error can be raised in the original function, in cases that the string is not long enough to accommodate the replacements. Worthwhile to mention that currently there are no such issues present in the test suits. It should not hurt to do a bit better though. //Asim and Georgios 0001-Use-stringInfo-instead-of-char-in-replace_string.patch Description: Binary data 0001-Heap-allocated-string-version-of-replace_string.patch Description: Binary data
Re: Include access method in listTables output
‐‐‐ Original Message ‐‐‐ On Saturday, July 25, 2020 2:41 AM, vignesh C wrote: > On Thu, Jul 16, 2020 at 7:35 PM Georgios gkokola...@protonmail.com wrote: > > > ‐‐‐ Original Message ‐‐‐ > > On Saturday, July 11, 2020 3:16 PM, vignesh C vignes...@gmail.com wrote: > > > > > On Mon, Jul 6, 2020 at 1:24 PM Georgios gkokola...@protonmail.com wrote: > > > > > > > ‐‐‐ Original Message ‐‐‐ > > > > On Monday, July 6, 2020 3:12 AM, Michael Paquier mich...@paquier.xyz > > > > wrote: > > > > > > > > > On Sun, Jul 05, 2020 at 07:13:10AM +0530, vignesh C wrote: > > > > > > > > > > > I'm not sure if we should include showViews, I had seen that the > > > > > > access method was not getting selected for view. > > > > > > > > > > +1. These have no physical storage, so you are looking here for > > > > > relkinds that satisfy RELKIND_HAS_STORAGE(). > > > > > > > > Thank you for the review. > > > > Find attached v4 of the patch. > > > > > > Thanks for fixing the comments. > > > Patch applies cleanly, make check & make check-world passes. > > > I was not sure if any documentation change is required or not for this > > > in doc/src/sgml/ref/psql-ref.sgml. Thoughts? > > > > Thank you for the comment. I added a line in docs. Admittedly, might need > > tweaking. > > Please find version 5 of the patch attached. > > Most changes looks fine, minor comments: Thank you for your comments. Please find the individual answers inline for completeness. I'm having issues understanding where you are going with the reviews, can you fully describe what is it that you wish to be done? > > \pset tuples_only false > -- check conditional tableam display > --- Create a heap2 table am handler with heapam handler > +\pset expanded off > +CREATE SCHEMA conditional_tableam_display; > +CREATE ROLE conditional_tableam_display_role; > +ALTER SCHEMA conditional_tableam_display OWNER TO > conditional_tableam_display_role; > +SET search_path TO conditional_tableam_display; > CREATE ACCESS METHOD heap_psql TYPE TABLE HANDLER heap_tableam_handler; > > This comment might have been removed unintentionally, do you want to > add it back? It was intentional as heap2 table am handler is not getting created. With the code additions the comment does seem out of place and thus removed. > > +-- access method column should not be displayed for sequences > +\ds+ > > - List of relations > > > - Schema | Name | Type | Owner | Persistence | Size | Description > ++--+--+---+-+--+- > +(0 rows) > > We can include one test for view. The list of cases in the test for both including and excluding storage is by no means exhaustive. I thought that this is acceptable. Adding a test for the view, will still not make it the test exhaustive. Are you sure you only suggest the view to be included or you would suggest an exhaustive list of tests. > > - if (pset.sversion >= 12 && !pset.hide_tableam && > > - (showTables || showMatViews || showIndexes)) > > - appendPQExpBuffer(, > > - ",\n am.amname as \"%s\"", > > - gettext_noop("Access Method")); > > - /* > - As of PostgreSQL 9.0, use pg_table_size() to show a more accurate > - size of a table, including FSM, VM and TOAST tables. > @@ -3772,6 +3778,12 @@ listTables(const char *tabtypes, const char > *pattern, bool verbose, bool showSys > appendPQExpBufferStr(, > "\nFROM pg_catalog.pg_class c" > "\n LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace"); > > - > - if (pset.sversion >= 12 && !pset.hide_tableam && > > - (showTables || showMatViews || showIndexes)) > > If conditions in both the places are the same, do you want to make it a > macro? No, I would rather not if you may. I think that a macro would not add to the readability rather it would remove from it. Those are two simple conditionals used in the same function very close to each other. > > Regards, > Vignesh > EnterpriseDB: http://www.enterprisedb.com >
Re: Using Valgrind to detect faulty buffer accesses (no pin or buffer content lock held)
‐‐‐ Original Message ‐‐‐ On Tuesday, July 21, 2020 11:52 PM, Peter Geoghegan wrote: > On Mon, Jul 6, 2020 at 1:35 AM Georgios Kokolatos > gkokola...@protonmail.com wrote: > > > As a general overview, the series of patches in the mail thread do match > > their description. The addition of the stricter, explicit use of > > instrumentation does improve the design as the distinction of the use cases > > requiring a pin or a lock is made more clear. The added commentary is > > descriptive and appears grammatically correct, at least to a non native > > speaker. > > I didn't see this review until now because it ended up in gmail's spam > folder. :-( > > Thanks for taking a look at it! No worries at all. It happens and it was beneficial for me to read the patch. //Georgios > > -- > > Peter Geoghegan
Re: [PATCH] Finally split StdRdOptions into HeapOptions and ToastOptions
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested Hi, thank you for the patch. It applies cleanly, compiles and passes check, check-world. I feel as per the discussion, this is a step to the right direction yet it does not get far enough. From experience, I can confirm that dealing with reloptions in a new table AM is somewhat of a pain. Ultimately, reloptions should be handled by the table AM specific code. The current patch does not address the issue. Yet it does make the issue easier to address by clearing up the current state. If you allow me, I have a couple of comments. - saveFreeSpace = RelationGetTargetPageFreeSpace(relation, - HEAP_DEFAULT_FILLFACTOR); + if (IsToastRelation(relation)) + saveFreeSpace = ToastGetTargetPageFreeSpace(); + else + saveFreeSpace = HeapGetTargetPageFreeSpace(relation); For balance, it does make some sense for ToastGetTargetPageFreeSpace() to get relation as an argument, similarly to HeapGetTargetPageFreeSpace(). Also, this pattern is repeated in four places, maybe the branch can be moved inside a macro or static inline instead? - /* Retrieve the parallel_workers reloption, or -1 if not set. */ - rel->rel_parallel_workers = RelationGetParallelWorkers(relation, -1); + /* +* Retrieve the parallel_workers for heap and mat.view relations. +* Use -1 if not set, or if we are dealing with other relation kinds +*/ + if (relation->rd_rel->relkind == RELKIND_RELATION || + relation->rd_rel->relkind == RELKIND_MATVIEW) + rel->rel_parallel_workers = RelationGetParallelWorkers(relation, -1); + else + rel->rel_parallel_workers = -1; If the comment above is agreed upon, then it makes a bit of sense to apply the same here. The expression in the branch is already asserted for in macro, why not switch there and remove the responsibility from the caller? Any thoughts on the above? Cheers, Georgios The new status of this patch is: Waiting on Author
Re: Include access method in listTables output
‐‐‐ Original Message ‐‐‐ On Saturday, July 11, 2020 3:16 PM, vignesh C wrote: > On Mon, Jul 6, 2020 at 1:24 PM Georgios gkokola...@protonmail.com wrote: > > > ‐‐‐ Original Message ‐‐‐ > > On Monday, July 6, 2020 3:12 AM, Michael Paquier mich...@paquier.xyz wrote: > > > > > On Sun, Jul 05, 2020 at 07:13:10AM +0530, vignesh C wrote: > > > > > > > I'm not sure if we should include showViews, I had seen that the > > > > access method was not getting selected for view. > > > > > > +1. These have no physical storage, so you are looking here for > > > relkinds that satisfy RELKIND_HAS_STORAGE(). > > > > Thank you for the review. > > Find attached v4 of the patch. > > Thanks for fixing the comments. > Patch applies cleanly, make check & make check-world passes. > I was not sure if any documentation change is required or not for this > in doc/src/sgml/ref/psql-ref.sgml. Thoughts? > Thank you for the comment. I added a line in docs. Admittedly, might need tweaking. Please find version 5 of the patch attached. > Regards, > Vignesh > EnterpriseDB: http://www.enterprisedb.com 0001-Include-AM-in-listTables-output-v5.patch Description: Binary data
Re: Using Valgrind to detect faulty buffer accesses (no pin or buffer content lock held)
As a general overview, the series of patches in the mail thread do match their description. The addition of the stricter, explicit use of instrumentation does improve the design as the distinction of the use cases requiring a pin or a lock is made more clear. The added commentary is descriptive and appears grammatically correct, at least to a non native speaker. Unfortunately though, the two bug fixes do not seem to apply. Also, there is a small issue regarding the process, not the content of the patches. In CF app there is a latest attachment (v3-0002-Add-nbtree-Valgrind-buffer-lock-checks.patch) which does not appear in the mail thread. Before changing the status, I will kindly ask for the complete latest series that applies in the mail thread. The new status of this patch is: Waiting on Author
Re: Include access method in listTables output
‐‐‐ Original Message ‐‐‐ On Monday, July 6, 2020 3:12 AM, Michael Paquier wrote: > On Sun, Jul 05, 2020 at 07:13:10AM +0530, vignesh C wrote: > > > I'm not sure if we should include showViews, I had seen that the > > access method was not getting selected for view. > > +1. These have no physical storage, so you are looking here for > relkinds that satisfy RELKIND_HAS_STORAGE(). Thank you for the review. Find attached v4 of the patch. > > --- > > Michael 0001-Include-AM-in-listTables-output-v4.patch Description: Binary data
Re: Include access method in listTables output
‐‐‐ Original Message ‐‐‐ On Saturday, June 20, 2020 3:15 PM, vignesh C wrote: > On Tue, Jun 16, 2020 at 6:13 PM Georgios gkokola...@protonmail.com wrote: > > > > Few comments: > > > > > > - if (pset.sversion >= 12) > > > > > > - appendPQExpBufferStr(, > > > > > > - "\n LEFT JOIN pg_catalog.pg_am am ON am.oid = c.relam"); > > > Should we include pset.hide_tableam check along with the version > > > check? > > > > > > > I opted against it, since it seems more intuitive to have a single > > switch and placed on the display part. A similar pattern can be found > > in describeOneTableDetails(). I do not hold a strong opinion and will > > gladly ament if insisted upon. > > I felt we could add that check as we might be including > pg_catalog.pg_am in cases even though we really don't need it. As promised, I gladly ament upon your request. Also included a fix for a minor oversight in tests, they should now be stable. Finally in this version, I extended a bit the logic to only include the access method column if the relations displayed can have one, for example sequences. > > Regards, > Vignesh > EnterpriseDB: http://www.enterprisedb.com 0001-Include-AM-in-listTables-output-v3.patch Description: Binary data
Use TableAm API in pg_table_size
Hi, I noticed that there are several file layout assumptions in dbsize.c which might not hold true for non heap relations attached with the TableAm API. It seems logical that in order to retrieve the disk size of a relation, the existing size method to be used instead. A small patch is included to demonstrate how such an implementation can look like. Also, the existing method for heap, table_block_relation_size, should be able to address the valid cases where a fork number does not exist. If this is considered valid, then the same can be applied for indexes too. The more generic calculate_relation_size can be adapted to call into the TableAm for those kinds of relations that makes sense. If agreed, a more complete patch can be provided. Cheers, //Georgios 0001-Use-tableam-for-pg_table_size.patch Description: Binary data
Re: Include access method in listTables output
‐‐‐ Original Message ‐‐‐ On Monday, June 15, 2020 3:20 AM, vignesh C wrote: > On Tue, Jun 9, 2020 at 6:45 PM Georgios gkokola...@protonmail.com wrote: > > > > > > > Please add it to the commitfest at https://commitfest.postgresql.org/28/ > > > > Thank you very much for your time. Added to the commitfest as suggested. > > Patch applies cleanly, make check & make check-world passes. Thank you for your time and comments! Please find v2 of the patch attached. > > Few comments: > > - if (pset.sversion >= 12) > > - appendPQExpBufferStr(, > > - "\n LEFT JOIN pg_catalog.pg_am am ON am.oid = c.relam"); > > Should we include pset.hide_tableam check along with the version check? I opted against it, since it seems more intuitive to have a single switch and placed on the display part. A similar pattern can be found in describeOneTableDetails(). I do not hold a strong opinion and will gladly ament if insisted upon. > > - if (pset.sversion >= 12 && !pset.hide_tableam) > > - { > > - appendPQExpBuffer(, > > - ",\n am.amname as \"%s\"", > > - gettext_noop("Access Method")); > > - } > > Braces can be removed & the second line can be combined with earlier. Agreed on the braces. Disagree on the second line as this style is in line with the rest of code. Consistency, buf, format string and gettext_noop() are found in their own lines within this file. > > We could include the test in psql file by configuring HIDE_TABLEAM. > Agreed. > Regards, > Vignesh > EnterpriseDB: http://www.enterprisedb.com > 0001-Include-AM-in-listTables-output-v2.patch Description: Binary data
Re: Relation wide 'LIKE' clause
‐‐‐ Original Message ‐‐‐ On Wednesday, June 10, 2020 12:08 PM, Peter Eisentraut wrote: > On 2020-06-10 11:42, Georgios wrote: > > > Postgres create table statement supports `LIKE source_table [like_option... > > ]` > > to specify `a table from which the new table automatically copies all > > column names, their data types, and their not-null constraints.` according > > to > > documentation [1]. > > I am wondering if a similar clause would make sense to copy relation wide > > settings. For example consider a relation created like this: > > `CREATE TABLE source_table ([column, ...]) USING customam WITH > > (storage_parameter1 = value1, ... )` > > We already have LIKE INCLUDING STORAGE. Maybe that should just be > updated to work like what you are asking for. This is correct. However I do see some limitations there. Consider the valid scenario: `CREATE TABLE target (LIKE source_am1 INCLUDING ALL, LIKE source_am2 INCLUDING ALL)` Which source relation should be used for the access method and storage parameters? Also I _think_ that the current `LIKE` clause specifically targets column definitions in the SQL standard. I am a bit hesitant on the last part, yet this is my current understanding. Please, let me know what you think. > > -- > > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Relation wide 'LIKE' clause
Hi, Postgres create table statement supports `LIKE source_table [like_option... ]` to specify `a table from which the new table automatically copies all column names, their data types, and their not-null constraints.` according to documentation [1]. I am wondering if a similar clause would make sense to copy relation wide settings. For example consider a relation created like this: `CREATE TABLE source_table ([column, ...]) USING customam WITH (storage_parameter1 = value1, ... )` Maybe a statement similar to: `CREATE TABLE target LIKE source_table` which should be equivalent to: `CREATE TABLE target (LIKE source_table INCLUDING ALL) USING customam WITH (storage_parameter1 = value1, ...)` can be usefull as a syntactic shortcut. Maybe the usefulness of such sortcut becomes a bit more apparent if one considers that custom access methods can offer a diversity of storage parameters that interact both at relation and column level, especially when the source relation is column oriented. If the possibility for such a statment is not discarded, a patch can be readily provided. Cheers, //Georgios [1] https://www.postgresql.org/docs/13/sql-createtable.html
Re: Include access method in listTables output
‐‐‐ Original Message ‐‐‐ On Tuesday, June 9, 2020 1:34 PM, David Rowley wrote: > On Tue, 9 Jun 2020 at 23:03, Georgios gkokola...@protonmail.com wrote: > > > A small patch is attached [1] to see if you think it makes sense. I have > > not included any > > differences in the tests output yet, as the idea might get discarded. > > However if the patch is > > found useful. I shall ament the test results as needed. > > It seems like a fair thing to need/want. We do already show this in > \d+ tablename, so it seems pretty fair to want it in the \d+ output > too > > Please add it to the commitfest at https://commitfest.postgresql.org/28/ Thank you very much for your time. Added to the commitfest as suggested. > > David
Include access method in listTables output
Hi, Postgres 12 introduced TableAm api. Although as far as I can see, currently only heap is included as access method, it is fair to imagine that users will start adding their own methods and more methods to be included in Postgres core. With that in mind, it might be desirable for a user to see the access method when describing in verbose mode, eg. `\d+`. A small patch is attached [1] to see if you think it makes sense. I have not included any differences in the tests output yet, as the idea might get discarded. However if the patch is found useful. I shall ament the test results as needed. Cheers, //Georgios 0001-Include-AM-in-listTables-output.patch Description: Binary data
Re: Refactor compile-time assertion checks for C/C++
Thank you for updating the status of the issue. I have to admit that I completely missed the CF bot. Lesson learned. For whatever is worth, my previous comment that the patch improves readability also applies to the updated version of the patch. The CF bot seems happy now, which means that your assessment as to the error and fix are correct. So :+1: from me.
Re: ALTER TEXT SEARCH DICTIONARY tab completion
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested It looks good and does what it says on the tin. One minor nitpick I feel I should add is that for completeness and balance the equivalent `CREATE TEXT SEARCH DICTIONARY` should get the same treatment. Maybe something along the lines of: - else if (Matches("CREATE", "TEXT", "SEARCH", "CONFIGURATION", MatchAny)) + else if (Matches("CREATE", "TEXT", "SEARCH", "DICTIONARY|CONFIGURATION", MatchAny))
Re: Refactor compile-time assertion checks for C/C++
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested In my humble opinion the patch improves readability, hence gets my +1. No further suggestions. Passing on to a committer to judge further. The new status of this patch is: Ready for Committer
Re: Duplicate Workers entries in some EXPLAIN plans
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested > Ah, I see. I think I got that from ExplainPrintSettings. I've > corrected my usage--thanks for pointing it out. I appreciate the > effort to maintain a consistent style. Thanks, I am just following the reviewing guide to be honest. > Also, reviewing my code again, I noticed that when I moved the general > worker output earlier, I missed part of the merge conflict: Right. I thought that was intentional. > which ignores the es->hide_workers flag (it did not fail the tests, > but the intent is pretty clear). I've corrected this in the current > patch. Noted and appreciated. > I also noticed that we can now handle worker buffer output more > consistently across TEXT and structured formats, so I made that small > change too: Looks good. > I think the "producing plan output for a worker" process is easier to > reason about now, and while it changes TEXT format worker output > order, the other changes in this patch are more drastic so this > probably does not matter. > > I've also addressed the other feedback above, and reworded a couple of > comments slightly. Thanks for the above. Looks clean, does what it says in the tin and solves a problem that needs solving. All relevant installcheck-world pass. As far as I am concerned, I think it is ready to be sent to a committer. Updating the status accordingly. The new status of this patch is: Ready for Committer
Re: Duplicate Workers entries in some EXPLAIN plans
>> + int num_workers = planstate->worker_instrument->num_workers; >> + int n; >> + worker_strs = (StringInfo *) palloc0(num_workers * >> sizeof(StringInfo)); >> + for (n = 0; n < num_workers; n++) { >> >> I think C99 would be better here. Also no parenthesis needed. > > > Pardon my C illiteracy, but what am I doing that's not valid C99 here? My bad, I should have been more clear. I meant that it is preferable to use the C99 standard which calls for declaring variables in the scope that you need them. In this case, 'n' is needed only in the for loop, so something like for (int n = 0; n < num_workers; n++) is preferable. To be clear, your code was perfectly valid. It was only the style I was referring to. >> + for (n = 0; n < w->num_workers; ++n) >> >> I think C99 would be better here. > > > And here (if it's not the same problem)? Exactly the same as above. >> int indent; /* current indentation level */ >> List *grouping_stack; /* format-specific grouping state */ >> + boolprint_workers; /* whether current node has worker metadata >> */ >> >> Hmm.. commit introduced `hide_workers` in the struct. Having >> both >> names in the struct so far apart even seems a bit confusing and sloppy. Do >> you >> think it would be possible to combine or rename? > > > I noticed that. I was thinking about combining them, but > "hide_workers" seems to be about "pretend there is no worker output > even if there is" and "print_workers" is "keep track of whether or not > there is worker output to print". Maybe I'll rename to > "has_worker_output"? The rename sounds a bit better in my humble opinion. Thanks.
Re: Duplicate Workers entries in some EXPLAIN plans
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested > TEXT format was tricky due to its inconsistencies, but I think I have > something working reasonably well. I added a simple test for TEXT > format output as well, using a similar approach as the JSON format Great! > test, and liberally regexp_replacing away any volatile output. I > suppose in theory we could do this for YAML, too, but I think it's > gross enough not to be worth it, especially given the high similarity > of all the structured outputs. Agreed, what is in the patch suffices. Overall great work, a couple of minor nitpicks if you allow me. + /* Prepare per-worker output */ + if (es->analyze && planstate->worker_instrument) { Style, parenthesis on its own line. + int num_workers = planstate->worker_instrument->num_workers; + int n; + worker_strs = (StringInfo *) palloc0(num_workers * sizeof(StringInfo)); + for (n = 0; n < num_workers; n++) { I think C99 would be better here. Also no parenthesis needed. + worker_strs[n] = makeStringInfo(); + } + } @@ -1357,6 +1369,58 @@ ExplainNode(PlanState *planstate, List *ancestors, ExplainPropertyBool("Parallel Aware", plan->parallel_aware, es); } + /* Prepare worker general execution details */ + if (es->analyze && es->verbose && planstate->worker_instrument) + { + WorkerInstrumentation *w = planstate->worker_instrument; + int n; + + for (n = 0; n < w->num_workers; ++n) I think C99 would be better here. + { + Instrumentation *instrument = >instrument[n]; + double nloops = instrument->nloops; - appendStringInfoSpaces(es->str, es->indent * 2); - if (n > 0 || !es->hide_workers) - appendStringInfo(es->str, "Worker %d: ", n); + if (indent) + { + appendStringInfoSpaces(es->str, es->indent * 2); + } Style: No parenthesis needed - if (opened_group) - ExplainCloseGroup("Workers", "Workers", false, es); + /* Show worker detail */ + if (planstate->worker_instrument) { + ExplainFlushWorkers(worker_strs, planstate->worker_instrument->num_workers, es); } Style: No parenthesis needed +* just indent once, to add worker info on the next worker line. +*/ + if (es->str == es->root_str) + { + es->indent += es->format == EXPLAIN_FORMAT_TEXT ? 1 : 2; + } + Style: No parenthesis needed + ExplainCloseGroup("Workers", "Workers", false, es); + // do we have any other cleanup to do? This comment does not really explain anything. Either remove or rephrase. Also C style comments. + es->print_workers = false; +} int indent; /* current indentation level */ List *grouping_stack; /* format-specific grouping state */ + boolprint_workers; /* whether current node has worker metadata */ Hmm.. commit introduced `hide_workers` in the struct. Having both names in the struct so far apart even seems a bit confusing and sloppy. Do you think it would be possible to combine or rename? +extern void ExplainOpenWorker(StringInfo worker_str, ExplainState *es); +extern void ExplainCloseWorker(ExplainState *es); +extern void ExplainFlushWorkers(StringInfo *worker_strs, int num_workers, ExplainState *es); No need to expose those, is there? I feel there should be static. Awaiting for answer or resolution of these comments to change the status. //Georgios
Re: Duplicate Workers entries in some EXPLAIN plans
> Sounds good, I'll try that format. Any idea how to test YAML? With the > JSON format, I was able to rely on Postgres' own JSON-manipulating > functions to strip or canonicalize fields that can vary across > executions--I can't really do that with YAML. Yes, this approach was clear in the patch and works great with Json. Also you are correct, this can not be done with YAML. I spend a bit of time to look around and I could not find any tests really on yaml format. > Or should I run EXPLAIN > with COSTS OFF, TIMING OFF, SUMMARY OFF and assume that for simple > queries the BUFFERS output (and other fields I can't turn off like > Sort Space Used) *is* going to be stable? I have to admit with the current diff tool used in pg_regress, this is not possible. I am pretty certain that it *is not* going to be stable. Not for long anyways. I withdraw my suggestion for YAML and currently awaiting for TEXT format only.
Re: Duplicate Workers entries in some EXPLAIN plans
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: not tested Spec compliant: not tested Documentation:not tested The current version of the patch (v2) applies cleanly and does what it says on the box. > Any thoughts on what we should do with text mode output (which is untouched right now)? The output Andres proposed above makes sense to me, but I'd like to get more input. Nothing to add beyond what is stated in the thread. > Sort Method: external merge Disk: 4920kB > Buffers: shared hit=682 read=10188, temp read=1415 written=2101 > Worker 0: actual time=130.058..130.324 rows=1324 loops=1 >Sort Method: external merge Disk: 5880kB >Buffers: shared hit=337 read=3489, temp read=505 written=739 > Worker 1: actual time=130.273..130.512 rows=1297 loops=1 >Buffers: shared hit=345 read=3507, temp read=505 written=744 >Sort Method: external merge Disk: 5920kB This proposal seems like a fitting approach. Awaiting for v3 which will include the text version. May I suggest a format yaml test case? Just to make certain that no regressions occur in the future. Thanks,
Re: Duplicate Workers entries in some EXPLAIN plans
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested This is a high level review only. However, seeing that there is a conflict and does not merge cleanly after commit , I return to Author. To be fair the resolution seems quite straight forward and I took the liberty of applying the necessary changes to include the new element of ExplainState introduced in the above commit, namely hide_workers. However since the author might have a different idea on how to incorporate this change I leave it up to him. Another very high level comment is the introduction of a new test file, namely explain. Seeing `explain.sql` in the tests suits, personally and very opinion based, I would have expected the whole spectrum of the explain properties to be tested. However only a slight fraction of the functionality is tested. Since this is a bit more of a personal opinion, I don't expect any changes unless the author happens to agree. Other than these minor nitpick, the code seems clear, concise and does what it says on the box. It follows the comments in the discussion thread(s) and solves a real issue. Please have a look on how commit affects this patch and rebase. The new status of this patch is: Waiting on Author
Re: CPU costs of random_zipfian in pgbench
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested Version 3 of the patch looks ready for committer. Thank you for taking the time to code! The new status of this patch is: Ready for Committer
Re: CPU costs of random_zipfian in pgbench
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation:not tested For whatever it is worth, the patch looks good to me. A minor nitpick would be to use a verb in the part: `cost when the parameter in (0, 1)` maybe: `cost when the parameter's value is in (0, 1)` or similar. Apart from that, I would suggest it that the patch could be moved to Waiting for Author state.
Re: Adding a TAP test checking data consistency on standby with minRecoveryPoint
Hi, you are right in saying that my comment didn't offer much of a constructive explanation. Apologies for that. To the issue at hand. Tests were run in the same manner as in all other cases and the test in question was the only one to fail in the whole tree. By looking a bit closer to the error, the culprit seems to be the missing installation of `pageinspect`, as it was correctly pointed out. I did notice before sending the first message in the thread that `contrib/pageinspect` was added to the `EXTRA_INSTALL` variable and I *assumed* that it should be properly installed to the system. However that was not the case. Without being a recursive Makefile guru, I noticed one has to explicitly $(call recurse,checkprep, $(recurse_alldirs_targets)) in `src/test/Makefile` in order for the call to `checkprep` to be made for running make check or installcheck or any other of the variants. Having said all the above, and after spending more time reviewing the whole patch I do think it is worthwhile committing, as long as it is not tripping any users up with unexpected test errors due to incomplete installations. I hope this comment helps a bit more.
Re: Adding a TAP test checking data consistency on standby with minRecoveryPoint
Hi, I applied the patch on current master and run the tests, but I am afraid that the newly introduced test failed on installcheck-world: ```t/016_min_consistency.pl . # Looks like your test exited with 29 before it could output anything. t/016_min_consistency.pl . Dubious, test returned 29 (wstat 7424, 0x1d00) Failed 2/2 subtests Test Summary Report --- t/016_min_consistency.pl (Wstat: 7424 Tests: 0 Failed: 0) Non-zero exit status: 29 Parse errors: Bad plan. You planned 2 tests but ran 0. Files=16, Tests=143, 65 wallclock secs ( 0.04 usr 0.04 sys + 6.53 cusr 5.08 csys = 11.69 CPU) Result: FAIL``` To be honest, I have not checked closely on the failure, still it is the only test failing which by itself should be worthwhile mentioning.
Re: Tighten error control for OpenTransientFile/CloseTransientFile
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: not tested Spec compliant: not tested Documentation:not tested The second version of this patch seems to be in order and ready for committer. Thank you for taking the time to code! The new status of this patch is: Ready for Committer
Re: Tighten error control for OpenTransientFile/CloseTransientFile
Overall the patch looks good and according to the previous discussion fulfils its purpose. It might be worthwhile to also check for errors on close in SaveSlotToPath(). pgstat_report_wait_end(); CloseTransientFile(fd); /* rename to permanent file, fsync file and directory */ if (rename(tmppath, path) != 0)