Re: Assert in pageinspect with NULL pages
On Wed, Mar 16, 2022 at 05:43:48PM +0900, Michael Paquier wrote: > So, I have looked at this second part of the thread, and concluded > that we should not fail for empty pages. First, we fetch pages from > the buffer pool in normal mode, where empty pages are valid. There is > also a second point in favor of doing so: code paths dedicated to hash > indexes already do that, marking such pages as simply "unused". The > proposal from Julien upthread sounds cleaner to me though in the long > run, as NULL gives the user the possibility to do a full-table scan > with simple clauses to filter out anything found as NULL. It has been a couple of weeks, but I have been able to come back to this set of issues for all-zero pages, double-checked the whole and applied a set of fixes down to 10. So we should be completely done here. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
Hi, On Mon, Mar 28, 2022 at 08:29:29PM +0300, Maxim Orlov wrote: > I've suddenly found that the test in this patch is based on a fact that > heap pages don't have PageSpecial or it is of different size with btree > pages Special area: > > CREATE TABLE test1 (a int8, b int4range); > > SELECT bt_page_items(get_raw_page('test1', 0)); > > > In the current state is is so, but it is not guaranteed. For example, in > the proposed 64xid patch [1] we introduce PageSpecial into heap pages and > its size on occasion equals to the size of btree page special so check from > above is false. Even if we pass heap page into pageinspect pretending it is > btree page. So the test will fail. > > > Generally it seems not only a wrong test but the consequence of a bigger > scale fact that we can not always distinguish different AM's pages. So I'd > propose at least that we don't come to a conclusion that a page is valid > based on PageSpecial size is right. We don't assume that the page is valid, or of the correct AM, just based on the special area size. We do reject it if the size is invalid, but we also have other tests based on what's actually possible to test (checking flag sanity, magic values and so on). Note that Peter G. suggested to add more checks for btree based on palloc_btree_page(), did you check if those were enough to fix this test with the 64bits xid patchset applied?
Re: Assert in pageinspect with NULL pages
I've suddenly found that the test in this patch is based on a fact that heap pages don't have PageSpecial or it is of different size with btree pages Special area: CREATE TABLE test1 (a int8, b int4range); SELECT bt_page_items(get_raw_page('test1', 0)); In the current state is is so, but it is not guaranteed. For example, in the proposed 64xid patch [1] we introduce PageSpecial into heap pages and its size on occasion equals to the size of btree page special so check from above is false. Even if we pass heap page into pageinspect pretending it is btree page. So the test will fail. Generally it seems not only a wrong test but the consequence of a bigger scale fact that we can not always distinguish different AM's pages. So I'd propose at least that we don't come to a conclusion that a page is valid based on PageSpecial size is right. [1] https://www.postgresql.org/message-id/flat/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com -- Best regards, Maxim Orlov.
Re: Assert in pageinspect with NULL pages
On Sun, Mar 27, 2022 at 02:36:54PM -0700, Peter Geoghegan wrote: > I think that it might actually be fundamentally impossible to > guarantee that that'll be safe, because we have no idea what the > output function might be doing. It's arbitrary user-defined code that > could easily be implemented in C. Combined with an arbitrary page > image. My guess that you'd bring in a lot more safety if we completely cut the capacity to fetch and pass down raw pages across the SQL calls because you can add checks for the wanted AM, replacing all that with a (regclass,blkno) pair. I am however ready to buy that this may not be worth rewriting 10~15 years after the fact. > I think that most of the functions can approach being perfectly > robust, with a little work. In practical terms they can almost > certainly be made so robust that no real user of bt_page_items() will > ever crash the server. Somebody that goes out of their way to do that > *might* find a way (even with the easier cases), but that doesn't > particularly concern me. That does not concern me either, and my own take is to take as realistic things that can be fetched from a get_raw_page() call rather than a bytea specifically crafted. Now, I have found myself in cases where it was useful to see the contents of a page, as long as you can go through the initial checks, particularly in cases where corruptions involved unaligned contents. Trigerring an error on a sanity check for a specific block may be fine, now I have also found myself doing some full scans to see in one shot the extent of a damaged relation file using the functions of pageinspect. Fun times. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
On Sun, Mar 27, 2022 at 2:02 PM Robert Haas wrote: > On Sun, Mar 27, 2022 at 4:26 PM Peter Geoghegan wrote: > > We're not dealing > > with adversarial page images here. > > I think it's bad that we have to make that assumption, considering > that there's nothing whatever to keep users from supplying arbitrary > page images to pageinspect. Maybe it isn't strictly necessary for bt_page_items(), but making that level of guarantee is really hard, and not particularly useful. And that's the easy case for pageinspect: gist_page_items() takes a raw bytea, and puts it through the underlying types output functions. I think that it might actually be fundamentally impossible to guarantee that that'll be safe, because we have no idea what the output function might be doing. It's arbitrary user-defined code that could easily be implemented in C. Combined with an arbitrary page image. > But I also agree that if we're unable or > unwilling to make things perfect, it's still good to make them better. I think that most of the functions can approach being perfectly robust, with a little work. In practical terms they can almost certainly be made so robust that no real user of bt_page_items() will ever crash the server. Somebody that goes out of their way to do that *might* find a way (even with the easier cases), but that doesn't particularly concern me. -- Peter Geoghegan
Re: Assert in pageinspect with NULL pages
On Sun, Mar 27, 2022 at 4:26 PM Peter Geoghegan wrote: > We're not dealing > with adversarial page images here. I think it's bad that we have to make that assumption, considering that there's nothing whatever to keep users from supplying arbitrary page images to pageinspect. But I also agree that if we're unable or unwilling to make things perfect, it's still good to make them better. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Assert in pageinspect with NULL pages
On Sun, Mar 27, 2022 at 7:24 AM Robert Haas wrote: > It's my impression that there are many ways of crashing the system > using pageinspect right now. We aren't very careful about making sure > that our code that reads from disk doesn't crash if the data on disk > is corrupted, and all of that systemic weakness is inherited by > pageinspect. It's significantly worse than the core code, I'd say (before we even consider the fact that you can supply your own pages). At least with index AMs, where functions like _bt_checkpage() and gistcheckpage() provide the most basic sanity checks for any page that is read from disk. > Fixing this seems like a lot of work and problematic for various > reasons. And I do agree that if we can allow people to look at what's > going on with a corrupt page, that's useful. On the other hand, if by > "looking" they crash the system, that sucks a lot. So overall I think > we need a lot more sanity checks here. I don't think it's particularly hard to do a little more. That's all it would take to prevent practically all crashes. We're not dealing with adversarial page images here. Sure, it would be overkill to fully adapt something like palloc_btree_page() for pageinspect, for the reason Michael gave. But there are a couple of checks that it does that would avoid practically all real world hard crashes, without any plausible downside: * The basic _bt_checkpage() call that every nbtree page uses in the core code (as well as in amcheck). * The maxoffset > MaxIndexTuplesPerPage check. You could even go a bit further, and care about individual index tuples. My commit 93ee38eade added some basic sanity checks for index tuples to bt_page_items(). That could go a bit further as well. In particular, the sanity checks from amcheck's PageGetItemIdCareful() function could be carried over. That would make it impossible for bt_page_items() to read past the end of the page when given a page that has an index tuple whose item pointer offset has some wildly unreasonable value. I'm not volunteering. Just saying that this is quite possible. -- Peter Geoghegan
Re: Assert in pageinspect with NULL pages
On Sun, 27 Mar 2022 at 16:24, Robert Haas wrote: > > On Fri, Mar 25, 2022 at 12:57 AM Michael Paquier wrote: > > On Thu, Mar 24, 2022 at 08:54:14PM -0700, Peter Geoghegan wrote: > > > amcheck's palloc_btree_page() function validates that an 8KiB page is > > > in fact an nbtree page, in a maximally paranoid way. Might be an > > > example worth following here. > > > > Sure, we could do that. However, I don't think that going down to > > that is something we have any need for in pageinspect, as the module > > is also useful to look at the contents of the full page, even if > > slightly corrupted, and too many checks would prevent a lookup at the > > internal contents of a page. > > It's my impression that there are many ways of crashing the system > using pageinspect right now. We aren't very careful about making sure > that our code that reads from disk doesn't crash if the data on disk > is corrupted, and all of that systemic weakness is inherited by > pageinspect. But, with pageinspect, you can supply your own pages, not > just use the ones that actually exist on disk. > > Fixing this seems like a lot of work and problematic for various > reasons. And I do agree that if we can allow people to look at what's > going on with a corrupt page, that's useful. On the other hand, if by > "looking" they crash the system, that sucks a lot. So overall I think > we need a lot more sanity checks here. I noticed this thread due to recent commit 291e517a, and noticed that it has some overlap with one of my patches [0], in which I fix the corresponding issue in core postgres by assuming (and in assert-enabled builds, verifying) the size and location of the special area. As such, you might be interested in that patch. Note that currently in core postgres a corrupted special area pointer will likely target neighbouring blocks in the buffer pool; resulting in spreading corruption when the special area is updated. This spreading corruption should be limited to only the corrupted block with my patch. Kind regards, Matthias van de Meent [0] https://commitfest.postgresql.org/37/3543/ https://www.postgresql.org/message-id/caeze2wje3+tgo9fs9+izmu+z6mmzko54w1zt98wkqbeuhbh...@mail.gmail.com
Re: Assert in pageinspect with NULL pages
On Fri, Mar 25, 2022 at 12:57 AM Michael Paquier wrote: > On Thu, Mar 24, 2022 at 08:54:14PM -0700, Peter Geoghegan wrote: > > amcheck's palloc_btree_page() function validates that an 8KiB page is > > in fact an nbtree page, in a maximally paranoid way. Might be an > > example worth following here. > > Sure, we could do that. However, I don't think that going down to > that is something we have any need for in pageinspect, as the module > is also useful to look at the contents of the full page, even if > slightly corrupted, and too many checks would prevent a lookup at the > internal contents of a page. It's my impression that there are many ways of crashing the system using pageinspect right now. We aren't very careful about making sure that our code that reads from disk doesn't crash if the data on disk is corrupted, and all of that systemic weakness is inherited by pageinspect. But, with pageinspect, you can supply your own pages, not just use the ones that actually exist on disk. Fixing this seems like a lot of work and problematic for various reasons. And I do agree that if we can allow people to look at what's going on with a corrupt page, that's useful. On the other hand, if by "looking" they crash the system, that sucks a lot. So overall I think we need a lot more sanity checks here. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Assert in pageinspect with NULL pages
On Fri, Mar 25, 2022 at 12:27:24PM +0900, Michael Paquier wrote: > Makes sense. Fine by me to stick an extra "btree" here. I have been able to look at that again today (earlier than expected), and except for one incorrect thing in the GIN tests where we were not testing the correct pattern, the rest was correct. So applied and backpatched. The buildfarm is not complaining based on the first reports. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
On Thu, Mar 24, 2022 at 08:54:14PM -0700, Peter Geoghegan wrote: > amcheck's palloc_btree_page() function validates that an 8KiB page is > in fact an nbtree page, in a maximally paranoid way. Might be an > example worth following here. Sure, we could do that. However, I don't think that going down to that is something we have any need for in pageinspect, as the module is also useful to look at the contents of the full page, even if slightly corrupted, and too many checks would prevent a lookup at the internal contents of a page. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
On Wed, Mar 23, 2022 at 1:16 AM Michael Paquier wrote: > As far as I can see, this is > also possible in bt_page_items_bytea(), gist_page_opaque_info(), > gin_metapage_info() and gin_page_opaque_info(). All those code paths > should be protected with some checks on PageGetSpecialSize(), I > guess, before attempting to read the special area of the page. Hash > indexes are protected by checking the expected size of the special > area, and one code path of btree relies on the opened relation to be a > btree index. amcheck's palloc_btree_page() function validates that an 8KiB page is in fact an nbtree page, in a maximally paranoid way. Might be an example worth following here. -- Peter Geoghegan
Re: Assert in pageinspect with NULL pages
On Fri, Mar 25, 2022 at 11:03:47AM +0800, Julien Rouhaud wrote: > I'm happy with all the changes, except: > > + if (P_ISLEAF(opaque) && opaque->btpo_level != 0) > + ereport(ERROR, > + > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > + errmsg("block is not a valid leaf > page"))); > > All other messages specify which kind of page it's about, so I think it would > be better to specify "btree" leaf page here, especially since some other AMs > also have leaf pages. Makes sense. Fine by me to stick an extra "btree" here. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
On Fri, Mar 25, 2022 at 11:44:26AM +0900, Michael Paquier wrote: > I have reviewed what you have sent, bumping on a couple of issues: Thanks! I'm happy with all the changes, except: + if (P_ISLEAF(opaque) && opaque->btpo_level != 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg("block is not a valid leaf page"))); All other messages specify which kind of page it's about, so I think it would be better to specify "btree" leaf page here, especially since some other AMs also have leaf pages. > - The tests of btree and BRIN failed with 32-bit builds, because > MAXALIGN returns shorter special area sizes in those cases. This can > be fixed by abusing of \set VERBOSITY to mask the error details. We > already do that in some of the tests to make them portable. Yeah, that's the other stability problem I was worried about. I should have tried to compile with -m32. > > I'm a bit worried about the btree tests stability. I avoid emitting the > > level > > found to help with that, but it still depends on what other AM will put in > > their special page. > > Well, the limit of the pageinspect model comes from the fact that it > is possible to pass down any bytea and all those code paths would > happily process the blobs as long as they are 8kB. Pages can be > crafted as well to bypass some of the checks. This is superuser-only, > so people have to be careful, but preventing out-of-bound reads is a > different class of problem, as long as these come from valid pages. Agreed. Also pageinspect can be handy when debugging corruption, so I think it shouldn't try too hard to discard buggy pages.
Re: Assert in pageinspect with NULL pages
On Thu, Mar 24, 2022 at 04:27:40PM +0800, Julien Rouhaud wrote: > I worked on a patch to fix the problem. The functions you mention were indeed > missing some check, but after digging a bit more I found that other problem > existed. For instance, feeding a btree page to a gist_page_items_bytea() > (both > pages have the same special size) can be quite problematic too, as you can end > up accessing bogus data when retrieving the items. I also added additional > sanity checks with what is available (gist_page_id for gist, zero-level for > btree leaf pages and so on), and tried to cover everything with tests. Thanks for the patch! I have reviewed what you have sent, bumping on a couple of issues: - The tests of btree and BRIN failed with 32-bit builds, because MAXALIGN returns shorter special area sizes in those cases. This can be fixed by abusing of \set VERBOSITY to mask the error details. We already do that in some of the tests to make them portable. - Some of the tests were not necessary, overlapping with stuff that already existed. - Some checks missed a MAXALIGN(). - Did some tweaks with the error messages. - Some error messages used an incorrect term to define the index AM type, aka s/gist/GiST/ or s/brin/BRIN/. - errdetail() requires a sentence, finishing with a dot (some places of hashfuncs.c missed that. - Not your fault, but hash indexes would complain about corrupted indexes which could be misleading for the user if passing down a correct page from an incorrect index AM. - While on it, I have made the error messages more generic in the places where I could do so. What I have finished with seems to have a good balance. > I'm a bit worried about the btree tests stability. I avoid emitting the level > found to help with that, but it still depends on what other AM will put in > their special page. Well, the limit of the pageinspect model comes from the fact that it is possible to pass down any bytea and all those code paths would happily process the blobs as long as they are 8kB. Pages can be crafted as well to bypass some of the checks. This is superuser-only, so people have to be careful, but preventing out-of-bound reads is a different class of problem, as long as these come from valid pages. With all that in place, I get the attached. It is Friday, so I am not going to take my bets on the buildfarm today with a rather-risky patch. Monday/Tuesday will be fine. -- Michael From f0f9a9cc10be30e10728bd99bc7425bf7f9d3a9e Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Fri, 25 Mar 2022 11:42:28 +0900 Subject: [PATCH v2] pageinspect: add more sanity checks when accessing pages. As seen in bug #16527, feeding an incorrect page can lead to ouf-of-bound reads when trying to access the page special area. To fix it, check that the special area has the expected size before trying to access in in the functions that weren't already doing so. While at it, add other checks when possible to validate that the given page is from the expected format to avoid other possible invalid access. Author: Julien Rouhaud Reported-By: Alexander Lakhin Discussion: https://postgr.es/m/16527-ef7606186f0610a1%40postgresql.org Discussion: https://postgr.es/m/150535d6-65ac-16ee-b28a-851965fc485b%40gmail.com --- contrib/pageinspect/brinfuncs.c| 18 + contrib/pageinspect/btreefuncs.c | 14 +++ contrib/pageinspect/expected/brin.out | 10 contrib/pageinspect/expected/btree.out | 22 +--- contrib/pageinspect/expected/gin.out | 12 +++-- contrib/pageinspect/expected/gist.out | 12 +++-- contrib/pageinspect/expected/hash.out | 14 +-- contrib/pageinspect/ginfuncs.c | 22 +--- contrib/pageinspect/gistfuncs.c| 35 ++ contrib/pageinspect/hashfuncs.c| 11 +--- contrib/pageinspect/sql/brin.sql | 8 ++ contrib/pageinspect/sql/btree.sql | 18 ++--- contrib/pageinspect/sql/gin.sql| 9 +-- contrib/pageinspect/sql/gist.sql | 9 +-- contrib/pageinspect/sql/hash.sql | 10 ++-- 15 files changed, 197 insertions(+), 27 deletions(-) diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c index bf12901ac3..0387b2847a 100644 --- a/contrib/pageinspect/brinfuncs.c +++ b/contrib/pageinspect/brinfuncs.c @@ -58,6 +58,15 @@ brin_page_type(PG_FUNCTION_ARGS) page = get_page_from_raw(raw_page); + /* verify the special space has the expected size */ + if (PageGetSpecialSize(page) != MAXALIGN(sizeof(BrinSpecialSpace))) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("input page is not a valid %s page", "BRIN"), + errdetail("Expected special size %d, got %d.", + (int) MAXALIGN(sizeof(BrinSpecialSpace)), + (int) PageGetSpecialSize(page; + switch (BrinPageType(page)) { case BRIN_PAGETYPE_META: @@ -86,6 +95,15 @@ verify_brin_page(bytea *raw_page, uint16
Re: Assert in pageinspect with NULL pages
Hi, On Wed, Mar 23, 2022 at 05:16:41PM +0900, Michael Paquier wrote: > On Fri, Mar 18, 2022 at 06:43:39AM +0300, Alexander Lakhin wrote: > > Hello Michael, > > No, just x86_64, Ubuntu 20.04, gcc 11, valgrind 3.15. I just put that query > > in page.sql and see the server abort. > > Bah, of course, I have missed the valgrind part of the problem. > > The problem is that we attempt to verify a heap page here but its > pd_special is BLCKSZ. This causes BrinPageType() to grab back a > position of the area at the exact end of the page, via > PageGetSpecialPointer(), hence the failure in reading two bytes > outside the page. The result here is that the set of defenses in > verify_brin_page() is poor: we should at least make sure that the > special area is available for a read. As far as I can see, this is > also possible in bt_page_items_bytea(), gist_page_opaque_info(), > gin_metapage_info() and gin_page_opaque_info(). All those code paths > should be protected with some checks on PageGetSpecialSize(), I > guess, before attempting to read the special area of the page. Hash > indexes are protected by checking the expected size of the special > area, and one code path of btree relies on the opened relation to be a > btree index. I worked on a patch to fix the problem. The functions you mention were indeed missing some check, but after digging a bit more I found that other problem existed. For instance, feeding a btree page to a gist_page_items_bytea() (both pages have the same special size) can be quite problematic too, as you can end up accessing bogus data when retrieving the items. I also added additional sanity checks with what is available (gist_page_id for gist, zero-level for btree leaf pages and so on), and tried to cover everything with tests. I'm a bit worried about the btree tests stability. I avoid emitting the level found to help with that, but it still depends on what other AM will put in their special page. From fa2841b83a57daf8de95e8b154afc2bf5dd60dda Mon Sep 17 00:00:00 2001 From: Julien Rouhaud Date: Thu, 24 Mar 2022 15:51:21 +0800 Subject: [PATCH v1] pageinspect: add more sanity checks when accessing pages. As seen in bug #16527, feeding an incorrect page can lead to ouf-of-bound reads when trying to access the page special area. To fix it, check that the special area has the expected size before trying to access in in the functions that weren't already doing so. While at it, add other checks when possible to validate that the given page is from the expected format to avoid other possible invalid access. Author: Julien Rouhaud Reported-By: Alexander Lakhin Discussion: https://postgr.es/m/16527-ef7606186f0610a1%40postgresql.org Discussion: https://postgr.es/m/150535d6-65ac-16ee-b28a-851965fc485b%40gmail.com --- contrib/pageinspect/brinfuncs.c| 18 + contrib/pageinspect/btreefuncs.c | 16 contrib/pageinspect/expected/brin.out | 12 + contrib/pageinspect/expected/btree.out | 23 +++-- contrib/pageinspect/expected/gin.out | 9 +++ contrib/pageinspect/expected/gist.out | 11 contrib/pageinspect/expected/hash.out | 14 +++ contrib/pageinspect/ginfuncs.c | 16 contrib/pageinspect/gistfuncs.c| 35 ++ contrib/pageinspect/sql/brin.sql | 5 contrib/pageinspect/sql/btree.sql | 13 -- contrib/pageinspect/sql/gin.sql| 3 +++ contrib/pageinspect/sql/gist.sql | 4 +++ contrib/pageinspect/sql/hash.sql | 11 14 files changed, 186 insertions(+), 4 deletions(-) diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c index bf12901ac3..d4de80eee5 100644 --- a/contrib/pageinspect/brinfuncs.c +++ b/contrib/pageinspect/brinfuncs.c @@ -58,6 +58,15 @@ brin_page_type(PG_FUNCTION_ARGS) page = get_page_from_raw(raw_page); + /* verify the special space has the expected size */ + if (PageGetSpecialSize(page) != MAXALIGN(sizeof(BrinSpecialSpace))) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg("input page is not a valid brin page"), +errdetail("Special size %d, expected %d", + (int) PageGetSpecialSize(page), + (int) MAXALIGN(sizeof(BrinSpecialSpace); + switch (BrinPageType(page)) { case BRIN_PAGETYPE_META: @@ -86,6 +95,15 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype) { Pagepage = get_page_from_raw(raw_page); + /* verify the special space has the expected size */ + if (PageGetSpecialSize(page) != sizeof(BrinSpecialSpace)) + ereport(ERROR, +
Re: Assert in pageinspect with NULL pages
On Fri, Mar 18, 2022 at 06:43:39AM +0300, Alexander Lakhin wrote: > Hello Michael, > No, just x86_64, Ubuntu 20.04, gcc 11, valgrind 3.15. I just put that query > in page.sql and see the server abort. Bah, of course, I have missed the valgrind part of the problem. The problem is that we attempt to verify a heap page here but its pd_special is BLCKSZ. This causes BrinPageType() to grab back a position of the area at the exact end of the page, via PageGetSpecialPointer(), hence the failure in reading two bytes outside the page. The result here is that the set of defenses in verify_brin_page() is poor: we should at least make sure that the special area is available for a read. As far as I can see, this is also possible in bt_page_items_bytea(), gist_page_opaque_info(), gin_metapage_info() and gin_page_opaque_info(). All those code paths should be protected with some checks on PageGetSpecialSize(), I guess, before attempting to read the special area of the page. Hash indexes are protected by checking the expected size of the special area, and one code path of btree relies on the opened relation to be a btree index. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
Hello Michael, 18.03.2022 05:06, Michael Paquier wrote: Okay. I'll try to look more closely at that, then. It feels like we are just missing a MAXALIGN() in those code paths. Are you using any specific CFLAGS or something environment-sensitive (macos, 32b, etc.)? No, just x86_64, Ubuntu 20.04, gcc 11, valgrind 3.15. I just put that query in page.sql and see the server abort. Best regards, Alexander
Re: Assert in pageinspect with NULL pages
On Wed, Mar 16, 2022 at 08:35:59PM +0300, Alexander Lakhin wrote: > Yes, I've reproduced that issue on master (46d9bfb0). Okay. I'll try to look more closely at that, then. It feels like we are just missing a MAXALIGN() in those code paths. Are you using any specific CFLAGS or something environment-sensitive (macos, 32b, etc.)? > pageinspect-zeros-v2.patch doesn't help too. Well, the scope is different, so that's not a surprise. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
Hello Michael, 16.03.2022 10:39, Michael Paquier wrote: On Mon, Feb 21, 2022 at 10:00:00AM +0300, Alexander Lakhin wrote: Could you please confirm before committing the patchset that it fixes the bug #16527 [1]? Or maybe I could check it? (Original patch proposed by Daria doesn't cover that case, but if the patch going to be improved, probably it's worth fixing that bug too.) [1] https://www.postgresql.org/message-id/flat/16527-ef7606186f0610a1%40postgresql.org FWIW, I think that this one has been fixed by 076f4d9, where we make sure that the page is correctly aligned. Are you still seeing this problem? Yes, I've reproduced that issue on master (46d9bfb0). pageinspect-zeros-v2.patch doesn't help too. Best regards. Alexander
Re: Assert in pageinspect with NULL pages
On Wed, Feb 23, 2022 at 12:09:02PM +0500, Daria Lepikhova wrote: > And one more addition. In the previous version of the patch, I forgot to add > tests for the gist index, but the described problem is also relevant for it. So, I have looked at this second part of the thread, and concluded that we should not fail for empty pages. First, we fetch pages from the buffer pool in normal mode, where empty pages are valid. There is also a second point in favor of doing so: code paths dedicated to hash indexes already do that, marking such pages as simply "unused". The proposal from Julien upthread sounds cleaner to me though in the long run, as NULL gives the user the possibility to do a full-table scan with simple clauses to filter out anything found as NULL. Painting more PageIsNew() across the place requires a bit more work than a simple ereport(ERROR) in get_page_from_raw(), of course, but the gain is the portability of the functions. (One can have a lot of fun playing with random inputs and breaking most code paths, but that's not new.) -- Michael diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c index bf12901ac3..87e75f663c 100644 --- a/contrib/pageinspect/brinfuncs.c +++ b/contrib/pageinspect/brinfuncs.c @@ -58,6 +58,9 @@ brin_page_type(PG_FUNCTION_ARGS) page = get_page_from_raw(raw_page); + if (PageIsNew(page)) + PG_RETURN_NULL(); + switch (BrinPageType(page)) { case BRIN_PAGETYPE_META: @@ -86,6 +89,9 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype) { Page page = get_page_from_raw(raw_page); + if (PageIsNew(page)) + return page; + /* verify the special space says this page is what we want */ if (BrinPageType(page) != type) ereport(ERROR, @@ -138,6 +144,13 @@ brin_page_items(PG_FUNCTION_ARGS) /* minimally verify the page we got */ page = verify_brin_page(raw_page, BRIN_PAGETYPE_REGULAR, "regular"); + if (PageIsNew(page)) + { + brin_free_desc(bdesc); + index_close(indexRel, AccessShareLock); + PG_RETURN_NULL(); + } + /* * Initialize output functions for all indexed datatypes; simplifies * calling them later. @@ -313,6 +326,9 @@ brin_metapage_info(PG_FUNCTION_ARGS) page = verify_brin_page(raw_page, BRIN_PAGETYPE_META, "metapage"); + if (PageIsNew(page)) + PG_RETURN_NULL(); + /* Build a tuple descriptor for our result type */ if (get_call_result_type(fcinfo, NULL, ) != TYPEFUNC_COMPOSITE) elog(ERROR, "return type must be a row type"); @@ -364,6 +380,12 @@ brin_revmap_data(PG_FUNCTION_ARGS) /* minimally verify the page we got */ page = verify_brin_page(raw_page, BRIN_PAGETYPE_REVMAP, "revmap"); + if (PageIsNew(page)) + { + MemoryContextSwitchTo(mctx); + PG_RETURN_NULL(); + } + state = palloc(sizeof(*state)); state->tids = ((RevmapContents *) PageGetContents(page))->rm_tids; state->idx = 0; diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c index d9628dd664..dde640fd33 100644 --- a/contrib/pageinspect/btreefuncs.c +++ b/contrib/pageinspect/btreefuncs.c @@ -611,6 +611,12 @@ bt_page_items_bytea(PG_FUNCTION_ARGS) uargs->page = get_page_from_raw(raw_page); + if (PageIsNew(uargs->page)) + { + MemoryContextSwitchTo(mctx); + PG_RETURN_NULL(); + } + uargs->offset = FirstOffsetNumber; opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page); diff --git a/contrib/pageinspect/expected/brin.out b/contrib/pageinspect/expected/brin.out index 10cd36c177..15786a0c0d 100644 --- a/contrib/pageinspect/expected/brin.out +++ b/contrib/pageinspect/expected/brin.out @@ -52,4 +52,29 @@ SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx') CREATE INDEX test1_a_btree ON test1 (a); SELECT brin_page_items(get_raw_page('test1_a_btree', 0), 'test1_a_btree'); ERROR: "test1_a_btree" is not a BRIN index +-- Test for empty pages with raw input. +SHOW block_size \gset +SELECT brin_page_type(decode(repeat('00', :block_size), 'hex')); + brin_page_type + + +(1 row) + +SELECT brin_page_items(decode(repeat('00', :block_size), 'hex'), 'test1_a_idx'); + brin_page_items +- +(0 rows) + +SELECT brin_metapage_info(decode(repeat('00', :block_size), 'hex')); + brin_metapage_info + + +(1 row) + +SELECT brin_revmap_data(decode(repeat('00', :block_size), 'hex')); + brin_revmap_data +-- + +(1 row) + DROP TABLE test1; diff --git a/contrib/pageinspect/expected/btree.out b/contrib/pageinspect/expected/btree.out index 80b3dfe861..8815b56b15 100644 --- a/contrib/pageinspect/expected/btree.out +++ b/contrib/pageinspect/expected/btree.out @@ -85,4 +85,10 @@ ERROR: "test1_a_hash" is not a btree index SELECT bt_page_items('aaa'::bytea); ERROR: invalid page size \set VERBOSITY default +-- Test for empty pages with raw input. +SHOW block_size \gset +SELECT bt_page_items(decode(repeat('00', :block_size), 'hex')); +-[ RECORD 1 ]-+- +bt_page_items | + DROP TABLE test1; diff
Re: Assert in pageinspect with NULL pages
On Mon, Feb 21, 2022 at 10:00:00AM +0300, Alexander Lakhin wrote: > Could you please confirm before committing the patchset that it fixes > the bug #16527 [1]? Or maybe I could check it? > (Original patch proposed by Daria doesn't cover that case, but if the > patch going to be improved, probably it's worth fixing that bug too.) > > [1] > https://www.postgresql.org/message-id/flat/16527-ef7606186f0610a1%40postgresql.org FWIW, I think that this one has been fixed by 076f4d9, where we make sure that the page is correctly aligned. Are you still seeing this problem? -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
On Tue, Mar 15, 2022 at 06:56:46AM -0500, Justin Pryzby wrote: > On Tue, Mar 15, 2022 at 06:32:44PM +0900, Michael Paquier wrote: >> +-- Suppress the DETAIL message, to allow the tests to work across various >> +-- default page sizes. > > I think you mean "various non-default page sizes" or "various page sizes". Thanks. Indeed, this sounded a bit weird. I have been able to look at all that this morning and backpatched this first part. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
On Tue, Mar 15, 2022 at 06:32:44PM +0900, Michael Paquier wrote: > + if (!IS_BRIN(indexRel)) > + ereport(ERROR, > + (errcode(ERRCODE_WRONG_OBJECT_TYPE), > + errmsg("\"%s\" is not a %s index", > + > RelationGetRelationName(indexRel), "BRIN"))); If it were me, I'd write this without the extra parens around (errcode()). > +-- Suppress the DETAIL message, to allow the tests to work across various > +-- default page sizes. I think you mean "various non-default page sizes" or "various page sizes". -- Justin
Re: Assert in pageinspect with NULL pages
On Thu, Feb 17, 2022 at 09:00:20PM -0600, Justin Pryzby wrote: > BRIN can also crash if passed a non-brin index. > > I've been sitting on this one for awhile. Feel free to include it in your > patchset. So, I have begun a close lookup of this thread, and the problem you are reporting here is worth fixing on its own, before we do something for new pages as reported originally. There is more that caught my attention than the brin AM not being check properly, because a couple of code paths are fuzzy with the checks on the page sizes. My impression of the whole thing is that we'd better generalize the use of get_page_from_raw(), improving the code on many sides when the user can provide a raw page in input (also I'd like to think that it is a better practice anyway as any of those functions may finish to look 8-byte fields, but the current coding would silently break in alignment-picky environments): - Some code paths (hash, btree) would trigger elog(ERROR) but we cannot use that for errors that can be triggered by the user. - bt_page_items_bytea(), page_header() and fsm_page_contents() were fuzzy on the page size check, so it was rather easy to generate garbage with random data. - page_checksum_internal() used a PageHeader, where I would have expected a Page. - More consistency in the error strings, meaning less contents to translate in the long-term. This first batch leads me to the attached, with tests to stress all that for all the functions taking raw pages in input. -- Michael From 588ffddf2bd2c0d1e6168a2e7093c2488caec94b Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Tue, 15 Mar 2022 17:59:04 +0900 Subject: [PATCH] Fixes for pageinspect with page sizes --- contrib/pageinspect/brinfuncs.c| 36 ++ contrib/pageinspect/btreefuncs.c | 28 ++-- contrib/pageinspect/expected/brin.out | 4 +++ contrib/pageinspect/expected/btree.out | 15 +++ contrib/pageinspect/expected/gin.out | 11 contrib/pageinspect/expected/gist.out | 15 +++ contrib/pageinspect/expected/hash.out | 17 contrib/pageinspect/expected/page.out | 11 contrib/pageinspect/fsmfuncs.c | 4 ++- contrib/pageinspect/gistfuncs.c| 9 +++ contrib/pageinspect/hashfuncs.c| 6 +++-- contrib/pageinspect/rawpage.c | 29 +++-- contrib/pageinspect/sql/brin.sql | 4 +++ contrib/pageinspect/sql/btree.sql | 13 ++ contrib/pageinspect/sql/gin.sql| 9 +++ contrib/pageinspect/sql/gist.sql | 13 ++ contrib/pageinspect/sql/hash.sql | 13 ++ contrib/pageinspect/sql/page.sql | 9 +++ 18 files changed, 179 insertions(+), 67 deletions(-) diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c index b7c8365218..bd0ea8b18c 100644 --- a/contrib/pageinspect/brinfuncs.c +++ b/contrib/pageinspect/brinfuncs.c @@ -16,6 +16,7 @@ #include "access/brin_tuple.h" #include "access/htup_details.h" #include "catalog/index.h" +#include "catalog/pg_am_d.h" #include "catalog/pg_type.h" #include "funcapi.h" #include "lib/stringinfo.h" @@ -31,6 +32,8 @@ PG_FUNCTION_INFO_V1(brin_page_items); PG_FUNCTION_INFO_V1(brin_metapage_info); PG_FUNCTION_INFO_V1(brin_revmap_data); +#define IS_BRIN(r) ((r)->rd_rel->relam == BRIN_AM_OID) + typedef struct brin_column_state { int nstored; @@ -45,8 +48,7 @@ Datum brin_page_type(PG_FUNCTION_ARGS) { bytea *raw_page = PG_GETARG_BYTEA_P(0); - Page page = VARDATA(raw_page); - int raw_page_size; + Page page; char *type; if (!superuser()) @@ -54,14 +56,7 @@ brin_page_type(PG_FUNCTION_ARGS) (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), errmsg("must be superuser to use raw page functions"))); - raw_page_size = VARSIZE(raw_page) - VARHDRSZ; - - if (raw_page_size != BLCKSZ) - ereport(ERROR, -(errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg("input page too small"), - errdetail("Expected size %d, got %d", - BLCKSZ, raw_page_size))); + page = get_page_from_raw(raw_page); switch (BrinPageType(page)) { @@ -89,19 +84,7 @@ brin_page_type(PG_FUNCTION_ARGS) static Page verify_brin_page(bytea *raw_page, uint16 type, const char *strtype) { - Page page; - int raw_page_size; - - raw_page_size = VARSIZE(raw_page) - VARHDRSZ; - - if (raw_page_size != BLCKSZ) - ereport(ERROR, -(errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg("input page too small"), - errdetail("Expected size %d, got %d", - BLCKSZ, raw_page_size))); - - page = VARDATA(raw_page); + Page page = get_page_from_raw(raw_page); /* verify the special space says this page is what we want */ if (BrinPageType(page) != type) @@ -143,6 +126,13 @@ brin_page_items(PG_FUNCTION_ARGS) SetSingleFuncCall(fcinfo, 0); indexRel = index_open(indexRelid, AccessShareLock); + + if (!IS_BRIN(indexRel)) + ereport(ERROR, +(errcode(ERRCODE_WRONG_OBJECT_TYPE),
Re: Assert in pageinspect with NULL pages
18.02.2022 08:00, Justin Pryzby пишет: BRIN can also crash if passed a non-brin index. I've been sitting on this one for awhile. Feel free to include it in your patchset. commit 08010a6037fc4e24a9ba05e5386e766f4310d35e Author: Justin Pryzby Date: Tue Jan 19 00:25:15 2021 -0600 pageinspect: brin_page_items(): check that given relation is not only an index, but a brin one diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c index f1e64a39ef2..3de6dd943ef 100644 --- a/contrib/pageinspect/brinfuncs.c +++ b/contrib/pageinspect/brinfuncs.c @@ -16,6 +16,7 @@ #include "access/brin_tuple.h" #include "access/htup_details.h" #include "catalog/index.h" +#include "catalog/pg_am.h" #include "catalog/pg_type.h" #include "funcapi.h" #include "lib/stringinfo.h" @@ -59,7 +60,7 @@ brin_page_type(PG_FUNCTION_ARGS) if (raw_page_size != BLCKSZ) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), -errmsg("input page too small"), +errmsg("input page wrong size"), errdetail("Expected size %d, got %d", BLCKSZ, raw_page_size))); @@ -97,7 +98,7 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype) if (raw_page_size != BLCKSZ) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), -errmsg("input page too small"), +errmsg("input page wrong size"), errdetail("Expected size %d, got %d", BLCKSZ, raw_page_size))); @@ -169,7 +170,14 @@ brin_page_items(PG_FUNCTION_ARGS) MemoryContextSwitchTo(oldcontext); indexRel = index_open(indexRelid, AccessShareLock); - bdesc = brin_build_desc(indexRel); + + /* Must be a BRIN index */ + if (indexRel->rd_rel->relkind != RELKIND_INDEX || + indexRel->rd_rel->relam != BRIN_AM_OID) + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), +errmsg("\"%s\" is not a BRIN index", +RelationGetRelationName(indexRel; /* minimally verify the page we got */ page = verify_brin_page(raw_page, BRIN_PAGETYPE_REGULAR, "regular"); @@ -178,6 +186,7 @@ brin_page_items(PG_FUNCTION_ARGS) * Initialize output functions for all indexed datatypes; simplifies * calling them later. */ + bdesc = brin_build_desc(indexRel); columns = palloc(sizeof(brin_column_state *) * RelationGetDescr(indexRel)->natts); for (attno = 1; attno <= bdesc->bd_tupdesc->natts; attno++) { Thanks! This is a very valuable note. I think I will definitely add it to the next version of the patch, as soon as I deal with the error exception. -- Daria Lepikhova Postgres Professional
Re: Assert in pageinspect with NULL pages
First of all, thanks for the review and remarks! 18.02.2022 08:02, Michael Paquier пишет: On Thu, Feb 17, 2022 at 05:40:41PM +0800, Julien Rouhaud wrote: About the patch, it's incorrectly using a hardcoded 8192 block-size rather than using the computed :block_size (or computing one when it's not already the case). Well, the tests of pageinspect fail would already fail when using a different page size than 8k, like the checksum ones :) Anywa, I agree with your point that if this is not a reason to not make the tests more portable if we can do easily. Fixed. I'm also wondering if it wouldn't be better to return NULL rather than throwing an error for uninitialized pages. Agreed that this is a sensible choice. NULL would be helpful for the case where one compiles all the checksums of a relation with a full scan based on the relation size, for example. All these behaviors ought to be documented properly, as well. For SRFs, this should just return no rows rather than generating an error. So, I tried to implement this remark. However, further getting rid of throwing an error and replacing it with a NULL return led to reproduce the problem that Alexander Lakhin mentioned And here I need little more time to figure it out. And one more addition. In the previous version of the patch, I forgot to add tests for the gist index, but the described problem is also relevant for it. -- Daria Lepikhova Postgres Professional From 7dbfa7f01bf485d2812b24af012721b4a1e1e785 Mon Sep 17 00:00:00 2001 From: "d.lepikhova" Date: Tue, 22 Feb 2022 16:25:31 +0500 Subject: [PATCH] Add checks for null pages into pageinspect --- contrib/pageinspect/brinfuncs.c | 10 ++ contrib/pageinspect/btreefuncs.c | 10 ++ contrib/pageinspect/expected/brin.out | 13 + contrib/pageinspect/expected/btree.out| 5 + contrib/pageinspect/expected/checksum.out | 7 +++ contrib/pageinspect/expected/gin.out | 7 +++ contrib/pageinspect/expected/gist.out | 5 + contrib/pageinspect/expected/hash.out | 9 + contrib/pageinspect/rawpage.c | 10 ++ contrib/pageinspect/sql/brin.sql | 6 ++ contrib/pageinspect/sql/btree.sql | 3 +++ contrib/pageinspect/sql/checksum.sql | 3 +++ contrib/pageinspect/sql/gin.sql | 5 + contrib/pageinspect/sql/gist.sql | 5 + contrib/pageinspect/sql/hash.sql | 6 ++ 15 files changed, 104 insertions(+) diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c index 50892b5cc20..a2fee13e2d1 100644 --- a/contrib/pageinspect/brinfuncs.c +++ b/contrib/pageinspect/brinfuncs.c @@ -63,6 +63,10 @@ brin_page_type(PG_FUNCTION_ARGS) errdetail("Expected size %d, got %d", BLCKSZ, raw_page_size))); + /* check that the page is initialized before accessing any fields */ + if (PageIsNew(page)) + PG_RETURN_NULL(); + switch (BrinPageType(page)) { case BRIN_PAGETYPE_META: @@ -103,6 +107,12 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype) page = VARDATA(raw_page); + /* check that the page is initialized before accessing any fields */ + if (PageIsNew(page)) + ereport(ERROR, +(errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("input page is empty"))); + /* verify the special space says this page is what we want */ if (BrinPageType(page) != type) ereport(ERROR, diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c index 03debe336ba..5ea7af5b998 100644 --- a/contrib/pageinspect/btreefuncs.c +++ b/contrib/pageinspect/btreefuncs.c @@ -519,6 +519,12 @@ bt_page_items_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version) UnlockReleaseBuffer(buffer); relation_close(rel, AccessShareLock); + /* check that the page is initialized before accessing any fields */ + if (PageIsNew(uargs->page)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("input page is empty"))); + uargs->offset = FirstOffsetNumber; opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page); @@ -615,6 +621,10 @@ bt_page_items_bytea(PG_FUNCTION_ARGS) uargs->page = VARDATA(raw_page); + /* check that the page is initialized before accessing any fields */ + if (PageIsNew(uargs->page)) + PG_RETURN_NULL(); + uargs->offset = FirstOffsetNumber; opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page); diff --git a/contrib/pageinspect/expected/brin.out b/contrib/pageinspect/expected/brin.out index 71eb190380c..2839ee9435b 100644 --- a/contrib/pageinspect/expected/brin.out +++ b/contrib/pageinspect/expected/brin.out @@ -48,4 +48,17 @@ SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx') 1 | 0 | 1 | f| f| f | {1 .. 1} (1 row) +-- Check that all functions fail gracefully with an empty page imput +select brin_page_type(repeat(E'\\000',
Re: Assert in pageinspect with NULL pages
On Mon, Feb 21, 2022 at 10:00:00AM +0300, Alexander Lakhin wrote: > Could you please confirm before committing the patchset that it fixes > the bug #16527 [1]? Or maybe I could check it? > (Original patch proposed by Daria doesn't cover that case, but if the > patch going to be improved, probably it's worth fixing that bug too.) I have not directly checked, but that looks like the same issue to me. By the way, I was waiting for an updated patch set, based on the review provided upthread: https://www.postgresql.org/message-id/Yg8MPrV49/9us...@paquier.xyz And it seems better to group everything in a single commit as the same areas are touched. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
Hello Michael, 18.02.2022 06:07, Michael Paquier wrpte: > On Thu, Feb 17, 2022 at 09:00:20PM -0600, Justin Pryzby wrote: >> BRIN can also crash if passed a non-brin index. >> >> I've been sitting on this one for awhile. Feel free to include it in your >> patchset. > Ugh. Thanks! I am keeping a note about this one. Could you please confirm before committing the patchset that it fixes the bug #16527 [1]? Or maybe I could check it? (Original patch proposed by Daria doesn't cover that case, but if the patch going to be improved, probably it's worth fixing that bug too.) [1] https://www.postgresql.org/message-id/flat/16527-ef7606186f0610a1%40postgresql.org Best regards, Alexander
Re: Assert in pageinspect with NULL pages
On Thu, Feb 17, 2022 at 09:00:20PM -0600, Justin Pryzby wrote: > BRIN can also crash if passed a non-brin index. > > I've been sitting on this one for awhile. Feel free to include it in your > patchset. Ugh. Thanks! I am keeping a note about this one. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
On Thu, Feb 17, 2022 at 05:40:41PM +0800, Julien Rouhaud wrote: > About the patch, it's incorrectly using a hardcoded 8192 block-size rather > than > using the computed :block_size (or computing one when it's not already the > case). Well, the tests of pageinspect fail would already fail when using a different page size than 8k, like the checksum ones :) Anywa, I agree with your point that if this is not a reason to not make the tests more portable if we can do easily. > I'm also wondering if it wouldn't be better to return NULL rather than > throwing > an error for uninitialized pages. Agreed that this is a sensible choice. NULL would be helpful for the case where one compiles all the checksums of a relation with a full scan based on the relation size, for example. All these behaviors ought to be documented properly, as well. For SRFs, this should just return no rows rather than generating an error. > While at it, it could also be worthwhile to add tests for invalid blkno and > block size in page_checksum(). If we are on that, we could also have tests for the various "input page too small" errors. Now, these are just improvements, so I'd rather treat this part separately of the issue reported, and add the extra tests only on HEAD. I can't help but notice that we should have a similar protection on PageIsNew() in heap_page_items()? Now the code happens to work for an empty page thanks to PageGetMaxOffsetNumber(), but this extra protection would make the code more solid in the long-term IMO for the full-zero case? It is worth noting that, in pageinspect, two of the heap functions are the only ones to not be strict.. Something similar could be said about page_header() in rawpage.c, though it is not wrong to return only zeros for a page full of zeros. Shouldn't we similarly care about bt_metap() and bt_page_stats_internal(), both callers of ReadBuffer(). The root point is that the RBM_NORMAL mode of ReadBufferExtended() considers a page full of zeros as valid, as per its top comments. For the same reason, get_raw_page_internal() would work just fine and return a page full of zeros. Also, instead of an error in get_page_from_raw(), we should make sure that all its callers are able to map with the case of a new page, where we return 8kB worth of zeros from this inner routine. -- Michael signature.asc Description: PGP signature
Re: Assert in pageinspect with NULL pages
BRIN can also crash if passed a non-brin index. I've been sitting on this one for awhile. Feel free to include it in your patchset. commit 08010a6037fc4e24a9ba05e5386e766f4310d35e Author: Justin Pryzby Date: Tue Jan 19 00:25:15 2021 -0600 pageinspect: brin_page_items(): check that given relation is not only an index, but a brin one diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c index f1e64a39ef2..3de6dd943ef 100644 --- a/contrib/pageinspect/brinfuncs.c +++ b/contrib/pageinspect/brinfuncs.c @@ -16,6 +16,7 @@ #include "access/brin_tuple.h" #include "access/htup_details.h" #include "catalog/index.h" +#include "catalog/pg_am.h" #include "catalog/pg_type.h" #include "funcapi.h" #include "lib/stringinfo.h" @@ -59,7 +60,7 @@ brin_page_type(PG_FUNCTION_ARGS) if (raw_page_size != BLCKSZ) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), -errmsg("input page too small"), +errmsg("input page wrong size"), errdetail("Expected size %d, got %d", BLCKSZ, raw_page_size))); @@ -97,7 +98,7 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype) if (raw_page_size != BLCKSZ) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), -errmsg("input page too small"), +errmsg("input page wrong size"), errdetail("Expected size %d, got %d", BLCKSZ, raw_page_size))); @@ -169,7 +170,14 @@ brin_page_items(PG_FUNCTION_ARGS) MemoryContextSwitchTo(oldcontext); indexRel = index_open(indexRelid, AccessShareLock); - bdesc = brin_build_desc(indexRel); + + /* Must be a BRIN index */ + if (indexRel->rd_rel->relkind != RELKIND_INDEX || + indexRel->rd_rel->relam != BRIN_AM_OID) + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), +errmsg("\"%s\" is not a BRIN index", +RelationGetRelationName(indexRel; /* minimally verify the page we got */ page = verify_brin_page(raw_page, BRIN_PAGETYPE_REGULAR, "regular"); @@ -178,6 +186,7 @@ brin_page_items(PG_FUNCTION_ARGS) * Initialize output functions for all indexed datatypes; simplifies * calling them later. */ + bdesc = brin_build_desc(indexRel); columns = palloc(sizeof(brin_column_state *) * RelationGetDescr(indexRel)->natts); for (attno = 1; attno <= bdesc->bd_tupdesc->natts; attno++) {
Re: Assert in pageinspect with NULL pages
On Thu, Feb 17, 2022 at 05:57:49PM +0900, Michael Paquier wrote: > On Thu, Feb 17, 2022 at 01:46:40PM +0500, Daria Lepikhova wrote: > > INSERT INTO test1(y) SELECT 0 FROM generate_series(1,1E6) AS x; > > SELECT page_checksum(repeat(E'\\000', 8192)::bytea, 1); > > Indeed. Good catch, and that seems pretty old at quick glance for the > checksum part. I'll try to look at all that tomorrow. Indeed, the problem in page_checksum() has been there since the beginning as far as I can see. About the patch, it's incorrectly using a hardcoded 8192 block-size rather than using the computed :block_size (or computing one when it's not already the case). I'm also wondering if it wouldn't be better to return NULL rather than throwing an error for uninitialized pages. While at it, it could also be worthwhile to add tests for invalid blkno and block size in page_checksum().
Re: Assert in pageinspect with NULL pages
On Thu, Feb 17, 2022 at 01:46:40PM +0500, Daria Lepikhova wrote: > INSERT INTO test1(y) SELECT 0 FROM generate_series(1,1E6) AS x; > SELECT page_checksum(repeat(E'\\000', 8192)::bytea, 1); Indeed. Good catch, and that seems pretty old at quick glance for the checksum part. I'll try to look at all that tomorrow. -- Michael signature.asc Description: PGP signature