Re: [HACKERS] pageinspect and hash indexes
On Fri, Mar 24, 2017 at 3:44 PM, Ashutosh Sharma wrote: > Agreed. Moreover, previous approach might even change the current > behaviour of functions like hash_page_items() and hash_page_stats() > basically when we pass UNUSED pages to these functions. Attached is > the newer version of patch. This patch doesn't completely solve the problem: pgbench -i -s 40 alter table pgbench_accounts drop constraint pgbench_accounts_pkey; create index on pgbench_accounts using hash (aid); insert into pgbench_accounts select * from pgbench_accounts; select hash_page_type, min(n), max(n), count(*) from (select n, hash_page_type(get_raw_page('pgbench_accounts_aid_idx'::text, n)) from generate_series(0, (pg_relation_size('pgbench_accounts_aid_idx')/8192)::integer - 1) n) x group by 1; ERROR: index table contains zero page This happens because of the sparse allocation forced by _hash_alloc_buckets. Perhaps the real fix is to have that function initialize all of the pages instead of just the last one, but unless and until we decide to do that, we've got to cope with zero pages in the index. Actually, even if we did fix that I think we'd still need to do this, because the way relation extension works in general is that we first write a page of zeroes and then go back and fill in the data; an intervening crash can leave behind the intermediate state. A second problem is that the one page initialized by _hash_alloc_buckets needs to end up with a valid special space; otherwise, you get the same error Jeff complained about originally when you try to use hash_page_type() on it. I fixed those issues and committed this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
Thanks for reviewing my patch. I have removed the extra white space. Attached are both the patches. >>> >>> Sorry, I have mistakenly attached wrong patch. Here are the correct >>> set of patches. >> >> The latest version of patches looks fine to me. > > I don't really like 0002. What about this, instead? > > --- a/contrib/pageinspect/hashfuncs.c > +++ b/contrib/pageinspect/hashfuncs.c > @@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags) > /* Check that page type is sane. */ > pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE; > if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE && > -pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE) > +pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE && > +pagetype != LH_UNUSED_PAGE) > ereport(ERROR, > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("invalid hash page type %08x", pagetype))); > > The advantage of that is (1) it won't get confused if in the future we > have an unused page that has some flag bit not in LH_PAGE_TYPE set, > and (2) if in the future we want to add any other checks to this > function which should apply to unused pages also, they won't get > bypassed by an early return statement. Agreed. Moreover, previous approach might even change the current behaviour of functions like hash_page_items() and hash_page_stats() basically when we pass UNUSED pages to these functions. Attached is the newer version of patch. From 5c54fa58895cd279ff44424a3eb1feafdaca69d5 Mon Sep 17 00:00:00 2001 From: ashu Date: Sat, 25 Mar 2017 01:02:35 +0530 Subject: [PATCH] Allow pageinspect to handle UNUSED hash pages v3 --- contrib/pageinspect/hashfuncs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c index 812a03f..2632287 100644 --- a/contrib/pageinspect/hashfuncs.c +++ b/contrib/pageinspect/hashfuncs.c @@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags) /* Check that page type is sane. */ pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE; if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE && - pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE) + pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE && + pagetype != LH_UNUSED_PAGE) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("invalid hash page type %08x", pagetype))); -- 1.8.3.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Fri, Mar 24, 2017 at 3:54 AM, Amit Kapila wrote: > On Fri, Mar 24, 2017 at 9:50 AM, Ashutosh Sharma > wrote: >> On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma >> wrote: >>> >>> Thanks for reviewing my patch. I have removed the extra white space. >>> Attached are both the patches. >> >> Sorry, I have mistakenly attached wrong patch. Here are the correct >> set of patches. > > The latest version of patches looks fine to me. I don't really like 0002. What about this, instead? --- a/contrib/pageinspect/hashfuncs.c +++ b/contrib/pageinspect/hashfuncs.c @@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags) /* Check that page type is sane. */ pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE; if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE && -pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE) +pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE && +pagetype != LH_UNUSED_PAGE) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("invalid hash page type %08x", pagetype))); The advantage of that is (1) it won't get confused if in the future we have an unused page that has some flag bit not in LH_PAGE_TYPE set, and (2) if in the future we want to add any other checks to this function which should apply to unused pages also, they won't get bypassed by an early return statement. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Fri, Mar 24, 2017 at 9:50 AM, Ashutosh Sharma wrote: > On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma > wrote: >> >> Thanks for reviewing my patch. I have removed the extra white space. >> Attached are both the patches. > > Sorry, I have mistakenly attached wrong patch. Here are the correct > set of patches. > The latest version of patches looks fine to me. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma wrote: > On Fri, Mar 24, 2017 at 9:21 AM, Amit Kapila wrote: >> On Thu, Mar 23, 2017 at 6:46 PM, Ashutosh Sharma >> wrote: Oh, okay, but my main objection was that we should not check hash page type (hasho_flag) without ensuring whether it is a hash page. Can you try to adjust the above code so that this check can be moved after hasho_page_id check? >>> >>> Yes, I got your point. I have done that but then i had to remove the >>> check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will >>> only be true for one page in entire hash index table and can be >>> ignored. If you wish, I could mention about it in the documentation. >>> >> >> Yeah, I think it is worth adding a Note in docs, but we can do that >> separately if required. >> > To avoid > this, at the start of verify_hash_page function itself if we recognise > page as UNUSED page we return immediately. > >> >> 2. >> + /* Check if it is an empty hash page. */ >> + if (PageIsEmpty(page)) >> + ereport(ERROR, >> + (errcode(ERRCODE_INDEX_CORRUPTED), >> + errmsg("index table contains empty page"))); >> >> >> Do we want to give a separate message for EMPTY and NEW pages? Isn't >> it better that the same error message can be given for both of them as >> from user perspective there is not much difference between both the >> messages? > > I think we should show separate message because they are two different > type of pages. In the sense like, one is initialised whereas other is > completely zero. > I understand your point, but not sure if it makes any difference to user. >>> >>> okay, I have now anyways removed the check for PageIsEmpty. Please >>> refer to the attached '0002 >>> allow_pageinspect_handle_UNUSED_hash_pages.patch' >>> >> >> + >> if (pageopaque->hasho_page_id != HASHO_PAGE_ID) >> ereport(ERROR, >> >> spurious white space. >> >>> >>> Also, I have attached >>> '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that >>> handles your comment mentioned in [1]. >>> >> >> In general, we have to initialize prevblock with max_bucket, but here >> it is okay, because we anyway initialize it after this page is >> allocated as overflow page. >> >> Your patches look good to me except for small white space change. > > Thanks for reviewing my patch. I have removed the extra white space. > Attached are both the patches. Sorry, I have mistakenly attached wrong patch. Here are the correct set of patches. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com From 86eb711ea80b495c9d2f5228558682d0e95c41eb Mon Sep 17 00:00:00 2001 From: ashu Date: Thu, 23 Mar 2017 16:47:17 +0530 Subject: [PATCH] Mark freed overflow page as UNUSED pagetype v2 --- src/backend/access/hash/hash_xlog.c | 9 + src/backend/access/hash/hashovfl.c | 13 - 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c index d9ac42c..2ccaf46 100644 --- a/src/backend/access/hash/hash_xlog.c +++ b/src/backend/access/hash/hash_xlog.c @@ -697,11 +697,20 @@ hash_xlog_squeeze_page(XLogReaderState *record) if (XLogReadBufferForRedo(record, 2, &ovflbuf) == BLK_NEEDS_REDO) { Page ovflpage; + HashPageOpaque ovflopaque; ovflpage = BufferGetPage(ovflbuf); _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf)); + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage); + + ovflopaque->hasho_prevblkno = InvalidBlockNumber; + ovflopaque->hasho_nextblkno = InvalidBlockNumber; + ovflopaque->hasho_bucket = -1; + ovflopaque->hasho_flag = LH_UNUSED_PAGE; + ovflopaque->hasho_page_id = HASHO_PAGE_ID; + PageSetLSN(ovflpage, lsn); MarkBufferDirty(ovflbuf); } diff --git a/src/backend/access/hash/hashovfl.c b/src/backend/access/hash/hashovfl.c index a3cae21..d3eaee0 100644 --- a/src/backend/access/hash/hashovfl.c +++ b/src/backend/access/hash/hashovfl.c @@ -591,9 +591,20 @@ _hash_freeovflpage(Relation rel, Buffer bucketbuf, Buffer ovflbuf, /* * Initialize the freed overflow page. Just zeroing the page won't work, * because WAL replay routines expect pages to be initialized. See - * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. + * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. Also, mark + * the page as UNUSED type and retain it's page id. This allows the tools + * like pageinspect to consider it as a hash page. */ _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf)); + + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage); + + ovflopaque->hasho_prevblkno = InvalidBlockNumber; + ovflopaque->hasho_nextblkno = InvalidBlockNumber; + ovflopaque->hasho_bucket = -1; + ovflopaque->hasho_flag = LH_UNUSED_PAGE; + ovflopaque->hasho_page_id = HASHO_PAGE_ID; + MarkBufferDirty(ovflbuf); if
Re: [HACKERS] pageinspect and hash indexes
On Fri, Mar 24, 2017 at 9:21 AM, Amit Kapila wrote: > On Thu, Mar 23, 2017 at 6:46 PM, Ashutosh Sharma > wrote: >>> >>> Oh, okay, but my main objection was that we should not check hash page >>> type (hasho_flag) without ensuring whether it is a hash page. Can you >>> try to adjust the above code so that this check can be moved after >>> hasho_page_id check? >> >> Yes, I got your point. I have done that but then i had to remove the >> check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will >> only be true for one page in entire hash index table and can be >> ignored. If you wish, I could mention about it in the documentation. >> > > Yeah, I think it is worth adding a Note in docs, but we can do that > separately if required. > >>> To avoid this, at the start of verify_hash_page function itself if we recognise page as UNUSED page we return immediately. > > 2. > + /* Check if it is an empty hash page. */ > + if (PageIsEmpty(page)) > + ereport(ERROR, > + (errcode(ERRCODE_INDEX_CORRUPTED), > + errmsg("index table contains empty page"))); > > > Do we want to give a separate message for EMPTY and NEW pages? Isn't > it better that the same error message can be given for both of them as > from user perspective there is not much difference between both the > messages? I think we should show separate message because they are two different type of pages. In the sense like, one is initialised whereas other is completely zero. >>> >>> I understand your point, but not sure if it makes any difference to user. >>> >> >> okay, I have now anyways removed the check for PageIsEmpty. Please >> refer to the attached '0002 >> allow_pageinspect_handle_UNUSED_hash_pages.patch' >> > > + > if (pageopaque->hasho_page_id != HASHO_PAGE_ID) > ereport(ERROR, > > spurious white space. > >> >> Also, I have attached >> '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that >> handles your comment mentioned in [1]. >> > > In general, we have to initialize prevblock with max_bucket, but here > it is okay, because we anyway initialize it after this page is > allocated as overflow page. > > Your patches look good to me except for small white space change. Thanks for reviewing my patch. I have removed the extra white space. Attached are both the patches. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com From 86eb711ea80b495c9d2f5228558682d0e95c41eb Mon Sep 17 00:00:00 2001 From: ashu Date: Thu, 23 Mar 2017 16:47:17 +0530 Subject: [PATCH] Mark freed overflow page as UNUSED pagetype v2 --- src/backend/access/hash/hash_xlog.c | 9 + src/backend/access/hash/hashovfl.c | 13 - 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c index d9ac42c..2ccaf46 100644 --- a/src/backend/access/hash/hash_xlog.c +++ b/src/backend/access/hash/hash_xlog.c @@ -697,11 +697,20 @@ hash_xlog_squeeze_page(XLogReaderState *record) if (XLogReadBufferForRedo(record, 2, &ovflbuf) == BLK_NEEDS_REDO) { Page ovflpage; + HashPageOpaque ovflopaque; ovflpage = BufferGetPage(ovflbuf); _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf)); + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage); + + ovflopaque->hasho_prevblkno = InvalidBlockNumber; + ovflopaque->hasho_nextblkno = InvalidBlockNumber; + ovflopaque->hasho_bucket = -1; + ovflopaque->hasho_flag = LH_UNUSED_PAGE; + ovflopaque->hasho_page_id = HASHO_PAGE_ID; + PageSetLSN(ovflpage, lsn); MarkBufferDirty(ovflbuf); } diff --git a/src/backend/access/hash/hashovfl.c b/src/backend/access/hash/hashovfl.c index a3cae21..d3eaee0 100644 --- a/src/backend/access/hash/hashovfl.c +++ b/src/backend/access/hash/hashovfl.c @@ -591,9 +591,20 @@ _hash_freeovflpage(Relation rel, Buffer bucketbuf, Buffer ovflbuf, /* * Initialize the freed overflow page. Just zeroing the page won't work, * because WAL replay routines expect pages to be initialized. See - * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. + * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. Also, mark + * the page as UNUSED type and retain it's page id. This allows the tools + * like pageinspect to consider it as a hash page. */ _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf)); + + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage); + + ovflopaque->hasho_prevblkno = InvalidBlockNumber; + ovflopaque->hasho_nextblkno = InvalidBlockNumber; + ovflopaque->hasho_bucket = -1; + ovflopaque->hasho_flag = LH_UNUSED_PAGE; + ovflopaque->hasho_page_id = HASHO_PAGE_ID; + MarkBufferDirty(ovflbuf); if (BufferIsValid(prevbuf)) -- 1.8.3.1 diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c index 812a03f..6ba8847 100644 --- a/contrib/pageinspect/hashfuncs.c +++ b/contrib/pageinspect/has
Re: [HACKERS] pageinspect and hash indexes
On Thu, Mar 23, 2017 at 6:46 PM, Ashutosh Sharma wrote: >> >> Oh, okay, but my main objection was that we should not check hash page >> type (hasho_flag) without ensuring whether it is a hash page. Can you >> try to adjust the above code so that this check can be moved after >> hasho_page_id check? > > Yes, I got your point. I have done that but then i had to remove the > check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will > only be true for one page in entire hash index table and can be > ignored. If you wish, I could mention about it in the documentation. > Yeah, I think it is worth adding a Note in docs, but we can do that separately if required. >> >>> To avoid >>> this, at the start of verify_hash_page function itself if we recognise >>> page as UNUSED page we return immediately. >>> 2. + /* Check if it is an empty hash page. */ + if (PageIsEmpty(page)) + ereport(ERROR, + (errcode(ERRCODE_INDEX_CORRUPTED), + errmsg("index table contains empty page"))); Do we want to give a separate message for EMPTY and NEW pages? Isn't it better that the same error message can be given for both of them as from user perspective there is not much difference between both the messages? >>> >>> I think we should show separate message because they are two different >>> type of pages. In the sense like, one is initialised whereas other is >>> completely zero. >>> >> >> I understand your point, but not sure if it makes any difference to user. >> > > okay, I have now anyways removed the check for PageIsEmpty. Please > refer to the attached '0002 > allow_pageinspect_handle_UNUSED_hash_pages.patch' > + if (pageopaque->hasho_page_id != HASHO_PAGE_ID) ereport(ERROR, spurious white space. > > Also, I have attached > '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that > handles your comment mentioned in [1]. > In general, we have to initialize prevblock with max_bucket, but here it is okay, because we anyway initialize it after this page is allocated as overflow page. Your patches look good to me except for small white space change. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
Hi Amit, >>> + >>> + /* Check if it is an unused hash page. */ >>> + if (pageopaque->hasho_flag == LH_UNUSED_PAGE) >>> + return page; >>> >>> >>> I don't see we need to have a separate check for unused page, why >>> can't it be checked with other page types in below check. >>> >>> if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE && >>> pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE) >> >> That is because UNUSED page is as good as an empty page except that >> empty page doesn't have any pagetype. If we add condition for checking >> UNUSED page in above if check it will never show unused page as an >> unsed page rather it will always show it as an empty page. >> > > Oh, okay, but my main objection was that we should not check hash page > type (hasho_flag) without ensuring whether it is a hash page. Can you > try to adjust the above code so that this check can be moved after > hasho_page_id check? Yes, I got your point. I have done that but then i had to remove the check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will only be true for one page in entire hash index table and can be ignored. If you wish, I could mention about it in the documentation. > >> To avoid >> this, at the start of verify_hash_page function itself if we recognise >> page as UNUSED page we return immediately. >> >>> >>> 2. >>> + /* Check if it is an empty hash page. */ >>> + if (PageIsEmpty(page)) >>> + ereport(ERROR, >>> + (errcode(ERRCODE_INDEX_CORRUPTED), >>> + errmsg("index table contains empty page"))); >>> >>> >>> Do we want to give a separate message for EMPTY and NEW pages? Isn't >>> it better that the same error message can be given for both of them as >>> from user perspective there is not much difference between both the >>> messages? >> >> I think we should show separate message because they are two different >> type of pages. In the sense like, one is initialised whereas other is >> completely zero. >> > > I understand your point, but not sure if it makes any difference to user. > okay, I have now anyways removed the check for PageIsEmpty. Please refer to the attached '0002 allow_pageinspect_handle_UNUSED_hash_pages.patch' Also, I have attached '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that handles your comment mentioned in [1]. [1] - https://www.postgresql.org/message-id/CAA4eK1%2BVE_TDRLWpyeOf%2B7%2B6if68kgPNwO4guKo060rm_t3O5w%40mail.gmail.com -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com From 86eb711ea80b495c9d2f5228558682d0e95c41eb Mon Sep 17 00:00:00 2001 From: ashu Date: Thu, 23 Mar 2017 16:47:17 +0530 Subject: [PATCH] Mark freed overflow page as UNUSED pagetype v2 --- src/backend/access/hash/hash_xlog.c | 9 + src/backend/access/hash/hashovfl.c | 13 - 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c index d9ac42c..2ccaf46 100644 --- a/src/backend/access/hash/hash_xlog.c +++ b/src/backend/access/hash/hash_xlog.c @@ -697,11 +697,20 @@ hash_xlog_squeeze_page(XLogReaderState *record) if (XLogReadBufferForRedo(record, 2, &ovflbuf) == BLK_NEEDS_REDO) { Page ovflpage; + HashPageOpaque ovflopaque; ovflpage = BufferGetPage(ovflbuf); _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf)); + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage); + + ovflopaque->hasho_prevblkno = InvalidBlockNumber; + ovflopaque->hasho_nextblkno = InvalidBlockNumber; + ovflopaque->hasho_bucket = -1; + ovflopaque->hasho_flag = LH_UNUSED_PAGE; + ovflopaque->hasho_page_id = HASHO_PAGE_ID; + PageSetLSN(ovflpage, lsn); MarkBufferDirty(ovflbuf); } diff --git a/src/backend/access/hash/hashovfl.c b/src/backend/access/hash/hashovfl.c index a3cae21..d3eaee0 100644 --- a/src/backend/access/hash/hashovfl.c +++ b/src/backend/access/hash/hashovfl.c @@ -591,9 +591,20 @@ _hash_freeovflpage(Relation rel, Buffer bucketbuf, Buffer ovflbuf, /* * Initialize the freed overflow page. Just zeroing the page won't work, * because WAL replay routines expect pages to be initialized. See - * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. + * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. Also, mark + * the page as UNUSED type and retain it's page id. This allows the tools + * like pageinspect to consider it as a hash page. */ _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf)); + + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage); + + ovflopaque->hasho_prevblkno = InvalidBlockNumber; + ovflopaque->hasho_nextblkno = InvalidBlockNumber; + ovflopaque->hasho_bucket = -1; + ovflopaque->hasho_flag = LH_UNUSED_PAGE; + ovflopaque->hasho_page_id = HASHO_PAGE_ID; + MarkBufferDirty(ovflbuf); if (BufferIsValid(prevbuf)) -- 1.8.3.1 diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c index 812a03f..6ba8847 100644 --- a/contrib/pageinspect/hashfuncs.c +
Re: [HACKERS] pageinspect and hash indexes
On Thu, Mar 23, 2017 at 10:02 AM, Ashutosh Sharma wrote: >>> >> >> 1. >> @@ -70,6 +70,17 @@ verify_hash_page(bytea *raw_page, int flags) >> pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); >> + >> + /* Check if it is an unused hash page. */ >> + if (pageopaque->hasho_flag == LH_UNUSED_PAGE) >> + return page; >> >> >> I don't see we need to have a separate check for unused page, why >> can't it be checked with other page types in below check. >> >> if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE && >> pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE) > > That is because UNUSED page is as good as an empty page except that > empty page doesn't have any pagetype. If we add condition for checking > UNUSED page in above if check it will never show unused page as an > unsed page rather it will always show it as an empty page. > Oh, okay, but my main objection was that we should not check hash page type (hasho_flag) without ensuring whether it is a hash page. Can you try to adjust the above code so that this check can be moved after hasho_page_id check? > To avoid > this, at the start of verify_hash_page function itself if we recognise > page as UNUSED page we return immediately. > >> >> 2. >> + /* Check if it is an empty hash page. */ >> + if (PageIsEmpty(page)) >> + ereport(ERROR, >> + (errcode(ERRCODE_INDEX_CORRUPTED), >> + errmsg("index table contains empty page"))); >> >> >> Do we want to give a separate message for EMPTY and NEW pages? Isn't >> it better that the same error message can be given for both of them as >> from user perspective there is not much difference between both the >> messages? > > I think we should show separate message because they are two different > type of pages. In the sense like, one is initialised whereas other is > completely zero. > I understand your point, but not sure if it makes any difference to user. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
>> I think it is not just happening for freed overflow but also for newly >> allocated bucket page. It would be good if we could mark freed >> overflow page as UNUSED page rather than just initialising it's header >> portion and leaving the page type in special area as it is. Attached >> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that >> marks a freed overflow page as an unused page. >> > > _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf)); > + > + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage); > + > + ovflopaque->hasho_prevblkno = InvalidBlockNumber; > + ovflopaque->hasho_nextblkno = InvalidBlockNumber; > + ovflopaque->hasho_bucket = -1; > + ovflopaque->hasho_flag = LH_UNUSED_PAGE; > + ovflopaque->hasho_page_id = HASHO_PAGE_ID; > + > > You need similar initialization in replay routine as well. I will do that and share the patch shortly. Thanks. > >> Also, I have now changed pageinspect module to handle unused and empty >> hash index page. Attached is the patch >> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for >> the same. >> > > 1. > @@ -70,6 +70,17 @@ verify_hash_page(bytea *raw_page, int flags) > pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); > + > + /* Check if it is an unused hash page. */ > + if (pageopaque->hasho_flag == LH_UNUSED_PAGE) > + return page; > > > I don't see we need to have a separate check for unused page, why > can't it be checked with other page types in below check. > > if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE && > pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE) That is because UNUSED page is as good as an empty page except that empty page doesn't have any pagetype. If we add condition for checking UNUSED page in above if check it will never show unused page as an unsed page rather it will always show it as an empty page. To avoid this, at the start of verify_hash_page function itself if we recognise page as UNUSED page we return immediately. > > 2. > + /* Check if it is an empty hash page. */ > + if (PageIsEmpty(page)) > + ereport(ERROR, > + (errcode(ERRCODE_INDEX_CORRUPTED), > + errmsg("index table contains empty page"))); > > > Do we want to give a separate message for EMPTY and NEW pages? Isn't > it better that the same error message can be given for both of them as > from user perspective there is not much difference between both the > messages? I think we should show separate message because they are two different type of pages. In the sense like, one is initialised whereas other is completely zero. -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Mon, Mar 20, 2017 at 6:53 PM, Ashutosh Sharma wrote: > > I think it is not just happening for freed overflow but also for newly > allocated bucket page. It would be good if we could mark freed > overflow page as UNUSED page rather than just initialising it's header > portion and leaving the page type in special area as it is. Attached > is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that > marks a freed overflow page as an unused page. > _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf)); + + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage); + + ovflopaque->hasho_prevblkno = InvalidBlockNumber; + ovflopaque->hasho_nextblkno = InvalidBlockNumber; + ovflopaque->hasho_bucket = -1; + ovflopaque->hasho_flag = LH_UNUSED_PAGE; + ovflopaque->hasho_page_id = HASHO_PAGE_ID; + You need similar initialization in replay routine as well. > Also, I have now changed pageinspect module to handle unused and empty > hash index page. Attached is the patch > (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for > the same. > 1. @@ -70,6 +70,17 @@ verify_hash_page(bytea *raw_page, int flags) pageopaque = (HashPageOpaque) PageGetSpecialPointer(page); + + /* Check if it is an unused hash page. */ + if (pageopaque->hasho_flag == LH_UNUSED_PAGE) + return page; I don't see we need to have a separate check for unused page, why can't it be checked with other page types in below check. if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE && pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE) 2. + /* Check if it is an empty hash page. */ + if (PageIsEmpty(page)) + ereport(ERROR, + (errcode(ERRCODE_INDEX_CORRUPTED), + errmsg("index table contains empty page"))); Do we want to give a separate message for EMPTY and NEW pages? Isn't it better that the same error message can be given for both of them as from user perspective there is not much difference between both the messages? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Tue, Mar 21, 2017 at 10:15 AM, Ashutosh Sharma wrote: >> >> I think it is not just happening for freed overflow but also for newly >> allocated bucket page. It would be good if we could mark freed >> overflow page as UNUSED page rather than just initialising it's header >> portion and leaving the page type in special area as it is. Attached >> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that >> marks a freed overflow page as an unused page. >> >> Also, I have now changed pageinspect module to handle unused and empty >> hash index page. Attached is the patch >> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for >> the same. >> >> [1] - >> https://www.postgresql.org/message-id/CAE9k0P%3DN%2BJjzqnHqrURE7ZQMgySRpv%3DBkjafbz%3DpeD4cbCgbhA%40mail.gmail.com >> I have yet to review your patches, let's first try to clear other things before doing so. > > I think when expanding hash index table we are only initialising the > pages that will be in-use after split or the last block in the index. > The initial few pages where tuples will be moved from old to new pages > has no issues but the last block that is just initialised and is not > marked as hash page needs to be handled along with the freed overflow > page. > I think PageIsEmpty() check in hashfuncs.c is sufficient for same. I don't see any value in treating the last page allocated during _hash_alloc_buckets() any different from other pages which are prior to that but will get allocated when required. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Mon, Mar 20, 2017 at 6:53 PM, Ashutosh Sharma wrote: > On Mon, Mar 20, 2017 at 9:31 AM, Amit Kapila wrote: >> On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma >> wrote: >>> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila >>> wrote: On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma wrote: > On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes wrote: >> While trying to figure out some bloating in the newly logged hash >> indexes, >> I'm looking into the type of each page in the index. But I get an error: >> >> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) >> from >> generate_series(1650,1650) f(x)" >> >> ERROR: page is not a hash page >> DETAIL: Expected ff80, got . >> >> The contents of the page are: >> >> \xa400d8f203bf65c91800f01ff01f0420... >> >> (where the elided characters at the end are all zero) >> >> What kind of page is that actually? > > it is basically either a newly allocated bucket page or a freed overflow > page. > What makes you think that it can be a newly allocated page? Basically, we always initialize the special space of newly allocated page, so not sure what makes you deduce that it can be newly allocated page. >>> >>> I came to know this from the following experiment. >>> >>> I created a hash index and kept on inserting data in it till the split >>> happens. >>> >>> When split happened, I could see following values for firstblock and >>> lastblock in _hash_alloc_buckets() >>> >>> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34, >>> nblocks=32) at hashpage.c:993 >>> (gdb) n >>> (gdb) pfirstblock >>> $15 = 34 >>> (gdb) pnblocks >>> $16 = 32 >>> (gdb) n >>> (gdb) plastblock >>> $17 = 65 >>> >>> AFAIU, this bucket split resulted in creation of new bucket pages from >>> block number 34 to 65. >>> >>> The contents for metap are as follows, >>> >>> (gdb) p*metap >>> $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples = >>> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096, >>> hashm_bmshift = 15, >>> hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31, >>> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1, >>> hashm_procid = 450, >>> hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 }, >>> hashm_mapp = {33,0 }} >>> >>> Now, if i try to check the page type for block number 65, this is what i >>> see, >>> >>> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65)); >>> ERROR: page is not a hash page >>> DETAIL: Expected ff80, got . >>> test=# >>> >> >> The contents of such a page should be zero and Jeff has reported some >> valid-looking contents of the page. If you see this page contents as >> zero, then we can conclude what Jeff is seeing was an freed overflow >> page. > > As shown in the mail thread-[1], the contents of metapage are, > > (gdb) p*metap > $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples > =2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096, > hashm_bmshift = 15, hashm_maxbucket = 32,hashm_highmask = 63, > hashm_lowmask = 31, hashm_ovflpoint = 6, hashm_firstfree = 0, > hashm_nmaps = 1, > hashm_procid = 450, hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 25 times>}, hashm_mapp = {33,0 }} > > postgres=# select spares from > hash_metapage_info(get_raw_page('con_hash_index', 0)); > spares > --- > {0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0} > (1 row) > > Here, if you see the spare page count is just 1 which corresponds to > bitmap page. So, that means there is no overflow page in hash index > table and neither I have ran any DELETE statements in my experiment > that would result in freeing an overflow page. > > Also, the page header content for such a page is, > > $9 = {pd_lsn = {xlogid = 0, xrecoff = 23638120}, pd_checksum = 0, > pd_flags = 0,pd_lower = 24, pd_upper = 8176,pd_special = 8176, > pd_pagesize_version = 8196, pd_prune_xid = 0,pd_linp = 0x1f3aa88} > > From values of pd_lower and pd_upper it is clear that it is an empty page. > > The content of this page is, > > \xb0308a011800f01ff01f042000. > > I think it is not just happening for freed overflow but also for newly > allocated bucket page. It would be good if we could mark freed > overflow page as UNUSED page rather than just initialising it's header > portion and leaving the page type in special area as it is. Attached > is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that > marks a freed overflow page as an unused page. > > Also, I have now changed pageinspect module to handle unused and empty > hash index page. Attached is the patch > (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for > the same.
Re: [HACKERS] pageinspect and hash indexes
On Mon, Mar 20, 2017 at 9:31 AM, Amit Kapila wrote: > On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma > wrote: >> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila wrote: >>> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma >>> wrote: On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes wrote: > While trying to figure out some bloating in the newly logged hash indexes, > I'm looking into the type of each page in the index. But I get an error: > > psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) > from > generate_series(1650,1650) f(x)" > > ERROR: page is not a hash page > DETAIL: Expected ff80, got . > > The contents of the page are: > > \xa400d8f203bf65c91800f01ff01f0420... > > (where the elided characters at the end are all zero) > > What kind of page is that actually? it is basically either a newly allocated bucket page or a freed overflow page. >>> >>> What makes you think that it can be a newly allocated page? >>> Basically, we always initialize the special space of newly allocated >>> page, so not sure what makes you deduce that it can be newly allocated >>> page. >> >> I came to know this from the following experiment. >> >> I created a hash index and kept on inserting data in it till the split >> happens. >> >> When split happened, I could see following values for firstblock and >> lastblock in _hash_alloc_buckets() >> >> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34, >> nblocks=32) at hashpage.c:993 >> (gdb) n >> (gdb) pfirstblock >> $15 = 34 >> (gdb) pnblocks >> $16 = 32 >> (gdb) n >> (gdb) plastblock >> $17 = 65 >> >> AFAIU, this bucket split resulted in creation of new bucket pages from >> block number 34 to 65. >> >> The contents for metap are as follows, >> >> (gdb) p*metap >> $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples = >> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096, >> hashm_bmshift = 15, >> hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31, >> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1, >> hashm_procid = 450, >> hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 }, >> hashm_mapp = {33,0 }} >> >> Now, if i try to check the page type for block number 65, this is what i see, >> >> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65)); >> ERROR: page is not a hash page >> DETAIL: Expected ff80, got . >> test=# >> > > The contents of such a page should be zero and Jeff has reported some > valid-looking contents of the page. If you see this page contents as > zero, then we can conclude what Jeff is seeing was an freed overflow > page. As shown in the mail thread-[1], the contents of metapage are, (gdb) p*metap $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples =2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096, hashm_bmshift = 15, hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31, hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1, hashm_procid = 450, hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 }, hashm_mapp = {33,0 }} postgres=# select spares from hash_metapage_info(get_raw_page('con_hash_index', 0)); spares --- {0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0} (1 row) Here, if you see the spare page count is just 1 which corresponds to bitmap page. So, that means there is no overflow page in hash index table and neither I have ran any DELETE statements in my experiment that would result in freeing an overflow page. Also, the page header content for such a page is, $9 = {pd_lsn = {xlogid = 0, xrecoff = 23638120}, pd_checksum = 0, pd_flags = 0,pd_lower = 24, pd_upper = 8176,pd_special = 8176, pd_pagesize_version = 8196, pd_prune_xid = 0,pd_linp = 0x1f3aa88} >From values of pd_lower and pd_upper it is clear that it is an empty page. The content of this page is, \xb0308a011800f01ff01f042000. I think it is not just happening for freed overflow but also for newly allocated bucket page. It would be good if we could mark freed overflow page as UNUSED page rather than just initialising it's header portion and leaving the page type in special area as it is. Attached is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that marks a freed overflow page as an unused page. Also, I have now changed pageinspect module to handle unused and empty hash index page. Attached is the patch (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for the same. [1] - https://www.postgresql.org/message-id/CAE9k0P%3DN%2BJjzqnHqrURE7ZQMgySRpv%3DBkjafbz%3DpeD4cbCgbhA%40mail.gmail.com -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com 0001-mark_freed_ovflpage_as_UNUSED_pagetype.p
Re: [HACKERS] pageinspect and hash indexes
On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma wrote: > On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila wrote: >> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma >> wrote: >>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes wrote: While trying to figure out some bloating in the newly logged hash indexes, I'm looking into the type of each page in the index. But I get an error: psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from generate_series(1650,1650) f(x)" ERROR: page is not a hash page DETAIL: Expected ff80, got . The contents of the page are: \xa400d8f203bf65c91800f01ff01f0420... (where the elided characters at the end are all zero) What kind of page is that actually? >>> >>> it is basically either a newly allocated bucket page or a freed overflow >>> page. >>> >> >> What makes you think that it can be a newly allocated page? >> Basically, we always initialize the special space of newly allocated >> page, so not sure what makes you deduce that it can be newly allocated >> page. > > I came to know this from the following experiment. > > I created a hash index and kept on inserting data in it till the split > happens. > > When split happened, I could see following values for firstblock and > lastblock in _hash_alloc_buckets() > > Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34, > nblocks=32) at hashpage.c:993 > (gdb) n > (gdb) pfirstblock > $15 = 34 > (gdb) pnblocks > $16 = 32 > (gdb) n > (gdb) plastblock > $17 = 65 > > AFAIU, this bucket split resulted in creation of new bucket pages from > block number 34 to 65. > > The contents for metap are as follows, > > (gdb) p*metap > $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples = > 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096, > hashm_bmshift = 15, > hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31, > hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1, > hashm_procid = 450, > hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 }, > hashm_mapp = {33,0 }} > > Now, if i try to check the page type for block number 65, this is what i see, > > test=# select * from hash_page_type(get_raw_page('con_hash_index', 65)); > ERROR: page is not a hash page > DETAIL: Expected ff80, got . > test=# > The contents of such a page should be zero and Jeff has reported some valid-looking contents of the page. If you see this page contents as zero, then we can conclude what Jeff is seeing was an freed overflow page. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila wrote: > On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma > wrote: >> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes wrote: >>> While trying to figure out some bloating in the newly logged hash indexes, >>> I'm looking into the type of each page in the index. But I get an error: >>> >>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from >>> generate_series(1650,1650) f(x)" >>> >>> ERROR: page is not a hash page >>> DETAIL: Expected ff80, got . >>> >>> The contents of the page are: >>> >>> \xa400d8f203bf65c91800f01ff01f0420... >>> >>> (where the elided characters at the end are all zero) >>> >>> What kind of page is that actually? >> >> it is basically either a newly allocated bucket page or a freed overflow >> page. >> > > What makes you think that it can be a newly allocated page? > Basically, we always initialize the special space of newly allocated > page, so not sure what makes you deduce that it can be newly allocated > page. I came to know this from the following experiment. I created a hash index and kept on inserting data in it till the split happens. When split happened, I could see following values for firstblock and lastblock in _hash_alloc_buckets() Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34, nblocks=32) at hashpage.c:993 (gdb) n (gdb) pfirstblock $15 = 34 (gdb) pnblocks $16 = 32 (gdb) n (gdb) plastblock $17 = 65 AFAIU, this bucket split resulted in creation of new bucket pages from block number 34 to 65. The contents for metap are as follows, (gdb) p*metap $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples = 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096, hashm_bmshift = 15, hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31, hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1, hashm_procid = 450, hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 }, hashm_mapp = {33,0 }} Now, if i try to check the page type for block number 65, this is what i see, test=# select * from hash_page_type(get_raw_page('con_hash_index', 65)); ERROR: page is not a hash page DETAIL: Expected ff80, got . test=# -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma wrote: > On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes wrote: >> While trying to figure out some bloating in the newly logged hash indexes, >> I'm looking into the type of each page in the index. But I get an error: >> >> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from >> generate_series(1650,1650) f(x)" >> >> ERROR: page is not a hash page >> DETAIL: Expected ff80, got . >> >> The contents of the page are: >> >> \xa400d8f203bf65c91800f01ff01f0420... >> >> (where the elided characters at the end are all zero) >> >> What kind of page is that actually? > > it is basically either a newly allocated bucket page or a freed overflow page. > What makes you think that it can be a newly allocated page? Basically, we always initialize the special space of newly allocated page, so not sure what makes you deduce that it can be newly allocated page. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Sat, Mar 18, 2017 at 1:42 AM, Tom Lane wrote: > Ashutosh Sharma writes: >> Basically, when we started working on WAL for hash index, we found >> that WAL routine 'XLogReadBufferExtended' does not expect a page to be >> completely zero page else it returns Invalid Buffer. To fix this, we >> started initializing freed overflow page or new bucket pages using >> _hash_pageinit() which basically initialises page header portion but >> not it's special area where page type information is present. That's >> why you are seeing an ERROR saying 'page is not a hash page'. Actually >> pageinspect module needs to handle this type of page. Currently it is >> just handling zero pages but not an empty pages. I will submit a patch >> for this. > > That seems like entirely the wrong approach. You should make the special > space valid, instead, so that tools like pg_filedump can make sense of > the page. > We were not aware that external tools like pg_filedump are dependent on special space of index, but after looking at the code of pg_filedump, I agree with you that we need to initialize the special space in this case (free overflow page). I think we can mark such a page type as LH_UNUSED_PAGE and then initialize the other fields of special space. Nonetheless, I think we still need modifications in hashfuncs.c so that it can understand this type of page. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
Ashutosh Sharma writes: > Basically, when we started working on WAL for hash index, we found > that WAL routine 'XLogReadBufferExtended' does not expect a page to be > completely zero page else it returns Invalid Buffer. To fix this, we > started initializing freed overflow page or new bucket pages using > _hash_pageinit() which basically initialises page header portion but > not it's special area where page type information is present. That's > why you are seeing an ERROR saying 'page is not a hash page'. Actually > pageinspect module needs to handle this type of page. Currently it is > just handling zero pages but not an empty pages. I will submit a patch > for this. That seems like entirely the wrong approach. You should make the special space valid, instead, so that tools like pg_filedump can make sense of the page. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On 3/17/17 13:24, Jeff Janes wrote: > And isn't it unhelpful to have the pageinspect module throw errors, > rather than returning a dummy value to indicate there was an error? Even if one agreed with that premise, it would be hard to satisfy reliably, because the user-facing pageinspect functions might reach into lower-level code to do some of the decoding, which are free to error out. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect and hash indexes
On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes wrote: > While trying to figure out some bloating in the newly logged hash indexes, > I'm looking into the type of each page in the index. But I get an error: > > psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from > generate_series(1650,1650) f(x)" > > ERROR: page is not a hash page > DETAIL: Expected ff80, got . > > The contents of the page are: > > \xa400d8f203bf65c91800f01ff01f0420... > > (where the elided characters at the end are all zero) > > What kind of page is that actually? it is basically either a newly allocated bucket page or a freed overflow page. You are seeing this error because of following change in the WAL patch for hash index, Basically, when we started working on WAL for hash index, we found that WAL routine 'XLogReadBufferExtended' does not expect a page to be completely zero page else it returns Invalid Buffer. To fix this, we started initializing freed overflow page or new bucket pages using _hash_pageinit() which basically initialises page header portion but not it's special area where page type information is present. That's why you are seeing an ERROR saying 'page is not a hash page'. Actually pageinspect module needs to handle this type of page. Currently it is just handling zero pages but not an empty pages. I will submit a patch for this. And isn't it unhelpful to have the > pageinspect module throw errors, rather than returning a dummy value to > indicate there was an error? Well, this is something not specific to hash index. So, I have no answer :) -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pageinspect and hash indexes
While trying to figure out some bloating in the newly logged hash indexes, I'm looking into the type of each page in the index. But I get an error: psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from generate_series(1650,1650) f(x)" ERROR: page is not a hash page DETAIL: Expected ff80, got . The contents of the page are: \xa400d8f203bf65c91800f01ff01f0420... (where the elided characters at the end are all zero) What kind of page is that actually? And isn't it unhelpful to have the pageinspect module throw errors, rather than returning a dummy value to indicate there was an error? Thanks, Jeff