subject:"\[HACKERS\] pageinspect and hash indexes"

Re: [HACKERS] pageinspect and hash indexes

2017-04-05 Thread Robert Haas

On Fri, Mar 24, 2017 at 3:44 PM, Ashutosh Sharma  wrote:
> Agreed. Moreover, previous approach might even change the current
> behaviour of functions like hash_page_items() and hash_page_stats()
> basically when we pass UNUSED pages to these functions. Attached is
> the newer version of patch.

This patch doesn't completely solve the problem:

pgbench -i -s 40
alter table pgbench_accounts drop constraint pgbench_accounts_pkey;
create index on pgbench_accounts using hash (aid);
insert into pgbench_accounts select * from pgbench_accounts;
select hash_page_type, min(n), max(n), count(*) from (select n,
hash_page_type(get_raw_page('pgbench_accounts_aid_idx'::text, n)) from
generate_series(0,
(pg_relation_size('pgbench_accounts_aid_idx')/8192)::integer - 1) n) x
group by 1;
ERROR:  index table contains zero page

This happens because of the sparse allocation forced by
_hash_alloc_buckets.  Perhaps the real fix is to have that function
initialize all of the pages instead of just the last one, but unless
and until we decide to do that, we've got to cope with zero pages in
the index.  Actually, even if we did fix that I think we'd still need
to do this, because the way relation extension works in general is
that we first write a page of zeroes and then go back and fill in the
data; an intervening crash can leave behind the intermediate state.

A second problem is that the one page initialized by
_hash_alloc_buckets needs to end up with a valid special space;
otherwise, you get the same error Jeff complained about originally
when you try to use hash_page_type() on it.

I fixed those issues and committed this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-24 Thread Ashutosh Sharma

 Thanks for reviewing my patch. I have removed the extra white space.
 Attached are both the patches.
>>>
>>> Sorry, I have mistakenly attached wrong patch. Here are the correct
>>> set of patches.
>>
>> The latest version of patches looks fine to me.
>
> I don't really like 0002.  What about this, instead?
>
> --- a/contrib/pageinspect/hashfuncs.c
> +++ b/contrib/pageinspect/hashfuncs.c
> @@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags)
>  /* Check that page type is sane. */
>  pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE;
>  if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
> -pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
> +pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE &&
> +pagetype != LH_UNUSED_PAGE)
>  ereport(ERROR,
>  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>   errmsg("invalid hash page type %08x", pagetype)));
>
> The advantage of that is (1) it won't get confused if in the future we
> have an unused page that has some flag bit not in LH_PAGE_TYPE set,
> and (2) if in the future we want to add any other checks to this
> function which should apply to unused pages also, they won't get
> bypassed by an early return statement.

Agreed. Moreover, previous approach might even change the current
behaviour of functions like hash_page_items() and hash_page_stats()
basically when we pass UNUSED pages to these functions. Attached is
the newer version of patch.
From 5c54fa58895cd279ff44424a3eb1feafdaca69d5 Mon Sep 17 00:00:00 2001
From: ashu 
Date: Sat, 25 Mar 2017 01:02:35 +0530
Subject: [PATCH] Allow pageinspect to handle UNUSED hash pages v3

---
 contrib/pageinspect/hashfuncs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c
index 812a03f..2632287 100644
--- a/contrib/pageinspect/hashfuncs.c
+++ b/contrib/pageinspect/hashfuncs.c
@@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags)
 	/* Check that page type is sane. */
 	pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE;
 	if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
-		pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
+		pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE &&
+		pagetype != LH_UNUSED_PAGE)
 		ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
  errmsg("invalid hash page type %08x", pagetype)));
-- 
1.8.3.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-24 Thread Robert Haas

On Fri, Mar 24, 2017 at 3:54 AM, Amit Kapila  wrote:
> On Fri, Mar 24, 2017 at 9:50 AM, Ashutosh Sharma  
> wrote:
>> On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma  
>> wrote:
>>>
>>> Thanks for reviewing my patch. I have removed the extra white space.
>>> Attached are both the patches.
>>
>> Sorry, I have mistakenly attached wrong patch. Here are the correct
>> set of patches.
>
> The latest version of patches looks fine to me.

I don't really like 0002.  What about this, instead?

--- a/contrib/pageinspect/hashfuncs.c
+++ b/contrib/pageinspect/hashfuncs.c
@@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags)
 /* Check that page type is sane. */
 pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE;
 if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
-pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
+pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE &&
+pagetype != LH_UNUSED_PAGE)
 ereport(ERROR,
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
  errmsg("invalid hash page type %08x", pagetype)));

The advantage of that is (1) it won't get confused if in the future we
have an unused page that has some flag bit not in LH_PAGE_TYPE set,
and (2) if in the future we want to add any other checks to this
function which should apply to unused pages also, they won't get
bypassed by an early return statement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-24 Thread Amit Kapila

On Fri, Mar 24, 2017 at 9:50 AM, Ashutosh Sharma  wrote:
> On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma  
> wrote:
>>
>> Thanks for reviewing my patch. I have removed the extra white space.
>> Attached are both the patches.
>
> Sorry, I have mistakenly attached wrong patch. Here are the correct
> set of patches.
>

The latest version of patches looks fine to me.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-23 Thread Ashutosh Sharma

On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma  wrote:
> On Fri, Mar 24, 2017 at 9:21 AM, Amit Kapila  wrote:
>> On Thu, Mar 23, 2017 at 6:46 PM, Ashutosh Sharma  
>> wrote:

 Oh, okay, but my main objection was that we should not check hash page
 type (hasho_flag) without ensuring whether it is a hash page.  Can you
 try to adjust the above code so that this check can be moved after
 hasho_page_id check?
>>>
>>> Yes, I got your point. I have done that but then i had to remove the
>>> check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will
>>> only be true for one page in entire hash index table and can be
>>> ignored. If you wish, I could mention about it in the documentation.
>>>
>>
>> Yeah, I think it is worth adding a Note in docs, but we can do that
>> separately if required.
>>

> To avoid
> this, at the start of verify_hash_page function itself if we recognise
> page as UNUSED page we return immediately.
>
>>
>> 2.
>> + /* Check if it is an empty hash page. */
>> + if (PageIsEmpty(page))
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INDEX_CORRUPTED),
>> + errmsg("index table contains empty page")));
>>
>>
>> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
>> it better that the same error message can be given for both of them as
>> from user perspective there is not much difference between both the
>> messages?
>
> I think we should show separate message because they are two different
> type of pages. In the sense like, one is initialised whereas other is
> completely zero.
>

 I understand your point, but not sure if it makes any difference to user.

>>>
>>> okay, I have now anyways removed the check for PageIsEmpty. Please
>>> refer to the attached '0002
>>> allow_pageinspect_handle_UNUSED_hash_pages.patch'
>>>
>>
>> +
>> if (pageopaque->hasho_page_id != HASHO_PAGE_ID)
>> ereport(ERROR,
>>
>> spurious white space.
>>
>>>
>>> Also, I have attached
>>> '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that
>>> handles your comment mentioned in [1].
>>>
>>
>> In general, we have to initialize prevblock with max_bucket, but here
>> it is okay, because we anyway initialize it after this page is
>> allocated as overflow page.
>>
>> Your patches look good to me except for small white space change.
>
> Thanks for reviewing my patch. I have removed the extra white space.
> Attached are both the patches.

Sorry, I have mistakenly attached wrong patch. Here are the correct
set of patches.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com
From 86eb711ea80b495c9d2f5228558682d0e95c41eb Mon Sep 17 00:00:00 2001
From: ashu 
Date: Thu, 23 Mar 2017 16:47:17 +0530
Subject: [PATCH] Mark freed overflow page as UNUSED pagetype v2

---
 src/backend/access/hash/hash_xlog.c |  9 +
 src/backend/access/hash/hashovfl.c  | 13 -
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index d9ac42c..2ccaf46 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -697,11 +697,20 @@ hash_xlog_squeeze_page(XLogReaderState *record)
 	if (XLogReadBufferForRedo(record, 2, &ovflbuf) == BLK_NEEDS_REDO)
 	{
 		Page		ovflpage;
+		HashPageOpaque ovflopaque;
 
 		ovflpage = BufferGetPage(ovflbuf);
 
 		_hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
 
+		ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
+
+		ovflopaque->hasho_prevblkno = InvalidBlockNumber;
+		ovflopaque->hasho_nextblkno = InvalidBlockNumber;
+		ovflopaque->hasho_bucket = -1;
+		ovflopaque->hasho_flag = LH_UNUSED_PAGE;
+		ovflopaque->hasho_page_id = HASHO_PAGE_ID;
+
 		PageSetLSN(ovflpage, lsn);
 		MarkBufferDirty(ovflbuf);
 	}
diff --git a/src/backend/access/hash/hashovfl.c b/src/backend/access/hash/hashovfl.c
index a3cae21..d3eaee0 100644
--- a/src/backend/access/hash/hashovfl.c
+++ b/src/backend/access/hash/hashovfl.c
@@ -591,9 +591,20 @@ _hash_freeovflpage(Relation rel, Buffer bucketbuf, Buffer ovflbuf,
 	/*
 	 * Initialize the freed overflow page.  Just zeroing the page won't work,
 	 * because WAL replay routines expect pages to be initialized. See
-	 * explanation of RBM_NORMAL mode atop XLogReadBufferExtended.
+	 * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. Also, mark
+	 * the page as UNUSED type and retain it's page id. This allows the tools
+	 * like pageinspect to consider it as a hash page.
 	 */
 	_hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
+
+	ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
+
+	ovflopaque->hasho_prevblkno = InvalidBlockNumber;
+	ovflopaque->hasho_nextblkno = InvalidBlockNumber;
+	ovflopaque->hasho_bucket = -1;
+	ovflopaque->hasho_flag = LH_UNUSED_PAGE;
+	ovflopaque->hasho_page_id = HASHO_PAGE_ID;
+
 	MarkBufferDirty(ovflbuf);
 
 	if

Re: [HACKERS] pageinspect and hash indexes

2017-03-23 Thread Ashutosh Sharma

On Fri, Mar 24, 2017 at 9:21 AM, Amit Kapila  wrote:
> On Thu, Mar 23, 2017 at 6:46 PM, Ashutosh Sharma  
> wrote:
>>>
>>> Oh, okay, but my main objection was that we should not check hash page
>>> type (hasho_flag) without ensuring whether it is a hash page.  Can you
>>> try to adjust the above code so that this check can be moved after
>>> hasho_page_id check?
>>
>> Yes, I got your point. I have done that but then i had to remove the
>> check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will
>> only be true for one page in entire hash index table and can be
>> ignored. If you wish, I could mention about it in the documentation.
>>
>
> Yeah, I think it is worth adding a Note in docs, but we can do that
> separately if required.
>
>>>
 To avoid
 this, at the start of verify_hash_page function itself if we recognise
 page as UNUSED page we return immediately.

>
> 2.
> + /* Check if it is an empty hash page. */
> + if (PageIsEmpty(page))
> + ereport(ERROR,
> + (errcode(ERRCODE_INDEX_CORRUPTED),
> + errmsg("index table contains empty page")));
>
>
> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
> it better that the same error message can be given for both of them as
> from user perspective there is not much difference between both the
> messages?

 I think we should show separate message because they are two different
 type of pages. In the sense like, one is initialised whereas other is
 completely zero.

>>>
>>> I understand your point, but not sure if it makes any difference to user.
>>>
>>
>> okay, I have now anyways removed the check for PageIsEmpty. Please
>> refer to the attached '0002
>> allow_pageinspect_handle_UNUSED_hash_pages.patch'
>>
>
> +
> if (pageopaque->hasho_page_id != HASHO_PAGE_ID)
> ereport(ERROR,
>
> spurious white space.
>
>>
>> Also, I have attached
>> '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that
>> handles your comment mentioned in [1].
>>
>
> In general, we have to initialize prevblock with max_bucket, but here
> it is okay, because we anyway initialize it after this page is
> allocated as overflow page.
>
> Your patches look good to me except for small white space change.

Thanks for reviewing my patch. I have removed the extra white space.
Attached are both the patches.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com
From 86eb711ea80b495c9d2f5228558682d0e95c41eb Mon Sep 17 00:00:00 2001
From: ashu 
Date: Thu, 23 Mar 2017 16:47:17 +0530
Subject: [PATCH] Mark freed overflow page as UNUSED pagetype v2

---
 src/backend/access/hash/hash_xlog.c |  9 +
 src/backend/access/hash/hashovfl.c  | 13 -
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index d9ac42c..2ccaf46 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -697,11 +697,20 @@ hash_xlog_squeeze_page(XLogReaderState *record)
 	if (XLogReadBufferForRedo(record, 2, &ovflbuf) == BLK_NEEDS_REDO)
 	{
 		Page		ovflpage;
+		HashPageOpaque ovflopaque;
 
 		ovflpage = BufferGetPage(ovflbuf);
 
 		_hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
 
+		ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
+
+		ovflopaque->hasho_prevblkno = InvalidBlockNumber;
+		ovflopaque->hasho_nextblkno = InvalidBlockNumber;
+		ovflopaque->hasho_bucket = -1;
+		ovflopaque->hasho_flag = LH_UNUSED_PAGE;
+		ovflopaque->hasho_page_id = HASHO_PAGE_ID;
+
 		PageSetLSN(ovflpage, lsn);
 		MarkBufferDirty(ovflbuf);
 	}
diff --git a/src/backend/access/hash/hashovfl.c b/src/backend/access/hash/hashovfl.c
index a3cae21..d3eaee0 100644
--- a/src/backend/access/hash/hashovfl.c
+++ b/src/backend/access/hash/hashovfl.c
@@ -591,9 +591,20 @@ _hash_freeovflpage(Relation rel, Buffer bucketbuf, Buffer ovflbuf,
 	/*
 	 * Initialize the freed overflow page.  Just zeroing the page won't work,
 	 * because WAL replay routines expect pages to be initialized. See
-	 * explanation of RBM_NORMAL mode atop XLogReadBufferExtended.
+	 * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. Also, mark
+	 * the page as UNUSED type and retain it's page id. This allows the tools
+	 * like pageinspect to consider it as a hash page.
 	 */
 	_hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
+
+	ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
+
+	ovflopaque->hasho_prevblkno = InvalidBlockNumber;
+	ovflopaque->hasho_nextblkno = InvalidBlockNumber;
+	ovflopaque->hasho_bucket = -1;
+	ovflopaque->hasho_flag = LH_UNUSED_PAGE;
+	ovflopaque->hasho_page_id = HASHO_PAGE_ID;
+
 	MarkBufferDirty(ovflbuf);
 
 	if (BufferIsValid(prevbuf))
-- 
1.8.3.1

diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c
index 812a03f..6ba8847 100644
--- a/contrib/pageinspect/hashfuncs.c
+++ b/contrib/pageinspect/has

Re: [HACKERS] pageinspect and hash indexes

2017-03-23 Thread Amit Kapila

On Thu, Mar 23, 2017 at 6:46 PM, Ashutosh Sharma  wrote:
>>
>> Oh, okay, but my main objection was that we should not check hash page
>> type (hasho_flag) without ensuring whether it is a hash page.  Can you
>> try to adjust the above code so that this check can be moved after
>> hasho_page_id check?
>
> Yes, I got your point. I have done that but then i had to remove the
> check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will
> only be true for one page in entire hash index table and can be
> ignored. If you wish, I could mention about it in the documentation.
>

Yeah, I think it is worth adding a Note in docs, but we can do that
separately if required.

>>
>>> To avoid
>>> this, at the start of verify_hash_page function itself if we recognise
>>> page as UNUSED page we return immediately.
>>>

 2.
 + /* Check if it is an empty hash page. */
 + if (PageIsEmpty(page))
 + ereport(ERROR,
 + (errcode(ERRCODE_INDEX_CORRUPTED),
 + errmsg("index table contains empty page")));


 Do we want to give a separate message for EMPTY and NEW pages?  Isn't
 it better that the same error message can be given for both of them as
 from user perspective there is not much difference between both the
 messages?
>>>
>>> I think we should show separate message because they are two different
>>> type of pages. In the sense like, one is initialised whereas other is
>>> completely zero.
>>>
>>
>> I understand your point, but not sure if it makes any difference to user.
>>
>
> okay, I have now anyways removed the check for PageIsEmpty. Please
> refer to the attached '0002
> allow_pageinspect_handle_UNUSED_hash_pages.patch'
>

+
if (pageopaque->hasho_page_id != HASHO_PAGE_ID)
ereport(ERROR,

spurious white space.

>
> Also, I have attached
> '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that
> handles your comment mentioned in [1].
>

In general, we have to initialize prevblock with max_bucket, but here
it is okay, because we anyway initialize it after this page is
allocated as overflow page.

Your patches look good to me except for small white space change.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-23 Thread Ashutosh Sharma

Hi Amit,

>>> +
>>> + /* Check if it is an unused hash page. */
>>> + if (pageopaque->hasho_flag == LH_UNUSED_PAGE)
>>> + return page;
>>>
>>>
>>> I don't see we need to have a separate check for unused page, why
>>> can't it be checked with other page types in below check.
>>>
>>> if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
>>> pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
>>
>> That is because UNUSED page is as good as an empty page except that
>> empty page doesn't have any pagetype. If we add condition for checking
>> UNUSED page in above if check it will never show unused page as an
>> unsed page rather it will always show it as an empty page.
>>
>
> Oh, okay, but my main objection was that we should not check hash page
> type (hasho_flag) without ensuring whether it is a hash page.  Can you
> try to adjust the above code so that this check can be moved after
> hasho_page_id check?

Yes, I got your point. I have done that but then i had to remove the
check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will
only be true for one page in entire hash index table and can be
ignored. If you wish, I could mention about it in the documentation.

>
>> To avoid
>> this, at the start of verify_hash_page function itself if we recognise
>> page as UNUSED page we return immediately.
>>
>>>
>>> 2.
>>> + /* Check if it is an empty hash page. */
>>> + if (PageIsEmpty(page))
>>> + ereport(ERROR,
>>> + (errcode(ERRCODE_INDEX_CORRUPTED),
>>> + errmsg("index table contains empty page")));
>>>
>>>
>>> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
>>> it better that the same error message can be given for both of them as
>>> from user perspective there is not much difference between both the
>>> messages?
>>
>> I think we should show separate message because they are two different
>> type of pages. In the sense like, one is initialised whereas other is
>> completely zero.
>>
>
> I understand your point, but not sure if it makes any difference to user.
>

okay, I have now anyways removed the check for PageIsEmpty. Please
refer to the attached '0002
allow_pageinspect_handle_UNUSED_hash_pages.patch'


Also, I have attached
'0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that
handles your comment mentioned in [1].

[1] - 
https://www.postgresql.org/message-id/CAA4eK1%2BVE_TDRLWpyeOf%2B7%2B6if68kgPNwO4guKo060rm_t3O5w%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com
From 86eb711ea80b495c9d2f5228558682d0e95c41eb Mon Sep 17 00:00:00 2001
From: ashu 
Date: Thu, 23 Mar 2017 16:47:17 +0530
Subject: [PATCH] Mark freed overflow page as UNUSED pagetype v2

---
 src/backend/access/hash/hash_xlog.c |  9 +
 src/backend/access/hash/hashovfl.c  | 13 -
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index d9ac42c..2ccaf46 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -697,11 +697,20 @@ hash_xlog_squeeze_page(XLogReaderState *record)
 	if (XLogReadBufferForRedo(record, 2, &ovflbuf) == BLK_NEEDS_REDO)
 	{
 		Page		ovflpage;
+		HashPageOpaque ovflopaque;
 
 		ovflpage = BufferGetPage(ovflbuf);
 
 		_hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
 
+		ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
+
+		ovflopaque->hasho_prevblkno = InvalidBlockNumber;
+		ovflopaque->hasho_nextblkno = InvalidBlockNumber;
+		ovflopaque->hasho_bucket = -1;
+		ovflopaque->hasho_flag = LH_UNUSED_PAGE;
+		ovflopaque->hasho_page_id = HASHO_PAGE_ID;
+
 		PageSetLSN(ovflpage, lsn);
 		MarkBufferDirty(ovflbuf);
 	}
diff --git a/src/backend/access/hash/hashovfl.c b/src/backend/access/hash/hashovfl.c
index a3cae21..d3eaee0 100644
--- a/src/backend/access/hash/hashovfl.c
+++ b/src/backend/access/hash/hashovfl.c
@@ -591,9 +591,20 @@ _hash_freeovflpage(Relation rel, Buffer bucketbuf, Buffer ovflbuf,
 	/*
 	 * Initialize the freed overflow page.  Just zeroing the page won't work,
 	 * because WAL replay routines expect pages to be initialized. See
-	 * explanation of RBM_NORMAL mode atop XLogReadBufferExtended.
+	 * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. Also, mark
+	 * the page as UNUSED type and retain it's page id. This allows the tools
+	 * like pageinspect to consider it as a hash page.
 	 */
 	_hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
+
+	ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
+
+	ovflopaque->hasho_prevblkno = InvalidBlockNumber;
+	ovflopaque->hasho_nextblkno = InvalidBlockNumber;
+	ovflopaque->hasho_bucket = -1;
+	ovflopaque->hasho_flag = LH_UNUSED_PAGE;
+	ovflopaque->hasho_page_id = HASHO_PAGE_ID;
+
 	MarkBufferDirty(ovflbuf);
 
 	if (BufferIsValid(prevbuf))
-- 
1.8.3.1

diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c
index 812a03f..6ba8847 100644
--- a/contrib/pageinspect/hashfuncs.c
+

Re: [HACKERS] pageinspect and hash indexes

2017-03-22 Thread Amit Kapila

On Thu, Mar 23, 2017 at 10:02 AM, Ashutosh Sharma  wrote:
>>>
>>
>> 1.
>> @@ -70,6 +70,17 @@ verify_hash_page(bytea *raw_page, int flags)
>>   pageopaque = (HashPageOpaque) PageGetSpecialPointer(page);
>> +
>> + /* Check if it is an unused hash page. */
>> + if (pageopaque->hasho_flag == LH_UNUSED_PAGE)
>> + return page;
>>
>>
>> I don't see we need to have a separate check for unused page, why
>> can't it be checked with other page types in below check.
>>
>> if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
>> pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
>
> That is because UNUSED page is as good as an empty page except that
> empty page doesn't have any pagetype. If we add condition for checking
> UNUSED page in above if check it will never show unused page as an
> unsed page rather it will always show it as an empty page.
>

Oh, okay, but my main objection was that we should not check hash page
type (hasho_flag) without ensuring whether it is a hash page.  Can you
try to adjust the above code so that this check can be moved after
hasho_page_id check?

> To avoid
> this, at the start of verify_hash_page function itself if we recognise
> page as UNUSED page we return immediately.
>
>>
>> 2.
>> + /* Check if it is an empty hash page. */
>> + if (PageIsEmpty(page))
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INDEX_CORRUPTED),
>> + errmsg("index table contains empty page")));
>>
>>
>> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
>> it better that the same error message can be given for both of them as
>> from user perspective there is not much difference between both the
>> messages?
>
> I think we should show separate message because they are two different
> type of pages. In the sense like, one is initialised whereas other is
> completely zero.
>

I understand your point, but not sure if it makes any difference to user.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-22 Thread Ashutosh Sharma

>> I think it is not just happening for freed overflow but also for newly
>> allocated bucket page. It would be good if we could mark  freed
>> overflow page as UNUSED page rather than just initialising it's header
>> portion and leaving the page type in special area as it is. Attached
>> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
>> marks a freed overflow page as an unused page.
>>
>
>   _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
> +
> + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
> +
> + ovflopaque->hasho_prevblkno = InvalidBlockNumber;
> + ovflopaque->hasho_nextblkno = InvalidBlockNumber;
> + ovflopaque->hasho_bucket = -1;
> + ovflopaque->hasho_flag = LH_UNUSED_PAGE;
> + ovflopaque->hasho_page_id = HASHO_PAGE_ID;
> +
>
> You need similar initialization in replay routine as well.

I will do that and share the patch shortly. Thanks.

>
>> Also, I have now changed pageinspect module to handle unused and empty
>> hash index page. Attached is the patch
>> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
>> the same.
>>
>
> 1.
> @@ -70,6 +70,17 @@ verify_hash_page(bytea *raw_page, int flags)
>   pageopaque = (HashPageOpaque) PageGetSpecialPointer(page);
> +
> + /* Check if it is an unused hash page. */
> + if (pageopaque->hasho_flag == LH_UNUSED_PAGE)
> + return page;
>
>
> I don't see we need to have a separate check for unused page, why
> can't it be checked with other page types in below check.
>
> if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
> pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)

That is because UNUSED page is as good as an empty page except that
empty page doesn't have any pagetype. If we add condition for checking
UNUSED page in above if check it will never show unused page as an
unsed page rather it will always show it as an empty page. To avoid
this, at the start of verify_hash_page function itself if we recognise
page as UNUSED page we return immediately.

>
> 2.
> + /* Check if it is an empty hash page. */
> + if (PageIsEmpty(page))
> + ereport(ERROR,
> + (errcode(ERRCODE_INDEX_CORRUPTED),
> + errmsg("index table contains empty page")));
>
>
> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
> it better that the same error message can be given for both of them as
> from user perspective there is not much difference between both the
> messages?

I think we should show separate message because they are two different
type of pages. In the sense like, one is initialised whereas other is
completely zero.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-22 Thread Amit Kapila

On Mon, Mar 20, 2017 at 6:53 PM, Ashutosh Sharma  wrote:
>
> I think it is not just happening for freed overflow but also for newly
> allocated bucket page. It would be good if we could mark  freed
> overflow page as UNUSED page rather than just initialising it's header
> portion and leaving the page type in special area as it is. Attached
> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
> marks a freed overflow page as an unused page.
>

  _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
+
+ ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
+
+ ovflopaque->hasho_prevblkno = InvalidBlockNumber;
+ ovflopaque->hasho_nextblkno = InvalidBlockNumber;
+ ovflopaque->hasho_bucket = -1;
+ ovflopaque->hasho_flag = LH_UNUSED_PAGE;
+ ovflopaque->hasho_page_id = HASHO_PAGE_ID;
+

You need similar initialization in replay routine as well.

> Also, I have now changed pageinspect module to handle unused and empty
> hash index page. Attached is the patch
> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
> the same.
>

1.
@@ -70,6 +70,17 @@ verify_hash_page(bytea *raw_page, int flags)
  pageopaque = (HashPageOpaque) PageGetSpecialPointer(page);
+
+ /* Check if it is an unused hash page. */
+ if (pageopaque->hasho_flag == LH_UNUSED_PAGE)
+ return page;


I don't see we need to have a separate check for unused page, why
can't it be checked with other page types in below check.

if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)

2.
+ /* Check if it is an empty hash page. */
+ if (PageIsEmpty(page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INDEX_CORRUPTED),
+ errmsg("index table contains empty page")));


Do we want to give a separate message for EMPTY and NEW pages?  Isn't
it better that the same error message can be given for both of them as
from user perspective there is not much difference between both the
messages?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-21 Thread Amit Kapila

On Tue, Mar 21, 2017 at 10:15 AM, Ashutosh Sharma  wrote:
>>
>> I think it is not just happening for freed overflow but also for newly
>> allocated bucket page. It would be good if we could mark  freed
>> overflow page as UNUSED page rather than just initialising it's header
>> portion and leaving the page type in special area as it is. Attached
>> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
>> marks a freed overflow page as an unused page.
>>
>> Also, I have now changed pageinspect module to handle unused and empty
>> hash index page. Attached is the patch
>> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
>> the same.
>>
>> [1] - 
>> https://www.postgresql.org/message-id/CAE9k0P%3DN%2BJjzqnHqrURE7ZQMgySRpv%3DBkjafbz%3DpeD4cbCgbhA%40mail.gmail.com
>>

I have yet to review your patches, let's first try to clear other
things before doing so.

>
> I think when expanding hash index table we are only initialising the
> pages that will be in-use after split or the last block in the index.
> The initial few pages where tuples will be moved from old to new pages
> has no issues but the last block that is just initialised and is not
> marked as hash page needs to be handled along with the freed overflow
> page.
>

I think PageIsEmpty() check in hashfuncs.c is sufficient for same.  I
don't see any value in treating the last page allocated during
_hash_alloc_buckets() any different from other pages which are prior
to that but will get allocated when required.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-20 Thread Ashutosh Sharma

On Mon, Mar 20, 2017 at 6:53 PM, Ashutosh Sharma  wrote:
> On Mon, Mar 20, 2017 at 9:31 AM, Amit Kapila  wrote:
>> On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma  
>> wrote:
>>> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila  
>>> wrote:
 On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma  
 wrote:
> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes  wrote:
>> While trying to figure out some bloating in the newly logged hash 
>> indexes,
>> I'm looking into the type of each page in the index.  But I get an error:
>>
>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) 
>> from
>> generate_series(1650,1650) f(x)"
>>
>> ERROR:  page is not a hash page
>> DETAIL:  Expected ff80, got .
>>
>> The contents of the page are:
>>
>> \xa400d8f203bf65c91800f01ff01f0420...
>>
>> (where the elided characters at the end are all zero)
>>
>> What kind of page is that actually?
>
> it is basically either a newly allocated bucket page or a freed overflow 
> page.
>

 What makes you think that it can be a newly allocated page?
 Basically, we always initialize the special space of newly allocated
 page, so not sure what makes you deduce that it can be newly allocated
 page.
>>>
>>> I came to know this from the following experiment.
>>>
>>> I  created a hash index and kept on inserting data in it till the split 
>>> happens.
>>>
>>> When split happened, I could see following values for firstblock and
>>> lastblock in _hash_alloc_buckets()
>>>
>>> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
>>> nblocks=32) at hashpage.c:993
>>> (gdb) n
>>> (gdb) pfirstblock
>>> $15 = 34
>>> (gdb) pnblocks
>>> $16 = 32
>>> (gdb) n
>>> (gdb) plastblock
>>> $17 = 65
>>>
>>> AFAIU, this bucket split resulted in creation of new bucket pages from
>>> block number 34 to 65.
>>>
>>> The contents for metap are as follows,
>>>
>>> (gdb) p*metap
>>> $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples =
>>> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
>>> hashm_bmshift = 15,
>>>   hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31,
>>> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
>>> hashm_procid = 450,
>>>   hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 },
>>> hashm_mapp = {33,0 }}
>>>
>>> Now, if i try to check the page type for block number 65, this is what i 
>>> see,
>>>
>>> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
>>> ERROR:  page is not a hash page
>>> DETAIL:  Expected ff80, got .
>>> test=#
>>>
>>
>> The contents of such a page should be zero and Jeff has reported some
>> valid-looking contents of the page.  If you see this page contents as
>> zero, then we can conclude what Jeff is seeing was an freed overflow
>> page.
>
> As shown in the mail thread-[1], the contents of metapage are,
>
> (gdb) p*metap
> $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples
> =2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
> hashm_bmshift = 15, hashm_maxbucket = 32,hashm_highmask = 63,
> hashm_lowmask = 31, hashm_ovflpoint = 6, hashm_firstfree = 0,
> hashm_nmaps = 1,
> hashm_procid = 450, hashm_spares = {0, 0,0, 0, 0, 1, 1, 0  25 times>}, hashm_mapp = {33,0 }}
>
> postgres=# select spares from
> hash_metapage_info(get_raw_page('con_hash_index', 0));
>   spares
> ---
>  {0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
> (1 row)
>
> Here, if you see the spare page count  is just 1 which corresponds to
> bitmap page. So, that means there is no overflow page in hash index
> table and neither I have ran any DELETE statements in my experiment
> that would result in freeing an overflow page.
>
> Also, the page header content for such a page is,
>
> $9 = {pd_lsn = {xlogid = 0, xrecoff = 23638120}, pd_checksum = 0,
> pd_flags = 0,pd_lower = 24, pd_upper = 8176,pd_special = 8176,
>   pd_pagesize_version = 8196, pd_prune_xid = 0,pd_linp = 0x1f3aa88}
>
> From values of pd_lower and pd_upper it is clear that it is an empty page.
>
> The content of this page is,
>
> \xb0308a011800f01ff01f042000.
>
> I think it is not just happening for freed overflow but also for newly
> allocated bucket page. It would be good if we could mark  freed
> overflow page as UNUSED page rather than just initialising it's header
> portion and leaving the page type in special area as it is. Attached
> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
> marks a freed overflow page as an unused page.
>
> Also, I have now changed pageinspect module to handle unused and empty
> hash index page. Attached is the patch
> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
> the same.

Re: [HACKERS] pageinspect and hash indexes

2017-03-20 Thread Ashutosh Sharma

On Mon, Mar 20, 2017 at 9:31 AM, Amit Kapila  wrote:
> On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma  
> wrote:
>> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila  wrote:
>>> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma  
>>> wrote:
 On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes  wrote:
> While trying to figure out some bloating in the newly logged hash indexes,
> I'm looking into the type of each page in the index.  But I get an error:
>
> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) 
> from
> generate_series(1650,1650) f(x)"
>
> ERROR:  page is not a hash page
> DETAIL:  Expected ff80, got .
>
> The contents of the page are:
>
> \xa400d8f203bf65c91800f01ff01f0420...
>
> (where the elided characters at the end are all zero)
>
> What kind of page is that actually?

 it is basically either a newly allocated bucket page or a freed overflow 
 page.

>>>
>>> What makes you think that it can be a newly allocated page?
>>> Basically, we always initialize the special space of newly allocated
>>> page, so not sure what makes you deduce that it can be newly allocated
>>> page.
>>
>> I came to know this from the following experiment.
>>
>> I  created a hash index and kept on inserting data in it till the split 
>> happens.
>>
>> When split happened, I could see following values for firstblock and
>> lastblock in _hash_alloc_buckets()
>>
>> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
>> nblocks=32) at hashpage.c:993
>> (gdb) n
>> (gdb) pfirstblock
>> $15 = 34
>> (gdb) pnblocks
>> $16 = 32
>> (gdb) n
>> (gdb) plastblock
>> $17 = 65
>>
>> AFAIU, this bucket split resulted in creation of new bucket pages from
>> block number 34 to 65.
>>
>> The contents for metap are as follows,
>>
>> (gdb) p*metap
>> $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples =
>> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
>> hashm_bmshift = 15,
>>   hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31,
>> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
>> hashm_procid = 450,
>>   hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 },
>> hashm_mapp = {33,0 }}
>>
>> Now, if i try to check the page type for block number 65, this is what i see,
>>
>> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
>> ERROR:  page is not a hash page
>> DETAIL:  Expected ff80, got .
>> test=#
>>
>
> The contents of such a page should be zero and Jeff has reported some
> valid-looking contents of the page.  If you see this page contents as
> zero, then we can conclude what Jeff is seeing was an freed overflow
> page.

As shown in the mail thread-[1], the contents of metapage are,

(gdb) p*metap
$18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples
=2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
hashm_bmshift = 15, hashm_maxbucket = 32,hashm_highmask = 63,
hashm_lowmask = 31, hashm_ovflpoint = 6, hashm_firstfree = 0,
hashm_nmaps = 1,
hashm_procid = 450, hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 }, hashm_mapp = {33,0 }}

postgres=# select spares from
hash_metapage_info(get_raw_page('con_hash_index', 0));
  spares
---
 {0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
(1 row)

Here, if you see the spare page count  is just 1 which corresponds to
bitmap page. So, that means there is no overflow page in hash index
table and neither I have ran any DELETE statements in my experiment
that would result in freeing an overflow page.

Also, the page header content for such a page is,

$9 = {pd_lsn = {xlogid = 0, xrecoff = 23638120}, pd_checksum = 0,
pd_flags = 0,pd_lower = 24, pd_upper = 8176,pd_special = 8176,
  pd_pagesize_version = 8196, pd_prune_xid = 0,pd_linp = 0x1f3aa88}

>From values of pd_lower and pd_upper it is clear that it is an empty page.

The content of this page is,

\xb0308a011800f01ff01f042000.

I think it is not just happening for freed overflow but also for newly
allocated bucket page. It would be good if we could mark  freed
overflow page as UNUSED page rather than just initialising it's header
portion and leaving the page type in special area as it is. Attached
is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
marks a freed overflow page as an unused page.

Also, I have now changed pageinspect module to handle unused and empty
hash index page. Attached is the patch
(0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
the same.

[1] - 
https://www.postgresql.org/message-id/CAE9k0P%3DN%2BJjzqnHqrURE7ZQMgySRpv%3DBkjafbz%3DpeD4cbCgbhA%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


0001-mark_freed_ovflpage_as_UNUSED_pagetype.p

Re: [HACKERS] pageinspect and hash indexes

2017-03-19 Thread Amit Kapila

On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma  wrote:
> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila  wrote:
>> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma  
>> wrote:
>>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes  wrote:
 While trying to figure out some bloating in the newly logged hash indexes,
 I'm looking into the type of each page in the index.  But I get an error:

 psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) 
 from
 generate_series(1650,1650) f(x)"

 ERROR:  page is not a hash page
 DETAIL:  Expected ff80, got .

 The contents of the page are:

 \xa400d8f203bf65c91800f01ff01f0420...

 (where the elided characters at the end are all zero)

 What kind of page is that actually?
>>>
>>> it is basically either a newly allocated bucket page or a freed overflow 
>>> page.
>>>
>>
>> What makes you think that it can be a newly allocated page?
>> Basically, we always initialize the special space of newly allocated
>> page, so not sure what makes you deduce that it can be newly allocated
>> page.
>
> I came to know this from the following experiment.
>
> I  created a hash index and kept on inserting data in it till the split 
> happens.
>
> When split happened, I could see following values for firstblock and
> lastblock in _hash_alloc_buckets()
>
> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
> nblocks=32) at hashpage.c:993
> (gdb) n
> (gdb) pfirstblock
> $15 = 34
> (gdb) pnblocks
> $16 = 32
> (gdb) n
> (gdb) plastblock
> $17 = 65
>
> AFAIU, this bucket split resulted in creation of new bucket pages from
> block number 34 to 65.
>
> The contents for metap are as follows,
>
> (gdb) p*metap
> $18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples =
> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
> hashm_bmshift = 15,
>   hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31,
> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
> hashm_procid = 450,
>   hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 },
> hashm_mapp = {33,0 }}
>
> Now, if i try to check the page type for block number 65, this is what i see,
>
> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
> ERROR:  page is not a hash page
> DETAIL:  Expected ff80, got .
> test=#
>

The contents of such a page should be zero and Jeff has reported some
valid-looking contents of the page.  If you see this page contents as
zero, then we can conclude what Jeff is seeing was an freed overflow
page.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-18 Thread Ashutosh Sharma

On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila  wrote:
> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma  
> wrote:
>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes  wrote:
>>> While trying to figure out some bloating in the newly logged hash indexes,
>>> I'm looking into the type of each page in the index.  But I get an error:
>>>
>>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
>>> generate_series(1650,1650) f(x)"
>>>
>>> ERROR:  page is not a hash page
>>> DETAIL:  Expected ff80, got .
>>>
>>> The contents of the page are:
>>>
>>> \xa400d8f203bf65c91800f01ff01f0420...
>>>
>>> (where the elided characters at the end are all zero)
>>>
>>> What kind of page is that actually?
>>
>> it is basically either a newly allocated bucket page or a freed overflow 
>> page.
>>
>
> What makes you think that it can be a newly allocated page?
> Basically, we always initialize the special space of newly allocated
> page, so not sure what makes you deduce that it can be newly allocated
> page.

I came to know this from the following experiment.

I  created a hash index and kept on inserting data in it till the split happens.

When split happened, I could see following values for firstblock and
lastblock in _hash_alloc_buckets()

Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
nblocks=32) at hashpage.c:993
(gdb) n
(gdb) pfirstblock
$15 = 34
(gdb) pnblocks
$16 = 32
(gdb) n
(gdb) plastblock
$17 = 65

AFAIU, this bucket split resulted in creation of new bucket pages from
block number 34 to 65.

The contents for metap are as follows,

(gdb) p*metap
$18 = {hashm_magic = 105121344,hashm_version = 2, hashm_ntuples =
2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
hashm_bmshift = 15,
  hashm_maxbucket = 32,hashm_highmask = 63, hashm_lowmask = 31,
hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
hashm_procid = 450,
  hashm_spares = {0, 0,0, 0, 0, 1, 1, 0 },
hashm_mapp = {33,0 }}

Now, if i try to check the page type for block number 65, this is what i see,

test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
ERROR:  page is not a hash page
DETAIL:  Expected ff80, got .
test=#

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-18 Thread Amit Kapila

On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma  wrote:
> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes  wrote:
>> While trying to figure out some bloating in the newly logged hash indexes,
>> I'm looking into the type of each page in the index.  But I get an error:
>>
>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
>> generate_series(1650,1650) f(x)"
>>
>> ERROR:  page is not a hash page
>> DETAIL:  Expected ff80, got .
>>
>> The contents of the page are:
>>
>> \xa400d8f203bf65c91800f01ff01f0420...
>>
>> (where the elided characters at the end are all zero)
>>
>> What kind of page is that actually?
>
> it is basically either a newly allocated bucket page or a freed overflow page.
>

What makes you think that it can be a newly allocated page?
Basically, we always initialize the special space of newly allocated
page, so not sure what makes you deduce that it can be newly allocated
page.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-18 Thread Amit Kapila

On Sat, Mar 18, 2017 at 1:42 AM, Tom Lane  wrote:
> Ashutosh Sharma  writes:
>> Basically, when we started working on WAL for hash index, we found
>> that WAL routine 'XLogReadBufferExtended' does not expect a page to be
>> completely zero page else it returns Invalid Buffer. To fix this, we
>> started initializing freed overflow page or new bucket pages using
>> _hash_pageinit() which basically initialises page header portion but
>> not it's special area where page type information is present. That's
>> why you are seeing an ERROR saying 'page is not a hash page'. Actually
>> pageinspect module needs to handle this type of page. Currently it is
>> just handling zero pages but not an empty pages. I will submit a patch
>> for this.
>
> That seems like entirely the wrong approach.  You should make the special
> space valid, instead, so that tools like pg_filedump can make sense of
> the page.
>

We were not aware that external tools like pg_filedump are dependent
on special space of index, but after looking at the code of
pg_filedump, I agree with you that we need to initialize the special
space in this case (free overflow page).  I think we can mark such a
page type as LH_UNUSED_PAGE  and then initialize the other fields of
special space.  Nonetheless, I think we still need modifications in
hashfuncs.c so that it can understand this type of page.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-17 Thread Tom Lane

Ashutosh Sharma  writes:
> Basically, when we started working on WAL for hash index, we found
> that WAL routine 'XLogReadBufferExtended' does not expect a page to be
> completely zero page else it returns Invalid Buffer. To fix this, we
> started initializing freed overflow page or new bucket pages using
> _hash_pageinit() which basically initialises page header portion but
> not it's special area where page type information is present. That's
> why you are seeing an ERROR saying 'page is not a hash page'. Actually
> pageinspect module needs to handle this type of page. Currently it is
> just handling zero pages but not an empty pages. I will submit a patch
> for this.

That seems like entirely the wrong approach.  You should make the special
space valid, instead, so that tools like pg_filedump can make sense of
the page.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-17 Thread Peter Eisentraut

On 3/17/17 13:24, Jeff Janes wrote:
> And isn't it unhelpful to have the pageinspect module throw errors,
> rather than returning a dummy value to indicate there was an error?

Even if one agreed with that premise, it would be hard to satisfy
reliably, because the user-facing pageinspect functions might reach into
lower-level code to do some of the decoding, which are free to error out.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pageinspect and hash indexes

2017-03-17 Thread Ashutosh Sharma

On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes  wrote:
> While trying to figure out some bloating in the newly logged hash indexes,
> I'm looking into the type of each page in the index.  But I get an error:
>
> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
> generate_series(1650,1650) f(x)"
>
> ERROR:  page is not a hash page
> DETAIL:  Expected ff80, got .
>
> The contents of the page are:
>
> \xa400d8f203bf65c91800f01ff01f0420...
>
> (where the elided characters at the end are all zero)
>
> What kind of page is that actually?

it is basically either a newly allocated bucket page or a freed overflow page.

You are seeing this error because of following change in the WAL patch
for hash index,

Basically, when we started working on WAL for hash index, we found
that WAL routine 'XLogReadBufferExtended' does not expect a page to be
completely zero page else it returns Invalid Buffer. To fix this, we
started initializing freed overflow page or new bucket pages using
_hash_pageinit() which basically initialises page header portion but
not it's special area where page type information is present. That's
why you are seeing an ERROR saying 'page is not a hash page'. Actually
pageinspect module needs to handle this type of page. Currently it is
just handling zero pages but not an empty pages. I will submit a patch
for this.

And isn't it unhelpful to have the
> pageinspect module throw errors, rather than returning a dummy value to
> indicate there was an error?

Well, this is something not specific to hash index. So, I have no answer :)

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pageinspect and hash indexes

2017-03-17 Thread Jeff Janes

While trying to figure out some bloating in the newly logged hash indexes,
I'm looking into the type of each page in the index.  But I get an error:

psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x))
from generate_series(1650,1650) f(x)"

ERROR:  page is not a hash page
DETAIL:  Expected ff80, got .

The contents of the page are:

\xa400d8f203bf65c91800f01ff01f0420...

(where the elided characters at the end are all zero)

What kind of page is that actually?  And isn't it unhelpful to have the
pageinspect module throw errors, rather than returning a dummy value to
indicate there was an error?

Thanks,

Jeff

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

Re: [HACKERS] pageinspect and hash indexes

[HACKERS] pageinspect and hash indexes

22 matches

Site Navigation

Mail list logo

Footer information