Re: [HACKERS] segfault in hot standby for hash indexes

Ashutosh Sharma Wed, 22 Mar 2017 03:10:40 -0700

Hi,

On Wed, Mar 22, 2017 at 8:41 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Tue, Mar 21, 2017 at 11:49 PM, Ashutosh Sharma <ashu.coe...@gmail.com> 
> wrote:
>>>
>>> I can confirm that that fixes the seg faults for me.
>>
>> Thanks for confirmation.
>>
>>>
>>> Did you mean you couldn't reproduce the problem in the first place, or that
>>> you could reproduce it and now the patch fixes it?  If the first of those, I
>>> forget to say you do have to wait for hot standby to reach a consistency and
>>> open for connections, and then connect to the standby ("psql -p 9874"),
>>> before the seg fault will be triggered.
>>
>> I meant that I was not able to reproduce the issue on HEAD.
>>
>>>
>>> But, there are places where hash_xlog_vacuum_get_latestRemovedXid diverges
>>> from btree_xlog_delete_get_latestRemovedXid, which I don't understand the
>>> reason for the divergence.  Is there a reason we dropped the PANIC if we
>>> have not reached consistency?
>>
>> Well, I'm not quite sure how would standby allow any backend to
>> connect to it until it has reached to a consistent state. If you see
>> the definition of btree_xlog_delete_get_latestRemovedXid(), just
>> before consistency check there is a if-condition 'if
>> (CountDBBackends(InvalidOid) == 0)' which means
>> we are checking for consistent state only after knowing that there are
>> some backends connected to the standby. So, Is there a possibility of
>> having some backend connected to standby server without having it in
>> consistent state.
>>
>
> I don't think so, but I think we should have reachedConsistency check
> and elog(PANIC,..) similar to btree.  If you see other conditions
> where we PANIC in btree or hash xlog code, you will notice that those
> are also theoretically not possible cases.  It seems this is to save
> database from getting corrupt or behaving insanely if due to some
> reason (like a coding error or others) the check fails.


okay, agreed. I have included it in the attached patch.

However,  I am still able to see the crash reported by Jeff upthread -
[1]. I think there are couple of things that needs to be corrected
inside hash_xlog_vacuum_get_latestRemovedXid().

1) As Amit mentioned in his earlier mail [2], the block id used for
registering deleted items in XLOG_HASH_VACUUM_ONE_PAGE is different
than the block id used for fetching those items. I had fixed this and
shared the patch earlier. With this patch I still see the crash Jeff
reported yesterday [1].

2) When a full page image of registered block is taken, the modified
data associated with that block is not included in the WAL record. In
such a case, we won't be able to fetch the items array that we have
tried to include during xlog record (XLOG_HASH_VACUUM_ONE_PAGE)
creation using following function.

   XLogRegisterBufData(0, (char *) deletable, ndeletable *
sizeof(OffsetNumber));

If above is not included in the WAL record, then fetching the same
using 'XLogRecGetBlockData(record, 0, &len);' will return NULL
pointer.

 ptr = XLogRecGetBlockData(record, 0, &len);
 unused = (OffsetNumber *) ptr;
 ............
 iitemid = PageGetItemId(ipage, unused[i]);

Here, if ptr is NULL, reference to unused will cause a crash.

To fix this, I think we should pass 'REGBUF_KEEP_DATA' while
registering the buffer. Something like this,

-                       XLogRegisterBuffer(0, buf, REGBUF_STANDARD);
+                       XLogRegisterBuffer(0, buf, REGBUF_STANDARD |
REGBUF_KEEP_DATA);

Attached is the patch that fixes this issue. Please have a look and
let me know if you still have any concerns. Thank you.

[1] - 
https://www.postgresql.org/message-id/CAMkU%3D1w-9Qe%3DFf1o6bSaXpNO9wqpo7_9GL8_CVhw4BoVVHasqg%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CAA4eK1%2BYUedok0%2Bmeynnf8K3fqFsfdMpEFz1JiKLyyNv46hjaA%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index de7522e..75c7c43 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -977,12 +977,20 @@ hash_xlog_vacuum_get_latestRemovedXid(XLogReaderState *record)
 		return latestRemovedXid;
 
 	/*
+	 * Check if WAL replay has reached a consistent database state. If not,
+	 * we must PANIC. See the definition of btree_xlog_delete_get_latestRemovedXid
+	 * on which this function is based.
+	 */
+	if (!reachedConsistency)
+		elog(PANIC, "hash_xlog_vacuum_get_latestRemovedXid: cannot operate with inconsistent data");
+
+	/*
 	 * Get index page.  If the DB is consistent, this should not fail, nor
 	 * should any of the heap page fetches below.  If one does, we return
 	 * InvalidTransactionId to cancel all HS transactions.  That's probably
 	 * overkill, but it's safe, and certainly better than panicking here.
 	 */
-	XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+	XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
 	ibuffer = XLogReadBufferExtended(rnode, MAIN_FORKNUM, blkno, RBM_NORMAL);
 
 	if (!BufferIsValid(ibuffer))
@@ -994,7 +1002,7 @@ hash_xlog_vacuum_get_latestRemovedXid(XLogReaderState *record)
 	 * Loop through the deleted index items to obtain the TransactionId from
 	 * the heap items they point to.
 	 */
-	ptr = XLogRecGetBlockData(record, 1, &len);
+	ptr = XLogRecGetBlockData(record, 0, &len);
 
 	unused = (OffsetNumber *) ptr;
 
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 8640e85..57995c1 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -403,7 +403,7 @@ _hash_vacuum_one_page(Relation rel, Buffer metabuf, Buffer buf,
 			XLogBeginInsert();
 			XLogRegisterData((char *) &xlrec, SizeOfHashVacuumOnePage);
 
-			XLogRegisterBuffer(0, buf, REGBUF_STANDARD);
+			XLogRegisterBuffer(0, buf, REGBUF_STANDARD | REGBUF_KEEP_DATA);
 			XLogRegisterBufData(0, (char *) deletable,
 						ndeletable * sizeof(OffsetNumber));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] segfault in hot standby for hash indexes

Reply via email to