While working on write-ahead-logging of hash indexes, I noticed that this function allocates buckets in batches and the mechanism it uses is that it initialize the last page of batch with zeros and expect that the filesystem will ensure the intervening pages read as zeroes too.
I think to make it WAL enabled, we need to initialize the page header (using PageInit() or equivalent) instead of initializing it with zeroes as some part of our WAL replay machinery expects that the page should not be new as indicated by me in other thread [1]. I think WAL consistency check tool [2] also uses same part of replay functions and will show this as problem, if we don't initialize the page header. The point which is not clear to me is that whether it is okay as-is or shall we try to initialize each page of batch during _hash_alloc_buckets() considering now we are trying to make hash indexes WAL enabled. Offhand, I don't see any problem with just initializing the last page and write the WAL for same with log_newpage(), however if we try to initialize all pages, there could be some performance penalty on split operation. Thoughts? [1] - https://www.postgresql.org/message-id/CAA4eK1JS%2BSiRSQBzEFpnsSmxZKingrRH7WNyWULJeEJSj1-%3D0w%40mail.gmail.com [2] - https://commitfest.postgresql.org/10/741/ -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers