Hello.

I reviewed this patch and I would like to share some comments.

It compiled with those 2 warnings:

verify_gin.c: In function 'gin_check_parent_keys_consistency':
verify_gin.c:481:38: warning: declaration of 'maxoff' shadows a previous local [-Wshadow=compatible-local] 481 | OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
      |                                      ^~~~~~
verify_gin.c:453:41: note: shadowed declaration is here
  453 |                                         maxoff;
      |                                         ^~~~~~
verify_gin.c:423:25: warning: unused variable 'heapallindexed' [-Wunused-variable]
  423 |         bool            heapallindexed = *((bool*)callback_state);
      |                         ^~~~~~~~~~~~~~


Also, I'm not sure about postgres' headers conventions, inside amcheck.h, there is "miscadmin.h" included, and inside verify_gin.c, verify_gist.h and verify_nbtree.c both amcheck.h and miscadmin.h are included.

About the documentation, the bt_index_parent_check has comments about the ShareLock and "SET client_min_messages = DEBUG1;", and both gist_index_parent_check and gin_index_parent_check lack it. verify_gin uses DEBUG3, I'm not sure if it is on purpose, but it would be nice to document it or put DEBUG1 to be consistent.

I lack enough context to do a deep review on the code, so in this area this patch needs more eyes.

I did the following test:

postgres=# create table teste (t text, tv tsvector);
CREATE TABLE
postgres=# insert into teste values ('hello', 'hello'::tsvector);
INSERT 0 1
postgres=# create index teste_tv on teste using gist(tv);
CREATE INDEX
postgres=# select pg_relation_filepath('teste_tv');
 pg_relation_filepath
----------------------
 base/5/16441
(1 row)

postgres=#
\q
$ bin/pg_ctl -D data -l log
waiting for server to shut down.... done
server stopped
$ okteta base/5/16441 # I couldn't figure out the dd syntax to change the 1FE9 to '0'
$ bin/pg_ctl -D data -l log
waiting for server to start.... done
server started
$ bin/psql -U ze postgres
psql (16devel)
Type "help" for help.

postgres=# SET client_min_messages = DEBUG3;
SET
postgres=# select gist_index_parent_check('teste_tv'::regclass, true);
DEBUG:  verifying that tuples from index "teste_tv" are present in "teste"
ERROR: heap tuple (0,1) from table "teste" lacks matching index tuple within index "teste_tv"
postgres=#

A simple index corruption in gin:

postgres=# CREATE TABLE "gin_check"("Column1" int[]);
CREATE TABLE
postgres=# insert into gin_check values (ARRAY[1]),(ARRAY[2]);
INSERT 0 2
postgres=# CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
CREATE INDEX
postgres=# select pg_relation_filepath('gin_check_idx');
 pg_relation_filepath
----------------------
 base/5/16453
(1 row)

postgres=#
\q
$ bin/pg_ctl -D data -l logfile stop
waiting for server to shut down.... done
server stopped
$ okteta data/base/5/16453 # edited some bits near 3FCC
$ bin/pg_ctl -D data -l logfile start
waiting for server to start.... done
server started
$ bin/psql -U ze postgres
psql (16devel)
Type "help" for help.

postgres=# SET client_min_messages = DEBUG3;
SET
postgres=# SELECT gin_index_parent_check('gin_check_idx', true);
ERROR: number of items mismatch in GIN entry tuple, 49 in tuple header, 1 decoded
postgres=#

There are more code paths to follow to check the entire code, and I had a hard time to corrupt the indices. Is there any automated code to corrupt index to test such code?


--
Jose Arthur Benetasso Villanova



Reply via email to