On Tue, Apr 10, 2018 at 1:37 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> _bt_mark_page_halfdead() looked like it had a problem, but it now
> looks like I was wrong.

I did find another problem, though. Looks like the idea to explicitly
represent the number of attributes directly has paid off already:

pg@~[3711]=# create table covering_bug (f1 int, f2 int, f3 text);
create unique index cov_idx on covering_bug (f1) include(f2);
insert into covering_bug select i, i * random() * 1000, i * random() *
100000 from generate_series(0,100000) i;
DEBUG:  building index "pg_toast_16451_index" on table "pg_toast_16451" serially
CREATE TABLE
DEBUG:  building index "cov_idx" on table "covering_bug" serially
CREATE INDEX
ERROR:  tuple has wrong number of attributes in index "cov_idx"

Note that amcheck can detect the issue with the index after the fact, too:

pg@~[3711]=# select bt_index_check('cov_idx');
ERROR:  wrong number of index tuple attributes for index "cov_idx"
DETAIL:  Index tid=(3,2) natts=2 points to index tid=(2,92) page lsn=0/170DC88.

I don't think that the issue is complicated. Looks like we missed a
place that we have to truncate within _bt_split(), located directly
after this comment block:

    /*
     * If the page we're splitting is not the rightmost page at its level in
     * the tree, then the first entry on the page is the high key for the
     * page.  We need to copy that to the right half.  Otherwise (meaning the
     * rightmost page case), all the items on the right half will be user
     * data.
     */

I believe that the reason that we didn't find this bug prior to commit
is that we only have a single index tuple with the wrong number of
attributes after an initial root page split through insertions, but
the next root page split masks the problems. Not 100% sure that that's
why we missed it just yet, though.

This bug shouldn't be hard to fix. I'll take care of it as part of
that post-commit review patch I'm working on.

-- 
Peter Geoghegan

Reply via email to