On Sat, Jan 10, 2015 at 8:32 PM, Peter Geoghegan <p...@heroku.com> wrote:
> I also include various bugfixes to approach #2 to value locking (these
> were all previously separately posted, but are now integrated into the
> main ON CONFLICT commit). Specifically, these are fixes for the bugs
> that emerged thanks to Jeff Janes' great work on stress testing [4].
> With these fixes, I have been unable to reproduce any problem with
> this patch with the test suite, even after many days of running the
> script on a quad-core server, with constant concurrent VACUUM runs,
> etc.

I continued with this since posting V2.0. I've run this bash script,
that invokes Jeff's script at various client counts, with runs of
various duration (since each client does a fixed amount of work):

https://github.com/petergeoghegan/jjanes_upsert/blob/master/run_test.sh

As previously discussed, Jeff's script comprehensively verifies the
correctness of the final values of a few thousand rows within a table
after many concurrent upserts, within and across upserting sessions,
and with concurrent deletions, too.

When building Postgres for this stress test, I included Jeff's
modifications that increase the XID burn rate artificially (I chose a
burn rate of X50). This makes anti-wraparound VACUUMs much more
frequent. I'm also looking out for outlier query execution durations,
because in theory they could indicate an unknown lock starvation
problem. I haven't seen any notable outliers after over a week of
testing.

> I think that we still need to think about the issues that
> transpired with exclusion constraints, but since I couldn't find
> another problem with an adapted version of Jeff's tool that tested
> exclusion constraints, I'm inclined to think that it should be
> possible to support exclusion constraints for the IGNORE variant.

Exclusion constraints were my focus with stress testing today. I
performed equivalent verification of upserts using exclusion
constraints (this is a hack; exclusion constraints are only intended
to be used with the IGNORE variant, but I get better test coverage
than I might otherwise this way). Unfortunately, even with the recent
bugfixes, there are still problems. On this server (rather than my
laptop), with 8 clients I can see errors like this before too long
(note that this output includes custom instrumentation from Jeff):

"""""""
6670 2015-01-17 18:02:54 PST LOG:  JJ scan_all 1, relfrozenid -813636509
6670 2015-01-17 18:02:54 PST LOG:  JJ freezeLimit -661025537
6670 2015-01-17 18:02:54 PST LOG:  JJ freeze_min_age 50000000
vacuum_freeze_table_age 150000000 freeze_table_age 150000000 ReadNew
-611025384
6670 2015-01-17 18:02:54 PST LOG:  JJ scan_all 1, relfrozenid -813636101
6670 2015-01-17 18:02:54 PST LOG:  JJ transaction ID wrap limit is
1352632427, limited by database with OID 12746
6670 2015-01-17 18:02:54 PST LOG:  autovacuum: done processing
database "postgres" at recent Xid of 3683945176 recent mxid of 1
6668 2015-01-17 18:02:54 PST ERROR:  conflicting key value violates
exclusion constraint "upsert_race_test_index_excl"
6668 2015-01-17 18:02:54 PST DETAIL:  Key (index)=(7142) conflicts
with existing key (index)=(600).
6668 2015-01-17 18:02:54 PST STATEMENT:  insert into upsert_race_test
(index, count) values ('7142','1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count
"""""""

It's always an exclusion violation problem that I see here.

As you can see, the query involved has no "unique index inference"
specification, per the hack to make this work with exclusion
constraints (the artificially much greater XID burn rate might have
also increased the likelihood of this error dramatically). You'll also
note that the DETAIL message seems to indicate that this
btree_gist-based exclusion constraint doesn't behave like a unique
constraint at all, because the conflicting new value (7142) is not at
all the same as the existing value (600). But that's wrong -- it's
supposed to be B-Tree-like. In short, there are further race
conditions with exclusion constraints.

I think that the fundamental, unfixable race condition here is the
disconnect between index tuple insertion and checking for would-be
exclusion violations that exclusion constraints naturally have here,
that unique indexes naturally don't have [1] (note that I'm talking
only about approach #2 to value locking here; approach #1 isn't in
V2.0). I suspect that the feature is not technically feasible to make
work correctly with exclusion constraints, end of story. VACUUM
interlocking is probably also involved here, but the unfixable race
condition seems like our fundamental problem.

We could possibly spend a lot of time discussing whether or not I'm
right about it being inherently impossible to make INSERT ... ON
CONFLICT IGNORE work with exclusion constraints. However, I strongly
suggest that we cut scope and at least leave them out of any version
that can be committed for 9.5, and instead work on other areas,
because it is at least now clear that they are much harder to get
right than unique constraints. Besides, making exclusion constraints
work with INSERT ... ON CONFLICT IGNORE is nice, but ultimately not
all that important. For that matter I think that INSERT ... ON
CONFLICT IGNORE is more generally not all that important compared to
ON CONFLICT UPDATE. I'd cut scope by cutting ON CONFLICT IGNORE if
that was the consensus....we could add back ON CONFLICT IGNORE in 9.6
when we had a better sense of exclusion constraints here. Exclusion
constraints can never be useful with ON CONFLICT UPDATE anyway.

Please work with me towards a committable patch. I think we have every
chance of committing this for 9.5, with value locking approach #2,
provided we now cut scope a bit. As I mention above, V2.0 has stood up
to more than a week of aggressive, comprehensive stress testing/custom
correctness verification on an 8 core box (plus numerous other stress
tests in months past). UPSERT (which never involved exclusion
constraints) is a very comprehensive and mature effort, and I think it
now needs one big push from a senior community member. I feel that I
cannot do anything more without that input.

[1] http://www.postgresql.org/message-id/54a7c76d.3070...@vmware.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to