On Wed, Feb 4, 2015 at 10:03 AM, Peter Geoghegan <p...@heroku.com> wrote: > On Wed, Feb 4, 2015 at 9:54 AM, Heikki Linnakangas > <hlinnakan...@vmware.com> wrote: >> I had compiled with -O0, but without assertions. I tried now again with -O3. >> It's been running for about 10 minutes now, and I haven't seen any errors. > > Did you run with an artificially high XID burn rate (i.e. did you also > apply Jeff's modifications to Postgres, and specify a high burn rate > using his custom GUC)? Maybe that was important.
Excuse me: I now see that you specifically indicated that you did. But looks like your XID burn rate was quite a lot higher than mine (assuming that you were consistent in using " jj_xid=10000", although I'm not asserting that that was significant). I attach a log of output from an example session where exclusion constraints are shown to break (plus the corresponding server log, plus /proc/cpuinfo on the off chance that that's significant). As you can from the fact that the span of time recorded in the log is only a couple of minutes, this is really easy for me to recreate....sometimes. I could not recreate the problem with only 4 clients (on this 8 core server) after a few dozen attempts, and then I couldn't recreate the issue at all, so clearly those details matter. It might have something to do with CPU scaling, which I've found can significantly affect outcomes for things like this (looks like my hosting provider changed settings in the system BIOS recently, such that I cannot set the CPU governor to "performance"). Perhaps you could take a crack at recreating this, Jeff? Thanks -- Peter Geoghegan
Example run of jjanes_upsert tool: pg@gerbil:~/jjanes_upsert$ perl count_upsert_exclusion.pl 8 [Thu Feb 5 04:38:36 2015] NOTICE: extension "btree_gist" already exists, skipping [Thu Feb 5 04:38:36 2015] init done at count_upsert_exclusion.pl line 106. pg@gerbil:~/jjanes_upsert$ perl count_upsert_exclusion.pl 8 10000 [Thu Feb 5 04:38:56 2015] NOTICE: extension "btree_gist" already exists, skipping [Thu Feb 5 04:38:56 2015] init done at count_upsert_exclusion.pl line 106. [Thu Feb 5 04:39:01 2015] sum is 528 [Thu Feb 5 04:39:01 2015] count is 8740 [Thu Feb 5 04:39:01 2015] For tuple with index value 1, 6 != 3 at count_upsert_exclusion.pl line 174. pg@gerbil:~/jjanes_upsert$ perl count_upsert_exclusion.pl 8 10000 [Thu Feb 5 04:39:08 2015] NOTICE: extension "btree_gist" already exists, skipping [Thu Feb 5 04:39:08 2015] init done at count_upsert_exclusion.pl line 106. pg@gerbil:~/jjanes_upsert$ perl count_upsert_exclusion.pl 8 10000 [Thu Feb 5 04:39:20 2015] NOTICE: extension "btree_gist" already exists, skipping [Thu Feb 5 04:39:20 2015] init done at count_upsert_exclusion.pl line 106. Corresponding PostgreSQL server logs (note: This uses the PDT timezone, not UTC as above): pg@gerbil:~/postgresql$ pg_ctl start -o "--fsync=off --JJ_xid=50 --JJ_vac=1" -w 2>&1 | tee postgres.log waiting for server to start....31541 2015-02-04 19:36:02 PST LOG: database system was shut down at 2015-02-04 19:35:12 PST 31541 2015-02-04 19:36:02 PST LOG: JJ transaction ID wrap limit is 2147484359, limited by database with OID 1 31540 2015-02-04 19:36:02 PST LOG: database system is ready to accept connections done server started 31706 2015-02-04 19:38:42 PST LOG: duration: 161.921 ms statement: insert into upsert_race_test (index, count) values ('4751','1') on conflict update set count=TARGET.count + EXCLUDED.count where TARGET.index = EXCLUDED.index returning upsert_race_test.count 31697 2015-02-04 19:38:42 PST LOG: duration: 161.900 ms statement: insert into upsert_race_test (index, count) values ('6931','-1') on conflict update set count=TARGET.count + EXCLUDED.count where TARGET.index = EXCLUDED.index returning upsert_race_test.count 31695 2015-02-04 19:38:42 PST LOG: duration: 161.966 ms statement: insert into upsert_race_test (index, count) values ('5679','-1') on conflict update set count=TARGET.count + EXCLUDED.count where TARGET.index = EXCLUDED.index returning upsert_race_test.count 31707 2015-02-04 19:38:42 PST LOG: duration: 161.959 ms statement: insert into upsert_race_test (index, count) values ('5876','1') on conflict update set count=TARGET.count + EXCLUDED.count where TARGET.index = EXCLUDED.index returning upsert_race_test.count 31705 2015-02-04 19:38:42 PST LOG: duration: 162.009 ms statement: insert into upsert_race_test (index, count) values ('1507','1') on conflict update set count=TARGET.count + EXCLUDED.count where TARGET.index = EXCLUDED.index returning upsert_race_test.count 31704 2015-02-04 19:38:42 PST LOG: duration: 162.058 ms statement: insert into upsert_race_test (index, count) values ('678','-1') on conflict update set count=TARGET.count + EXCLUDED.count where TARGET.index = EXCLUDED.index returning upsert_race_test.count 31699 2015-02-04 19:38:42 PST LOG: duration: 162.186 ms statement: insert into upsert_race_test (index, count) values ('9289','1') on conflict update set count=TARGET.count + EXCLUDED.count where TARGET.index = EXCLUDED.index returning upsert_race_test.count 31750 2015-02-04 19:39:11 PST ERROR: conflicting key value violates exclusion constraint "upsert_race_test_index_excl" 31750 2015-02-04 19:39:11 PST DETAIL: Key (index)=(7999) conflicts with existing key (index)=(1909). 31750 2015-02-04 19:39:11 PST STATEMENT: insert into upsert_race_test (index, count) values ('7999','1') on conflict update set count=TARGET.count + EXCLUDED.count where TARGET.index = EXCLUDED.index returning upsert_race_test.count 31773 2015-02-04 19:39:22 PST ERROR: conflicting key value violates exclusion constraint "upsert_race_test_index_excl" 31773 2015-02-04 19:39:22 PST DETAIL: Key (index)=(4266) conflicts with existing key (index)=(6551). 31773 2015-02-04 19:39:22 PST STATEMENT: insert into upsert_race_test (index, count) values ('4266','1') on conflict update set count=TARGET.count + EXCLUDED.count where TARGET.index = EXCLUDED.index returning upsert_race_test.count cpuinfo (which may be significant in recreating the bug): pg@gerbil:~$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 2 model name : QEMU Virtual CPU version 1.1.2 stepping : 3 microcode : 0x1 cpu MHz : 3399.996 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt hypervisor lahf_lm abm bogomips : 6799.99 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 2 model name : QEMU Virtual CPU version 1.1.2 stepping : 3 microcode : 0x1 cpu MHz : 3399.996 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt hypervisor lahf_lm abm bogomips : 6799.99 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 2 model name : QEMU Virtual CPU version 1.1.2 stepping : 3 microcode : 0x1 cpu MHz : 3399.996 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt hypervisor lahf_lm abm bogomips : 6799.99 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 2 model name : QEMU Virtual CPU version 1.1.2 stepping : 3 microcode : 0x1 cpu MHz : 3399.996 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt hypervisor lahf_lm abm bogomips : 6799.99 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 4 vendor_id : GenuineIntel cpu family : 6 model : 2 model name : QEMU Virtual CPU version 1.1.2 stepping : 3 microcode : 0x1 cpu MHz : 3399.996 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt hypervisor lahf_lm abm bogomips : 6799.99 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 5 vendor_id : GenuineIntel cpu family : 6 model : 2 model name : QEMU Virtual CPU version 1.1.2 stepping : 3 microcode : 0x1 cpu MHz : 3399.996 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt hypervisor lahf_lm abm bogomips : 6799.99 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 6 vendor_id : GenuineIntel cpu family : 6 model : 2 model name : QEMU Virtual CPU version 1.1.2 stepping : 3 microcode : 0x1 cpu MHz : 3399.996 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt hypervisor lahf_lm abm bogomips : 6799.99 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 2 model name : QEMU Virtual CPU version 1.1.2 stepping : 3 microcode : 0x1 cpu MHz : 3399.996 cache size : 4096 KB fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt hypervisor lahf_lm abm bogomips : 6799.99 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management:
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers