On Wed, Feb 4, 2015 at 10:03 AM, Peter Geoghegan <p...@heroku.com> wrote:
> On Wed, Feb 4, 2015 at 9:54 AM, Heikki Linnakangas
> <hlinnakan...@vmware.com> wrote:
>> I had compiled with -O0, but without assertions. I tried now again with -O3.
>> It's been running for about 10 minutes now, and I haven't seen any errors.
>
> Did you run with an artificially high XID burn rate (i.e. did you also
> apply Jeff's modifications to Postgres, and specify a high burn rate
> using his custom GUC)? Maybe that was important.

Excuse me: I now see that you specifically indicated that you did. But
looks like your XID burn rate was quite a lot higher than mine
(assuming that you were consistent in using " jj_xid=10000", although
I'm not asserting that that was significant).

I attach a log of output from an example session where exclusion
constraints are shown to break (plus the corresponding server log,
plus /proc/cpuinfo on the off chance that that's significant). As you
can from the fact that the span of time recorded in the log is only a
couple of minutes, this is really easy for me to
recreate....sometimes. I could not recreate the problem with only 4
clients (on this 8 core server) after a few dozen attempts, and then I
couldn't recreate the issue at all, so clearly those details matter.
It might have something to do with CPU scaling, which I've found can
significantly affect outcomes for things like this (looks like my
hosting provider changed settings in the system BIOS recently, such
that I cannot set the CPU governor to "performance").

Perhaps you could take a crack at recreating this, Jeff?

Thanks
-- 
Peter Geoghegan
Example run of jjanes_upsert tool:

pg@gerbil:~/jjanes_upsert$ perl count_upsert_exclusion.pl 8
[Thu Feb  5 04:38:36 2015] NOTICE:  extension "btree_gist" already exists, 
skipping
[Thu Feb  5 04:38:36 2015] init done at count_upsert_exclusion.pl line 106.
pg@gerbil:~/jjanes_upsert$ perl count_upsert_exclusion.pl 8 10000
[Thu Feb  5 04:38:56 2015] NOTICE:  extension "btree_gist" already exists, 
skipping
[Thu Feb  5 04:38:56 2015] init done at count_upsert_exclusion.pl line 106.
[Thu Feb  5 04:39:01 2015] sum is 528
[Thu Feb  5 04:39:01 2015] count is 8740
[Thu Feb  5 04:39:01 2015] For tuple with index value 1, 6 != 3 at 
count_upsert_exclusion.pl line 174.
pg@gerbil:~/jjanes_upsert$ perl count_upsert_exclusion.pl 8 10000
[Thu Feb  5 04:39:08 2015] NOTICE:  extension "btree_gist" already exists, 
skipping
[Thu Feb  5 04:39:08 2015] init done at count_upsert_exclusion.pl line 106.
pg@gerbil:~/jjanes_upsert$ perl count_upsert_exclusion.pl 8 10000
[Thu Feb  5 04:39:20 2015] NOTICE:  extension "btree_gist" already exists, 
skipping
[Thu Feb  5 04:39:20 2015] init done at count_upsert_exclusion.pl line 106.

Corresponding PostgreSQL server logs (note: This uses the PDT timezone, not UTC 
as above):

pg@gerbil:~/postgresql$ pg_ctl start -o "--fsync=off --JJ_xid=50 --JJ_vac=1" -w 
2>&1 | tee postgres.log
waiting for server to start....31541 2015-02-04 19:36:02 PST LOG:  database 
system was shut down at 2015-02-04 19:35:12 PST
31541 2015-02-04 19:36:02 PST LOG:  JJ transaction ID wrap limit is 2147484359, 
limited by database with OID 1
31540 2015-02-04 19:36:02 PST LOG:  database system is ready to accept 
connections
 done
server started
31706 2015-02-04 19:38:42 PST LOG:  duration: 161.921 ms  statement: insert 
into upsert_race_test (index, count) values ('4751','1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count
31697 2015-02-04 19:38:42 PST LOG:  duration: 161.900 ms  statement: insert 
into upsert_race_test (index, count) values ('6931','-1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count
31695 2015-02-04 19:38:42 PST LOG:  duration: 161.966 ms  statement: insert 
into upsert_race_test (index, count) values ('5679','-1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count
31707 2015-02-04 19:38:42 PST LOG:  duration: 161.959 ms  statement: insert 
into upsert_race_test (index, count) values ('5876','1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count
31705 2015-02-04 19:38:42 PST LOG:  duration: 162.009 ms  statement: insert 
into upsert_race_test (index, count) values ('1507','1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count
31704 2015-02-04 19:38:42 PST LOG:  duration: 162.058 ms  statement: insert 
into upsert_race_test (index, count) values ('678','-1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count
31699 2015-02-04 19:38:42 PST LOG:  duration: 162.186 ms  statement: insert 
into upsert_race_test (index, count) values ('9289','1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count
31750 2015-02-04 19:39:11 PST ERROR:  conflicting key value violates exclusion 
constraint "upsert_race_test_index_excl"
31750 2015-02-04 19:39:11 PST DETAIL:  Key (index)=(7999) conflicts with 
existing key (index)=(1909).
31750 2015-02-04 19:39:11 PST STATEMENT:  insert into upsert_race_test (index, 
count) values ('7999','1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count
31773 2015-02-04 19:39:22 PST ERROR:  conflicting key value violates exclusion 
constraint "upsert_race_test_index_excl"
31773 2015-02-04 19:39:22 PST DETAIL:  Key (index)=(4266) conflicts with 
existing key (index)=(6551).
31773 2015-02-04 19:39:22 PST STATEMENT:  insert into upsert_race_test (index, 
count) values ('4266','1') on conflict
                      update set count=TARGET.count + EXCLUDED.count
                      where TARGET.index = EXCLUDED.index
                      returning upsert_race_test.count

cpuinfo (which may be significant in recreating the bug):
pg@gerbil:~$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 1.1.2
stepping        : 3
microcode       : 0x1
cpu MHz         : 3399.996
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt 
hypervisor lahf_lm abm
bogomips        : 6799.99
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
 
processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 1.1.2
stepping        : 3
microcode       : 0x1
cpu MHz         : 3399.996
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt 
hypervisor lahf_lm abm
bogomips        : 6799.99
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
 
processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 1.1.2
stepping        : 3
microcode       : 0x1
cpu MHz         : 3399.996
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt 
hypervisor lahf_lm abm
bogomips        : 6799.99
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
 
processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 1.1.2
stepping        : 3
microcode       : 0x1
cpu MHz         : 3399.996
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt 
hypervisor lahf_lm abm
bogomips        : 6799.99
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
 
processor       : 4
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 1.1.2
stepping        : 3
microcode       : 0x1
cpu MHz         : 3399.996
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt 
hypervisor lahf_lm abm
bogomips        : 6799.99
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
 
processor       : 5
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 1.1.2
stepping        : 3
microcode       : 0x1
cpu MHz         : 3399.996
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt 
hypervisor lahf_lm abm
bogomips        : 6799.99
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
 
processor       : 6
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 1.1.2
stepping        : 3
microcode       : 0x1
cpu MHz         : 3399.996
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt 
hypervisor lahf_lm abm
bogomips        : 6799.99
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
 
processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 2
model name      : QEMU Virtual CPU version 1.1.2
stepping        : 3
microcode       : 0x1
cpu MHz         : 3399.996
cache size      : 4096 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni cx16 popcnt 
hypervisor lahf_lm abm
bogomips        : 6799.99
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to