One of our servers crashed last night like this:

< 2019-10-10 22:31:02.186 EDT postgres >STATEMENT:  REINDEX INDEX CONCURRENTLY 
child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx
< 2019-10-10 22:31:02.399 EDT  >LOG:  server process (PID 29857) was terminated 
by signal 11: Segmentation fault
< 2019-10-10 22:31:02.399 EDT  >DETAIL:  Failed process was running: REINDEX 
INDEX CONCURRENTLY child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx
< 2019-10-10 22:31:02.399 EDT  >LOG:  terminating any other active server 
processes

ts=# \d+ child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx
Index "child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx"
 Column  |  Type   | Key? | Definition | Storage | Stats target
---------+---------+------+------------+---------+--------------
 site_id | integer | yes  | site_id    | plain   |
btree, for table "child.eric_umts_rnc_utrancell_hsdsch_eul_201910"

That's an index on a table partition, but not itself a child of a relkind=I
index.

Unfortunately, there was no core file, and I'm still trying to reproduce it.

I can't see that the table was INSERTed into during the reindex...
But looks like it was SELECTed from, and the report finished within 1sec of the
crash:

(2019-10-10 22:30:50,485 - p1604 t140325365622592 - INFO): PID 1604 finished 
running report; est=None rows=552; cols=83; [...] duration:12

postgres=# SELECT log_time, pid, session_id, left(message,99), detail FROM 
postgres_log_2019_10_10_2200 WHERE pid=29857 OR (log_time BETWEEN '2019-10-10 
22:31:02.18' AND '2019-10-10 22:31:02.4' AND NOT message~'crash of another') 
ORDER BY log_time LIMIT 9;
 2019-10-10 22:30:24.441-04 | 29857 | 5d9fe93f.74a1 | temporary file: path 
"base/pgsql_tmp/pgsql_tmp29857.0.sharedfileset/0.0", size 3096576      | 
 2019-10-10 22:30:24.442-04 | 29857 | 5d9fe93f.74a1 | temporary file: path 
"base/pgsql_tmp/pgsql_tmp29857.0.sharedfileset/1.0", size 2809856      | 
 2019-10-10 22:30:24.907-04 | 29857 | 5d9fe93f.74a1 | process 29857 still 
waiting for ShareLock on virtual transaction 30/103010 after 333.078 ms | 
Process holding the lock: 29671. Wait queue: 29857.
 2019-10-10 22:31:02.186-04 | 29857 | 5d9fe93f.74a1 | process 29857 acquired 
ShareLock on virtual transaction 30/103010 after 37611.995 ms        | 
 2019-10-10 22:31:02.186-04 | 29671 | 5d9fe92a.73e7 | duration: 50044.778 ms  
statement: SELECT fn, sz FROM                                      +| 
                            |       |               |                         
(SELECT file_name fn, file_size_bytes sz,                          +| 
                            |       |               |                           
                                                                  | 
 2019-10-10 22:31:02.399-04 |  1161 | 5d9cad9e.489  | terminating any other 
active server processes                                               | 
 2019-10-10 22:31:02.399-04 |  1161 | 5d9cad9e.489  | server process (PID 
29857) was terminated by signal 11: Segmentation fault                  | 
Failed process was running: REINDEX INDEX CONCURRENTLY 
child.eric_umts_rnc_utrancell_hsdsch_eul_201910_site_idx

Justin


Reply via email to