On Thu, Nov 15, 2018 at 04:17:51PM -0800, Andres Freund wrote:
> I'm about to commit some changes to 12/master that'd possibly make it
> easier to find issues like this.
Are you referring to this a future commit ?
commit 763f2edd92095b1ca2f4476da073a28505c13820
Rejigger materializing and fetching a HeapTuple from a slot.
I was able to reproduce under HEAD with pg_restored data.
I guess you're right that the "memory alloc failure" is related/same thing,
I've seen it intermittently with queries which also sometimes crash (and also
sometimes don't).
Note that when it crashes, it seems to take a longer time to do so than the
query would normally take. Like we're walking off the end of an array, say.
I've been able to reproduce the crash with a self join of a table (no view, no
expressions, no parallel, directly querying a relkind='r' child). In that
case, enable_bitmapscan=on and jit_tuple_deforming=on are both needed to crash,
and jit_debugging_support=on does not yield a useful bt.
The table is not too special, but was probably ALTERed to add columns a good
number of times by one of our processes. It has ~1100 columns, including
arrays, and some with null_frac=1. I'm trying to come up with a test case
involving column types and order.
(gdb) bt
#0 0x00007f81a08b8b98 in ?? ()
#1 0x0000000000000000 in ?? ()
ts=# SET jit=on;SET jit_above_cost=0;explain(analyze off,verbose off) SELECT
a.* FROM child.daily_eric_umts_rnc_utrancell_view_201804 a JOIN
child.daily_eric_umts_rnc_utrancell_view_201804 b USING(start_time,sect_id)
WHERE a.start_time BETWEEN '2018-04-30' AND '2018-05-04' AND b.start_time
BETWEEN '2018-04-30' AND '2018-05-04';
SET
SET
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=527.36..1038.17 rows=1 width=7760)
Hash Cond: ((a.start_time = b.start_time) AND (a.sect_id = b.sect_id))
-> Bitmap Heap Scan on daily_eric_umts_rnc_utrancell_view_201804 a
(cost=9.78..515.59 rows=133 width=7760)
Recheck Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp without
time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without time
zone))
-> Bitmap Index Scan on
daily_eric_umts_rnc_utrancell_view_201804_unique_idx (cost=0.00..9.74 rows=133
width=0)
Index Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp
without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without
time zone))
-> Hash (cost=515.59..515.59 rows=133 width=12)
-> Bitmap Heap Scan on daily_eric_umts_rnc_utrancell_view_201804 b
(cost=9.78..515.59 rows=133 width=12)
Recheck Cond: ((start_time >= '2018-04-30 00:00:00'::timestamp
without time zone) AND (start_time <= '2018-05-04 00:00:00'::timestamp without
time zone))
-> Bitmap Index Scan on
daily_eric_umts_rnc_utrancell_view_201804_unique_idx (cost=0.00..9.74 rows=133
width=0)
Index Cond: ((start_time >= '2018-04-30
00:00:00'::timestamp without time zone) AND (start_time <= '2018-05-04
00:00:00'::timestamp without time zone))
JIT:
Functions: 19
Options: Inlining false, Optimization false, Expressions true, Deforming true
BTW find attached patch which I believe corrects some comments.
Justin
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 59e38d2..ab0c6d0 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -93,7 +93,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts)
funcname = llvm_expand_funcname(context, "deform");
/*
- * Check which columns do have to exist, so we don't have to check the
+ * Check which columns have to exist, so we don't have to check the
* rows natts unnecessarily.
*/
for (attnum = 0; attnum < desc->natts; attnum++)
@@ -252,7 +252,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts)
}
/*
- * Check if's guaranteed the all the desired attributes are available in
+ * Check if it's guaranteed that all the desired attributes are available in
* tuple. If so, we can start deforming. If not, need to make sure to
* fetch the missing columns.
*/
@@ -337,7 +337,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts)
/*
* If this is the first attribute, slot->tts_nvalid was 0. Therefore
- * reset offset to 0 to, it be from a previous execution.
+ * also reset offset to 0, it may be from a previous execution.
*/
if (attnum == 0)
{
@@ -554,7 +554,7 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc, int natts)
else if (att->attnotnull && attguaranteedalign && known_alignment >= 0)
{
/*
- * If the offset to the column was previously known a NOT NULL &
+ * If the offset to the column was previously known, a NOT NULL &
* fixed width column guarantees that alignment is just the
* previous alignment plus column width.
*/