Hi Tom,

Thanks for looking at it.  I should have included the exact reproducer in
the first email.

The failure needs histogram statistics on a "char" column, so simple
catalog joins such as the pg_type example do not necessarily reach the
problematic path.  For example, I just rechecked that your query plans as
a hash join for me:

  explain select * from pg_type a join pg_type b using (typtype);

To reach the failing path, this reproduces it for me on current master:

  create temp table char_stats_1 (c "char");
  create temp table char_stats_2 (c "char");

  insert into char_stats_1
  select v::"char"
  from unnest(array['I','S','c','i','m','p','r','t','v']) as v,
       generate_series(1,
                       case when v in ('i','v','r','t') then 50 else 1 end);

  insert into char_stats_2
  select v::"char"
  from unnest(array['a','e','i']) as v,
       generate_series(1, case when v = 'i' then 50 else 5 end);

  analyze char_stats_1;
  analyze char_stats_2;

  set enable_hashjoin = off;
  set enable_nestloop = off;

  explain (costs off)
  select count(*)
  from char_stats_1 s1
  join char_stats_2 s2 on s1.c = s2.c;

After ANALYZE, the first column has histogram statistics, e.g.:

  tablename    | attname | n_distinct | most_common_vals | histogram_bounds
  -------------+---------+------------+------------------+------------------
  char_stats_1 | c       |          9 | {i,r,t,v}        | {I,S,c,m,p}
  char_stats_2 | c       |          3 | {i,a,e}          |

Without the patch, planning fails with:

  ERROR:  XX000: cache lookup failed for collation 0
  LOCATION:  pg_newlocale_from_collation, pg_locale.c:1211

The reason is that mergejoinscansel() calls scalarineqsel() while costing
the merge join.  That can reach convert_to_scalar() for histogram bounds.
For CHAROID, convert_string_datum() builds a one-byte string, but then
still calls pg_newlocale_from_collation() with the clause input collation,
which is InvalidOid for the non-collatable "char" type.

The patch only skips that collation lookup for CHAROID.  For text,
varchar, bpchar, and name, the existing locale handling is unchanged.

Regards,
Feng


Reply via email to