Outer join tables, on text columns with different collates (perhaps charsets 
too), may propagate wrong LIKE predicate
---------------------------------------------------------------------------------------------------------------------

                 Key: CORE-6094
                 URL: http://tracker.firebirdsql.org/browse/CORE-6094
             Project: Firebird Core
          Issue Type: Bug
    Affects Versions: 3.0.3, 2.1.7
         Environment: Firebird 2.1.7, Firebird 3.unknown
            Reporter: Arioch


Initial report: 
https://stackoverflow.com/questions/56858452/why-did-my-where-clause-affect-my-left-join
Initial discussion: 
https://www.sql.ru/forum/1314381-a/fb3-join-vs-like-vs-case-sensitivity

The example is based upon mismatched collates but probably implementation 
behavior is the same with regard to charsets too.

DML:

~~~~~~~~~~~~~~~~~
create table t_CI (  -- case-INSENSITIVE
  id VarChar(10) character set UTF8 primary key collate UNICODE_CI,
  payload integer
);

create table t_CS (  -- case-SENSITIVE
  id VarChar(10) character set UTF8 primary key collate UNICODE,
  payload integer
);


insert into t_CS values ( 'ABCD', -100 );

insert into t_CI values ( 'ABCD', +22 );
~~~~~~~~~~~~~~

We now have two tables with ID fields having different collates. However the ID 
value is the same within both tables and is within both collates. 

When joining those two tables - the results should be either data from both 
tables or no data at all.

Situation, when upon join by ID data is fetched from only one of the two tables 
but not from another would be abnormal !!!



Abnormal query returning only half of the data, most probably due to 
propagation of `LIKE` predicate with wrong collation:
~~~~~~~~~~~
select * from T_CI
 left join T_CS on T_CS.id = T_CI.id
 where T_CI.id like '%bc%'
PLAN JOIN (T_CI NATURAL, T_CS INDEX (RDB$PRIMARY3))

Adapted plan:
PLAN JOIN (T_CI NATURAL, T_CS INDEX (INTEG_21))
~~~~~~~~~~~  
  
  
For comparison:


Correct queries returning data from BOTH tables:
~~~~~~~~
select * from T_CS,  T_CI where T_CS.id = T_CI.id

select * from T_CS
 left join T_CI on T_CS.id = T_CI.id

select * from T_CI
 left join T_CS on T_CS.id = T_CI.id

select * from T_CI
 left join T_CS on T_CS.id = T_CI.id
 where T_CI.id like '%BC%'

select * from T_CS
 left join T_CI on T_CS.id = T_CI.id
 where T_CS.id like '%BC%'

select * from T_CI
 left join T_CS on T_CS.id = T_CI.id
 where T_CI.id collate unicode_ci like '%bc%'
~~~~~~~~~~~~~~~


Correct queries returning no data at all:
~~~~~~~
select * from T_CS
 left join T_CI on T_CS.id = T_CI.id
 where T_CS.id like '%bc%'

select * from T_CI
 left join T_CS on T_CS.id = T_CI.id
 where T_CI.id collate unicode like '%bc%'
~~~~~~~


One more query, somewhat similar, different is engaging master-table index thus 
probably preventing LIKE-propagation
~~~~~~~~~~
select * from T_CI
 left join T_CS on T_CS.id = T_CI.id
 where T_CI.id starting with 'abc'

-- OK: data ~ data

Plan:
PLAN JOIN (T_CI INDEX (RDB$PRIMARY2), T_CS INDEX (RDB$PRIMARY3))
~~~~~~~~~~~~~~~~~~~~~


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to