alamb commented on code in PR #10221: URL: https://github.com/apache/datafusion/pull/10221#discussion_r1579870509
########## datafusion/sqllogictest/test_files/dictionary.slt: ########## @@ -386,3 +386,53 @@ drop table m3; statement ok drop table m3_source; + + +## Test that filtering on dictionary columns coerces the filter value to the dictionary type +statement ok +create table test as values + ('row1', arrow_cast('1', 'Dictionary(Int32, Utf8)')), + ('row2', arrow_cast('2', 'Dictionary(Int32, Utf8)')), + ('row3', arrow_cast('3', 'Dictionary(Int32, Utf8)')) +; + +# query using an string '1' which must be coerced into a dictionary string +query T? +SELECT * from test where column2 = '1'; +---- +row1 1 + +# filter should not have a cast on column2 +query TT +explain SELECT * from test where column2 = '1'; +---- +logical_plan +01)Filter: test.column2 = Dictionary(Int32, Utf8("1")) +02)--TableScan: test projection=[column1, column2] +physical_plan +01)CoalesceBatchesExec: target_batch_size=8192 +02)--FilterExec: column2@1 = 1 +03)----MemoryExec: partitions=1, partition_sizes=[1] + + +# Now query using an integer which must be coerced into a dictionary string +query T? +SELECT * from test where column2 = 1; +---- +row1 1 + +# filter should not have a cast on column2 +query TT +explain SELECT * from test where column2 = 1; +---- +logical_plan +01)Filter: test.column2 = Dictionary(Int32, Utf8("1")) Review Comment: the key is that there is no CAST on `column2` here ########## datafusion/sqllogictest/test_files/scalar.slt: ########## @@ -1768,52 +1768,61 @@ SELECT make_array(1, 2, 3); [1, 2, 3] # coalesce static empty value -query T -SELECT COALESCE('', 'test') +query TT +SELECT COALESCE('', 'test'), +arrow_typeof(COALESCE('', 'test')) ---- -(empty) +(empty) Utf8 # coalesce static value with null -query T -SELECT COALESCE(NULL, 'test') +query TT +SELECT COALESCE(NULL, 'test'), +arrow_typeof(COALESCE(NULL, 'test')) ---- -test - +test Utf8 +# Create table with a dictionary value statement ok create table test1 as values (arrow_cast('foo', 'Dictionary(Int32, Utf8)')), (null); -# test coercion string -query ? -select coalesce(column1, 'none_set') from test1; +# test coercion string (should preserve the dictionary type) +query ?T +select coalesce(column1, 'none_set'), +arrow_typeof(coalesce(column1, 'none_set')) +from test1; ---- -foo -none_set +foo Dictionary(Int32, Utf8) +none_set Dictionary(Int32, Utf8) -# test coercion Int -query I -select coalesce(34, arrow_cast(123, 'Dictionary(Int32, Int8)')); +# test coercion Int and Dictionary +query ?T Review Comment: As @erratic-pattern noted the difference here is that now that the output type of cealesce is `Dictionary` rather than Int. ########## datafusion/sqllogictest/test_files/scalar.slt: ########## @@ -1768,52 +1768,61 @@ SELECT make_array(1, 2, 3); [1, 2, 3] # coalesce static empty value -query T -SELECT COALESCE('', 'test') +query TT Review Comment: I updated these tests to show what the coerced type was as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org