alamb commented on code in PR #20441:
URL: https://github.com/apache/datafusion/pull/20441#discussion_r2842261771


##########
datafusion/physical-plan/src/joins/hash_join/inlist_builder.rs:
##########
@@ -33,15 +33,16 @@ pub(super) fn build_struct_fields(data_types: &[DataType]) 
-> Result<Fields> {
         .collect()
 }
 
-/// Flattens dictionary-encoded arrays to their underlying value arrays.
+/// Casts dictionary-encoded arrays to their underlying value type, preserving 
row count.
 /// Non-dictionary arrays are returned as-is.
-fn flatten_dictionary_array(array: &ArrayRef) -> ArrayRef {
-    downcast_dictionary_array! {
-        array => {
+fn flatten_dictionary_array(array: &ArrayRef) -> Result<ArrayRef> {

Review Comment:
   It seems to me like it still flattens dictionaries. why would we rename it?



##########
datafusion/physical-plan/src/joins/hash_join/inlist_builder.rs:
##########
@@ -130,4 +133,41 @@ mod tests {
             )
         );
     }
+
+    #[test]
+    fn test_build_multi_column_inlist_with_dictionary() {

Review Comment:
   I am not sure this unit test adds much value -- it just basically reiterates 
how the current function works  (it is testing some intermediate state
   
   I double checked that the .slt test fails without the code in this PR



##########
datafusion/physical-plan/src/joins/hash_join/inlist_builder.rs:
##########
@@ -33,15 +33,16 @@ pub(super) fn build_struct_fields(data_types: &[DataType]) 
-> Result<Fields> {
         .collect()
 }
 
-/// Flattens dictionary-encoded arrays to their underlying value arrays.
+/// Casts dictionary-encoded arrays to their underlying value type, preserving 
row count.
 /// Non-dictionary arrays are returned as-is.
-fn flatten_dictionary_array(array: &ArrayRef) -> ArrayRef {
-    downcast_dictionary_array! {
-        array => {
+fn flatten_dictionary_array(array: &ArrayRef) -> Result<ArrayRef> {
+    match array.data_type() {
+        DataType::Dictionary(_, value_type) => {
+            let casted = cast(array, value_type)?;

Review Comment:
   I messed around with this PR and I don't really understand why the code is 
flattening arrays at all
   
   I removed the flattening code entirely and all the code seems to pass.  I'll 
make a follow on PR 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to