erratic-pattern commented on code in PR #10221: URL: https://github.com/apache/datafusion/pull/10221#discussion_r1581447540
########## datafusion/expr/src/type_coercion/binary.rs: ########## @@ -638,22 +638,26 @@ fn dictionary_coercion( preserve_dictionaries: bool, ) -> Option<DataType> { use arrow::datatypes::DataType::*; + let maybe_preserve_dictionaries = + |index_type: &Box<DataType>, value_type: DataType| -> DataType { + if preserve_dictionaries { + Dictionary(index_type.clone(), Box::new(value_type)) + } else { + value_type + } + }; match (lhs_type, rhs_type) { ( Dictionary(_lhs_index_type, lhs_value_type), Dictionary(_rhs_index_type, rhs_value_type), ) => comparison_coercion(lhs_value_type, rhs_value_type), Review Comment: I agree with @jayzhan211 that we probably don't want to cast everything to dictionaries in the way that we are currently doing it, and what we really want is a way for the optimizer to avoid expensive casts of dictionary columns, and more generally to just avoid column casts in favor of casting constants and scalar expressions. I think what we have works fine for now and fixes performance issues we're seeing on dictionary columns, but should be improved for the general case in subsequent PRs that redesign the type coercion logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org