rtpsw commented on code in PR #13880: URL: https://github.com/apache/arrow/pull/13880#discussion_r954636007
########## cpp/src/arrow/compute/exec/asof_join_node.cc: ########## @@ -294,10 +452,22 @@ class InputState { // Index of the time col col_index_t time_col_index_; // Index of the key col - col_index_t key_col_index_; + vec_col_index_t key_col_index_; + // Type id of the time column + Type::type time_type_id_; + // Type id of the key column + std::vector<Type::type> key_type_id_; + // Hasher for key elements + mutable KeyHasher* key_hasher_; + // True if hashing is mandatory + bool must_hash_; + // True if null by-key values are expected + bool nullable_by_key_; Review Comment: As implemented now, the faster path just applies `norm_value` to an integral value, while assuming it is not null, so it cannot work for a nullable column. It's possible to implement the fast path differently, such that the null bit is accounted for, by normalizing to a wider type (e.g., a 32-bit nullable value would be normalized to a 64-bit value); this would require special handling for 64-bit values (e.g., normalizing to a yet wider value, or resorting to hashing) and I opted to go with the current implementation. Let me know if you feel strongly about avoiding a nullable option being exposed to the user. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org