andre-cc-natzka opened a new pull request, #4104:
URL: https://github.com/apache/arrow-datafusion/pull/4104

   
   
   This PR aims at extending the Type Coercion optimization rule from 
Datafusion to support time types (Time32 and Time64), which is required for 
where clauses involving fields of these types.
   
   The main changes can be found at 
datafusion/expr/src/type_coercion/binary.rs, in the method temporal_coercion, 
which is adapted to ensure that, when getting a pair of types with a string and 
a time type, the later is taken as coerced type. Given that both Time32 and 
Time64 have an argument to specify the TimeUnit, one must be careful. In 
particular, the implementation must be consistent with the TimeUnits supported 
by Arrow, which include Second and Millisecond for Time32, and Microsecond and 
Nanosecond for Time64. These four cases are taken into account in 
temporal_coercion by calling a function check_time_unit that checks if the 
TimeUnit is consistent with the type and returns an option to it. The changes 
in binary.rs are checked in new tests, which are all passing.
   
   These modifications required a bunch of adaptations to be implemented in the 
datafusion code so as to consistently support Time32(TimeUnit::Second), 
Time32(TimeUnit::Millisecond), Time64(TimeUnit::Microsecond) and 
Time64(TimeUnit::Nanosecond). The changes are straightforward in general, 
except for the ones in the proto crate. I chose not to change the 
datafusion.proto file, which only supports a generic Time64 type, with no unit 
specification, and does not support Time32 at all. This is not a problem in 
itself, however the datafusion/proto/src/to_proto.rs and 
datafusion/proto/src/from_proto.rs files have to be adapted. Right now, the 
proto Time64 translates into a Time64Nanosecond, as specified in the 
datafusion/proto/src/from_proto.rs file. On the other hand, a proto Time64 can 
be obtained from all four Time64Nanosecond, Time64Microsecond, 
Time32Millisecond and Time32Second scalar value types, as implemented in 
datafusion/proto/src/to_proto.rs. This makes sense to me, but I am no
 t sure it is the best solution. Perhaps we could change the datafusion.proto?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to