eejbyfeldt commented on issue #481: URL: https://github.com/apache/datafusion-comet/issues/481#issuecomment-2150830474
@andygrove Then there is more than one issue. I also recreated a similar crash using the date `290000-01-01` inside a timestamp. Spark will correctly return the year given this date, but comet crashes. Stack trace: ``` at comet::parquet::read::values::<impl comet::parquet::read::PlainDecoding for comet::parquet::data_type::Int96TimestampMicrosType>::decode(/home/eejbyfeldt/dev/apache/datafusion-comet/core/src/parquet/read/values.rs:800) at <comet::parquet::read::values::PlainDecoder<T> as comet::parquet::read::values::Decoder>::read_batch(/home/eejbyfeldt/dev/apache/datafusion-comet/core/src/parquet/read/values.rs:853) at <comet::parquet::read::values::PlainDecoder<T> as comet::parquet::read::values::Decoder>::read(/home/eejbyfeldt/dev/apache/datafusion-comet/core/src/parquet/read/values.rs:844) at comet::parquet::read::levels::LevelDecoder::read_batch(/home/eejbyfeldt/dev/apache/datafusion-comet/core/src/parquet/read/levels.rs:135) at comet::parquet::read::column::TypedColumnReader<T>::read_batch(/home/eejbyfeldt/dev/apache/datafusion-comet/core/src/parquet/read/column.rs:546) at comet::parquet::read::column::ColumnReader::read_batch(/home/eejbyfeldt/dev/apache/datafusion-comet/core/src/parquet/read/column.rs:444) at comet::parquet::Java_org_apache_comet_parquet_Native_readBatch::{{closure}}(/home/eejbyfeldt/dev/apache/datafusion-comet/core/src/parquet/mod.rs:508) at comet::errors::curry::{{closure}}(/home/eejbyfeldt/dev/apache/datafusion-comet/core/src/errors.rs:442) at std::panicking::try::do_call(/rustc/ec08a0337f3556212525dbf1d3b41e19bdf27621/library/std/src/panicking.rs:526) ``` Even if the overflow is fixed at that location full support might not be easy as that date is out of range for what is supported by chrono::DateTime: https://docs.rs/chrono/latest/chrono/naive/struct.NaiveDate.html so the existing code using that will be unable to return the date. If one allows the overflow at the first location (or runs in release mode) it instead crashes here: https://github.com/apache/datafusion-comet/blob/24781fb7b3966f787cf72c97f42e2613f24fb2ac/core/src/execution/datafusion/expressions/utils.rs#L178 since that code assumes that all timestamps will be possible to represent in the `chrono` type. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org