90degs2infty opened a new issue, #16736:
URL: https://github.com/apache/datafusion/issues/16736
### Describe the bug
I'm trying to implement a "poor-man's" `any` function to check for rows
matching a predicate in a dataframe:
```rust
async fn any(df: DataFrame, predicate: Expr) -> Result<bool,
DataFusionError> {
Ok(df.filter(predicate)?.limit(0, Some(1))?.count().await? > 0)
}
```
Depending on the predicate, this sometimes causes a panic because of an
`attempt to subtract with overflow` somewhere in `interval_arithmetic.rs` in
`datafusion-expr-common`:
```rust
#[tokio::main]
async fn main() -> Result<(), DataFusionError> {
let ctx = SessionContext::new();
let df = ctx
.read_parquet("value.parquet", ParquetReadOptions::new())
.await?;
df.clone().describe().await?.show().await?;
// Works - comparing against negative 0
let flag = any(df.clone(), col("value").lt(lit(-0f64))).await?;
println!("{flag}");
// Panics - comparing against positive 0
let flag = any(df, col("value").lt(lit(0f64))).await?;
println!("{flag}");
Ok(())
}
```
`value.parquet` is a file containing a single column (`value` of datatype
`f64`, non-nullable) with a bunch of `0.0` and `1.0` rows.
Find the backtrace below.
### To Reproduce
I've set up a minimal example to reproduce the bug at
https://github.com/90degs2infty/datafusion-bug
### Expected behavior
No panics. `any` should simply output `true` or `false`.
### Additional context
Backtrace:
```console
> RUST_BACKTRACE=1 cargo run
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s
Running `target/debug/datafusion_bug`
+------------+---------------------+
| describe | value |
+------------+---------------------+
| count | 1736.0 |
| null_count | 0.0 |
| mean | 0.46543778801843316 |
| std | 0.4989477499924608 |
| min | -0.0 |
| max | 1.0 |
| median | 0.0 |
+------------+---------------------+
false
thread 'main' panicked at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-expr-common-48.0.1/src/interval_arithmetic.rs:921:33:
attempt to subtract with overflow
stack backtrace:
0: rust_begin_unwind
at
/rustc/05f9846f893b09a1be1fc8560e33fc3c815cfecb/library/std/src/panicking.rs:695:5
1: core::panicking::panic_fmt
at
/rustc/05f9846f893b09a1be1fc8560e33fc3c815cfecb/library/core/src/panicking.rs:75:14
2: core::panicking::panic_const::panic_const_sub_overflow
at
/rustc/05f9846f893b09a1be1fc8560e33fc3c815cfecb/library/core/src/panicking.rs:178:21
3: datafusion_expr_common::interval_arithmetic::Interval::cardinality
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-expr-common-48.0.1/src/interval_arithmetic.rs:921:33
4: datafusion_expr_common::interval_arithmetic::cardinality_ratio
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-expr-common-48.0.1/src/interval_arithmetic.rs:1630:12
5: datafusion_physical_expr::analysis::calculate_selectivity
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-expr-48.0.1/src/analysis.rs:286:24
6: datafusion_physical_expr::analysis::shrink_boundaries
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-expr-48.0.1/src/analysis.rs:258:23
7: datafusion_physical_expr::analysis::analyze
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-expr-48.0.1/src/analysis.rs:221:17
8: datafusion_physical_plan::filter::FilterExec::statistics_helper
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-48.0.1/src/filter.rs:206:28
9: datafusion_physical_plan::filter::FilterExec::compute_properties
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-48.0.1/src/filter.rs:269:21
10: datafusion_physical_plan::filter::FilterExec::try_new
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-physical-plan-48.0.1/src/filter.rs:99:29
11:
datafusion::physical_planner::DefaultPhysicalPlanner::map_logical_node_to_physical::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-48.0.1/src/physical_planner.rs:783:30
12:
datafusion::physical_planner::DefaultPhysicalPlanner::task_helper::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-48.0.1/src/physical_planner.rs:393:26
13: <futures_util::stream::futures_unordered::FuturesUnordered<Fut> as
futures_core::stream::Stream>::poll_next
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/futures_unordered/mod.rs:528:17
14: futures_util::stream::stream::StreamExt::poll_next_unpin
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/mod.rs:1638:9
15: <futures_util::stream::stream::buffer_unordered::BufferUnordered<St>
as futures_core::stream::Stream>::poll_next
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/stream/buffer_unordered.rs:75:15
16: <S as futures_core::stream::TryStream>::try_poll_next
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-core-0.3.31/src/stream.rs:206:9
17: <futures_util::stream::try_stream::try_collect::TryCollect<St,C> as
core::future::future::Future>::poll
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/futures-util-0.3.31/src/stream/try_stream/try_collect.rs:46:26
18:
datafusion::physical_planner::DefaultPhysicalPlanner::create_initial_plan::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-48.0.1/src/physical_planner.rs:339:14
19: <datafusion::physical_planner::DefaultPhysicalPlanner as
datafusion::physical_planner::PhysicalPlanner>::create_physical_plan::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-48.0.1/src/physical_planner.rs:189:14
20: <core::pin::Pin<P> as core::future::future::Future>::poll
at
~/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:124:9
21: <datafusion::execution::session_state::DefaultQueryPlanner as
datafusion::execution::context::QueryPlanner>::create_physical_plan::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-48.0.1/src/execution/session_state.rs:1923:14
22: <core::pin::Pin<P> as core::future::future::Future>::poll
at
~/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:124:9
23:
datafusion::execution::session_state::SessionState::create_physical_plan::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-48.0.1/src/execution/session_state.rs:659:14
24: datafusion::dataframe::DataFrame::create_physical_plan::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-48.0.1/src/dataframe/mod.rs:278:61
25: datafusion::dataframe::DataFrame::collect::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-48.0.1/src/dataframe/mod.rs:1378:48
26: datafusion::dataframe::DataFrame::count::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/datafusion-48.0.1/src/dataframe/mod.rs:1345:14
27: datafusion_bug::any::{{closure}}
at ./src/main.rs:4:57
28: datafusion_bug::main::{{closure}}
at ./src/main.rs:20:60
29: <core::pin::Pin<P> as core::future::future::Future>::poll
at
~/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:124:9
30: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/runtime/park.rs:285:60
31: tokio::task::coop::with_budget
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/task/coop/mod.rs:167:5
32: tokio::task::coop::budget
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/task/coop/mod.rs:133:5
33: tokio::runtime::park::CachedParkThread::block_on
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/runtime/park.rs:285:31
34: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/runtime/context/blocking.rs:66:9
35:
tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/runtime/scheduler/multi_thread/mod.rs:87:13
36: tokio::runtime::context::runtime::enter_runtime
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/runtime/context/runtime.rs:65:16
37: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/runtime/scheduler/multi_thread/mod.rs:86:9
38: tokio::runtime::runtime::Runtime::block_on_inner
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/runtime/runtime.rs:358:45
39: tokio::runtime::runtime::Runtime::block_on
at
~/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.46.1/src/runtime/runtime.rs:328:13
40: datafusion_bug::main
at ./src/main.rs:23:5
41: core::ops::function::FnOnce::call_once
at
~/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose
backtrace.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]