devinjdangelo commented on code in PR #11345:
URL: https://github.com/apache/datafusion/pull/11345#discussion_r1672594976
##########
datafusion/core/tests/memory_limit/mod.rs:
##########
@@ -323,6 +323,31 @@ async fn oom_recursive_cte() {
.await
}
+#[tokio::test]
+async fn oom_parquet_sink() {
+ let file = tempfile::Builder::new()
+ .suffix(".parquet")
+ .tempfile()
+ .unwrap();
+
+ TestCase::new()
+ .with_query(format!(
+ "
+ COPY (select * from t)
+ TO '{}'
+ STORED AS PARQUET OPTIONS (compression 'uncompressed');
+ ",
+ file.path().to_string_lossy()
+ ))
+ .with_expected_errors(vec![
+ // TODO: update error handling in ParquetSink
+ "Unable to send array to writer!",
Review Comment:
I think we need to update several map_err statements to propagate inner
error messages rather than ignore them. E.g.
https://github.com/apache/datafusion/blob/b96186fdef1ff410663ec8fce41186c018f8e09a/datafusion/core/src/datasource/file_format/parquet.rs#L880-L884
change to something like
```rust
col_array_channels[next_channel]
.send(c)
.await
.map_err(|e| internal_datafusion_err!("Unable to send array
to writer due to error {e}"))
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]