cadonna commented on issue #16836: URL: https://github.com/apache/datafusion/issues/16836#issuecomment-3113873969
We do not have a reproducer yet. However, I would like to share some more data points with you. We suspect the issue is related to partitioning for two reasons: 1. If we set the target partitions to 1, the issue does not happen. 2. In the following call to `execute_input_stream()` the partition of the input stream is hard coded to 0. To my beginner's eyes it looks like that only partition 0 of the input stream is read and inserted into the sink table. Could that be? https://github.com/apache/datafusion/blob/dbc03fa4f6d47c8f3b97f3a3d979945b2b7ccce7/datafusion/datasource/src/sink.rs#L227 A useful info might also be the workaround we found. Instead of using `provider.insert_into()` we use `df.clone().write_table(sink_table_name, ...)`. I hope this is helpful! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
