alamb commented on issue #6983: URL: https://github.com/apache/arrow-datafusion/issues/6983#issuecomment-1638250822
I made a POC on https://github.com/apache/arrow-datafusion/pull/6984 which demonstrates the issue is indeed using more cores to do the write. However, the implementation of doing repartitioning is probably not right -- I think the better approach would be to set the target partitions when writing into memory table Perhaps this could be done by creating a `LogicalPlan::DmlStatement` for write and then letting the existing insert machinery work rather than doing a custom "collect". https://docs.rs/datafusion/latest/datafusion/logical_expr/logical_plan/struct.DmlStatement.html Marking this as a good first issue as I think the approach will work well and should be able to follow existing patterns, and was asked for by a customer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
