qifanlili opened a new issue, #9231: URL: https://github.com/apache/seatunnel/issues/9231
### Search before asking - [x] I had searched in the [feature](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22Feature%22) and found no similar feature requirement. ### Description Spark引擎执行过程中,所有的executors都执行完成后,Driver端会单线层执行rename操作,将数据文件从/tmp/seatunnel移动到最终目录。 这个rename的过程是单线程串行执行的,当文件数量多的时候这个过程是非常漫长的。特别是使用对象存储的时候,如COS,似乎也是基于同样的逻辑。 <img width="888" alt="Image" src="https://github.com/user-attachments/assets/2acd82dc-67d8-482b-a55d-66570123207c" /> <img width="897" alt="Image" src="https://github.com/user-attachments/assets/82ac4a7a-d349-410a-93f7-7220ec8523e7" /> <img width="956" alt="Image" src="https://github.com/user-attachments/assets/1266c838-a6a1-4a9e-b8d9-c56b6de57c36" /> 我有几个疑问和建议: 1、为什么设计时使用的是单线程串行的方式?是出于规避什么风险吗? 2、如果要做优化的话,是否可以参考阿里的jindo oss commit 通过Multipart Upload的方式来实现?或者有更合理的方式推荐呢? ### Usage Scenario _No response_ ### Related issues _No response_ ### Are you willing to submit a PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
