steveloughran commented on PR #5378: URL: https://github.com/apache/hadoop/pull/5378#issuecomment-1484965578
not going to merge this just yet; been getting complaints about memory use in some jobs during commit. I think I will have to merge manifest load with the file commit phase, which isn't done right now. problem there is that directories need to be created before the renames begin; that needs to be optimised to not duplicate dir creation for every task, but not be too blocking either. will write some scale tests first to see whether the OOMs are coming from the committer or problems with abfs input streams. null hypothesis: my code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org