steveloughran commented on PR #5378:
URL: https://github.com/apache/hadoop/pull/5378#issuecomment-1484965578

   not going to merge this just yet; been getting complaints about memory use 
in some jobs during commit. I think I will have to merge manifest load with the 
file commit phase, which isn't done right now.
   
   problem there is that directories need to be created before the renames 
begin; that needs to be optimised to not duplicate dir creation for every task, 
but not be too blocking either. 
   
   will write some scale tests first to see whether the OOMs are coming from 
the committer or problems with abfs input streams. null hypothesis: my code


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to