Neer393 commented on PR #5994: URL: https://github.com/apache/hive/pull/5994#issuecomment-3117423557
So what I have done here is we wanted to prevent writing empty commitTask files but just skipping them was not a solution as we encountered issues while reading as we used to read files in an order i.e from 0....numTasks where numTasks would be equal to number of mapper tasks if it's a mapper job or equal to reducer tasks if it's a reducer job. So in order to prevent the creation of empty commitTask files, we needed to keep a track of the files that are non empty and were committed and hence I created a commitTasksInfo file which stores the files saved. Now when we want to read, we read the files which were committed from this file and hence we avoid creating unwanted commitTask Files Please review the idea as well as the PR @deniskuzZ @abstractdog -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org