[ https://issues.apache.org/jira/browse/HIVE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13775834#comment-13775834 ]
Shuaishuai Nie commented on HIVE-4773: -------------------------------------- Thank [~ekoifman] and [~hari.s] for the comments. -Eugene, I don't think this is a problem when watcher first assignes stderr/stdout to 'out' and then reassigns 'out' to 'statusdir'. Only the last assign of 'out' matters. The fix will ensure stdout/stderr won't be closed when calling writer.close() by override the close function if the 'out' is actrually point to stdout/stderr when calling writer.close(). -Hari 1. I am not sure why close() should immediately close if flush() does not perform the same thing. As I mentioned in the earlier comment, flush() will not ensure the content of stream written to file based on the book "Hadoop the definitive guide". It won't write to file if a block is not filled in distribute file system. 2. Inside run() of Watcher why do you need to create a new object using PrintWriter writer = new PrintWriter(out); I didn't change it in my patch. It is in the origin code base. I think it is needed by the format of log in the output file. 3. Even if you add CustomFilterOutputStream class, why do you need to add flush() inside close(). This looks like you are flushing twice. This flush() is not necessary here. Just in case this class is used in somewhere else and flush may work there. 4. Do you necessarily need to make CustomFilterOutputStream class public. It doesnt look like its used elsewhere. For now it is not used anywhere else, I think it is ok to change it to protected. > Templeton intermittently fail to commit output to file system > ------------------------------------------------------------- > > Key: HIVE-4773 > URL: https://issues.apache.org/jira/browse/HIVE-4773 > Project: Hive > Issue Type: Bug > Components: WebHCat > Reporter: Shuaishuai Nie > Assignee: Shuaishuai Nie > Attachments: HIVE-4773.1.patch, HIVE-4773.2.patch > > > With ASV as a default FS, we saw instances where output is not fully flushed > to storage before the Templeton controller process exits. This results in > stdout and stderr being empty even though the job completed successfully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira