[ 
https://issues.apache.org/jira/browse/HIVE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13775834#comment-13775834
 ] 

Shuaishuai Nie commented on HIVE-4773:
--------------------------------------

Thank [~ekoifman] and [~hari.s] for the comments.
-Eugene, I don't think this is a problem when watcher first assignes 
stderr/stdout to 'out' and then reassigns 'out' to 'statusdir'. Only the last 
assign of 'out' matters. The fix will ensure stdout/stderr won't be closed when 
calling writer.close() by override the close function if the 'out' is actrually 
point to stdout/stderr when calling writer.close().
-Hari
1. I am not sure why close() should immediately close if flush() does not 
perform the same thing.
As I mentioned in the earlier comment, flush() will not ensure the content of 
stream written to file based on the book "Hadoop the definitive guide". It 
won't write to file if a block is not filled in distribute file system.
2. Inside run() of Watcher why do you need to create a new object using 
PrintWriter writer = new PrintWriter(out);
I didn't change it in my patch. It is in the origin code base. I think it is 
needed by the format of log in the output file.
3. Even if you add CustomFilterOutputStream class, why do you need to add 
flush() inside close(). This looks like you are flushing twice.
This flush() is not necessary here. Just in case this class is used in 
somewhere else and flush may work there.
4. Do you necessarily need to make CustomFilterOutputStream class public. It 
doesnt look like its used elsewhere.
For now it is not used anywhere else, I think it is ok to change it to 
protected.
                
> Templeton intermittently fail to commit output to file system
> -------------------------------------------------------------
>
>                 Key: HIVE-4773
>                 URL: https://issues.apache.org/jira/browse/HIVE-4773
>             Project: Hive
>          Issue Type: Bug
>          Components: WebHCat
>            Reporter: Shuaishuai Nie
>            Assignee: Shuaishuai Nie
>         Attachments: HIVE-4773.1.patch, HIVE-4773.2.patch
>
>
> With ASV as a default FS, we saw instances where output is not fully flushed 
> to storage before the Templeton controller process exits. This results in 
> stdout and stderr being empty even though the job completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to