[jira] [Commented] (FLUME-1391) Use sync() instead of syncFs() in HDFS Sink to be compatible with hadoop 0.20.2

Yongkun Wang (JIRA) Wed, 01 Aug 2012 01:16:39 -0700

    [ 
https://issues.apache.org/jira/browse/FLUME-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426372#comment-13426372
 ]


Yongkun Wang commented on FLUME-1391:
-------------------------------------

Hi guys, thanks for the reviews and comments. 

I tested this patch with flume release 1.2.0. It can work well with hadoop 
0.20.205.0 and latest hadoop release 1.0.3.

But this patch is not enough to make flume 1.2.0 to work with hadoop 0.20.2. 
The hdfs sink cannot be started because there were some big changes from 0.20.2 
to 0.20.205.0+ on hadoop security. Need more patches. 
But I will keep these patches internal for our current hadoop cluster, and wait 
for the upgrade of hadoop to use the latest flume.
                
> Use sync() instead of syncFs() in HDFS Sink to be compatible with hadoop 
> 0.20.2
> -------------------------------------------------------------------------------
>
>                 Key: FLUME-1391
>                 URL: https://issues.apache.org/jira/browse/FLUME-1391
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.1.0
>            Reporter: Yongkun Wang
>            Assignee: Yongkun Wang
>              Labels: hadoop
>             Fix For: v1.3.0
>
>         Attachments: HDFSSink-for-hadoop-0.20.2.patch
>
>
> For HDFS sink, the syncFs() is called in HDFSSequenceFile. But syncFs() is 
> not available in legacy hadoop 0.20.2, which may be a widely used version. 
> sync() method is available for all hadoop versions. And syncFs() is also 
> implemented by sync() in hadoop (SequenceFile):
> {code}
>     /** create a sync point */
>     public void sync() throws IOException {
>       if (sync != null && lastSyncPos != out.getPos()) {
>         out.writeInt(SYNC_ESCAPE);                // mark the start of the 
> sync
>         out.write(sync);                          // write sync
>         lastSyncPos = out.getPos();               // update lastSyncPos
>       }
>     }
>     /** flush all currently written data to the file system */
>     public void syncFs() throws IOException {
>       if (out != null) {
>         out.sync();                               // flush contents to file 
> system
>       }
>     }
> {code}
> Therefore, using sync() in HDFSSequenceFile may be better.
> {code}
>   @Override
>   public void sync() throws IOException {
>     //writer.syncFs(); //for hadoop 0.20.205.0+
>     writer.sync(); //support hadoop 0.20.2+
>   }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (FLUME-1391) Use sync() instead of syncFs() in HDFS Sink to be compatible with hadoop 0.20.2

Reply via email to