[jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers

Doug Judd (JIRA) Mon, 26 Jan 2009 23:11:23 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667602#action_12667602
 ]


Doug Judd commented on HADOOP-4379:
-----------------------------------

Hi Dhruba,

I tried your suggestion, but got the following exception when trying to open 
the file with the 'append' method:

SEVERE: I/O exception while getting length of file 
'/hypertable/servers/10.0.30.102_38060/log/range_txn/0.log' - 
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create 
file /hypertable/servers/10.0.30.102_38060/log/range_txn/0.log for 
DFSClient_2003773208 on client 10.0.30.102, because this file is already being 
created by DFSClient_423127459 on 10.0.30.102
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1088)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1177)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.append(NameNode.java:321)
        at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)

I could re-write the whole thing to not be dependent on knowing log length, 
however, it seems like it ought to be possible to obtain the actual file length 
in this situation.  The semantics of getFileStatus() seem a little odd.  
Sometimes it returns the actual length of the file and sometimes it returns a 
stale version of the length.  I suppose this is ok as long as it is well 
documented.  But it should be possible to obtain the actual length of a file.  
Would it be possible to add a FileSystem::length(Path path) method that returns 
the accurate file length by fetching the size of the last block from the 
primary datanode?

- Doug


> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4379
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4379
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: dhruba borthakur
>             Fix For: 0.19.1
>
>         Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, 
> fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, Reader.java, 
> Reader.java, Writer.java, Writer.java
>
>
> In the append design doc 
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it 
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before 
> the reader opened the file
> However, this feature is not yet implemented.  Note that the operation 
> 'flushed' is now called "sync".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4379) In HDFS, sync() not yet guarantees data available to the new readers

Reply via email to