[
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667602#action_12667602
]
Doug Judd commented on HADOOP-4379:
-----------------------------------
Hi Dhruba,
I tried your suggestion, but got the following exception when trying to open
the file with the 'append' method:
SEVERE: I/O exception while getting length of file
'/hypertable/servers/10.0.30.102_38060/log/range_txn/0.log' -
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create
file /hypertable/servers/10.0.30.102_38060/log/range_txn/0.log for
DFSClient_2003773208 on client 10.0.30.102, because this file is already being
created by DFSClient_423127459 on 10.0.30.102
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1088)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1177)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.append(NameNode.java:321)
at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
I could re-write the whole thing to not be dependent on knowing log length,
however, it seems like it ought to be possible to obtain the actual file length
in this situation. The semantics of getFileStatus() seem a little odd.
Sometimes it returns the actual length of the file and sometimes it returns a
stale version of the length. I suppose this is ok as long as it is well
documented. But it should be possible to obtain the actual length of a file.
Would it be possible to add a FileSystem::length(Path path) method that returns
the accurate file length by fetching the size of the last block from the
primary datanode?
- Doug
> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
> Key: HADOOP-4379
> URL: https://issues.apache.org/jira/browse/HADOOP-4379
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: Tsz Wo (Nicholas), SZE
> Assignee: dhruba borthakur
> Fix For: 0.19.1
>
> Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt,
> fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, Reader.java,
> Reader.java, Writer.java, Writer.java
>
>
> In the append design doc
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before
> the reader opened the file
> However, this feature is not yet implemented. Note that the operation
> 'flushed' is now called "sync".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.