[jira] [Commented] (HDFS-3179) failed to append data, DataStreamer throw an exception, nodes.length != original.length + 1 on single datanode cluster
[ https://issues.apache.org/jira/browse/HDFS-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245476#comment-13245476 ] Zhanwei.Wang commented on HDFS-3179: @Uma and amith It seems the same question with HDFS-3091. I configure only one datanode and create a file using default number of replica(3), existings(1) = replication/2(3/2==1) will be satisfied and it can not replace with the new node as there is no extra nodes exist in the cluster. HDFS-3091 should patch to 0.23.2 branch failed to append data, DataStreamer throw an exception, nodes.length != original.length + 1 on single datanode cluster Key: HDFS-3179 URL: https://issues.apache.org/jira/browse/HDFS-3179 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.2 Reporter: Zhanwei.Wang Priority: Critical Create a single datanode cluster disable permissions enable webhfds start hdfs run the test script expected result: a file named test is created and the content is testtest the result I got: hdfs throw an exception on the second append operation. {code} ./test.sh {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Failed to add a datanode: nodes.length != original.length + 1, nodes=[127.0.0.1:50010], original=[127.0.0.1:50010]}} {code} Log in datanode: {code} 2012-04-02 14:34:21,058 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to add a datanode: nodes.length != original.length + 1, nodes=[127.0.0.1:50010], original=[127.0.0.1:50010] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461) 2012-04-02 14:34:21,059 ERROR org.apache.hadoop.hdfs.DFSClient: Failed to close file /test java.io.IOException: Failed to add a datanode: nodes.length != original.length + 1, nodes=[127.0.0.1:50010], original=[127.0.0.1:50010] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461) {code} test.sh {code} #!/bin/sh echo test test.txt curl -L -X PUT http://localhost:50070/webhdfs/v1/test?op=CREATE; curl -L -X POST -T test.txt http://localhost:50070/webhdfs/v1/test?op=APPEND; curl -L -X POST -T test.txt http://localhost:50070/webhdfs/v1/test?op=APPEND; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3179) failed to append data, DataStreamer throw an exception, nodes.length != original.length + 1 on single datanode cluster
[ https://issues.apache.org/jira/browse/HDFS-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245501#comment-13245501 ] Zhanwei.Wang commented on HDFS-3179: @Uma and amith Another question, in this test script, I first create a new EMPTY file and append to the file twice. The first append succeed because file is empty, to create a pipeline, the stage is PIPELINE_SETUP_CREATE and the policy will not be checked. The second append failed because the stage is PIPELINE_SETPU_APPEND and the policy will be checked. So from the view of user, the first append succeed while the second fail, is that a good idea? {code} // get new block from namenode if (stage == BlockConstructionStage.PIPELINE_SETUP_CREATE) { if(DFSClient.LOG.isDebugEnabled()) { DFSClient.LOG.debug(Allocating new block); } nodes = nextBlockOutputStream(src); initDataStreaming(); } else if (stage == BlockConstructionStage.PIPELINE_SETUP_APPEND) { if(DFSClient.LOG.isDebugEnabled()) { DFSClient.LOG.debug(Append to block + block); } setupPipelineForAppendOrRecovery(); //check the policy here initDataStreaming(); } {code} failed to append data, DataStreamer throw an exception, nodes.length != original.length + 1 on single datanode cluster Key: HDFS-3179 URL: https://issues.apache.org/jira/browse/HDFS-3179 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.2 Reporter: Zhanwei.Wang Priority: Critical Create a single datanode cluster disable permissions enable webhfds start hdfs run the test script expected result: a file named test is created and the content is testtest the result I got: hdfs throw an exception on the second append operation. {code} ./test.sh {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Failed to add a datanode: nodes.length != original.length + 1, nodes=[127.0.0.1:50010], original=[127.0.0.1:50010]}} {code} Log in datanode: {code} 2012-04-02 14:34:21,058 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to add a datanode: nodes.length != original.length + 1, nodes=[127.0.0.1:50010], original=[127.0.0.1:50010] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461) 2012-04-02 14:34:21,059 ERROR org.apache.hadoop.hdfs.DFSClient: Failed to close file /test java.io.IOException: Failed to add a datanode: nodes.length != original.length + 1, nodes=[127.0.0.1:50010], original=[127.0.0.1:50010] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461) {code} test.sh {code} #!/bin/sh echo test test.txt curl -L -X PUT http://localhost:50070/webhdfs/v1/test?op=CREATE; curl -L -X POST -T test.txt http://localhost:50070/webhdfs/v1/test?op=APPEND; curl -L -X POST -T test.txt http://localhost:50070/webhdfs/v1/test?op=APPEND; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3091) Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters.
[ https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245515#comment-13245515 ] Zhanwei.Wang commented on HDFS-3091: Hi, Nocholas {quote} I would say the failures are expected. The feature is to guarantee the number of replicas that the user is asking. However, the cluster is too small that the guarantee is impossible. It makes sense to fail the write requests. {quote} I agree with you, but have a look at code. in HDFS-3179, I first create a EMPTY file and append twice, the first append finished successfully but the second failed since there is only one datanode and the number of replica is 3. Is that what you want to see? I think the policy check should fail on the first write to the file. Update the usage limitations of ReplaceDatanodeOnFailure policy in the config description for the smaller clusters. --- Key: HDFS-3091 URL: https://issues.apache.org/jira/browse/HDFS-3091 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, hdfs client, name-node Affects Versions: 0.23.0, 0.24.0 Reporter: Uma Maheswara Rao G Assignee: Tsz Wo (Nicholas), SZE Fix For: 2.0.0 Attachments: h3091_20120319.patch When verifying the HDFS-1606 feature, Observed couple of issues. Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont have enough DN to replcae in cluster and will be resulted into write failure. {quote} 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to add a datanode: nodes.length != original.length + 1, nodes=[xx.xx.xx.xx:50010], original=[xx.xx.xx.xx1:50010] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416) {quote} Lets take some cases: 1) Replication factor 3 and cluster size also 3 and unportunately pipeline drops to 1. ReplaceDatanodeOnFailure will be satisfied because *existings(1)= replication/2 (3/2==1)*. But when it finding the new node to replace obiously it can not find the new node and the sanity check will fail. This will be resulted to Wite failure. 2) Replication factor 10 (accidentally user sets the replication factor to higher value than cluster size), Cluser has only 5 datanodes. Here even if one node fails also write will fail with same reason. Because pipeline max will be 5 and killed one datanode, then existings will be 4 *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it can not replace with the new node as there is no extra nodes exist in the cluster. This will be resulted to write failure. 3) sync realted opreations also fails in this situations ( will post the clear scenarios) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3179) failed to append data, DataStreamer throw an exception, nodes.length != original.length + 1 on single datanode cluster
[ https://issues.apache.org/jira/browse/HDFS-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245755#comment-13245755 ] Zhanwei.Wang commented on HDFS-3179: I totally agree with you about the problem one datanode with replication 3,I think this kind of operation should fail or at least get a warning. My opinion is that, the purpose of the policy check is to make sure no potential data lose, in this one datanode 3 replica case, although the first append failure will not cause the data lose, the appended data after the first successful append is in danger because there is only one replica which is not the user expected 3. And there is no warning to tell the user the truth. My suggestion is to make the first write to the empty file fail if there is not enough datanode, in another word, make the policy check more strictly. And make the error message more friendly instead of nodes.length != original.length + 1. failed to append data, DataStreamer throw an exception, nodes.length != original.length + 1 on single datanode cluster Key: HDFS-3179 URL: https://issues.apache.org/jira/browse/HDFS-3179 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.2 Reporter: Zhanwei.Wang Priority: Critical Create a single datanode cluster disable permissions enable webhfds start hdfs run the test script expected result: a file named test is created and the content is testtest the result I got: hdfs throw an exception on the second append operation. {code} ./test.sh {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Failed to add a datanode: nodes.length != original.length + 1, nodes=[127.0.0.1:50010], original=[127.0.0.1:50010]}} {code} Log in datanode: {code} 2012-04-02 14:34:21,058 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to add a datanode: nodes.length != original.length + 1, nodes=[127.0.0.1:50010], original=[127.0.0.1:50010] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461) 2012-04-02 14:34:21,059 ERROR org.apache.hadoop.hdfs.DFSClient: Failed to close file /test java.io.IOException: Failed to add a datanode: nodes.length != original.length + 1, nodes=[127.0.0.1:50010], original=[127.0.0.1:50010] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461) {code} test.sh {code} #!/bin/sh echo test test.txt curl -L -X PUT http://localhost:50070/webhdfs/v1/test?op=CREATE; curl -L -X POST -T test.txt http://localhost:50070/webhdfs/v1/test?op=APPEND; curl -L -X POST -T test.txt http://localhost:50070/webhdfs/v1/test?op=APPEND; {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2656) Implement a pure c client based on webhdfs
[ https://issues.apache.org/jira/browse/HDFS-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243664#comment-13243664 ] Zhanwei.Wang commented on HDFS-2656: Hi donal, Good question, performance is an important issue and the lib needs to be designed and implemented carefully. From lib side, I use libcurl to deal with http protocol and a buffer in the lib to optimize the performance. The same design was also used in our another project and the performance of libcurl is ok. For the transmission, http use tcp connection. To read data from server, only the raw data is transfered. To write to server, I use chunked transfer encoding, and the overhead is just a small head per chunk. For the server side, the performance is depending on the jetty server. In the previous prototype, jetty server or webhdfs had performance problem when I use HTTP1.1 protocol to read data from server, but this problem cannot reproduce when I switch to HTTP1.0 protocol. I did simple performance test on the previous prototype, and more performance test work is on the plan. Currently, to write to hdfs may still fail under the heavy workload, I am not sure it is a bug of my code or the hdfs, I am working on it (seems not my bug -_-). The doc is under writing, function test is finished. As soon as I get the permit to open source and finish the doc, you can test yourself. I think it will not take too long. Implement a pure c client based on webhdfs -- Key: HDFS-2656 URL: https://issues.apache.org/jira/browse/HDFS-2656 Project: Hadoop HDFS Issue Type: Improvement Reporter: Zhanwei.Wang Currently, the implementation of libhdfs is based on JNI. The overhead of JVM seems a little big, and libhdfs can also not be used in the environment without hdfs. It seems a good idea to implement a pure c client by wrapping webhdfs. It also can be used to access different version of hdfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2656) Implement a pure c client based on webhdfs
[ https://issues.apache.org/jira/browse/HDFS-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238651#comment-13238651 ] Zhanwei.Wang commented on HDFS-2656: Hi everyone, The code of this jira proposed is almost finished and will available soon. It is more complicated than what I thought and cost me more time. The following is the current status of the code. Status update: 1, finished: most functions in hdfs.h of libhdfs are implemented. 2, on-going: function test. unit test. document 3, todo: kerberos support. http proxy support. some performance improvement Implement a pure c client based on webhdfs -- Key: HDFS-2656 URL: https://issues.apache.org/jira/browse/HDFS-2656 Project: Hadoop HDFS Issue Type: Improvement Reporter: Zhanwei.Wang Currently, the implementation of libhdfs is based on JNI. The overhead of JVM seems a little big, and libhdfs can also not be used in the environment without hdfs. It seems a good idea to implement a pure c client by wrapping webhdfs. It also can be used to access different version of hdfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236298#comment-13236298 ] Zhanwei.Wang commented on HDFS-3107: A problem of truncate is the visibility. Since to truncate a file needs to get the lease first, we do not need to take care of concurrent write, but we need to take care of concurrent read when we truncate a file. Hdfs client will buffer some block info when open and read a file, while these blocks may be truncated. Furthermore, socket and Hdfs client may buffer some data which may will be truncated. When I implement the first edition of truncate prototype, if the block or data the client required is truncated, datanode will throw a exception and client will update the metadata to check if the data is truncated or the real error happened. But this cannot prevent the client reading buffered data. Any comment and suggestion? HDFS truncate - Key: HDFS-3107 URL: https://issues.apache.org/jira/browse/HDFS-3107 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Lei Chang Attachments: HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf Original Estimate: 1,344h Remaining Estimate: 1,344h Systems with transaction support often need to undo changes made to the underlying storage when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix operation) which is a reverse operation of append, which makes upper layer applications use ugly workarounds (such as keeping track of the discarded byte range per file in a separate metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome this limitation of HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236362#comment-13236362 ] Zhanwei.Wang commented on HDFS-3107: Add more detail to my previous question, how to define may read content of a file that will be truncated, that is the visibility problem. If a file is opened and read just before truncation, should the truncated data be visible? Or just depends on the process of truncation? What if a file is opened before truncation and read after truncation? HDFS truncate - Key: HDFS-3107 URL: https://issues.apache.org/jira/browse/HDFS-3107 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Lei Chang Attachments: HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf Original Estimate: 1,344h Remaining Estimate: 1,344h Systems with transaction support often need to undo changes made to the underlying storage when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix operation) which is a reverse operation of append, which makes upper layer applications use ugly workarounds (such as keeping track of the discarded byte range per file in a separate metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome this limitation of HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3100) failed to append data using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230923#comment-13230923 ] Zhanwei.Wang commented on HDFS-3100: I also run this test on hdfs 1.0.1 and the script finished successfully, but I found some strange things in datanode log. 1) I also got DataBlockScanner: Verification failed. Since my hdfs is a single node cluster and running on local network, local disk and everything should not fail at all. 2) I got lots of DataNode: Client calls recoverBlock and I have no idea what happened. {code} 2012-03-15 22:43:33,572 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_3505027625176300242_3086 src: /127.0.0.1:63879 dest: /127.0.0.1:50010 2012-03-15 22:43:33,590 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification failed for blk_8647935099647661204_1101. Its ok since it not in datanode dataset anymore. 2012-03-15 22:43:33,659 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen Block for append blk_3505027625176300242_3086 2012-03-15 22:43:33,662 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: setBlockPosition trying to set position to 275712 for block blk_3505027625176300242_3086 which is not a multiple of bytesPerChecksum 512 2012-03-15 22:43:33,662 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: computePartialChunkCrc sizePartialChunk 256 block blk_3505027625176300242_3086 offset in block 275456 offset in metafile 2159 2012-03-15 22:43:33,662 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Read in partial CRC chunk from disk for block blk_3505027625176300242_3086 2012-03-15 22:43:33,664 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /127.0.0.1:63879, dest: /127.0.0.1:50010, bytes: 307712, op: HDFS_WRITE, cliID: DFSClient_110493321, offset: 0, srvID: DS-1576952563-10.64.55.158-50010-1331870286875, blockid: blk_3505027625176300242_3086, duration: 2816000 2012-03-15 22:43:33,664 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_3505027625176300242_3086 terminating 2012-03-15 22:43:33,677 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Client calls recoverBlock(block=blk_3505027625176300242_3086, targets=[127.0.0.1:50010]) {code} failed to append data using webhdfs --- Key: HDFS-3100 URL: https://issues.apache.org/jira/browse/HDFS-3100 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: hadoop-wangzw-datanode-ubuntu.log, hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch STEP: 1, deploy a single node hdfs 0.23.1 cluster and configure hdfs as: A) enable webhdfs B) enable append C) disable permissions 2, start hdfs 3, run the test script as attached RESULT: expected: a file named testFile should be created and populated with 32K * 5000 zeros, HDFS should be OK. I got: script cannot be finished, file has been created but not be populated as expected, actually append operation failed. Datanode log shows that, blockscaner report a bad replica and nanenode decide to delete it. Since it is a single node cluster, append fail. It makes no sense that the script failed every time. Datanode and Namenode logs are attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230891#comment-13230891 ] Zhanwei.Wang commented on HDFS-3101: well done, fixed very fast cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 1.1.0, 0.23.2, 1.0.2, 0.23.3 Attachments: h3101_20120315.patch, h3101_20120315_branch-1.patch STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2656) Implement a pure c client based on webhdfs
[ https://issues.apache.org/jira/browse/HDFS-2656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166815#comment-13166815 ] Zhanwei.Wang commented on HDFS-2656: I am working on a new pure c hdfs client named libchdfs. It has almost the same interface with libhdfs. Any comment are welcomed. Implement a pure c client based on webhdfs -- Key: HDFS-2656 URL: https://issues.apache.org/jira/browse/HDFS-2656 Project: Hadoop HDFS Issue Type: Improvement Reporter: Zhanwei.Wang Currently, the implementation of libhdfs is based on JNI. The overhead of JVM seems a little big, and libhdfs can also not be used in the environment without hdfs. It seems a good idea to implement a pure c client by wrapping webhdfs. It also can be used to access different version of hdfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira