[jira] [Updated] (HDFS-6300) Allows to run multiple balancer simultaneously
[ https://issues.apache.org/jira/browse/HDFS-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-6300: Fix Version/s: (was: 2.6.0) 2.7.0 Allows to run multiple balancer simultaneously -- Key: HDFS-6300 URL: https://issues.apache.org/jira/browse/HDFS-6300 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Reporter: Rakesh R Assignee: Rakesh R Fix For: 2.7.0 Attachments: HDFS-6300.patch Javadoc of Balancer.java says, it will not allow to run second balancer if the first one is in progress. But I've noticed multiple can run together and balancer.id implementation is not safe guarding. {code} * liAnother balancer is running. Exiting... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3907) Allow multiple users for local block readers
[ https://issues.apache.org/jira/browse/HDFS-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3907: Fix Version/s: (was: 2.6.0) 2.7.0 Allow multiple users for local block readers Key: HDFS-3907 URL: https://issues.apache.org/jira/browse/HDFS-3907 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.7.0 Attachments: hdfs-3907.txt The {{dfs.block.local-path-access.user}} config added in HDFS-2246 only supports a single user, however as long as blocks are group readable by more than one user the feature could be used by multiple users, to support this we just need to allow both to be configured. In practice this allows us to also support HBase where the client (RS) runs as the hbase system user and the DN runs as hdfs system user. I think this should work secure as well since we're not using impersonation in the HBase case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5160) MiniDFSCluster webui does not work
[ https://issues.apache.org/jira/browse/HDFS-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5160: Fix Version/s: (was: 2.6.0) 2.7.0 MiniDFSCluster webui does not work -- Key: HDFS-5160 URL: https://issues.apache.org/jira/browse/HDFS-5160 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Fix For: 2.7.0 The webui does not work, when going to http://localhost:50070 you get: {code} Directory: / webapps/ 102 bytes Sep 4, 2013 9:32:55 AM {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6940) Initial refactoring to allow ConsensusNode implementation
[ https://issues.apache.org/jira/browse/HDFS-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-6940: Fix Version/s: (was: 2.6.0) 2.7.0 Initial refactoring to allow ConsensusNode implementation - Key: HDFS-6940 URL: https://issues.apache.org/jira/browse/HDFS-6940 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.6-alpha, 2.5.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 2.7.0 Attachments: HDFS-6940.patch Minor refactoring of FSNamesystem to open private methods that are needed for CNode implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7273) Add start and stop wrapper scripts for mover
[ https://issues.apache.org/jira/browse/HDFS-7273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-7273: Fix Version/s: (was: 2.6.0) 2.7.0 Add start and stop wrapper scripts for mover Key: HDFS-7273 URL: https://issues.apache.org/jira/browse/HDFS-7273 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Affects Versions: 2.6.0 Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor Fix For: 2.7.0 Similar to balancer , we need start/stop mover scripts to run data migration tool as a daemon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3761) Uses of AuthenticatedURL should set the configured timeouts
[ https://issues.apache.org/jira/browse/HDFS-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3761: Fix Version/s: (was: 2.6.0) 2.7.0 Uses of AuthenticatedURL should set the configured timeouts --- Key: HDFS-3761 URL: https://issues.apache.org/jira/browse/HDFS-3761 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha, 2.0.2-alpha Reporter: Alejandro Abdelnur Priority: Critical Fix For: 2.7.0 Similar to {{URLUtils}}, the {{AuthenticatedURL}} should set the configured timeouts in URL connection instances. HADOOP-8644 is introducing a {{ConnectionConfigurator}} interface to the {{AuthenticatedURL}} class, it should be done via that interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4066) Fix compile warnings for libwebhdfs
[ https://issues.apache.org/jira/browse/HDFS-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4066: Fix Version/s: (was: 2.6.0) 2.7.0 Fix compile warnings for libwebhdfs --- Key: HDFS-4066 URL: https://issues.apache.org/jira/browse/HDFS-4066 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Fix For: 2.7.0 The compile of libwebhdfs still generates a bunch of warnings, which need to be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7067) ClassCastException while using a key created by keytool to create encryption zone.
[ https://issues.apache.org/jira/browse/HDFS-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-7067: Fix Version/s: (was: 2.6.0) 2.7.0 ClassCastException while using a key created by keytool to create encryption zone. --- Key: HDFS-7067 URL: https://issues.apache.org/jira/browse/HDFS-7067 Project: Hadoop HDFS Issue Type: Bug Components: encryption Affects Versions: 2.6.0 Reporter: Yi Yao Priority: Minor Fix For: 2.7.0 I'm using transparent encryption. If I create a key for KMS keystore via keytool and use the key to create an encryption zone. I get a ClassCastException rather than an exception with decent error message. I know we should use 'hadoop key create' to create a key. It's better to provide an decent error message to remind user to use the right way to create a KMS key. [LOG] ERROR[user=hdfs] Method:'GET' Exception:'java.lang.ClassCastException: javax.crypto.spec.SecretKeySpec cannot be cast to org.apache.hadoop.crypto.key.JavaKeyStoreProvider$KeyMetadata' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5687) Problem in accessing NN JSP page
[ https://issues.apache.org/jira/browse/HDFS-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5687: Fix Version/s: (was: 2.6.0) 2.7.0 Problem in accessing NN JSP page Key: HDFS-5687 URL: https://issues.apache.org/jira/browse/HDFS-5687 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.3.0 Reporter: sathish Assignee: sathish Priority: Minor Fix For: 2.7.0 Attachments: HDFS-5687-0001.patch In NN UI page After clicking the browse File System page,from that page,if you click GO Back TO DFS HOME ICon it is not accessing the dfshealth.jsp page NN http URL is http://nnaddr///nninfoaddr/dfshealth.jsp,it is coming like this,due to this i think it is not browsing that page It should be http://nninfoaddr/dfshealth.jsp/ like this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6694) TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms
[ https://issues.apache.org/jira/browse/HDFS-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-6694: Fix Version/s: (was: 2.6.0) 2.7.0 TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms Key: HDFS-6694 URL: https://issues.apache.org/jira/browse/HDFS-6694 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Priority: Critical Fix For: 2.7.0 Attachments: HDFS-6694.001.dbg.patch, HDFS-6694.001.dbg.patch, HDFS-6694.001.dbg.patch, HDFS-6694.002.dbg.patch, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover-output.txt, org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.txt TestPipelinesFailover.testPipelineRecoveryStress tests fail intermittently with various symptoms. Typical failures are described in first comment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6833) DirectoryScanner should not register a deleting block with memory of DataNode
[ https://issues.apache.org/jira/browse/HDFS-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-6833: Target Version/s: 2.6.1 DirectoryScanner should not register a deleting block with memory of DataNode - Key: HDFS-6833 URL: https://issues.apache.org/jira/browse/HDFS-6833 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.5.0, 2.5.1 Reporter: Shinichi Yamashita Assignee: Shinichi Yamashita Priority: Critical Attachments: HDFS-6833-6-2.patch, HDFS-6833-6-3.patch, HDFS-6833-6.patch, HDFS-6833-7-2.patch, HDFS-6833-7.patch, HDFS-6833.8.patch, HDFS-6833.9.patch, HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch, HDFS-6833.patch When a block is deleted in DataNode, the following messages are usually output. {code} 2014-08-07 17:53:11,606 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825 for deletion 2014-08-07 17:53:11,617 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825 {code} However, DirectoryScanner may be executed when DataNode deletes the block in the current implementation. And the following messsages are output. {code} 2014-08-07 17:53:30,519 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825 for deletion 2014-08-07 17:53:31,426 INFO org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool BP-1887080305-172.28.0.101-1407398838872 Total blocks: 1, missing metadata files:0, missing block files:0, missing blocks in memory:1, mismatched blocks:0 2014-08-07 17:53:31,426 WARN org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added missing block to memory FinalizedReplica, blk_1073741825_1001, FINALIZED getNumBytes() = 21230663 getBytesOnDisk() = 21230663 getVisibleLength()= 21230663 getVolume() = /hadoop/data1/dfs/data/current getBlockFile()= /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825 unlinked =false 2014-08-07 17:53:31,531 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-1887080305-172.28.0.101-1407398838872 blk_1073741825_1001 file /hadoop/data1/dfs/data/current/BP-1887080305-172.28.0.101-1407398838872/current/finalized/subdir0/subdir0/blk_1073741825 {code} Deleting block information is registered in DataNode's memory. And when DataNode sends a block report, NameNode receives wrong block information. For example, when we execute recommission or change the number of replication, NameNode may delete the right block as ExcessReplicate by this problem. And Under-Replicated Blocks and Missing Blocks occur. When DataNode run DirectoryScanner, DataNode should not register a deleting block. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7274) Disable SSLv3 in HttpFS
[ https://issues.apache.org/jira/browse/HDFS-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-7274: Fix Version/s: 2.6.0 Disable SSLv3 in HttpFS --- Key: HDFS-7274 URL: https://issues.apache.org/jira/browse/HDFS-7274 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0 Reporter: Robert Kanter Assignee: Robert Kanter Priority: Blocker Fix For: 2.6.0, 2.5.2 Attachments: HDFS-7274.patch, HDFS-7274.patch We should disable SSLv3 in HttpFS to protect against the POODLEbleed vulnerability. See [CVE-2014-3566|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-3566] We have {{sslProtocol=TLS}} set to only allow TLS in ssl-server.xml, but when I checked, I could still connect with SSLv3. There documentation is somewhat unclear in the tomcat configs between {{sslProtocol}}, {{sslProtocols}}, and {{sslEnabledProtocols}} and what each value they take does exactly. From what I can gather, {{sslProtocol=TLS}} actually includes SSLv3 and the only way to fix this is to explicitly list which TLS versions we support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS
[ https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209030#comment-14209030 ] Arun C Murthy commented on HDFS-7391: - Sounds good, thanks [~rkanter] and [~ywskycn]! I'll commit both shortly for RC1. Renable SSLv2Hello in HttpFS Key: HDFS-7391 URL: https://issues.apache.org/jira/browse/HDFS-7391 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0, 2.5.2 Reporter: Robert Kanter Assignee: Robert Kanter Priority: Blocker Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch We should re-enable SSLv2Hello, which is required for older clients (e.g. Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be clear, it does not mean SSLv2, which is insecure. I couldn't simply do an addendum patch on HDFS-7274 because it's already been closed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7391) Renable SSLv2Hello in HttpFS
[ https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14209148#comment-14209148 ] Arun C Murthy commented on HDFS-7391: - [~kasha] you just saved me some typing chores, I'll take care of this. Thanks! Renable SSLv2Hello in HttpFS Key: HDFS-7391 URL: https://issues.apache.org/jira/browse/HDFS-7391 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0, 2.5.2 Reporter: Robert Kanter Assignee: Robert Kanter Priority: Blocker Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch We should re-enable SSLv2Hello, which is required for older clients (e.g. Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be clear, it does not mean SSLv2, which is insecure. I couldn't simply do an addendum patch on HDFS-7274 because it's already been closed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7391) Renable SSLv2Hello in HttpFS
[ https://issues.apache.org/jira/browse/HDFS-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-7391: Resolution: Fixed Fix Version/s: 2.6.0 Status: Resolved (was: Patch Available) I just committed this to branch-2.6. Thanks [~rkanter]! [~kasha] - I wasn't sure if you wanted this in branch-2.5, so I haven't cherry-picked it. Please do so if you want to. Thanks. Renable SSLv2Hello in HttpFS Key: HDFS-7391 URL: https://issues.apache.org/jira/browse/HDFS-7391 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.6.0, 2.5.2 Reporter: Robert Kanter Assignee: Robert Kanter Priority: Blocker Fix For: 2.6.0 Attachments: HDFS-7391-branch-2.5.patch, HDFS-7391.patch We should re-enable SSLv2Hello, which is required for older clients (e.g. Java 6 with openssl 0.9.8x) so they can't connect without it. Just to be clear, it does not mean SSLv2, which is insecure. I couldn't simply do an addendum patch on HDFS-7274 because it's already been closed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7383) DataNode.requestShortCircuitFdsForRead may throw NullPointerException
[ https://issues.apache.org/jira/browse/HDFS-7383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14204245#comment-14204245 ] Arun C Murthy commented on HDFS-7383: - [~sureshms] this might be bad enough that we want it in 2.6? Thanks. DataNode.requestShortCircuitFdsForRead may throw NullPointerException - Key: HDFS-7383 URL: https://issues.apache.org/jira/browse/HDFS-7383 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Fix For: 2.7.0 Attachments: h7383_20141108.patch {noformat} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode.requestShortCircuitFdsForRead(DataNode.java:1525) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitFds(DataXceiver.java:286) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitFds(Receiver.java:185) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:89) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:234) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6573) TestDeadDatanode#testDeadDatanode fails if debug flag is turned on for the unit test
[ https://issues.apache.org/jira/browse/HDFS-6573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-6573: Fix Version/s: (was: 2.4.1) 2.5.0 TestDeadDatanode#testDeadDatanode fails if debug flag is turned on for the unit test Key: HDFS-6573 URL: https://issues.apache.org/jira/browse/HDFS-6573 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.2.0, 2.3.0, 2.4.0 Reporter: Jinghui Wang Assignee: Jinghui Wang Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6573.patch o.a.h.hdfs.server.namenode.NameNodeRpcServer#blockReport calls a function in for debug mode, which uses the StorageBlockReport data passed in by the test TestDeadDatanode#testDeadDatanode. The report block list is an long array of size 3 in the test, but the blockReport expects the format of long array of size 5. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039716#comment-14039716 ] Arun C Murthy commented on HDFS-6527: - Changed fix version to 2.4.1 since it got merged in for rc1. Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 2.4.1 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-6527: Fix Version/s: (was: 2.5.0) 2.4.1 Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 2.4.1 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5687) Problem in accessing NN JSP page
[ https://issues.apache.org/jira/browse/HDFS-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5687: Fix Version/s: (was: 2.4.0) 2.5.0 Problem in accessing NN JSP page Key: HDFS-5687 URL: https://issues.apache.org/jira/browse/HDFS-5687 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.3.0 Reporter: sathish Assignee: sathish Priority: Minor Fix For: 2.5.0 Attachments: HDFS-5687-0001.patch In NN UI page After clicking the browse File System page,from that page,if you click GO Back TO DFS HOME ICon it is not accessing the dfshealth.jsp page NN http URL is http://nnaddr///nninfoaddr/dfshealth.jsp,it is coming like this,due to this i think it is not browsing that page It should be http://nninfoaddr/dfshealth.jsp/ like this -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4754) Add an API in the namenode to mark a datanode as stale
[ https://issues.apache.org/jira/browse/HDFS-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4754: Fix Version/s: (was: 2.4.0) 2.5.0 Add an API in the namenode to mark a datanode as stale -- Key: HDFS-4754 URL: https://issues.apache.org/jira/browse/HDFS-4754 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Fix For: 3.0.0, 2.5.0 Attachments: 4754.v1.patch, 4754.v2.patch, 4754.v4.patch, 4754.v4.patch There is a detection of the stale datanodes in HDFS since HDFS-3703, with a timeout, defaulted to 30s. There are two reasons to add an API to mark a node as stale even if the timeout is not yet reached: 1) ZooKeeper can detect that a client is dead at any moment. So, for HBase, we sometimes start the recovery before a node is marked staled. (even with reasonable settings as: stale: 20s; HBase ZK timeout: 30s 2) Some third parties could detect that a node is dead before the timeout, hence saving us the cost of retrying. An example or such hw is Arista, presented here by [~tsuna] http://tsunanet.net/~tsuna/fsf-hbase-meetup-april13.pdf, and confirmed in HBASE-6290. As usual, even if the node is dead it can comeback before the 10 minutes limit. So I would propose to set a timebound. The API would be namenode.markStale(String ipAddress, int port, long durationInMs); After durationInMs, the namenode would again rely only on its heartbeat to decide. Thoughts? If there is no objections, and if nobody in the hdfs dev team has the time to spend some time on it, I will give it a try for branch 2 3. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4489: Fix Version/s: (was: 2.4.0) 2.5.0 Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.5.0 Attachments: 4434.optimized.patch The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5160) MiniDFSCluster webui does not work
[ https://issues.apache.org/jira/browse/HDFS-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5160: Fix Version/s: (was: 2.4.0) 2.5.0 MiniDFSCluster webui does not work -- Key: HDFS-5160 URL: https://issues.apache.org/jira/browse/HDFS-5160 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Fix For: 2.5.0 The webui does not work, when going to http://localhost:50070 you get: {code} Directory: / webapps/ 102 bytes Sep 4, 2013 9:32:55 AM {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4066) Fix compile warnings for libwebhdfs
[ https://issues.apache.org/jira/browse/HDFS-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4066: Fix Version/s: (was: 2.4.0) 2.5.0 Fix compile warnings for libwebhdfs --- Key: HDFS-4066 URL: https://issues.apache.org/jira/browse/HDFS-4066 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Fix For: 2.5.0 The compile of libwebhdfs still generates a bunch of warnings, which need to be fixed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-3907) Allow multiple users for local block readers
[ https://issues.apache.org/jira/browse/HDFS-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3907: Fix Version/s: (was: 2.4.0) 2.5.0 Allow multiple users for local block readers Key: HDFS-3907 URL: https://issues.apache.org/jira/browse/HDFS-3907 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.5.0 Attachments: hdfs-3907.txt The {{dfs.block.local-path-access.user}} config added in HDFS-2246 only supports a single user, however as long as blocks are group readable by more than one user the feature could be used by multiple users, to support this we just need to allow both to be configured. In practice this allows us to also support HBase where the client (RS) runs as the hbase system user and the DN runs as hdfs system user. I think this should work secure as well since we're not using impersonation in the HBase case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4198) the shell script error for Cygwin on windows7
[ https://issues.apache.org/jira/browse/HDFS-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4198: Fix Version/s: (was: 2.4.0) 2.5.0 the shell script error for Cygwin on windows7 - Key: HDFS-4198 URL: https://issues.apache.org/jira/browse/HDFS-4198 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 2.0.2-alpha Environment: windows7 ,cygwin. Reporter: Han Hui Wen Fix For: 2.5.0 See the following [comment|https://issues.apache.org/jira/browse/HDFS-4198?focusedCommentId=13498818page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13498818] for detailed description. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5098) Enhance FileSystem.Statistics to have locality information
[ https://issues.apache.org/jira/browse/HDFS-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5098: Fix Version/s: (was: 2.4.0) 2.5.0 Enhance FileSystem.Statistics to have locality information -- Key: HDFS-5098 URL: https://issues.apache.org/jira/browse/HDFS-5098 Project: Hadoop HDFS Issue Type: Improvement Reporter: Bikas Saha Assignee: Suresh Srinivas Fix For: 2.5.0 Currently in MR/Tez we dont have a good and accurate means to detect how much the the IO was actually done locally. Getting this information from the source of truth would be much better. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-3988) HttpFS can't do GETDELEGATIONTOKEN without a prior authenticated request
[ https://issues.apache.org/jira/browse/HDFS-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3988: Fix Version/s: (was: 2.4.0) 2.5.0 HttpFS can't do GETDELEGATIONTOKEN without a prior authenticated request Key: HDFS-3988 URL: https://issues.apache.org/jira/browse/HDFS-3988 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.5.0 A request to obtain a delegation token cannot it initiate an authentication sequence, it must be accompanied by an auth cookie obtained in a prev request using a different operation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4286) Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server jar in hidden package of BKJM
[ https://issues.apache.org/jira/browse/HDFS-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4286: Fix Version/s: (was: 2.4.0) 2.5.0 Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server jar in hidden package of BKJM -- Key: HDFS-4286 URL: https://issues.apache.org/jira/browse/HDFS-4286 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Vinayakumar B Fix For: 3.0.0, 2.5.0 BOOKKEEPER-203 introduced changes to LedgerLayout to include ManagerFactoryClass instead of ManagerFactoryName. So because of this, BKJM cannot shade the bookkeeper-server jar inside BKJM jar LAYOUT znode created by BookieServer is not readable by the BKJM as it have classes in hidden packages. (same problem vice versa) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-3592) libhdfs should expose ClientProtocol::mkdirs2
[ https://issues.apache.org/jira/browse/HDFS-3592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3592: Fix Version/s: (was: 2.4.0) 2.5.0 libhdfs should expose ClientProtocol::mkdirs2 - Key: HDFS-3592 URL: https://issues.apache.org/jira/browse/HDFS-3592 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 3.0.0, 2.5.0 It would be nice if libhdfs exposed mkdirs2. This version of mkdirs is much more verbose about any errors that occur-- it throws AccessControlException, FileAlreadyExists, FileNotFoundException, ParentNotDirectoryException, etc. The original mkdirs just throws IOException if anything goes wrong. For something like fuse_dfs, it is very important to return the correct errno code when an error has occurred. mkdirs2 would allow us to do that. I'm not sure if we should just change hdfsMkdirs to use mkdirs2, or add an hdfsMkdirs2. Probably the latter, but the former course would maintain bug compatibility with ancient releases-- if that is important. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-3828) Block Scanner rescans blocks too frequently
[ https://issues.apache.org/jira/browse/HDFS-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3828: Fix Version/s: (was: 2.4.0) 2.5.0 Block Scanner rescans blocks too frequently --- Key: HDFS-3828 URL: https://issues.apache.org/jira/browse/HDFS-3828 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Andy Isaacson Assignee: Andy Isaacson Fix For: 2.5.0 Attachments: hdfs-3828-1.txt, hdfs-3828-2.txt, hdfs-3828-3.txt, hdfs3828.txt {{BlockPoolSliceScanner#scan}} calls cleanUp every time it's invoked from {{DataBlockScanner#run}} via {{scanBlockPoolSlice}}. But cleanUp unconditionally roll()s the verificationLogs, so after two iterations we have lost the first iteration of block verification times. As a result a cluster with just one block repeatedly rescans it every 10 seconds: {noformat} 2012-08-16 15:59:57,884 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 2012-08-16 16:00:07,904 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 2012-08-16 16:00:17,925 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 {noformat} {quote} To fix this, we need to avoid roll()ing the logs multiple times per period. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-3761) Uses of AuthenticatedURL should set the configured timeouts
[ https://issues.apache.org/jira/browse/HDFS-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3761: Fix Version/s: (was: 2.4.0) 2.5.0 Uses of AuthenticatedURL should set the configured timeouts --- Key: HDFS-3761 URL: https://issues.apache.org/jira/browse/HDFS-3761 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha, 2.0.2-alpha Reporter: Alejandro Abdelnur Priority: Critical Fix For: 2.5.0 Similar to {{URLUtils}}, the {{AuthenticatedURL}} should set the configured timeouts in URL connection instances. HADOOP-8644 is introducing a {{ConnectionConfigurator}} interface to the {{AuthenticatedURL}} class, it should be done via that interface. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13961906#comment-13961906 ] Arun C Murthy commented on HDFS-6143: - One concern: what is the impact on compatibility with 2.2 etc? WebHdfsFileSystem open should throw FileNotFoundException for non-existing paths Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6143-branch-2.4.0.v01.patch, HDFS-6143-trunk-after-HDFS-5570.v01.patch, HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch, HDFS-6143.v04.patch, HDFS-6143.v05.patch, HDFS-6143.v06.patch WebHdfsFileSystem.open and HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954997#comment-13954997 ] Arun C Murthy commented on HDFS-4564: - I'm going to commit this, based on [~jingzhao]'s +1 - I'd like to roll 2.4 rc0 asap since my availability next few days is limited with the Hadoop Summit. Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4564: Resolution: Fixed Fix Version/s: 2.4.0 Status: Resolved (was: Patch Available) I just committed this. Thanks [~daryn] (and [~jingzhao] for the review)! Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Fix For: 2.4.0 Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6143) (WebHdfs|Hftp)FileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-6143: Target Version/s: 2.4.1 (was: 2.4.0) (WebHdfs|Hftp)FileSystem open should throw FileNotFoundException for non-existing paths --- Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6143) (WebHdfs|Hftp)FileSystem open should throw FileNotFoundException for non-existing paths
[ https://issues.apache.org/jira/browse/HDFS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955014#comment-13955014 ] Arun C Murthy commented on HDFS-6143: - [~jira.shegalov] Saw this too late - I'm moving this to 2.4.1, let's try get this in asap. I have rc0 going out, we can re-target 2.4.0 if I need to spin another rc. Thanks. (WebHdfs|Hftp)FileSystem open should throw FileNotFoundException for non-existing paths --- Key: HDFS-6143 URL: https://issues.apache.org/jira/browse/HDFS-6143 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.3.0 Reporter: Gera Shegalov Assignee: Gera Shegalov Priority: Blocker Attachments: HDFS-6143.v01.patch, HDFS-6143.v02.patch, HDFS-6143.v03.patch, HDFS-6143.v04.patch HftpFileSystem.open incorrectly handles non-existing paths. - 'open', does not really open anything, i.e., it does not contact the server, and therefore cannot discover FileNotFound, it's deferred until next read. It's counterintuitive and not how local FS or HDFS work. In POSIX you get ENOENT on open. [LzoInputFormat.getSplits|https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/LzoInputFormat.java] is an example of the code that's broken because of this. - On the server side, FileDataServlet incorrectly sends SC_BAD_REQUEST instead of SC_NOT_FOUND for non-exitsing paths -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13954374#comment-13954374 ] Arun C Murthy commented on HDFS-4564: - [~daryn] - Can you pls clarify some of the above review comments? I'd like to move fwd with 2.4 ASAP. Tx! Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5138) Support HDFS upgrade in HA
[ https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944060#comment-13944060 ] Arun C Murthy commented on HDFS-5138: - [~jingzhao] / [~sureshms] - Should we close this one and mark HDFS-6135 as the blocker? Thanks. Support HDFS upgrade in HA -- Key: HDFS-5138 URL: https://issues.apache.org/jira/browse/HDFS-5138 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Kihwal Lee Assignee: Aaron T. Myers Priority: Blocker Fix For: 3.0.0 Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, hdfs-5138-branch-2.txt With HA enabled, NN wo't start with -upgrade. Since there has been a layout version change between 2.0.x and 2.1.x, starting NN in upgrade mode was necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way to get around this was to disable HA and upgrade. The NN and the cluster cannot be flipped back to HA until the upgrade is finalized. If HA is disabled only on NN for layout upgrade and HA is turned back on without involving DNs, things will work, but finaliizeUpgrade won't work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade snapshots won't get removed. We will need a different ways of doing layout upgrade and upgrade snapshot. I am marking this as a 2.1.1-beta blocker based on feedback from others. If there is a reasonable workaround that does not increase maintenance window greatly, we can lower its priority from blocker to critical. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures
[ https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13944061#comment-13944061 ] Arun C Murthy commented on HDFS-5840: - [~sureshms] / [~atm] - Should this be a blocker for 2.4? Thanks. Follow-up to HDFS-5138 to improve error handling during partial upgrade failures Key: HDFS-5840 URL: https://issues.apache.org/jira/browse/HDFS-5840 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 3.0.0 Attachments: HDFS-5840.patch Suresh posted some good comment in HDFS-5138 after that patch had already been committed to trunk. This JIRA is to address those. See the first comment of this JIRA for the full content of the review. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932008#comment-13932008 ] Arun C Murthy commented on HDFS-4564: - How is this looking [~daryn]? Thanks. (Doing a pass over 2.4 blockers) Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-3828) Block Scanner rescans blocks too frequently
[ https://issues.apache.org/jira/browse/HDFS-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3828: Fix Version/s: (was: 2.3.0) 2.4.0 Block Scanner rescans blocks too frequently --- Key: HDFS-3828 URL: https://issues.apache.org/jira/browse/HDFS-3828 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Andy Isaacson Assignee: Andy Isaacson Fix For: 2.4.0 Attachments: hdfs-3828-1.txt, hdfs-3828-2.txt, hdfs-3828-3.txt, hdfs3828.txt {{BlockPoolSliceScanner#scan}} calls cleanUp every time it's invoked from {{DataBlockScanner#run}} via {{scanBlockPoolSlice}}. But cleanUp unconditionally roll()s the verificationLogs, so after two iterations we have lost the first iteration of block verification times. As a result a cluster with just one block repeatedly rescans it every 10 seconds: {noformat} 2012-08-16 15:59:57,884 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 2012-08-16 16:00:07,904 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 2012-08-16 16:00:17,925 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 {noformat} {quote} To fix this, we need to avoid roll()ing the logs multiple times per period. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4754) Add an API in the namenode to mark a datanode as stale
[ https://issues.apache.org/jira/browse/HDFS-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4754: Fix Version/s: (was: 2.3.0) 2.4.0 Add an API in the namenode to mark a datanode as stale -- Key: HDFS-4754 URL: https://issues.apache.org/jira/browse/HDFS-4754 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Fix For: 3.0.0, 2.4.0 Attachments: 4754.v1.patch, 4754.v2.patch, 4754.v4.patch, 4754.v4.patch There is a detection of the stale datanodes in HDFS since HDFS-3703, with a timeout, defaulted to 30s. There are two reasons to add an API to mark a node as stale even if the timeout is not yet reached: 1) ZooKeeper can detect that a client is dead at any moment. So, for HBase, we sometimes start the recovery before a node is marked staled. (even with reasonable settings as: stale: 20s; HBase ZK timeout: 30s 2) Some third parties could detect that a node is dead before the timeout, hence saving us the cost of retrying. An example or such hw is Arista, presented here by [~tsuna] http://tsunanet.net/~tsuna/fsf-hbase-meetup-april13.pdf, and confirmed in HBASE-6290. As usual, even if the node is dead it can comeback before the 10 minutes limit. So I would propose to set a timebound. The API would be namenode.markStale(String ipAddress, int port, long durationInMs); After durationInMs, the namenode would again rely only on its heartbeat to decide. Thoughts? If there is no objections, and if nobody in the hdfs dev team has the time to spend some time on it, I will give it a try for branch 2 3. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-2832: Fix Version/s: (was: 2.3.0) 2.4.0 Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 3.0.0, 2.4.0 Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch, h2832_branch-2_20131226.patch, h2832_branch-2_20140103.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5098) Enhance FileSystem.Statistics to have locality information
[ https://issues.apache.org/jira/browse/HDFS-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5098: Fix Version/s: (was: 2.3.0) 2.4.0 Enhance FileSystem.Statistics to have locality information -- Key: HDFS-5098 URL: https://issues.apache.org/jira/browse/HDFS-5098 Project: Hadoop HDFS Issue Type: Improvement Reporter: Bikas Saha Assignee: Suresh Srinivas Fix For: 2.4.0 Currently in MR/Tez we dont have a good and accurate means to detect how much the the IO was actually done locally. Getting this information from the source of truth would be much better. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4489: Fix Version/s: (was: 2.3.0) 2.4.0 Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.4.0 Attachments: 4434.optimized.patch The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-3988) HttpFS can't do GETDELEGATIONTOKEN without a prior authenticated request
[ https://issues.apache.org/jira/browse/HDFS-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3988: Fix Version/s: (was: 2.3.0) 2.4.0 HttpFS can't do GETDELEGATIONTOKEN without a prior authenticated request Key: HDFS-3988 URL: https://issues.apache.org/jira/browse/HDFS-3988 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.4.0 A request to obtain a delegation token cannot it initiate an authentication sequence, it must be accompanied by an auth cookie obtained in a prev request using a different operation. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5687) Problem in accessing NN JSP page
[ https://issues.apache.org/jira/browse/HDFS-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5687: Fix Version/s: (was: 2.3.0) 2.4.0 Problem in accessing NN JSP page Key: HDFS-5687 URL: https://issues.apache.org/jira/browse/HDFS-5687 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.3.0 Reporter: sathish Assignee: sathish Priority: Minor Fix For: 2.4.0 Attachments: HDFS-5687-0001.patch In NN UI page After clicking the browse File System page,from that page,if you click GO Back TO DFS HOME ICon it is not accessing the dfshealth.jsp page NN http URL is http://nnaddr///nninfoaddr/dfshealth.jsp,it is coming like this,due to this i think it is not browsing that page It should be http://nninfoaddr/dfshealth.jsp/ like this -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-3761) Uses of AuthenticatedURL should set the configured timeouts
[ https://issues.apache.org/jira/browse/HDFS-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3761: Fix Version/s: (was: 2.3.0) 2.4.0 Uses of AuthenticatedURL should set the configured timeouts --- Key: HDFS-3761 URL: https://issues.apache.org/jira/browse/HDFS-3761 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha, 2.0.2-alpha Reporter: Alejandro Abdelnur Priority: Critical Fix For: 2.4.0 Similar to {{URLUtils}}, the {{AuthenticatedURL}} should set the configured timeouts in URL connection instances. HADOOP-8644 is introducing a {{ConnectionConfigurator}} interface to the {{AuthenticatedURL}} class, it should be done via that interface. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-3592) libhdfs should expose ClientProtocol::mkdirs2
[ https://issues.apache.org/jira/browse/HDFS-3592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3592: Fix Version/s: (was: 2.3.0) 2.4.0 libhdfs should expose ClientProtocol::mkdirs2 - Key: HDFS-3592 URL: https://issues.apache.org/jira/browse/HDFS-3592 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 3.0.0, 2.4.0 It would be nice if libhdfs exposed mkdirs2. This version of mkdirs is much more verbose about any errors that occur-- it throws AccessControlException, FileAlreadyExists, FileNotFoundException, ParentNotDirectoryException, etc. The original mkdirs just throws IOException if anything goes wrong. For something like fuse_dfs, it is very important to return the correct errno code when an error has occurred. mkdirs2 would allow us to do that. I'm not sure if we should just change hdfsMkdirs to use mkdirs2, or add an hdfsMkdirs2. Probably the latter, but the former course would maintain bug compatibility with ancient releases-- if that is important. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4286) Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server jar in hidden package of BKJM
[ https://issues.apache.org/jira/browse/HDFS-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4286: Fix Version/s: (was: 2.3.0) 2.4.0 Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server jar in hidden package of BKJM -- Key: HDFS-4286 URL: https://issues.apache.org/jira/browse/HDFS-4286 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Vinayakumar B Fix For: 3.0.0, 2.4.0 BOOKKEEPER-203 introduced changes to LedgerLayout to include ManagerFactoryClass instead of ManagerFactoryName. So because of this, BKJM cannot shade the bookkeeper-server jar inside BKJM jar LAYOUT znode created by BookieServer is not readable by the BKJM as it have classes in hidden packages. (same problem vice versa) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-2802: Fix Version/s: (was: 2.3.0) 2.4.0 Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: namenode Reporter: Hari Mankude Assignee: Tsz Wo (Nicholas), SZE Fix For: 2.4.0 Attachments: 2802.diff, 2802.patch, 2802.patch, HDFS-2802-meeting-minutes-121101.txt, HDFS-2802.20121101.patch, HDFS-2802.branch-2.0510.patch, HDFSSnapshotsDesign.pdf, Snapshots.pdf, Snapshots20120429.pdf, Snapshots20121018.pdf, Snapshots20121030.pdf, h2802_20130417.patch, h2802_20130422.patch, h2802_20130423.patch, h2802_20130425.patch, h2802_20130426.patch, h2802_20130430.patch, h2802_20130430b.patch, h2802_20130508.patch, h2802_20130509.patch, snap.patch, snapshot-design.pdf, snapshot-design.tex, snapshot-one-pager.pdf, snapshot-testplan.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5160) MiniDFSCluster webui does not work
[ https://issues.apache.org/jira/browse/HDFS-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5160: Fix Version/s: (was: 2.3.0) 2.4.0 MiniDFSCluster webui does not work -- Key: HDFS-5160 URL: https://issues.apache.org/jira/browse/HDFS-5160 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Fix For: 2.4.0 The webui does not work, when going to http://localhost:50070 you get: {code} Directory: / webapps/ 102 bytes Sep 4, 2013 9:32:55 AM {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-3747) balancer-Test failed because of time out
[ https://issues.apache.org/jira/browse/HDFS-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3747: Fix Version/s: (was: 2.3.0) 2.4.0 balancer-Test failed because of time out - Key: HDFS-3747 URL: https://issues.apache.org/jira/browse/HDFS-3747 Project: Hadoop HDFS Issue Type: Test Components: balancer Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: meng gong Assignee: meng gong Labels: test Fix For: 3.0.0, 2.4.0 Attachments: TestCaseForBanlancer.java When run a given test case for Banlancer Test. In the test the banlancer thread try to move some block cross the rack but it can't find any available blocks in the source rack. Then the thread won't interrupt until the tag isTimeUp reaches 20min. But maven judges the test failed because the thread have runned for 15min. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4198) the shell script error for Cygwin on windows7
[ https://issues.apache.org/jira/browse/HDFS-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4198: Fix Version/s: (was: 2.3.0) 2.4.0 the shell script error for Cygwin on windows7 - Key: HDFS-4198 URL: https://issues.apache.org/jira/browse/HDFS-4198 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 2.0.2-alpha Environment: windows7 ,cygwin. Reporter: Han Hui Wen Fix For: 2.4.0 See the following [comment|https://issues.apache.org/jira/browse/HDFS-4198?focusedCommentId=13498818page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13498818] for detailed description. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-3907) Allow multiple users for local block readers
[ https://issues.apache.org/jira/browse/HDFS-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3907: Fix Version/s: (was: 2.3.0) 2.4.0 Allow multiple users for local block readers Key: HDFS-3907 URL: https://issues.apache.org/jira/browse/HDFS-3907 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.4.0 Attachments: hdfs-3907.txt The {{dfs.block.local-path-access.user}} config added in HDFS-2246 only supports a single user, however as long as blocks are group readable by more than one user the feature could be used by multiple users, to support this we just need to allow both to be configured. In practice this allows us to also support HBase where the client (RS) runs as the hbase system user and the DN runs as hdfs system user. I think this should work secure as well since we're not using impersonation in the HBase case. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop
[ https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3746: Fix Version/s: (was: 2.3.0) 2.4.0 Invalid counter tag in HDFS balancer which lead to infinite loop Key: HDFS-3746 URL: https://issues.apache.org/jira/browse/HDFS-3746 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: meng gong Assignee: meng gong Labels: patch Fix For: 3.0.0, 2.4.0 Everytime banlancer try to move a block cross the rack. For every NameNodeConnector it will instance a new banlancer. The tag notChangedIterations is reseted to 0. Then it won't reach 5 and exit the thread for not moved in 5 consecutive iterations. This lead to a infinite loop -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5522) Datanode disk error check may be incorrectly skipped
[ https://issues.apache.org/jira/browse/HDFS-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5522: Fix Version/s: (was: 2.3.0) 2.4.0 Datanode disk error check may be incorrectly skipped Key: HDFS-5522 URL: https://issues.apache.org/jira/browse/HDFS-5522 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.9, 2.2.0 Reporter: Kihwal Lee Fix For: 0.23.11, 2.4.0 After HDFS-4581 and HDFS-4699, {{checkDiskError()}} is not called when network errors occur during processing data node requests. This appears to create problems when a disk is having problems, but not failing I/O soon. If I/O hangs for a long time, network read/write may timeout first and the peer may close the connection. Although the error was caused by a faulty local disk, disk check is not being carried out in this case. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4066) Fix compile warnings for libwebhdfs
[ https://issues.apache.org/jira/browse/HDFS-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4066: Fix Version/s: (was: 2.3.0) 2.4.0 Fix compile warnings for libwebhdfs --- Key: HDFS-4066 URL: https://issues.apache.org/jira/browse/HDFS-4066 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Fix For: 2.4.0 The compile of libwebhdfs still generates a bunch of warnings, which need to be fixed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order.
[ https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5244: Fix Version/s: (was: 2.3.0) 2.4.0 TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order. Key: HDFS-5244 URL: https://issues.apache.org/jira/browse/HDFS-5244 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6 Reporter: Jinghui Wang Assignee: Jinghui Wang Fix For: 3.0.0, 2.1.0-beta, 2.4.0 Attachments: HDFS-5244.patch The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a HashMap(dirRoots) to store the root storages to be mocked for the purging test, which does not have any predictable order. The directories needs be purged are stored in a LinkedHashSet, which has a predictable order. So, when the directories get mocked for the test, they could be already out of the order that they were added. Thus, the order that the directories were actually purged and the order of them being added to the LinkedHashList could be different and cause the test to fail. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4564: Target Version/s: 2.3.0 (was: 3.0.0, 2.3.0) Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891086#comment-13891086 ] Arun C Murthy commented on HDFS-4564: - [~daryn] - Is this close? Thanks. Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-4564) Webhdfs returns incorrect http response codes for denied operations
[ https://issues.apache.org/jira/browse/HDFS-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891167#comment-13891167 ] Arun C Murthy commented on HDFS-4564: - Ok, thanks [~daryn]! Webhdfs returns incorrect http response codes for denied operations --- Key: HDFS-4564 URL: https://issues.apache.org/jira/browse/HDFS-4564 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker Attachments: HDFS-4564.branch-23.patch, HDFS-4564.branch-23.patch, HDFS-4564.patch Webhdfs is returning 401 (Unauthorized) instead of 403 (Forbidden) when it's denying operations. Examples including rejecting invalid proxy user attempts and renew/cancel with an invalid user. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-4949) Centralized cache management in HDFS
[ https://issues.apache.org/jira/browse/HDFS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855882#comment-13855882 ] Arun C Murthy commented on HDFS-4949: - We've been discussing through a proposal where-in we can leverage YARN's resource-management *and* workload-management capabilities (via delegation of resources, in this case RAM to HDFS) to provide a more general cache administration in YARN-1488. Centralized cache management in HDFS Key: HDFS-4949 URL: https://issues.apache.org/jira/browse/HDFS-4949 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Affects Versions: 3.0.0, 2.4.0 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: HDFS-4949-consolidated.patch, caching-design-doc-2013-07-02.pdf, caching-design-doc-2013-08-09.pdf, caching-design-doc-2013-10-24.pdf, caching-testplan.pdf HDFS currently has no support for managing or exposing in-memory caches at datanodes. This makes it harder for higher level application frameworks like Hive, Pig, and Impala to effectively use cluster memory, because they cannot explicitly cache important datasets or place their tasks for memory locality. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HDFS-5291) Standby namenode after transition to active goes into safemode
[ https://issues.apache.org/jira/browse/HDFS-5291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5291: Fix Version/s: (was: 2.2.0) 2.2.1 Standby namenode after transition to active goes into safemode -- Key: HDFS-5291 URL: https://issues.apache.org/jira/browse/HDFS-5291 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.1.1-beta Reporter: Arpit Gupta Assignee: Jing Zhao Priority: Critical Fix For: 2.2.1 Attachments: HDFS-5291.000.patch, HDFS-5291.001.patch, HDFS-5291.002.patch, HDFS-5291.003.patch, HDFS-5291.003.patch, nn.log Some log snippets standby state to active transition {code} 2013-10-02 00:13:49,482 INFO ipc.Server (Server.java:run(2068)) - IPC Server handler 69 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.renewLease from IP:33911 Call#1483 Retry#1: error: org.apache.hadoop.ipc.StandbyException: Operation category WRITE is not supported in state standby 2013-10-02 00:13:49,689 INFO ipc.Server (Server.java:saslProcess(1342)) - Auth successful for nn/hostn...@example.com (auth:SIMPLE) 2013-10-02 00:13:49,696 INFO authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(111)) - Authorization successful for nn/hostn...@example.com (auth:KERBEROS) for protocol=interface org.apache.hadoop.ha.HAServiceProtocol 2013-10-02 00:13:49,700 INFO namenode.FSNamesystem (FSNamesystem.java:stopStandbyServices(1013)) - Stopping services started for standby state 2013-10-02 00:13:49,701 WARN ha.EditLogTailer (EditLogTailer.java:doWork(336)) - Edit log tailer interrupted java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:334) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:356) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1463) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:454) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTail 2013-10-02 00:13:49,704 INFO namenode.FSNamesystem (FSNamesystem.java:startActiveServices(885)) - Starting services required for active state 2013-10-02 00:13:49,719 INFO client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(419)) - Starting recovery process for unclosed journal segments... 2013-10-02 00:13:49,755 INFO ipc.Server (Server.java:saslProcess(1342)) - Auth successful for hbase/hostn...@example.com (auth:SIMPLE) 2013-10-02 00:13:49,761 INFO authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(111)) - Authorization successful for hbase/hostn...@example.com (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol 2013-10-02 00:13:49,839 INFO client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(421)) - Successfully started new epoch 85 2013-10-02 00:13:49,839 INFO client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(249)) - Beginning recovery of unclosed segment starting at txid 887112 2013-10-02 00:13:49,874 INFO client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(258)) - Recovery prepare phase complete. Responses: IP:8485: segmentState { startTxId: 887112 endTxId: 887531 isInProgress: true } lastWriterEpoch: 84 lastCommittedTxId: 887530 172.18.145.97:8485: segmentState { startTxId: 887112 endTxId: 887531 isInProgress: true } lastWriterEpoch: 84 lastCommittedTxId: 887530 2013-10-02 00:13:49,875 INFO client.QuorumJournalManager (QuorumJournalManager.java:recover {code} And then we get into safemode {code} Construction[IP:1019|RBW]]} size 0 2013-10-02 00:13:50,277 INFO BlockStateChange (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: blockMap updated: IP:1019 is added to blk_IP157{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[IP:1019|RBW], ReplicaUnderConstruction[172.18.145.96:1019|RBW], ReplicaUnde rConstruction[IP:1019|RBW]]} size 0 2013-10-02 00:13:50,279 INFO hdfs.StateChange (FSNamesystem.java:reportStatus(4703)) - STATE* Safe mode ON. The reported blocks 1071 needs additional
[jira] [Updated] (HDFS-5307) Support both HTTP and HTTPS in jsp pages
[ https://issues.apache.org/jira/browse/HDFS-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5307: Fix Version/s: (was: 2.2.0) 2.2.1 Support both HTTP and HTTPS in jsp pages Key: HDFS-5307 URL: https://issues.apache.org/jira/browse/HDFS-5307 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.2.1 Attachments: HDFS-5307.000.patch Adopt the new APIs provided by HDFS-5306 to support both HTTP and HTTPS in the jsp pages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5317) Go back to DFS Home link does not work on datanode webUI
[ https://issues.apache.org/jira/browse/HDFS-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5317: Fix Version/s: (was: 2.2.0) 2.2.1 Go back to DFS Home link does not work on datanode webUI Key: HDFS-5317 URL: https://issues.apache.org/jira/browse/HDFS-5317 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Suresh Srinivas Assignee: Haohui Mai Priority: Critical Fix For: 2.2.1 Attachments: HDFS-5317.000.patch, HDFS-5317.001.patch Here is how to reproduce this problem: # Goto https://namenode:https_port/dfshealth.jsp # Click on Live Nodes link # On the Live Nodes page, click one of the links for the datanodes. This link has always hardcodes the http link instead of http/https based on the context of the webpage. On datanode webUI page, the link Go back to DFS home does not work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4817) make HDFS advisory caching configurable on a per-file basis
[ https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13787917#comment-13787917 ] Arun C Murthy commented on HDFS-4817: - [~cmccabe] In future, if you backport patches, please ensure you reflect it appropriately in all branches. For e.g. HDFS-4817. Make HDFS advisory caching configurable on a per-file basis. (Colin Patrick McCabe) shows up in Release 2.3.0 as an Improvement in trunk/branch-2 while it shows up as a New Feature in branch-2.1-beta. This will reduce the burden on making releases, makes it consistent for folks to figure release etc. Thanks. make HDFS advisory caching configurable on a per-file basis --- Key: HDFS-4817 URL: https://issues.apache.org/jira/browse/HDFS-4817 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.2.0 Attachments: HDFS-4817.001.patch, HDFS-4817.002.patch, HDFS-4817.004.patch, HDFS-4817.006.patch, HDFS-4817.007.patch, HDFS-4817.008.patch, HDFS-4817.009.patch, HDFS-4817.010.patch, HDFS-4817-b2.1.001.patch HADOOP-7753 and related JIRAs introduced some performance optimizations for the DataNode. One of them was readahead. When readahead is enabled, the DataNode starts reading the next bytes it thinks it will need in the block file, before the client requests them. This helps hide the latency of rotational media and send larger reads down to the device. Another optimization was drop-behind. Using this optimization, we could remove files from the Linux page cache after they were no longer needed. Using {{dfs.datanode.drop.cache.behind.writes}} and {{dfs.datanode.drop.cache.behind.reads}} can improve performance substantially on many MapReduce jobs. In our internal benchmarks, we have seen speedups of 40% on certain workloads. The reason is because if we know the block data will not be read again any time soon, keeping it out of memory allows more memory to be used by the other processes on the system. See HADOOP-7714 for more benchmarks. We would like to turn on these configurations on a per-file or per-client basis, rather than on the DataNode as a whole. This will allow more users to actually make use of them. It would also be good to add unit tests for the drop-cache code path, to ensure that it is functioning as we expect. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5225) datanode keeps logging the same 'is no longer in the dataset' message over and over again
[ https://issues.apache.org/jira/browse/HDFS-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5225: Priority: Blocker (was: Major) datanode keeps logging the same 'is no longer in the dataset' message over and over again - Key: HDFS-5225 URL: https://issues.apache.org/jira/browse/HDFS-5225 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.1.1-beta Reporter: Roman Shaposhnik Priority: Blocker Attachments: HDFS-5225.1.patch, HDFS-5225-reproduce.1.txt I was running the usual Bigtop testing on 2.1.1-beta RC1 with the following configuration: 4 nodes fully distributed cluster with security on. All of a sudden my DN ate up all of the space in /var/log logging the following message repeatedly: {noformat} 2013-09-18 20:51:12,046 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369 is no longer in the dataset {noformat} It wouldn't answer to a jstack and jstack -F ended up being useless. Here's what I was able to find in the NameNode logs regarding this block ID: {noformat} fgrep -rI 'blk_1073742189' hadoop-hdfs-namenode-ip-10-224-158-152.log 2013-09-18 18:03:16,972 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /user/jenkins/testAppendInputWedSep18180222UTC2013/test4.fileWedSep18180222UTC2013._COPYING_. BP-1884637155-10.224.158.152-1379524544853 blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.224.158.152:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.34.74.206:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,224 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.83.107.80:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,899 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369, newGenerationStamp=1370, newLength=1048576, newNodes=[10.83.107.80:1004, 10.34.74.206:1004, 10.224.158.152:1004], clientName=DFSClient_NONMAPREDUCE_-450304083_1) 2013-09-18 18:03:17,904 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369) successfully to BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370 2013-09-18 18:03:18,540 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370, newGenerationStamp=1371, newLength=2097152, newNodes=[10.83.107.80:1004, 10.34.74.206:1004, 10.224.158.152:1004], clientName=DFSClient_NONMAPREDUCE_-450304083_1) 2013-09-18 18:03:18,548 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370) successfully to BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1371 2013-09-18 18:03:26,150 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073742189_1371 10.83.107.80:1004 10.34.74.206:1004 10.224.158.152:1004 2013-09-18 18:03:26,847 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* InvalidateBlocks: ask 10.34.74.206:1004 to delete [blk_1073742178_1359, blk_1073742183_1362, blk_1073742184_1363, blk_1073742186_1366, blk_1073742188_1368, blk_1073742189_1371] 2013-09-18 18:03:29,848 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* InvalidateBlocks: ask 10.224.158.152:1004 to delete [blk_1073742177_1353, blk_1073742178_1359, blk_1073742179_1355, blk_1073742180_1356, blk_1073742181_1358, blk_1073742182_1361, blk_1073742185_1364, blk_1073742187_1367, blk_1073742188_1368, blk_1073742189_1371]
[jira] [Updated] (HDFS-5225) datanode keeps logging the same 'is no longer in the dataset' message over and over again
[ https://issues.apache.org/jira/browse/HDFS-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5225: Target Version/s: 2.1.2-beta datanode keeps logging the same 'is no longer in the dataset' message over and over again - Key: HDFS-5225 URL: https://issues.apache.org/jira/browse/HDFS-5225 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.1.1-beta Reporter: Roman Shaposhnik Priority: Blocker Attachments: HDFS-5225.1.patch, HDFS-5225-reproduce.1.txt I was running the usual Bigtop testing on 2.1.1-beta RC1 with the following configuration: 4 nodes fully distributed cluster with security on. All of a sudden my DN ate up all of the space in /var/log logging the following message repeatedly: {noformat} 2013-09-18 20:51:12,046 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369 is no longer in the dataset {noformat} It wouldn't answer to a jstack and jstack -F ended up being useless. Here's what I was able to find in the NameNode logs regarding this block ID: {noformat} fgrep -rI 'blk_1073742189' hadoop-hdfs-namenode-ip-10-224-158-152.log 2013-09-18 18:03:16,972 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /user/jenkins/testAppendInputWedSep18180222UTC2013/test4.fileWedSep18180222UTC2013._COPYING_. BP-1884637155-10.224.158.152-1379524544853 blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.224.158.152:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,222 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.34.74.206:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,224 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.83.107.80:1004 is added to blk_1073742189_1369{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[10.83.107.80:1004|RBW], ReplicaUnderConstruction[10.34.74.206:1004|RBW], ReplicaUnderConstruction[10.224.158.152:1004|RBW]]} size 0 2013-09-18 18:03:17,899 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369, newGenerationStamp=1370, newLength=1048576, newNodes=[10.83.107.80:1004, 10.34.74.206:1004, 10.224.158.152:1004], clientName=DFSClient_NONMAPREDUCE_-450304083_1) 2013-09-18 18:03:17,904 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1369) successfully to BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370 2013-09-18 18:03:18,540 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(block=BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370, newGenerationStamp=1371, newLength=2097152, newNodes=[10.83.107.80:1004, 10.34.74.206:1004, 10.224.158.152:1004], clientName=DFSClient_NONMAPREDUCE_-450304083_1) 2013-09-18 18:03:18,548 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1370) successfully to BP-1884637155-10.224.158.152-1379524544853:blk_1073742189_1371 2013-09-18 18:03:26,150 INFO BlockStateChange: BLOCK* addToInvalidates: blk_1073742189_1371 10.83.107.80:1004 10.34.74.206:1004 10.224.158.152:1004 2013-09-18 18:03:26,847 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* InvalidateBlocks: ask 10.34.74.206:1004 to delete [blk_1073742178_1359, blk_1073742183_1362, blk_1073742184_1363, blk_1073742186_1366, blk_1073742188_1368, blk_1073742189_1371] 2013-09-18 18:03:29,848 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* InvalidateBlocks: ask 10.224.158.152:1004 to delete [blk_1073742177_1353, blk_1073742178_1359, blk_1073742179_1355, blk_1073742180_1356, blk_1073742181_1358, blk_1073742182_1361, blk_1073742185_1364, blk_1073742187_1367, blk_1073742188_1368, blk_1073742189_1371]
[jira] [Updated] (HDFS-5228) The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE
[ https://issues.apache.org/jira/browse/HDFS-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5228: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta The RemoteIterator returned by DistributedFileSystem.listFiles(..) may throw NPE Key: HDFS-5228 URL: https://issues.apache.org/jira/browse/HDFS-5228 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.1.0-beta Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Blocker Fix For: 2.1.2-beta Attachments: h5228_20130919.patch, h5228_20130919_test.patch, HDFS-5228.2.patch Get a RemoteIterator from DistributedFileSystem.listFiles(..) with a relative path. Then, it will result a NullPointerException when calling hasNext() from the RemoteIterator. This bug was discovered by Arnaud: http://hortonworks.com/community/forums/topic/new-bug-in-hdfs-listfiles-method/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5246) Hadoop nfs server binds to port 2049 which is the same as Linux nfs server
[ https://issues.apache.org/jira/browse/HDFS-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5246: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta Hadoop nfs server binds to port 2049 which is the same as Linux nfs server -- Key: HDFS-5246 URL: https://issues.apache.org/jira/browse/HDFS-5246 Project: Hadoop HDFS Issue Type: Improvement Components: nfs Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM Java 1.6 Reporter: Jinghui Wang Fix For: 3.0.0, 2.1.2-beta Attachments: HDFS-5246-2.patch, HDFS-5246-3.patch, HDFS-5246.patch Hadoop nfs binds the nfs server to port 2049, which is also the default port that Linux nfs uses. If Linux nfs is already running on the machine then Hadoop nfs will not be albe to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5244) TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order.
[ https://issues.apache.org/jira/browse/HDFS-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5244: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta TestNNStorageRetentionManager#testPurgeMultipleDirs fails because incorrectly expects Hashmap values to have order. Key: HDFS-5244 URL: https://issues.apache.org/jira/browse/HDFS-5244 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Environment: Red Hat Enterprise 6 with Sun Java 1.7 and IBM java 1.6 Reporter: Jinghui Wang Fix For: 3.0.0, 2.1.0-beta, 2.1.2-beta Attachments: HDFS-5244.patch The test o.a.h.hdfs.server.namenode.TestNNStorageRetentionManager uses a HashMap(dirRoots) to store the root storages to be mocked for the purging test, which does not have any predictable order. The directories needs be purged are stored in a LinkedHashSet, which has a predictable order. So, when the directories get mocked for the test, they could be already out of the order that they were added. Thus, the order that the directories were actually purged and the order of them being added to the LinkedHashList could be different and cause the test to fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5160) MiniDFSCluster webui does not work
[ https://issues.apache.org/jira/browse/HDFS-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5160: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta MiniDFSCluster webui does not work -- Key: HDFS-5160 URL: https://issues.apache.org/jira/browse/HDFS-5160 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Fix For: 2.1.2-beta The webui does not work, when going to http://localhost:50070 you get: {code} Directory: / webapps/ 102 bytes Sep 4, 2013 9:32:55 AM {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4754) Add an API in the namenode to mark a datanode as stale
[ https://issues.apache.org/jira/browse/HDFS-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4754: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta Add an API in the namenode to mark a datanode as stale -- Key: HDFS-4754 URL: https://issues.apache.org/jira/browse/HDFS-4754 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Fix For: 3.0.0, 2.1.2-beta Attachments: 4754.v1.patch, 4754.v2.patch, 4754.v4.patch, 4754.v4.patch There is a detection of the stale datanodes in HDFS since HDFS-3703, with a timeout, defaulted to 30s. There are two reasons to add an API to mark a node as stale even if the timeout is not yet reached: 1) ZooKeeper can detect that a client is dead at any moment. So, for HBase, we sometimes start the recovery before a node is marked staled. (even with reasonable settings as: stale: 20s; HBase ZK timeout: 30s 2) Some third parties could detect that a node is dead before the timeout, hence saving us the cost of retrying. An example or such hw is Arista, presented here by [~tsuna] http://tsunanet.net/~tsuna/fsf-hbase-meetup-april13.pdf, and confirmed in HBASE-6290. As usual, even if the node is dead it can comeback before the 10 minutes limit. So I would propose to set a timebound. The API would be namenode.markStale(String ipAddress, int port, long durationInMs); After durationInMs, the namenode would again rely only on its heartbeat to decide. Thoughts? If there is no objections, and if nobody in the hdfs dev team has the time to spend some time on it, I will give it a try for branch 2 3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5098) Enhance FileSystem.Statistics to have locality information
[ https://issues.apache.org/jira/browse/HDFS-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5098: Fix Version/s: (was: 2.1.1-beta) 2.1.2-beta Enhance FileSystem.Statistics to have locality information -- Key: HDFS-5098 URL: https://issues.apache.org/jira/browse/HDFS-5098 Project: Hadoop HDFS Issue Type: Improvement Reporter: Bikas Saha Assignee: Suresh Srinivas Fix For: 2.1.2-beta Currently in MR/Tez we dont have a good and accurate means to detect how much the the IO was actually done locally. Getting this information from the source of truth would be much better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5139) Remove redundant -R option from setrep
[ https://issues.apache.org/jira/browse/HDFS-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5139: Fix Version/s: (was: 2.2.0) 2.1.2-beta Remove redundant -R option from setrep -- Key: HDFS-5139 URL: https://issues.apache.org/jira/browse/HDFS-5139 Project: Hadoop HDFS Issue Type: Improvement Components: tools Affects Versions: 3.0.0, 1.3.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 2.1.2-beta Attachments: HDFS-5139.01.patch, HDFS-5139.02.patch, HDFS-5139.03.patch, HDFS-5139.04.patch The -R option to setrep is redundant because it is required for directory targets and ignored for file targets. We can just remove the option and make -R the default for directories. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5251) Race between the initialization of NameNode and the http server
[ https://issues.apache.org/jira/browse/HDFS-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5251: Fix Version/s: (was: 2.2.0) 2.1.2-beta Race between the initialization of NameNode and the http server --- Key: HDFS-5251 URL: https://issues.apache.org/jira/browse/HDFS-5251 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.1.2-beta Attachments: HDFS-5251.000.patch The constructor of NameNode starts a HTTP server before the FSNameSystem is initialized. Currently there is a race where the HTTP server can access the uninitialized namesystem variable, throwing a NullPointerException. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3983) Hftp should support both SPNEGO and KSSL
[ https://issues.apache.org/jira/browse/HDFS-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768351#comment-13768351 ] Arun C Murthy commented on HDFS-3983: - Any update on this? Is this ready to go? Thanks. Hftp should support both SPNEGO and KSSL Key: HDFS-3983 URL: https://issues.apache.org/jira/browse/HDFS-3983 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Blocker Attachments: hdfs-3983.txt, hdfs-3983.txt Hftp currently doesn't work against a secure cluster unless you configure {{dfs.https.port}} to be the http port, otherwise the client can't fetch tokens: {noformat} $ hadoop fs -ls hftp://c1225.hal.cloudera.com:50070/ 12/09/26 18:02:00 INFO fs.FileSystem: Couldn't get a delegation token from http://c1225.hal.cloudera.com:50470 using http. ls: Security enabled but user not authenticated by filter {noformat} This is due to Hftp still using the https port. Post HDFS-2617 it should use the regular http port. Hsftp should still use the secure port, however now that we have HADOOP-8581 it's worth considering removing Hsftp entirely. I'll start a separate thread about that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5138) Support HDFS upgrade in HA
[ https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5138: Target Version/s: (was: 2.1.1-beta) Support HDFS upgrade in HA -- Key: HDFS-5138 URL: https://issues.apache.org/jira/browse/HDFS-5138 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Kihwal Lee Priority: Blocker With HA enabled, NN wo't start with -upgrade. Since there has been a layout version change between 2.0.x and 2.1.x, starting NN in upgrade mode was necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way to get around this was to disable HA and upgrade. The NN and the cluster cannot be flipped back to HA until the upgrade is finalized. If HA is disabled only on NN for layout upgrade and HA is turned back on without involving DNs, things will work, but finaliizeUpgrade won't work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade snapshots won't get removed. We will need a different ways of doing layout upgrade and upgrade snapshot. I am marking this as a 2.1.1-beta blocker based on feedback from others. If there is a reasonable workaround that does not increase maintenance window greatly, we can lower its priority from blocker to critical. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4952) dfs -ls hftp:// fails on secure hadoop2 cluster
[ https://issues.apache.org/jira/browse/HDFS-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4952: Target Version/s: (was: 2.1.1-beta) dfs -ls hftp:// fails on secure hadoop2 cluster --- Key: HDFS-4952 URL: https://issues.apache.org/jira/browse/HDFS-4952 Project: Hadoop HDFS Issue Type: Bug Reporter: Yesha Vora Priority: Blocker Running :hadoop dfs -ls hftp://namenode:namenodeport/A WARN fs.FileSystem: Couldn't connect to http://namenode:50470, assuming security is disabled ls: Security enabled but user not authenticated by filter -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4952) dfs -ls hftp:// fails on secure hadoop2 cluster
[ https://issues.apache.org/jira/browse/HDFS-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy resolved HDFS-4952. - Resolution: Fixed Duplicate of HDFS-3983 dfs -ls hftp:// fails on secure hadoop2 cluster --- Key: HDFS-4952 URL: https://issues.apache.org/jira/browse/HDFS-4952 Project: Hadoop HDFS Issue Type: Bug Reporter: Yesha Vora Priority: Blocker Running :hadoop dfs -ls hftp://namenode:namenodeport/A WARN fs.FileSystem: Couldn't connect to http://namenode:50470, assuming security is disabled ls: Security enabled but user not authenticated by filter -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5138) Support HDFS upgrade in HA
[ https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765519#comment-13765519 ] Arun C Murthy commented on HDFS-5138: - Guys, any luck? I'd like to get 2.1.1 out soon... thanks! Support HDFS upgrade in HA -- Key: HDFS-5138 URL: https://issues.apache.org/jira/browse/HDFS-5138 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Kihwal Lee Priority: Blocker With HA enabled, NN wo't start with -upgrade. Since there has been a layout version change between 2.0.x and 2.1.x, starting NN in upgrade mode was necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way to get around this was to disable HA and upgrade. The NN and the cluster cannot be flipped back to HA until the upgrade is finalized. If HA is disabled only on NN for layout upgrade and HA is turned back on without involving DNs, things will work, but finaliizeUpgrade won't work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade snapshots won't get removed. We will need a different ways of doing layout upgrade and upgrade snapshot. I am marking this as a 2.1.1-beta blocker based on feedback from others. If there is a reasonable workaround that does not increase maintenance window greatly, we can lower its priority from blocker to critical. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5132) Deadlock in NameNode between SafeModeMonitor#run and DatanodeManager#handleHeartbeat
[ https://issues.apache.org/jira/browse/HDFS-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5132: Priority: Blocker (was: Major) Deadlock in NameNode between SafeModeMonitor#run and DatanodeManager#handleHeartbeat Key: HDFS-5132 URL: https://issues.apache.org/jira/browse/HDFS-5132 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-5132.001.patch, HDFS-5132.patch, jstack.log While running nn load generator with 15 threads for 20 mins the standby namenode deadlocked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5132) Deadlock in NameNode between SafeModeMonitor#run and DatanodeManager#handleHeartbeat
[ https://issues.apache.org/jira/browse/HDFS-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-5132: Target Version/s: 2.1.1-beta Deadlock in NameNode between SafeModeMonitor#run and DatanodeManager#handleHeartbeat Key: HDFS-5132 URL: https://issues.apache.org/jira/browse/HDFS-5132 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-5132.001.patch, HDFS-5132.patch, jstack.log While running nn load generator with 15 threads for 20 mins the standby namenode deadlocked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4489: Fix Version/s: (was: 2.1.0-beta) 2.3.0 Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.3.0 Attachments: 4434.optimized.patch The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3747) balancer-Test failed because of time out
[ https://issues.apache.org/jira/browse/HDFS-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3747: Fix Version/s: (was: 2.1.0-beta) 2.3.0 balancer-Test failed because of time out - Key: HDFS-3747 URL: https://issues.apache.org/jira/browse/HDFS-3747 Project: Hadoop HDFS Issue Type: Test Components: balancer Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: meng gong Assignee: meng gong Labels: test Fix For: 3.0.0, 2.3.0 Attachments: TestCaseForBanlancer.java When run a given test case for Banlancer Test. In the test the banlancer thread try to move some block cross the rack but it can't find any available blocks in the source rack. Then the thread won't interrupt until the tag isTimeUp reaches 20min. But maven judges the test failed because the thread have runned for 15min. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4559) WebHDFS does not allow resolution of Symlinks
[ https://issues.apache.org/jira/browse/HDFS-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4559: Fix Version/s: (was: 2.1.0-beta) 2.3.0 WebHDFS does not allow resolution of Symlinks - Key: HDFS-4559 URL: https://issues.apache.org/jira/browse/HDFS-4559 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.3-alpha Reporter: Plamen Jeliazkov Assignee: Plamen Jeliazkov Fix For: 3.0.0, 2.3.0 WebHDFS allows you to create symlinks via the CREATESYMLINK operation, but the GETFILEINFO operation specifically calls the getFileInfo() method of the NameNodeRpcServer which does not resolve symlinks. I propose adding a parameter to GETFILEINFO such that if true will call getFileLinkInfo() rather than getFileInfo() which will resolve any symlinks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3988) HttpFS can't do GETDELEGATIONTOKEN without a prior authenticated request
[ https://issues.apache.org/jira/browse/HDFS-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3988: Fix Version/s: (was: 2.1.0-beta) 2.3.0 HttpFS can't do GETDELEGATIONTOKEN without a prior authenticated request Key: HDFS-3988 URL: https://issues.apache.org/jira/browse/HDFS-3988 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.3.0 A request to obtain a delegation token cannot it initiate an authentication sequence, it must be accompanied by an auth cookie obtained in a prev request using a different operation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3987) add support for encrypted webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3987: Fix Version/s: (was: 2.1.0-beta) 2.3.0 add support for encrypted webhdfs - Key: HDFS-3987 URL: https://issues.apache.org/jira/browse/HDFS-3987 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 2.0.2-alpha Reporter: Alejandro Abdelnur Fix For: 2.3.0 This is a follow up of HDFS-3983. We should have a new filesystem client impl/binding for encrypted WebHDFS, i.e. *webhdfss://* On the server side, webhdfs and httpfs we should only need to start the service on a secured (HTTPS) endpoint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4286) Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server jar in hidden package of BKJM
[ https://issues.apache.org/jira/browse/HDFS-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4286: Fix Version/s: (was: 2.1.0-beta) 2.3.0 Changes from BOOKKEEPER-203 broken capability of including bookkeeper-server jar in hidden package of BKJM -- Key: HDFS-4286 URL: https://issues.apache.org/jira/browse/HDFS-4286 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Vinay Fix For: 3.0.0, 2.3.0 BOOKKEEPER-203 introduced changes to LedgerLayout to include ManagerFactoryClass instead of ManagerFactoryName. So because of this, BKJM cannot shade the bookkeeper-server jar inside BKJM jar LAYOUT znode created by BookieServer is not readable by the BKJM as it have classes in hidden packages. (same problem vice versa) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3592) libhdfs should expose ClientProtocol::mkdirs2
[ https://issues.apache.org/jira/browse/HDFS-3592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3592: Fix Version/s: (was: 2.1.0-beta) 2.3.0 libhdfs should expose ClientProtocol::mkdirs2 - Key: HDFS-3592 URL: https://issues.apache.org/jira/browse/HDFS-3592 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 3.0.0, 2.3.0 It would be nice if libhdfs exposed mkdirs2. This version of mkdirs is much more verbose about any errors that occur-- it throws AccessControlException, FileAlreadyExists, FileNotFoundException, ParentNotDirectoryException, etc. The original mkdirs just throws IOException if anything goes wrong. For something like fuse_dfs, it is very important to return the correct errno code when an error has occurred. mkdirs2 would allow us to do that. I'm not sure if we should just change hdfsMkdirs to use mkdirs2, or add an hdfsMkdirs2. Probably the latter, but the former course would maintain bug compatibility with ancient releases-- if that is important. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3746) Invalid counter tag in HDFS balancer which lead to infinite loop
[ https://issues.apache.org/jira/browse/HDFS-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3746: Fix Version/s: (was: 2.1.0-beta) 2.3.0 Invalid counter tag in HDFS balancer which lead to infinite loop Key: HDFS-3746 URL: https://issues.apache.org/jira/browse/HDFS-3746 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: meng gong Assignee: meng gong Labels: patch Fix For: 3.0.0, 2.3.0 Everytime banlancer try to move a block cross the rack. For every NameNodeConnector it will instance a new banlancer. The tag notChangedIterations is reseted to 0. Then it won't reach 5 and exit the thread for not moved in 5 consecutive iterations. This lead to a infinite loop -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3907) Allow multiple users for local block readers
[ https://issues.apache.org/jira/browse/HDFS-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3907: Fix Version/s: (was: 2.1.0-beta) 2.3.0 Allow multiple users for local block readers Key: HDFS-3907 URL: https://issues.apache.org/jira/browse/HDFS-3907 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.3.0 Attachments: hdfs-3907.txt The {{dfs.block.local-path-access.user}} config added in HDFS-2246 only supports a single user, however as long as blocks are group readable by more than one user the feature could be used by multiple users, to support this we just need to allow both to be configured. In practice this allows us to also support HBase where the client (RS) runs as the hbase system user and the DN runs as hdfs system user. I think this should work secure as well since we're not using impersonation in the HBase case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-2802: Fix Version/s: (was: 2.1.0-beta) 2.3.0 Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: namenode Reporter: Hari Mankude Assignee: Tsz Wo (Nicholas), SZE Fix For: 2.3.0 Attachments: 2802.diff, 2802.patch, 2802.patch, h2802_20130417.patch, h2802_20130422.patch, h2802_20130423.patch, h2802_20130425.patch, h2802_20130426.patch, h2802_20130430b.patch, h2802_20130430.patch, h2802_20130508.patch, h2802_20130509.patch, HDFS-2802.20121101.patch, HDFS-2802.branch-2.0510.patch, HDFS-2802-meeting-minutes-121101.txt, HDFSSnapshotsDesign.pdf, snap.patch, snapshot-design.pdf, snapshot-design.tex, snapshot-one-pager.pdf, Snapshots20120429.pdf, Snapshots20121018.pdf, Snapshots20121030.pdf, Snapshots.pdf, snapshot-testplan.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3691) UserGroupInformation#spawnAutoRenewalThreadForUserCreds tries to renew even if the kerberos ticket cache is non-renewable
[ https://issues.apache.org/jira/browse/HDFS-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3691: Fix Version/s: (was: 2.1.0-beta) 2.3.0 UserGroupInformation#spawnAutoRenewalThreadForUserCreds tries to renew even if the kerberos ticket cache is non-renewable - Key: HDFS-3691 URL: https://issues.apache.org/jira/browse/HDFS-3691 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Priority: Minor Fix For: 2.3.0 UserGroupInformation#spawnAutoRenewalThreadForUserCreds tries to renew user credentials. However, it does this even if the kerberos ticket cache in question is non-renewable. This leads to an annoying error message being printed out all the time. {code} cmccabe@keter:/h klist Ticket cache: FILE:/tmp/krb5cc_1014 Default principal: hdfs/ke...@cloudera.com Valid starting ExpiresService principal 07/18/12 15:24:15 07/19/12 15:24:13 krbtgt/cloudera@cloudera.com {code} {code} cmccabe@keter:/h ./bin/hadoop fs -ls / 15:21:39,882 WARN UserGroupInformation:739 - Exception encountered while running the renewal command. Aborting renew thread. org.apache.hadoop.util.Shell$ExitCodeException: kinit: KDC can't fulfill requested option while renewing credentials Found 3 items -rw-r--r-- 3 cmccabe users 0 2012-07-09 17:15 /b -rw-r--r-- 3 hdfssupergroup 0 2012-07-09 17:17 /c drwxrwxrwx - cmccabe audio 0 2012-07-19 11:25 /tmp {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3828) Block Scanner rescans blocks too frequently
[ https://issues.apache.org/jira/browse/HDFS-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3828: Fix Version/s: (was: 2.1.0-beta) 2.3.0 Block Scanner rescans blocks too frequently --- Key: HDFS-3828 URL: https://issues.apache.org/jira/browse/HDFS-3828 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Andy Isaacson Assignee: Andy Isaacson Fix For: 2.3.0 Attachments: hdfs-3828-1.txt, hdfs-3828-2.txt, hdfs-3828-3.txt, hdfs3828.txt {{BlockPoolSliceScanner#scan}} calls cleanUp every time it's invoked from {{DataBlockScanner#run}} via {{scanBlockPoolSlice}}. But cleanUp unconditionally roll()s the verificationLogs, so after two iterations we have lost the first iteration of block verification times. As a result a cluster with just one block repeatedly rescans it every 10 seconds: {noformat} 2012-08-16 15:59:57,884 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 2012-08-16 16:00:07,904 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 2012-08-16 16:00:17,925 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 {noformat} {quote} To fix this, we need to avoid roll()ing the logs multiple times per period. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3761) Uses of AuthenticatedURL should set the configured timeouts
[ https://issues.apache.org/jira/browse/HDFS-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-3761: Fix Version/s: (was: 2.1.0-beta) 2.3.0 Uses of AuthenticatedURL should set the configured timeouts --- Key: HDFS-3761 URL: https://issues.apache.org/jira/browse/HDFS-3761 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha, 2.0.2-alpha Reporter: Alejandro Abdelnur Priority: Critical Fix For: 2.3.0 Similar to {{URLUtils}}, the {{AuthenticatedURL}} should set the configured timeouts in URL connection instances. HADOOP-8644 is introducing a {{ConnectionConfigurator}} interface to the {{AuthenticatedURL}} class, it should be done via that interface. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4198) the shell script error for Cygwin on windows7
[ https://issues.apache.org/jira/browse/HDFS-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4198: Fix Version/s: (was: 2.1.0-beta) 2.3.0 the shell script error for Cygwin on windows7 - Key: HDFS-4198 URL: https://issues.apache.org/jira/browse/HDFS-4198 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 2.0.2-alpha Environment: windows7 ,cygwin. Reporter: Han Hui Wen Fix For: 2.3.0 See the following [comment|https://issues.apache.org/jira/browse/HDFS-4198?focusedCommentId=13498818page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13498818] for detailed description. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4066) Fix compile warnings for libwebhdfs
[ https://issues.apache.org/jira/browse/HDFS-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HDFS-4066: Fix Version/s: (was: 2.1.0-beta) 2.3.0 Fix compile warnings for libwebhdfs --- Key: HDFS-4066 URL: https://issues.apache.org/jira/browse/HDFS-4066 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Fix For: 2.3.0 The compile of libwebhdfs still generates a bunch of warnings, which need to be fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira