[jira] [Commented] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336528#comment-14336528 ] Hudson commented on HDFS-7763: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #106 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/106/]) HDFS-7763. fix zkfc hung issue due to not catching exception in a corner case. Contributed by Liang Xie. (wang: rev 7105ebaa9f370db04962a1e19a67073dc080433b) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 2.7.0 Attachments: HDFS-7763-001.txt, HDFS-7763-002.txt, jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run()
[jira] [Commented] (HDFS-7831) Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks()
[ https://issues.apache.org/jira/browse/HDFS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336512#comment-14336512 ] Hudson commented on HDFS-7831: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2047 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2047/]) HDFS-7831. Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks(). Contributed by Konstantin Shvachko. (jing9: rev 73bcfa99af61e5202f030510db8954c17cba43cc) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiffList.java Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks() Key: HDFS-7831 URL: https://issues.apache.org/jira/browse/HDFS-7831 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 2.7.0 Attachments: HDFS-7831-01.patch Currently the loop in {{FileDiffList.findEarlierSnapshotBlocks()}} starts from {{insertPoint + 1}}. It should start from {{insertPoint - 1}}. As noted in [Jing's comment|https://issues.apache.org/jira/browse/HDFS-7056?focusedCommentId=14333864page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14333864] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7831) Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks()
[ https://issues.apache.org/jira/browse/HDFS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336533#comment-14336533 ] Hudson commented on HDFS-7831: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #106 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/106/]) HDFS-7831. Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks(). Contributed by Konstantin Shvachko. (jing9: rev 73bcfa99af61e5202f030510db8954c17cba43cc) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiffList.java Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks() Key: HDFS-7831 URL: https://issues.apache.org/jira/browse/HDFS-7831 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 2.7.0 Attachments: HDFS-7831-01.patch Currently the loop in {{FileDiffList.findEarlierSnapshotBlocks()}} starts from {{insertPoint + 1}}. It should start from {{insertPoint - 1}}. As noted in [Jing's comment|https://issues.apache.org/jira/browse/HDFS-7056?focusedCommentId=14333864page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14333864] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7008) xlator should be closed upon exit from DFSAdmin#genericRefresh()
[ https://issues.apache.org/jira/browse/HDFS-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336530#comment-14336530 ] Hudson commented on HDFS-7008: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #106 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/106/]) HDFS-7008. xlator should be closed upon exit from DFSAdmin#genericRefresh(). (ozawa) (ozawa: rev b53fd7163bc3a4eef4632afb55e5513c7c592fcf) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt xlator should be closed upon exit from DFSAdmin#genericRefresh() Key: HDFS-7008 URL: https://issues.apache.org/jira/browse/HDFS-7008 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Tsuyoshi Ozawa Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7008.1.patch, HDFS-7008.2.patch {code} GenericRefreshProtocol xlator = new GenericRefreshProtocolClientSideTranslatorPB(proxy); // Refresh CollectionRefreshResponse responses = xlator.refresh(identifier, args); {code} GenericRefreshProtocolClientSideTranslatorPB#close() should be called on xlator before return. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336507#comment-14336507 ] Hudson commented on HDFS-7763: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2047 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2047/]) HDFS-7763. fix zkfc hung issue due to not catching exception in a corner case. Contributed by Liang Xie. (wang: rev 7105ebaa9f370db04962a1e19a67073dc080433b) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 2.7.0 Attachments: HDFS-7763-001.txt, HDFS-7763-002.txt, jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run() so we can
[jira] [Commented] (HDFS-7008) xlator should be closed upon exit from DFSAdmin#genericRefresh()
[ https://issues.apache.org/jira/browse/HDFS-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336509#comment-14336509 ] Hudson commented on HDFS-7008: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2047 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2047/]) HDFS-7008. xlator should be closed upon exit from DFSAdmin#genericRefresh(). (ozawa) (ozawa: rev b53fd7163bc3a4eef4632afb55e5513c7c592fcf) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt xlator should be closed upon exit from DFSAdmin#genericRefresh() Key: HDFS-7008 URL: https://issues.apache.org/jira/browse/HDFS-7008 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Tsuyoshi Ozawa Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7008.1.patch, HDFS-7008.2.patch {code} GenericRefreshProtocol xlator = new GenericRefreshProtocolClientSideTranslatorPB(proxy); // Refresh CollectionRefreshResponse responses = xlator.refresh(identifier, args); {code} GenericRefreshProtocolClientSideTranslatorPB#close() should be called on xlator before return. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7842) Blocks missed while performing downgrade immediately after rolling back the cluster.
[ https://issues.apache.org/jira/browse/HDFS-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336444#comment-14336444 ] Brahma Reddy Battula commented on HDFS-7842: AFAIK {{-rollingUpgrade downgrade}} no longer availble,, Please check following link for same.. https://issues.apache.org/jira/browse/HDFS-7302?focusedCommentId=14335036page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14335036 {quote} Restoring blocks from trash after downgrade can be avoided. {quote} I feel,HDFS-7645 will solve this.. Blocks missed while performing downgrade immediately after rolling back the cluster. Key: HDFS-7842 URL: https://issues.apache.org/jira/browse/HDFS-7842 Project: Hadoop HDFS Issue Type: Bug Reporter: J.Andreina Assignee: J.Andreina Priority: Critical Performing downgrade immediately after rolling back the cluster , will replace the blocks from trash Since the block id for the files created before rollback will be same as the file created before downgrade, namenode will get into safemode , as the block size reported from Datanode will be different from the one in block map (corrupted blocks) . Steps to Reproduce {noformat} Step 1: Prepare rolling upgrade using hdfs dfsadmin -rollingUpgrade prepare Step 2: Shutdown SNN and NN Step 3: Start NN with the hdfs namenode -rollingUpgrade started option. Step 4: Executed hdfs dfsadmin -shutdownDatanode DATANODE_HOST:IPC_PORT upgrade and restarted Datanode Step 5: Create File_1 of size 11526 Step 6: Shutdown both NN and DN Step 7: Start NNs with the hdfs namenode -rollingUpgrade rollback option. Start DNs with the -rollback option. Step 8: Prepare rolling upgrade using hdfs dfsadmin -rollingUpgrade prepare Step 9: Shutdown SNN and NN Step 10: Start NN with the hdfs namenode -rollingUpgrade started option . Step 11: Executed hdfs dfsadmin -shutdownDatanode DATANODE_HOST:IPC_PORT upgrade and restarted Datanode step 12: Add file File_2 with size 6324 (which has same blockid as previous created File_1 with block size 11526) Step 13: Shutdown both NN and DN Step 14: Start NNs with the hdfs namenode -rollingUpgrade downgrade option.Start DNs normally. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7831) Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks()
[ https://issues.apache.org/jira/browse/HDFS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336412#comment-14336412 ] Hudson commented on HDFS-7831: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #849 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/849/]) HDFS-7831. Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks(). Contributed by Konstantin Shvachko. (jing9: rev 73bcfa99af61e5202f030510db8954c17cba43cc) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiffList.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks() Key: HDFS-7831 URL: https://issues.apache.org/jira/browse/HDFS-7831 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 2.7.0 Attachments: HDFS-7831-01.patch Currently the loop in {{FileDiffList.findEarlierSnapshotBlocks()}} starts from {{insertPoint + 1}}. It should start from {{insertPoint - 1}}. As noted in [Jing's comment|https://issues.apache.org/jira/browse/HDFS-7056?focusedCommentId=14333864page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14333864] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7008) xlator should be closed upon exit from DFSAdmin#genericRefresh()
[ https://issues.apache.org/jira/browse/HDFS-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336409#comment-14336409 ] Hudson commented on HDFS-7008: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #849 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/849/]) HDFS-7008. xlator should be closed upon exit from DFSAdmin#genericRefresh(). (ozawa) (ozawa: rev b53fd7163bc3a4eef4632afb55e5513c7c592fcf) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java xlator should be closed upon exit from DFSAdmin#genericRefresh() Key: HDFS-7008 URL: https://issues.apache.org/jira/browse/HDFS-7008 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Tsuyoshi Ozawa Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7008.1.patch, HDFS-7008.2.patch {code} GenericRefreshProtocol xlator = new GenericRefreshProtocolClientSideTranslatorPB(proxy); // Refresh CollectionRefreshResponse responses = xlator.refresh(identifier, args); {code} GenericRefreshProtocolClientSideTranslatorPB#close() should be called on xlator before return. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336407#comment-14336407 ] Hudson commented on HDFS-7763: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #849 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/849/]) HDFS-7763. fix zkfc hung issue due to not catching exception in a corner case. Contributed by Liang Xie. (wang: rev 7105ebaa9f370db04962a1e19a67073dc080433b) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 2.7.0 Attachments: HDFS-7763-001.txt, HDFS-7763-002.txt, jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run() so we can
[jira] [Commented] (HDFS-7831) Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks()
[ https://issues.apache.org/jira/browse/HDFS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336594#comment-14336594 ] Hudson commented on HDFS-7831: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #115 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/115/]) HDFS-7831. Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks(). Contributed by Konstantin Shvachko. (jing9: rev 73bcfa99af61e5202f030510db8954c17cba43cc) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiffList.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks() Key: HDFS-7831 URL: https://issues.apache.org/jira/browse/HDFS-7831 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 2.7.0 Attachments: HDFS-7831-01.patch Currently the loop in {{FileDiffList.findEarlierSnapshotBlocks()}} starts from {{insertPoint + 1}}. It should start from {{insertPoint - 1}}. As noted in [Jing's comment|https://issues.apache.org/jira/browse/HDFS-7056?focusedCommentId=14333864page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14333864] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336589#comment-14336589 ] Hudson commented on HDFS-7763: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #115 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/115/]) HDFS-7763. fix zkfc hung issue due to not catching exception in a corner case. Contributed by Liang Xie. (wang: rev 7105ebaa9f370db04962a1e19a67073dc080433b) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 2.7.0 Attachments: HDFS-7763-001.txt, HDFS-7763-002.txt, jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from
[jira] [Commented] (HDFS-7008) xlator should be closed upon exit from DFSAdmin#genericRefresh()
[ https://issues.apache.org/jira/browse/HDFS-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336591#comment-14336591 ] Hudson commented on HDFS-7008: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #115 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/115/]) HDFS-7008. xlator should be closed upon exit from DFSAdmin#genericRefresh(). (ozawa) (ozawa: rev b53fd7163bc3a4eef4632afb55e5513c7c592fcf) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java xlator should be closed upon exit from DFSAdmin#genericRefresh() Key: HDFS-7008 URL: https://issues.apache.org/jira/browse/HDFS-7008 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Tsuyoshi Ozawa Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7008.1.patch, HDFS-7008.2.patch {code} GenericRefreshProtocol xlator = new GenericRefreshProtocolClientSideTranslatorPB(proxy); // Refresh CollectionRefreshResponse responses = xlator.refresh(identifier, args); {code} GenericRefreshProtocolClientSideTranslatorPB#close() should be called on xlator before return. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7843) A truncated file is corrupted after rollback from a rolling upgrade
Tsz Wo Nicholas Sze created HDFS-7843: - Summary: A truncated file is corrupted after rollback from a rolling upgrade Key: HDFS-7843 URL: https://issues.apache.org/jira/browse/HDFS-7843 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Blocker Here is a rolling upgrade truncate test from [~brandonli]. The basic test step is: (3 nodes cluster with HA) 1. upload a file to hdfs 2. start rollingupgrade. finish rollingupgrade for namenode and one datanode. 3. truncate the file in hdfs to 1byte 4. do rollback 5. download file from hdfs, check file size to be original size I see the file size in hdfs is correct but can't read it because the block is corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7843) A truncated file is corrupted after rollback from a rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7843: -- Attachment: h7843_20150226.patch h7843_20150226.patch: use copy-on-truncate for rolling upgrade. A truncated file is corrupted after rollback from a rolling upgrade --- Key: HDFS-7843 URL: https://issues.apache.org/jira/browse/HDFS-7843 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Blocker Fix For: 2.7.0 Attachments: h7843_20150226.patch Here is a rolling upgrade truncate test from [~brandonli]. The basic test step is: (3 nodes cluster with HA) 1. upload a file to hdfs 2. start rollingupgrade. finish rollingupgrade for namenode and one datanode. 3. truncate the file in hdfs to 1byte 4. do rollback 5. download file from hdfs, check file size to be original size I see the file size in hdfs is correct but can't read it because the block is corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7763) fix zkfc hung issue due to not catching exception in a corner case
[ https://issues.apache.org/jira/browse/HDFS-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336615#comment-14336615 ] Hudson commented on HDFS-7763: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2065 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2065/]) HDFS-7763. fix zkfc hung issue due to not catching exception in a corner case. Contributed by Liang Xie. (wang: rev 7105ebaa9f370db04962a1e19a67073dc080433b) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt fix zkfc hung issue due to not catching exception in a corner case -- Key: HDFS-7763 URL: https://issues.apache.org/jira/browse/HDFS-7763 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 2.6.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 2.7.0 Attachments: HDFS-7763-001.txt, HDFS-7763-002.txt, jstack.4936 In our product cluster, we hit both the two zkfc process is hung after a zk network outage. the zkfc log said: {code} 2015-02-07,17:40:11,875 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 3334ms for sessionid 0x4a61bacdd9dfb2, closing socket connection and attempting reconnect 2015-02-07,17:40:11,977 FATAL org.apache.hadoop.ha.ActiveStandbyElector: Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.zookeeper.ZooKeeper: Session: 0x4a61bacdd9dfb2 closed 2015-02-07,17:40:12,425 FATAL org.apache.hadoop.ha.ZKFailoverController: Fatal error occurred:Received stat error from Zookeeper. code:CONNECTIONLOSS. Not retrying further znode monitoring connection errors. 2015-02-07,17:40:12,425 INFO org.apache.hadoop.ipc.Server: Stopping server on 11300 2015-02-07,17:40:12,425 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 WARN org.apache.hadoop.ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x4a61bacdd9dfb2 2015-02-07,17:40:12,426 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.ActiveStandbyElector: Yielding from election 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ha.HealthMonitor: Stopping HealthMonitor thread 2015-02-07,17:40:12,426 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 11300 {code} and the thread dump also be uploaded as attachment. From the dump, we can see due to the unknown non-daemon threads(pool-*-thread-*), the process did not exit, but the critical threads, like health monitor and rpc threads had been stopped, so our watchdog(supervisord) had not not observed the zkfc process is down or abnormal. so the following namenode failover could not be done as expected. there're two possible fixes here, 1) figure out the unset-thread-name, like pool-7-thread-1, where them came from and close or set daemon property. i tried to search but got nothing right now. 2) catch the exception from ZKFailoverController.run()
[jira] [Commented] (HDFS-7008) xlator should be closed upon exit from DFSAdmin#genericRefresh()
[ https://issues.apache.org/jira/browse/HDFS-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336617#comment-14336617 ] Hudson commented on HDFS-7008: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2065 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2065/]) HDFS-7008. xlator should be closed upon exit from DFSAdmin#genericRefresh(). (ozawa) (ozawa: rev b53fd7163bc3a4eef4632afb55e5513c7c592fcf) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java xlator should be closed upon exit from DFSAdmin#genericRefresh() Key: HDFS-7008 URL: https://issues.apache.org/jira/browse/HDFS-7008 Project: Hadoop HDFS Issue Type: Bug Reporter: Ted Yu Assignee: Tsuyoshi Ozawa Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7008.1.patch, HDFS-7008.2.patch {code} GenericRefreshProtocol xlator = new GenericRefreshProtocolClientSideTranslatorPB(proxy); // Refresh CollectionRefreshResponse responses = xlator.refresh(identifier, args); {code} GenericRefreshProtocolClientSideTranslatorPB#close() should be called on xlator before return. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7831) Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks()
[ https://issues.apache.org/jira/browse/HDFS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336620#comment-14336620 ] Hudson commented on HDFS-7831: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2065 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2065/]) HDFS-7831. Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks(). Contributed by Konstantin Shvachko. (jing9: rev 73bcfa99af61e5202f030510db8954c17cba43cc) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/FileDiffList.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Fix the starting index and end condition of the loop in FileDiffList.findEarlierSnapshotBlocks() Key: HDFS-7831 URL: https://issues.apache.org/jira/browse/HDFS-7831 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 2.7.0 Attachments: HDFS-7831-01.patch Currently the loop in {{FileDiffList.findEarlierSnapshotBlocks()}} starts from {{insertPoint + 1}}. It should start from {{insertPoint - 1}}. As noted in [Jing's comment|https://issues.apache.org/jira/browse/HDFS-7056?focusedCommentId=14333864page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14333864] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7843) A truncated file is corrupted after rollback from a rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7843: -- Status: Patch Available (was: Open) A truncated file is corrupted after rollback from a rolling upgrade --- Key: HDFS-7843 URL: https://issues.apache.org/jira/browse/HDFS-7843 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Blocker Fix For: 2.7.0 Attachments: h7843_20150226.patch Here is a rolling upgrade truncate test from [~brandonli]. The basic test step is: (3 nodes cluster with HA) 1. upload a file to hdfs 2. start rollingupgrade. finish rollingupgrade for namenode and one datanode. 3. truncate the file in hdfs to 1byte 4. do rollback 5. download file from hdfs, check file size to be original size I see the file size in hdfs is correct but can't read it because the block is corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7843) A truncated file is corrupted after rollback from a rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7843: -- Component/s: (was: datanode) Fix Version/s: (was: 2.7.0) A truncated file is corrupted after rollback from a rolling upgrade --- Key: HDFS-7843 URL: https://issues.apache.org/jira/browse/HDFS-7843 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Blocker Attachments: h7843_20150226.patch Here is a rolling upgrade truncate test from [~brandonli]. The basic test step is: (3 nodes cluster with HA) 1. upload a file to hdfs 2. start rollingupgrade. finish rollingupgrade for namenode and one datanode. 3. truncate the file in hdfs to 1byte 4. do rollback 5. download file from hdfs, check file size to be original size I see the file size in hdfs is correct but can't read it because the block is corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with +
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336700#comment-14336700 ] Kihwal Lee commented on HDFS-7816: -- [~wheat9], are you okay with restoring the compatibility here and address the standard-compliance in HDFS-7822? Unable to open webhdfs paths with + - Key: HDFS-7816 URL: https://issues.apache.org/jira/browse/HDFS-7816 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.7.0 Reporter: Jason Lowe Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-7816.patch, HDFS-7816.patch webhdfs requests to open files with % characters in the filename fail because the filename is not being decoded properly. For example: $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7729) Add logic to DFSOutputStream to support writing a file in striping layout
[ https://issues.apache.org/jira/browse/HDFS-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-7729: Attachment: HDFS-7729-009.patch I've updated the patch slightly to fix Jenkins test failures. I think the patch works with all existing unit tests now. Will wait for another Jenkins run. [~jingzhao] Given that Bo is still in holiday and the DataStreamer refactor in trunk will take some more time, do you think we can commit this functional patch first? Since we are close to finish a minimum NN prototype, this client patch can help us do some preliminary end-to-end tests. When HDFS-7793 is in we can do a follow-on JIRA. Add logic to DFSOutputStream to support writing a file in striping layout -- Key: HDFS-7729 URL: https://issues.apache.org/jira/browse/HDFS-7729 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: Codec-tmp.patch, HDFS-7729-001.patch, HDFS-7729-002.patch, HDFS-7729-003.patch, HDFS-7729-004.patch, HDFS-7729-005.patch, HDFS-7729-006.patch, HDFS-7729-007.patch, HDFS-7729-008.patch, HDFS-7729-009.patch If client wants to directly write a file striping layout, we need to add some logic to DFSOutputStream. DFSOutputStream needs multiple DataStreamers to write each cell of a stripe to a remote datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with +
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336890#comment-14336890 ] Haohui Mai commented on HDFS-7816: -- bq. maintaining compatibility takes precedence over being standard compliant, because things stop working and business are impacted. IMO this is a bug of implementing the spec instead of a feature. One thing to note is that the scope of the compatibility of WebHDFS is much larger. There are many other investments that rely on the fact that the WebHDFS server is compliant with RFC 3986. Just take a few examples: * Microsoft .NET SDK For Hadoop (http://hadoopsdk.codeplex.com/wikipage?title=WebHDFS%20Client) * Python (https://pypi.python.org/pypi/WebHDFS/0.2.0) * Proxy-based solutions, like Knox and HttpFS * Reverse proxy standing in front of WebHDFS This bug prevents all of the above libraries working properly. I think that in this case we also need to consider the use cases of the above libraries as well. Unable to open webhdfs paths with + - Key: HDFS-7816 URL: https://issues.apache.org/jira/browse/HDFS-7816 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.7.0 Reporter: Jason Lowe Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-7816.patch, HDFS-7816.patch webhdfs requests to open files with % characters in the filename fail because the filename is not being decoded properly. For example: $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7843) A truncated file is corrupted after rollback from a rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337067#comment-14337067 ] Hadoop QA commented on HDFS-7843: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700791/h7843_20150226.patch against trunk revision 1a68fc4. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.namenode.TestFileTruncate org.apache.hadoop.hdfs.qjournal.TestNNWithQJM Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9664//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9664//console This message is automatically generated. A truncated file is corrupted after rollback from a rolling upgrade --- Key: HDFS-7843 URL: https://issues.apache.org/jira/browse/HDFS-7843 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Blocker Attachments: h7843_20150226.patch Here is a rolling upgrade truncate test from [~brandonli]. The basic test step is: (3 nodes cluster with HA) 1. upload a file to hdfs 2. start rollingupgrade. finish rollingupgrade for namenode and one datanode. 3. truncate the file in hdfs to 1byte 4. do rollback 5. download file from hdfs, check file size to be original size I see the file size in hdfs is correct but can't read it because the block is corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7749) Erasure Coding: Add striped block support in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337074#comment-14337074 ] Zhe Zhang commented on HDFS-7749: - Thanks Jing for the solid work! The overall structure is very clear and the main logics look good. More detailed comments: # As a quick note for our future refactor to abstract out a common {{BlockInfoUnderConstruction}} class: the following added methods in {{BlockInfo}} can be simplified: #* {{convertToCompleteBlock}} #* {{commitBlock}} #* {{addReplica}} #* {{getNumExpectedLocations}} #* {{getExpectedStorageLocations}} #* {{setExpectedLocations}} #* {{getBlockRecoveryId}} #* Seems {{convertToBlockUnderConstruction}} could be unified too # Should {{BlockInfoStripedUnderConstruction#replicas}} represent a sorted list of blocks in the group? If so seems we should update {{addReplicaIfNotPresent}} to maintain the order while handling stale entries. # This constructor could use some Javadoc as well: {code} public BlockInfoStripedUnderConstruction(Block blk, short dataBlockNum, short parityBlockNum) { this(blk, dataBlockNum, parityBlockNum, UNDER_CONSTRUCTION, null); } {code} # Javadoc of {{convertToCompleteBlock}} method in both UC classes could be updated with the correct return type. But they might be merged later anyway.. # Minor log message error in {{BlockInfoContiguousUnderConstruction}} {code} if (replicas.size() == 0) { NameNode.blockStateChangeLog.warn(BLOCK* + BlockInfoUnderConstruction.initLeaseRecovery: + No blocks found, lease removed.); } {code} # This will probably also be addressed when we abstract out the common {{BlockInfoUnderConstruction}} class: should methods like {{commitBlock}} be static? # {{BlockManager#addStoredBlock}}: should we still check the block belongs to a file? {code} -assert bc != null : Block must belong to a file; {code} I'm still working on the patch after {{FSEditLog}}. Will finish the review soon. Erasure Coding: Add striped block support in INodeFile -- Key: HDFS-7749 URL: https://issues.apache.org/jira/browse/HDFS-7749 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-7749.000.patch This jira plan to add a new INodeFile feature to store the stripped blocks information in case that the INodeFile is erasure coded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7729) Add logic to DFSOutputStream to support writing a file in striping layout
[ https://issues.apache.org/jira/browse/HDFS-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336932#comment-14336932 ] Jing Zhao commented on HDFS-7729: - I think we'd better finish HDFS-7793 first. I will also spend more time this week on that jira if Bo is still on vacation. For testing NN prototype, before we write end-to-end tests, we'd better first add more tests with synthetic block reports and client requests. Add logic to DFSOutputStream to support writing a file in striping layout -- Key: HDFS-7729 URL: https://issues.apache.org/jira/browse/HDFS-7729 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: Codec-tmp.patch, HDFS-7729-001.patch, HDFS-7729-002.patch, HDFS-7729-003.patch, HDFS-7729-004.patch, HDFS-7729-005.patch, HDFS-7729-006.patch, HDFS-7729-007.patch, HDFS-7729-008.patch, HDFS-7729-009.patch If client wants to directly write a file striping layout, we need to add some logic to DFSOutputStream. DFSOutputStream needs multiple DataStreamers to write each cell of a stripe to a remote datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with +
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336954#comment-14336954 ] Haohui Mai commented on HDFS-7816: -- Take a closer look of the RFC -- the syntax of HTTP URLs are specified in RFC 1738. {code} ; HTTP httpurl= http://; hostport [ / hpath [ ? search ]] hpath = hsegment *[ / hsegment ] hsegment = *[ uchar | ; | : | @ | | = ] search = *[ uchar | ; | : | @ | | = ] safe = $ | - | _ | . | + extra = ! | * | ' | ( | ) | , reserved = ; | / | ? | : | @ | | = hex= digit | A | B | C | D | E | F | a | b | c | d | e | f escape = % hex hex unreserved = alpha | digit | safe | extra uchar = unreserved | escape {code} So it looks like it is okay to not encode +. However, the current webhdfs client is still broken as it does not encode the '%' character: {code} final URL url = new URL(getTransportScheme(), nnAddr.getHostName(), nnAddr.getPort(), path + '?' + query); scala new java.net.URL(http, localhost, 80, /+%asdlkf) res3: java.net.URL = http://localhost:80/+%asdlkf {code} So it looks like that we need to fix the client anyway. Unable to open webhdfs paths with + - Key: HDFS-7816 URL: https://issues.apache.org/jira/browse/HDFS-7816 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.7.0 Reporter: Jason Lowe Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-7816.patch, HDFS-7816.patch webhdfs requests to open files with % characters in the filename fail because the filename is not being decoded properly. For example: $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7729) Add logic to DFSOutputStream to support writing a file in striping layout
[ https://issues.apache.org/jira/browse/HDFS-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336962#comment-14336962 ] Zhe Zhang commented on HDFS-7729: - OK, adding more synthetic tests sounds a good idea to me -- we need them in the final patch anyway. Let's keep this JIRA open then, and refactor the patch after HDFS-7793. Add logic to DFSOutputStream to support writing a file in striping layout -- Key: HDFS-7729 URL: https://issues.apache.org/jira/browse/HDFS-7729 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Li Bo Assignee: Li Bo Attachments: Codec-tmp.patch, HDFS-7729-001.patch, HDFS-7729-002.patch, HDFS-7729-003.patch, HDFS-7729-004.patch, HDFS-7729-005.patch, HDFS-7729-006.patch, HDFS-7729-007.patch, HDFS-7729-008.patch, HDFS-7729-009.patch If client wants to directly write a file striping layout, we need to add some logic to DFSOutputStream. DFSOutputStream needs multiple DataStreamers to write each cell of a stripe to a remote datanode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with +
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336977#comment-14336977 ] Kihwal Lee commented on HDFS-7816: -- bq. So it looks like that we need to fix the client anyway. I am not disputing that. I just want it to be done in a way that does not affect existing users in HDFS-7822. Unable to open webhdfs paths with + - Key: HDFS-7816 URL: https://issues.apache.org/jira/browse/HDFS-7816 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.7.0 Reporter: Jason Lowe Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-7816.patch, HDFS-7816.patch webhdfs requests to open files with % characters in the filename fail because the filename is not being decoded properly. For example: $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with +
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337001#comment-14337001 ] Daryn Sharp commented on HDFS-7816: --- Use the force, I mean Javadoc, Luke! It needs to be fixed, but as Kihwal says, we must maintain compatibility. {code} The URL class does not itself encode or decode any URL components according to the escaping mechanism defined in RFC2396. It is the responsibility of the caller to encode any fields, which need to be escaped prior to calling URL, and also to decode any escaped fields, that are returned from URL. Furthermore, because URL has no knowledge of URL escaping, it does not recognise equivalence between the encoded or decoded form of the same URL. For example, the two URLs: http://foo.com/hello world/ and http://foo.com/hello%20world would be considered not equal to each other. Note, the java.net.URI class does perform escaping of its component fields in certain circumstances. The recommended way to manage the encoding and decoding of URLs is to use java.net.URI, and to convert between these two classes using toURI() and URI.toURL(). {code} Unable to open webhdfs paths with + - Key: HDFS-7816 URL: https://issues.apache.org/jira/browse/HDFS-7816 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.7.0 Reporter: Jason Lowe Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-7816.patch, HDFS-7816.patch webhdfs requests to open files with % characters in the filename fail because the filename is not being decoded properly. For example: $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7816) Unable to open webhdfs paths with +
[ https://issues.apache.org/jira/browse/HDFS-7816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336910#comment-14336910 ] Kihwal Lee commented on HDFS-7816: -- The server will still decode per-cent encoded + and other special characters. So it won't break clients who are encoding according to the standard. Unable to open webhdfs paths with + - Key: HDFS-7816 URL: https://issues.apache.org/jira/browse/HDFS-7816 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.7.0 Reporter: Jason Lowe Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-7816.patch, HDFS-7816.patch webhdfs requests to open files with % characters in the filename fail because the filename is not being decoded properly. For example: $ hadoop fs -cat 'webhdfs://nn/user/somebody/abc%def' cat: File does not exist: /user/somebody/abc%25def -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Smith updated HDFS-7460: - Status: Patch Available (was: Open) Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Attachments: HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337177#comment-14337177 ] Hadoop QA commented on HDFS-7460: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700855/HDFS-7460.patch against trunk revision 5731c0e. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9665//console This message is automatically generated. Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Attachments: HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Smith updated HDFS-7460: - Attachment: HDFS-7460.patch Based upon the code for kms. Please review. Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Attachments: HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Smith reassigned HDFS-7460: Assignee: John Smith Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7774) Unresolved symbols error while compiling HDFS on Windows 7/32 bit
[ https://issues.apache.org/jira/browse/HDFS-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-7774: Labels: build native (was: build) Unresolved symbols error while compiling HDFS on Windows 7/32 bit - Key: HDFS-7774 URL: https://issues.apache.org/jira/browse/HDFS-7774 Project: Hadoop HDFS Issue Type: Bug Components: build, native Affects Versions: 2.6.0 Environment: Windows 7, 32 bit, Visual Studio 10. Windows PATH: PATH=C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; SDK Path: PATH=C:\Windows\Microsoft.NET\Framework\v4.0.30319;C:\Windows\Microsoft.NET\Framework\v3.5;;C:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE;C:\Program Files\Microsoft Visual Studio 10.0\Common7\Tools;;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin\VCPackages;;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\NETFX 4.0 Tools;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin;;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; Reporter: Venkatasubramaniam Ramakrishnan Assignee: Kiran Kumar M R Priority: Critical Labels: build Attachments: HDFS-7774-001.patch, Win32_Changes-temp.patch I am getting the following error in the hdfs module compilation: . . . [exec] ClCompile: [exec] All outputs are up-to-date. [exec] Lib: [exec] All outputs are up-to-date. [exec] hdfs_static.vcxproj - D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\target\bin\RelWithDebInfo\hdfs.lib [exec] FinalizeBuildStatus: [exec] Deleting file hdfs_static.dir\RelWithDebInfo\hdfs_static.unsuccessfulbuild. [exec] Touching hdfs_static.dir\RelWithDebInfo\hdfs_static.lastbuildstate. [exec] Done Building Project D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs_static.vcxproj (default targets). [exec] Done Building Project D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\ALL_BUILD.vcxproj (default targets) -- FAILED. [exec] [exec] Build FAILED. [exec] [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\ALL_BUILD.vcxproj (default target) (1) - [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj (default target) (3) - [exec] (Link target) - [exec] thread_local_storage.obj : error LNK2001: unresolved external symbol _tls_used [D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj] [exec] thread_local_storage.obj : error LNK2001: unresolved external symbol pTlsCallback [D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj] [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\target\bin\RelWithDebInfo\hdfs.dll : fatal error LNK1120: 2 unresolved externals [D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj] [exec] [exec] 0 Warning(s) [exec] 3 Error(s) [exec] [exec] Time Elapsed 00:00:40.39 [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop HDFS . FAILURE [02:27 min] [INFO] Apache Hadoop HttpFS ... SKIPPED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7819) Log WARN message for the blocks which are not in Block ID based layout
[ https://issues.apache.org/jira/browse/HDFS-7819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337233#comment-14337233 ] Colin Patrick McCabe commented on HDFS-7819: +1. Thanks, [~rakeshr]. Log WARN message for the blocks which are not in Block ID based layout -- Key: HDFS-7819 URL: https://issues.apache.org/jira/browse/HDFS-7819 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Rakesh R Assignee: Rakesh R Attachments: HDFS-7819-1.patch, HDFS-7819-2.patch As per the discussion in HDFS-7648 decided that, *DirectoryScanner* should identify the blocks which are not in the expected directory path computed using its *Block ID* and just do a log message. To understand more please refer the [discussion thread|https://issues.apache.org/jira/browse/HDFS-7648?focusedCommentId=14292443page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14292443] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7495) Remove updatePosition argument from DFSInputStream#getBlockAt()
[ https://issues.apache.org/jira/browse/HDFS-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337269#comment-14337269 ] Hudson commented on HDFS-7495: -- FAILURE: Integrated in Hadoop-trunk-Commit #7200 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7200/]) HDFS-7495. Remove updatePosition argument from DFSInputStream#getBlockAt() (cmccabe) (cmccabe: rev caa42adf208bfb5625d1b3ef665fbf334ffcccd9) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java Remove updatePosition argument from DFSInputStream#getBlockAt() --- Key: HDFS-7495 URL: https://issues.apache.org/jira/browse/HDFS-7495 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ted Yu Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7495.002.patch, hdfs-7495-001.patch There're two locks: one on DFSInputStream.this , one on DFSInputStream.infoLock Normally lock is obtained on infoLock, then on DFSInputStream.infoLock However, such order is not observed in DFSInputStream#getBlockAt() : {code} synchronized(infoLock) { ... if (updatePosition) { // synchronized not strictly needed, since we only get here // from synchronized caller methods synchronized(this) { {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337216#comment-14337216 ] Allen Wittenauer commented on HDFS-7460: While you are fixing the patch :), any chance you can include HDFS-7794? Thanks! Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Attachments: HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7774) Unresolved symbols error while compiling HDFS on Windows 7/32 bit
[ https://issues.apache.org/jira/browse/HDFS-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-7774: Labels: build (was: build native) Unresolved symbols error while compiling HDFS on Windows 7/32 bit - Key: HDFS-7774 URL: https://issues.apache.org/jira/browse/HDFS-7774 Project: Hadoop HDFS Issue Type: Bug Components: build, native Affects Versions: 2.6.0 Environment: Windows 7, 32 bit, Visual Studio 10. Windows PATH: PATH=C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; SDK Path: PATH=C:\Windows\Microsoft.NET\Framework\v4.0.30319;C:\Windows\Microsoft.NET\Framework\v3.5;;C:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE;C:\Program Files\Microsoft Visual Studio 10.0\Common7\Tools;;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin\VCPackages;;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\NETFX 4.0 Tools;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin;;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; Reporter: Venkatasubramaniam Ramakrishnan Assignee: Kiran Kumar M R Priority: Critical Labels: build Attachments: HDFS-7774-001.patch, Win32_Changes-temp.patch I am getting the following error in the hdfs module compilation: . . . [exec] ClCompile: [exec] All outputs are up-to-date. [exec] Lib: [exec] All outputs are up-to-date. [exec] hdfs_static.vcxproj - D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\target\bin\RelWithDebInfo\hdfs.lib [exec] FinalizeBuildStatus: [exec] Deleting file hdfs_static.dir\RelWithDebInfo\hdfs_static.unsuccessfulbuild. [exec] Touching hdfs_static.dir\RelWithDebInfo\hdfs_static.lastbuildstate. [exec] Done Building Project D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs_static.vcxproj (default targets). [exec] Done Building Project D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\ALL_BUILD.vcxproj (default targets) -- FAILED. [exec] [exec] Build FAILED. [exec] [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\ALL_BUILD.vcxproj (default target) (1) - [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj (default target) (3) - [exec] (Link target) - [exec] thread_local_storage.obj : error LNK2001: unresolved external symbol _tls_used [D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj] [exec] thread_local_storage.obj : error LNK2001: unresolved external symbol pTlsCallback [D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj] [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\target\bin\RelWithDebInfo\hdfs.dll : fatal error LNK1120: 2 unresolved externals [D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj] [exec] [exec] 0 Warning(s) [exec] 3 Error(s) [exec] [exec] Time Elapsed 00:00:40.39 [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop HDFS . FAILURE [02:27 min] [INFO] Apache Hadoop HttpFS ... SKIPPED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7774) Unresolved symbols error while compiling HDFS on Windows 7/32 bit
[ https://issues.apache.org/jira/browse/HDFS-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-7774: Component/s: native Unresolved symbols error while compiling HDFS on Windows 7/32 bit - Key: HDFS-7774 URL: https://issues.apache.org/jira/browse/HDFS-7774 Project: Hadoop HDFS Issue Type: Bug Components: build, native Affects Versions: 2.6.0 Environment: Windows 7, 32 bit, Visual Studio 10. Windows PATH: PATH=C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; SDK Path: PATH=C:\Windows\Microsoft.NET\Framework\v4.0.30319;C:\Windows\Microsoft.NET\Framework\v3.5;;C:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE;C:\Program Files\Microsoft Visual Studio 10.0\Common7\Tools;;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin\VCPackages;;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\NETFX 4.0 Tools;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin;;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; Reporter: Venkatasubramaniam Ramakrishnan Assignee: Kiran Kumar M R Priority: Critical Labels: build Attachments: HDFS-7774-001.patch, Win32_Changes-temp.patch I am getting the following error in the hdfs module compilation: . . . [exec] ClCompile: [exec] All outputs are up-to-date. [exec] Lib: [exec] All outputs are up-to-date. [exec] hdfs_static.vcxproj - D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\target\bin\RelWithDebInfo\hdfs.lib [exec] FinalizeBuildStatus: [exec] Deleting file hdfs_static.dir\RelWithDebInfo\hdfs_static.unsuccessfulbuild. [exec] Touching hdfs_static.dir\RelWithDebInfo\hdfs_static.lastbuildstate. [exec] Done Building Project D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs_static.vcxproj (default targets). [exec] Done Building Project D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\ALL_BUILD.vcxproj (default targets) -- FAILED. [exec] [exec] Build FAILED. [exec] [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\ALL_BUILD.vcxproj (default target) (1) - [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj (default target) (3) - [exec] (Link target) - [exec] thread_local_storage.obj : error LNK2001: unresolved external symbol _tls_used [D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj] [exec] thread_local_storage.obj : error LNK2001: unresolved external symbol pTlsCallback [D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj] [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\target\bin\RelWithDebInfo\hdfs.dll : fatal error LNK1120: 2 unresolved externals [D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs.vcxproj] [exec] [exec] 0 Warning(s) [exec] 3 Error(s) [exec] [exec] Time Elapsed 00:00:40.39 [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop HDFS . FAILURE [02:27 min] [INFO] Apache Hadoop HttpFS ... SKIPPED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7774) Unresolved symbols error while compiling HDFS on Windows 7/32 bit
[ https://issues.apache.org/jira/browse/HDFS-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337225#comment-14337225 ] Chris Nauroth commented on HDFS-7774: - Kiran, thank you for this patch. It looks pretty good. For parameterizing the CMake generator, I recommend using an ant condition to evaluate a generator property based on the value of the {{PLATFORM}} environment variable. Then, that generator property can be used in place of the hard-coded string in the cmake call. Something like this ought to work: {code} condition property=generator value=Visual Studio 10 else=Visual Studio 10 Win64 equals arg1=Win32 arg2=${env.PLATFORM} / /condition mkdir dir=${project.build.directory}/native/ exec executable=cmake dir=${project.build.directory}/native failonerror=true arg line=${basedir}/src/ -DGENERATED_JAVAH=${project.build.directory}/native/javah -DJVM_ARCH_DATA_MODEL=${sun.arch.data.model} -DREQUIRE_LIBWEBHDFS=${require.libwebhdfs} -DREQUIRE_FUSE=${require.fuse} -G '${generator}'/ {code} Are the libhdfs tests passing for you with a 32-bit build? I consistently get {{OutOfMemoryError}} while trying to create a thread: {code} [exec] nmdCreate: Builder#build error: [exec] java.lang.OutOfMemoryError: unable to create new native thread [exec] at java.lang.Thread.start0(Native Method) [exec] at java.lang.Thread.start(Thread.java:714) [exec] at io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:557) [exec] at io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:146) [exec] at io.netty.util.concurrent.AbstractEventExecutorGroup.shutdownGracefully(AbstractEventExecutorGroup.java:69) [exec] at org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.close(DatanodeHttpServer.java:165) [exec] at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1632) [exec] at org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1740) [exec] at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1715) [exec] at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1699) [exec] at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:836) [exec] at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:466) [exec] at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:425) [exec] TEST_ERROR: failed on ..\..\src\main\native\libhdfs\test_libhdfs_threaded.c:330 (errno: 12): got NULL from tlhCluster {code} Unresolved symbols error while compiling HDFS on Windows 7/32 bit - Key: HDFS-7774 URL: https://issues.apache.org/jira/browse/HDFS-7774 Project: Hadoop HDFS Issue Type: Bug Components: build, native Affects Versions: 2.6.0 Environment: Windows 7, 32 bit, Visual Studio 10. Windows PATH: PATH=C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; SDK Path: PATH=C:\Windows\Microsoft.NET\Framework\v4.0.30319;C:\Windows\Microsoft.NET\Framework\v3.5;;C:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE;C:\Program Files\Microsoft Visual Studio 10.0\Common7\Tools;;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin\VCPackages;;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\NETFX 4.0 Tools;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin;;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; Reporter: Venkatasubramaniam Ramakrishnan Assignee: Kiran Kumar M R Priority: Critical Labels: build Attachments:
[jira] [Commented] (HDFS-7817) libhdfs3: fix strerror_r detection
[ https://issues.apache.org/jira/browse/HDFS-7817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337235#comment-14337235 ] Colin Patrick McCabe commented on HDFS-7817: That's one approach. Another approach is to force a standards-compliant strerror_r to be used by including special #ifdefs at the top of the file using strerror_r. Since sys_errlist is deprecated, that might be the better way. libhdfs3: fix strerror_r detection -- Key: HDFS-7817 URL: https://issues.apache.org/jira/browse/HDFS-7817 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Colin Patrick McCabe The signature of strerror_r is not quite detected correctly in libhdfs3. The code assumes that {{int foo = strerror_r}} will fail to compile with the GNU type signature, but this is not the case (C\+\+ will coerce the char* to an int in this case). Instead, we should do what the libhdfs {{terror}} (threaded error) function does here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7749) Erasure Coding: Add striped block support in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337241#comment-14337241 ] Jing Zhao commented on HDFS-7749: - Thanks for the review, Zhe! Yeah, we can do further code simplification for BlockInfoUC. I plan to do it in a separate jira. bq. Should BlockInfoStripedUnderConstruction#replicas represent a sorted list of blocks in the group? If so seems we should update addReplicaIfNotPresent to maintain the order while handling stale entries. Yes, this is necessary when NameNode restarts before completing the block. Currently I'm working on another patch which includes this part already. I will create a jira and post the patch there later. bq. BlockManager#addStoredBlock: should we still check the block belongs to a file? This actually has been checked by this code: {code} if (storedBlock == null || storedBlock.getBlockCollection() == null) { {code} I will update the patch after your further comments. Thanks again for the review! Erasure Coding: Add striped block support in INodeFile -- Key: HDFS-7749 URL: https://issues.apache.org/jira/browse/HDFS-7749 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-7749.000.patch This jira plan to add a new INodeFile feature to store the stripped blocks information in case that the INodeFile is erasure coded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7495) Remove updatePosition argument from DFSInputStream#getBlockAt()
[ https://issues.apache.org/jira/browse/HDFS-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7495: --- Resolution: Fixed Fix Version/s: 2.7.0 Status: Resolved (was: Patch Available) Committed to 2.7. Thanks. Remove updatePosition argument from DFSInputStream#getBlockAt() --- Key: HDFS-7495 URL: https://issues.apache.org/jira/browse/HDFS-7495 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Ted Yu Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.7.0 Attachments: HDFS-7495.002.patch, hdfs-7495-001.patch There're two locks: one on DFSInputStream.this , one on DFSInputStream.infoLock Normally lock is obtained on infoLock, then on DFSInputStream.infoLock However, such order is not observed in DFSInputStream#getBlockAt() : {code} synchronized(infoLock) { ... if (updatePosition) { // synchronized not strictly needed, since we only get here // from synchronized caller methods synchronized(this) { {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7749) Erasure Coding: Add striped block support in INodeFile
[ https://issues.apache.org/jira/browse/HDFS-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337347#comment-14337347 ] Zhe Zhang commented on HDFS-7749: - The rest of the patch looks good to me. Some minor points (except the last one): # In {{FSImageFormatPBINode}}: {{stripFeature}} should be {{stripeFeature}} # In {{FileWithStripedBlocksFeature}}: {{size_1}} could use a better name, maybe just {{oldSize}} and always use {{oldSize - 1}}? # Should the following Preconditions check in {{INodeFile}} also check {{isStriped}}? {code} +Preconditions.checkArgument(blocks == null || blocks.length == 0, +The file contains contiguous blocks); {code} # Maybe we should extend {{INodeFile#addBlock}} to handle striped blocks in this JIRA? Is it a simple matter of checking the type of {{newBlock}} and calling {{FileWithStripedBlocksFeature#addBlock}} if it's striped? This will allow us to test some basic client requests. Erasure Coding: Add striped block support in INodeFile -- Key: HDFS-7749 URL: https://issues.apache.org/jira/browse/HDFS-7749 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-7749.000.patch This jira plan to add a new INodeFile feature to store the stripped blocks information in case that the INodeFile is erasure coded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7371) Loading predefined EC schemas from configuration
[ https://issues.apache.org/jira/browse/HDFS-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337349#comment-14337349 ] Zhe Zhang commented on HDFS-7371: - HDFS-7839 will implement the basic XAttr structure for EC policies. Loading predefined EC schemas from configuration Key: HDFS-7371 URL: https://issues.apache.org/jira/browse/HDFS-7371 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng System administrator can configure multiple EC codecs in hdfs-site.xml file, and codec instances or schemas in a new configuration file named ec-schema.xml in the conf folder. A codec can be referenced by its instance or schema using the codec name, and a schema can be utilized and specified by the schema name for a folder or EC ZONE to enforce EC. Once a schema is used to define an EC ZONE, then its associated parameter values will be stored as xattributes and respected thereafter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7740) Test truncate with DataNodes restarting
[ https://issues.apache.org/jira/browse/HDFS-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337440#comment-14337440 ] Tsz Wo Nicholas Sze commented on HDFS-7740: --- TestFileTruncate.testTruncateWithDataNodesRestart may time out. Could you take a look? - https://builds.apache.org/job/PreCommit-HDFS-Build/9641/testReport/org.apache.hadoop.hdfs.server.namenode/TestFileTruncate/testTruncateWithDataNodesRestart/ - https://builds.apache.org/job/PreCommit-HDFS-Build/9664//testReport/org.apache.hadoop.hdfs.server.namenode/TestFileTruncate/testTruncateWithDataNodesRestart/ Test truncate with DataNodes restarting --- Key: HDFS-7740 URL: https://issues.apache.org/jira/browse/HDFS-7740 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7740.001.patch, HDFS-7740.002.patch, HDFS-7740.003.patch Add a test case, which ensures replica consistency when DNs are failing and restarting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7740) Test truncate with DataNodes restarting
[ https://issues.apache.org/jira/browse/HDFS-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337559#comment-14337559 ] Yi Liu commented on HDFS-7740: -- Sure, let me take a look. Test truncate with DataNodes restarting --- Key: HDFS-7740 URL: https://issues.apache.org/jira/browse/HDFS-7740 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7740.001.patch, HDFS-7740.002.patch, HDFS-7740.003.patch Add a test case, which ensures replica consistency when DNs are failing and restarting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337638#comment-14337638 ] Allen Wittenauer commented on HDFS-7460: Argh. Stupid failures. These make zero sense. Let's try one more time... Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337685#comment-14337685 ] Hadoop QA commented on HDFS-7537: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700935/Screen%20Shot%202015-02-26%20at%2010.30.35.png against trunk revision d140d76. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9670//console This message is automatically generated. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7843) A truncated file is corrupted after rollback from a rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337718#comment-14337718 ] Hudson commented on HDFS-7843: -- FAILURE: Integrated in Hadoop-trunk-Commit #7202 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7202/]) HDFS-7843. A truncated file is corrupted after rollback from a rolling upgrade. (szetszwo: rev 606f5b517ffbeae0140a8c80b4cddc012c7fb3c4) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRollingUpgrade.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java A truncated file is corrupted after rollback from a rolling upgrade --- Key: HDFS-7843 URL: https://issues.apache.org/jira/browse/HDFS-7843 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Blocker Fix For: 2.7.0 Attachments: h7843_20150226.patch Here is a rolling upgrade truncate test from [~brandonli]. The basic test step is: (3 nodes cluster with HA) 1. upload a file to hdfs 2. start rollingupgrade. finish rollingupgrade for namenode and one datanode. 3. truncate the file in hdfs to 1byte 4. do rollback 5. download file from hdfs, check file size to be original size I see the file size in hdfs is correct but can't read it because the block is corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-7740) Test truncate with DataNodes restarting
[ https://issues.apache.org/jira/browse/HDFS-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337715#comment-14337715 ] Yi Liu edited comment on HDFS-7740 at 2/26/15 2:21 AM: --- Seems it's not the same reason as HDFS-7695 where NN restarts and is unable to leave SafeMode. While in {{testTruncateWithDataNodesRestart}} NN is not restarted and is caused by {{triggerBlockReportForTests}} can't finish. The only possible reason is that the block report on that datanode is not successful. I can't reproduce this issue locally, but I check the log and find there is indeed no block report after that DN restarts. Let me see more details later today. was (Author: hitliuyi): Seems it's not the same reason as HDFS-7695 where NN restarts and is unable to leave SafeMode. While in {{testTruncateWithDataNodesRestart}} NN is not restarted and is caused by {{triggerBlockReportForTests}} can't finish. The only possible reason is that the block report on that datanode is not successful. I can't reproduce this issue locally, but I check the log and find there is indeed no block report after that DN restarts. Let me see more details. Test truncate with DataNodes restarting --- Key: HDFS-7740 URL: https://issues.apache.org/jira/browse/HDFS-7740 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7740.001.patch, HDFS-7740.002.patch, HDFS-7740.003.patch Add a test case, which ensures replica consistency when DNs are failing and restarting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7740) Test truncate with DataNodes restarting
[ https://issues.apache.org/jira/browse/HDFS-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337715#comment-14337715 ] Yi Liu commented on HDFS-7740: -- Seems it's not the same reason as HDFS-7695 where NN restarts and is unable to leave SafeMode. While in {{testTruncateWithDataNodesRestart}} NN is not restarted and is caused by {{triggerBlockReportForTests}} can't finish. The only possible reason is that the block report on that datanode is not successful. I can't reproduce this issue locally, but I check the log and find there is indeed no block report after that DN restarts. Let me see more details. Test truncate with DataNodes restarting --- Key: HDFS-7740 URL: https://issues.apache.org/jira/browse/HDFS-7740 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7740.001.patch, HDFS-7740.002.patch, HDFS-7740.003.patch Add a test case, which ensures replica consistency when DNs are failing and restarting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337761#comment-14337761 ] Allen Wittenauer commented on HDFS-7460: OK, that's what I would expect. :) +1 Let's get this committed to trunk. Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337765#comment-14337765 ] Hudson commented on HDFS-7460: -- FAILURE: Integrated in Hadoop-trunk-Commit #7203 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7203/]) HDFS-7460. Rewrite httpfs to use new shell framework (John Smith via aw) (aw: rev 8c4f76aa20e75635bd6d3de14924ec246a8a071a) * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/ServerSetup.apt.vm * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/ServerSetup.md.vm * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/tomcat/ssl-server.xml * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/conf/httpfs-env.sh * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/UsingHttpTools.apt.vm * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/libexec/httpfs-config.sh * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/index.md * hadoop-hdfs-project/hadoop-hdfs-httpfs/pom.xml * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/index.apt.vm * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/markdown/UsingHttpTools.md * hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/tomcat/ssl-server.xml.conf Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Fix For: 3.0.0 Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7847) Write a junit test to simulate a heavy BlockManager load
Colin Patrick McCabe created HDFS-7847: -- Summary: Write a junit test to simulate a heavy BlockManager load Key: HDFS-7847 URL: https://issues.apache.org/jira/browse/HDFS-7847 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb Write a junit test to simulate a heavy BlockManager load. Quantify native and java heap sizes, and some latency numbers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7844) Create an off-heap hash table implementation
[ https://issues.apache.org/jira/browse/HDFS-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7844: --- Target Version/s: HDFS-7836 Create an off-heap hash table implementation Key: HDFS-7844 URL: https://issues.apache.org/jira/browse/HDFS-7844 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6133) Make Balancer support exclude specified path
[ https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337787#comment-14337787 ] Tsz Wo Nicholas Sze commented on HDFS-6133: --- Sure, just have filed HDFS-7849 for updating the documentation. Make Balancer support exclude specified path Key: HDFS-6133 URL: https://issues.apache.org/jira/browse/HDFS-6133 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover, datanode Reporter: zhaoyunjiong Assignee: zhaoyunjiong Fix For: 2.7.0 Attachments: HDFS-6133-1.patch, HDFS-6133-10.patch, HDFS-6133-11.patch, HDFS-6133-2.patch, HDFS-6133-3.patch, HDFS-6133-4.patch, HDFS-6133-5.patch, HDFS-6133-6.patch, HDFS-6133-7.patch, HDFS-6133-8.patch, HDFS-6133-9.patch, HDFS-6133.patch Currently, run Balancer will destroying Regionserver's data locality. If getBlocks could exclude blocks belongs to files which have specific path prefix, like /hbase, then we can run Balancer without destroying Regionserver's data locality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337799#comment-14337799 ] Tsz Wo Nicholas Sze commented on HDFS-7537: --- ..., I think changes in fsck should not influent TestLeaseRecovery2.java tests, right? Agree. The failure of TestLeaseRecovery2 has nothing to do with the patch. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337806#comment-14337806 ] GAO Rui commented on HDFS-7537: --- So, what should I do to apply this patch to branch trunk? fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337807#comment-14337807 ] GAO Rui commented on HDFS-7537: --- So, what should I do to apply this patch to branch trunk? fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7537: -- Resolution: Fixed Fix Version/s: 2.7.0 Status: Resolved (was: Patch Available) I have committed this. Thanks, Rui! fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337831#comment-14337831 ] GAO Rui commented on HDFS-7537: --- I see. Thank you very much for you kind help. Our team will try our best to contribute to HADOOP, HDFS. Thank you again. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7785) Improve diagnostics for HttpPutFailedException
[ https://issues.apache.org/jira/browse/HDFS-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengbing Liu updated HDFS-7785: Summary: Improve diagnostics for HttpPutFailedException (was: Add detailed message for HttpPutFailedException) Improve diagnostics for HttpPutFailedException -- Key: HDFS-7785 URL: https://issues.apache.org/jira/browse/HDFS-7785 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Chengbing Liu Assignee: Chengbing Liu Attachments: HDFS-7785.01.patch One of our namenode logs shows the following exception message. ... Caused by: org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException: org.apache.hadoop.security.authentication.util.SignerException: Invalid signature at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImage(TransferFsImage.java:294) ... {{HttpPutFailedException}} should have its detailed information, such as status code and url, shown in the log to help debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-7460: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks! Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Fix For: 3.0.0 Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337772#comment-14337772 ] Allen Wittenauer commented on HDFS-7460: I'm going to mark this as security, since this patch also includes the fix for display the password in the command line for ssl connections. Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Labels: security Fix For: 3.0.0 Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337818#comment-14337818 ] Hudson commented on HDFS-7537: -- FAILURE: Integrated in Hadoop-trunk-Commit #7204 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7204/]) HDFS-7537. Add UNDER MIN REPL'D BLOCKS count to fsck. Contributed by GAO Rui (szetszwo: rev 725cc499f00abeeab9f58cbc778e65522eec9d98) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337705#comment-14337705 ] Hadoop QA commented on HDFS-7460: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700895/HDFS-7460-01.patch against trunk revision d140d76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-httpfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9669//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9669//console This message is automatically generated. Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-3097) Use 'exec' to invoke catalina.sh in HttpFS's httpfs.sh
[ https://issues.apache.org/jira/browse/HDFS-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HDFS-3097. Resolution: Duplicate Use 'exec' to invoke catalina.sh in HttpFS's httpfs.sh -- Key: HDFS-3097 URL: https://issues.apache.org/jira/browse/HDFS-3097 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.23.1 Reporter: Herman Chen Without it shell spawns a new process, which defeats the purpose when you would like to run it in the foreground with httpfs.sh run, which eventually invokes catalina.sh run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-7794) Update httpfs documentation
[ https://issues.apache.org/jira/browse/HDFS-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HDFS-7794. Resolution: Duplicate Update httpfs documentation --- Key: HDFS-7794 URL: https://issues.apache.org/jira/browse/HDFS-7794 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Allen Wittenauer Attachments: HDFS-7794-00.patch Update httpfs documentation: * shell script changes * apt to markdown * remove HDFS proxy references, since it no longer exists -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7845) Compress block reports
Colin Patrick McCabe created HDFS-7845: -- Summary: Compress block reports Key: HDFS-7845 URL: https://issues.apache.org/jira/browse/HDFS-7845 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Charles Lamb We should optionally compress block reports using a low-cpu codec such as lz4 or snappy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7846) Create off-heap BlocksMap, BlockInfo, DataNodeInfo structures
Colin Patrick McCabe created HDFS-7846: -- Summary: Create off-heap BlocksMap, BlockInfo, DataNodeInfo structures Key: HDFS-7846 URL: https://issues.apache.org/jira/browse/HDFS-7846 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Create off-heap BlocksMap, BlockInfo, and DataNodeInfo structures. The BlocksMap will use the off-heap hash table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7846) Create off-heap BlocksMap, BlockInfo, DataNodeInfo structures
[ https://issues.apache.org/jira/browse/HDFS-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7846: --- Target Version/s: HDFS-7836 Affects Version/s: HDFS-7836 Create off-heap BlocksMap, BlockInfo, DataNodeInfo structures - Key: HDFS-7846 URL: https://issues.apache.org/jira/browse/HDFS-7846 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Create off-heap BlocksMap, BlockInfo, and DataNodeInfo structures. The BlocksMap will use the off-heap hash table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7848) Use off-heap BlocksMap, BlockInfo, etc. in FSNamesystem
Colin Patrick McCabe created HDFS-7848: -- Summary: Use off-heap BlocksMap, BlockInfo, etc. in FSNamesystem Key: HDFS-7848 URL: https://issues.apache.org/jira/browse/HDFS-7848 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Use off-heap BlocksMap, BlockInfo, etc. in FSNamesystem -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-7460: --- Labels: security (was: s) Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Labels: security Fix For: 3.0.0 Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-7460: --- Labels: s (was: ) Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Labels: s Fix For: 3.0.0 Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7849) Update documentation for enabling a new feature in rolling upgrade
Tsz Wo Nicholas Sze created HDFS-7849: - Summary: Update documentation for enabling a new feature in rolling upgrade Key: HDFS-7849 URL: https://issues.apache.org/jira/browse/HDFS-7849 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor When there is a new feature in a new software, the new feature may not work correctly with the old software. In such case, rolling upgrade should be done by the following steps: # disable the new feature, # upgrade the cluster, and # enable the new feature. We should clarify it in the documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337811#comment-14337811 ] GAO Rui commented on HDFS-7537: --- ? you mean this patch HDFS-7537.1.patch? or, you suggest me to attached a new patch which has the same content as HDFS-7537.2.patch but named HDFS-7537.3.patch? fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7843) A truncated file is corrupted after rollback from a rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7843: -- Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Brandon for reviewing the patch. I have committed this. A truncated file is corrupted after rollback from a rolling upgrade --- Key: HDFS-7843 URL: https://issues.apache.org/jira/browse/HDFS-7843 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Blocker Fix For: 2.7.0 Attachments: h7843_20150226.patch Here is a rolling upgrade truncate test from [~brandonli]. The basic test step is: (3 nodes cluster with HA) 1. upload a file to hdfs 2. start rollingupgrade. finish rollingupgrade for namenode and one datanode. 3. truncate the file in hdfs to 1byte 4. do rollback 5. download file from hdfs, check file size to be original size I see the file size in hdfs is correct but can't read it because the block is corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-7460: --- Release Note: This deprecates the following environment variables: HTTPFS_CONFIG HTTPFS_LOG Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Fix For: 3.0.0 Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7844) Create an off-heap hash table implementation
Colin Patrick McCabe created HDFS-7844: -- Summary: Create an off-heap hash table implementation Key: HDFS-7844 URL: https://issues.apache.org/jira/browse/HDFS-7844 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7836 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Create an off-heap hash table implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7850) distribute-excludes and refresh-namenodes update to new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-7850: --- Description: These need to get updated to use new shell framework. distribute-excludes and refresh-namenodes update to new shell framework --- Key: HDFS-7850 URL: https://issues.apache.org/jira/browse/HDFS-7850 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer These need to get updated to use new shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7850) distribute-excludes and refresh-namenodes update to new shell framework
Allen Wittenauer created HDFS-7850: -- Summary: distribute-excludes and refresh-namenodes update to new shell framework Key: HDFS-7850 URL: https://issues.apache.org/jira/browse/HDFS-7850 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7537: -- Hadoop Flags: Reviewed +1 patch looks good. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337827#comment-14337827 ] Tsz Wo Nicholas Sze commented on HDFS-7537: --- ? you mean this patch HDFS-7537.1.patch? By +1, we mean that the patch is perfect and can be committed. Thanks again for working on this. fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Fix For: 2.7.0 Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7774) Unresolved symbols error while compiling HDFS on Windows 7/32 bit
[ https://issues.apache.org/jira/browse/HDFS-7774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338044#comment-14338044 ] Kiran Kumar M R commented on HDFS-7774: --- Thanks for the suggestion [~cnauroth]. Check the new patch, I have added ant condition to make cmake generator parametrized. Tests passed for me. I had set {{-Xmx512M}} I do not have access to pastebin from workplace to post full test logs. Pasting some test output snippets here. {code} hadoop-hdfs-project\hadoop-hdfsmvn test -Dtest=test_libhdfs_threaded . . main: [echo] Running test_libhdfs_threaded . . . [exec] testHdfsOperations(threadIdx=0): starting [exec] testHdfsOperations(threadIdx=1): starting [exec] testHdfsOperations(threadIdx=2): starting. . . . [exec] 2015-02-26 10:54:58,772 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped SelectChannelConnector@127.0.0.1:0 [exec] 2015-02-26 10:54:58,773 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210)) - Stopping DataNode metrics system... [exec] 2015-02-26 10:54:58,774 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(216)) - DataNode metrics system stopped. [exec] 2015-02-26 10:54:58,774 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(600)) - DataNode metrics system shutdown complete. [echo] Finished test_native_mini_dfs [INFO] Executed tasks [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 1:42.793s [INFO] Finished at: Thu Feb 26 10:54:59 GMT+05:30 2015 [INFO] Final Memory: 35M/494M [INFO] {code} Unresolved symbols error while compiling HDFS on Windows 7/32 bit - Key: HDFS-7774 URL: https://issues.apache.org/jira/browse/HDFS-7774 Project: Hadoop HDFS Issue Type: Bug Components: build, native Affects Versions: 2.6.0 Environment: Windows 7, 32 bit, Visual Studio 10. Windows PATH: PATH=C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; SDK Path: PATH=C:\Windows\Microsoft.NET\Framework\v4.0.30319;C:\Windows\Microsoft.NET\Framework\v3.5;;C:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE;C:\Program Files\Microsoft Visual Studio 10.0\Common7\Tools;;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin;C:\Program Files\Microsoft Visual Studio 10.0\VC\Bin\VCPackages;;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\NETFX 4.0 Tools;C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin;;C:\ProgramData\Oracle\Java\javapath;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;D:\PIG\pig-0.13.0\bin;C:\PROGRA~1\JAVA\JDK1.7.0_71\bin;C:\Program Files\Microsoft Windows Performance Toolkit\;C:\GNUWIN32\GETGNUWIN32\BIN;C:\CYGWIN\BIN;D:\git\cmd;D:\GIT\BIN;D:\MAVEN-3-2-3\APACHE-MAVEN-3.2.3-BIN\apache-maven-3.2.3\bin;D:\UTILS;c:\windows\Microsoft.NET\Framework\v4.0.30319;D:\cmake\bin;c:\progra~1\Micros~1.0\vc\crt\src; Reporter: Venkatasubramaniam Ramakrishnan Assignee: Kiran Kumar M R Priority: Critical Labels: build Attachments: HDFS-7774-001.patch, HDFS-7774-002.patch, Win32_Changes-temp.patch I am getting the following error in the hdfs module compilation: . . . [exec] ClCompile: [exec] All outputs are up-to-date. [exec] Lib: [exec] All outputs are up-to-date. [exec] hdfs_static.vcxproj - D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\target\bin\RelWithDebInfo\hdfs.lib [exec] FinalizeBuildStatus: [exec] Deleting file hdfs_static.dir\RelWithDebInfo\hdfs_static.unsuccessfulbuild. [exec] Touching hdfs_static.dir\RelWithDebInfo\hdfs_static.lastbuildstate. [exec] Done Building Project D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\hdfs_static.vcxproj (default targets). [exec] Done Building Project D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\ALL_BUILD.vcxproj (default targets) -- FAILED. [exec] [exec] Build FAILED. [exec] [exec] D:\h\hadoop-2.6.0-src\hadoop-hdfs-project\hadoop-hdfs\target\native\ALL_BUILD.vcxproj (default target) (1) -
[jira] [Updated] (HDFS-7467) Provide storage tier information for a directory via fsck
[ https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated HDFS-7467: --- Resolution: Fixed Status: Resolved (was: Patch Available) Provide storage tier information for a directory via fsck - Key: HDFS-7467 URL: https://issues.apache.org/jira/browse/HDFS-7467 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Affects Versions: 2.6.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-7467-002.patch, HDFS-7467-003.patch, HDFS-7467-004.patch, HDFS-7467.patch, storagepolicydisplay.pdf Currently _fsck_ provides information regarding blocks for a directory. It should be augmented to provide storage tier information (optionally). The sample report could be as follows : {code} Storage Tier Combination# of blocks % of blocks DISK:1,ARCHIVE:2 340730 97.7393% ARCHIVE:3 39281.1268% DISK:2,ARCHIVE:231220.8956% DISK:2,ARCHIVE:1 7480.2146% DISK:1,ARCHIVE:3 440.0126% DISK:3,ARCHIVE:2 300.0086% DISK:3,ARCHIVE:1 90.0026% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7435) PB encoding of block reports is very inefficient
[ https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-7435: -- Attachment: HDFS-7435.patch I don't need to bump the DN's min NN version. I leveraged the auto-detection I did before. So here's the deal. As wasteful as PBs can be in some cases, they are smart in other cases to avoid unnecessary copying. I've leveraged those apis to their maximum. I’ve also optimized {{BlockListAsLongs}} to remove intermediate encodings. Instead of a pre-allocated array, a {{ByteString.Output}} stream can be used by a custom specified buffer size. Internally the {{ByteString}} is building up a balanced tree of rope buffers of the given size. Extracting the buffer via {{Output#toByteString}} doesn't really copy anything. So on the DN we have cheaply built up a blocks buffer of {{ByteString}} that is internally segmented. I relented to sending the blocks buffer in a repeating field. {{ByteString}} provides no direct API for internally accessing its roped strings. However, we can slice it up with {{ByteString#substring}}. It prefers to create bounded instances (offset+len) much like {{String}} did in earlier JDKs. Alas, a wasted instantiation. However a substring that aligns with an entire rope buffer will get the actual buffer reference – not wrapped with a bounded instance. So I slice it up that way. On the NN, if the new repeating field is not present, it decodes as old-style format. Otherwise it uses {{ByteString#concat}} to reassemble the blocks buffer fragments. Concat doesn't copy, but again builds a balanced tree for later decoding. The best part is the encoding/decoding. {{BlockListAsLongs}} directly encodes the {{Replicas}} into a {{ByteString}}, not into a wasted intermediate {{long[]}}. To support the unlikely and surely shorted lived case of a new DN reporting to a old NN, the PB layer will decode the {{ByteString}} and re-encode into old-style longs. Not efficient, but it’s the least ugly way to handle something that will probably never happen. {{BlockListAsLongs}} doesn’t decode the buffer into a wasted and intermediate {{long[]}}. Instead, the iterator decodes on-demand. I also changed the format of the encoded buffer. Might as well while the patient is on the table. It’s documented in the code, but now finalized and uc blocks aren’t treated specially. The finalized and uc counts, followed by 3-long finalized, a 3-long delimiter, and 4-long uc blocks is now just total block count, 4-long blocks. Every block encodes its {{ReplicaState}}. One compelling reason is the capability to encode additional bits into the state field. One anticipated use is encoding the pinned block status. Phew. I need to write some new tests to verify old/new compatibility. I’ll also manually test on a perf cluster. PB encoding of block reports is very inefficient Key: HDFS-7435 URL: https://issues.apache.org/jira/browse/HDFS-7435 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Attachments: HDFS-7435.000.patch, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.patch, HDFS-7435.patch Block reports are encoded as a PB repeating long. Repeating fields use an {{ArrayList}} with default capacity of 10. A block report containing tens or hundreds of thousand of longs (3 for each replica) is extremely expensive since the {{ArrayList}} must realloc many times. Also, decoding repeating fields will box the primitive longs which must then be unboxed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient
[ https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337507#comment-14337507 ] Hadoop QA commented on HDFS-7435: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700906/HDFS-7435.patch against trunk revision caa42ad. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9668//console This message is automatically generated. PB encoding of block reports is very inefficient Key: HDFS-7435 URL: https://issues.apache.org/jira/browse/HDFS-7435 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Attachments: HDFS-7435.000.patch, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.patch, HDFS-7435.patch Block reports are encoded as a PB repeating long. Repeating fields use an {{ArrayList}} with default capacity of 10. A block report containing tens or hundreds of thousand of longs (3 for each replica) is extremely expensive since the {{ArrayList}} must realloc many times. Also, decoding repeating fields will box the primitive longs which must then be unboxed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7467) Provide storage tier information for a directory via fsck
[ https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337573#comment-14337573 ] Hudson commented on HDFS-7467: -- FAILURE: Integrated in Hadoop-trunk-Commit #7201 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7201/]) HDFS-7467. Provide storage tier information for a directory via fsck. (Benoy Antony) (benoy: rev d140d76a43c88e326b9c2818578f22bd3563b969) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestStoragePolicySummary.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/StoragePolicySummary.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java Provide storage tier information for a directory via fsck - Key: HDFS-7467 URL: https://issues.apache.org/jira/browse/HDFS-7467 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Affects Versions: 2.6.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-7467-002.patch, HDFS-7467-003.patch, HDFS-7467-004.patch, HDFS-7467.patch, storagepolicydisplay.pdf Currently _fsck_ provides information regarding blocks for a directory. It should be augmented to provide storage tier information (optionally). The sample report could be as follows : {code} Storage Tier Combination# of blocks % of blocks DISK:1,ARCHIVE:2 340730 97.7393% ARCHIVE:3 39281.1268% DISK:2,ARCHIVE:231220.8956% DISK:2,ARCHIVE:1 7480.2146% DISK:1,ARCHIVE:3 440.0126% DISK:3,ARCHIVE:2 300.0086% DISK:3,ARCHIVE:1 90.0026% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7568) Support immutability (Write-once-read-many) in HDFS
[ https://issues.apache.org/jira/browse/HDFS-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337400#comment-14337400 ] Lei (Eddy) Xu commented on HDFS-7568: - bq. I don't understand why this would be a requirement. It is the active NameNode that is choosing to create or not create a file, not the standby. The standby just follows the decision which the active NameNode already took. And if there is a failover, the standby will replay all edits before becoming active. It is not a necessary requirements to be implemented within HDFS. The regulation requires some unmodified time source for audit logs, retentions and such. So I think that this can also be done by letting ANN/SNN synchronized with an external time server. bq. Who are special personnel are and how they are different than system administrators? A few times you refer to super users, but we already have superusers in hdfs and they are system administrators. I agree that Super user might be a bad terminology here. What I meant is that there are certain operations (i.e., modify immutable file before retention date) should be only granted to some authorized people (the staff from the Commission), that is different to the administrator, who operates maintenances tasks. Also, in this environment, we assume that all users do not have any kind of access to the underlying Linux system (e.g., ssh or physical access). This requirement could not be implemented / enforced within HDFS as well. Does them make sense to you, [~cmccabe] ? Support immutability (Write-once-read-many) in HDFS --- Key: HDFS-7568 URL: https://issues.apache.org/jira/browse/HDFS-7568 Project: Hadoop HDFS Issue Type: New Feature Components: namenode Affects Versions: 2.7.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Many regulatory compliance requires storage to support WORM functionality to protect sensitive data from being modified or deleted. This jira proposes adding that feature to HDFS. See the following comment for more description. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7360) Test libhdfs3 against MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337448#comment-14337448 ] Xiaoyu Yao commented on HDFS-7360: -- [~wangzw], Thanks for the work to make libhdfs build with maven. I tried the patches with the mvn commnd executed form /hadoop/hadoop-hdfs-project/hadoop-hfs/ above, but it failed because CMakeLists does not allow build project in the source directory. [exec] CMake Error at contrib/libhdfs3/CMakeLists.txt:26 (MESSAGE): [exec] cannot build the project in the source directory! Out-of-source build is [exec] enforced! [ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-hdfs: An Ant BuildException has occured: exec returned: 1 [ERROR] around Ant part ...exec dir=/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/native executable=cmake failonerror=true... @ 5:132 in /hadoop/hadoop-hdfs-project/hadoop-hdfs/target/antrun/build-main.xml Test libhdfs3 against MiniDFSCluster Key: HDFS-7360 URL: https://issues.apache.org/jira/browse/HDFS-7360 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Haohui Mai Assignee: Zhanwei Wang Priority: Critical Attachments: HDFS-7360-pnative.002.patch, HDFS-7360.patch Currently the branch has enough code to interact with HDFS servers. We should test the code against MiniDFSCluster to ensure the correctness of the code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7460) Rewrite httpfs to use new shell framework
[ https://issues.apache.org/jira/browse/HDFS-7460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337469#comment-14337469 ] Hadoop QA commented on HDFS-7460: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700895/HDFS-7460-01.patch against trunk revision caa42ad. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9666//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9666//console This message is automatically generated. Rewrite httpfs to use new shell framework - Key: HDFS-7460 URL: https://issues.apache.org/jira/browse/HDFS-7460 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: John Smith Attachments: HDFS-7460-01.patch, HDFS-7460.patch httpfs shell code was not rewritten during HADOOP-9902. It should be modified to take advantage of the common shell framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7740) Test truncate with DataNodes restarting
[ https://issues.apache.org/jira/browse/HDFS-7740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337580#comment-14337580 ] Konstantin Shvachko commented on HDFS-7740: --- It could be related to HDFS-7695. I did not look closely, so it's just a possibility. Test truncate with DataNodes restarting --- Key: HDFS-7740 URL: https://issues.apache.org/jira/browse/HDFS-7740 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7740.001.patch, HDFS-7740.002.patch, HDFS-7740.003.patch Add a test case, which ensures replica consistency when DNs are failing and restarting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7467) Provide storage tier information for a directory via fsck
[ https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337588#comment-14337588 ] Benoy Antony commented on HDFS-7467: committed to trunk and branch-2. Provide storage tier information for a directory via fsck - Key: HDFS-7467 URL: https://issues.apache.org/jira/browse/HDFS-7467 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Affects Versions: 2.6.0 Reporter: Benoy Antony Assignee: Benoy Antony Fix For: 2.7.0 Attachments: HDFS-7467-002.patch, HDFS-7467-003.patch, HDFS-7467-004.patch, HDFS-7467.patch, storagepolicydisplay.pdf Currently _fsck_ provides information regarding blocks for a directory. It should be augmented to provide storage tier information (optionally). The sample report could be as follows : {code} Storage Tier Combination# of blocks % of blocks DISK:1,ARCHIVE:2 340730 97.7393% ARCHIVE:3 39281.1268% DISK:2,ARCHIVE:231220.8956% DISK:2,ARCHIVE:1 7480.2146% DISK:1,ARCHIVE:3 440.0126% DISK:3,ARCHIVE:2 300.0086% DISK:3,ARCHIVE:1 90.0026% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7843) A truncated file is corrupted after rollback from a rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-7843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337590#comment-14337590 ] Brandon Li commented on HDFS-7843: -- +1. The patch looks good to me. A truncated file is corrupted after rollback from a rolling upgrade --- Key: HDFS-7843 URL: https://issues.apache.org/jira/browse/HDFS-7843 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Blocker Attachments: h7843_20150226.patch Here is a rolling upgrade truncate test from [~brandonli]. The basic test step is: (3 nodes cluster with HA) 1. upload a file to hdfs 2. start rollingupgrade. finish rollingupgrade for namenode and one datanode. 3. truncate the file in hdfs to 1byte 4. do rollback 5. download file from hdfs, check file size to be original size I see the file size in hdfs is correct but can't read it because the block is corrupted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7467) Provide storage tier information for a directory via fsck
[ https://issues.apache.org/jira/browse/HDFS-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony updated HDFS-7467: --- Fix Version/s: 2.7.0 Provide storage tier information for a directory via fsck - Key: HDFS-7467 URL: https://issues.apache.org/jira/browse/HDFS-7467 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Affects Versions: 2.6.0 Reporter: Benoy Antony Assignee: Benoy Antony Fix For: 2.7.0 Attachments: HDFS-7467-002.patch, HDFS-7467-003.patch, HDFS-7467-004.patch, HDFS-7467.patch, storagepolicydisplay.pdf Currently _fsck_ provides information regarding blocks for a directory. It should be augmented to provide storage tier information (optionally). The sample report could be as follows : {code} Storage Tier Combination# of blocks % of blocks DISK:1,ARCHIVE:2 340730 97.7393% ARCHIVE:3 39281.1268% DISK:2,ARCHIVE:231220.8956% DISK:2,ARCHIVE:1 7480.2146% DISK:1,ARCHIVE:3 440.0126% DISK:3,ARCHIVE:2 300.0086% DISK:3,ARCHIVE:1 90.0026% {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart
[ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-7537: -- Attachment: Screen Shot 2015-02-26 at 10.30.35.png Screen Shot 2015-02-26 at 10.27.46.png fsck is confusing when dfs.namenode.replication.min 1 missing replicas NN restart - Key: HDFS-7537 URL: https://issues.apache.org/jira/browse/HDFS-7537 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Allen Wittenauer Assignee: GAO Rui Attachments: HDFS-7537.1.patch, HDFS-7537.2.patch, Screen Shot 2015-02-26 at 10.27.46.png, Screen Shot 2015-02-26 at 10.30.35.png, dfs-min-2-fsck.png, dfs-min-2.png If minimum replication is set to 2 or higher and some of those replicas are missing and the namenode restarts, it isn't always obvious that the missing replicas are the reason why the namenode isn't leaving safemode. We should improve the output of fsck and the web UI to make it obvious that the missing blocks are from unmet replicas vs. completely/totally missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6488) HDFS superuser unable to access user's Trash files using NFSv3 mount
[ https://issues.apache.org/jira/browse/HDFS-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337660#comment-14337660 ] Brandon Li commented on HDFS-6488: -- The question is, how could the gateway know who is the super user. It doesn't seems reasonable to assume the gateway is always started by the super user. Any suggestions? HDFS superuser unable to access user's Trash files using NFSv3 mount Key: HDFS-6488 URL: https://issues.apache.org/jira/browse/HDFS-6488 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.3.0 Reporter: Stephen Chu As hdfs superuseruser on the NFS mount, I cannot cd or ls the /user/schu/.Trash directory: {code} bash-4.1$ cd .Trash/ bash: cd: .Trash/: Permission denied bash-4.1$ ls -la total 2 drwxr-xr-x 4 schu 2584148964 128 Jan 7 10:42 . drwxr-xr-x 4 hdfs 2584148964 128 Jan 6 16:59 .. drwx-- 2 schu 2584148964 64 Jan 7 10:45 .Trash drwxr-xr-x 2 hdfs hdfs64 Jan 7 10:42 tt bash-4.1$ ls .Trash ls: cannot open directory .Trash: Permission denied bash-4.1$ {code} When using FsShell as hdfs superuser, I have superuser permissions to schu's .Trash contents: {code} bash-4.1$ hdfs dfs -ls -R /user/schu/.Trash drwx-- - schu supergroup 0 2014-01-07 10:48 /user/schu/.Trash/Current drwx-- - schu supergroup 0 2014-01-07 10:48 /user/schu/.Trash/Current/user drwx-- - schu supergroup 0 2014-01-07 10:48 /user/schu/.Trash/Current/user/schu -rw-r--r-- 1 schu supergroup 4 2014-01-07 10:48 /user/schu/.Trash/Current/user/schu/tf1 {code} The NFSv3 logs don't produce any error when superuser tries to access schu Trash contents. However, for other permission errors (e.g. schu tries to delete a directory owned by hdfs), there will be a permission error in the logs. I think this is not specific to the .Trash directory perhaps. I created a /user/schu/dir1 which has the same permissions as .Trash (700). When I try cd'ing into the directory from the NFSv3 mount as hdfs superuser, I get the same permission denied. {code} [schu@hdfs-nfs ~]$ hdfs dfs -ls Found 4 items drwx-- - schu supergroup 0 2014-01-07 10:57 .Trash drwx-- - schu supergroup 0 2014-01-07 11:05 dir1 -rw-r--r-- 1 schu supergroup 4 2014-01-07 11:05 tf1 drwxr-xr-x - hdfs hdfs0 2014-01-07 10:42 tt bash-4.1$ whoami hdfs bash-4.1$ pwd /hdfs_nfs_mount/user/schu bash-4.1$ cd dir1 bash: cd: dir1: Permission denied bash-4.1$ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)