[jira] [Commented] (HDFS-7299) Hadoop Namenode failing because of negative value in fsimage
[ https://issues.apache.org/jira/browse/HDFS-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189717#comment-14189717 ] Vishnu Ganth commented on HDFS-7299: I was able to bring the namenode up, by commenting the following lines from org.apache.hadoop.hdfs.protocol.Block.java if (numBytes < 0) { throw new IOException("Unexpected block size: " + numBytes); } But not sure how NUM_BYTES got negative value in fsimage. [~huLiu] > Hadoop Namenode failing because of negative value in fsimage > > > Key: HDFS-7299 > URL: https://issues.apache.org/jira/browse/HDFS-7299 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Vishnu Ganth > > Hadoop Namenode is getting failed because of some unexpected value of block > size in fsimage. > Stack trace: > {code} > 2014-10-27 16:22:12,107 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: > STARTUP_MSG: > / > STARTUP_MSG: Starting NameNode > STARTUP_MSG: host = / > STARTUP_MSG: args = [] > STARTUP_MSG: version = 2.0.0-cdh4.4.0 > STARTUP_MSG: classpath = > /var/run/cloudera-scm-agent/process/12726-hdfs-NAMENODE:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/hue-plugins-2.5.0-cdh4.4.0.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/activation-1.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jetty-6.1.26.cloudera.2.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jersey-core-1.8.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jackson-xc-1.8.8.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-collections-3.2.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/paranamer-2.3.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-net-3.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/xz-1.0.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jetty-util-6.1.26.cloudera.2.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/zookeeper-3.4.5-cdh4.4.0.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/slf4j-api-1.6.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/stax-api-1.0.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/snappy-java-1.0.4.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jline-0.9.94.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jsp-api-2.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jsr305-1.3.9.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-logging-1.1.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-digester-1.8.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/xmlenc-0.52.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jersey-server-1.8.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jettison-1.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-httpclient-3.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-math-2.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jets3t-0.6.1.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-el-1.0.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/avro-1.7.4.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/commons-lang-2.5.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/jersey-json-1.8.jar:/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/lib/kfs-0.3
[jira] [Commented] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189702#comment-14189702 ] Hadoop QA commented on HDFS-6385: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678111/HDFS-6385.2.patch against trunk revision 0126cf1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8596//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8596//console This message is automatically generated. > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.2.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7035) Make adding volume an atomic operation.
[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189703#comment-14189703 ] Hadoop QA commented on HDFS-7035: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678104/HDFS-7035.014.patch against trunk revision 2a6be65. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1266 javac compiler warnings (more than the trunk's current 1265 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestAllowFormat org.apache.hadoop.hdfs.server.namenode.TestCheckPointForSecurityTokens org.apache.hadoop.hdfs.server.datanode.TestRefreshNamenodes org.apache.hadoop.hdfs.TestEncryptedTransfer org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotMetrics org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream org.apache.hadoop.hdfs.TestSnapshotCommands org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshottableDirListing org.apache.hadoop.hdfs.TestRead org.apache.hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots org.apache.hadoop.hdfs.TestBlocksScheduledCounter org.apache.hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate org.apache.hadoop.hdfs.TestDFSPermission org.apache.hadoop.hdfs.server.namenode.TestCheckpoint org.apache.hadoop.hdfs.server.namenode.TestStartup org.apache.hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics org.apache.hadoop.hdfs.server.namenode.TestFSImageWithXAttr org.apache.hadoop.hdfs.TestDFSClientFailover org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestRbwSpaceReservation org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks org.apache.hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality org.apache.hadoop.hdfs.TestLeaseRecovery2 org.apache.hadoop.hdfs.server.namenode.TestFSImageWithAcl org.apache.hadoop.hdfs.TestWriteConfigurationToDFS org.apache.hadoop.hdfs.server.namenode.TestFSEditLogLoader org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCacheRevocation org.apache.hadoop.hdfs.web.TestHttpsFileSystem org.apache.hadoop.hdfs.server.namenode.TestFSDirectory org.apache.hadoop.hdfs.server.datanode.TestIncrementalBlockReports org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestInterDatanodeProtocol org.apache.hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles org.apache.hadoop.hdfs.TestDFSOutputStream org.apache.hadoop.hdfs.TestSetTimes org.apache.hadoop.hdfs.server.blockmanagement.TestHeartbeatHandling org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS org.apache.hadoop.hdfs.TestDatanodeDeath org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode org.apache.hadoop.hdfs.TestDFSRollback org.apache.hadoop.hdfs.TestClientBlockVerification org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache org.apache.hadoop.hdfs.server.datanode.TestDataNodeExit org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache org.apache.hadoop.hdfs.server.datanode.TestBlockHasMultipleReplicasOnSameDN org.apache.hadoop.hdfs.TestFileCreationE
[jira] [Commented] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189650#comment-14189650 ] Jing Zhao commented on HDFS-6385: - +1 > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.2.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189630#comment-14189630 ] Haohui Mai commented on HDFS-6385: -- +1 > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.2.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HDFS-7263) Snapshot read can reveal future bytes for appended files.
[ https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189574#comment-14189574 ] Konstantin Shvachko edited comment on HDFS-7263 at 10/30/14 3:40 AM: - I just committed this. Thank you Tao. was (Author: shv): I jsut committed this. Thank you Tao. > Snapshot read can reveal future bytes for appended files. > - > > Key: HDFS-7263 > URL: https://issues.apache.org/jira/browse/HDFS-7263 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Konstantin Shvachko >Assignee: Tao Luo > Fix For: 2.7.0 > > Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, > TestSnapshotRead.java > > > The following sequence of steps will produce extra bytes, that should not be > visible, because they are not in the snapshot. > * Create a file of size L, where {{L % blockSize != 0}}. > * Create a snapshot > * Append bytes to the file > * Read file in the snapshot (not the current file) > * You will see the bytes are read beoynd the original file size L -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7263) Snapshot read can reveal future bytes for appended files.
[ https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189577#comment-14189577 ] Hudson commented on HDFS-7263: -- FAILURE: Integrated in Hadoop-trunk-Commit #6391 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6391/]) HDFS-7263. Snapshot read can reveal future bytes for appended files. Contributed by Tao Luo. (shv: rev 0126cf16b73843da2e504b6a03fee8bd93a404d5) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotFileLength.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Snapshot read can reveal future bytes for appended files. > - > > Key: HDFS-7263 > URL: https://issues.apache.org/jira/browse/HDFS-7263 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Konstantin Shvachko >Assignee: Tao Luo > Fix For: 2.7.0 > > Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, > TestSnapshotRead.java > > > The following sequence of steps will produce extra bytes, that should not be > visible, because they are not in the snapshot. > * Create a file of size L, where {{L % blockSize != 0}}. > * Create a snapshot > * Append bytes to the file > * Read file in the snapshot (not the current file) > * You will see the bytes are read beoynd the original file size L -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7263) Snapshot read can reveal future bytes for appended files.
[ https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-7263: -- Resolution: Fixed Fix Version/s: 2.7.0 Status: Resolved (was: Patch Available) I jsut committed this. Thank you Tao. > Snapshot read can reveal future bytes for appended files. > - > > Key: HDFS-7263 > URL: https://issues.apache.org/jira/browse/HDFS-7263 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Konstantin Shvachko >Assignee: Tao Luo > Fix For: 2.7.0 > > Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, > TestSnapshotRead.java > > > The following sequence of steps will produce extra bytes, that should not be > visible, because they are not in the snapshot. > * Create a file of size L, where {{L % blockSize != 0}}. > * Create a snapshot > * Append bytes to the file > * Read file in the snapshot (not the current file) > * You will see the bytes are read beoynd the original file size L -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6385: Attachment: HDFS-6385.2.patch Here is patch v2. This fixes the test by parsing the return value of {{FSNamesystem#getNNStarted}} to determine start time. [~jingzhao], are you still +1 for this version of the patch? bq. I wonder, why the information needs to be exported on both metrics and JMX? Thanks for reviewing, Haohui. I was aiming for consistency with PendingDeletionBlocks, but it's not really necessary. I removed the {{Metric}} annotation in this version of the patch. > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.2.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7263) Snapshot read can reveal future bytes for appended files.
[ https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-7263: -- Summary: Snapshot read can reveal future bytes for appended files. (was: Snapshot read of an appended file returns more bytes than the file length.) +1 > Snapshot read can reveal future bytes for appended files. > - > > Key: HDFS-7263 > URL: https://issues.apache.org/jira/browse/HDFS-7263 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Konstantin Shvachko >Assignee: Tao Luo > Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, > TestSnapshotRead.java > > > The following sequence of steps will produce extra bytes, that should not be > visible, because they are not in the snapshot. > * Create a file of size L, where {{L % blockSize != 0}}. > * Create a snapshot > * Append bytes to the file > * Read file in the snapshot (not the current file) > * You will see the bytes are read beoynd the original file size L -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189554#comment-14189554 ] Hadoop QA commented on HDFS-7276: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678065/h7276_20141029b.patch against trunk revision 6f5f604. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDFSZKFailoverController org.apache.hadoop.hdfs.server.namenode.ha.TestBootstrapStandby {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8593//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8593//console This message is automatically generated. > Limit the number of byte arrays used by DFSOutputStream > --- > > Key: HDFS-7276 > URL: https://issues.apache.org/jira/browse/HDFS-7276 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h7276_20141021.patch, h7276_20141022.patch, > h7276_20141023.patch, h7276_20141024.patch, h7276_20141027.patch, > h7276_20141027b.patch, h7276_20141028.patch, h7276_20141029.patch, > h7276_20141029b.patch > > > When there are a lot of DFSOutputStream's writing concurrently, the number of > outstanding packets could be large. The byte arrays created by those packets > could occupy a lot of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7035) Make adding volume an atomic operation.
[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189553#comment-14189553 ] Hadoop QA commented on HDFS-7035: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678067/HDFS-7035.013.patch against trunk revision 6f5f604. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1272 javac compiler warnings (more than the trunk's current 1267 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestBootstrapStandby {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8594//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8594//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8594//console This message is automatically generated. > Make adding volume an atomic operation. > --- > > Key: HDFS-7035 > URL: https://issues.apache.org/jira/browse/HDFS-7035 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.5.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, > HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, > HDFS-7035.003.patch, HDFS-7035.003.patch, HDFS-7035.004.patch, > HDFS-7035.005.patch, HDFS-7035.007.patch, HDFS-7035.008.patch, > HDFS-7035.009.patch, HDFS-7035.010.patch, HDFS-7035.010.patch, > HDFS-7035.011.patch, HDFS-7035.012.patch, HDFS-7035.013.patch, > HDFS-7035.014.patch > > > It refactors {{DataStorage}} and {{BlockPoolSliceStorage}} to reduce the > duplicate code and supports atomic adding volume operations. Also it > parallels loading data volume operation: each thread loads one volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7017) Implement OutputStream for libhdfs3
[ https://issues.apache.org/jira/browse/HDFS-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189546#comment-14189546 ] Zhanwei Wang commented on HDFS-7017: Hi [~wheat9] and [~cmccabe] Any comments on this patch? > Implement OutputStream for libhdfs3 > --- > > Key: HDFS-7017 > URL: https://issues.apache.org/jira/browse/HDFS-7017 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: Zhanwei Wang >Assignee: Zhanwei Wang > Attachments: HDFS-7017-pnative.002.patch, HDFS-7017.patch > > > Implement pipeline and OutputStream C++ interface -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7035) Make adding volume an atomic operation.
[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-7035: Attachment: HDFS-7035.014.patch [~cmccabe] I made changes based on your comments. Thanks! bq. this isn't needed because VolumeBuilder is in the same Java package as Storage. This function is {{BlockPoolSliceStorage#addStorageDir}}, which is called by {{DataStorage#VolumeBuilder}}. I could not use {{protected}} or project visibility here. > Make adding volume an atomic operation. > --- > > Key: HDFS-7035 > URL: https://issues.apache.org/jira/browse/HDFS-7035 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.5.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, > HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, > HDFS-7035.003.patch, HDFS-7035.003.patch, HDFS-7035.004.patch, > HDFS-7035.005.patch, HDFS-7035.007.patch, HDFS-7035.008.patch, > HDFS-7035.009.patch, HDFS-7035.010.patch, HDFS-7035.010.patch, > HDFS-7035.011.patch, HDFS-7035.012.patch, HDFS-7035.013.patch, > HDFS-7035.014.patch > > > It refactors {{DataStorage}} and {{BlockPoolSliceStorage}} to reduce the > duplicate code and supports atomic adding volume operations. Also it > parallels loading data volume operation: each thread loads one volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7173) Only keep successfully loaded volumes in the configuration.
[ https://issues.apache.org/jira/browse/HDFS-7173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-7173: Resolution: Won't Fix Status: Resolved (was: Patch Available) The changes have been merged to HDFS-7035 and been reviewed there. > Only keep successfully loaded volumes in the configuration. > --- > > Key: HDFS-7173 > URL: https://issues.apache.org/jira/browse/HDFS-7173 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 3.0.0, 2.6.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7173.000.combo.patch, HDFS-7173.000.patch, > HDFS-7173.001.combo.patch, HDFS-7173.001.patch, HDFS-7173.002.combo.patch, > HDFS-7173.002.patch, HDFS-7173.003.combo.patch, HDFS-7173.003.patch > > > Hot swapping data volumes might fail. The user should be able to fix the > failed volumes and disks, then ask the {{DataNode}} to retry the previously > failed volumes. > To attempt to reload the failed volume again on the same directory, this > failed directory must not be presented in the {{Configuration}} object that > {{DataNode has}}. Therefore, it should only put successfully loaded volumes > into the {{Configuration}} object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7263) Snapshot read of an appended file returns more bytes than the file length.
[ https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189493#comment-14189493 ] Hadoop QA commented on HDFS-7263: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678051/HDFS-7263.patch against trunk revision 3ae84e1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8592//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8592//console This message is automatically generated. > Snapshot read of an appended file returns more bytes than the file length. > -- > > Key: HDFS-7263 > URL: https://issues.apache.org/jira/browse/HDFS-7263 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Konstantin Shvachko >Assignee: Tao Luo > Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, > TestSnapshotRead.java > > > The following sequence of steps will produce extra bytes, that should not be > visible, because they are not in the snapshot. > * Create a file of size L, where {{L % blockSize != 0}}. > * Create a snapshot > * Append bytes to the file > * Read file in the snapshot (not the current file) > * You will see the bytes are read beoynd the original file size L -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7035) Make adding volume an atomic operation.
[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189469#comment-14189469 ] Colin Patrick McCabe commented on HDFS-7035: {{Storage.java}}: remove unnecessary whitespace change {code} 129 // Expose visibility for VolumeBuilder#commit(). 130 public void addStorageDir(StorageDirectory sd) { 131 super.addStorageDir(sd); 132 } {code} This isn't needed because VolumeBuilder is in the same Java package as Storage, so it can just call the parent method directly. {code} /** The unchanged locations that exist in the old configuration. */ {code} Should be "existed in the old configuration" {code} builder.addBpStorageDirtectories( {code} Should be "directories" not "dirtectories" {code} 167 // 2. Do transitions 168 // Each storage directory is treated individually. 169 // During startup some of them can upgrade or roll back 170 // while others could be up-to-date for the regular startup. 171 doTransition(datanode, sd, nsInfo, startOpt); 172 assert getCTime() == nsInfo.getCTime() 173 : "Data-node and name-node CTimes must be the same."; {code} This should be throwing an IOE, not an assert. Otherwise we're bringing down the DataNode because someone tried to add a storage directory that wasn't valid... not good. {code} 327 // bpStorage does not add loaded volume immediately. The volume will be 328 // added when calling builder.build() later. However, several 329 // members (e.g., Storage#layoutVersion, Storage#cTime will be updated 330 // in BlockPoolSliceStorage#format() and 331 // BlockPoolSliceStorage#loadStorageDirectory. But since these values are 332 // considered constant during the DataNode execution, we do not revert the 333 // the changes on such members. {code} I think this comment belongs in the JavaDoc for the function. I also feel like the current form of the comment is somewhat confusing. I would say something like "prepareVolume creates a builder which can be used to add to the volume. If the volume cannot be added, it is OK to discard the builder later." removeVolumes: can you document in the JavaDoc for this function that even when the IOE is thrown, the volumes are still removed? +1 once these are addressed. > Make adding volume an atomic operation. > --- > > Key: HDFS-7035 > URL: https://issues.apache.org/jira/browse/HDFS-7035 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.5.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, > HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, > HDFS-7035.003.patch, HDFS-7035.003.patch, HDFS-7035.004.patch, > HDFS-7035.005.patch, HDFS-7035.007.patch, HDFS-7035.008.patch, > HDFS-7035.009.patch, HDFS-7035.010.patch, HDFS-7035.010.patch, > HDFS-7035.011.patch, HDFS-7035.012.patch, HDFS-7035.013.patch > > > It refactors {{DataStorage}} and {{BlockPoolSliceStorage}} to reduce the > duplicate code and supports atomic adding volume operations. Also it > parallels loading data volume operation: each thread loads one volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189431#comment-14189431 ] Hadoop QA commented on HDFS-7276: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678023/h7276_20141029.patch against trunk revision d33e07d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.util.TestByteArrayManager {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8591//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8591//console This message is automatically generated. > Limit the number of byte arrays used by DFSOutputStream > --- > > Key: HDFS-7276 > URL: https://issues.apache.org/jira/browse/HDFS-7276 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h7276_20141021.patch, h7276_20141022.patch, > h7276_20141023.patch, h7276_20141024.patch, h7276_20141027.patch, > h7276_20141027b.patch, h7276_20141028.patch, h7276_20141029.patch, > h7276_20141029b.patch > > > When there are a lot of DFSOutputStream's writing concurrently, the number of > outstanding packets could be large. The byte arrays created by those packets > could occupy a lot of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token
[ https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189430#comment-14189430 ] Vinod Kumar Vavilapalli commented on HDFS-7295: --- bq. Vinod, We're probably not on the same wavelength. I agree with all that you said about keytabs being the solution for services. But I'm trying to find a solution for apps that are started by regular users. There are no keytabs here. We are. I am saying that the services we are bringing to YARN are the same services the existed today outside of YARN. And they have keytabs. I a not sure how Spark Streaming works today in a secure cluster outside of YARN without any access to keytabs. > Support arbitrary max expiration times for delegation token > --- > > Key: HDFS-7295 > URL: https://issues.apache.org/jira/browse/HDFS-7295 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > > Currently the max lifetime of HDFS delegation tokens is hardcoded to 7 days. > This is a problem for different users of HDFS such as long running YARN apps. > Users should be allowed to optionally specify max lifetime for their tokens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7035) Make adding volume an atomic operation.
[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189421#comment-14189421 ] Hadoop QA commented on HDFS-7035: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12678017/HDFS-7035.012.patch against trunk revision d33e07d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1268 javac compiler warnings (more than the trunk's current 1267 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8590//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8590//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8590//console This message is automatically generated. > Make adding volume an atomic operation. > --- > > Key: HDFS-7035 > URL: https://issues.apache.org/jira/browse/HDFS-7035 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.5.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, > HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, > HDFS-7035.003.patch, HDFS-7035.003.patch, HDFS-7035.004.patch, > HDFS-7035.005.patch, HDFS-7035.007.patch, HDFS-7035.008.patch, > HDFS-7035.009.patch, HDFS-7035.010.patch, HDFS-7035.010.patch, > HDFS-7035.011.patch, HDFS-7035.012.patch, HDFS-7035.013.patch > > > It refactors {{DataStorage}} and {{BlockPoolSliceStorage}} to reduce the > duplicate code and supports atomic adding volume operations. Also it > parallels loading data volume operation: each thread loads one volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4882) Namenode LeaseManager checkLeases() runs into infinite loop
[ https://issues.apache.org/jira/browse/HDFS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189396#comment-14189396 ] Hadoop QA commented on HDFS-4882: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586700/4882.patch against trunk revision d33e07d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestDeleteRace {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8589//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8589//console This message is automatically generated. > Namenode LeaseManager checkLeases() runs into infinite loop > --- > > Key: HDFS-4882 > URL: https://issues.apache.org/jira/browse/HDFS-4882 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, namenode >Affects Versions: 2.0.0-alpha >Reporter: Zesheng Wu > Attachments: 4882.1.patch, 4882.patch, 4882.patch > > > Scenario: > 1. cluster with 4 DNs > 2. the size of the file to be written is a little more than one block > 3. write the first block to 3 DNs, DN1->DN2->DN3 > 4. all the data packets of first block is successfully acked and the client > sets the pipeline stage to PIPELINE_CLOSE, but the last packet isn't sent out > 5. DN2 and DN3 are down > 6. client recovers the pipeline, but no new DN is added to the pipeline > because of the current pipeline stage is PIPELINE_CLOSE > 7. client continuously writes the last block, and try to close the file after > written all the data > 8. NN finds that the penultimate block doesn't has enough replica(our > dfs.namenode.replication.min=2), and the client's close runs into indefinite > loop(HDFS-2936), and at the same time, NN makes the last block's state to > COMPLETE > 9. shutdown the client > 10. the file's lease exceeds hard limit > 11. LeaseManager realizes that and begin to do lease recovery by call > fsnamesystem.internalReleaseLease() > 12. but the last block's state is COMPLETE, and this triggers lease manager's > infinite loop and prints massive logs like this: > {noformat} > 2013-06-05,17:42:25,695 INFO > org.apache.hadoop.hdfs.server.namenode.LeaseManager: Lease [Lease. Holder: > DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1] has expired hard > limit > 2013-06-05,17:42:25,695 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease. > Holder: DFSClient_NONMAPREDUCE_-1252656407_1, pendingcreates: 1], src= > /user/h_wuzesheng/test.dat > 2013-06-05,17:42:25,695 WARN org.apache.hadoop.hdfs.StateChange: DIR* > NameSystem.internalReleaseLease: File = /user/h_wuzesheng/test.dat, block > blk_-7028017402720175688_1202597, > lastBLockState=COMPLETE > 2013-06-05,17:42:25,695 INFO > org.apache.hadoop.hdfs.server.namenode.LeaseManager: Started block recovery > for file /user/h_wuzesheng/test.dat lease [Lease. Holder: DFSClient_NONM > APREDUCE_-1252656407_1, pendingcreates: 1] > {noformat} > (the 3rd line log is a debug log added by us) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7225) Failed DataNode lookup can crash NameNode with NullPointerException
[ https://issues.apache.org/jira/browse/HDFS-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189369#comment-14189369 ] Zhe Zhang commented on HDFS-7225: - Thanks [~jingzhao] for the advice. If that's the case we should indeed remove the block invalidation tasks once a new storage UUID has been discovered. I'll submit an updated patch. > Failed DataNode lookup can crash NameNode with NullPointerException > --- > > Key: HDFS-7225 > URL: https://issues.apache.org/jira/browse/HDFS-7225 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.0 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-7225-v1.patch, HDFS-7225-v2.patch > > > {{BlockManager#invalidateWorkForOneNode}} looks up a DataNode by the > {{datanodeUuid}} and passes the resultant {{DatanodeDescriptor}} to > {{InvalidateBlocks#invalidateWork}}. However, if a wrong or outdated > {{datanodeUuid}} is used, a null pointer will be passed to {{invalidateWork}} > which will use it to lookup in a {{TreeMap}}. Since the key type is > {{DatanodeDescriptor}}, key comparison is based on the IP address. A null key > will crash the NameNode with an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189366#comment-14189366 ] Zhe Zhang commented on HDFS-7285: - A meeting has been scheduled: * When: Friday Oct. 31st 10am~12pm * Where: Cloudera Headquarter, 1001 Page Mill Road, Palo Alto. Both the lobby (for guests check-in) and the meeting room (Hadoop) are in building #2 * URL: https://cloudera.webex.com/cloudera/j.php?MTID=me26394d0a3559c7a9498f18ad7de8962 * Call-in: 1-650-479-3208 (US/Canada) with access code: 290 472 605. Please drop me a note (zhezh...@cloudera.com) if you prefer a different time. Thanks [~drankye] for the suggestion. The interface of the erasure coding feature potentially has a close relationship with HSM (HDFS-2832) and archival storage (HDFS-6584). We'll make sure to cover this topic in the meeting and share the summary here. > Erasure Coding Support inside HDFS > -- > > Key: HDFS-7285 > URL: https://issues.apache.org/jira/browse/HDFS-7285 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Weihua Jiang >Assignee: Zhe Zhang > Attachments: HDFSErasureCodingDesign-20141028.pdf > > > Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice > of data reliability, comparing to the existing HDFS 3-replica approach. For > example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, > with storage overhead only being 40%. This makes EC a quite attractive > alternative for big data storage, particularly for cold data. > Facebook had a related open source project called HDFS-RAID. It used to be > one of the contribute packages in HDFS but had been removed since Hadoop 2.0 > for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends > on MapReduce to do encoding and decoding tasks; 2) it can only be used for > cold files that are intended not to be appended anymore; 3) the pure Java EC > coding implementation is extremely slow in practical use. Due to these, it > might not be a good idea to just bring HDFS-RAID back. > We (Intel and Cloudera) are working on a design to build EC into HDFS that > gets rid of any external dependencies, makes it self-contained and > independently maintained. This design lays the EC feature on the storage type > support and considers compatible with existing HDFS features like caching, > snapshot, encryption, high availability and etc. This design will also > support different EC coding schemes, implementations and policies for > different deployment scenarios. By utilizing advanced libraries (e.g. Intel > ISA-L library), an implementation can greatly improve the performance of EC > encoding/decoding and makes the EC solution even more attractive. We will > post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7281) Missing block is marked as corrupted block
[ https://issues.apache.org/jira/browse/HDFS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-7281: Labels: supportability (was: ) > Missing block is marked as corrupted block > -- > > Key: HDFS-7281 > URL: https://issues.apache.org/jira/browse/HDFS-7281 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Labels: supportability > Attachments: HDFS-7281-2.patch, HDFS-7281.patch > > > In the situation where the block lost all its replicas, fsck shows the block > is missing as well as corrupted. Perhaps it is better not to mark the block > corrupted in this case. The reason it is marked as corrupted is > numCorruptNodes == numNodes == 0 in the following code. > {noformat} > BlockManager > final boolean isCorrupt = numCorruptNodes == numNodes; > {noformat} > Would like to clarify if it is the intent to mark missing block as corrupted > or it is just a bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7281) Missing block is marked as corrupted block
[ https://issues.apache.org/jira/browse/HDFS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189343#comment-14189343 ] Yongjun Zhang commented on HDFS-7281: - Thanks [~mingma]. Hi [~atm], the latest patch looks good to me. I wonder if you would have time to do a review here? thanks. > Missing block is marked as corrupted block > -- > > Key: HDFS-7281 > URL: https://issues.apache.org/jira/browse/HDFS-7281 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-7281-2.patch, HDFS-7281.patch > > > In the situation where the block lost all its replicas, fsck shows the block > is missing as well as corrupted. Perhaps it is better not to mark the block > corrupted in this case. The reason it is marked as corrupted is > numCorruptNodes == numNodes == 0 in the following code. > {noformat} > BlockManager > final boolean isCorrupt = numCorruptNodes == numNodes; > {noformat} > Would like to clarify if it is the intent to mark missing block as corrupted > or it is just a bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7199) DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O exception
[ https://issues.apache.org/jira/browse/HDFS-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189339#comment-14189339 ] Hadoop QA commented on HDFS-7199: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677996/HDFS-7199.patch against trunk revision d33e07d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8588//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8588//console This message is automatically generated. > DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O > exception > --- > > Key: HDFS-7199 > URL: https://issues.apache.org/jira/browse/HDFS-7199 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Rushabh S Shah >Priority: Critical > Attachments: HDFS-7199-WIP.patch, HDFS-7199.patch > > > If the DataStreamer thread encounters a non-I/O exception then it closes the > output stream but does not set lastException. When the client later calls > close on the output stream then it will see the stream is already closed with > lastException == null, mistakently think this is a redundant close call, and > fail to report any error to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7035) Make adding volume an atomic operation.
[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-7035: Attachment: HDFS-7035.013.patch Hi, [~cmccabe] Thanks for your quick reviews. I have updates the patch based on your comments. {{conf}} is only updated in the end of {{refreshVolumes}}. bq. I am also curious what happens when we fail midway through. I can see that DataStorage#prepareVolume adds the volume to DataStorage#bpStorageMap, is there anywhere where we remove it if the addition fails? I have also added comments here. Basically, {{refreshVolume}} does not change the states of {{BlockPoolSliceStorage}} except updating a few member with constant values. Would you give another look? Thanks! > Make adding volume an atomic operation. > --- > > Key: HDFS-7035 > URL: https://issues.apache.org/jira/browse/HDFS-7035 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.5.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, > HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, > HDFS-7035.003.patch, HDFS-7035.003.patch, HDFS-7035.004.patch, > HDFS-7035.005.patch, HDFS-7035.007.patch, HDFS-7035.008.patch, > HDFS-7035.009.patch, HDFS-7035.010.patch, HDFS-7035.010.patch, > HDFS-7035.011.patch, HDFS-7035.012.patch, HDFS-7035.013.patch > > > It refactors {{DataStorage}} and {{BlockPoolSliceStorage}} to reduce the > duplicate code and supports atomic adding volume operations. Also it > parallels loading data volume operation: each thread loads one volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7276: -- Attachment: h7276_20141029b.patch h7276_20141029b.patch: reverts PacketHeader and slightly changes some javadoc and error messages. > Limit the number of byte arrays used by DFSOutputStream > --- > > Key: HDFS-7276 > URL: https://issues.apache.org/jira/browse/HDFS-7276 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h7276_20141021.patch, h7276_20141022.patch, > h7276_20141023.patch, h7276_20141024.patch, h7276_20141027.patch, > h7276_20141027b.patch, h7276_20141028.patch, h7276_20141029.patch, > h7276_20141029b.patch > > > When there are a lot of DFSOutputStream's writing concurrently, the number of > outstanding packets could be large. The byte arrays created by those packets > could occupy a lot of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7225) Failed DataNode lookup can crash NameNode with NullPointerException
[ https://issues.apache.org/jira/browse/HDFS-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189314#comment-14189314 ] Jing Zhao commented on HDFS-7225: - Currently if a reported block belongs to no file, the block will be finally marked as invalid (not for the first block report though) and will be finally deleted. > Failed DataNode lookup can crash NameNode with NullPointerException > --- > > Key: HDFS-7225 > URL: https://issues.apache.org/jira/browse/HDFS-7225 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.0 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-7225-v1.patch, HDFS-7225-v2.patch > > > {{BlockManager#invalidateWorkForOneNode}} looks up a DataNode by the > {{datanodeUuid}} and passes the resultant {{DatanodeDescriptor}} to > {{InvalidateBlocks#invalidateWork}}. However, if a wrong or outdated > {{datanodeUuid}} is used, a null pointer will be passed to {{invalidateWork}} > which will use it to lookup in a {{TreeMap}}. Since the key type is > {{DatanodeDescriptor}}, key comparison is based on the IP address. A null key > will crash the NameNode with an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189286#comment-14189286 ] Hudson commented on HDFS-7305: -- FAILURE: Integrated in Hadoop-trunk-Commit #6388 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6388/]) HDFS-7305. NPE seen in wbhdfs FS while running SLive. Contributed by Jing Zhao. (jing9: rev 6f5f604a798b545faf6fadc9b66c8a8995b354db) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Assignee: Jing Zhao >Priority: Minor > Fix For: 2.6.0 > > Attachments: HDFS-7305.000.patch > > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7305: Component/s: webhdfs > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Assignee: Jing Zhao >Priority: Minor > Fix For: 2.6.0 > > Attachments: HDFS-7305.000.patch > > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7305: Resolution: Fixed Fix Version/s: 2.6.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Arpit for the report and Haohui for the review. I've committed this to trunk, branch-2 and branch-2.6. > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Assignee: Jing Zhao >Priority: Minor > Fix For: 2.6.0 > > Attachments: HDFS-7305.000.patch > > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7225) Failed DataNode lookup can crash NameNode with NullPointerException
[ https://issues.apache.org/jira/browse/HDFS-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189281#comment-14189281 ] Zhe Zhang commented on HDFS-7225: - AFAICT, NN won't try to delete orphan blocks. I verified with the following test: {code} public void testOrphanBlocks() throws IOException { DataNode dn = cluster.getDataNodes().get(0); DatanodeRegistration dnReg = dn.getDNRegistrationForBP(bpid); StorageBlockReport reports[] = new StorageBlockReport[cluster.getStoragesPerDatanode()]; ArrayList blocks = new ArrayList(); for (int i = 0; i < 10; i++) { blocks.add(new Block()); } for (int i = 0; i < cluster.getStoragesPerDatanode(); ++i) { BlockListAsLongs bll = new BlockListAsLongs(blocks, null); FsVolumeSpi v = dn.getFSDataset().getVolumes().get(i); DatanodeStorage dns = new DatanodeStorage(v.getStorageID()); reports[i] = new StorageBlockReport(dns, bll.getBlockListAsLongs()); } cluster.getNameNodeRpc().blockReport(dnReg, bpid, reports); LOG.debug("Scheduling to delete " + cluster.getNameNode().getNamesystem().getBlockManager(). getPendingDeletionBlocksCount() + " blocks"); } {code} I wonder if it's the intended behavior for the NN to keep orphan blocks, or we should add the logic to delete them. [~andrew.wang] Do you have a clue? > Failed DataNode lookup can crash NameNode with NullPointerException > --- > > Key: HDFS-7225 > URL: https://issues.apache.org/jira/browse/HDFS-7225 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.6.0 >Reporter: Zhe Zhang >Assignee: Zhe Zhang > Attachments: HDFS-7225-v1.patch, HDFS-7225-v2.patch > > > {{BlockManager#invalidateWorkForOneNode}} looks up a DataNode by the > {{datanodeUuid}} and passes the resultant {{DatanodeDescriptor}} to > {{InvalidateBlocks#invalidateWork}}. However, if a wrong or outdated > {{datanodeUuid}} is used, a null pointer will be passed to {{invalidateWork}} > which will use it to lookup in a {{TreeMap}}. Since the key type is > {{DatanodeDescriptor}}, key comparison is based on the IP address. A null key > will crash the NameNode with an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7281) Missing block is marked as corrupted block
[ https://issues.apache.org/jira/browse/HDFS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189241#comment-14189241 ] Ming Ma commented on HDFS-7281: --- Thanks, Yongjun. HADOOP-11045 is useful. Both TestEncryptionZonesWithHA and TestLeaseRecovery2 pass with the local run. > Missing block is marked as corrupted block > -- > > Key: HDFS-7281 > URL: https://issues.apache.org/jira/browse/HDFS-7281 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-7281-2.patch, HDFS-7281.patch > > > In the situation where the block lost all its replicas, fsck shows the block > is missing as well as corrupted. Perhaps it is better not to mark the block > corrupted in this case. The reason it is marked as corrupted is > numCorruptNodes == numNodes == 0 in the following code. > {noformat} > BlockManager > final boolean isCorrupt = numCorruptNodes == numNodes; > {noformat} > Would like to clarify if it is the intent to mark missing block as corrupted > or it is just a bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189243#comment-14189243 ] Hadoop QA commented on HDFS-7305: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677978/HDFS-7305.000.patch against trunk revision d33e07d. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8587//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8587//console This message is automatically generated. > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Assignee: Jing Zhao >Priority: Minor > Attachments: HDFS-7305.000.patch > > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7263) Snapshot read of an appended file returns more bytes than the file length.
[ https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Luo updated HDFS-7263: -- Status: Patch Available (was: In Progress) > Snapshot read of an appended file returns more bytes than the file length. > -- > > Key: HDFS-7263 > URL: https://issues.apache.org/jira/browse/HDFS-7263 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Konstantin Shvachko >Assignee: Tao Luo > Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, > TestSnapshotRead.java > > > The following sequence of steps will produce extra bytes, that should not be > visible, because they are not in the snapshot. > * Create a file of size L, where {{L % blockSize != 0}}. > * Create a snapshot > * Append bytes to the file > * Read file in the snapshot (not the current file) > * You will see the bytes are read beoynd the original file size L -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7263) Snapshot read of an appended file returns more bytes than the file length.
[ https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Luo updated HDFS-7263: -- Attachment: HDFS-7263.patch Simplified the test per Konstantin's review. > Snapshot read of an appended file returns more bytes than the file length. > -- > > Key: HDFS-7263 > URL: https://issues.apache.org/jira/browse/HDFS-7263 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Konstantin Shvachko >Assignee: Tao Luo > Attachments: HDFS-7263.patch, HDFS-7263.patch, HDFS-7263.patch, > TestSnapshotRead.java > > > The following sequence of steps will produce extra bytes, that should not be > visible, because they are not in the snapshot. > * Create a file of size L, where {{L % blockSize != 0}}. > * Create a snapshot > * Append bytes to the file > * Read file in the snapshot (not the current file) > * You will see the bytes are read beoynd the original file size L -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189217#comment-14189217 ] Haohui Mai commented on HDFS-6385: -- I wonder, why the information needs to be exported on both metrics and JMX? > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189216#comment-14189216 ] Konstantin Shvachko commented on HDFS-3107: --- Yes, good point, Nicholas. During upgrade old blocks are hard linked and the hard links are stored in a separate directory, which allows the blocks themselves be deleted while the file system is updated. So when we roll back the deleted blocks are still available via those hard links. With truncate the same is applied to the blocks that were deleted. For the blocks that are truncated to a smaller length we will need to do copy-on-truncate recovery, same as we do for snapshots. That is if upgrade is in progress NN will schedule copy-on-truncate wether snapshots are present or not. The bottom line is that implementing copy-on-truncate is needed both for snapshots and upgrades. I'll make a note to update the design. > HDFS truncate > - > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Lei Chang >Assignee: Plamen Jeliazkov > Attachments: HDFS-3107.008.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, > HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, > editsStored, editsStored, editsStored.xml > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the > underlying storage when a transaction is aborted. Currently HDFS does not > support truncate (a standard Posix operation) which is a reverse operation of > append, which makes upper layer applications use ugly workarounds (such as > keeping track of the discarded byte range per file in a separate metadata > store, and periodically running a vacuum process to rewrite compacted files) > to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7263) Snapshot read of an appended file returns more bytes than the file length.
[ https://issues.apache.org/jira/browse/HDFS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189193#comment-14189193 ] Konstantin Shvachko commented on HDFS-7263: --- I coudn't see the test failures or patch related java warnings locally with or without the patch. I have one nit. In the test you added for append you create a new snapshot2. This is not necessary for this particulare test case. Could you also add a comment that you are testing snapshot read of a file opened for append. > Snapshot read of an appended file returns more bytes than the file length. > -- > > Key: HDFS-7263 > URL: https://issues.apache.org/jira/browse/HDFS-7263 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Konstantin Shvachko >Assignee: Tao Luo > Attachments: HDFS-7263.patch, HDFS-7263.patch, TestSnapshotRead.java > > > The following sequence of steps will produce extra bytes, that should not be > visible, because they are not in the snapshot. > * Create a file of size L, where {{L % blockSize != 0}}. > * Create a snapshot > * Append bytes to the file > * Read file in the snapshot (not the current file) > * You will see the bytes are read beoynd the original file size L -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189186#comment-14189186 ] Jing Zhao commented on HDFS-7276: - Thanks for updating the patch, Nicholas! The new patch looks pretty good to me. Maybe we can remove the change in PacketHeader? Other than that +1. > Limit the number of byte arrays used by DFSOutputStream > --- > > Key: HDFS-7276 > URL: https://issues.apache.org/jira/browse/HDFS-7276 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h7276_20141021.patch, h7276_20141022.patch, > h7276_20141023.patch, h7276_20141024.patch, h7276_20141027.patch, > h7276_20141027b.patch, h7276_20141028.patch, h7276_20141029.patch > > > When there are a lot of DFSOutputStream's writing concurrently, the number of > outstanding packets could be large. The byte arrays created by those packets > could occupy a lot of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7281) Missing block is marked as corrupted block
[ https://issues.apache.org/jira/browse/HDFS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189175#comment-14189175 ] Yongjun Zhang commented on HDFS-7281: - HI [~mingma], Thanks for addressing my comments, the change looks good to me. About the test failure, I used the tool from HADOOP-11045 and found the following: {code} Recently FAILED builds in url: https://builds.apache.org//job/PreCommit-Hdfs-Build THERE ARE 95 builds (out of 100) that have failed tests in the past 7 days, as listed below: .. Among 100 runs examined, all failed tests <#failedRuns: testName>: 6: org.apache.hadoop.hdfs.TestLeaseRecovery2.testLeaseRecoverByAnotherUser 6: org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecovery 6: org.apache.hadoop.hdfs.TestLeaseRecovery2.testHardLeaseRecoveryWithRenameAfterNameNodeRestart 5: org.apache.hadoop.hdfs.server.balancer.TestBalancer.testUnknownDatanode 5: org.apache.hadoop.hdfs.TestLeaseRecovery2.testThreadName 3: org.apache.hadoop.hdfs.TestDFSClientRetries.testFailuresArePerOperation ... {code} So the TestLeaseReovery2 is not relevant to your change as we expected. I suggest that you run locally both the this test and the timeouted one TestEncryptionZonesWithHA and see if they pass with your patch, for completeness. Thanks. > Missing block is marked as corrupted block > -- > > Key: HDFS-7281 > URL: https://issues.apache.org/jira/browse/HDFS-7281 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-7281-2.patch, HDFS-7281.patch > > > In the situation where the block lost all its replicas, fsck shows the block > is missing as well as corrupted. Perhaps it is better not to mark the block > corrupted in this case. The reason it is marked as corrupted is > numCorruptNodes == numNodes == 0 in the following code. > {noformat} > BlockManager > final boolean isCorrupt = numCorruptNodes == numNodes; > {noformat} > Would like to clarify if it is the intent to mark missing block as corrupted > or it is just a bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7035) Make adding volume an atomic operation.
[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189165#comment-14189165 ] Colin Patrick McCabe commented on HDFS-7035: P.S. I like the VolumeBuilder concept. Can you add some JavaDoc about the usage? I am also curious what happens when we fail midway through. I can see that {{DataStorage#prepareVolume}} adds the volume to {{DataStorage#bpStorageMap}}, is there anywhere where we remove it if the addition fails? Perhaps we need an {{abort}} function in the Builder? > Make adding volume an atomic operation. > --- > > Key: HDFS-7035 > URL: https://issues.apache.org/jira/browse/HDFS-7035 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.5.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, > HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, > HDFS-7035.003.patch, HDFS-7035.003.patch, HDFS-7035.004.patch, > HDFS-7035.005.patch, HDFS-7035.007.patch, HDFS-7035.008.patch, > HDFS-7035.009.patch, HDFS-7035.010.patch, HDFS-7035.010.patch, > HDFS-7035.011.patch, HDFS-7035.012.patch > > > It refactors {{DataStorage}} and {{BlockPoolSliceStorage}} to reduce the > duplicate code and supports atomic adding volume operations. Also it > parallels loading data volume operation: each thread loads one volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7199) DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O exception
[ https://issues.apache.org/jira/browse/HDFS-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189160#comment-14189160 ] Rushabh S Shah commented on HDFS-7199: -- Test TestDFSStorageStateRecovery running without any errors on my local cluster. > DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O > exception > --- > > Key: HDFS-7199 > URL: https://issues.apache.org/jira/browse/HDFS-7199 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Rushabh S Shah >Priority: Critical > Attachments: HDFS-7199-WIP.patch, HDFS-7199.patch > > > If the DataStreamer thread encounters a non-I/O exception then it closes the > output stream but does not set lastException. When the client later calls > close on the output stream then it will see the stream is already closed with > lastException == null, mistakently think this is a redundant close call, and > fail to report any error to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
[ https://issues.apache.org/jira/browse/HDFS-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189157#comment-14189157 ] Hudson commented on HDFS-7300: -- FAILURE: Integrated in Hadoop-trunk-Commit #6387 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6387/]) HDFS-7300. HDFS-7300. The getMaxNodesPerRack() method in (kihwal: rev 3ae84e1ba8928879b3eda90e79667ba5a45d60f8) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppendRestart.java > The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed > > > Key: HDFS-7300 > URL: https://issues.apache.org/jira/browse/HDFS-7300 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.6.0 > > Attachments: HDFS-7300.patch, HDFS-7300.v2.patch > > > The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases. > - Three replicas on two racks. The max is 3, so everything can go to one rack. > - Two replicas on two or more racks. The max is 2, both replicas can end up > in the same rack. > {{BlockManager#isNeededReplication()}} fixes this after block/file is closed > because {{blockHasEnoughRacks()}} will return fail. This is not only extra > work, but also can break the favored nodes feature. > When there are two racks and two favored nodes are specified in the same > rack, NN may allocate the third replica on a node in the same rack, because > {{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the > other rack. There is 66% chance that a favored node is moved. If > {{maxNodesPerRack}} was 2, this would not happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7035) Make adding volume an atomic operation.
[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189148#comment-14189148 ] Colin Patrick McCabe commented on HDFS-7035: This looks good. Thanks, Eddy. Comments below. {code} private synchronized void refreshVolumes(String newVolumes) throws IOException { conf.set(DFS_DATANODE_DATA_DIR_KEY, newVolumes); {code} What's the purpose of setting this at the beginning of the function? At the end of the function we set it to the actual volumes that got added (plus the existing). It seems like it only needs to be set once? Also, we should add some JavaDoc stating that even if an IOException is thrown from this function, some new volumes may have been successfully added. {code} LOG.info("Analyzed volume - " + dir + ", StorageType: " + storageType); {code} Should say "added volume"? Since this is at the end of addVolume. {{updateReplicaUnderRecovery}}: can we avoid changing the whitespace here? It's distracting {{SimulatedFSDataset#addVolume}}: need an Override annotation here. Findbugs or something will probably complain. {code} 1115 public void write(StorageDirectory sd) throws IOException { 1116this.layoutVersion = getServiceLayoutVersion(); 1117writeProperties(sd); 1118 } {code} I realize that you modelled this on {{Storage#writeAll}}. But I find this to be a weird (and weirdly named) API. It's a function named "write", that updates the layoutVersion? And then writes just the properties file? I think we should have an API named setServiceLayoutVersion, and then just do setServiceLayoutVersion(getServiceLayoutVersion()). Better yet, rename getServiceLayoutVersion to getLatestServiceLayoutVersion, since that's really what it's doing. Then we could just call: {code} storage.setServiceLayoutVersion(getLatestServiceLayoutVersion()) storage.writeProperties(sd) {code} and it would be obvious what was going on. > Make adding volume an atomic operation. > --- > > Key: HDFS-7035 > URL: https://issues.apache.org/jira/browse/HDFS-7035 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.5.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, > HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, > HDFS-7035.003.patch, HDFS-7035.003.patch, HDFS-7035.004.patch, > HDFS-7035.005.patch, HDFS-7035.007.patch, HDFS-7035.008.patch, > HDFS-7035.009.patch, HDFS-7035.010.patch, HDFS-7035.010.patch, > HDFS-7035.011.patch, HDFS-7035.012.patch > > > It refactors {{DataStorage}} and {{BlockPoolSliceStorage}} to reduce the > duplicate code and supports atomic adding volume operations. Also it > parallels loading data volume operation: each thread loads one volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
[ https://issues.apache.org/jira/browse/HDFS-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-7300: - Resolution: Fixed Fix Version/s: 2.6.0 Target Version/s: 2.6.0 (was: 2.7.0) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the review, Daryn. Committed to trunk, branch-2 and branch-2.6. > The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed > > > Key: HDFS-7300 > URL: https://issues.apache.org/jira/browse/HDFS-7300 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.6.0 > > Attachments: HDFS-7300.patch, HDFS-7300.v2.patch > > > The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases. > - Three replicas on two racks. The max is 3, so everything can go to one rack. > - Two replicas on two or more racks. The max is 2, both replicas can end up > in the same rack. > {{BlockManager#isNeededReplication()}} fixes this after block/file is closed > because {{blockHasEnoughRacks()}} will return fail. This is not only extra > work, but also can break the favored nodes feature. > When there are two racks and two favored nodes are specified in the same > rack, NN may allocate the third replica on a node in the same rack, because > {{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the > other rack. There is 66% chance that a favored node is moved. If > {{maxNodesPerRack}} was 2, this would not happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189127#comment-14189127 ] Chris Nauroth commented on HDFS-6385: - {quote} -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock {quote} I think the problem is that my test was comparing a {{Time#now}} value (wrapper over {{System#currentTimeMillis}}) to a {{Time#monotonicNow}} value (wrapper over {{System#nanoTime}}). Values returned from {{System#nanoTime}} can be negative though. It passed locally on my system, but that was just coincidental. I'll need to change my test code. > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7300) The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed
[ https://issues.apache.org/jira/browse/HDFS-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189122#comment-14189122 ] Daryn Sharp commented on HDFS-7300: --- +1 Looks good. > The getMaxNodesPerRack() method in BlockPlacementPolicyDefault is flawed > > > Key: HDFS-7300 > URL: https://issues.apache.org/jira/browse/HDFS-7300 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-7300.patch, HDFS-7300.v2.patch > > > The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases. > - Three replicas on two racks. The max is 3, so everything can go to one rack. > - Two replicas on two or more racks. The max is 2, both replicas can end up > in the same rack. > {{BlockManager#isNeededReplication()}} fixes this after block/file is closed > because {{blockHasEnoughRacks()}} will return fail. This is not only extra > work, but also can break the favored nodes feature. > When there are two racks and two favored nodes are specified in the same > rack, NN may allocate the third replica on a node in the same rack, because > {{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the > other rack. There is 66% chance that a favored node is moved. If > {{maxNodesPerRack}} was 2, this would not happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7199) DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O exception
[ https://issues.apache.org/jira/browse/HDFS-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189114#comment-14189114 ] Hadoop QA commented on HDFS-7199: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12675017/HDFS-7199-WIP.patch against trunk revision 5c900b5. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 13 warning messages. See https://builds.apache.org/job/PreCommit-HDFS-Build/8585//artifact/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8585//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8585//console This message is automatically generated. > DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O > exception > --- > > Key: HDFS-7199 > URL: https://issues.apache.org/jira/browse/HDFS-7199 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Rushabh S Shah >Priority: Critical > Attachments: HDFS-7199-WIP.patch, HDFS-7199.patch > > > If the DataStreamer thread encounters a non-I/O exception then it closes the > output stream but does not set lastException. When the client later calls > close on the output stream then it will see the stream is already closed with > lastException == null, mistakently think this is a redundant close call, and > fail to report any error to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189113#comment-14189113 ] Hadoop QA commented on HDFS-6385: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677969/HDFS-6385.1.patch against trunk revision c2575fb. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8586//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8586//console This message is automatically generated. > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7276: -- Attachment: h7276_20141029.patch h7276_20141029.patch: uses power of two array size and addresses Jing comments except #1. > Limit the number of byte arrays used by DFSOutputStream > --- > > Key: HDFS-7276 > URL: https://issues.apache.org/jira/browse/HDFS-7276 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h7276_20141021.patch, h7276_20141022.patch, > h7276_20141023.patch, h7276_20141024.patch, h7276_20141027.patch, > h7276_20141027b.patch, h7276_20141028.patch, h7276_20141029.patch > > > When there are a lot of DFSOutputStream's writing concurrently, the number of > outstanding packets could be large. The byte arrays created by those packets > could occupy a lot of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189109#comment-14189109 ] Tsz Wo Nicholas Sze commented on HDFS-7276: --- Filed HDFS-7308 for fixing the computation. > Limit the number of byte arrays used by DFSOutputStream > --- > > Key: HDFS-7276 > URL: https://issues.apache.org/jira/browse/HDFS-7276 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h7276_20141021.patch, h7276_20141022.patch, > h7276_20141023.patch, h7276_20141024.patch, h7276_20141027.patch, > h7276_20141027b.patch, h7276_20141028.patch > > > When there are a lot of DFSOutputStream's writing concurrently, the number of > outstanding packets could be large. The byte arrays created by those packets > could occupy a lot of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7287) The OfflineImageViewer (OIV) can output invalid XML depending on the filename
[ https://issues.apache.org/jira/browse/HDFS-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189103#comment-14189103 ] Ravi Prakash commented on HDFS-7287: Thanks Colin! I've filed HDFS-7309 for clarification on the mangling. Please feel free to close as it as invalid if you think its unreasonable. > The OfflineImageViewer (OIV) can output invalid XML depending on the filename > - > > Key: HDFS-7287 > URL: https://issues.apache.org/jira/browse/HDFS-7287 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Fix For: 2.6.0 > > Attachments: HDFS-7287.1.patch, HDFS-7287.2.patch, HDFS-7287.patch, > testXMLOutput > > > If the filename contains a character which is invalid in XML, > TextWriterImageVisitor.write() or PBImageXmlWriter.o() prints out the string > unescaped. For us this was the character 0x0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7035) Make adding volume an atomic operation.
[ https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-7035: Attachment: HDFS-7035.012.patch [~cmccabe] Thanks for your detailed comments! In this updated patch, I have made the following major changes: * Merge the changes from HDFS-7173 so that this patch can be more self-contained. * Remove {{StagedAddVolume}} interface and rename {{DataStorage#DataStorageAddedVolume}} to {{DataStorage#VolumeBuilder}}. * Removed {{FsDatasetImpl#StagedAddVolume}} * Moved the logic of calling {{DataStorage#addVolume}} from {{DataNode#refreshVolume}} to {{FsDatasetImpl#addVolume}}. * Change {{DataNode#refreshVolume}} to {{syncrhonized}} function so that there will be no starting-up / shutting-down DN activities. * Move {{DataNode#conf}} and {{DataNode#dataDir}} recovery into the {{finally}} segment. > Make adding volume an atomic operation. > --- > > Key: HDFS-7035 > URL: https://issues.apache.org/jira/browse/HDFS-7035 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Affects Versions: 2.5.0 >Reporter: Lei (Eddy) Xu >Assignee: Lei (Eddy) Xu > Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, > HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, > HDFS-7035.003.patch, HDFS-7035.003.patch, HDFS-7035.004.patch, > HDFS-7035.005.patch, HDFS-7035.007.patch, HDFS-7035.008.patch, > HDFS-7035.009.patch, HDFS-7035.010.patch, HDFS-7035.010.patch, > HDFS-7035.011.patch, HDFS-7035.012.patch > > > It refactors {{DataStorage}} and {{BlockPoolSliceStorage}} to reduce the > duplicate code and supports atomic adding volume operations. Also it > parallels loading data volume operation: each thread loads one volume. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7309) XMLUtils.mangleXmlString doesn't seem to handle less than sign
[ https://issues.apache.org/jira/browse/HDFS-7309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-7309: --- Attachment: HDFS-7309.patch Here's a unit test to illustrate the problem > XMLUtils.mangleXmlString doesn't seem to handle less than sign > -- > > Key: HDFS-7309 > URL: https://issues.apache.org/jira/browse/HDFS-7309 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.1.0-beta >Reporter: Ravi Prakash >Priority: Minor > Attachments: HDFS-7309.patch > > > My expectation was that "" + XMLUtils.mangleXmlString( > "Containing" would be a string > acceptable to a SAX parser. However this was not true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7308) DFSClient write packet size may > 64kB
[ https://issues.apache.org/jira/browse/HDFS-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189098#comment-14189098 ] Yongjun Zhang commented on HDFS-7308: - Good catch Nicholas! > DFSClient write packet size may > 64kB > -- > > Key: HDFS-7308 > URL: https://issues.apache.org/jira/browse/HDFS-7308 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Minor > > In DFSOutputStream.computePacketChunkSize(..), > {code} > private void computePacketChunkSize(int psize, int csize) { > final int chunkSize = csize + getChecksumSize(); > chunksPerPacket = Math.max(psize/chunkSize, 1); > packetSize = chunkSize*chunksPerPacket; > if (DFSClient.LOG.isDebugEnabled()) { > ... > } > } > {code} > We have the following > || variables || usual values || > | psize | dfsClient.getConf().writePacketSize = 64kB | > | csize | bytesPerChecksum = 512B | > | getChecksumSize(), i.e. CRC size | 32B | > | chunkSize = csize + getChecksumSize() | 544B (not a power of two) | > | psize/chunkSize | 120.47 | > | chunksPerPacket = max(psize/chunkSize, 1) | 120 | > | packetSize = chunkSize*chunksPerPacket (not including header) | 65280B | > | PacketHeader.PKT_MAX_HEADER_LEN | 33B | > | actual packet size | 65280 + 33 = *65313* < 65536 = 64k | > It is fortunate that the usual packet size = 65313 < 64k although the > calculation above does not guarantee it always happens (e.g. if > PKT_MAX_HEADER_LEN=257, then actual packet size=65537 > 64k.) We should fix > the computation in order to guarantee actual packet size < 64k. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7309) XMLUtils.mangleXmlString doesn't seem to handle less than sign
Ravi Prakash created HDFS-7309: -- Summary: XMLUtils.mangleXmlString doesn't seem to handle less than sign Key: HDFS-7309 URL: https://issues.apache.org/jira/browse/HDFS-7309 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Ravi Prakash Priority: Minor My expectation was that "" + XMLUtils.mangleXmlString( "Containing" would be a string acceptable to a SAX parser. However this was not true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189063#comment-14189063 ] Konstantin Boudnik commented on HDFS-3107: -- bq. I posted it as a demonstration. I think to make it more robust we would want And it needs to be atomic e.g. not involving 5 RPC calls, otherwise recovery would be a nightmare. > HDFS truncate > - > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Lei Chang >Assignee: Plamen Jeliazkov > Attachments: HDFS-3107.008.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, > HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, > editsStored, editsStored, editsStored.xml > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the > underlying storage when a transaction is aborted. Currently HDFS does not > support truncate (a standard Posix operation) which is a reverse operation of > append, which makes upper layer applications use ugly workarounds (such as > keeping track of the discarded byte range per file in a separate metadata > store, and periodically running a vacuum process to rewrite compacted files) > to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7308) DFSClient write packet size may > 64kB
Tsz Wo Nicholas Sze created HDFS-7308: - Summary: DFSClient write packet size may > 64kB Key: HDFS-7308 URL: https://issues.apache.org/jira/browse/HDFS-7308 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor In DFSOutputStream.computePacketChunkSize(..), {code} private void computePacketChunkSize(int psize, int csize) { final int chunkSize = csize + getChecksumSize(); chunksPerPacket = Math.max(psize/chunkSize, 1); packetSize = chunkSize*chunksPerPacket; if (DFSClient.LOG.isDebugEnabled()) { ... } } {code} We have the following || variables || usual values || | psize | dfsClient.getConf().writePacketSize = 64kB | | csize | bytesPerChecksum = 512B | | getChecksumSize(), i.e. CRC size | 32B | | chunkSize = csize + getChecksumSize() | 544B (not a power of two) | | psize/chunkSize | 120.47 | | chunksPerPacket = max(psize/chunkSize, 1) | 120 | | packetSize = chunkSize*chunksPerPacket (not including header) | 65280B | | PacketHeader.PKT_MAX_HEADER_LEN | 33B | | actual packet size | 65280 + 33 = *65313* < 65536 = 64k | It is fortunate that the usual packet size = 65313 < 64k although the calculation above does not guarantee it always happens (e.g. if PKT_MAX_HEADER_LEN=257, then actual packet size=65537 > 64k.) We should fix the computation in order to guarantee actual packet size < 64k. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7308) DFSClient write packet size may > 64kB
[ https://issues.apache.org/jira/browse/HDFS-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7308: -- Component/s: hdfs-client > DFSClient write packet size may > 64kB > -- > > Key: HDFS-7308 > URL: https://issues.apache.org/jira/browse/HDFS-7308 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze >Priority: Minor > > In DFSOutputStream.computePacketChunkSize(..), > {code} > private void computePacketChunkSize(int psize, int csize) { > final int chunkSize = csize + getChecksumSize(); > chunksPerPacket = Math.max(psize/chunkSize, 1); > packetSize = chunkSize*chunksPerPacket; > if (DFSClient.LOG.isDebugEnabled()) { > ... > } > } > {code} > We have the following > || variables || usual values || > | psize | dfsClient.getConf().writePacketSize = 64kB | > | csize | bytesPerChecksum = 512B | > | getChecksumSize(), i.e. CRC size | 32B | > | chunkSize = csize + getChecksumSize() | 544B (not a power of two) | > | psize/chunkSize | 120.47 | > | chunksPerPacket = max(psize/chunkSize, 1) | 120 | > | packetSize = chunkSize*chunksPerPacket (not including header) | 65280B | > | PacketHeader.PKT_MAX_HEADER_LEN | 33B | > | actual packet size | 65280 + 33 = *65313* < 65536 = 64k | > It is fortunate that the usual packet size = 65313 < 64k although the > calculation above does not guarantee it always happens (e.g. if > PKT_MAX_HEADER_LEN=257, then actual packet size=65537 > 64k.) We should fix > the computation in order to guarantee actual packet size < 64k. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7307) Need 'force close'
Allen Wittenauer created HDFS-7307: -- Summary: Need 'force close' Key: HDFS-7307 URL: https://issues.apache.org/jira/browse/HDFS-7307 Project: Hadoop HDFS Issue Type: Bug Reporter: Allen Wittenauer Until HDFS-4882 and HDFS-7306 get real fixes, operations teams need a way to force close files. DNs are essentially held hostage by broken clients that never close. This situation will get worse as longer/permanently running jobs start increasing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream
[ https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14189037#comment-14189037 ] Tsz Wo Nicholas Sze commented on HDFS-7276: --- > ... However, it is unfortunate that our full package size is 64k + hearder > length, which will round up to 128k. I was wrong about the full package size. In DFSOutputStream.computePacketChunkSize(..), {code} private void computePacketChunkSize(int psize, int csize) { final int chunkSize = csize + getChecksumSize(); chunksPerPacket = Math.max(psize/chunkSize, 1); packetSize = chunkSize*chunksPerPacket; if (DFSClient.LOG.isDebugEnabled()) { ... } } {code} So we have the following || variables || usual values || | psize | dfsClient.getConf().writePacketSize = 64kB | | csize | bytesPerChecksum = 512B | | getChecksumSize(), i.e. CRC size | 32B | | chunkSize = csize + getChecksumSize() | 544B (not a power of two) | | psize/chunkSize | 120.47 | | chunksPerPacket = max(psize/chunkSize, 1) | 120 | | packetSize = chunkSize*chunksPerPacket (not including header) | 65280 | | PacketHeader.PKT_MAX_HEADER_LEN | 33B | | actual packet size | 65280 + 33 = *65313* < 65536 = 64k | It is fortunate that the usual packetSize = 65313 < 64k although the calculation above does not guarantee it happen (e.g. if PKT_MAX_HEADER_LEN=257, then actual packet size=65537 > 64k.) I will fix the computation in order to guarantee actual packet size < 64k. > Limit the number of byte arrays used by DFSOutputStream > --- > > Key: HDFS-7276 > URL: https://issues.apache.org/jira/browse/HDFS-7276 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: h7276_20141021.patch, h7276_20141022.patch, > h7276_20141023.patch, h7276_20141024.patch, h7276_20141027.patch, > h7276_20141027b.patch, h7276_20141028.patch > > > When there are a lot of DFSOutputStream's writing concurrently, the number of > outstanding packets could be large. The byte arrays created by those packets > could occupy a lot of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7306) can't decommission w/under construction blocks
Allen Wittenauer created HDFS-7306: -- Summary: can't decommission w/under construction blocks Key: HDFS-7306 URL: https://issues.apache.org/jira/browse/HDFS-7306 Project: Hadoop HDFS Issue Type: Bug Reporter: Allen Wittenauer We need a way to decommission a node with open blocks. Now that HDFS supports append, this should be do-able. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7199) DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O exception
[ https://issues.apache.org/jira/browse/HDFS-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-7199: - Status: Patch Available (was: Open) Changed the patch to address Collin's comment. Did the same manual testing as before. > DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O > exception > --- > > Key: HDFS-7199 > URL: https://issues.apache.org/jira/browse/HDFS-7199 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Rushabh S Shah >Priority: Critical > Attachments: HDFS-7199-WIP.patch, HDFS-7199.patch > > > If the DataStreamer thread encounters a non-I/O exception then it closes the > output stream but does not set lastException. When the client later calls > close on the output stream then it will see the stream is already closed with > lastException == null, mistakently think this is a redundant close call, and > fail to report any error to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7199) DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O exception
[ https://issues.apache.org/jira/browse/HDFS-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-7199: - Attachment: HDFS-7199.patch > DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O > exception > --- > > Key: HDFS-7199 > URL: https://issues.apache.org/jira/browse/HDFS-7199 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Rushabh S Shah >Priority: Critical > Attachments: HDFS-7199-WIP.patch, HDFS-7199.patch > > > If the DataStreamer thread encounters a non-I/O exception then it closes the > output stream but does not set lastException. When the client later calls > close on the output stream then it will see the stream is already closed with > lastException == null, mistakently think this is a redundant close call, and > fail to report any error to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7199) DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O exception
[ https://issues.apache.org/jira/browse/HDFS-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-7199: - Status: Open (was: Patch Available) > DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O > exception > --- > > Key: HDFS-7199 > URL: https://issues.apache.org/jira/browse/HDFS-7199 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Rushabh S Shah >Priority: Critical > Attachments: HDFS-7199-WIP.patch, HDFS-7199.patch > > > If the DataStreamer thread encounters a non-I/O exception then it closes the > output stream but does not set lastException. When the client later calls > close on the output stream then it will see the stream is already closed with > lastException == null, mistakently think this is a redundant close call, and > fail to report any error to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token
[ https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188962#comment-14188962 ] bc Wong commented on HDFS-7295: --- Thanks, [~aw]. Services vs Apps is an argument that I can appreciate. There are sites running Spark Streaming as apps. So I'll have to check with them first. > Support arbitrary max expiration times for delegation token > --- > > Key: HDFS-7295 > URL: https://issues.apache.org/jira/browse/HDFS-7295 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > > Currently the max lifetime of HDFS delegation tokens is hardcoded to 7 days. > This is a problem for different users of HDFS such as long running YARN apps. > Users should be allowed to optionally specify max lifetime for their tokens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188946#comment-14188946 ] Kihwal Lee commented on HDFS-7097: -- The test case failed because : {{java.net.BindException: Port in use: localhost:40123}} We are getting this sort of failures more often nowadays from precommit. Both test cases pass when run on my machine. > Allow block reports to be processed during checkpointing on standby name node > - > > Key: HDFS-7097 > URL: https://issues.apache.org/jira/browse/HDFS-7097 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, > HDFS-7097.patch > > > On a reasonably busy HDFS cluster, there are stream of creates, causing data > nodes to generate incremental block reports. When a standby name node is > checkpointing, RPC handler threads trying to process a full or incremental > block report is blocked on the name system's {{fsLock}}, because the > checkpointer acquires the read lock on it. This can create a serious problem > if the size of name space is big and checkpointing takes a long time. > All available RPC handlers can be tied up very quickly. If you have 100 > handlers, it only takes 34 file creates. If a separate service RPC port is > not used, HA transition will have to wait in the call queue for minutes. Even > if a separate service RPC port is configured, hearbeats from datanodes will > be blocked. A standby NN with a big name space can lose all data nodes after > checkpointing. The rpc calls will also be retransmitted by data nodes many > times, filling up the call queue and potentially causing listen queue > overflow. > Since block reports are not modifying any state that is being saved to > fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188909#comment-14188909 ] Haohui Mai commented on HDFS-7305: -- +1 pending jenkins > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Assignee: Jing Zhao >Priority: Minor > Attachments: HDFS-7305.000.patch > > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7305: Priority: Minor (was: Critical) > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Assignee: Jing Zhao >Priority: Minor > Attachments: HDFS-7305.000.patch > > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7305: Status: Patch Available (was: Open) > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-7305.000.patch > > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao reassigned HDFS-7305: --- Assignee: Jing Zhao > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Assignee: Jing Zhao >Priority: Critical > Attachments: HDFS-7305.000.patch > > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7305: Attachment: HDFS-7305.000.patch Simple patch to fix. > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Priority: Critical > Attachments: HDFS-7305.000.patch > > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
[ https://issues.apache.org/jira/browse/HDFS-7305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188899#comment-14188899 ] Jing Zhao commented on HDFS-7305: - Looks more like the NPE is caused by possible null message contained in the Exception: {code} if (re.getMessage().startsWith( SecurityUtil.FAILED_TO_GET_UGI_MSG_HEADER)) { {code} {code} public static RemoteException toRemoteException(final Map json) { final Map m = (Map)json.get(RemoteException.class.getSimpleName()); final String message = (String)m.get("message"); final String javaClassName = (String)m.get("javaClassName"); return new RemoteException(javaClassName, message); } {code} > NPE seen in wbhdfs FS while running SLive > - > > Key: HDFS-7305 > URL: https://issues.apache.org/jira/browse/HDFS-7305 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Arpit Gupta >Priority: Critical > > {code} > 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task > status: "Failed at running due to java.lang.NullPointerException > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) > at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) > at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) > at > org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) > at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > " truncated to max limit (512 characters) > Activity > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188890#comment-14188890 ] Jing Zhao commented on HDFS-6385: - Thanks Chris! The patch looks pretty good to me. +1 pending Jenkins. > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7305) NPE seen in wbhdfs FS while running SLive
Arpit Gupta created HDFS-7305: - Summary: NPE seen in wbhdfs FS while running SLive Key: HDFS-7305 URL: https://issues.apache.org/jira/browse/HDFS-7305 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Arpit Gupta Priority: Critical {code} 2014-10-23 10:22:49,066 WARN [main] org.apache.hadoop.mapred.Task: Task status: "Failed at running due to java.lang.NullPointerException at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:358) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:91) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:529) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:605) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.append(WebHdfsFileSystem.java:1154) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1163) at org.apache.hadoop.fs.slive.AppendOp.run(AppendOp.java:80) at org.apache.hadoop.fs.slive.ObserveableOp.run(ObserveableOp.java:63) at org.apache.hadoop.fs.slive.SliveMapper.runOperation(SliveMapper.java:122) at org.apache.hadoop.fs.slive.SliveMapper.map(SliveMapper.java:168) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) " truncated to max limit (512 characters) Activity {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3107) HDFS truncate
[ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1416#comment-1416 ] Colin Patrick McCabe commented on HDFS-3107: Good point, Nicholas. Note that the patch I posted does handle rollback correctly, since it never modifies any existing block files. I posted it as a demonstration. I think to make it more robust we would want to avoid having the client write out the last block and concat it, and instead have some other mechanism for duplicating + shortening the final block of the file-- possibly a new DN command similar to COPY_BLOCK, but taking a length argument. > HDFS truncate > - > > Key: HDFS-3107 > URL: https://issues.apache.org/jira/browse/HDFS-3107 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Lei Chang >Assignee: Plamen Jeliazkov > Attachments: HDFS-3107.008.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, > HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, > HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, > editsStored, editsStored, editsStored.xml > > Original Estimate: 1,344h > Remaining Estimate: 1,344h > > Systems with transaction support often need to undo changes made to the > underlying storage when a transaction is aborted. Currently HDFS does not > support truncate (a standard Posix operation) which is a reverse operation of > append, which makes upper layer applications use ugly workarounds (such as > keeping track of the discarded byte range per file in a separate metadata > store, and periodically running a vacuum process to rewrite compacted files) > to overcome this limitation of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7287) The OfflineImageViewer (OIV) can output invalid XML depending on the filename
[ https://issues.apache.org/jira/browse/HDFS-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188869#comment-14188869 ] Hudson commented on HDFS-7287: -- FAILURE: Integrated in Hadoop-trunk-Commit #6386 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6386/]) HDFS-7287. The OfflineImageViewer (OIV) can output invalid XML depending on the filename (Ravi Prakash via Colin P. McCabe) (cmccabe: rev d33e07dc49e00db138921fb3aa52c4ef00510161) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/PBImageXmlWriter.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/XmlImageVisitor.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java > The OfflineImageViewer (OIV) can output invalid XML depending on the filename > - > > Key: HDFS-7287 > URL: https://issues.apache.org/jira/browse/HDFS-7287 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Fix For: 2.6.0 > > Attachments: HDFS-7287.1.patch, HDFS-7287.2.patch, HDFS-7287.patch, > testXMLOutput > > > If the filename contains a character which is invalid in XML, > TextWriterImageVisitor.write() or PBImageXmlWriter.o() prints out the string > unescaped. For us this was the character 0x0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7287) The OfflineImageViewer (OIV) can output invalid XML depending on the filename
[ https://issues.apache.org/jira/browse/HDFS-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7287: --- Resolution: Fixed Fix Version/s: 2.6.0 Status: Resolved (was: Patch Available) > The OfflineImageViewer (OIV) can output invalid XML depending on the filename > - > > Key: HDFS-7287 > URL: https://issues.apache.org/jira/browse/HDFS-7287 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Fix For: 2.6.0 > > Attachments: HDFS-7287.1.patch, HDFS-7287.2.patch, HDFS-7287.patch, > testXMLOutput > > > If the filename contains a character which is invalid in XML, > TextWriterImageVisitor.write() or PBImageXmlWriter.o() prints out the string > unescaped. For us this was the character 0x0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7199) DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O exception
[ https://issues.apache.org/jira/browse/HDFS-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188861#comment-14188861 ] Colin Patrick McCabe commented on HDFS-7199: I'm having trouble understanding this patch. Won't the exception you are setting with {{setLastException(new IOException("DataStreamer Exception: ",e))}} overwrite the exception set on these previous lines: {code} if (e instanceof IOException) { setLastException((IOException)e); } {code} Wouldn't it make more sense to simply add an else statement here where we wrap the non-IOE in an IOE? bq. working in progress patch. I will work on creating the test case. It is a littble bit hard. It looks like this will end up being a 1 or 2 line patch. So we could potentially commit this JIRA and file a follow-up JIRA for the test case. I think it should be possible to write a good test case using Mockito or perhaps one of the fault injectors. > DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O > exception > --- > > Key: HDFS-7199 > URL: https://issues.apache.org/jira/browse/HDFS-7199 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Rushabh S Shah >Priority: Critical > Attachments: HDFS-7199-WIP.patch > > > If the DataStreamer thread encounters a non-I/O exception then it closes the > output stream but does not set lastException. When the client later calls > close on the output stream then it will see the stream is already closed with > lastException == null, mistakently think this is a redundant close call, and > fail to report any error to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client
[ https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188855#comment-14188855 ] Chris Nauroth commented on HDFS-6994: - If it's helpful, take a look at HDFS-573 for example usage of CMake on Windows. > libhdfs3 - A native C/C++ HDFS client > - > > Key: HDFS-6994 > URL: https://issues.apache.org/jira/browse/HDFS-6994 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs-client >Reporter: Zhanwei Wang >Assignee: Zhanwei Wang > Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch > > > Hi All > I just got the permission to open source libhdfs3, which is a native C/C++ > HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol. > libhdfs3 provide the libhdfs style C interface and a C++ interface. Support > both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos > authentication. > libhdfs3 is currently used by HAWQ of Pivotal > I'd like to integrate libhdfs3 into HDFS source code to benefit others. > You can find libhdfs3 code from github > https://github.com/PivotalRD/libhdfs3 > http://pivotalrd.github.io/libhdfs3/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token
[ https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188847#comment-14188847 ] Allen Wittenauer commented on HDFS-7295: bq. Especially since not everyone here understands the use case yet — There are no keytabs here. These are not services. I think we do, actually. The argument back is (still): why shouldn't these be services? There are lots of examples of things that run for a long time (e.g., 0xdata, samza, storm) that people have been using for quite a while now. Spark isn't magically different because it is this year's new hotness. In every single one of these cases that I'm familiar with, it is almost always in the best interest of the user to run these as a dedicated service account and treat it as a service rather than as the user. They are almost always being managed by team. They almost always feed multiple inputs and multiple outputs from various sources and usually from other teams. Plus there is the bus factor: if that user gets hit by a bus, who takes it over when that user account gets removed? The only case that I know of where running as the user makes sense is during the experimentation phase. To which, in my mind, they can live with their service dying after 7 days. With ACLs now in HDFS, it makes even less sense to run these as the user. > Support arbitrary max expiration times for delegation token > --- > > Key: HDFS-7295 > URL: https://issues.apache.org/jira/browse/HDFS-7295 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > > Currently the max lifetime of HDFS delegation tokens is hardcoded to 7 days. > This is a problem for different users of HDFS such as long running YARN apps. > Users should be allowed to optionally specify max lifetime for their tokens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7287) The OfflineImageViewer (OIV) can output invalid XML depending on the filename
[ https://issues.apache.org/jira/browse/HDFS-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188841#comment-14188841 ] Colin Patrick McCabe commented on HDFS-7287: bq. Here's the patch without the test barfing. Great! bq. The build doesn't have any test failures It has a test timeout, on {{TestPread}}. However, this is clearly not related to your patch. bq. I'm guessing this is because of the recent changes to test-patch.sh test-patch.sh wasn't changed recently. smart-patch-apply.sh was, but that shouldn't have anything to do with this test timeout, I think. bq. Besides this test seems unrelated. Could someone please review and merge? +1, will commit shortly. > The OfflineImageViewer (OIV) can output invalid XML depending on the filename > - > > Key: HDFS-7287 > URL: https://issues.apache.org/jira/browse/HDFS-7287 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-7287.1.patch, HDFS-7287.2.patch, HDFS-7287.patch, > testXMLOutput > > > If the filename contains a character which is invalid in XML, > TextWriterImageVisitor.write() or PBImageXmlWriter.o() prints out the string > unescaped. For us this was the character 0x0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6385: Attachment: HDFS-6385.1.patch I'm attaching a patch that adds a "Block Deletion Start Time" field to the web UI. I've also attached a screen shot showing the new field. > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6385: Status: Patch Available (was: Open) > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6385: Target Version/s: 2.6.0 > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.1.patch, HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6385) Show when block deletion will start after NameNode startup in WebUI
[ https://issues.apache.org/jira/browse/HDFS-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6385: Attachment: HDFS-6385.png > Show when block deletion will start after NameNode startup in WebUI > --- > > Key: HDFS-6385 > URL: https://issues.apache.org/jira/browse/HDFS-6385 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Jing Zhao >Assignee: Chris Nauroth > Attachments: HDFS-6385.png > > > HDFS-6186 provides functionality to delay block deletion for a period of time > after NameNode startup. Currently we only show the number of pending block > deletions in WebUI. We should also show when the block deletion will start in > WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6994) libhdfs3 - A native C/C++ HDFS client
[ https://issues.apache.org/jira/browse/HDFS-6994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188828#comment-14188828 ] Colin Patrick McCabe commented on HDFS-6994: bq. Hi, I've been porting libhdfs3 to Windows Visual Studio 2013 and would like to contribute my effort back to the community. Welcome! bq. Should this be under HDFS-7188? How big are the changes? We might want to break it up if it gets too big. If it can fit in a few kb, then maybe one JIRA is enough. One more thing: if you can, please try to use CMake to build on Windows. The ability to have one build system for all platforms was a big reason to switch to CMake in the first place. > libhdfs3 - A native C/C++ HDFS client > - > > Key: HDFS-6994 > URL: https://issues.apache.org/jira/browse/HDFS-6994 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs-client >Reporter: Zhanwei Wang >Assignee: Zhanwei Wang > Attachments: HDFS-6994-rpc-8.patch, HDFS-6994.patch > > > Hi All > I just got the permission to open source libhdfs3, which is a native C/C++ > HDFS client based on Hadoop RPC protocol and HDFS Data Transfer Protocol. > libhdfs3 provide the libhdfs style C interface and a C++ interface. Support > both HADOOP RPC version 8 and 9. Support Namenode HA and Kerberos > authentication. > libhdfs3 is currently used by HAWQ of Pivotal > I'd like to integrate libhdfs3 into HDFS source code to benefit others. > You can find libhdfs3 code from github > https://github.com/PivotalRD/libhdfs3 > http://pivotalrd.github.io/libhdfs3/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7199) DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O exception
[ https://issues.apache.org/jira/browse/HDFS-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-7199: - Status: Patch Available (was: Open) > DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O > exception > --- > > Key: HDFS-7199 > URL: https://issues.apache.org/jira/browse/HDFS-7199 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Rushabh S Shah >Priority: Critical > Attachments: HDFS-7199-WIP.patch > > > If the DataStreamer thread encounters a non-I/O exception then it closes the > output stream but does not set lastException. When the client later calls > close on the output stream then it will see the stream is already closed with > lastException == null, mistakently think this is a redundant close call, and > fail to report any error to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7287) The OfflineImageViewer (OIV) can output invalid XML depending on the filename
[ https://issues.apache.org/jira/browse/HDFS-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188784#comment-14188784 ] Ravi Prakash commented on HDFS-7287: The build doesn't have any test failures: https://builds.apache.org/job/PreCommit-HDFS-Build/8581/ . The console says (Note its missing the "Tests run" string {code}Running org.apache.hadoop.hdfs.TestPread 0, Errors: 0, Skipped: 0, Time elapsed: 4.719 sec - in org.apache.hadoop.hdfs.TestSmallBlock{code} I'm guessing this is because of the recent changes to test-patch.sh Besides this test seems unrelated. Could someone please review and merge? > The OfflineImageViewer (OIV) can output invalid XML depending on the filename > - > > Key: HDFS-7287 > URL: https://issues.apache.org/jira/browse/HDFS-7287 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-7287.1.patch, HDFS-7287.2.patch, HDFS-7287.patch, > testXMLOutput > > > If the filename contains a character which is invalid in XML, > TextWriterImageVisitor.write() or PBImageXmlWriter.o() prints out the string > unescaped. For us this was the character 0x0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7279) Use netty to implement DatanodeWebHdfsMethods
[ https://issues.apache.org/jira/browse/HDFS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188759#comment-14188759 ] Hadoop QA commented on HDFS-7279: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677949/HDFS-7279.003.patch against trunk revision 5c900b5. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8584//console This message is automatically generated. > Use netty to implement DatanodeWebHdfsMethods > - > > Key: HDFS-7279 > URL: https://issues.apache.org/jira/browse/HDFS-7279 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, webhdfs >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-7279.000.patch, HDFS-7279.001.patch, > HDFS-7279.002.patch, HDFS-7279.003.patch > > > Currently the DN implements all related webhdfs functionality using jetty. As > the current jetty version the DN used (jetty 6) lacks of fine-grained buffer > and connection management, DN often suffers from long latency and OOM when > its webhdfs component is under sustained heavy load. > This jira proposes to implement the webhdfs component in DN using netty, > which can be more efficient and allow more finer-grain controls on webhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7295) Support arbitrary max expiration times for delegation token
[ https://issues.apache.org/jira/browse/HDFS-7295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188754#comment-14188754 ] bc Wong commented on HDFS-7295: --- Vinod, We're probably not on the same wavelength. I agree with all that you said about keytabs being the solution for services. But I'm trying to find a solution for apps that are started by regular users. There are no keytabs here. -- Steve, the let-user-push-new-token solution is possible, although the user experience is very bad as it requires periodic intervention. I.e. the user can't go on a 2-week vacation. bq. I guess you are disappointed by the negative feedback here: you had a simple solution to the problem of HDFS token expiry without having to distribute keytabs. No, I don't feel emotional about this. I believe that we're all reasonably trying to find the right solution for the users. Especially since not everyone here understands the use case yet --- There are no keytabs here. These are not services. > Support arbitrary max expiration times for delegation token > --- > > Key: HDFS-7295 > URL: https://issues.apache.org/jira/browse/HDFS-7295 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > > Currently the max lifetime of HDFS delegation tokens is hardcoded to 7 days. > This is a problem for different users of HDFS such as long running YARN apps. > Users should be allowed to optionally specify max lifetime for their tokens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7279) Use netty to implement DatanodeWebHdfsMethods
[ https://issues.apache.org/jira/browse/HDFS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7279: - Attachment: HDFS-7279.003.patch > Use netty to implement DatanodeWebHdfsMethods > - > > Key: HDFS-7279 > URL: https://issues.apache.org/jira/browse/HDFS-7279 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, webhdfs >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-7279.000.patch, HDFS-7279.001.patch, > HDFS-7279.002.patch, HDFS-7279.003.patch > > > Currently the DN implements all related webhdfs functionality using jetty. As > the current jetty version the DN used (jetty 6) lacks of fine-grained buffer > and connection management, DN often suffers from long latency and OOM when > its webhdfs component is under sustained heavy load. > This jira proposes to implement the webhdfs component in DN using netty, > which can be more efficient and allow more finer-grain controls on webhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188700#comment-14188700 ] Hadoop QA commented on HDFS-7097: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677893/HDFS-7097.patch against trunk revision ec63a3f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestPread {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8580//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8580//console This message is automatically generated. > Allow block reports to be processed during checkpointing on standby name node > - > > Key: HDFS-7097 > URL: https://issues.apache.org/jira/browse/HDFS-7097 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, > HDFS-7097.patch > > > On a reasonably busy HDFS cluster, there are stream of creates, causing data > nodes to generate incremental block reports. When a standby name node is > checkpointing, RPC handler threads trying to process a full or incremental > block report is blocked on the name system's {{fsLock}}, because the > checkpointer acquires the read lock on it. This can create a serious problem > if the size of name space is big and checkpointing takes a long time. > All available RPC handlers can be tied up very quickly. If you have 100 > handlers, it only takes 34 file creates. If a separate service RPC port is > not used, HA transition will have to wait in the call queue for minutes. Even > if a separate service RPC port is configured, hearbeats from datanodes will > be blocked. A standby NN with a big name space can lose all data nodes after > checkpointing. The rpc calls will also be retransmitted by data nodes many > times, filling up the call queue and potentially causing listen queue > overflow. > Since block reports are not modifying any state that is being saved to > fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7287) The OfflineImageViewer (OIV) can output invalid XML depending on the filename
[ https://issues.apache.org/jira/browse/HDFS-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188699#comment-14188699 ] Hadoop QA commented on HDFS-7287: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677907/HDFS-7287.2.patch against trunk revision ec63a3f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestPread {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8581//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8581//console This message is automatically generated. > The OfflineImageViewer (OIV) can output invalid XML depending on the filename > - > > Key: HDFS-7287 > URL: https://issues.apache.org/jira/browse/HDFS-7287 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-7287.1.patch, HDFS-7287.2.patch, HDFS-7287.patch, > testXMLOutput > > > If the filename contains a character which is invalid in XML, > TextWriterImageVisitor.write() or PBImageXmlWriter.o() prints out the string > unescaped. For us this was the character 0x0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7279) Use netty to implement DatanodeWebHdfsMethods
[ https://issues.apache.org/jira/browse/HDFS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188673#comment-14188673 ] Hadoop QA commented on HDFS-7279: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677765/HDFS-7279.002.patch against trunk revision b056048. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8582//console This message is automatically generated. > Use netty to implement DatanodeWebHdfsMethods > - > > Key: HDFS-7279 > URL: https://issues.apache.org/jira/browse/HDFS-7279 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, webhdfs >Reporter: Haohui Mai >Assignee: Haohui Mai > Attachments: HDFS-7279.000.patch, HDFS-7279.001.patch, > HDFS-7279.002.patch > > > Currently the DN implements all related webhdfs functionality using jetty. As > the current jetty version the DN used (jetty 6) lacks of fine-grained buffer > and connection management, DN often suffers from long latency and OOM when > its webhdfs component is under sustained heavy load. > This jira proposes to implement the webhdfs component in DN using netty, > which can be more efficient and allow more finer-grain controls on webhdfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7165) Separate block metrics for files with replication count 1
[ https://issues.apache.org/jira/browse/HDFS-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188655#comment-14188655 ] Hadoop QA commented on HDFS-7165: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677935/HDFS-7165-branch-2.patch against trunk revision b056048. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8583//console This message is automatically generated. > Separate block metrics for files with replication count 1 > - > > Key: HDFS-7165 > URL: https://issues.apache.org/jira/browse/HDFS-7165 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.5.1 >Reporter: Andrew Wang >Assignee: Zhe Zhang > Fix For: 3.0.0 > > Attachments: HDFS-7165-20141003-v1.patch, > HDFS-7165-20141009-v1.patch, HDFS-7165-20141010-v1.patch, > HDFS-7165-20141015-v1.patch, HDFS-7165-20141021-v1.patch, > HDFS-7165-20141021-v2.patch, HDFS-7165-branch-2.patch > > > We see a lot of escalations because someone has written teragen output with a > replication factor of 1, a DN goes down, and a bunch of missing blocks show > up. These are normally false positives, since teragen output is disposable, > and generally speaking, users should understand this is true for all repl=1 > files. > It'd be nice to be able to separate out these repl=1 missing blocks from > missing blocks with higher replication factors.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7165) Separate block metrics for files with replication count 1
[ https://issues.apache.org/jira/browse/HDFS-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-7165: Attachment: HDFS-7165-branch-2.patch > Separate block metrics for files with replication count 1 > - > > Key: HDFS-7165 > URL: https://issues.apache.org/jira/browse/HDFS-7165 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.5.1 >Reporter: Andrew Wang >Assignee: Zhe Zhang > Fix For: 3.0.0 > > Attachments: HDFS-7165-20141003-v1.patch, > HDFS-7165-20141009-v1.patch, HDFS-7165-20141010-v1.patch, > HDFS-7165-20141015-v1.patch, HDFS-7165-20141021-v1.patch, > HDFS-7165-20141021-v2.patch, HDFS-7165-branch-2.patch > > > We see a lot of escalations because someone has written teragen output with a > replication factor of 1, a DN goes down, and a bunch of missing blocks show > up. These are normally false positives, since teragen output is disposable, > and generally speaking, users should understand this is true for all repl=1 > files. > It'd be nice to be able to separate out these repl=1 missing blocks from > missing blocks with higher replication factors.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7165) Separate block metrics for files with replication count 1
[ https://issues.apache.org/jira/browse/HDFS-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14188642#comment-14188642 ] Zhe Zhang commented on HDFS-7165: - There are 2 blockers for the branch-2 merging: * HDFS-6252 wasn't completely ported to branch-2. HDFS-7301 has been created and resolved to fix this. * HDFS-4366 was never ported to branch-2. In {{UnderReplicatedBlocks}} it removed {{priorityToReplIdx}} which happens to be on the same line as our added variable {{corruptReplOneBlocks}}. I created a branch-2 patch containing this simple fix. > Separate block metrics for files with replication count 1 > - > > Key: HDFS-7165 > URL: https://issues.apache.org/jira/browse/HDFS-7165 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.5.1 >Reporter: Andrew Wang >Assignee: Zhe Zhang > Fix For: 3.0.0 > > Attachments: HDFS-7165-20141003-v1.patch, > HDFS-7165-20141009-v1.patch, HDFS-7165-20141010-v1.patch, > HDFS-7165-20141015-v1.patch, HDFS-7165-20141021-v1.patch, > HDFS-7165-20141021-v2.patch > > > We see a lot of escalations because someone has written teragen output with a > replication factor of 1, a DN goes down, and a bunch of missing blocks show > up. These are normally false positives, since teragen output is disposable, > and generally speaking, users should understand this is true for all repl=1 > files. > It'd be nice to be able to separate out these repl=1 missing blocks from > missing blocks with higher replication factors.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)