[jira] [Commented] (HDFS-1900) Use the block size key defined by common
[ https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053668#comment-13053668 ] Abel Perez commented on HDFS-1900: -- Hey Aaron, I'm trying to run the Ant test-patch task but I'm not sure what args I should be passing the task. Can you provide me with a sample command for the sh script or Ant task? thanks, - Abel Use the block size key defined by common - Key: HDFS-1900 URL: https://issues.apache.org/jira/browse/HDFS-1900 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.21.1 Reporter: Eli Collins Assignee: Abel Perez Labels: newbie Fix For: 0.22.0 Attachments: HDFS-1900.txt HADOOP-4952 added a dfs.block.size key to common configuration, defined in o.a.h.fs.FsConfig. This conflicts with the original HDFS block size key of the same name, which is now deprecated in favor of dfs.blocksize. It doesn't make sense to have two different keys for the block size (ie they can disagree). Why doesn't HDFS just use the key defined in common? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1900) Use the block size key defined by common
[ https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053673#comment-13053673 ] Eli Collins commented on HDFS-1900: --- Hey Abel, I use the following bash function: function ant-test-patch() { ant -Dpatch.file=$1 \ -Dforrest.home=$FORREST_HOME \ -Dfindbugs.home=$FINDBUGS_HOME \ -Djava5.home=$JAVA5_HOME \ test-patch } and apache-forrest-0.8 and findbugs-1.3.9. More at https://github.com/elicollins/hadoop-dev/blob/master/bin/hadoop-alias.sh Use the block size key defined by common - Key: HDFS-1900 URL: https://issues.apache.org/jira/browse/HDFS-1900 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.21.1 Reporter: Eli Collins Assignee: Abel Perez Labels: newbie Fix For: 0.22.0 Attachments: HDFS-1900.txt HADOOP-4952 added a dfs.block.size key to common configuration, defined in o.a.h.fs.FsConfig. This conflicts with the original HDFS block size key of the same name, which is now deprecated in favor of dfs.blocksize. It doesn't make sense to have two different keys for the block size (ie they can disagree). Why doesn't HDFS just use the key defined in common? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HDFS-1900) Use the block size key defined by common
[ https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053673#comment-13053673 ] Eli Collins edited comment on HDFS-1900 at 6/23/11 6:31 AM: Hey Abel, I use the following bash function: {code} function ant-test-patch() { ant -Dpatch.file=$1 \ -Dforrest.home=$FORREST_HOME \ -Dfindbugs.home=$FINDBUGS_HOME \ -Djava5.home=$JAVA5_HOME \ test-patch } {code} and apache-forrest-0.8 and findbugs-1.3.9. More at https://github.com/elicollins/hadoop-dev/blob/master/bin/hadoop-alias.sh was (Author: eli): Hey Abel, I use the following bash function: function ant-test-patch() { ant -Dpatch.file=$1 \ -Dforrest.home=$FORREST_HOME \ -Dfindbugs.home=$FINDBUGS_HOME \ -Djava5.home=$JAVA5_HOME \ test-patch } and apache-forrest-0.8 and findbugs-1.3.9. More at https://github.com/elicollins/hadoop-dev/blob/master/bin/hadoop-alias.sh Use the block size key defined by common - Key: HDFS-1900 URL: https://issues.apache.org/jira/browse/HDFS-1900 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.21.1 Reporter: Eli Collins Assignee: Abel Perez Labels: newbie Fix For: 0.22.0 Attachments: HDFS-1900.txt HADOOP-4952 added a dfs.block.size key to common configuration, defined in o.a.h.fs.FsConfig. This conflicts with the original HDFS block size key of the same name, which is now deprecated in favor of dfs.blocksize. It doesn't make sense to have two different keys for the block size (ie they can disagree). Why doesn't HDFS just use the key defined in common? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-2103) Read lock must be released before acquiring a write lock
[ https://issues.apache.org/jira/browse/HDFS-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Mundlapudi resolved HDFS-2103. -- Resolution: Not A Problem Didn't notice the finally block, where read lock is released. I am closing this Jira. Read lock must be released before acquiring a write lock Key: HDFS-2103 URL: https://issues.apache.org/jira/browse/HDFS-2103 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.23.0 In FSNamesystem.getBlockLocationsUpdateTimes function, we have the following code: {code} for (int attempt = 0; attempt 2; attempt++) { if (attempt == 0) { // first attempt is with readlock readLock(); } else { // second attempt is with write lock writeLock(); // writelock is needed to set accesstime } ... if (attempt == 0) { continue; } {code} In the above code, readLock is acquired in attempt 0 and if the execution enters in the continue block, then it tries to acquire writeLock before releasing the readLock. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2102) 1073 Zero pad edits filename to make them lexically sortable
[ https://issues.apache.org/jira/browse/HDFS-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-2102: - Attachment: HDFS-2102.diff 1073 Zero pad edits filename to make them lexically sortable Key: HDFS-2102 URL: https://issues.apache.org/jira/browse/HDFS-2102 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: Edit log branch (HDFS-1073) Attachments: HDFS-2102.diff Zero pad the edit log filenames so they appear in the correct order when you ls on the filesystem. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2102) 1073 Zero pad edits filename to make them lexically sortable
[ https://issues.apache.org/jira/browse/HDFS-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-2102: - Status: Patch Available (was: Open) 1073 Zero pad edits filename to make them lexically sortable Key: HDFS-2102 URL: https://issues.apache.org/jira/browse/HDFS-2102 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: Edit log branch (HDFS-1073) Attachments: HDFS-2102.diff Zero pad the edit log filenames so they appear in the correct order when you ls on the filesystem. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2018) Move all journal stream management code into one place
[ https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-2018: - Status: Open (was: Patch Available) Move all journal stream management code into one place -- Key: HDFS-2018 URL: https://issues.apache.org/jira/browse/HDFS-2018 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: Edit log branch (HDFS-1073) Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff Currently in the HDFS-1073 branch, the code for creating output streams is in FileJournalManager and the code for input streams is in the inspectors. This change does a number of things. - Input and Output streams are now created by the JournalManager. - FSImageStorageInspectors now deals with URIs when referring to edit logs - Recovery of inprogress logs is performed by counting the number of transactions instead of looking at the length of the file. The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2018) Move all journal stream management code into one place
[ https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated HDFS-2018: - Attachment: HDFS-2018.diff Added RemoteEditLogManifest stuff to allow it to take segments from different journals. Move all journal stream management code into one place -- Key: HDFS-2018 URL: https://issues.apache.org/jira/browse/HDFS-2018 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: Edit log branch (HDFS-1073) Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff Currently in the HDFS-1073 branch, the code for creating output streams is in FileJournalManager and the code for input streams is in the inspectors. This change does a number of things. - Input and Output streams are now created by the JournalManager. - FSImageStorageInspectors now deals with URIs when referring to edit logs - Recovery of inprogress logs is performed by counting the number of transactions instead of looking at the length of the file. The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-2096) Mavenization of hadoop-hdfs
[ https://issues.apache.org/jira/browse/HDFS-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur reassigned HDFS-2096: Assignee: Alejandro Abdelnur Mavenization of hadoop-hdfs --- Key: HDFS-2096 URL: https://issues.apache.org/jira/browse/HDFS-2096 Project: Hadoop HDFS Issue Type: Task Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Same as HADOOP-6671 for hdfs -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1574) HDFS cannot be browsed from web UI while in safe mode
[ https://issues.apache.org/jira/browse/HDFS-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053916#comment-13053916 ] Todd Lipcon commented on HDFS-1574: --- I think HDFS should be browsable in safe-mode -- perhaps in such a way that we just log a warning that delegation tokens will be non-persistent. HDFS cannot be browsed from web UI while in safe mode - Key: HDFS-1574 URL: https://issues.apache.org/jira/browse/HDFS-1574 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Priority: Blocker Labels: newbie As of HDFS-984, the NN does not issue delegation tokens while in safe mode (since it would require writing to the edit log). But the browsedfscontent servlet relies on getting a delegation token before redirecting to a random DN to browse the FS. Thus, the browse the filesystem link does not work while the NN is in safe mode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1321) If service port and main port are the same, there is no clear log message explaining the issue.
[ https://issues.apache.org/jira/browse/HDFS-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053969#comment-13053969 ] Hadoop QA commented on HDFS-1321: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12483619/HDFS-1321.patch against trunk revision 1138645. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/828//console This message is automatically generated. If service port and main port are the same, there is no clear log message explaining the issue. --- Key: HDFS-1321 URL: https://issues.apache.org/jira/browse/HDFS-1321 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: gary murry Assignee: Jim Plush Priority: Minor Labels: newbie Fix For: 0.23.0 Attachments: HDFS-1321.patch With the introduction of a service port to the namenode, there is now a chance for user error to set the two port equal. This will cause the namenode to fail to start up. It would be nice if there was a log message explaining the port clash. Or just treat things as if the service port was not specified. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1606) Provide a stronger data guarantee in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053999#comment-13053999 ] Tsz Wo (Nicholas), SZE commented on HDFS-1606: -- Koji Noguchi also has provided a lot of inputs on this. Sorry that I failed to mention it in the [acknowledgement|https://issues.apache.org/jira/browse/HDFS-1606?focusedCommentId=13018958page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13018958]. Provide a stronger data guarantee in the write pipeline --- Key: HDFS-1606 URL: https://issues.apache.org/jira/browse/HDFS-1606 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client, name-node Affects Versions: 0.23.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h1606_20110210.patch, h1606_20110211.patch, h1606_20110217.patch, h1606_20110228.patch, h1606_20110404.patch, h1606_20110405.patch, h1606_20110405b.patch, h1606_20110406.patch, h1606_20110406b.patch, h1606_20110407.patch, h1606_20110407b.patch, h1606_20110407c.patch, h1606_20110408.patch, h1606_20110408b.patch In the current design, if there is a datanode/network failure in the write pipeline, DFSClient will try to remove the failed datanode from the pipeline and then continue writing with the remaining datanodes. As a result, the number of datanodes in the pipeline is decreased. Unfortunately, it is possible that DFSClient may incorrectly remove a healthy datanode but leave the failed datanode in the pipeline because failure detection may be inaccurate under erroneous conditions. We propose to have a new mechanism for adding new datanodes to the pipeline in order to provide a stronger data guarantee. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (HDFS-2103) Read lock must be released before acquiring a write lock
[ https://issues.apache.org/jira/browse/HDFS-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins closed HDFS-2103. - Read lock must be released before acquiring a write lock Key: HDFS-2103 URL: https://issues.apache.org/jira/browse/HDFS-2103 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.23.0 In FSNamesystem.getBlockLocationsUpdateTimes function, we have the following code: {code} for (int attempt = 0; attempt 2; attempt++) { if (attempt == 0) { // first attempt is with readlock readLock(); } else { // second attempt is with write lock writeLock(); // writelock is needed to set accesstime } ... if (attempt == 0) { continue; } {code} In the above code, readLock is acquired in attempt 0 and if the execution enters in the continue block, then it tries to acquire writeLock before releasing the readLock. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1321) If service port and main port are the same, there is no clear log message explaining the issue.
[ https://issues.apache.org/jira/browse/HDFS-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Plush updated HDFS-1321: Attachment: HDFS-1321-take2.txt did a hard reset and re-applied the changes to try and get a cleaner patch file If service port and main port are the same, there is no clear log message explaining the issue. --- Key: HDFS-1321 URL: https://issues.apache.org/jira/browse/HDFS-1321 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: gary murry Assignee: Jim Plush Priority: Minor Labels: newbie Fix For: 0.23.0 Attachments: HDFS-1321-take2.txt, HDFS-1321.patch With the introduction of a service port to the namenode, there is now a chance for user error to set the two port equal. This will cause the namenode to fail to start up. It would be nice if there was a log message explaining the port clash. Or just treat things as if the service port was not specified. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1900) Use the block size key defined by common
[ https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054011#comment-13054011 ] Abel Perez commented on HDFS-1900: -- Hey Eli, thanks for the function. Not sure if my environment is properly setup, I tried running test-patch and got the following error: [exec] Exception in thread main java.lang.NoClassDefFoundError: org/apache/fop/messaging/MessageHandler [exec] at org.apache.cocoon.serialization.FOPSerializer.configure(FOPSerializer.java:122) [exec] at org.apache.avalon.framework.container.ContainerUtil.configure(ContainerUtil.java:201) [exec] at org.apache.avalon.excalibur.component.DefaultComponentFactory.newInstance(DefaultComponentFactory.java:289) [exec] at org.apache.avalon.excalibur.pool.InstrumentedResourceLimitingPool.newPoolable(InstrumentedResourceLimitingPool.java:655) [exec] at org.apache.avalon.excalibur.pool.InstrumentedResourceLimitingPool.get(InstrumentedResourceLimitingPool.java:371) [exec] at org.apache.avalon.excalibur.component.PoolableComponentHandler.doGet(PoolableComponentHandler.java:198) [exec] at org.apache.avalon.excalibur.component.ComponentHandler.get(ComponentHandler.java:381) [exec] at org.apache.avalon.excalibur.component.ExcaliburComponentSelector.select(ExcaliburComponentSelector.java:215) [exec] at org.apache.cocoon.components.ExtendedComponentSelector.select(ExtendedComponentSelector.java:268) [exec] at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setSerializer(AbstractProcessingPipeline.java:311) [exec] at org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setSerializer(AbstractCachingProcessingPipeline.java:171) [exec] at org.apache.cocoon.components.treeprocessor.sitemap.SerializeNode.invoke(SerializeNode.java:120) [exec] at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69) [exec] at org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:103) [exec] at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:47) [exec] at org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:131) [exec] at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69) [exec] at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143) [exec] at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69) [exec] at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:93) [exec] at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:235) [exec] at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:177) [exec] at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:254) [exec] at org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:118) [exec] at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69) [exec] at org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:98) [exec] at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69) [exec] at org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143) [exec] at org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69) [exec] at org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:93) [exec] at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:235) [exec] at org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:177) [exec] at org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:254) [exec] at org.apache.cocoon.Cocoon.process(Cocoon.java:699) [exec] at org.apache.cocoon.bean.CocoonWrapper.getPage(CocoonWrapper.java:514) [exec] at org.apache.cocoon.bean.CocoonBean.processTarget(CocoonBean.java:499) [exec] at org.apache.cocoon.bean.CocoonBean.process(CocoonBean.java:356) [exec] at org.apache.cocoon.Main.main(Main.java:321)
[jira] [Updated] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied
[ https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-2086: --- Status: Open (was: Patch Available) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied Key: HDFS-2086 URL: https://issues.apache.org/jira/browse/HDFS-2086 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Tanping Wang Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, HDFS-2086.5.patch, HDFS-2086.patch As the title describes the problem: if the include host list contains host name, after restarting namenodes, the datanodes registrant is denied by namenodes. This is because after namenode is restarted, the still alive data node will try to register itself with the namenode and it identifies itself with its *IP address*. However, namenode only allows all the hosts in its hosts list to registrant and all of them are hostnames. So namenode would deny the datanode registration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1321) If service port and main port are the same, there is no clear log message explaining the issue.
[ https://issues.apache.org/jira/browse/HDFS-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054032#comment-13054032 ] Hadoop QA commented on HDFS-1321: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12483625/HDFS-1321-take2.txt against trunk revision 1138645. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/829//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/829//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/829//console This message is automatically generated. If service port and main port are the same, there is no clear log message explaining the issue. --- Key: HDFS-1321 URL: https://issues.apache.org/jira/browse/HDFS-1321 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: gary murry Assignee: Jim Plush Priority: Minor Labels: newbie Fix For: 0.23.0 Attachments: HDFS-1321-take2.txt, HDFS-1321.patch With the introduction of a service port to the namenode, there is now a chance for user error to set the two port equal. This will cause the namenode to fail to start up. It would be nice if there was a log message explaining the port clash. Or just treat things as if the service port was not specified. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2052) FSNamesystem should not sync the log with the write lock held
[ https://issues.apache.org/jira/browse/HDFS-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054073#comment-13054073 ] Suresh Srinivas commented on HDFS-2052: --- but that would drop the audit logging so probably easier to pass a boolean that indicates whether to sync. It does not. Logging for delete is done in FSDirectory#delete(). In this case, you are not syncing the audit log and do it along with editlog that logs file creation. FSNamesystem should not sync the log with the write lock held - Key: HDFS-2052 URL: https://issues.apache.org/jira/browse/HDFS-2052 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Eli Collins FSNamesystem#deleteInternal releases the write lock before syncing the log, however FSNamesystem#startFileInternal calls delete - deleteInternal with the write lock held, which means deleteInternal will sync the log while holding the lock. We could fix cases like this by passing a flag indicating whether the function should sysnc (eg in this case the sysnc is not necessary because startFileInternals callers will sync the log) or modify the current calls to sync to flag that a sync is necessary before returning to the caller rather than doing the sync right at the call sight. This way the cost of syncing the log could be amortized over multiple function calls (and potentially multiple RPCs if we didn't mind introducing some synchronization). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2052) FSNamesystem should not sync the log with the write lock held
[ https://issues.apache.org/jira/browse/HDFS-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054081#comment-13054081 ] Eli Collins commented on HDFS-2052: --- It would drop the audit logging in FSNamesystem#delete, I don't see audit logging in FSDirectory#delete. FSNamesystem should not sync the log with the write lock held - Key: HDFS-2052 URL: https://issues.apache.org/jira/browse/HDFS-2052 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Eli Collins FSNamesystem#deleteInternal releases the write lock before syncing the log, however FSNamesystem#startFileInternal calls delete - deleteInternal with the write lock held, which means deleteInternal will sync the log while holding the lock. We could fix cases like this by passing a flag indicating whether the function should sysnc (eg in this case the sysnc is not necessary because startFileInternals callers will sync the log) or modify the current calls to sync to flag that a sync is necessary before returning to the caller rather than doing the sync right at the call sight. This way the cost of syncing the log could be amortized over multiple function calls (and potentially multiple RPCs if we didn't mind introducing some synchronization). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied
[ https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-2086: --- Attachment: HDFS-2086.7.patch Address the review comments. If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied Key: HDFS-2086 URL: https://issues.apache.org/jira/browse/HDFS-2086 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Tanping Wang Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch As the title describes the problem: if the include host list contains host name, after restarting namenodes, the datanodes registrant is denied by namenodes. This is because after namenode is restarted, the still alive data node will try to register itself with the namenode and it identifies itself with its *IP address*. However, namenode only allows all the hosts in its hosts list to registrant and all of them are hostnames. So namenode would deny the datanode registration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied
[ https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-2086: --- Status: Patch Available (was: Open) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied Key: HDFS-2086 URL: https://issues.apache.org/jira/browse/HDFS-2086 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Tanping Wang Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch As the title describes the problem: if the include host list contains host name, after restarting namenodes, the datanodes registrant is denied by namenodes. This is because after namenode is restarted, the still alive data node will try to register itself with the namenode and it identifies itself with its *IP address*. However, namenode only allows all the hosts in its hosts list to registrant and all of them are hostnames. So namenode would deny the datanode registration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied
[ https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054128#comment-13054128 ] Hadoop QA commented on HDFS-2086: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12483638/HDFS-2086.7.patch against trunk revision 1138645. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/830//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/830//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/830//console This message is automatically generated. If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied Key: HDFS-2086 URL: https://issues.apache.org/jira/browse/HDFS-2086 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Tanping Wang Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch As the title describes the problem: if the include host list contains host name, after restarting namenodes, the datanodes registrant is denied by namenodes. This is because after namenode is restarted, the still alive data node will try to register itself with the namenode and it identifies itself with its *IP address*. However, namenode only allows all the hosts in its hosts list to registrant and all of them are hostnames. So namenode would deny the datanode registration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1900) Use the block size key defined by common
[ https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054130#comment-13054130 ] Todd Lipcon commented on HDFS-1900: --- Are you using forrest 0.8 or 0.9? I think I've seen this problem using forrest 0.9 which is fairly new and apparently has some kind of problem with our build environment. Use the block size key defined by common - Key: HDFS-1900 URL: https://issues.apache.org/jira/browse/HDFS-1900 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.21.1 Reporter: Eli Collins Assignee: Abel Perez Labels: newbie Fix For: 0.22.0 Attachments: HDFS-1900.txt HADOOP-4952 added a dfs.block.size key to common configuration, defined in o.a.h.fs.FsConfig. This conflicts with the original HDFS block size key of the same name, which is now deprecated in favor of dfs.blocksize. It doesn't make sense to have two different keys for the block size (ie they can disagree). Why doesn't HDFS just use the key defined in common? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2092) Remove configuration object reference in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054134#comment-13054134 ] Bharath Mundlapudi commented on HDFS-2092: -- Also, exiting unit tests should cover this path. So i haven't added new unit tests. Remove configuration object reference in DFSClient -- Key: HDFS-2092 URL: https://issues.apache.org/jira/browse/HDFS-2092 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.23.0 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch At present, DFSClient stores reference to configuration object. Since, these configuration objects are pretty big at times can blot the processes which has multiple DFSClient objects like in TaskTracker. This is an attempt to remove the reference of conf object in DFSClient. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied
[ https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-2086: --- Resolution: Fixed Status: Resolved (was: Patch Available) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied Key: HDFS-2086 URL: https://issues.apache.org/jira/browse/HDFS-2086 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Tanping Wang Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch As the title describes the problem: if the include host list contains host name, after restarting namenodes, the datanodes registrant is denied by namenodes. This is because after namenode is restarted, the still alive data node will try to register itself with the namenode and it identifies itself with its *IP address*. However, namenode only allows all the hosts in its hosts list to registrant and all of them are hostnames. So namenode would deny the datanode registration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied
[ https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054147#comment-13054147 ] Hudson commented on HDFS-2086: -- Integrated in Hadoop-Hdfs-trunk-Commit #755 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/755/]) HDFS-2086. If the include hosts list contains host names, after restarting namenode, data nodes registration is denied. Contributed by Tanping Wang. tanping : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1139090 Files : * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hdfs/CHANGES.txt * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartup.java If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied Key: HDFS-2086 URL: https://issues.apache.org/jira/browse/HDFS-2086 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Tanping Wang Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch As the title describes the problem: if the include host list contains host name, after restarting namenodes, the datanodes registrant is denied by namenodes. This is because after namenode is restarted, the still alive data node will try to register itself with the namenode and it identifies itself with its *IP address*. However, namenode only allows all the hosts in its hosts list to registrant and all of them are hostnames. So namenode would deny the datanode registration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2092) Remove configuration object reference in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054151#comment-13054151 ] Hudson commented on HDFS-2092: -- Integrated in Hadoop-Hdfs-trunk-Commit #756 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/756/]) HDFS-2092. Remove some object references to Configuration in DFSClient. Contributed by Bharath Mundlapudi szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1139097 Files : * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hdfs/CHANGES.txt * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java Remove configuration object reference in DFSClient -- Key: HDFS-2092 URL: https://issues.apache.org/jira/browse/HDFS-2092 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.23.0 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch At present, DFSClient stores reference to configuration object. Since, these configuration objects are pretty big at times can blot the processes which has multiple DFSClient objects like in TaskTracker. This is an attempt to remove the reference of conf object in DFSClient. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2092) Remove configuration object reference in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054159#comment-13054159 ] Eli Collins commented on HDFS-2092: --- Does this change mean that a Configuration object can now bee free'd because there's one fewer ref to it? Otherwise it seems like we're now allocating a DFSClient#conf in addition to a Configuration object which increases over all memory usage no? Remove configuration object reference in DFSClient -- Key: HDFS-2092 URL: https://issues.apache.org/jira/browse/HDFS-2092 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.23.0 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch At present, DFSClient stores reference to configuration object. Since, these configuration objects are pretty big at times can blot the processes which has multiple DFSClient objects like in TaskTracker. This is an attempt to remove the reference of conf object in DFSClient. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-2104) 1073: Add a flag to 2NN to format its checkpoint dirs on startup
1073: Add a flag to 2NN to format its checkpoint dirs on startup Key: HDFS-2104 URL: https://issues.apache.org/jira/browse/HDFS-2104 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Todd Lipcon -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1125) Removing a datanode (failed or decommissioned) should not require a namenode restart
[ https://issues.apache.org/jira/browse/HDFS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054175#comment-13054175 ] Allen Wittenauer commented on HDFS-1125: The problem still seems to be present in 0.20.203, so I'm guessing no, the problem hasn't been fixed by HDFS-1773. How I tested: a) create a grid with 203, filling in dfs.hosts b) populate it with data c) put host in dfs.exclude d) -refreshNodes, verify host is in decom'ing nodes e) let decom process finish f) host now shows up in dead g) remove host from dfs.host and dfs.exclude h) -refreshNodes i) node is still listed as dead by nn j) kill DataNode process k) node is still listed as dead by nn l) 10 mins later, still listed... Removing a datanode (failed or decommissioned) should not require a namenode restart Key: HDFS-1125 URL: https://issues.apache.org/jira/browse/HDFS-1125 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.20.2 Reporter: Alex Loddengaard Priority: Blocker I've heard of several Hadoop users using dfsadmin -report to monitor the number of dead nodes, and alert if that number is not 0. This mechanism tends to work pretty well, except when a node is decommissioned or fails, because then the namenode requires a restart for said node to be entirely removed from HDFS. More details here: http://markmail.org/search/?q=decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode#query:decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode+page:1+mid:7gwqwdkobgfuszb4+state:results Removal from the exclude file and a refresh should get rid of the dead node. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2092) Remove configuration object reference in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054176#comment-13054176 ] Bharath Mundlapudi commented on HDFS-2092: -- Hi Eli, Does this change mean that a Configuration object can now bee free'd because there's one fewer ref to it? Yes, the direction of this patch is that. Eventually, we will be passing around only the DFSClient#conf or only required parameters to the downstream. This will be a big change and needs border discussion. But you are right, the idea is to stop having references to the conf object coming from the users. We want to let client code to decide the scope of conf object. Regarding memory, these will be few [key,value] pairs copied into DFSClient but then will be freeing the blotted conf object for the GC. That will be a big win on memory. Remove configuration object reference in DFSClient -- Key: HDFS-2092 URL: https://issues.apache.org/jira/browse/HDFS-2092 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.23.0 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch At present, DFSClient stores reference to configuration object. Since, these configuration objects are pretty big at times can blot the processes which has multiple DFSClient objects like in TaskTracker. This is an attempt to remove the reference of conf object in DFSClient. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2087) Add methods to DataTransferProtocol interface
[ https://issues.apache.org/jira/browse/HDFS-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2087: - Resolution: Fixed Fix Version/s: 0.23.0 Release Note: Declare methods in DataTransferProtocol interface, and change Sender and Receiver to implement the interface. Hadoop Flags: [Incompatible change, Reviewed] (was: [Incompatible change]) Status: Resolved (was: Patch Available) Thanks Tanping for reviewing it. I have committed this. Add methods to DataTransferProtocol interface - Key: HDFS-2087 URL: https://issues.apache.org/jira/browse/HDFS-2087 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node, hdfs client Affects Versions: 0.23.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h2087_20110620.patch, h2087_20110621.patch, h2087_20110621b.patch The {{DataTransferProtocol}} interface is currently empty. The {{Sender}} and {{Receiver}} define similar methods individually. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-2105) Remove the references to configuration object from the DFSClient library.
Remove the references to configuration object from the DFSClient library. - Key: HDFS-2105 URL: https://issues.apache.org/jira/browse/HDFS-2105 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.23.0 This is an umbrella jira to track removing the references to conf object in DFSClient library. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2087) Add methods to DataTransferProtocol interface
[ https://issues.apache.org/jira/browse/HDFS-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054190#comment-13054190 ] Hudson commented on HDFS-2087: -- Integrated in Hadoop-Hdfs-trunk-Commit #757 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/757/]) HDFS-2087. Declare methods in DataTransferProtocol interface, and change Sender and Receiver to implement the interface. szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1139124 Files : * /hadoop/common/trunk/hdfs/CHANGES.txt * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/datatransfer/Sender.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/DFSTestUtil.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/datatransfer/Receiver.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/datatransfer/DataTransferProtocol.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/BlockReader.java * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDataTransferProtocol.java * /hadoop/common/trunk/hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/DataTransferProtocolAspects.aj * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java Add methods to DataTransferProtocol interface - Key: HDFS-2087 URL: https://issues.apache.org/jira/browse/HDFS-2087 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node, hdfs client Affects Versions: 0.23.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h2087_20110620.patch, h2087_20110621.patch, h2087_20110621b.patch The {{DataTransferProtocol}} interface is currently empty. The {{Sender}} and {{Receiver}} define similar methods individually. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1480) All replicas for a block with repl=2 end up in same rack
[ https://issues.apache.org/jira/browse/HDFS-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1480: -- Attachment: hdfs-1480-test.txt Here's a test which fails after you loop it a few times. I added some debug log messages and could see that maxNodesPerRack is getting set to 4. All replicas for a block with repl=2 end up in same rack Key: HDFS-1480 URL: https://issues.apache.org/jira/browse/HDFS-1480 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.2 Reporter: T Meyarivan Attachments: hdfs-1480-test.txt It appears that all replicas of a block can end up in the same rack. The likelihood of such replicas seems to be directly related to decommissioning of nodes. Post rolling OS upgrade (decommission 3-10% of nodes, re-install etc, add them back) of a running cluster, all replicas of about 0.16% of blocks ended up in the same rack. Hadoop Namenode UI etc doesn't seem to know about such incorrectly replicated blocks. hadoop fsck .. does report that the blocks must be replicated on additional racks. Looking at ReplicationTargetChooser.java, following seem suspect: snippet-01: {code} int maxNodesPerRack = (totalNumOfReplicas-1)/clusterMap.getNumOfRacks()+2; {code} snippet-02: {code} case 2: if (clusterMap.isOnSameRack(results.get(0), results.get(1))) { chooseRemoteRack(1, results.get(0), excludedNodes, blocksize, maxNodesPerRack, results); } else if (newBlock){ chooseLocalRack(results.get(1), excludedNodes, blocksize, maxNodesPerRack, results); } else { chooseLocalRack(writer, excludedNodes, blocksize, maxNodesPerRack, results); } if (--numOfReplicas == 0) { break; } {code} snippet-03: {code} do { DatanodeDescriptor[] selectedNodes = chooseRandom(1, nodes, excludedNodes); if (selectedNodes.length == 0) { throw new NotEnoughReplicasException( Not able to place enough replicas); } result = (DatanodeDescriptor)(selectedNodes[0]); } while(!isGoodTarget(result, blocksize, maxNodesPerRack, results)); {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2092) Create a light inner conf class in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054224#comment-13054224 ] Aaron T. Myers commented on HDFS-2092: -- If I read that right, we're talking about a change that at the 99th percentile saves at most 386kb? I'm skeptical that those modest savings warrant this change. Also, how exactly were these gains measured? In what unit can we expect these memory savings? i.e. per TT? per DFSClient instance? Create a light inner conf class in DFSClient Key: HDFS-2092 URL: https://issues.apache.org/jira/browse/HDFS-2092 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.23.0 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch At present, DFSClient stores reference to configuration object. Since, these configuration objects are pretty big at times can blot the processes which has multiple DFSClient objects like in TaskTracker. This is an attempt to remove the reference of conf object in DFSClient. This patch creates a light inner conf class and copies the required keys from the Configuration object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1480) All replicas for a block with repl=2 end up in same rack
[ https://issues.apache.org/jira/browse/HDFS-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054231#comment-13054231 ] Todd Lipcon commented on HDFS-1480: --- Sorry, I think the above test actually fails because it will sometimes decommission all of the nodes on one of the test racks. But, if you bump it up to have 3 nodes in each rack, you'll see the new code path from HDFS-15 get triggered. -- you can see it first re-replicate the block to be all one one host, and then after it gets the addStoredBlock calls, it notices it's not on enough racks, re-replicates elsewhere, and eventually the random choice gets it on the right one. All replicas for a block with repl=2 end up in same rack Key: HDFS-1480 URL: https://issues.apache.org/jira/browse/HDFS-1480 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.2 Reporter: T Meyarivan Attachments: hdfs-1480-test.txt It appears that all replicas of a block can end up in the same rack. The likelihood of such replicas seems to be directly related to decommissioning of nodes. Post rolling OS upgrade (decommission 3-10% of nodes, re-install etc, add them back) of a running cluster, all replicas of about 0.16% of blocks ended up in the same rack. Hadoop Namenode UI etc doesn't seem to know about such incorrectly replicated blocks. hadoop fsck .. does report that the blocks must be replicated on additional racks. Looking at ReplicationTargetChooser.java, following seem suspect: snippet-01: {code} int maxNodesPerRack = (totalNumOfReplicas-1)/clusterMap.getNumOfRacks()+2; {code} snippet-02: {code} case 2: if (clusterMap.isOnSameRack(results.get(0), results.get(1))) { chooseRemoteRack(1, results.get(0), excludedNodes, blocksize, maxNodesPerRack, results); } else if (newBlock){ chooseLocalRack(results.get(1), excludedNodes, blocksize, maxNodesPerRack, results); } else { chooseLocalRack(writer, excludedNodes, blocksize, maxNodesPerRack, results); } if (--numOfReplicas == 0) { break; } {code} snippet-03: {code} do { DatanodeDescriptor[] selectedNodes = chooseRandom(1, nodes, excludedNodes); if (selectedNodes.length == 0) { throw new NotEnoughReplicasException( Not able to place enough replicas); } result = (DatanodeDescriptor)(selectedNodes[0]); } while(!isGoodTarget(result, blocksize, maxNodesPerRack, results)); {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1900) Use the block size key defined by common
[ https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054243#comment-13054243 ] Aaron T. Myers commented on HDFS-1900: -- That's definitely Forrest 0.9. See: HADOOP-7394 Use the block size key defined by common - Key: HDFS-1900 URL: https://issues.apache.org/jira/browse/HDFS-1900 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.21.1 Reporter: Eli Collins Assignee: Abel Perez Labels: newbie Fix For: 0.22.0 Attachments: HDFS-1900.txt HADOOP-4952 added a dfs.block.size key to common configuration, defined in o.a.h.fs.FsConfig. This conflicts with the original HDFS block size key of the same name, which is now deprecated in favor of dfs.blocksize. It doesn't make sense to have two different keys for the block size (ie they can disagree). Why doesn't HDFS just use the key defined in common? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2092) Create a light inner conf class in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054245#comment-13054245 ] Bharath Mundlapudi commented on HDFS-2092: -- Hi Aaron, That was just a sample of measurement for a day. We should care for MAX here in this case. Also, Going forward, PIG 0.9 will store lots of meta data in the conf and also one can embed the PIG script itself in the conf. This can potentially blow the TT. We can measure an approx size of conf by the job.xml file in the job history location. Since one can store anything in the job conf, we should be careful with the references to this object - we should not hold for long duration. Create a light inner conf class in DFSClient Key: HDFS-2092 URL: https://issues.apache.org/jira/browse/HDFS-2092 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.23.0 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch At present, DFSClient stores reference to configuration object. Since, these configuration objects are pretty big at times can blot the processes which has multiple DFSClient objects like in TaskTracker. This is an attempt to remove the reference of conf object in DFSClient. This patch creates a light inner conf class and copies the required keys from the Configuration object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira