date:20110623


[ 
https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053673#comment-13053673
 ] 

Eli Collins commented on HDFS-1900:
---

Hey Abel, I use the following bash function:

function ant-test-patch() {
 ant -Dpatch.file=$1 \
   -Dforrest.home=$FORREST_HOME \
   -Dfindbugs.home=$FINDBUGS_HOME \
   -Djava5.home=$JAVA5_HOME \
   test-patch
}

and apache-forrest-0.8 and findbugs-1.3.9. More at 
https://github.com/elicollins/hadoop-dev/blob/master/bin/hadoop-alias.sh



 Use the block size key defined by common 
 -

 Key: HDFS-1900
 URL: https://issues.apache.org/jira/browse/HDFS-1900
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.21.1
Reporter: Eli Collins
Assignee: Abel Perez
  Labels: newbie
 Fix For: 0.22.0

 Attachments: HDFS-1900.txt


 HADOOP-4952 added a dfs.block.size key to common configuration, defined in 
 o.a.h.fs.FsConfig. This conflicts with the original HDFS block size key of 
 the same name, which is now deprecated in favor of dfs.blocksize. It doesn't 
 make sense to have two different keys for the block size (ie they can 
 disagree). Why doesn't HDFS just use the key defined in common?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HDFS-1900) Use the block size key defined by common


[ 
https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053673#comment-13053673
 ] 

Eli Collins edited comment on HDFS-1900 at 6/23/11 6:31 AM:


Hey Abel, I use the following bash function:

{code}
function ant-test-patch() {
 ant -Dpatch.file=$1 \
   -Dforrest.home=$FORREST_HOME \
   -Dfindbugs.home=$FINDBUGS_HOME \
   -Djava5.home=$JAVA5_HOME \
   test-patch
}
{code}

and apache-forrest-0.8 and findbugs-1.3.9. More at 
https://github.com/elicollins/hadoop-dev/blob/master/bin/hadoop-alias.sh



  was (Author: eli):
Hey Abel, I use the following bash function:

function ant-test-patch() {
 ant -Dpatch.file=$1 \
   -Dforrest.home=$FORREST_HOME \
   -Dfindbugs.home=$FINDBUGS_HOME \
   -Djava5.home=$JAVA5_HOME \
   test-patch
}

and apache-forrest-0.8 and findbugs-1.3.9. More at 
https://github.com/elicollins/hadoop-dev/blob/master/bin/hadoop-alias.sh


  
 Use the block size key defined by common 
 -

 Key: HDFS-1900
 URL: https://issues.apache.org/jira/browse/HDFS-1900
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.21.1
Reporter: Eli Collins
Assignee: Abel Perez
  Labels: newbie
 Fix For: 0.22.0

 Attachments: HDFS-1900.txt


 HADOOP-4952 added a dfs.block.size key to common configuration, defined in 
 o.a.h.fs.FsConfig. This conflicts with the original HDFS block size key of 
 the same name, which is now deprecated in favor of dfs.blocksize. It doesn't 
 make sense to have two different keys for the block size (ie they can 
 disagree). Why doesn't HDFS just use the key defined in common?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-2103) Read lock must be released before acquiring a write lock


 [ 
https://issues.apache.org/jira/browse/HDFS-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Mundlapudi resolved HDFS-2103.
--

Resolution: Not A Problem

Didn't notice the finally block, where read lock is released. I am closing this 
Jira.

 Read lock must be released before acquiring a write lock
 

 Key: HDFS-2103
 URL: https://issues.apache.org/jira/browse/HDFS-2103
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.23.0


 In FSNamesystem.getBlockLocationsUpdateTimes function, we have the following 
 code:
 {code}
 for (int attempt = 0; attempt  2; attempt++) {
   if (attempt == 0) { // first attempt is with readlock
 readLock();
   }  else { // second attempt is with  write lock
 writeLock(); // writelock is needed to set accesstime
   }
   ...
   if (attempt == 0) {
  continue;
   }
 {code}
 In the above code, readLock is acquired in attempt 0 and if the execution 
 enters in the continue block, then it tries to acquire writeLock before 
 releasing the readLock.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2102) 1073 Zero pad edits filename to make them lexically sortable


 [ 
https://issues.apache.org/jira/browse/HDFS-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-2102:
-

Attachment: HDFS-2102.diff

 1073 Zero pad edits filename to make them lexically sortable
 

 Key: HDFS-2102
 URL: https://issues.apache.org/jira/browse/HDFS-2102
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: Edit log branch (HDFS-1073)

 Attachments: HDFS-2102.diff


 Zero pad the edit log filenames so they appear in the correct order when you 
 ls on the filesystem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2102) 1073 Zero pad edits filename to make them lexically sortable


 [ 
https://issues.apache.org/jira/browse/HDFS-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-2102:
-

Status: Patch Available  (was: Open)

 1073 Zero pad edits filename to make them lexically sortable
 

 Key: HDFS-2102
 URL: https://issues.apache.org/jira/browse/HDFS-2102
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: Edit log branch (HDFS-1073)

 Attachments: HDFS-2102.diff


 Zero pad the edit log filenames so they appear in the correct order when you 
 ls on the filesystem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2018) Move all journal stream management code into one place


 [ 
https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-2018:
-

Status: Open  (was: Patch Available)

 Move all journal stream management code into one place
 --

 Key: HDFS-2018
 URL: https://issues.apache.org/jira/browse/HDFS-2018
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: Edit log branch (HDFS-1073)

 Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
 HDFS-2018.diff, HDFS-2018.diff


 Currently in the HDFS-1073 branch, the code for creating output streams is in 
 FileJournalManager and the code for input streams is in the inspectors. This 
 change does a number of things.
   - Input and Output streams are now created by the JournalManager.
   - FSImageStorageInspectors now deals with URIs when referring to edit logs
   - Recovery of inprogress logs is performed by counting the number of 
 transactions instead of looking at the length of the file.
 The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2018) Move all journal stream management code into one place


 [ 
https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-2018:
-

Attachment: HDFS-2018.diff

Added RemoteEditLogManifest stuff to allow it to take segments from different 
journals.

 Move all journal stream management code into one place
 --

 Key: HDFS-2018
 URL: https://issues.apache.org/jira/browse/HDFS-2018
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: Edit log branch (HDFS-1073)

 Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, 
 HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff


 Currently in the HDFS-1073 branch, the code for creating output streams is in 
 FileJournalManager and the code for input streams is in the inspectors. This 
 change does a number of things.
   - Input and Output streams are now created by the JournalManager.
   - FSImageStorageInspectors now deals with URIs when referring to edit logs
   - Recovery of inprogress logs is performed by counting the number of 
 transactions instead of looking at the length of the file.
 The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-2096) Mavenization of hadoop-hdfs

2011-06-23 Thread Alejandro Abdelnur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur reassigned HDFS-2096:


Assignee: Alejandro Abdelnur

 Mavenization of hadoop-hdfs
 ---

 Key: HDFS-2096
 URL: https://issues.apache.org/jira/browse/HDFS-2096
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur

 Same as HADOOP-6671 for hdfs

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1574) HDFS cannot be browsed from web UI while in safe mode


[ 
https://issues.apache.org/jira/browse/HDFS-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053916#comment-13053916
 ] 

Todd Lipcon commented on HDFS-1574:
---

I think HDFS should be browsable in safe-mode -- perhaps in such a way that we 
just log a warning that delegation tokens will be non-persistent.

 HDFS cannot be browsed from web UI while in safe mode
 -

 Key: HDFS-1574
 URL: https://issues.apache.org/jira/browse/HDFS-1574
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Priority: Blocker
  Labels: newbie

 As of HDFS-984, the NN does not issue delegation tokens while in safe mode 
 (since it would require writing to the edit log). But the browsedfscontent 
 servlet relies on getting a delegation token before redirecting to a random 
 DN to browse the FS. Thus, the browse the filesystem link does not work 
 while the NN is in safe mode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1321) If service port and main port are the same, there is no clear log message explaining the issue.

2011-06-23 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053969#comment-13053969
]

Hadoop QA commented on HDFS-1321:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12483619/HDFS-1321.patch
against trunk revision 1138645.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

-1 patch. The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/828//console

This message is automatically generated.

If service port and main port are the same, there is no clear log message
explaining the issue.
---

Key: HDFS-1321
URL: https://issues.apache.org/jira/browse/HDFS-1321
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 0.23.0
Reporter: gary murry
Assignee: Jim Plush
Priority: Minor
Labels: newbie
Fix For: 0.23.0

Attachments: HDFS-1321.patch

With the introduction of a service port to the namenode, there is now a
chance for user error to set the two port equal. This will cause the
namenode to fail to start up. It would be nice if there was a log message
explaining the port clash. Or just treat things as if the service port was
not specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1606) Provide a stronger data guarantee in the write pipeline

2011-06-23 Thread Tsz Wo (Nicholas), SZE (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053999#comment-13053999
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1606:
--

Koji Noguchi also has provided a lot of inputs on this.  Sorry that I failed to 
mention it in the 
[acknowledgement|https://issues.apache.org/jira/browse/HDFS-1606?focusedCommentId=13018958page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13018958].

 Provide a stronger data guarantee in the write pipeline
 ---

 Key: HDFS-1606
 URL: https://issues.apache.org/jira/browse/HDFS-1606
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.23.0

 Attachments: h1606_20110210.patch, h1606_20110211.patch, 
 h1606_20110217.patch, h1606_20110228.patch, h1606_20110404.patch, 
 h1606_20110405.patch, h1606_20110405b.patch, h1606_20110406.patch, 
 h1606_20110406b.patch, h1606_20110407.patch, h1606_20110407b.patch, 
 h1606_20110407c.patch, h1606_20110408.patch, h1606_20110408b.patch


 In the current design, if there is a datanode/network failure in the write 
 pipeline, DFSClient will try to remove the failed datanode from the pipeline 
 and then continue writing with the remaining datanodes.  As a result, the 
 number of datanodes in the pipeline is decreased.  Unfortunately, it is 
 possible that DFSClient may incorrectly remove a healthy datanode but leave 
 the failed datanode in the pipeline because failure detection may be 
 inaccurate under erroneous conditions.
 We propose to have a new mechanism for adding new datanodes to the pipeline 
 in order to provide a stronger data guarantee.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (HDFS-2103) Read lock must be released before acquiring a write lock


 [ 
https://issues.apache.org/jira/browse/HDFS-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins closed HDFS-2103.
-


 Read lock must be released before acquiring a write lock
 

 Key: HDFS-2103
 URL: https://issues.apache.org/jira/browse/HDFS-2103
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.23.0


 In FSNamesystem.getBlockLocationsUpdateTimes function, we have the following 
 code:
 {code}
 for (int attempt = 0; attempt  2; attempt++) {
   if (attempt == 0) { // first attempt is with readlock
 readLock();
   }  else { // second attempt is with  write lock
 writeLock(); // writelock is needed to set accesstime
   }
   ...
   if (attempt == 0) {
  continue;
   }
 {code}
 In the above code, readLock is acquired in attempt 0 and if the execution 
 enters in the continue block, then it tries to acquire writeLock before 
 releasing the readLock.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1321) If service port and main port are the same, there is no clear log message explaining the issue.

2011-06-23 Thread Jim Plush (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Plush updated HDFS-1321:


Attachment: HDFS-1321-take2.txt

did a hard reset and re-applied the changes to try and get a cleaner patch file

 If service port and main port are the same, there is no clear log message 
 explaining the issue.
 ---

 Key: HDFS-1321
 URL: https://issues.apache.org/jira/browse/HDFS-1321
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: gary murry
Assignee: Jim Plush
Priority: Minor
  Labels: newbie
 Fix For: 0.23.0

 Attachments: HDFS-1321-take2.txt, HDFS-1321.patch


 With the introduction of a service port to the namenode, there is now a 
 chance for user error to set the two port equal.  This will cause the 
 namenode to fail to start up.  It would be nice if there was a log message 
 explaining the port clash.  Or just treat things as if the service port was 
 not specified. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1900) Use the block size key defined by common

2011-06-23 Thread Abel Perez (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054011#comment-13054011
 ] 

Abel Perez commented on HDFS-1900:
--

Hey Eli, thanks for the function.  Not sure if my environment is properly 
setup, I tried running test-patch and got the following error:

[exec] Exception in thread main java.lang.NoClassDefFoundError: 
org/apache/fop/messaging/MessageHandler
 [exec] at 
org.apache.cocoon.serialization.FOPSerializer.configure(FOPSerializer.java:122)
 [exec] at 
org.apache.avalon.framework.container.ContainerUtil.configure(ContainerUtil.java:201)
 [exec] at 
org.apache.avalon.excalibur.component.DefaultComponentFactory.newInstance(DefaultComponentFactory.java:289)
 [exec] at 
org.apache.avalon.excalibur.pool.InstrumentedResourceLimitingPool.newPoolable(InstrumentedResourceLimitingPool.java:655)
 [exec] at 
org.apache.avalon.excalibur.pool.InstrumentedResourceLimitingPool.get(InstrumentedResourceLimitingPool.java:371)
 [exec] at 
org.apache.avalon.excalibur.component.PoolableComponentHandler.doGet(PoolableComponentHandler.java:198)
 [exec] at 
org.apache.avalon.excalibur.component.ComponentHandler.get(ComponentHandler.java:381)
 [exec] at 
org.apache.avalon.excalibur.component.ExcaliburComponentSelector.select(ExcaliburComponentSelector.java:215)
 [exec] at 
org.apache.cocoon.components.ExtendedComponentSelector.select(ExtendedComponentSelector.java:268)
 [exec] at 
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.setSerializer(AbstractProcessingPipeline.java:311)
 [exec] at 
org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.setSerializer(AbstractCachingProcessingPipeline.java:171)
 [exec] at 
org.apache.cocoon.components.treeprocessor.sitemap.SerializeNode.invoke(SerializeNode.java:120)
 [exec] at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69)
 [exec] at 
org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:103)
 [exec] at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:47)
 [exec] at 
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:131)
 [exec] at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69)
 [exec] at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
 [exec] at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69)
 [exec] at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:93)
 [exec] at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:235)
 [exec] at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:177)
 [exec] at 
org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:254)
 [exec] at 
org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:118)
 [exec] at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69)
 [exec] at 
org.apache.cocoon.components.treeprocessor.sitemap.SelectNode.invoke(SelectNode.java:98)
 [exec] at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69)
 [exec] at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:143)
 [exec] at 
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:69)
 [exec] at 
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:93)
 [exec] at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:235)
 [exec] at 
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(ConcreteTreeProcessor.java:177)
 [exec] at 
org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:254)
 [exec] at org.apache.cocoon.Cocoon.process(Cocoon.java:699)
 [exec] at 
org.apache.cocoon.bean.CocoonWrapper.getPage(CocoonWrapper.java:514)
 [exec] at 
org.apache.cocoon.bean.CocoonBean.processTarget(CocoonBean.java:499)
 [exec] at 
org.apache.cocoon.bean.CocoonBean.process(CocoonBean.java:356)
 [exec] at org.apache.cocoon.Main.main(Main.java:321)

[jira] [Updated] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied


 [ 
https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang updated HDFS-2086:
---

Status: Open  (was: Patch Available)

 If the include hosts list contains host name, after restarting namenode, 
 datanodes registrant is denied 
 

 Key: HDFS-2086
 URL: https://issues.apache.org/jira/browse/HDFS-2086
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, 
 HDFS-2086.5.patch, HDFS-2086.patch


 As the title describes the problem:  if the include host list contains host 
 name, after restarting namenodes, the datanodes registrant is denied by 
 namenodes.  This is because after namenode is restarted, the still alive data 
 node will try to register itself with the namenode and it identifies itself 
 with its *IP address*.  However, namenode only allows all the hosts in its 
 hosts list to registrant and all of them are hostnames. So namenode would 
 deny the datanode registration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1321) If service port and main port are the same, there is no clear log message explaining the issue.

2011-06-23 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054032#comment-13054032
]

Hadoop QA commented on HDFS-1321:
-

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12483625/HDFS-1321-take2.txt
against trunk revision 1138645.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/829//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HDFS-Build/829//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/829//console

This message is automatically generated.

If service port and main port are the same, there is no clear log message
explaining the issue.
---

Attachments: HDFS-1321-take2.txt, HDFS-1321.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2052) FSNamesystem should not sync the log with the write lock held

2011-06-23 Thread Suresh Srinivas (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054073#comment-13054073
 ] 

Suresh Srinivas commented on HDFS-2052:
---

 but that would drop the audit logging so probably easier to pass a boolean 
 that indicates whether to sync. 
It does not. Logging for delete is done in FSDirectory#delete(). In this case, 
you are not syncing the audit log and do it along with editlog that logs file 
creation.

 FSNamesystem should not sync the log with the write lock held
 -

 Key: HDFS-2052
 URL: https://issues.apache.org/jira/browse/HDFS-2052
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Eli Collins

 FSNamesystem#deleteInternal releases the write lock before syncing the log, 
 however FSNamesystem#startFileInternal calls delete - deleteInternal with 
 the write lock held, which means deleteInternal will sync the log while 
 holding the lock. We could fix cases like this by passing  a flag indicating 
 whether the function should sysnc (eg in this case the sysnc is not necessary 
 because startFileInternals callers will sync the log) or modify the current 
 calls to sync to flag that a sync is necessary before returning to the caller 
 rather than doing the sync right at the call sight. This way the cost of 
 syncing the log could be amortized over multiple function calls (and 
 potentially multiple RPCs if we didn't mind introducing some synchronization).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2052) FSNamesystem should not sync the log with the write lock held


[ 
https://issues.apache.org/jira/browse/HDFS-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054081#comment-13054081
 ] 

Eli Collins commented on HDFS-2052:
---

It would drop the audit logging in FSNamesystem#delete, I don't see audit 
logging in FSDirectory#delete.

 FSNamesystem should not sync the log with the write lock held
 -

 Key: HDFS-2052
 URL: https://issues.apache.org/jira/browse/HDFS-2052
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Eli Collins

 FSNamesystem#deleteInternal releases the write lock before syncing the log, 
 however FSNamesystem#startFileInternal calls delete - deleteInternal with 
 the write lock held, which means deleteInternal will sync the log while 
 holding the lock. We could fix cases like this by passing  a flag indicating 
 whether the function should sysnc (eg in this case the sysnc is not necessary 
 because startFileInternals callers will sync the log) or modify the current 
 calls to sync to flag that a sync is necessary before returning to the caller 
 rather than doing the sync right at the call sight. This way the cost of 
 syncing the log could be amortized over multiple function calls (and 
 potentially multiple RPCs if we didn't mind introducing some synchronization).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied


 [ 
https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang updated HDFS-2086:
---

Attachment: HDFS-2086.7.patch

Address the review comments.

 If the include hosts list contains host name, after restarting namenode, 
 datanodes registrant is denied 
 

 Key: HDFS-2086
 URL: https://issues.apache.org/jira/browse/HDFS-2086
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, 
 HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch


 As the title describes the problem:  if the include host list contains host 
 name, after restarting namenodes, the datanodes registrant is denied by 
 namenodes.  This is because after namenode is restarted, the still alive data 
 node will try to register itself with the namenode and it identifies itself 
 with its *IP address*.  However, namenode only allows all the hosts in its 
 hosts list to registrant and all of them are hostnames. So namenode would 
 deny the datanode registration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied


 [ 
https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang updated HDFS-2086:
---

Status: Patch Available  (was: Open)

 If the include hosts list contains host name, after restarting namenode, 
 datanodes registrant is denied 
 

 Key: HDFS-2086
 URL: https://issues.apache.org/jira/browse/HDFS-2086
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, 
 HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch


 As the title describes the problem:  if the include host list contains host 
 name, after restarting namenodes, the datanodes registrant is denied by 
 namenodes.  This is because after namenode is restarted, the still alive data 
 node will try to register itself with the namenode and it identifies itself 
 with its *IP address*.  However, namenode only allows all the hosts in its 
 hosts list to registrant and all of them are hostnames. So namenode would 
 deny the datanode registration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied

2011-06-23 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054128#comment-13054128
]

Hadoop QA commented on HDFS-2086:
-

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12483638/HDFS-2086.7.patch
against trunk revision 1138645.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/830//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HDFS-Build/830//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/830//console

This message is automatically generated.

If the include hosts list contains host name, after restarting namenode,
datanodes registrant is denied

Key: HDFS-2086
URL: https://issues.apache.org/jira/browse/HDFS-2086
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
Fix For: 0.23.0

Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch,
HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch

As the title describes the problem: if the include host list contains host
name, after restarting namenodes, the datanodes registrant is denied by
namenodes. This is because after namenode is restarted, the still alive data
node will try to register itself with the namenode and it identifies itself
with its *IP address*. However, namenode only allows all the hosts in its
hosts list to registrant and all of them are hostnames. So namenode would
deny the datanode registration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1900) Use the block size key defined by common


[ 
https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054130#comment-13054130
 ] 

Todd Lipcon commented on HDFS-1900:
---

Are you using forrest 0.8 or 0.9? I think I've seen this problem using forrest 
0.9 which is fairly new and apparently has some kind of problem with our build 
environment.

 Use the block size key defined by common 
 -

 Key: HDFS-1900
 URL: https://issues.apache.org/jira/browse/HDFS-1900
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.21.1
Reporter: Eli Collins
Assignee: Abel Perez
  Labels: newbie
 Fix For: 0.22.0

 Attachments: HDFS-1900.txt


 HADOOP-4952 added a dfs.block.size key to common configuration, defined in 
 o.a.h.fs.FsConfig. This conflicts with the original HDFS block size key of 
 the same name, which is now deprecated in favor of dfs.blocksize. It doesn't 
 make sense to have two different keys for the block size (ie they can 
 disagree). Why doesn't HDFS just use the key defined in common?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2092) Remove configuration object reference in DFSClient


[ 
https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054134#comment-13054134
 ] 

Bharath Mundlapudi commented on HDFS-2092:
--

Also, exiting unit tests should cover this path. So i haven't added new unit 
tests. 

 Remove configuration object reference in DFSClient
 --

 Key: HDFS-2092
 URL: https://issues.apache.org/jira/browse/HDFS-2092
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.23.0

 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch


 At present, DFSClient stores reference to configuration object. Since, these 
 configuration objects are pretty big at times can blot the processes which 
 has multiple DFSClient objects like in TaskTracker. This is an attempt to 
 remove the reference of conf object in DFSClient. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied


 [ 
https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang updated HDFS-2086:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 If the include hosts list contains host name, after restarting namenode, 
 datanodes registrant is denied 
 

 Key: HDFS-2086
 URL: https://issues.apache.org/jira/browse/HDFS-2086
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, 
 HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch


 As the title describes the problem:  if the include host list contains host 
 name, after restarting namenodes, the datanodes registrant is denied by 
 namenodes.  This is because after namenode is restarted, the still alive data 
 node will try to register itself with the namenode and it identifies itself 
 with its *IP address*.  However, namenode only allows all the hosts in its 
 hosts list to registrant and all of them are hostnames. So namenode would 
 deny the datanode registration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2086) If the include hosts list contains host name, after restarting namenode, datanodes registrant is denied

2011-06-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054147#comment-13054147
 ] 

Hudson commented on HDFS-2086:
--

Integrated in Hadoop-Hdfs-trunk-Commit #755 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/755/])
HDFS-2086. If the include hosts list contains host names, after restarting 
namenode, data nodes registration is denied.  Contributed by Tanping Wang.

tanping : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1139090
Files : 
* 
/hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* /hadoop/common/trunk/hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestStartup.java


 If the include hosts list contains host name, after restarting namenode, 
 datanodes registrant is denied 
 

 Key: HDFS-2086
 URL: https://issues.apache.org/jira/browse/HDFS-2086
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-2086.2.patch, HDFS-2086.3.patch, HDFS-2086.4.patch, 
 HDFS-2086.5.patch, HDFS-2086.6.patch, HDFS-2086.7.patch, HDFS-2086.patch


 As the title describes the problem:  if the include host list contains host 
 name, after restarting namenodes, the datanodes registrant is denied by 
 namenodes.  This is because after namenode is restarted, the still alive data 
 node will try to register itself with the namenode and it identifies itself 
 with its *IP address*.  However, namenode only allows all the hosts in its 
 hosts list to registrant and all of them are hostnames. So namenode would 
 deny the datanode registration.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2092) Remove configuration object reference in DFSClient

2011-06-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054151#comment-13054151
 ] 

Hudson commented on HDFS-2092:
--

Integrated in Hadoop-Hdfs-trunk-Commit #756 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/756/])
HDFS-2092. Remove some object references to Configuration in DFSClient.  
Contributed by Bharath Mundlapudi

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1139097
Files : 
* /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* /hadoop/common/trunk/hdfs/CHANGES.txt
* /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java
* /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java


 Remove configuration object reference in DFSClient
 --

 Key: HDFS-2092
 URL: https://issues.apache.org/jira/browse/HDFS-2092
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.23.0

 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch


 At present, DFSClient stores reference to configuration object. Since, these 
 configuration objects are pretty big at times can blot the processes which 
 has multiple DFSClient objects like in TaskTracker. This is an attempt to 
 remove the reference of conf object in DFSClient. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2092) Remove configuration object reference in DFSClient


[ 
https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054159#comment-13054159
 ] 

Eli Collins commented on HDFS-2092:
---

Does this change mean that a Configuration object can now bee free'd because 
there's one fewer ref to it?  Otherwise it seems like we're now allocating a 
DFSClient#conf in addition to a Configuration object which increases over all 
memory usage no?

 Remove configuration object reference in DFSClient
 --

 Key: HDFS-2092
 URL: https://issues.apache.org/jira/browse/HDFS-2092
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.23.0

 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch


 At present, DFSClient stores reference to configuration object. Since, these 
 configuration objects are pretty big at times can blot the processes which 
 has multiple DFSClient objects like in TaskTracker. This is an attempt to 
 remove the reference of conf object in DFSClient. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-2104) 1073: Add a flag to 2NN to format its checkpoint dirs on startup

1073: Add a flag to 2NN to format its checkpoint dirs on startup


 Key: HDFS-2104
 URL: https://issues.apache.org/jira/browse/HDFS-2104
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Todd Lipcon




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1125) Removing a datanode (failed or decommissioned) should not require a namenode restart

2011-06-23 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054175#comment-13054175
 ] 

Allen Wittenauer commented on HDFS-1125:


The problem still seems to be present in 0.20.203, so I'm guessing no, the 
problem hasn't been fixed by HDFS-1773.  

How I tested:

a) create a grid with 203, filling in dfs.hosts
b) populate it with data
c) put host in dfs.exclude
d) -refreshNodes, verify host is in decom'ing nodes
e) let decom process finish
f) host now shows up in dead
g) remove host from dfs.host and dfs.exclude
h) -refreshNodes
i) node is still listed as dead by nn
j) kill DataNode process
k) node is still listed as dead by nn
l) 10 mins later, still listed...






 Removing a datanode (failed or decommissioned) should not require a namenode 
 restart
 

 Key: HDFS-1125
 URL: https://issues.apache.org/jira/browse/HDFS-1125
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.20.2
Reporter: Alex Loddengaard
Priority: Blocker

 I've heard of several Hadoop users using dfsadmin -report to monitor the 
 number of dead nodes, and alert if that number is not 0.  This mechanism 
 tends to work pretty well, except when a node is decommissioned or fails, 
 because then the namenode requires a restart for said node to be entirely 
 removed from HDFS.  More details here:
 http://markmail.org/search/?q=decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode#query:decommissioned%20node%20showing%20up%20ad%20dead%20node%20in%20web%20based%09interface%20to%20namenode+page:1+mid:7gwqwdkobgfuszb4+state:results
 Removal from the exclude file and a refresh should get rid of the dead node.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2092) Remove configuration object reference in DFSClient

[
https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054176#comment-13054176
]

Bharath Mundlapudi commented on HDFS-2092:
--

Hi Eli,

Does this change mean that a Configuration object can now bee free'd because
there's one fewer ref to it?
Yes, the direction of this patch is that. Eventually, we will be passing around
only the DFSClient#conf or only required parameters to the downstream. This
will be a big change and needs border discussion. But you are right, the idea
is to stop having references to the conf object coming from the users. We want
to let client code to decide the scope of conf object.

Regarding memory, these will be few [key,value] pairs copied into DFSClient but
then will be freeing the blotted conf object for the GC. That will be a big win
on memory.

Remove configuration object reference in DFSClient
--

Key: HDFS-2092
URL: https://issues.apache.org/jira/browse/HDFS-2092
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs client
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Fix For: 0.23.0

Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch

At present, DFSClient stores reference to configuration object. Since, these
configuration objects are pretty big at times can blot the processes which
has multiple DFSClient objects like in TaskTracker. This is an attempt to
remove the reference of conf object in DFSClient.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-2087) Add methods to DataTransferProtocol interface

2011-06-23 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-2087:
-

   Resolution: Fixed
Fix Version/s: 0.23.0
 Release Note: Declare methods in DataTransferProtocol interface, and 
change Sender and Receiver to implement the interface.
 Hadoop Flags: [Incompatible change, Reviewed]  (was: [Incompatible change])
   Status: Resolved  (was: Patch Available)

Thanks Tanping for reviewing it.

I have committed this.

 Add methods to DataTransferProtocol interface
 -

 Key: HDFS-2087
 URL: https://issues.apache.org/jira/browse/HDFS-2087
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Affects Versions: 0.23.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.23.0

 Attachments: h2087_20110620.patch, h2087_20110621.patch, 
 h2087_20110621b.patch


 The {{DataTransferProtocol}} interface is currently empty.  The {{Sender}} 
 and {{Receiver}} define similar methods individually.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-2105) Remove the references to configuration object from the DFSClient library.

Remove the references to configuration object from the DFSClient library.
-

 Key: HDFS-2105
 URL: https://issues.apache.org/jira/browse/HDFS-2105
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.23.0


This is an umbrella jira to track removing the references to conf object in 
DFSClient library.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2087) Add methods to DataTransferProtocol interface

2011-06-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054190#comment-13054190
 ] 

Hudson commented on HDFS-2087:
--

Integrated in Hadoop-Hdfs-trunk-Commit #757 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/757/])
HDFS-2087. Declare methods in DataTransferProtocol interface, and change 
Sender and Receiver to implement the interface.

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1139124
Files : 
* /hadoop/common/trunk/hdfs/CHANGES.txt
* /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java
* 
/hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/datatransfer/Sender.java
* 
/hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
/hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java
* 
/hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java
* 
/hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/datatransfer/Receiver.java
* 
/hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/protocol/datatransfer/DataTransferProtocol.java
* 
/hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/BlockReader.java
* 
/hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
* 
/hadoop/common/trunk/hdfs/src/test/aop/org/apache/hadoop/hdfs/server/datanode/DataTransferProtocolAspects.aj
* 
/hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* 
/hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


 Add methods to DataTransferProtocol interface
 -

 Key: HDFS-2087
 URL: https://issues.apache.org/jira/browse/HDFS-2087
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Affects Versions: 0.23.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.23.0

 Attachments: h2087_20110620.patch, h2087_20110621.patch, 
 h2087_20110621b.patch


 The {{DataTransferProtocol}} interface is currently empty.  The {{Sender}} 
 and {{Receiver}} define similar methods individually.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1480) All replicas for a block with repl=2 end up in same rack


 [ 
https://issues.apache.org/jira/browse/HDFS-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1480:
--

Attachment: hdfs-1480-test.txt

Here's a test which fails after you loop it a few times. I added some debug log 
messages and could see that maxNodesPerRack is getting set to 4.

 All replicas for a block with repl=2 end up in same rack
 

 Key: HDFS-1480
 URL: https://issues.apache.org/jira/browse/HDFS-1480
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20.2
Reporter: T Meyarivan
 Attachments: hdfs-1480-test.txt


 It appears that all replicas of a block can end up in the same rack. The 
 likelihood of such replicas seems to be directly related to decommissioning 
 of nodes. 
 Post rolling OS upgrade (decommission 3-10% of nodes, re-install etc, add 
 them back) of a running cluster, all replicas of about 0.16% of blocks ended 
 up in the same rack.
 Hadoop Namenode UI etc doesn't seem to know about such incorrectly replicated 
 blocks. hadoop fsck .. does report that the blocks must be replicated on 
 additional racks.
 Looking at ReplicationTargetChooser.java, following seem suspect:
 snippet-01:
 {code}
 int maxNodesPerRack =
   (totalNumOfReplicas-1)/clusterMap.getNumOfRacks()+2;
 {code}
 snippet-02:
 {code}
   case 2:
 if (clusterMap.isOnSameRack(results.get(0), results.get(1))) {
   chooseRemoteRack(1, results.get(0), excludedNodes,
blocksize, maxNodesPerRack, results);
 } else if (newBlock){
   chooseLocalRack(results.get(1), excludedNodes, blocksize,
   maxNodesPerRack, results);
 } else {
   chooseLocalRack(writer, excludedNodes, blocksize,
   maxNodesPerRack, results);
 }
 if (--numOfReplicas == 0) {
   break;
 }
 {code}
 snippet-03:
 {code}
 do {
   DatanodeDescriptor[] selectedNodes =
 chooseRandom(1, nodes, excludedNodes);
   if (selectedNodes.length == 0) {
 throw new NotEnoughReplicasException(
  Not able to place enough 
 replicas);
   }
   result = (DatanodeDescriptor)(selectedNodes[0]);
 } while(!isGoodTarget(result, blocksize, maxNodesPerRack, results));
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2092) Create a light inner conf class in DFSClient

2011-06-23 Thread Aaron T. Myers (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054224#comment-13054224
 ] 

Aaron T. Myers commented on HDFS-2092:
--

If I read that right, we're talking about a change that at the 99th percentile 
saves at most 386kb? I'm skeptical that those modest savings warrant this 
change.

Also, how exactly were these gains measured? In what unit can we expect these 
memory savings? i.e. per TT? per DFSClient instance?

 Create a light inner conf class in DFSClient
 

 Key: HDFS-2092
 URL: https://issues.apache.org/jira/browse/HDFS-2092
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.23.0

 Attachments: HDFS-2092-1.patch, HDFS-2092-2.patch


 At present, DFSClient stores reference to configuration object. Since, these 
 configuration objects are pretty big at times can blot the processes which 
 has multiple DFSClient objects like in TaskTracker. This is an attempt to 
 remove the reference of conf object in DFSClient. 
 This patch creates a light inner conf class and copies the required keys from 
 the Configuration object.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1480) All replicas for a block with repl=2 end up in same rack


[ 
https://issues.apache.org/jira/browse/HDFS-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054231#comment-13054231
 ] 

Todd Lipcon commented on HDFS-1480:
---

Sorry, I think the above test actually fails because it will sometimes 
decommission all of the nodes on one of the test racks.

But, if you bump it up to have 3 nodes in each rack, you'll see the new code 
path from HDFS-15 get triggered. -- you can see it first re-replicate the block 
to be all one one host, and then after it gets the addStoredBlock calls, it 
notices it's not on enough racks, re-replicates elsewhere, and eventually the 
random choice gets it on the right one.

 All replicas for a block with repl=2 end up in same rack
 

 Key: HDFS-1480
 URL: https://issues.apache.org/jira/browse/HDFS-1480
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20.2
Reporter: T Meyarivan
 Attachments: hdfs-1480-test.txt


 It appears that all replicas of a block can end up in the same rack. The 
 likelihood of such replicas seems to be directly related to decommissioning 
 of nodes. 
 Post rolling OS upgrade (decommission 3-10% of nodes, re-install etc, add 
 them back) of a running cluster, all replicas of about 0.16% of blocks ended 
 up in the same rack.
 Hadoop Namenode UI etc doesn't seem to know about such incorrectly replicated 
 blocks. hadoop fsck .. does report that the blocks must be replicated on 
 additional racks.
 Looking at ReplicationTargetChooser.java, following seem suspect:
 snippet-01:
 {code}
 int maxNodesPerRack =
   (totalNumOfReplicas-1)/clusterMap.getNumOfRacks()+2;
 {code}
 snippet-02:
 {code}
   case 2:
 if (clusterMap.isOnSameRack(results.get(0), results.get(1))) {
   chooseRemoteRack(1, results.get(0), excludedNodes,
blocksize, maxNodesPerRack, results);
 } else if (newBlock){
   chooseLocalRack(results.get(1), excludedNodes, blocksize,
   maxNodesPerRack, results);
 } else {
   chooseLocalRack(writer, excludedNodes, blocksize,
   maxNodesPerRack, results);
 }
 if (--numOfReplicas == 0) {
   break;
 }
 {code}
 snippet-03:
 {code}
 do {
   DatanodeDescriptor[] selectedNodes =
 chooseRandom(1, nodes, excludedNodes);
   if (selectedNodes.length == 0) {
 throw new NotEnoughReplicasException(
  Not able to place enough 
 replicas);
   }
   result = (DatanodeDescriptor)(selectedNodes[0]);
 } while(!isGoodTarget(result, blocksize, maxNodesPerRack, results));
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1900) Use the block size key defined by common

2011-06-23 Thread Aaron T. Myers (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054243#comment-13054243
 ] 

Aaron T. Myers commented on HDFS-1900:
--

That's definitely Forrest 0.9. See: HADOOP-7394

 Use the block size key defined by common 
 -

 Key: HDFS-1900
 URL: https://issues.apache.org/jira/browse/HDFS-1900
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.21.1
Reporter: Eli Collins
Assignee: Abel Perez
  Labels: newbie
 Fix For: 0.22.0

 Attachments: HDFS-1900.txt


 HADOOP-4952 added a dfs.block.size key to common configuration, defined in 
 o.a.h.fs.FsConfig. This conflicts with the original HDFS block size key of 
 the same name, which is now deprecated in favor of dfs.blocksize. It doesn't 
 make sense to have two different keys for the block size (ie they can 
 disagree). Why doesn't HDFS just use the key defined in common?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2092) Create a light inner conf class in DFSClient