[jira] [Commented] (HDFS-1117) HDFS portion of HADOOP-6728 (ovehaul metrics framework)

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032880#comment-13032880
 ] 

Hadoop QA commented on HDFS-1117:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12478923/HDFS-1117.2.patch
  against trunk revision 1102513.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 42 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.server.common.TestDistributedUpgrade
  org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics
  org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode
  
org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
  org.apache.hadoop.hdfs.TestFileConcurrentReader
  org.apache.hadoop.hdfs.TestLargeBlock

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/509//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/509//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/509//console

This message is automatically generated.

> HDFS portion of HADOOP-6728 (ovehaul metrics framework)
> ---
>
> Key: HDFS-1117
> URL: https://issues.apache.org/jira/browse/HDFS-1117
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.20.2
>Reporter: Luke Lu
>Assignee: Luke Lu
> Fix For: 0.23.0
>
> Attachments: HDFS-1117.2.patch, HDFS-1117.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032882#comment-13032882
 ] 

Hadoop QA commented on HDFS-1921:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479025/hdfs1921_v23.patch
  against trunk revision 1102513.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestFileConcurrentReader
  org.apache.hadoop.tools.TestJMXGet

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/512//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/512//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/512//console

This message is automatically generated.

> Save namespace can cause NN to be unable to come up on restart
> --
>
> Key: HDFS-1921
> URL: https://issues.apache.org/jira/browse/HDFS-1921
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Matt Foley
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
> Attachments: hdfs-1505-1-test.txt, hdfs1921_v23.patch, 
> hdfs1921_v23.patch
>
>
> I discovered this in the course of trying to implement a fix for HDFS-1505.
> Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save 
> namespace proceeds in the following order:
> # rename current to lastcheckpoint.tmp for all of them,
> # save image and recreate edits for all of them,
> # rename lastcheckpoint.tmp to previous.checkpoint.
> The problem is that step 3 occurs regardless of whether or not an error 
> occurs for all storage directories in step 2. Upon restart, the NN will see 
> non-existent or corrupt {{current}} directories, and no 
> {{lastcheckpoint.tmp}} directories, and so will conclude that the storage 
> directories are not formatted.
> This issue appears to be present on both 0.22 and 0.23. This should arguably 
> be a 0.22/0.23 blocker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1926) Remove references to StorageDirectory from JournalManager interface

2011-05-13 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-1926:
-

Status: Open  (was: Patch Available)

> Remove references to StorageDirectory from JournalManager interface
> ---
>
> Key: HDFS-1926
> URL: https://issues.apache.org/jira/browse/HDFS-1926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-1926.diff
>
>
> The JournalManager interface introduced by HDFS-1799 has a 
> getStorageDirectory method which is out of place in a generic interface. This 
> JIRA removed that call by refactoring the error handling for FSEditLog. Each 
> EditLogFileOutputStream is now a NNStorageListener and listens for error on 
> it's containing StorageDirectory. If an error occurs from FSImage, the stream 
> will be aborted. If the error occurs in FSEditLog, the stream will be aborted 
> and NNStorage will be notified that the StorageDirectory is no longer valid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1926) Remove references to StorageDirectory from JournalManager interface

2011-05-13 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-1926:
-

Status: Patch Available  (was: Open)

> Remove references to StorageDirectory from JournalManager interface
> ---
>
> Key: HDFS-1926
> URL: https://issues.apache.org/jira/browse/HDFS-1926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-1926.diff, HDFS-1926.diff
>
>
> The JournalManager interface introduced by HDFS-1799 has a 
> getStorageDirectory method which is out of place in a generic interface. This 
> JIRA removed that call by refactoring the error handling for FSEditLog. Each 
> EditLogFileOutputStream is now a NNStorageListener and listens for error on 
> it's containing StorageDirectory. If an error occurs from FSImage, the stream 
> will be aborted. If the error occurs in FSEditLog, the stream will be aborted 
> and NNStorage will be notified that the StorageDirectory is no longer valid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1926) Remove references to StorageDirectory from JournalManager interface

2011-05-13 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-1926:
-

Attachment: HDFS-1926.diff

> Remove references to StorageDirectory from JournalManager interface
> ---
>
> Key: HDFS-1926
> URL: https://issues.apache.org/jira/browse/HDFS-1926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-1926.diff, HDFS-1926.diff
>
>
> The JournalManager interface introduced by HDFS-1799 has a 
> getStorageDirectory method which is out of place in a generic interface. This 
> JIRA removed that call by refactoring the error handling for FSEditLog. Each 
> EditLogFileOutputStream is now a NNStorageListener and listens for error on 
> it's containing StorageDirectory. If an error occurs from FSImage, the stream 
> will be aborted. If the error occurs in FSEditLog, the stream will be aborted 
> and NNStorage will be notified that the StorageDirectory is no longer valid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1931) Update tests for du/dus/df

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032890#comment-13032890
 ] 

Hadoop QA commented on HDFS-1931:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479040/HDFS-1931.patch
  against trunk revision 1102513.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 47 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.server.balancer.TestBalancer
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
  org.apache.hadoop.hdfs.TestFileConcurrentReader
  org.apache.hadoop.tools.TestJMXGet

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/511//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/511//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/511//console

This message is automatically generated.

> Update tests for du/dus/df
> --
>
> Key: HDFS-1931
> URL: https://issues.apache.org/jira/browse/HDFS-1931
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1931.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1933) Update tests for FsShell's "test"

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032911#comment-13032911
 ] 

Hadoop QA commented on HDFS-1933:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479047/HDFS-1933.patch
  against trunk revision 1102513.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery
  org.apache.hadoop.hdfs.TestDFSShell
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestFileConcurrentReader
  org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage
  org.apache.hadoop.tools.TestJMXGet

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/510//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/510//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/510//console

This message is automatically generated.

> Update tests for FsShell's "test"
> -
>
> Key: HDFS-1933
> URL: https://issues.apache.org/jira/browse/HDFS-1933
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1933.patch
>
>
> Fix tests broken by refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1899) GenericTestUtils.formatNamenode is misplaced

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032913#comment-13032913
 ] 

Hadoop QA commented on HDFS-1899:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12478847/HDFS-1899.patch
  against trunk revision 1102513.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 36 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.TestDFSClientRetries
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestFileConcurrentReader
  org.apache.hadoop.tools.TestJMXGet

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/513//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/513//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/513//console

This message is automatically generated.

> GenericTestUtils.formatNamenode is misplaced
> 
>
> Key: HDFS-1899
> URL: https://issues.apache.org/jira/browse/HDFS-1899
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1899.patch
>
>
> This function belongs in DFSTestUtil, the standard place for putting 
> cluster-related utils.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1926) Remove references to StorageDirectory from JournalManager interface

2011-05-13 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032914#comment-13032914
 ] 

Ivan Kelly commented on HDFS-1926:
--

I've uploaded the correct patch. 

The reason I see for NNStorageListener is that if a directory fails you don't 
want to keep trying it. This may not be a problem though. It certainly would be 
nice to get rid of NNStorageListener. Without it, if a StorageDirectory fails 
(disk failure etc) during a image save, then FSEditLog wouldn't know anything 
about it until the next time it went to write an entry, at which point the 
write would fail and the stream would be removed from rotation. I guess the 
only problem here would be if write didn't fail immediately, but blocked 
indefinitely (if the StorageDirectory pointed to a hard mounted NFS share for 
example).

I'll try getting rid of it completely from the edit log side of things, run the 
tests and see how it goes.

> Remove references to StorageDirectory from JournalManager interface
> ---
>
> Key: HDFS-1926
> URL: https://issues.apache.org/jira/browse/HDFS-1926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-1926.diff, HDFS-1926.diff
>
>
> The JournalManager interface introduced by HDFS-1799 has a 
> getStorageDirectory method which is out of place in a generic interface. This 
> JIRA removed that call by refactoring the error handling for FSEditLog. Each 
> EditLogFileOutputStream is now a NNStorageListener and listens for error on 
> it's containing StorageDirectory. If an error occurs from FSImage, the stream 
> will be aborted. If the error occurs in FSEditLog, the stream will be aborted 
> and NNStorage will be notified that the StorageDirectory is no longer valid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032948#comment-13032948
 ] 

Hadoop QA commented on HDFS-1371:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479061/HDFS-1371.0513.patch
  against trunk revision 1102513.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 11 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestFileConcurrentReader
  org.apache.hadoop.tools.TestJMXGet

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/514//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/514//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/514//console

This message is automatically generated.

> One bad node can incorrectly flag many files as corrupt
> ---
>
> Key: HDFS-1371
> URL: https://issues.apache.org/jira/browse/HDFS-1371
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, name-node
>Affects Versions: 0.20.1
> Environment: yahoo internal version 
> [knoguchi@gwgd4003 ~]$ hadoop version
> Hadoop 0.20.104.3.1007030707
>Reporter: Koji Noguchi
>Assignee: Tanping Wang
> Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, 
> HDFS-1371.0513.patch
>
>
> On our cluster, 12 files were reported as corrupt by fsck even though the 
> replicas on the datanodes were healthy.
> Turns out that all the replicas (12 files x 3 replicas per file) were 
> reported corrupt from one node.
> Surprisingly, these files were still readable/accessible from dfsclient 
> (-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1926) Remove references to StorageDirectory from JournalManager interface

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032952#comment-13032952
 ] 

Hadoop QA commented on HDFS-1926:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479067/HDFS-1926.diff
  against trunk revision 1102513.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/515//console

This message is automatically generated.

> Remove references to StorageDirectory from JournalManager interface
> ---
>
> Key: HDFS-1926
> URL: https://issues.apache.org/jira/browse/HDFS-1926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-1926.diff, HDFS-1926.diff
>
>
> The JournalManager interface introduced by HDFS-1799 has a 
> getStorageDirectory method which is out of place in a generic interface. This 
> JIRA removed that call by refactoring the error handling for FSEditLog. Each 
> EditLogFileOutputStream is now a NNStorageListener and listens for error on 
> it's containing StorageDirectory. If an error occurs from FSImage, the stream 
> will be aborted. If the error occurs in FSEditLog, the stream will be aborted 
> and NNStorage will be notified that the StorageDirectory is no longer valid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1895) Setting up of cluster using ssh - Scripts that help in minimising the cluster setup efforts

2011-05-13 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HDFS-1895:
-

Attachment: ssh-hadoop-config.sh

> Setting up of cluster using ssh - Scripts that help in minimising the cluster 
> setup efforts
> ---
>
> Key: HDFS-1895
> URL: https://issues.apache.org/jira/browse/HDFS-1895
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Reporter: ramkrishna.s.vasudevan
>Priority: Minor
> Attachments: ssh-hadoop-config.sh
>
>
> Sometimes when we have a large number of clusters we may have to specify the 
> password of the different machines that we are using as slaves (datanodes).
>  
> If the cluster is very huge we may have to repeat this everytime.
> So we would  to suggest a way to avoid this
>  
> 1. Generate a SSH key from the name node machine
> 2. Read the entries from the conf/slaves file, for every entry add the key 
> generated in step 1 to a file of slave machine.
> 3. Repeat the same for master file also.
>  
> when you execute step 1 it will prompt for the password.  This is only for 
> the first time.
>  
> After that whenever you need to start the cluster then password need not be 
> specified.
>  
> This scenario is valid when we are sure of the cluster that we will be 
> maintaining and we are aware of the credentials of the machine.
>  
> This will help the cluster administrator.
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1933) Update tests for FsShell's "test"

2011-05-13 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033095#comment-13033095
 ] 

Daryn Sharp commented on HDFS-1933:
---

(Mishap caused by forgetting to mvn-install hadoop-common before running hdfs 
tests...)

No changes necessary to this patch.  I fixed the related HADOOP-7285's patch to 
return the correct exit codes expected by these test changes.  Note that other 
test failures are not related to the "test" command.

> Update tests for FsShell's "test"
> -
>
> Key: HDFS-1933
> URL: https://issues.apache.org/jira/browse/HDFS-1933
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1933.patch
>
>
> Fix tests broken by refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1926) Remove references to StorageDirectory from JournalManager interface

2011-05-13 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-1926:
-

Attachment: HDFS-1926-nolistener.diff

> Remove references to StorageDirectory from JournalManager interface
> ---
>
> Key: HDFS-1926
> URL: https://issues.apache.org/jira/browse/HDFS-1926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-1926-nolistener.diff, HDFS-1926.diff, HDFS-1926.diff
>
>
> The JournalManager interface introduced by HDFS-1799 has a 
> getStorageDirectory method which is out of place in a generic interface. This 
> JIRA removed that call by refactoring the error handling for FSEditLog. Each 
> EditLogFileOutputStream is now a NNStorageListener and listens for error on 
> it's containing StorageDirectory. If an error occurs from FSImage, the stream 
> will be aborted. If the error occurs in FSEditLog, the stream will be aborted 
> and NNStorage will be notified that the StorageDirectory is no longer valid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1926) Remove references to StorageDirectory from JournalManager interface

2011-05-13 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033096#comment-13033096
 ] 

Ivan Kelly commented on HDFS-1926:
--

I've just uploaded a patch which removes the NNStorageListener completely. It 
did require one test change as TestStorageRestore simulated failure by only 
marking the directory as failed, so a subsequent write from edit log would not 
fail. I fixed that to use mockitod versions of EditLogFileOutputStream now, so 
when the failure is triggerred, writing to the edit log fails with an 
IOException.

> Remove references to StorageDirectory from JournalManager interface
> ---
>
> Key: HDFS-1926
> URL: https://issues.apache.org/jira/browse/HDFS-1926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-1926-nolistener.diff, HDFS-1926.diff, HDFS-1926.diff
>
>
> The JournalManager interface introduced by HDFS-1799 has a 
> getStorageDirectory method which is out of place in a generic interface. This 
> JIRA removed that call by refactoring the error handling for FSEditLog. Each 
> EditLogFileOutputStream is now a NNStorageListener and listens for error on 
> it's containing StorageDirectory. If an error occurs from FSImage, the stream 
> will be aborted. If the error occurs in FSEditLog, the stream will be aborted 
> and NNStorage will be notified that the StorageDirectory is no longer valid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1926) Remove references to StorageDirectory from JournalManager interface

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033098#comment-13033098
 ] 

Hadoop QA commented on HDFS-1926:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12479133/HDFS-1926-nolistener.diff
  against trunk revision 1102513.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/516//console

This message is automatically generated.

> Remove references to StorageDirectory from JournalManager interface
> ---
>
> Key: HDFS-1926
> URL: https://issues.apache.org/jira/browse/HDFS-1926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-1926-nolistener.diff, HDFS-1926.diff, HDFS-1926.diff
>
>
> The JournalManager interface introduced by HDFS-1799 has a 
> getStorageDirectory method which is out of place in a generic interface. This 
> JIRA removed that call by refactoring the error handling for FSEditLog. Each 
> EditLogFileOutputStream is now a NNStorageListener and listens for error on 
> it's containing StorageDirectory. If an error occurs from FSImage, the stream 
> will be aborted. If the error occurs in FSEditLog, the stream will be aborted 
> and NNStorage will be notified that the StorageDirectory is no longer valid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format

2011-05-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033111#comment-13033111
 ] 

Suresh Srinivas commented on HDFS-1905:
---

Doug, please read the previous comments on the rationale.

This is one command changing. It will be documented. While this is a change, I 
am not sure this is a huge complexity.

> Improve the usability of namenode -format 
> --
>
> Key: HDFS-1905
> URL: https://issues.apache.org/jira/browse/HDFS-1905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Bharath Mundlapudi
>Assignee: Bharath Mundlapudi
>Priority: Minor
> Fix For: 0.23.0
>
>
> While setting up 0.23 version based cluster, i ran into this issue. When i 
> issue a format namenode command, which got changed in 23, it should let the 
> user know to how to use this command in case where complete options were not 
> specified.
> ./hdfs namenode -format
> I get the following error msg, still its not clear what and how user should 
> use this command.
> 11/05/09 15:36:25 ERROR namenode.NameNode: 
> java.lang.IllegalArgumentException: Format must be provided with clusterid
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)
>  
> The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1825) Remove thriftfs contrib

2011-05-13 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-1825:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Remove thriftfs contrib
> ---
>
> Key: HDFS-1825
> URL: https://issues.apache.org/jira/browse/HDFS-1825
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nigel Daley
>Assignee: Nigel Daley
> Fix For: 0.22.0
>
> Attachments: HDFS-1825.patch
>
>
> As per vote on general@ 
> (http://mail-archives.apache.org/mod_mbox/hadoop-general/201102.mbox/%3cef44cfe2-692f-4956-8b33-d125d05e2...@mac.com%3E)
>  thriftfs can be removed: 
> svn remove hdfs/trunk/src/contrib/thriftfs
> and wiki updated:
> http://wiki.apache.org/hadoop/Attic

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1936) Updating the layout version from HDFS-1822 causes problems where logic depends on layout version

2011-05-13 Thread Suresh Srinivas (JIRA)
Updating the layout version from HDFS-1822 causes problems where logic depends 
on layout version


 Key: HDFS-1936
 URL: https://issues.apache.org/jira/browse/HDFS-1936
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.22.0, 0.23.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
Priority: Blocker
 Fix For: 0.22.0, 0.23.0


In HDFS_1822 and HDFS_1842, the layout versions for 203, 204, 22 and trunk were 
changed. Some of the namenode logic that depends on layout version is broken 
because of this. Read the comment for more description.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1926) Remove references to StorageDirectory from JournalManager interface

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033125#comment-13033125
 ] 

Todd Lipcon commented on HDFS-1926:
---

bq. I guess the only problem here would be if write didn't fail immediately, 
but blocked indefinitely (if the StorageDirectory pointed to a hard mounted NFS 
share for example).

Yea, I thought about that as well (it's something we see a lot with customers 
who forget to set their NFS mounts to soft with a low timeout).

But, I think, we never know whether we'll see this problem first with an 
editlog write vs an image save, so we have to handle this case in both places 
anyway, perhaps by adding timeouts to such operations. I think we should deal 
with that as part of HDFS-1603.

It made a lot of sense before when we needed agreement between images and 
editlogs about whether a directory was active, in order to coordinate 
image/edit rolls. But now that the two are entirely orthogonal (except for 
happening to share a directory) I don't see much benefit.

Let me ping Sanjay as well to see if he has an opinion here.

> Remove references to StorageDirectory from JournalManager interface
> ---
>
> Key: HDFS-1926
> URL: https://issues.apache.org/jira/browse/HDFS-1926
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Attachments: HDFS-1926-nolistener.diff, HDFS-1926.diff, HDFS-1926.diff
>
>
> The JournalManager interface introduced by HDFS-1799 has a 
> getStorageDirectory method which is out of place in a generic interface. This 
> JIRA removed that call by refactoring the error handling for FSEditLog. Each 
> EditLogFileOutputStream is now a NNStorageListener and listens for error on 
> it's containing StorageDirectory. If an error occurs from FSImage, the stream 
> will be aborted. If the error occurs in FSEditLog, the stream will be aborted 
> and NNStorage will be notified that the StorageDirectory is no longer valid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1936) Updating the layout version from HDFS-1822 causes problems where logic depends on layout version

2011-05-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1936:
--

Description: In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 
22 and trunk were changed. Some of the namenode logic that depends on layout 
version is broken because of this. Read the comment for more description.  
(was: In HDFS_1822 and HDFS_1842, the layout versions for 203, 204, 22 and 
trunk were changed. Some of the namenode logic that depends on layout version 
is broken because of this. Read the comment for more description.)

> Updating the layout version from HDFS-1822 causes problems where logic 
> depends on layout version
> 
>
> Key: HDFS-1936
> URL: https://issues.apache.org/jira/browse/HDFS-1936
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
>
> In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk 
> were changed. Some of the namenode logic that depends on layout version is 
> broken because of this. Read the comment for more description.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1933) Update tests for FsShell's "test"

2011-05-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033143#comment-13033143
 ] 

Aaron T. Myers commented on HDFS-1933:
--

+1. Patch looks good to me. Thanks, Daryn.

> Update tests for FsShell's "test"
> -
>
> Key: HDFS-1933
> URL: https://issues.apache.org/jira/browse/HDFS-1933
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1933.patch
>
>
> Fix tests broken by refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1936) Updating the layout version from HDFS-1822 causes upgrade problems.

2011-05-13 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1936:
--

Summary: Updating the layout version from HDFS-1822 causes upgrade 
problems.  (was: Updating the layout version from HDFS-1822 causes problems 
where logic depends on layout version)

> Updating the layout version from HDFS-1822 causes upgrade problems.
> ---
>
> Key: HDFS-1936
> URL: https://issues.apache.org/jira/browse/HDFS-1936
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
>
> In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk 
> were changed. Some of the namenode logic that depends on layout version is 
> broken because of this. Read the comment for more description.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1936) Updating the layout version from HDFS-1822 causes problems where logic depends on layout version

2011-05-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033144#comment-13033144
 ] 

Suresh Srinivas commented on HDFS-1936:
---

The changes were:
* 0.20.203 moved from LV -19 to -31
* 0.20.204 moved to LV -32
* 0.22 from -27 to -33
* Trunk to -34

This results in the following problems (all ranges are inclusive):
# Functionality added from LV -20 to -30 are not in 0.20.203 release. LV -28 to 
-32 are not in 0.22.
# As an example take FSImage compression added in LV -25. When upgrading to 
trunk from 0.20.203(LV -31), namenode code expects FSImage compression to be 
available (because -31 is later LV than -25). This functionality is in 0.22 and 
trunk and not in 0.20.203. Hence the upgrade fails.

Solution:
# Change the checks for LV in 0.22 for ranges -20 to -27 to -33.
# Change the checks for LV in trunk for ranges -28 to -33 to -34.


> Updating the layout version from HDFS-1822 causes problems where logic 
> depends on layout version
> 
>
> Key: HDFS-1936
> URL: https://issues.apache.org/jira/browse/HDFS-1936
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
>
> In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk 
> were changed. Some of the namenode logic that depends on layout version is 
> broken because of this. Read the comment for more description.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1936) Updating the layout version from HDFS-1822 causes upgrade problems.

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033153#comment-13033153
 ] 

Todd Lipcon commented on HDFS-1936:
---

good digging. Can I suggest that we add some static helper methods here, 
something like 
{{FSImageFormat.versionSupportsFSImageCompression(layoutVersion)}} rather than 
the hardcoded version numbers we've got all over?

Medium term maybe we can move away from the single-integer versioning to 
something like a list of flags? eg 
{code}supported_features={image_compression,edits_checksum,image_checksum,delegation_token_ops,...}
 {code}. That would have gotten us out of this mess somewhat, right?

> Updating the layout version from HDFS-1822 causes upgrade problems.
> ---
>
> Key: HDFS-1936
> URL: https://issues.apache.org/jira/browse/HDFS-1936
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
>
> In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk 
> were changed. Some of the namenode logic that depends on layout version is 
> broken because of this. Read the comment for more description.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated

2011-05-13 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1592:
---

Status: Patch Available  (was: Open)

> Datanode startup doesn't honor volumes.tolerated 
> -
>
> Key: HDFS-1592
> URL: https://issues.apache.org/jira/browse/HDFS-1592
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.204.0
>Reporter: Bharath Mundlapudi
>Assignee: Bharath Mundlapudi
> Fix For: 0.20.204.0, 0.23.0
>
> Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, 
> HDFS-1592-rel20.patch
>
>
> Datanode startup doesn't honor volumes.tolerated for hadoop 20 version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1917) Clean up duplication of dependent jar files

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033162#comment-13033162
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1917:
--

+1 patch looks good.

> Clean up duplication of dependent jar files
> ---
>
> Key: HDFS-1917
> URL: https://issues.apache.org/jira/browse/HDFS-1917
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
> Environment: Java 6, RHEL 5.5
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: HDFS-1917-1.patch, HDFS-1917.patch
>
>
> For trunk, the build and deployment tree look like this:
> hadoop-common-0.2x.y
> hadoop-hdfs-0.2x.y
> hadoop-mapred-0.2x.y
> Technically, hdfs's the third party dependent jar files should be fetch from 
> hadoop-common.  However, it is currently fetching from hadoop-hdfs/lib only.  
> It would be nice to eliminate the need to repeat duplicated jar files at 
> build time.
> There are two options to manage this dependency list, continue to enhance ant 
> build structure to fetch and filter jar file dependencies using ivy.  On the 
> other hand, it would be a good opportunity to convert the build structure to 
> maven, and use maven to manage the provided jar files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1917) Clean up duplication of dependent jar files

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-1917:
-

   Resolution: Fixed
Fix Version/s: 0.23.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Eric!

> Clean up duplication of dependent jar files
> ---
>
> Key: HDFS-1917
> URL: https://issues.apache.org/jira/browse/HDFS-1917
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
> Environment: Java 6, RHEL 5.5
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: 0.23.0
>
> Attachments: HDFS-1917-1.patch, HDFS-1917.patch
>
>
> For trunk, the build and deployment tree look like this:
> hadoop-common-0.2x.y
> hadoop-hdfs-0.2x.y
> hadoop-mapred-0.2x.y
> Technically, hdfs's the third party dependent jar files should be fetch from 
> hadoop-common.  However, it is currently fetching from hadoop-hdfs/lib only.  
> It would be nice to eliminate the need to repeat duplicated jar files at 
> build time.
> There are two options to manage this dependency list, continue to enhance ant 
> build structure to fetch and filter jar file dependencies using ivy.  On the 
> other hand, it would be a good opportunity to convert the build structure to 
> maven, and use maven to manage the provided jar files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1814) HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033168#comment-13033168
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1814:
--

Is this an incompatible change?

> HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent
> ---
>
> Key: HDFS-1814
> URL: https://issues.apache.org/jira/browse/HDFS-1814
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 0.23.0
>
> Attachments: hdfs-1814.0.txt, hdfs-1814.1.txt, hdfs-1814.2.txt, 
> hdfs-1814.3.patch, hdfs-1814.4.patch, hdfs-1814.5.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1814) HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033184#comment-13033184
 ] 

Todd Lipcon commented on HDFS-1814:
---

This just adds a new command - not sure why it would be considered incompatible?

> HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent
> ---
>
> Key: HDFS-1814
> URL: https://issues.apache.org/jira/browse/HDFS-1814
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 0.23.0
>
> Attachments: hdfs-1814.0.txt, hdfs-1814.1.txt, hdfs-1814.2.txt, 
> hdfs-1814.3.patch, hdfs-1814.4.patch, hdfs-1814.5.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1936) Updating the layout version from HDFS-1822 causes upgrade problems.

2011-05-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033186#comment-13033186
 ] 

Suresh Srinivas commented on HDFS-1936:
---

> Can I suggest that we add some static helper methods here
I plan to add that.

> Medium term maybe we can move away from the single-integer versioning to 
> something like a list of flags
I think layout version is referenced in various places. We cannot replace that. 
But I agree with you on adding a feature matrix, that some looks some thing 
like:
LV <-> EnumSet of features supported.

With that we could move to using feature supported for doing the functionality 
that we depend on LV today. This also enables building a simple tool which 
given two layout versions, can tell if upgrade is possible.


> Updating the layout version from HDFS-1822 causes upgrade problems.
> ---
>
> Key: HDFS-1936
> URL: https://issues.apache.org/jira/browse/HDFS-1936
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
>
> In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk 
> were changed. Some of the namenode logic that depends on layout version is 
> broken because of this. Read the comment for more description.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1814) HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent

2011-05-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033188#comment-13033188
 ] 

Aaron T. Myers commented on HDFS-1814:
--

I would think this would be considered incompatible only if we consider the 
help output of {{`hdfs'}} and {{`mapred'}} to be interfaces whose backward 
compatibility we care about. I don't think we should.

> HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent
> ---
>
> Key: HDFS-1814
> URL: https://issues.apache.org/jira/browse/HDFS-1814
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 0.23.0
>
> Attachments: hdfs-1814.0.txt, hdfs-1814.1.txt, hdfs-1814.2.txt, 
> hdfs-1814.3.patch, hdfs-1814.4.patch, hdfs-1814.5.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1936) Updating the layout version from HDFS-1822 causes upgrade problems.

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033191#comment-13033191
 ] 

Todd Lipcon commented on HDFS-1936:
---

Nice idea with the EnumSet

> Updating the layout version from HDFS-1822 causes upgrade problems.
> ---
>
> Key: HDFS-1936
> URL: https://issues.apache.org/jira/browse/HDFS-1936
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
>
> In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk 
> were changed. Some of the namenode logic that depends on layout version is 
> broken because of this. Read the comment for more description.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1931) Update tests for du/dus/df

2011-05-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033193#comment-13033193
 ] 

Aaron T. Myers commented on HDFS-1931:
--

+1. The patch looks good to me. Thanks a lot, Daryn.

> Update tests for du/dus/df
> --
>
> Key: HDFS-1931
> URL: https://issues.apache.org/jira/browse/HDFS-1931
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1931.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033197#comment-13033197
 ] 

Ted Yu commented on HDFS-1332:
--

Todd suggested using the NNThroughputBenchmark to show that the patch doesn't 
cause significant downgrade.
How does that sound ?

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1814) HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033204#comment-13033204
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1814:
--

Hey Todd, it is a question.

The new protocol {{GetUserMappingsProtocol}} was added by HADOOP-7214.  The old 
client won't use it.  If the new client uses it to talk to an old server, it 
will get an exception.  It seems no compatibility issue.  Do you agree?

> HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent
> ---
>
> Key: HDFS-1814
> URL: https://issues.apache.org/jira/browse/HDFS-1814
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 0.23.0
>
> Attachments: hdfs-1814.0.txt, hdfs-1814.1.txt, hdfs-1814.2.txt, 
> hdfs-1814.3.patch, hdfs-1814.4.patch, hdfs-1814.5.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1933) Update tests for FsShell's "test"

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033205#comment-13033205
 ] 

Todd Lipcon commented on HDFS-1933:
---

In the tests, why do we need the try..catch around all of the invocations? It 
seems to me that we don't expect anything to be thrown, so we should just let 
the exception out so it will fail the test if thrown.

> Update tests for FsShell's "test"
> -
>
> Key: HDFS-1933
> URL: https://issues.apache.org/jira/browse/HDFS-1933
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1933.patch
>
>
> Fix tests broken by refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1814) HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033218#comment-13033218
 ] 

Todd Lipcon commented on HDFS-1814:
---

As far as I know we have not yet reached the point where protocol-level changes 
between releases are considered incompatible -- i.e. no one expects an 0.22 
client to talk to an 0.23 server or vice versa.

Though I also agree that in this case, even if we had that guarantee, this 
wouldn't break anything since it is an additional protocol rather than a change 
to an old one.

> HDFS portion of HADOOP-7214 - Hadoop /usr/bin/groups equivalent
> ---
>
> Key: HDFS-1814
> URL: https://issues.apache.org/jira/browse/HDFS-1814
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 0.23.0
>
> Attachments: hdfs-1814.0.txt, hdfs-1814.1.txt, hdfs-1814.2.txt, 
> hdfs-1814.3.patch, hdfs-1814.4.patch, hdfs-1814.5.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1933) Update tests for FsShell's "test"

2011-05-13 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033217#comment-13033217
 ] 

Daryn Sharp commented on HDFS-1933:
---

I just followed the pattern that was there, but I can change it.  I suppose 
it's to let all the tests run since they are all part of one giant junit test.

> Update tests for FsShell's "test"
> -
>
> Key: HDFS-1933
> URL: https://issues.apache.org/jira/browse/HDFS-1933
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1933.patch
>
>
> Fix tests broken by refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1787) "Not enough xcievers" error should propagate to client

2011-05-13 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033219#comment-13033219
 ] 

Jonathan Hsieh commented on HDFS-1787:
--

#2.  Some tabs got in there, reformatting to look like subsequent block blocks.

#3.  Ok, just incrementing the counter in this case, and using the previous 
more verbose message format.

#4. It needs to be a reference to a integer because it is being incremented in 
one thread and read by another. The normal Integer is not really trustable in 
these situations (ends up using a const) so I chose to use AtomicInteger.  In 
the input case, there is only a single thread.  Since this should be a rare 
error condition, I really wouldn't be concerned about its performance.

#5. I believe the string that I use has a different purpose than shipping error 
messages and normally has a node name.  I hijacked it.  This can incur if it is 
not a node name, this can incur ArrayOutOfBounds exception (default value is 
initialized to -1).

More responses pending. 

> "Not enough xcievers" error should propagate to client
> --
>
> Key: HDFS-1787
> URL: https://issues.apache.org/jira/browse/HDFS-1787
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Jonathan Hsieh
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: hdfs-1787.patch
>
>
> We find that users often run into the default transceiver limits in the DN. 
> Putting aside the inherent issues with xceiver threads, it would be nice if 
> the "xceiver limit exceeded" error propagated to the client. Currently, 
> clients simply see an EOFException which is hard to interpret, and have to go 
> slogging through DN logs to find the underlying issue.
> The data transfer protocol should be extended to either have a special error 
> code for "not enough xceivers" or should have some error code for generic 
> errors with which a string can be attached and propagated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033220#comment-13033220
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1332:
--

Ted, please correct me if I am wrong: {{NNThroughputBenchmark}} does not count 
replication and GC.

> How about adding a static boolean, blockPlacementDebug, ...

I actually like your idea.  In Hadoop, we use log levels for such purpose.  
That why I suggest adding {{BlockPlacementPolicyDefault.LOG}} (It probably is 
better to add the log in {{BlockPlacementPolicy.LOG}}).  How does it sound to 
you?

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1936) Updating the layout version from HDFS-1822 causes upgrade problems.

2011-05-13 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1936:
--

Attachment: HDFS-1936.trunk.patch
HDFS-1936.22.patch

Patch with changes proposed in the jira.

For trunk I will further clean this up in a separate jira to add 
isFeatureSupported() check and remove checking for versions.

> Updating the layout version from HDFS-1822 causes upgrade problems.
> ---
>
> Key: HDFS-1936
> URL: https://issues.apache.org/jira/browse/HDFS-1936
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
> Attachments: HDFS-1936.22.patch, HDFS-1936.trunk.patch
>
>
> In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk 
> were changed. Some of the namenode logic that depends on layout version is 
> broken because of this. Read the comment for more description.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033231#comment-13033231
 ] 

Hadoop QA commented on HDFS-1592:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479026/HDFS-1592-2.patch
  against trunk revision 1102833.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestFileConcurrentReader
  org.apache.hadoop.tools.TestJMXGet

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/518//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/518//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/518//console

This message is automatically generated.

> Datanode startup doesn't honor volumes.tolerated 
> -
>
> Key: HDFS-1592
> URL: https://issues.apache.org/jira/browse/HDFS-1592
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.204.0
>Reporter: Bharath Mundlapudi
>Assignee: Bharath Mundlapudi
> Fix For: 0.20.204.0, 0.23.0
>
> Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, 
> HDFS-1592-rel20.patch
>
>
> Datanode startup doesn't honor volumes.tolerated for hadoop 20 version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033233#comment-13033233
 ] 

Ted Yu commented on HDFS-1332:
--

In this method:
{code}
  private void chooseRandom(int numOfReplicas,
String nodes,
HashMap excludedNodes,
long blocksize,
int maxNodesPerRack,
List results)
{code}
it is possible that numOfAvailableNodes is greater than numOfReplicas. Meaning, 
if we see a bad target, NotEnoughReplicasException may not be thrown at the end 
of the method.
Using LOG without creating map would report unnecessary alarm.

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1933) Update tests for FsShell's "test"

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033282#comment-13033282
 ] 

Todd Lipcon commented on HDFS-1933:
---

Ah, fair enough... might as well be internally consistent. I don't think it 
lets all tests run, since after any of the catch clauses, it will fail on the 
next assert since the variable wouldn't be set to a correct return code. Patch 
looks good except there is one tab character:
+   args = new String[2];


> Update tests for FsShell's "test"
> -
>
> Key: HDFS-1933
> URL: https://issues.apache.org/jira/browse/HDFS-1933
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1933.patch
>
>
> Fix tests broken by refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033284#comment-13033284
 ] 

Todd Lipcon commented on HDFS-1332:
---

I think there's some confusion what Nicholas's suggestion is. Is it (a) that we 
use the log level to decide whether to add detail to the exception message? or 
(b) that we don't change the exception message at all, but rather add debug 
level messages for each of the cases where a node is ignored?

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1927) audit logs could ignore certain xsactions and also could contain "ip=null"

2011-05-13 Thread John George (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John George updated HDFS-1927:
--

Status: Open  (was: Patch Available)

Will "Submit patch after the corresponding common change goes in"

> audit logs could ignore certain xsactions and also could contain "ip=null"
> --
>
> Key: HDFS-1927
> URL: https://issues.apache.org/jira/browse/HDFS-1927
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: John George
>Assignee: John George
> Attachments: HDFS-1927.patch
>
>
> Namenode audit logs could be ignoring certain transactions that are 
> successfully completed. This is because it check if the RemoteIP is null to 
> decide if a transaction is remote or not. In certain cases, RemoteIP could 
> return null but the xsaction could still be "remote". An example is a case 
> where a client gets killed while in the middle of the transaction. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1933) Update tests for FsShell's "test"

2011-05-13 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-1933:
--

Attachment: HDFS-1933-2.patch

Phew. I was trying to split up the giant test, which would require sharing a 
mini cluster object across tests, which incidentally would drastically reduce 
the runtime of the tests, but resetting the cluster and/or tearing it down was 
turning into a rabbit hole...

In this patch I simply removed the errant tab as requested.  Thanks!

> Update tests for FsShell's "test"
> -
>
> Key: HDFS-1933
> URL: https://issues.apache.org/jira/browse/HDFS-1933
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1933-2.patch, HDFS-1933.patch
>
>
> Fix tests broken by refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt

2011-05-13 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033299#comment-13033299
 ] 

Jitendra Nath Pandey commented on HDFS-1371:


DFSInputStream.java :
1. Please don't remove the public method getCurrentBlock. Change in DFSClient 
won't be required if we don't change the method. Also, there is no need to 
introduce the public method getCurrentLocatedBlock().
2. javadoc comment for reportCheckSumFailure and addIntoCorruptedBlockMap 
doesn't list all parameters.
3. The corruptedBlockMap is created outside the loop in read methods, therefore 
after reporting the checksum you should still clear the map.
4. TestClientReportBadBlock : The comment "The order of data nodes..." should 
be moved before the loop.

Minor:
5. TestFsck.java: Indentation starting at comment "// corrupt replicas" has two 
extra spaces.



> One bad node can incorrectly flag many files as corrupt
> ---
>
> Key: HDFS-1371
> URL: https://issues.apache.org/jira/browse/HDFS-1371
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, name-node
>Affects Versions: 0.20.1
> Environment: yahoo internal version 
> [knoguchi@gwgd4003 ~]$ hadoop version
> Hadoop 0.20.104.3.1007030707
>Reporter: Koji Noguchi
>Assignee: Tanping Wang
> Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, 
> HDFS-1371.0513.patch
>
>
> On our cluster, 12 files were reported as corrupt by fsck even though the 
> replicas on the datanodes were healthy.
> Turns out that all the replicas (12 files x 3 replicas per file) were 
> reported corrupt from one node.
> Surprisingly, these files were still readable/accessible from dfsclient 
> (-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format

2011-05-13 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033307#comment-13033307
 ] 

Sanjay Radia commented on HDFS-1905:


The format command  prints a usage message explaining the usage so the issue is 
not as big as it is made out. 
The notion of a cluster id will become more and more important over time and 
goes beyond federation; it existed in the hadoop from early days but the design 
was not completely correct. But lets not debate that; instead we will change 
the NN format command to generate one if it is not supplied. This will maintain 
the  compatibility for this operator CLI.
Are folks happy with that?

> Improve the usability of namenode -format 
> --
>
> Key: HDFS-1905
> URL: https://issues.apache.org/jira/browse/HDFS-1905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Bharath Mundlapudi
>Assignee: Bharath Mundlapudi
>Priority: Minor
> Fix For: 0.23.0
>
>
> While setting up 0.23 version based cluster, i ran into this issue. When i 
> issue a format namenode command, which got changed in 23, it should let the 
> user know to how to use this command in case where complete options were not 
> specified.
> ./hdfs namenode -format
> I get the following error msg, still its not clear what and how user should 
> use this command.
> 11/05/09 15:36:25 ERROR namenode.NameNode: 
> java.lang.IllegalArgumentException: Format must be provided with clusterid
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)
>  
> The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1881) After taking snapshot (upgrade) on Federation, the current directory of data node is emtpy.

2011-05-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033308#comment-13033308
 ] 

Suresh Srinivas commented on HDFS-1881:
---

Patch looks good. Thanks for catching the bug. One minor comment:
In tests use MiniDFSCluster#getFinalizedDir() instead of building it using 
various path fragments. There are example in other tests.


> After taking snapshot (upgrade) on Federation,  the current directory of data 
> node is emtpy.
> 
>
> Key: HDFS-1881
> URL: https://issues.apache.org/jira/browse/HDFS-1881
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tanping Wang
>Assignee: Tanping Wang
> Attachments: HDFS-1881.patch
>
>
> After taking a snapshot in Federation (by starting up namenode with option 
> -upgrade), it appears that the current directory of data node does not 
> contain the block files.  We have also verified that upgrading from 20 to 
> Federation does not have this problem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1933) Update tests for FsShell's "test"

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033313#comment-13033313
 ] 

Todd Lipcon commented on HDFS-1933:
---

I seem to be getting the following failure even with the hadoop-side patch:

Testcase: testDFSShell took 0.961 sec
FAILED
expected:<1> but was:<-1>
junit.framework.AssertionFailedError: expected:<1> but was:<-1>
at 
org.apache.hadoop.hdfs.TestDFSShell.testDFSShell(TestDFSShell.java:1055)


> Update tests for FsShell's "test"
> -
>
> Key: HDFS-1933
> URL: https://issues.apache.org/jira/browse/HDFS-1933
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1933-2.patch, HDFS-1933.patch
>
>
> Fix tests broken by refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033315#comment-13033315
 ] 

Todd Lipcon commented on HDFS-1905:
---

Sounds good to me, Sanjay.

> Improve the usability of namenode -format 
> --
>
> Key: HDFS-1905
> URL: https://issues.apache.org/jira/browse/HDFS-1905
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Bharath Mundlapudi
>Assignee: Bharath Mundlapudi
>Priority: Minor
> Fix For: 0.23.0
>
>
> While setting up 0.23 version based cluster, i ran into this issue. When i 
> issue a format namenode command, which got changed in 23, it should let the 
> user know to how to use this command in case where complete options were not 
> specified.
> ./hdfs namenode -format
> I get the following error msg, still its not clear what and how user should 
> use this command.
> 11/05/09 15:36:25 ERROR namenode.NameNode: 
> java.lang.IllegalArgumentException: Format must be provided with clusterid
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)
>  
> The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1117) HDFS portion of HADOOP-6728 (ovehaul metrics framework)

2011-05-13 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1117:
--

Attachment: HDFS-1117.3.patch

Some tests are failing due to OutOfMemoryError. Increasing the junit java heap 
size from 512m to 1024m.

> HDFS portion of HADOOP-6728 (ovehaul metrics framework)
> ---
>
> Key: HDFS-1117
> URL: https://issues.apache.org/jira/browse/HDFS-1117
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.20.2
>Reporter: Luke Lu
>Assignee: Luke Lu
> Fix For: 0.23.0
>
> Attachments: HDFS-1117.2.patch, HDFS-1117.3.patch, HDFS-1117.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1117) HDFS portion of HADOOP-6728 (ovehaul metrics framework)

2011-05-13 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1117:
--

Status: Open  (was: Patch Available)

> HDFS portion of HADOOP-6728 (ovehaul metrics framework)
> ---
>
> Key: HDFS-1117
> URL: https://issues.apache.org/jira/browse/HDFS-1117
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.20.2
>Reporter: Luke Lu
>Assignee: Luke Lu
> Fix For: 0.23.0
>
> Attachments: HDFS-1117.2.patch, HDFS-1117.3.patch, HDFS-1117.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1117) HDFS portion of HADOOP-6728 (ovehaul metrics framework)

2011-05-13 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1117:
--

Status: Patch Available  (was: Open)

> HDFS portion of HADOOP-6728 (ovehaul metrics framework)
> ---
>
> Key: HDFS-1117
> URL: https://issues.apache.org/jira/browse/HDFS-1117
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.20.2
>Reporter: Luke Lu
>Assignee: Luke Lu
> Fix For: 0.23.0
>
> Attachments: HDFS-1117.2.patch, HDFS-1117.3.patch, HDFS-1117.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033324#comment-13033324
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1332:
--

>From the summary and description, I think this is about log messages.  So my 
>suggestion is

(c) If {{BlockPlacementPolicy.LOG.isDebugEnabled()}}, construct a log message 
and then print it before throwing the exception.  Otherwise, no HashMap nor 
strings should be created.

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart

2011-05-13 Thread Matt Foley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033326#comment-13033326
 ] 

Matt Foley commented on HDFS-1921:
--

Todd and Aaron, regarding v23: I understand this may be modified by work in 
HDFS-1073, but until HDFS-1073 is ready to come out I'd like to keep trunk as 
clean as possible.  So I think this patch should go into both v22 and v23.  Is 
there any serious clash with patches already done for HDFS-1073?  Thanks.

> Save namespace can cause NN to be unable to come up on restart
> --
>
> Key: HDFS-1921
> URL: https://issues.apache.org/jira/browse/HDFS-1921
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Matt Foley
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
> Attachments: hdfs-1505-1-test.txt, hdfs1921_v23.patch, 
> hdfs1921_v23.patch
>
>
> I discovered this in the course of trying to implement a fix for HDFS-1505.
> Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save 
> namespace proceeds in the following order:
> # rename current to lastcheckpoint.tmp for all of them,
> # save image and recreate edits for all of them,
> # rename lastcheckpoint.tmp to previous.checkpoint.
> The problem is that step 3 occurs regardless of whether or not an error 
> occurs for all storage directories in step 2. Upon restart, the NN will see 
> non-existent or corrupt {{current}} directories, and no 
> {{lastcheckpoint.tmp}} directories, and so will conclude that the storage 
> directories are not formatted.
> This issue appears to be present on both 0.22 and 0.23. This should arguably 
> be a 0.22/0.23 blocker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart

2011-05-13 Thread Matt Foley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033329#comment-13033329
 ] 

Matt Foley commented on HDFS-1921:
--

None of the test errors are related to this patch (all four are recurring; see 
HDFS-1852).
I agree with Aaron that his new unit test for HDFS-1505 is a good test for this 
patch too, so no additional unit tests needed (but the core of that unit test 
is attached to this Jira, and passes local testing).

> Save namespace can cause NN to be unable to come up on restart
> --
>
> Key: HDFS-1921
> URL: https://issues.apache.org/jira/browse/HDFS-1921
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Matt Foley
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
> Attachments: hdfs-1505-1-test.txt, hdfs1921_v23.patch, 
> hdfs1921_v23.patch
>
>
> I discovered this in the course of trying to implement a fix for HDFS-1505.
> Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save 
> namespace proceeds in the following order:
> # rename current to lastcheckpoint.tmp for all of them,
> # save image and recreate edits for all of them,
> # rename lastcheckpoint.tmp to previous.checkpoint.
> The problem is that step 3 occurs regardless of whether or not an error 
> occurs for all storage directories in step 2. Upon restart, the NN will see 
> non-existent or corrupt {{current}} directories, and no 
> {{lastcheckpoint.tmp}} directories, and so will conclude that the storage 
> directories are not formatted.
> This issue appears to be present on both 0.22 and 0.23. This should arguably 
> be a 0.22/0.23 blocker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1933) Update tests for FsShell's "test"

2011-05-13 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033340#comment-13033340
 ] 

Daryn Sharp commented on HDFS-1933:
---

Odd.  I just reran HDFS-1933-2.patch against hadoop-common's trunk and the test 
passed.  Would you please double check?

> Update tests for FsShell's "test"
> -
>
> Key: HDFS-1933
> URL: https://issues.apache.org/jira/browse/HDFS-1933
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-1933-2.patch, HDFS-1933.patch
>
>
> Fix tests broken by refactoring.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated

2011-05-13 Thread Bharath Mundlapudi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033343#comment-13033343
 ] 

Bharath Mundlapudi commented on HDFS-1592:
--

These failing tests are not related to this patch.


> Datanode startup doesn't honor volumes.tolerated 
> -
>
> Key: HDFS-1592
> URL: https://issues.apache.org/jira/browse/HDFS-1592
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.204.0
>Reporter: Bharath Mundlapudi
>Assignee: Bharath Mundlapudi
> Fix For: 0.20.204.0, 0.23.0
>
> Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, 
> HDFS-1592-rel20.patch
>
>
> Datanode startup doesn't honor volumes.tolerated for hadoop 20 version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033346#comment-13033346
 ] 

Todd Lipcon commented on HDFS-1332:
---

Hey Nicholas. How do you feel about the following compromise:
- For the simple case that there are no datanodes in the cluster, we include 
some additional detail in the exception message indicating as much. This will 
help the common case of a new user whose datanodes failed to start and is 
confused why he can't write blocks. This should be in the IOException itself so 
that it propagates to the client.
- if debug is enabled, we construct the HashMap as above, and log the "failure 
to allocate block" type messages at WARN level
- if debug is not enabled, then we log a message that says something like 
"failure to allocate block ... For more information, please enable DEBUG level 
logging on the o.a.h.BlockPlacementPolicyDefault logger."

This should avoid any performance impact, but also point users down the right 
path to solving the issues.

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033347#comment-13033347
 ] 

Todd Lipcon commented on HDFS-1921:
---

bq. Is there any serious clash with patches already done for HDFS-1073? Thanks

Yes, this will most likely clash with 1073. If you want to commit with trunk, 
that's fine - the next time I merge 1073 with trunk, I'll probably just add a 
TODO task for that branch to make sure that we haven't regressed this behavior.

> Save namespace can cause NN to be unable to come up on restart
> --
>
> Key: HDFS-1921
> URL: https://issues.apache.org/jira/browse/HDFS-1921
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Matt Foley
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
> Attachments: hdfs-1505-1-test.txt, hdfs1921_v23.patch, 
> hdfs1921_v23.patch
>
>
> I discovered this in the course of trying to implement a fix for HDFS-1505.
> Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save 
> namespace proceeds in the following order:
> # rename current to lastcheckpoint.tmp for all of them,
> # save image and recreate edits for all of them,
> # rename lastcheckpoint.tmp to previous.checkpoint.
> The problem is that step 3 occurs regardless of whether or not an error 
> occurs for all storage directories in step 2. Upon restart, the NN will see 
> non-existent or corrupt {{current}} directories, and no 
> {{lastcheckpoint.tmp}} directories, and so will conclude that the storage 
> directories are not formatted.
> This issue appears to be present on both 0.22 and 0.23. This should arguably 
> be a 0.22/0.23 blocker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1881) After taking snapshot (upgrade) on Federation, the current directory of data node is emtpy.

2011-05-13 Thread Tanping Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang updated HDFS-1881:
---

Attachment: HDFS-1881.2.patch

Upload a patch to address the review comments.

> After taking snapshot (upgrade) on Federation,  the current directory of data 
> node is emtpy.
> 
>
> Key: HDFS-1881
> URL: https://issues.apache.org/jira/browse/HDFS-1881
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tanping Wang
>Assignee: Tanping Wang
> Attachments: HDFS-1881.2.patch, HDFS-1881.patch
>
>
> After taking a snapshot in Federation (by starting up namenode with option 
> -upgrade), it appears that the current directory of data node does not 
> contain the block files.  We have also verified that upgrading from 20 to 
> Federation does not have this problem.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart

2011-05-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033349#comment-13033349
 ] 

Suresh Srinivas commented on HDFS-1921:
---

Looks good.

Comments:
# Code of thread starting logic is duplicated. It could be added to a separate 
method. Also continue in catch block is redundant.
# Minor: per the coding guidelines please add { } after if statements.


> Save namespace can cause NN to be unable to come up on restart
> --
>
> Key: HDFS-1921
> URL: https://issues.apache.org/jira/browse/HDFS-1921
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Matt Foley
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
> Attachments: hdfs-1505-1-test.txt, hdfs1921_v23.patch, 
> hdfs1921_v23.patch
>
>
> I discovered this in the course of trying to implement a fix for HDFS-1505.
> Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save 
> namespace proceeds in the following order:
> # rename current to lastcheckpoint.tmp for all of them,
> # save image and recreate edits for all of them,
> # rename lastcheckpoint.tmp to previous.checkpoint.
> The problem is that step 3 occurs regardless of whether or not an error 
> occurs for all storage directories in step 2. Upon restart, the NN will see 
> non-existent or corrupt {{current}} directories, and no 
> {{lastcheckpoint.tmp}} directories, and so will conclude that the storage 
> directories are not formatted.
> This issue appears to be present on both 0.22 and 0.23. This should arguably 
> be a 0.22/0.23 blocker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1936) Updating the layout version from HDFS-1822 causes upgrade problems.

2011-05-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033354#comment-13033354
 ] 

Suresh Srinivas commented on HDFS-1936:
---

Todd, can you review the patch?

> Updating the layout version from HDFS-1822 causes upgrade problems.
> ---
>
> Key: HDFS-1936
> URL: https://issues.apache.org/jira/browse/HDFS-1936
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
> Attachments: HDFS-1936.22.patch, HDFS-1936.trunk.patch
>
>
> In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk 
> were changed. Some of the namenode logic that depends on layout version is 
> broken because of this. Read the comment for more description.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1899) GenericTestUtils.formatNamenode is misplaced

2011-05-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1899:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Ted!

> GenericTestUtils.formatNamenode is misplaced
> 
>
> Key: HDFS-1899
> URL: https://issues.apache.org/jira/browse/HDFS-1899
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1899.patch
>
>
> This function belongs in DFSTestUtil, the standard place for putting 
> cluster-related utils.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated

2011-05-13 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033356#comment-13033356
 ] 

Jitendra Nath Pandey commented on HDFS-1592:


+1 for the patch.

> Datanode startup doesn't honor volumes.tolerated 
> -
>
> Key: HDFS-1592
> URL: https://issues.apache.org/jira/browse/HDFS-1592
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.204.0
>Reporter: Bharath Mundlapudi
>Assignee: Bharath Mundlapudi
> Fix For: 0.20.204.0, 0.23.0
>
> Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, 
> HDFS-1592-rel20.patch
>
>
> Datanode startup doesn't honor volumes.tolerated for hadoop 20 version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1929) TestEditLogFileOutputStream fails if running on same host as NN

2011-05-13 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-1929:
-

Attachment: hdfs-1929.0.patch

Patch which reworks the test to use a {{MiniDFSCluster}} instead of starting an 
NN manually. I manually verified that running the test on my box with a process 
bound to 50070 passes, whereas it failed before.

> TestEditLogFileOutputStream fails if running on same host as NN
> ---
>
> Key: HDFS-1929
> URL: https://issues.apache.org/jira/browse/HDFS-1929
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
> Attachments: hdfs-1929.0.patch
>
>
> This test instantiates NameNode directly rather than using MiniDFSCluster, so 
> it tries to claim the default port 50070 rather than using an ephemeral one. 
> This makes the test fail if you are running a NN on the same machine where 
> you're running the test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1929) TestEditLogFileOutputStream fails if running on same host as NN

2011-05-13 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-1929:
-

Status: Patch Available  (was: Open)

> TestEditLogFileOutputStream fails if running on same host as NN
> ---
>
> Key: HDFS-1929
> URL: https://issues.apache.org/jira/browse/HDFS-1929
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
> Attachments: hdfs-1929.0.patch
>
>
> This test instantiates NameNode directly rather than using MiniDFSCluster, so 
> it tries to claim the default port 50070 rather than using an ephemeral one. 
> This makes the test fail if you are running a NN on the same machine where 
> you're running the test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1929) TestEditLogFileOutputStream fails if running on same host as NN

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033362#comment-13033362
 ] 

Todd Lipcon commented on HDFS-1929:
---

Sorry to be a nit picker, but we should specify numDataNodes(0) so the test 
runs a little faster.

> TestEditLogFileOutputStream fails if running on same host as NN
> ---
>
> Key: HDFS-1929
> URL: https://issues.apache.org/jira/browse/HDFS-1929
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
> Attachments: hdfs-1929.0.patch
>
>
> This test instantiates NameNode directly rather than using MiniDFSCluster, so 
> it tries to claim the default port 50070 rather than using an ephemeral one. 
> This makes the test fail if you are running a NN on the same machine where 
> you're running the test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033363#comment-13033363
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1332:
--

Todd, it sounds good to me.

Some optional optimizations:
- I believe the use of {{HashMap}} can be eliminated since whenever we put an 
entry to the map, we may simply append it to the string builder.
- Also, we may make the string builder {{ThreadLocal}} to minimize object 
creation.

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1929) TestEditLogFileOutputStream fails if running on same host as NN

2011-05-13 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-1929:
-

Attachment: hdfs-1929.1.patch

Thanks for the review, Todd. Here's an updated patch which sets numDataNodes to 
0. This sped up the run of this test case by 10 seconds.

> TestEditLogFileOutputStream fails if running on same host as NN
> ---
>
> Key: HDFS-1929
> URL: https://issues.apache.org/jira/browse/HDFS-1929
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
> Attachments: hdfs-1929.0.patch, hdfs-1929.1.patch
>
>
> This test instantiates NameNode directly rather than using MiniDFSCluster, so 
> it tries to claim the default port 50070 rather than using an ephemeral one. 
> This makes the test fail if you are running a NN on the same machine where 
> you're running the test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033366#comment-13033366
 ] 

Todd Lipcon commented on HDFS-1332:
---

Good ideas, both. Ted, want to update the patch?

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033368#comment-13033368
 ] 

Todd Lipcon commented on HDFS-1332:
---

Just a sample of why this JIRA is important, from #hadoop IRC a minute ago:
{code}
15:23 < user> Hi all. I have a strange problem - I can't seem to run a simple 
copyFromLocal command from 
  my local machine, but everything works fine if i log into 
the master as the same user and 
  issue command
15:25 < user> the error it throws is cryptic - java.io.IOException: File 
/foo/bar could only be 
  replicated to 0 nodes, instead of 1
15:26 < user> i checked the namenode and tasktracker logs - nothing of interest 
there (except the error i 
  mentioned, of course)
{code}

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033367#comment-13033367
 ] 

Ted Yu commented on HDFS-1332:
--

>From 
>http://sbdevel.wordpress.com/2009/03/12/threadlocal-stringbuilders-for-fast-text-processing/,
> Caveat Emptor:

So what’s the catch? You will be keeping one string builder around for each new 
thread that ever enters processRecord(). This could potentially end up as lots 
of string builders ...

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033372#comment-13033372
 ] 

Todd Lipcon commented on HDFS-1332:
---

bq. So what’s the catch? You will be keeping one string builder around for each 
new thread that ever enters processRecord(). This could potentially end up as 
lots of string builders ...

In this case, there is a bounded set of IPC handler threads in the NameNode. 
So, each of those will keep around a stringbuilder, which might have a 1KB 
buffer. So, total memory usage might be a few hundred KB.

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1117) HDFS portion of HADOOP-6728 (ovehaul metrics framework)

2011-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033374#comment-13033374
 ] 

Hadoop QA commented on HDFS-1117:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479174/HDFS-1117.3.patch
  against trunk revision 1102833.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 43 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode
  
org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestLargeBlock

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/519//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/519//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/519//console

This message is automatically generated.

> HDFS portion of HADOOP-6728 (ovehaul metrics framework)
> ---
>
> Key: HDFS-1117
> URL: https://issues.apache.org/jira/browse/HDFS-1117
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.20.2
>Reporter: Luke Lu
>Assignee: Luke Lu
> Fix For: 0.23.0
>
> Attachments: HDFS-1117.2.patch, HDFS-1117.3.patch, HDFS-1117.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1937) Improve DataTransferProtocol

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)
Improve DataTransferProtocol


 Key: HDFS-1937
 URL: https://issues.apache.org/jira/browse/HDFS-1937
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE


This is an umbrella JIRA for improving {{DataTransferProtocol}}.

{{DataTransferProtocol}} is implemented using socket directly and the codes are 
distributed among datanode classes and {{DFSClient}} classes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1917) Clean up duplication of dependent jar files

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033379#comment-13033379
 ] 

Todd Lipcon commented on HDFS-1917:
---

Not sure why the testbot didn't catch this, but it looks like this commit broke 
the system test compilation. Seeing "Reference ivy-hdfs.classpath not found." 
when I run ant test-system. Reverting this patch locally fixes it.

> Clean up duplication of dependent jar files
> ---
>
> Key: HDFS-1917
> URL: https://issues.apache.org/jira/browse/HDFS-1917
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
> Environment: Java 6, RHEL 5.5
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: 0.23.0
>
> Attachments: HDFS-1917-1.patch, HDFS-1917.patch
>
>
> For trunk, the build and deployment tree look like this:
> hadoop-common-0.2x.y
> hadoop-hdfs-0.2x.y
> hadoop-mapred-0.2x.y
> Technically, hdfs's the third party dependent jar files should be fetch from 
> hadoop-common.  However, it is currently fetching from hadoop-hdfs/lib only.  
> It would be nice to eliminate the need to repeat duplicated jar files at 
> build time.
> There are two options to manage this dependency list, continue to enhance ant 
> build structure to fetch and filter jar file dependencies using ivy.  On the 
> other hand, it would be a good opportunity to convert the build structure to 
> maven, and use maven to manage the provided jar files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HDFS-1917) Clean up duplication of dependent jar files

2011-05-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reopened HDFS-1917:
---


> Clean up duplication of dependent jar files
> ---
>
> Key: HDFS-1917
> URL: https://issues.apache.org/jira/browse/HDFS-1917
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
> Environment: Java 6, RHEL 5.5
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: 0.23.0
>
> Attachments: HDFS-1917-1.patch, HDFS-1917.patch
>
>
> For trunk, the build and deployment tree look like this:
> hadoop-common-0.2x.y
> hadoop-hdfs-0.2x.y
> hadoop-mapred-0.2x.y
> Technically, hdfs's the third party dependent jar files should be fetch from 
> hadoop-common.  However, it is currently fetching from hadoop-hdfs/lib only.  
> It would be nice to eliminate the need to repeat duplicated jar files at 
> build time.
> There are two options to manage this dependency list, continue to enhance ant 
> build structure to fetch and filter jar file dependencies using ivy.  On the 
> other hand, it would be a good opportunity to convert the build structure to 
> maven, and use maven to manage the provided jar files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1937) Improve DataTransferProtocol

2011-05-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1937:
--

Attachment: hdfs-1937-some-preliminary-junk.txt

Hey Nicholas. I'm big +1 for this idea, the code is kind of a mess.

I had started some work in this direction last year - figured I would upload 
the patch in case you find it useful as a starting point

> Improve DataTransferProtocol
> 
>
> Key: HDFS-1937
> URL: https://issues.apache.org/jira/browse/HDFS-1937
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, hdfs client
>Reporter: Tsz Wo (Nicholas), SZE
> Attachments: hdfs-1937-some-preliminary-junk.txt
>
>
> This is an umbrella JIRA for improving {{DataTransferProtocol}}.
> {{DataTransferProtocol}} is implemented using socket directly and the codes 
> are distributed among datanode classes and {{DFSClient}} classes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-1332:
-

Attachment: (was: HDFS-1332.patch)

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-1332:
-

Attachment: HDFS-1332.patch

Updated patch that removes HashMap and subject additional logging to 
LOG.isDebugEnabled()

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



[jira] [Commented] (HDFS-1917) Clean up duplication of dependent jar files

2011-05-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033388#comment-13033388
 ] 

Todd Lipcon commented on HDFS-1917:
---

Jolly points out that making ivy-retrieve-common depend on ivy-retrieve-hdfs 
fixes the issue. Does that seem right to you guys?

> Clean up duplication of dependent jar files
> ---
>
> Key: HDFS-1917
> URL: https://issues.apache.org/jira/browse/HDFS-1917
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
> Environment: Java 6, RHEL 5.5
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: 0.23.0
>
> Attachments: HDFS-1917-1.patch, HDFS-1917.patch
>
>
> For trunk, the build and deployment tree look like this:
> hadoop-common-0.2x.y
> hadoop-hdfs-0.2x.y
> hadoop-mapred-0.2x.y
> Technically, hdfs's the third party dependent jar files should be fetch from 
> hadoop-common.  However, it is currently fetching from hadoop-hdfs/lib only.  
> It would be nice to eliminate the need to repeat duplicated jar files at 
> build time.
> There are two options to manage this dependency list, continue to enhance ant 
> build structure to fetch and filter jar file dependencies using ivy.  On the 
> other hand, it would be a good opportunity to convert the build structure to 
> maven, and use maven to manage the provided jar files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-1332:
-

Attachment: (was: HDFS-1332.patch)

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded

2011-05-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HDFS-1332:
-

Attachment: HDFS-1332.patch

Clears StringBuilder after we have used the contents of it.

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes 
> were excluded
> --
>
> Key: HDFS-1332
> URL: https://issues.apache.org/jira/browse/HDFS-1332
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Todd Lipcon
>Assignee: Ted Yu
>Priority: Minor
>  Labels: newbie
> Fix For: 0.23.0
>
> Attachments: HDFS-1332.patch
>
>
> Whenever the block placement policy determines that a node is not a "good 
> target" it could add the reason for exclusion to a list, and then when we log 
> "Not able to place enough replicas" we could say why each node was refused. 
> This would help new users who are having issues on pseudo-distributed (eg 
> because their data dir is on /tmp and /tmp is full). Right now it's very 
> difficult to figure out the issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1929) TestEditLogFileOutputStream fails if running on same host as NN

2011-05-13 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-1929:
-

Attachment: hdfs-1929.2.patch

Updated patch rebased against trunk.

> TestEditLogFileOutputStream fails if running on same host as NN
> ---
>
> Key: HDFS-1929
> URL: https://issues.apache.org/jira/browse/HDFS-1929
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
> Attachments: hdfs-1929.0.patch, hdfs-1929.1.patch, hdfs-1929.2.patch
>
>
> This test instantiates NameNode directly rather than using MiniDFSCluster, so 
> it tries to claim the default port 50070 rather than using an ephemeral one. 
> This makes the test fail if you are running a NN on the same machine where 
> you're running the test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart

2011-05-13 Thread Matt Foley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033392#comment-13033392
 ] 

Matt Foley commented on HDFS-1921:
--

Dmytro, since this is a mod of HDFS-1071, would you like to review it?
It's short :-)  Thanks, if you have time.

> Save namespace can cause NN to be unable to come up on restart
> --
>
> Key: HDFS-1921
> URL: https://issues.apache.org/jira/browse/HDFS-1921
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Aaron T. Myers
>Assignee: Matt Foley
>Priority: Blocker
> Fix For: 0.22.0, 0.23.0
>
> Attachments: hdfs-1505-1-test.txt, hdfs1921_v23.patch, 
> hdfs1921_v23.patch
>
>
> I discovered this in the course of trying to implement a fix for HDFS-1505.
> Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save 
> namespace proceeds in the following order:
> # rename current to lastcheckpoint.tmp for all of them,
> # save image and recreate edits for all of them,
> # rename lastcheckpoint.tmp to previous.checkpoint.
> The problem is that step 3 occurs regardless of whether or not an error 
> occurs for all storage directories in step 2. Upon restart, the NN will see 
> non-existent or corrupt {{current}} directories, and no 
> {{lastcheckpoint.tmp}} directories, and so will conclude that the storage 
> directories are not formatted.
> This issue appears to be present on both 0.22 and 0.23. This should arguably 
> be a 0.22/0.23 blocker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1932) Add utility method to initialize HDFS default configurations

2011-05-13 Thread Jolly Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jolly Chen updated HDFS-1932:
-

Attachment: hdfs-1932.txt

> Add utility method to initialize HDFS default configurations
> 
>
> Key: HDFS-1932
> URL: https://issues.apache.org/jira/browse/HDFS-1932
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Priority: Critical
>  Labels: newbie
> Attachments: hdfs-1932.txt
>
>
> Currently we have code blocks like the following in lots of places:
> {code}
>   static{
> Configuration.addDefaultResource("hdfs-default.xml");
> Configuration.addDefaultResource("hdfs-site.xml");
>   }
> {code}
> This is dangerous since, if we don't remember to also classload 
> HdfsConfiguration, the config key deprecations won't work. We should add a 
> method like HdfsConfiguration.init() which would load the default resources 
> as well as ensure that deprecation gets initialized properly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1932) Add utility method to initialize HDFS default configurations

2011-05-13 Thread Jolly Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jolly Chen updated HDFS-1932:
-

Assignee: Jolly Chen
  Status: Patch Available  (was: Open)

> Add utility method to initialize HDFS default configurations
> 
>
> Key: HDFS-1932
> URL: https://issues.apache.org/jira/browse/HDFS-1932
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Jolly Chen
>Priority: Critical
>  Labels: newbie
> Attachments: hdfs-1932.txt
>
>
> Currently we have code blocks like the following in lots of places:
> {code}
>   static{
> Configuration.addDefaultResource("hdfs-default.xml");
> Configuration.addDefaultResource("hdfs-site.xml");
>   }
> {code}
> This is dangerous since, if we don't remember to also classload 
> HdfsConfiguration, the config key deprecations won't work. We should add a 
> method like HdfsConfiguration.init() which would load the default resources 
> as well as ensure that deprecation gets initialized properly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1938) Reference ivy-hdfs.classpath not found.

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)
 Reference ivy-hdfs.classpath not found.


 Key: HDFS-1938
 URL: https://issues.apache.org/jira/browse/HDFS-1938
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Eric Yang
Priority: Minor


{noformat}
$ant test-system
...
BUILD FAILED
/export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:129: The following 
error occurred while executing this line:
/export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:183: The following 
error occurred while executing this line:
/export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:193: The following 
error occurred while executing this line:
/export/crawlspace/tsz/hdfs/h1/build.xml:449: Reference ivy-hdfs.classpath not 
found.
{noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1917) Clean up duplication of dependent jar files

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE resolved HDFS-1917.
--

Resolution: Fixed

Todd, good catch.  Filed HDFS-1938 and Eric is looking at it.

> Clean up duplication of dependent jar files
> ---
>
> Key: HDFS-1917
> URL: https://issues.apache.org/jira/browse/HDFS-1917
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
> Environment: Java 6, RHEL 5.5
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: 0.23.0
>
> Attachments: HDFS-1917-1.patch, HDFS-1917.patch
>
>
> For trunk, the build and deployment tree look like this:
> hadoop-common-0.2x.y
> hadoop-hdfs-0.2x.y
> hadoop-mapred-0.2x.y
> Technically, hdfs's the third party dependent jar files should be fetch from 
> hadoop-common.  However, it is currently fetching from hadoop-hdfs/lib only.  
> It would be nice to eliminate the need to repeat duplicated jar files at 
> build time.
> There are two options to manage this dependency list, continue to enhance ant 
> build structure to fetch and filter jar file dependencies using ivy.  On the 
> other hand, it would be a good opportunity to convert the build structure to 
> maven, and use maven to manage the provided jar files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1917) Clean up duplication of dependent jar files

2011-05-13 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033400#comment-13033400
 ] 

Eric Yang commented on HDFS-1917:
-

ivy-retrieve-common should not depends on ivy-retrieve-hdfs.  It becomes a 
circular dependency.  It looks like ivy-retrieve-hdfs is not initialized when 
calling test-system.  I will track it down, thanks Todd.

> Clean up duplication of dependent jar files
> ---
>
> Key: HDFS-1917
> URL: https://issues.apache.org/jira/browse/HDFS-1917
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
> Environment: Java 6, RHEL 5.5
>Reporter: Eric Yang
>Assignee: Eric Yang
> Fix For: 0.23.0
>
> Attachments: HDFS-1917-1.patch, HDFS-1917.patch
>
>
> For trunk, the build and deployment tree look like this:
> hadoop-common-0.2x.y
> hadoop-hdfs-0.2x.y
> hadoop-mapred-0.2x.y
> Technically, hdfs's the third party dependent jar files should be fetch from 
> hadoop-common.  However, it is currently fetching from hadoop-hdfs/lib only.  
> It would be nice to eliminate the need to repeat duplicated jar files at 
> build time.
> There are two options to manage this dependency list, continue to enhance ant 
> build structure to fetch and filter jar file dependencies using ivy.  On the 
> other hand, it would be a good opportunity to convert the build structure to 
> maven, and use maven to manage the provided jar files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1938) Reference ivy-hdfs.classpath not found.

2011-05-13 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HDFS-1938:


Attachment: HDFS-1938.patch

Added ivy-retrieve-hdfs to test-system class path.

>  Reference ivy-hdfs.classpath not found.
> 
>
> Key: HDFS-1938
> URL: https://issues.apache.org/jira/browse/HDFS-1938
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Eric Yang
>Priority: Minor
> Attachments: HDFS-1938.patch
>
>
> {noformat}
> $ant test-system
> ...
> BUILD FAILED
> /export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:129: The following 
> error occurred while executing this line:
> /export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:183: The following 
> error occurred while executing this line:
> /export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:193: The following 
> error occurred while executing this line:
> /export/crawlspace/tsz/hdfs/h1/build.xml:449: Reference ivy-hdfs.classpath 
> not found.
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1938) Reference ivy-hdfs.classpath not found.

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-1938:
-

Hadoop Flags: [Reviewed]

+1

I verified that it fixes the problem.

>  Reference ivy-hdfs.classpath not found.
> 
>
> Key: HDFS-1938
> URL: https://issues.apache.org/jira/browse/HDFS-1938
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Eric Yang
>Priority: Minor
> Attachments: HDFS-1938.patch
>
>
> {noformat}
> $ant test-system
> ...
> BUILD FAILED
> /export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:129: The following 
> error occurred while executing this line:
> /export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:183: The following 
> error occurred while executing this line:
> /export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:193: The following 
> error occurred while executing this line:
> /export/crawlspace/tsz/hdfs/h1/build.xml:449: Reference ivy-hdfs.classpath 
> not found.
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1938) Reference ivy-hdfs.classpath not found.

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE resolved HDFS-1938.
--

   Resolution: Fixed
Fix Version/s: 0.23.0

Skipping Hudson since it won't detect this.

I have committed this.  Thanks, Eric!

>  Reference ivy-hdfs.classpath not found.
> 
>
> Key: HDFS-1938
> URL: https://issues.apache.org/jira/browse/HDFS-1938
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.23.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Eric Yang
>Priority: Minor
> Fix For: 0.23.0
>
> Attachments: HDFS-1938.patch
>
>
> {noformat}
> $ant test-system
> ...
> BUILD FAILED
> /export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:129: The following 
> error occurred while executing this line:
> /export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:183: The following 
> error occurred while executing this line:
> /export/crawlspace/tsz/hdfs/h1/src/test/aop/build/aop.xml:193: The following 
> error occurred while executing this line:
> /export/crawlspace/tsz/hdfs/h1/build.xml:449: Reference ivy-hdfs.classpath 
> not found.
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1939) ivy: test conf should not extend common conf

2011-05-13 Thread Tsz Wo (Nicholas), SZE (JIRA)
ivy: test conf should not extend common conf


 Key: HDFS-1939
 URL: https://issues.apache.org/jira/browse/HDFS-1939
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Eric Yang


Similar improvement as HADOOP-7289.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1940) Datanode can have more than one copy of same block when a failed disk is coming back in datanode

2011-05-13 Thread Rajit (JIRA)
Datanode can have more than one copy of same block when a failed disk is coming 
back in datanode


 Key: HDFS-1940
 URL: https://issues.apache.org/jira/browse/HDFS-1940
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.20.204.0
Reporter: Rajit


There is a situation where one datanode can have more than one copy of same 
block due to a disk fails and comes back after sometime in a datanode. And 
these duplicate blocks are not getting deleted even after datanode and namenode 
restart.

This situation can only happen in a corner case , when due to disk failure, the 
data block is replicated to other disk of the same datanode.


To simulate this scenario I copied a datablock and the associated .meta file 
from one disk to another disk of same datanode, so the datanode is having 2 
copy of same replica. Now I restarted datanode and namenode. Still the extra 
data block and meta file is not deleted from the datanode

[hdfs@gsbl90192 rajsaha]$ ls -l `find 
/grid/{0,1,2,3}/hadoop/var/hdfs/data/current -name blk_*`
-rw-r--r-- 1 hdfs users 7814 May 13 21:05 
/grid/1/hadoop/var/hdfs/data/current/blk_1727421609840461376
-rw-r--r-- 1 hdfs users   71 May 13 21:05 
/grid/1/hadoop/var/hdfs/data/current/blk_1727421609840461376_579992.meta
-rw-r--r-- 1 hdfs users 7814 May 13 21:14 
/grid/3/hadoop/var/hdfs/data/current/blk_1727421609840461376
-rw-r--r-- 1 hdfs users   71 May 13 21:14 
/grid/3/hadoop/var/hdfs/data/current/blk_1727421609840461376_579992.meta

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-1940) Datanode can have more than one copy of same block when a failed disk is coming back in datanode

2011-05-13 Thread Bharath Mundlapudi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Mundlapudi reassigned HDFS-1940:


Assignee: Bharath Mundlapudi

> Datanode can have more than one copy of same block when a failed disk is 
> coming back in datanode
> 
>
> Key: HDFS-1940
> URL: https://issues.apache.org/jira/browse/HDFS-1940
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 0.20.204.0
>Reporter: Rajit
>Assignee: Bharath Mundlapudi
>
> There is a situation where one datanode can have more than one copy of same 
> block due to a disk fails and comes back after sometime in a datanode. And 
> these duplicate blocks are not getting deleted even after datanode and 
> namenode restart.
> This situation can only happen in a corner case , when due to disk failure, 
> the data block is replicated to other disk of the same datanode.
> To simulate this scenario I copied a datablock and the associated .meta file 
> from one disk to another disk of same datanode, so the datanode is having 2 
> copy of same replica. Now I restarted datanode and namenode. Still the extra 
> data block and meta file is not deleted from the datanode
> [hdfs@gsbl90192 rajsaha]$ ls -l `find 
> /grid/{0,1,2,3}/hadoop/var/hdfs/data/current -name blk_*`
> -rw-r--r-- 1 hdfs users 7814 May 13 21:05 
> /grid/1/hadoop/var/hdfs/data/current/blk_1727421609840461376
> -rw-r--r-- 1 hdfs users   71 May 13 21:05 
> /grid/1/hadoop/var/hdfs/data/current/blk_1727421609840461376_579992.meta
> -rw-r--r-- 1 hdfs users 7814 May 13 21:14 
> /grid/3/hadoop/var/hdfs/data/current/blk_1727421609840461376
> -rw-r--r-- 1 hdfs users   71 May 13 21:14 
> /grid/3/hadoop/var/hdfs/data/current/blk_1727421609840461376_579992.meta

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >