[jira] Commented: (HDFS-905) Make changes to HDFS for the new UserGroupInformation APIs (HADOOP-6299)

2010-01-26 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805401#action_12805401
 ] 

Owen O'Malley commented on HDFS-905:


+1

> Make changes to HDFS for the new UserGroupInformation APIs (HADOOP-6299)
> 
>
> Key: HDFS-905
> URL: https://issues.apache.org/jira/browse/HDFS-905
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Devaraj Das
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: HDFS-905-mark3.patch, HDFS-905.patch, HDFS-905.patch
>
>
> This is about moving the HDFS code to use the new UserGroupInformation API as 
> described in HADOOP-6299.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-899) Delegation Token Implementation

2010-01-26 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-899:
--

Attachment: HDFS-899-0_20.2.patch

Patch for hadoop-20 added.

> Delegation Token Implementation
> ---
>
> Key: HDFS-899
> URL: https://issues.apache.org/jira/browse/HDFS-899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-899-0_20.2.patch, HDFS-899.1.patch, 
> HDFS-899.2.patch, HDFS-899.3.patch, HDFS-899.4.patch, HDFS-899.5.patch, 
> HDFS-899.6.patch, HDFS-899.7.patch
>
>
>   This jira tracks implementation of delegation token and corresponding 
> changes in Namenode and DFS Api to issue, renew and cancel delegation tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-927) DFSInputStream retries too many times for new block locations

2010-01-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805398#action_12805398
 ] 

Todd Lipcon commented on HDFS-927:
--

Can anyone explain this hudson result? it says -1 core tests, but the Test 
results page shows no failures...

> DFSInputStream retries too many times for new block locations
> -
>
> Key: HDFS-927
> URL: https://issues.apache.org/jira/browse/HDFS-927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-927.txt
>
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
> to only go back to the NN max.block.acquires times, but in trunk it goes back 
> twice as many - the default is 3, but I am counting 7 calls to 
> getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-927) DFSInputStream retries too many times for new block locations

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805396#action_12805396
 ] 

Hadoop QA commented on HDFS-927:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt
  against trunk revision 903381.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/106/console

This message is automatically generated.

> DFSInputStream retries too many times for new block locations
> -
>
> Key: HDFS-927
> URL: https://issues.apache.org/jira/browse/HDFS-927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-927.txt
>
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
> to only go back to the NN max.block.acquires times, but in trunk it goes back 
> twice as many - the default is 3, but I am counting 7 calls to 
> getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-455) Make NN and DN handle in a intuitive way comma-separated configuration strings

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805391#action_12805391
 ] 

Hadoop QA commented on HDFS-455:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431502/hdfs-455.txt
  against trunk revision 903547.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/209/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/209/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/209/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/209/console

This message is automatically generated.

> Make NN and DN handle in a intuitive way comma-separated configuration strings
> --
>
> Key: HDFS-455
> URL: https://issues.apache.org/jira/browse/HDFS-455
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, name-node
>Affects Versions: 0.20.1, 0.21.0
>Reporter: Michele (aka pirroh) Catasta
>Priority: Minor
> Attachments: HDFS-455.patch, hdfs-455.txt
>
>
> The following configuration causes problems:
> 
> dfs.data.dir
> /mnt/hstore2/hdfs, /home/foo/dfs 
> 
> The problem is that the space after the comma causes the second directory for 
> storage to be " /home/foo/dfs" which is in a directory named  which 
> contains a sub-dir named "home" in the hadoop datanodes default directory. 
> This will typically cause the user's home partition to fill, but will be very 
> hard for the user to understand since a directory with a whitespace name is 
> hard to understand.
> (ripped from HADOOP-2366)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-922) Remove extra semicolon from HDFS-877 that really annoys Eclipse

2010-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805375#action_12805375
 ] 

Hudson commented on HDFS-922:
-

Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #208 (See 
[http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/])


> Remove extra semicolon from HDFS-877 that really annoys Eclipse
> ---
>
> Key: HDFS-922
> URL: https://issues.apache.org/jira/browse/HDFS-922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Jakob Homan
>Assignee: Jakob Homan
>Priority: Minor
> Attachments: HDFS-922.patch
>
>
> HDFS-877 introduced an extra semicolon on an empty line that Eclipse treats 
> as a syntax error and hence messes up its compilation.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-877) Client-driven block verification not functioning

2010-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805372#action_12805372
 ] 

Hudson commented on HDFS-877:
-

Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #208 (See 
[http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/])


> Client-driven block verification not functioning
> 
>
> Key: HDFS-877
> URL: https://issues.apache.org/jira/browse/HDFS-877
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, test
>Affects Versions: 0.20.1, 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.0
>
> Attachments: hdfs-877-branch20.txt, hdfs-877.txt, hdfs-877.txt, 
> hdfs-877.txt, hdfs-877.txt, hdfs-877.txt, hdfs-877.txt
>
>
> This is actually the reason for HDFS-734 (TestDatanodeBlockScanner timing 
> out). The issue is that DFSInputStream relies on readChunk being called one 
> last time at the end of the file in order to receive the 
> lastPacketInBlock=true packet from the DN. However, DFSInputStream.read 
> checks pos < getFileLength() before issuing the read. Thus gotEOS never 
> shifts to true and checksumOk() is never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-630) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.

2010-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805374#action_12805374
 ] 

Hudson commented on HDFS-630:
-

Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #208 (See 
[http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/])


> In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific 
> datanodes when locating the next block.
> ---
>
> Key: HDFS-630
> URL: https://issues.apache.org/jira/browse/HDFS-630
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client, name-node
>Affects Versions: 0.21.0
>Reporter: Ruyue Ma
>Assignee: Cosmin Lehene
> Fix For: 0.21.0, 0.22.0
>
> Attachments: 0001-Fix-HDFS-630-0.21-svn-1.patch, 
> 0001-Fix-HDFS-630-0.21-svn-2.patch, 0001-Fix-HDFS-630-0.21-svn.patch, 
> 0001-Fix-HDFS-630-for-0.21-and-trunk-unified.patch, 
> 0001-Fix-HDFS-630-for-0.21.patch, 0001-Fix-HDFS-630-svn.patch, 
> 0001-Fix-HDFS-630-svn.patch, 0001-Fix-HDFS-630-trunk-svn-1.patch, 
> 0001-Fix-HDFS-630-trunk-svn-2.patch, 0001-Fix-HDFS-630-trunk-svn-3.patch, 
> 0001-Fix-HDFS-630-trunk-svn-3.patch, 0001-Fix-HDFS-630-trunk-svn-4.patch, 
> hdfs-630-0.20.txt, HDFS-630.patch
>
>
> created from hdfs-200.
> If during a write, the dfsclient sees that a block replica location for a 
> newly allocated block is not-connectable, it re-requests the NN to get a 
> fresh set of replica locations of the block. It tries this 
> dfs.client.block.write.retries times (default 3), sleeping 6 seconds between 
> each retry ( see DFSClient.nextBlockOutputStream).
> This setting works well when you have a reasonable size cluster; if u have 
> few datanodes in the cluster, every retry maybe pick the dead-datanode and 
> the above logic bails out.
> Our solution: when getting block location from namenode, we give nn the 
> excluded datanodes. The list of dead datanodes is only for one block 
> allocation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-899) Delegation Token Implementation

2010-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805373#action_12805373
 ] 

Hudson commented on HDFS-899:
-

Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #208 (See 
[http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/])


> Delegation Token Implementation
> ---
>
> Key: HDFS-899
> URL: https://issues.apache.org/jira/browse/HDFS-899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-899.1.patch, HDFS-899.2.patch, HDFS-899.3.patch, 
> HDFS-899.4.patch, HDFS-899.5.patch, HDFS-899.6.patch, HDFS-899.7.patch
>
>
>   This jira tracks implementation of delegation token and corresponding 
> changes in Namenode and DFS Api to issue, renew and cancel delegation tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-844) Log the filename when file locking fails

2010-01-26 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HDFS-844:
---

   Resolution: Fixed
Fix Version/s: 0.22.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I've just committed this.

> Log the filename when file locking fails
> 
>
> Key: HDFS-844
> URL: https://issues.apache.org/jira/browse/HDFS-844
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Tom White
>Assignee: Tom White
> Fix For: 0.22.0
>
> Attachments: HDFS-844.patch, HDFS-844.patch
>
>
> When the storage lock cannot be acquired in StorageDirectory.tryLock() the 
> the error message does not show the storage volume that was at fault. The log 
> message is very generic:
> {code}
> common.Storage: java.io.IOException: Input/output error
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-919) Create test to validate the BlocksVerified metric

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805369#action_12805369
 ] 

Hadoop QA commented on HDFS-919:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431382/HDFS-919.patch
  against trunk revision 903381.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/208/console

This message is automatically generated.

> Create test to validate the BlocksVerified metric
> -
>
> Key: HDFS-919
> URL: https://issues.apache.org/jira/browse/HDFS-919
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.20.2
>Reporter: gary murry
> Attachments: HDFS-919.patch, HDFS-919.patch, HDFS-919_0.20.patch, 
> HDFS-919_2.patch
>
>
> Just adding some tests to validate the BlocksVerified metric.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-874) TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts

2010-01-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805363#action_12805363
 ] 

Todd Lipcon commented on HDFS-874:
--

Test that failed is a fault injection hflush test - unrelated to this jira. I 
think this is ready for commit.

> TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts
> ---
>
> Key: HDFS-874
> URL: https://issues.apache.org/jira/browse/HDFS-874
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, test
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-874.txt
>
>
> On an internal build machine I see exceptions like this:
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:47262/data/1/scratch/patchqueue/patch-worker-20518/patch_21/svnrepo/build/test/data/test/test/testRenameWithQuota/srcdir,
>  expected: hdfs://localhost.localdomain:47262
> "hostname" and "hostname -f" both show the machine's FQDN (not localhost). 
> /etc/hosts is stock after CentOS 5 install. "host 127.0.0.1" reverses to 
> "localhost"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-874) TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805352#action_12805352
 ] 

Hadoop QA commented on HDFS-874:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12429689/hdfs-874.txt
  against trunk revision 903381.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/105/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/105/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/105/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/105/console

This message is automatically generated.

> TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts
> ---
>
> Key: HDFS-874
> URL: https://issues.apache.org/jira/browse/HDFS-874
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, test
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-874.txt
>
>
> On an internal build machine I see exceptions like this:
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:47262/data/1/scratch/patchqueue/patch-worker-20518/patch_21/svnrepo/build/test/data/test/test/testRenameWithQuota/srcdir,
>  expected: hdfs://localhost.localdomain:47262
> "hostname" and "hostname -f" both show the machine's FQDN (not localhost). 
> /etc/hosts is stock after CentOS 5 install. "host 127.0.0.1" reverses to 
> "localhost"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-905) Make changes to HDFS for the new UserGroupInformation APIs (HADOOP-6299)

2010-01-26 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-905:
-

Attachment: HDFS-905-mark3.patch

Attaching final patch.  Passes all tests.  
Modified test-patch to use new common jar:
{noformat} [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 76 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.{noformat}

> Make changes to HDFS for the new UserGroupInformation APIs (HADOOP-6299)
> 
>
> Key: HDFS-905
> URL: https://issues.apache.org/jira/browse/HDFS-905
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Devaraj Das
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: HDFS-905-mark3.patch, HDFS-905.patch, HDFS-905.patch
>
>
> This is about moving the HDFS code to use the new UserGroupInformation API as 
> described in HADOOP-6299.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-905) Make changes to HDFS for the new UserGroupInformation APIs (HADOOP-6299)

2010-01-26 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-905:
-

Status: Open  (was: Patch Available)

> Make changes to HDFS for the new UserGroupInformation APIs (HADOOP-6299)
> 
>
> Key: HDFS-905
> URL: https://issues.apache.org/jira/browse/HDFS-905
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Devaraj Das
>Assignee: Jakob Homan
> Fix For: 0.22.0
>
> Attachments: HDFS-905-mark3.patch, HDFS-905.patch, HDFS-905.patch
>
>
> This is about moving the HDFS code to use the new UserGroupInformation API as 
> described in HADOOP-6299.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-455) Make NN and DN handle in a intuitive way comma-separated configuration strings

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-455:
-

Attachment: hdfs-455.txt

The earlier patch did not apply to trunk. This one should.

> Make NN and DN handle in a intuitive way comma-separated configuration strings
> --
>
> Key: HDFS-455
> URL: https://issues.apache.org/jira/browse/HDFS-455
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, name-node
>Affects Versions: 0.20.1, 0.21.0
>Reporter: Michele (aka pirroh) Catasta
>Priority: Minor
> Attachments: HDFS-455.patch, hdfs-455.txt
>
>
> The following configuration causes problems:
> 
> dfs.data.dir
> /mnt/hstore2/hdfs, /home/foo/dfs 
> 
> The problem is that the space after the comma causes the second directory for 
> storage to be " /home/foo/dfs" which is in a directory named  which 
> contains a sub-dir named "home" in the hadoop datanodes default directory. 
> This will typically cause the user's home partition to fill, but will be very 
> hard for the user to understand since a directory with a whitespace name is 
> hard to understand.
> (ripped from HADOOP-2366)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-455) Make NN and DN handle in a intuitive way comma-separated configuration strings

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-455:
-

Status: Patch Available  (was: Open)

> Make NN and DN handle in a intuitive way comma-separated configuration strings
> --
>
> Key: HDFS-455
> URL: https://issues.apache.org/jira/browse/HDFS-455
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, name-node
>Affects Versions: 0.20.1, 0.21.0
>Reporter: Michele (aka pirroh) Catasta
>Priority: Minor
> Attachments: HDFS-455.patch, hdfs-455.txt
>
>
> The following configuration causes problems:
> 
> dfs.data.dir
> /mnt/hstore2/hdfs, /home/foo/dfs 
> 
> The problem is that the space after the comma causes the second directory for 
> storage to be " /home/foo/dfs" which is in a directory named  which 
> contains a sub-dir named "home" in the hadoop datanodes default directory. 
> This will typically cause the user's home partition to fill, but will be very 
> hard for the user to understand since a directory with a whitespace name is 
> hard to understand.
> (ripped from HADOOP-2366)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-455) Make NN and DN handle in a intuitive way comma-separated configuration strings

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-455:
-

Status: Open  (was: Patch Available)

> Make NN and DN handle in a intuitive way comma-separated configuration strings
> --
>
> Key: HDFS-455
> URL: https://issues.apache.org/jira/browse/HDFS-455
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, name-node
>Affects Versions: 0.20.1, 0.21.0
>Reporter: Michele (aka pirroh) Catasta
>Priority: Minor
> Attachments: HDFS-455.patch, hdfs-455.txt
>
>
> The following configuration causes problems:
> 
> dfs.data.dir
> /mnt/hstore2/hdfs, /home/foo/dfs 
> 
> The problem is that the space after the comma causes the second directory for 
> storage to be " /home/foo/dfs" which is in a directory named  which 
> contains a sub-dir named "home" in the hadoop datanodes default directory. 
> This will typically cause the user's home partition to fill, but will be very 
> hard for the user to understand since a directory with a whitespace name is 
> hard to understand.
> (ripped from HADOOP-2366)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-01-26 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805333#action_12805333
 ] 

Ruyue Ma commented on HDFS-923:
---

Firstly, to resolve this problem, we should make sure whether the current 
hdfsRead api is good (correctly).

I support the following:

If the returned length of hdfsRead is not equal to buffer length, we could make 
sure that the file is EOF.

Maybe, we can provide another api: hdfsReadFully().


your suggestions?

> libhdfs hdfs_read example uses hdfsRead wrongly
> ---
>
> Key: HDFS-923
> URL: https://issues.apache.org/jira/browse/HDFS-923
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: contrib/libhdfs
>Affects Versions: 0.20.1
>Reporter: Ruyue Ma
>Assignee: Ruyue Ma
> Fix For: 0.21.0
>
>
> In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
> {noformat}
> // read from the file
> tSize curSize = bufferSize;
> for (; curSize == bufferSize;) {
> curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
> }
> {noformat} 
> the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-928) Ability to provide custom DatanodeProtocol implementation

2010-01-26 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805330#action_12805330
 ] 

Konstantin Boudnik commented on HDFS-928:
-

I would like to hear your argument about how such thing will help with the 
testing, if possible?

> Ability to provide custom DatanodeProtocol implementation
> -
>
> Key: HDFS-928
> URL: https://issues.apache.org/jira/browse/HDFS-928
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: data-node
>Reporter: Zlatin Balevsky
>Priority: Trivial
>
> This should make testing easier as well as allow users to provide their own 
> RPC/namenode implementations.  It's pretty straightforward:
> 1. add 
> interface DatanodeProtocolProvider {
>   DatanodeProtocol getNameNode(Configuration conf);
> }
> 2. add a config setting like "dfs.datanode.protocol.impl"
> 3. create a default implementation and copy/paste the RPC initialization code 
> there

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-927) DFSInputStream retries too many times for new block locations

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-927:
-

Status: Patch Available  (was: Open)

arg, same flaky NoClassDefFound Hudson junk. resubmitting

> DFSInputStream retries too many times for new block locations
> -
>
> Key: HDFS-927
> URL: https://issues.apache.org/jira/browse/HDFS-927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-927.txt
>
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
> to only go back to the NN max.block.acquires times, but in trunk it goes back 
> twice as many - the default is 3, but I am counting 7 calls to 
> getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-927) DFSInputStream retries too many times for new block locations

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-927:
-

Status: Open  (was: Patch Available)

> DFSInputStream retries too many times for new block locations
> -
>
> Key: HDFS-927
> URL: https://issues.apache.org/jira/browse/HDFS-927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-927.txt
>
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
> to only go back to the NN max.block.acquires times, but in trunk it goes back 
> twice as many - the default is 3, but I am counting 7 calls to 
> getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-839) The NameNode should forward block reports to BackupNode

2010-01-26 Thread Wang Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805318#action_12805318
 ] 

Wang Xu commented on HDFS-839:
--

I agree with Todd and Eli. 

Even manual fail-over requires a suite of operation interfaces, which could 
also be invoked by external mature HA tools.

As a trade-off between consistency and performance, the block related info 
could be forward to BN in non-block manners:
* From DN or even client to BN directly; or
* Forward from NN to BN asynchronized.

And I think the selection between the above two , together with forward info in 
EditStream, is the discussion in this issue.

> The NameNode should forward block reports to BackupNode
> ---
>
> Key: HDFS-839
> URL: https://issues.apache.org/jira/browse/HDFS-839
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> The BackupNode (via HADOOP-4539) receives a stream of transactions from 
> NameNode. However, the BackupNode does not have block locations of blocks. It 
> would be nice if the NameNode can forward all block reports (that it receives 
> from DataNodes) to the BackupNode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-919) Create test to validate the BlocksVerified metric

2010-01-26 Thread gary murry (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805315#action_12805315
 ] 

gary murry commented on HDFS-919:
-

These results look to be unrelated to my patch.  The reference two other 
patches.  I am going to resubmit to see if that clears up the issue.

> Create test to validate the BlocksVerified metric
> -
>
> Key: HDFS-919
> URL: https://issues.apache.org/jira/browse/HDFS-919
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.20.2
>Reporter: gary murry
> Attachments: HDFS-919.patch, HDFS-919.patch, HDFS-919_0.20.patch, 
> HDFS-919_2.patch
>
>
> Just adding some tests to validate the BlocksVerified metric.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-919) Create test to validate the BlocksVerified metric

2010-01-26 Thread gary murry (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gary murry updated HDFS-919:


Status: Open  (was: Patch Available)

> Create test to validate the BlocksVerified metric
> -
>
> Key: HDFS-919
> URL: https://issues.apache.org/jira/browse/HDFS-919
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.20.2
>Reporter: gary murry
> Attachments: HDFS-919.patch, HDFS-919.patch, HDFS-919_0.20.patch, 
> HDFS-919_2.patch
>
>
> Just adding some tests to validate the BlocksVerified metric.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-919) Create test to validate the BlocksVerified metric

2010-01-26 Thread gary murry (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gary murry updated HDFS-919:


Status: Patch Available  (was: Open)

> Create test to validate the BlocksVerified metric
> -
>
> Key: HDFS-919
> URL: https://issues.apache.org/jira/browse/HDFS-919
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.20.2
>Reporter: gary murry
> Attachments: HDFS-919.patch, HDFS-919.patch, HDFS-919_0.20.patch, 
> HDFS-919_2.patch
>
>
> Just adding some tests to validate the BlocksVerified metric.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-607) HDFS should support SNMP

2010-01-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805306#action_12805306
 ] 

Allen Wittenauer commented on HDFS-607:
---

We're basically 'getting by' with ganglia and have the more critical stuff 
pushed into zenoss now.

> HDFS should support SNMP
> 
>
> Key: HDFS-607
> URL: https://issues.apache.org/jira/browse/HDFS-607
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Allen Wittenauer
>
> HDFS should provide key statistics over a standard protocol such as SNMP.  
> This would allow for much easier integration into common software packages 
> that are already established in the industry.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-927) DFSInputStream retries too many times for new block locations

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805298#action_12805298
 ] 

Hadoop QA commented on HDFS-927:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431471/hdfs-927.txt
  against trunk revision 903381.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/207/console

This message is automatically generated.

> DFSInputStream retries too many times for new block locations
> -
>
> Key: HDFS-927
> URL: https://issues.apache.org/jira/browse/HDFS-927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-927.txt
>
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
> to only go back to the NN max.block.acquires times, but in trunk it goes back 
> twice as many - the default is 3, but I am counting 7 calls to 
> getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-919) Create test to validate the BlocksVerified metric

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805296#action_12805296
 ] 

Hadoop QA commented on HDFS-919:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431382/HDFS-919.patch
  against trunk revision 903381.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/104/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/104/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/104/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/104/console

This message is automatically generated.

> Create test to validate the BlocksVerified metric
> -
>
> Key: HDFS-919
> URL: https://issues.apache.org/jira/browse/HDFS-919
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.20.2
>Reporter: gary murry
> Attachments: HDFS-919.patch, HDFS-919.patch, HDFS-919_0.20.patch, 
> HDFS-919_2.patch
>
>
> Just adding some tests to validate the BlocksVerified metric.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-607) HDFS should support SNMP

2010-01-26 Thread Tomer Shiran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805295#action_12805295
 ] 

Tomer Shiran commented on HDFS-607:
---

I'm surprise we haven't heard more people ask for SNMP. Do you have this need 
at LinkedIn?

> HDFS should support SNMP
> 
>
> Key: HDFS-607
> URL: https://issues.apache.org/jira/browse/HDFS-607
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Allen Wittenauer
>
> HDFS should provide key statistics over a standard protocol such as SNMP.  
> This would allow for much easier integration into common software packages 
> that are already established in the industry.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-928) Ability to provide custom DatanodeProtocol implementation

2010-01-26 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805292#action_12805292
 ] 

Eli Collins commented on HDFS-928:
--

Moving the DN to Avro will help here. 

> Ability to provide custom DatanodeProtocol implementation
> -
>
> Key: HDFS-928
> URL: https://issues.apache.org/jira/browse/HDFS-928
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: data-node
>Reporter: Zlatin Balevsky
>Priority: Trivial
>
> This should make testing easier as well as allow users to provide their own 
> RPC/namenode implementations.  It's pretty straightforward:
> 1. add 
> interface DatanodeProtocolProvider {
>   DatanodeProtocol getNameNode(Configuration conf);
> }
> 2. add a config setting like "dfs.datanode.protocol.impl"
> 3. create a default implementation and copy/paste the RPC initialization code 
> there

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-928) Ability to provide custom DatanodeProtocol implementation

2010-01-26 Thread Zlatin Balevsky (JIRA)
Ability to provide custom DatanodeProtocol implementation
-

 Key: HDFS-928
 URL: https://issues.apache.org/jira/browse/HDFS-928
 Project: Hadoop HDFS
  Issue Type: Wish
  Components: data-node
Reporter: Zlatin Balevsky
Priority: Trivial


This should make testing easier as well as allow users to provide their own 
RPC/namenode implementations.  It's pretty straightforward:

1. add 
interface DatanodeProtocolProvider {
  DatanodeProtocol getNameNode(Configuration conf);
}

2. add a config setting like "dfs.datanode.protocol.impl"

3. create a default implementation and copy/paste the RPC initialization code 
there

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-872) DFSClient 0.20.1 is incompatible with HDFS 0.20.2

2010-01-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805268#action_12805268
 ] 

Todd Lipcon commented on HDFS-872:
--

Sounds good. Thanks for the help reviewing.

> DFSClient 0.20.1 is incompatible with HDFS 0.20.2
> -
>
> Key: HDFS-872
> URL: https://issues.apache.org/jira/browse/HDFS-872
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client
>Affects Versions: 0.20.1, 0.20.2
>Reporter: Bassam Tabbara
>Assignee: Todd Lipcon
> Fix For: 0.20.2
>
> Attachments: hdfs-793-branch20.txt, hdfs-793-branch20.txt, 
> hdfs-872.txt
>
>
> After upgrading to that latest HDFS 0.20.2 (r896310 from 
> /branches/branch-0.20), old DFS clients (0.20.1) seem to not work anymore. 
> HBase uses the 0.20.1 hadoop core jars and the HBase master will no longer 
> startup. Here is the exception from the HBase master log:
> {code}
> 2010-01-06 09:59:46,762 WARN org.apache.hadoop.hdfs.DFSClient: DFS Read: 
> java.io.IOException: Could not obtain block: blk_338051
> 259657728_1002 file=/hbase/hbase.version
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1788)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1616)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1743)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1673)
> at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320)
> at java.io.DataInputStream.readUTF(DataInputStream.java:572)
> at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:189)
> at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:208)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:208)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1241)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1282)
> 2010-01-06 09:59:46,763 FATAL org.apache.hadoop.hbase.master.HMaster: Not 
> starting HMaster because:
> java.io.IOException: Could not obtain block: blk_338051259657728_1002 
> file=/hbase/hbase.version
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1788)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1616)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1743)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1673)
> at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320)
> at java.io.DataInputStream.readUTF(DataInputStream.java:572)
> at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:189)
> at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:208)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:208)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1241)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1282)
> {code}
> If I switch the hadoop jars in the hbase/lib directory with 0.20.2 version it 
> works well, which what led me to open this bug here and not in the HBASE 
> project.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-872) DFSClient 0.20.1 is incompatible with HDFS 0.20.2

2010-01-26 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang resolved HDFS-872.


  Resolution: Fixed
Hadoop Flags: [Reviewed]

I resolve this for now. If we ever want to port HDFS-101 to 0.20, let's reopen 
HDFS-101 or create a new jira.

> DFSClient 0.20.1 is incompatible with HDFS 0.20.2
> -
>
> Key: HDFS-872
> URL: https://issues.apache.org/jira/browse/HDFS-872
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, hdfs client
>Affects Versions: 0.20.1, 0.20.2
>Reporter: Bassam Tabbara
>Assignee: Todd Lipcon
> Fix For: 0.20.2
>
> Attachments: hdfs-793-branch20.txt, hdfs-793-branch20.txt, 
> hdfs-872.txt
>
>
> After upgrading to that latest HDFS 0.20.2 (r896310 from 
> /branches/branch-0.20), old DFS clients (0.20.1) seem to not work anymore. 
> HBase uses the 0.20.1 hadoop core jars and the HBase master will no longer 
> startup. Here is the exception from the HBase master log:
> {code}
> 2010-01-06 09:59:46,762 WARN org.apache.hadoop.hdfs.DFSClient: DFS Read: 
> java.io.IOException: Could not obtain block: blk_338051
> 259657728_1002 file=/hbase/hbase.version
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1788)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1616)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1743)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1673)
> at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320)
> at java.io.DataInputStream.readUTF(DataInputStream.java:572)
> at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:189)
> at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:208)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:208)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1241)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1282)
> 2010-01-06 09:59:46,763 FATAL org.apache.hadoop.hbase.master.HMaster: Not 
> starting HMaster because:
> java.io.IOException: Could not obtain block: blk_338051259657728_1002 
> file=/hbase/hbase.version
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1788)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1616)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1743)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1673)
> at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320)
> at java.io.DataInputStream.readUTF(DataInputStream.java:572)
> at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:189)
> at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:208)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:208)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at org.apache.hadoop.hbase.master.HMaster.doMain(HMaster.java:1241)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1282)
> {code}
> If I switch the hadoop jars in the hbase/lib directory with 0.20.2 version it 
> works well, which what led me to open this bug here and not in the HBASE 
> project.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-874) TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-874:
-

Status: Patch Available  (was: Open)

> TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts
> ---
>
> Key: HDFS-874
> URL: https://issues.apache.org/jira/browse/HDFS-874
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, test
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-874.txt
>
>
> On an internal build machine I see exceptions like this:
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:47262/data/1/scratch/patchqueue/patch-worker-20518/patch_21/svnrepo/build/test/data/test/test/testRenameWithQuota/srcdir,
>  expected: hdfs://localhost.localdomain:47262
> "hostname" and "hostname -f" both show the machine's FQDN (not localhost). 
> /etc/hosts is stock after CentOS 5 install. "host 127.0.0.1" reverses to 
> "localhost"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-874) TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-874:
-

Status: Open  (was: Patch Available)

> TestHDFSFileContextMainOperations fails on weirdly configured DNS hosts
> ---
>
> Key: HDFS-874
> URL: https://issues.apache.org/jira/browse/HDFS-874
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, test
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-874.txt
>
>
> On an internal build machine I see exceptions like this:
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:47262/data/1/scratch/patchqueue/patch-worker-20518/patch_21/svnrepo/build/test/data/test/test/testRenameWithQuota/srcdir,
>  expected: hdfs://localhost.localdomain:47262
> "hostname" and "hostname -f" both show the machine's FQDN (not localhost). 
> /etc/hosts is stock after CentOS 5 install. "host 127.0.0.1" reverses to 
> "localhost"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-919) Create test to validate the BlocksVerified metric

2010-01-26 Thread gary murry (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gary murry updated HDFS-919:


Status: Patch Available  (was: Open)

> Create test to validate the BlocksVerified metric
> -
>
> Key: HDFS-919
> URL: https://issues.apache.org/jira/browse/HDFS-919
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 0.20.2
>Reporter: gary murry
> Attachments: HDFS-919.patch, HDFS-919.patch, HDFS-919_0.20.patch, 
> HDFS-919_2.patch
>
>
> Just adding some tests to validate the BlocksVerified metric.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-839) The NameNode should forward block reports to BackupNode

2010-01-26 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805230#action_12805230
 ] 

Eli Collins commented on HDFS-839:
--

I think we should attempt to keep whether the fail over is manual or automatic 
largely orthogonal to HDFS, ie we need to provide the necessary interfaces to 
allow external software to do either manual or automatic fail over, and rely on 
that software to drive the fail over. ie let's leverage existing software where 
we can. 

My understanding is that this jira is about enabling a faster fail over (either 
manual or automatic) from the NN to the BN by actively syncing the necessary 
state between the two. Seems like the next step is to identify the set of the 
jiras we need to do to enable this beyond fwd'ing the block report: eg have the 
BN maintain an up-to-date edits log, reconstruct leases etc.  Reasonable?

> The NameNode should forward block reports to BackupNode
> ---
>
> Key: HDFS-839
> URL: https://issues.apache.org/jira/browse/HDFS-839
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> The BackupNode (via HADOOP-4539) receives a stream of transactions from 
> NameNode. However, the BackupNode does not have block locations of blocks. It 
> would be nice if the NameNode can forward all block reports (that it receives 
> from DataNodes) to the BackupNode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-927) DFSInputStream retries too many times for new block locations

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-927:
-

Status: Patch Available  (was: Open)

> DFSInputStream retries too many times for new block locations
> -
>
> Key: HDFS-927
> URL: https://issues.apache.org/jira/browse/HDFS-927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-927.txt
>
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
> to only go back to the NN max.block.acquires times, but in trunk it goes back 
> twice as many - the default is 3, but I am counting 7 calls to 
> getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-927) DFSInputStream retries too many times for new block locations

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-927:
-

Attachment: hdfs-927.txt

The crux of this issue is that the original HDFS-127 patch was bad. I'm not 
sure why it caused an infinite loop on 0.20 but not on later branches, but 
either way it doesn't do what it was supposed to.

This patch adds test cases to check infinite loop behavior and also to verify 
that the correct number of retries are taken. I also took the approach I 
outlined at 
https://issues.apache.org/jira/browse/HDFS-127?focusedCommentId=12803077&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12803077
 to fix HDFS-127's retry logic.

> DFSInputStream retries too many times for new block locations
> -
>
> Key: HDFS-927
> URL: https://issues.apache.org/jira/browse/HDFS-927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-927.txt
>
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
> to only go back to the NN max.block.acquires times, but in trunk it goes back 
> twice as many - the default is 3, but I am counting 7 calls to 
> getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-927) DFSInputStream retries too many times for new block locations

2010-01-26 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-927:
-

Description: I think this is a regression caused by HDFS-127 -- 
DFSInputStream is supposed to only go back to the NN max.block.acquires times, 
but in trunk it goes back twice as many - the default is 3, but I am counting 7 
calls to getBlockLocations before an exception is thrown.  (was: I think this 
is a regression caused by HDFS-127 -- DFSInputStream is supposed to only go 
back to the NN max.block.acquires times, but in trunk it goes back twice as 
many - the default is 6, but I am counting 7 calls to getBlockLocations before 
an exception is thrown.)

> DFSInputStream retries too many times for new block locations
> -
>
> Key: HDFS-927
> URL: https://issues.apache.org/jira/browse/HDFS-927
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
>
> I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
> to only go back to the NN max.block.acquires times, but in trunk it goes back 
> twice as many - the default is 3, but I am counting 7 calls to 
> getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-927) DFSInputStream retries too many times for new block locations

2010-01-26 Thread Todd Lipcon (JIRA)
DFSInputStream retries too many times for new block locations
-

 Key: HDFS-927
 URL: https://issues.apache.org/jira/browse/HDFS-927
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical


I think this is a regression caused by HDFS-127 -- DFSInputStream is supposed 
to only go back to the NN max.block.acquires times, but in trunk it goes back 
twice as many - the default is 6, but I am counting 7 calls to 
getBlockLocations before an exception is thrown.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-899) Delegation Token Implementation

2010-01-26 Thread Boris Shkolnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805202#action_12805202
 ] 

Boris Shkolnik commented on HDFS-899:
-

committed, thanks Jitendra.

> Delegation Token Implementation
> ---
>
> Key: HDFS-899
> URL: https://issues.apache.org/jira/browse/HDFS-899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-899.1.patch, HDFS-899.2.patch, HDFS-899.3.patch, 
> HDFS-899.4.patch, HDFS-899.5.patch, HDFS-899.6.patch, HDFS-899.7.patch
>
>
>   This jira tracks implementation of delegation token and corresponding 
> changes in Namenode and DFS Api to issue, renew and cancel delegation tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-839) The NameNode should forward block reports to BackupNode

2010-01-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805185#action_12805185
 ] 

Todd Lipcon commented on HDFS-839:
--

bq. Does it make sense?

Absolutely. In particular, I think the _automatic_ standby modes shoulds be 
punted to external tools for the initial implementation. There are a lot of 
good tools for this, and as long as the manual failover modes are scriptable 
and reliable (read: automatically tested) we should feel comfortable using 
LinuxHA, ZK, or any other failure detector to trigger the failover.

As I understand it, we have had "cold HA" for quite some time already. The 
BackupNode in 21 adds "warm standby". This JIRA is starting discussion about 
"hot standby".

Is there a temperature in between warm and hot where the standby does not have 
block reports, but when a failover occurs, the DNs are instructured to 
immediately report? Thus, the failover would not have to wait for an entire 
block report interval to be operational.

> The NameNode should forward block reports to BackupNode
> ---
>
> Key: HDFS-839
> URL: https://issues.apache.org/jira/browse/HDFS-839
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> The BackupNode (via HADOOP-4539) receives a stream of transactions from 
> NameNode. However, the BackupNode does not have block locations of blocks. It 
> would be nice if the NameNode can forward all block reports (that it receives 
> from DataNodes) to the BackupNode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-839) The NameNode should forward block reports to BackupNode

2010-01-26 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805180#action_12805180
 ] 

Konstantin Shvachko commented on HDFS-839:
--

Wang> the information which BN has, decides the latency of fail-over.

Correct. And this is the difference between warm standby and hot standby. By 
_warm_ standby I mean a BN, which has only namespace information without block 
locations. So in order to startup it will wait for block reports from 
data-nodes that are switching to BN as a new primary. Waiting for all block 
reports can take time, which depends on the cluster size (could be 10-20 
minutes on large clusters).
But it is still faster than NN _cold_ startup when it needs to read the image 
and edits before processing the block reports.
With _hot_ standby you will need BN to have all information the primary NN has.

Following yesterday's discussion with folks from Facebook and Y! I want to 
clarify the classification of HA solutions in my previous comment.
# _Manual or automatic cold HA_. Start new NN from scratch when the old one 
dies. Does not need BN. Does not need changes to HDFS code if external tools 
are used for failure detection and restart. Suresh experimented with Linux HA, 
and Cloudera has a write up on that.
# _Manual warm standby_. An admin command is issued to switch to the BN. BN 
does not have block location - only the namespace.
# _Automatic warm standby_. Cluster components automatically switch to BN when 
the primary NN dies. BN does not have block locations.
# _Manual hot standby_. An admin command is issued to switch to the BN. BN has 
has maximum information to take over.
# _Automatic hot standby_. Cluster automatically switches to BN. BN has has 
maximum information to take over.

I am arguing that we should not jump straight to automatic-hot-standby because 
it's a hard problem, but rather do it step by step, starting from 
manual-warm-standby. I am sure there will be substantial amount of work on each 
step. Lease recovery (as Todd proposes) and other requirements can be 
additional constraints defining how warm or how hot we want the standby node to 
be.
Does it make sense?

> The NameNode should forward block reports to BackupNode
> ---
>
> Key: HDFS-839
> URL: https://issues.apache.org/jira/browse/HDFS-839
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> The BackupNode (via HADOOP-4539) receives a stream of transactions from 
> NameNode. However, the BackupNode does not have block locations of blocks. It 
> would be nice if the NameNode can forward all block reports (that it receives 
> from DataNodes) to the BackupNode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-877) Client-driven block verification not functioning

2010-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805176#action_12805176
 ] 

Hudson commented on HDFS-877:
-

Integrated in Hadoop-Hdfs-trunk-Commit #179 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/179/])
HDFS-922. Remove unnecessary semicolon added by  that causes
problems for Eclipse compilation.


> Client-driven block verification not functioning
> 
>
> Key: HDFS-877
> URL: https://issues.apache.org/jira/browse/HDFS-877
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, test
>Affects Versions: 0.20.1, 0.21.0, 0.22.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.22.0
>
> Attachments: hdfs-877-branch20.txt, hdfs-877.txt, hdfs-877.txt, 
> hdfs-877.txt, hdfs-877.txt, hdfs-877.txt, hdfs-877.txt
>
>
> This is actually the reason for HDFS-734 (TestDatanodeBlockScanner timing 
> out). The issue is that DFSInputStream relies on readChunk being called one 
> last time at the end of the file in order to receive the 
> lastPacketInBlock=true packet from the DN. However, DFSInputStream.read 
> checks pos < getFileLength() before issuing the read. Thus gotEOS never 
> shifts to true and checksumOk() is never called.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-899) Delegation Token Implementation

2010-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805177#action_12805177
 ] 

Hudson commented on HDFS-899:
-

Integrated in Hadoop-Hdfs-trunk-Commit #179 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/179/])
. Delegation Token Implementation


> Delegation Token Implementation
> ---
>
> Key: HDFS-899
> URL: https://issues.apache.org/jira/browse/HDFS-899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-899.1.patch, HDFS-899.2.patch, HDFS-899.3.patch, 
> HDFS-899.4.patch, HDFS-899.5.patch, HDFS-899.6.patch, HDFS-899.7.patch
>
>
>   This jira tracks implementation of delegation token and corresponding 
> changes in Namenode and DFS Api to issue, renew and cancel delegation tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-922) Remove extra semicolon from HDFS-877 that really annoys Eclipse

2010-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805178#action_12805178
 ] 

Hudson commented on HDFS-922:
-

Integrated in Hadoop-Hdfs-trunk-Commit #179 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/179/])
. Remove unnecessary semicolon added by HDFS-877 that causes
problems for Eclipse compilation.


> Remove extra semicolon from HDFS-877 that really annoys Eclipse
> ---
>
> Key: HDFS-922
> URL: https://issues.apache.org/jira/browse/HDFS-922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Jakob Homan
>Assignee: Jakob Homan
>Priority: Minor
> Attachments: HDFS-922.patch
>
>
> HDFS-877 introduced an extra semicolon on an empty line that Eclipse treats 
> as a syntax error and hence messes up its compilation.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-165) NPE in datanode.handshake()

2010-01-26 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805162#action_12805162
 ] 

Steve Loughran commented on HDFS-165:
-

OK, if that's the case then its only possible to recreate this condition with 
my Service lifecycle -in which case it should go into that patch- or if someone 
has subclassed the datanode and has somehow started a thread while the super() 
class is starting up. I will merge it into the lifecycle patch rather than 
split out (as I have done here)

> NPE in datanode.handshake()
> ---
>
> Key: HDFS-165
> URL: https://issues.apache.org/jira/browse/HDFS-165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-165.patch
>
>
> It appears possible to raise an NPE in DataNode.handshake() if the startup 
> protocol gets interrupted or fails in some manner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-884) DataNode makeInstance should report the directory list when failing to start up

2010-01-26 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805158#action_12805158
 ] 

Steve Loughran commented on HDFS-884:
-

There are no tests here as it is only logged output, and we don't have a test 
setup that captures/analyses logs.

> DataNode makeInstance should report the directory list when failing to start 
> up
> ---
>
> Key: HDFS-884
> URL: https://issues.apache.org/jira/browse/HDFS-884
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-884.patch, HDFS-884.patch
>
>
> When {{Datanode.makeInstance()}} cannot work with one of the directories in 
> dfs.data.dir, it logs this at warn level (while losing the stack trace). 
> It should include the nested exception for better troubleshooting. Then, when 
> all dirs in the list fail, an exception is thrown, but this exception does 
> not include the list of directories. It should list the absolute path of 
> every missing/failing directory, so that whoever sees the exception can see 
> where to start looking for problems: either the filesystem or the 
> configuration. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-922) Remove extra semicolon from HDFS-877 that really annoys Eclipse

2010-01-26 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HDFS-922:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I've committed this.

> Remove extra semicolon from HDFS-877 that really annoys Eclipse
> ---
>
> Key: HDFS-922
> URL: https://issues.apache.org/jira/browse/HDFS-922
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Jakob Homan
>Assignee: Jakob Homan
>Priority: Minor
> Attachments: HDFS-922.patch
>
>
> HDFS-877 introduced an extra semicolon on an empty line that Eclipse treats 
> as a syntax error and hence messes up its compilation.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-165) NPE in datanode.handshake()

2010-01-26 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805145#action_12805145
 ] 

Konstantin Shvachko commented on HDFS-165:
--

Steve, I don't understand the scenario, when you get NPE. 
{{shouldRun}} is set to {{flase}} in {{DataNode.shutdown()}} and 
{{DataNode.handleDiskError()}} only. And {{handshake()}} is called during 
data-node start up, when it is still single threaded. The only way I can see to 
get your NPE is to call {{shutdown()}} in a separate thread before the 
data-node started. But in order to call {{shutdown()}} you need an instance of 
DataNode, and creating DataNode instance is not possible without a successful 
{{handshake()}}.
Could you please clarify.

> NPE in datanode.handshake()
> ---
>
> Key: HDFS-165
> URL: https://issues.apache.org/jira/browse/HDFS-165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-165.patch
>
>
> It appears possible to raise an NPE in DataNode.handshake() if the startup 
> protocol gets interrupted or fails in some manner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-899) Delegation Token Implementation

2010-01-26 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-899:
--

Attachment: HDFS-899.7.patch

HDFS-899.7.patch is updated because of recent commits in the trunk.

> Delegation Token Implementation
> ---
>
> Key: HDFS-899
> URL: https://issues.apache.org/jira/browse/HDFS-899
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-899.1.patch, HDFS-899.2.patch, HDFS-899.3.patch, 
> HDFS-899.4.patch, HDFS-899.5.patch, HDFS-899.6.patch, HDFS-899.7.patch
>
>
>   This jira tracks implementation of delegation token and corresponding 
> changes in Namenode and DFS Api to issue, renew and cancel delegation tokens.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-926) BufferedDFSInputStream

2010-01-26 Thread Zlatin Balevsky (JIRA)
BufferedDFSInputStream
--

 Key: HDFS-926
 URL: https://issues.apache.org/jira/browse/HDFS-926
 Project: Hadoop HDFS
  Issue Type: Wish
  Components: hdfs client
Reporter: Zlatin Balevsky
Priority: Minor


Self-explanatory.  Buffer size can be provided in number of blocks.  Could be 
implemented trivially with heap storage and several BlockReaders or could have 
more advanced features like:

* logic to ensure that blocks are not pulled from the same Datanode(s).
* local filesystem store for buffered blocks
* adaptive parallelism 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-165) NPE in datanode.handshake()

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805070#action_12805070
 ] 

Hadoop QA commented on HDFS-165:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12429761/HDFS-165.patch
  against trunk revision 903098.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/103/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/103/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/103/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/103/console

This message is automatically generated.

> NPE in datanode.handshake()
> ---
>
> Key: HDFS-165
> URL: https://issues.apache.org/jira/browse/HDFS-165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-165.patch
>
>
> It appears possible to raise an NPE in DataNode.handshake() if the startup 
> protocol gets interrupted or fails in some manner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-708) A stress-test tool for HDFS.

2010-01-26 Thread Wang Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805065#action_12805065
 ] 

Wang Xu commented on HDFS-708:
--

I post our design illustration here.
http://gnawux.info/hadoop/2010/01/a-simple-hdfs-performance-test-tool/

And I will post the code on google code or other place tomorrow.

In our test program, synchronizer is a server written in python, it accepts the 
request of test program running in test nodes. Having received requests from 
all nodes, it admits them start pressure simultaneously.

The test program is written in Java, and it starts several threads to write or 
read with DFSClient. All the pressure thread record the data it has written in 
a variable and the main thread of the test program collect them periodically, 
then written into a XML file.

Analyzing the xml output file, we can tell the performance of reading and 
writing.

In our test program, it supports read only, write only and read-write. And it 
can be set as read files writen by itself or random files.

> A stress-test tool for HDFS.
> 
>
> Key: HDFS-708
> URL: https://issues.apache.org/jira/browse/HDFS-708
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: test, tools
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
> Fix For: 0.22.0
>
>
> It would be good to have a tool for automatic stress testing HDFS, which 
> would provide IO-intensive load on HDFS cluster.
> The idea is to start the tool, let it run overnight, and then be able to 
> analyze possible failures.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-884) DataNode makeInstance should report the directory list when failing to start up

2010-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805060#action_12805060
 ] 

Hadoop QA commented on HDFS-884:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12431420/HDFS-884.patch
  against trunk revision 903098.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/206/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/206/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/206/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/206/console

This message is automatically generated.

> DataNode makeInstance should report the directory list when failing to start 
> up
> ---
>
> Key: HDFS-884
> URL: https://issues.apache.org/jira/browse/HDFS-884
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-884.patch, HDFS-884.patch
>
>
> When {{Datanode.makeInstance()}} cannot work with one of the directories in 
> dfs.data.dir, it logs this at warn level (while losing the stack trace). 
> It should include the nested exception for better troubleshooting. Then, when 
> all dirs in the list fail, an exception is thrown, but this exception does 
> not include the list of directories. It should list the absolute path of 
> every missing/failing directory, so that whoever sees the exception can see 
> where to start looking for problems: either the filesystem or the 
> configuration. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-630) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.

2010-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805032#action_12805032
 ] 

Hudson commented on HDFS-630:
-

Integrated in Hadoop-Hdfs-trunk #212 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk/212/])
 In DFSOutputStream.nextBlockOutputStream(), the client can exclude 
specific datanodes when locating the next block
 In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific 
datanodes when locating the next block


> In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific 
> datanodes when locating the next block.
> ---
>
> Key: HDFS-630
> URL: https://issues.apache.org/jira/browse/HDFS-630
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client, name-node
>Affects Versions: 0.21.0
>Reporter: Ruyue Ma
>Assignee: Cosmin Lehene
> Fix For: 0.21.0, 0.22.0
>
> Attachments: 0001-Fix-HDFS-630-0.21-svn-1.patch, 
> 0001-Fix-HDFS-630-0.21-svn-2.patch, 0001-Fix-HDFS-630-0.21-svn.patch, 
> 0001-Fix-HDFS-630-for-0.21-and-trunk-unified.patch, 
> 0001-Fix-HDFS-630-for-0.21.patch, 0001-Fix-HDFS-630-svn.patch, 
> 0001-Fix-HDFS-630-svn.patch, 0001-Fix-HDFS-630-trunk-svn-1.patch, 
> 0001-Fix-HDFS-630-trunk-svn-2.patch, 0001-Fix-HDFS-630-trunk-svn-3.patch, 
> 0001-Fix-HDFS-630-trunk-svn-3.patch, 0001-Fix-HDFS-630-trunk-svn-4.patch, 
> hdfs-630-0.20.txt, HDFS-630.patch
>
>
> created from hdfs-200.
> If during a write, the dfsclient sees that a block replica location for a 
> newly allocated block is not-connectable, it re-requests the NN to get a 
> fresh set of replica locations of the block. It tries this 
> dfs.client.block.write.retries times (default 3), sleeping 6 seconds between 
> each retry ( see DFSClient.nextBlockOutputStream).
> This setting works well when you have a reasonable size cluster; if u have 
> few datanodes in the cluster, every retry maybe pick the dead-datanode and 
> the above logic bails out.
> Our solution: when getting block location from namenode, we give nn the 
> excluded datanodes. The list of dead datanodes is only for one block 
> allocation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Moved: (HDFS-925) Make it harder to accidentally close a shared DFSClient

2010-01-26 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran moved HADOOP-5933 to HDFS-925:
-

  Component/s: (was: fs)
   hdfs client
Fix Version/s: (was: 0.22.0)
   0.22.0
Affects Version/s: (was: 0.21.0)
   0.21.0
  Key: HDFS-925  (was: HADOOP-5933)
  Project: Hadoop HDFS  (was: Hadoop Common)

> Make it harder to accidentally close a shared DFSClient
> ---
>
> Key: HDFS-925
> URL: https://issues.apache.org/jira/browse/HDFS-925
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 0.21.0
>Reporter: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HADOOP-5933.patch, HADOOP-5933.patch
>
>
> Every so often I get stack traces telling me that DFSClient is closed, 
> usually in {{org.apache.hadoop.hdfs.DFSClient.checkOpen() }} . The root cause 
> of this is usually that one thread has closed a shared fsclient while another 
> thread still has a reference to it. If the other thread then asks for a new 
> client it will get one -and the cache repopulated- but if has one already, 
> then I get to see a stack trace. 
> It's effectively a race condition between clients in different threads. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-165) NPE in datanode.handshake()

2010-01-26 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-165:


Status: Open  (was: Patch Available)

> NPE in datanode.handshake()
> ---
>
> Key: HDFS-165
> URL: https://issues.apache.org/jira/browse/HDFS-165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-165.patch
>
>
> It appears possible to raise an NPE in DataNode.handshake() if the startup 
> protocol gets interrupted or fails in some manner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-883) Datanode shutdown should log problems with Storage.unlockAll()

2010-01-26 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805008#action_12805008
 ] 

Steve Loughran commented on HDFS-883:
-

As I said in the patch; no tests here as all that is happening is logging 
shutdown problems that are fairly hard to simulate, but when they happen, 
should at least be logged

> Datanode shutdown should log problems with Storage.unlockAll()
> --
>
> Key: HDFS-883
> URL: https://issues.apache.org/jira/browse/HDFS-883
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Attachments: HDFS-883.patch
>
>
> When shutting down, Datanode calls {{Storage.unlockAll()}}, but discards any 
> exceptions thrown in the process.
> These exceptions could be useful in diagnosing problems, and should be logged.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-165) NPE in datanode.handshake()

2010-01-26 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-165:


Status: Patch Available  (was: Open)

> NPE in datanode.handshake()
> ---
>
> Key: HDFS-165
> URL: https://issues.apache.org/jira/browse/HDFS-165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-165.patch
>
>
> It appears possible to raise an NPE in DataNode.handshake() if the startup 
> protocol gets interrupted or fails in some manner

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-884) DataNode makeInstance should report the directory list when failing to start up

2010-01-26 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-884:


Status: Patch Available  (was: Open)

This has more detailed reporting of bad directories/URIs, but it does not raise 
any exception if the no of data dirs ==0.

> DataNode makeInstance should report the directory list when failing to start 
> up
> ---
>
> Key: HDFS-884
> URL: https://issues.apache.org/jira/browse/HDFS-884
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-884.patch, HDFS-884.patch
>
>
> When {{Datanode.makeInstance()}} cannot work with one of the directories in 
> dfs.data.dir, it logs this at warn level (while losing the stack trace). 
> It should include the nested exception for better troubleshooting. Then, when 
> all dirs in the list fail, an exception is thrown, but this exception does 
> not include the list of directories. It should list the absolute path of 
> every missing/failing directory, so that whoever sees the exception can see 
> where to start looking for problems: either the filesystem or the 
> configuration. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-884) DataNode makeInstance should report the directory list when failing to start up

2010-01-26 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-884:


Attachment: HDFS-884.patch

Resync with head and URI list

> DataNode makeInstance should report the directory list when failing to start 
> up
> ---
>
> Key: HDFS-884
> URL: https://issues.apache.org/jira/browse/HDFS-884
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-884.patch, HDFS-884.patch
>
>
> When {{Datanode.makeInstance()}} cannot work with one of the directories in 
> dfs.data.dir, it logs this at warn level (while losing the stack trace). 
> It should include the nested exception for better troubleshooting. Then, when 
> all dirs in the list fail, an exception is thrown, but this exception does 
> not include the list of directories. It should list the absolute path of 
> every missing/failing directory, so that whoever sees the exception can see 
> where to start looking for problems: either the filesystem or the 
> configuration. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-884) DataNode makeInstance should report the directory list when failing to start up

2010-01-26 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-884:


Status: Open  (was: Patch Available)

move to URIs breaks this patch

> DataNode makeInstance should report the directory list when failing to start 
> up
> ---
>
> Key: HDFS-884
> URL: https://issues.apache.org/jira/browse/HDFS-884
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node
>Affects Versions: 0.22.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: HDFS-884.patch
>
>
> When {{Datanode.makeInstance()}} cannot work with one of the directories in 
> dfs.data.dir, it logs this at warn level (while losing the stack trace). 
> It should include the nested exception for better troubleshooting. Then, when 
> all dirs in the list fail, an exception is thrown, but this exception does 
> not include the list of directories. It should list the absolute path of 
> every missing/failing directory, so that whoever sees the exception can see 
> where to start looking for problems: either the filesystem or the 
> configuration. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-924) Support reading and writing sequencefile in libhdfs

2010-01-26 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804979#action_12804979
 ] 

Ruyue Ma commented on HDFS-924:
---

we have implemented it by using jni. 

> Support reading and writing sequencefile in libhdfs 
> 
>
> Key: HDFS-924
> URL: https://issues.apache.org/jira/browse/HDFS-924
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Ruyue Ma
>Assignee: Ruyue Ma
>Priority: Minor
>
> Some use case may need read and write sequencefile through libhdfs. 
> We should provide the reading and writing api for sequencefile in libhdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-924) Support reading and writing sequencefile in libhdfs

2010-01-26 Thread Ruyue Ma (JIRA)
Support reading and writing sequencefile in libhdfs 


 Key: HDFS-924
 URL: https://issues.apache.org/jira/browse/HDFS-924
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ruyue Ma
Assignee: Ruyue Ma
Priority: Minor


Some use case may need read and write sequencefile through libhdfs. 

We should provide the reading and writing api for sequencefile in libhdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-01-26 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804975#action_12804975
 ] 

Ruyue Ma commented on HDFS-923:
---

Our modification for hdfs.c is: 

{noformat}

tSize hdfsRead(hdfsFS fs, hdfsFile f, void* buffer, tSize length)
{
// JAVA EQUIVALENT:
//  byte [] bR = new byte[length];
//  fis.read(bR);

//Get the JNIEnv* corresponding to current thread
JNIEnv* env = getJNIEnv();
if (env == NULL) {
  errno = EINTERNAL;
  return -1;
}

//Parameters
jobject jInputStream = (jobject)(f ? f->file : NULL);

jbyteArray jbRarray;
jint noReadBytes = 0;
jvalue jVal;
jthrowable jExc = NULL;

int hasReadBytes = 0;

//Sanity check
if (!f || f->type == UNINITIALIZED) {
errno = EBADF;
return -1;
}

//Error checking... make sure that this file is 'readable'
if (f->type != INPUT) {
fprintf(stderr, "Cannot read from a non-InputStream object!\n");
errno = EINVAL;
return -1;
}

 
/
// > OUR MODIFICATION
jbRarray = (*env)->NewByteArray(env, length);
while (hasReadBytes < length) {
if (invokeMethod(env, &jVal, &jExc, INSTANCE, jInputStream, 
HADOOP_ISTRM,
 "read", JMETHOD3("[B", "I", "I", "I") , jbRarray, 
hasReadBytes, length-hasReadBytes) != 0) {
errno = errnoFromException(jExc, env, "org.apache.hadoop.fs."
   "FSDataInputStream::read");
noReadBytes = -1;
}
else {
noReadBytes = jVal.i;
if (noReadBytes >= 0) {
(*env)->GetByteArrayRegion(env, jbRarray, 0, noReadBytes, 
buffer+hasReadBytes);
hasReadBytes += noReadBytes;
}  else {
//This is a valid case: there aren't any bytes left to read!
break;
}
errno = 0;
}

 // > OUR MODIFICATION

///

}

destroyLocalReference(env, jbRarray);
return hasReadBytes;
}


{noformat}

> libhdfs hdfs_read example uses hdfsRead wrongly
> ---
>
> Key: HDFS-923
> URL: https://issues.apache.org/jira/browse/HDFS-923
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: contrib/libhdfs
>Affects Versions: 0.20.1
>Reporter: Ruyue Ma
>Assignee: Ruyue Ma
> Fix For: 0.21.0
>
>
> In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
> {noformat}
> // read from the file
> tSize curSize = bufferSize;
> for (; curSize == bufferSize;) {
> curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
> }
> {noformat} 
> the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-01-26 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804974#action_12804974
 ] 

Ruyue Ma commented on HDFS-923:
---

My opinion: 

The user api of hdfsRead is not good. We should support the following use case.

{noformat}
// read from the file
tSize curSize = bufferSize;
for (; curSize == bufferSize;) {
curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
}

{noformat}

> libhdfs hdfs_read example uses hdfsRead wrongly
> ---
>
> Key: HDFS-923
> URL: https://issues.apache.org/jira/browse/HDFS-923
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: contrib/libhdfs
>Affects Versions: 0.20.1
>Reporter: Ruyue Ma
>Assignee: Ruyue Ma
> Fix For: 0.21.0
>
>
> In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
> {noformat}
> // read from the file
> tSize curSize = bufferSize;
> for (; curSize == bufferSize;) {
> curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
> }
> {noformat} 
> the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-01-26 Thread Ruyue Ma (JIRA)
libhdfs hdfs_read example uses hdfsRead wrongly
---

 Key: HDFS-923
 URL: https://issues.apache.org/jira/browse/HDFS-923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Affects Versions: 0.20.1
Reporter: Ruyue Ma
Assignee: Ruyue Ma
 Fix For: 0.21.0


In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 

{noformat}
// read from the file
tSize curSize = bufferSize;
for (; curSize == bufferSize;) {
curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
}
{noformat} 

the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-630) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.

2010-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804948#action_12804948
 ] 

Hudson commented on HDFS-630:
-

Integrated in Hadoop-Hdfs-trunk-Commit #178 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/178/])
 In DFSOutputStream.nextBlockOutputStream(), the client can exclude 
specific datanodes when locating the next block
 In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific 
datanodes when locating the next block


> In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific 
> datanodes when locating the next block.
> ---
>
> Key: HDFS-630
> URL: https://issues.apache.org/jira/browse/HDFS-630
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client, name-node
>Affects Versions: 0.21.0
>Reporter: Ruyue Ma
>Assignee: Cosmin Lehene
> Fix For: 0.21.0, 0.22.0
>
> Attachments: 0001-Fix-HDFS-630-0.21-svn-1.patch, 
> 0001-Fix-HDFS-630-0.21-svn-2.patch, 0001-Fix-HDFS-630-0.21-svn.patch, 
> 0001-Fix-HDFS-630-for-0.21-and-trunk-unified.patch, 
> 0001-Fix-HDFS-630-for-0.21.patch, 0001-Fix-HDFS-630-svn.patch, 
> 0001-Fix-HDFS-630-svn.patch, 0001-Fix-HDFS-630-trunk-svn-1.patch, 
> 0001-Fix-HDFS-630-trunk-svn-2.patch, 0001-Fix-HDFS-630-trunk-svn-3.patch, 
> 0001-Fix-HDFS-630-trunk-svn-3.patch, 0001-Fix-HDFS-630-trunk-svn-4.patch, 
> hdfs-630-0.20.txt, HDFS-630.patch
>
>
> created from hdfs-200.
> If during a write, the dfsclient sees that a block replica location for a 
> newly allocated block is not-connectable, it re-requests the NN to get a 
> fresh set of replica locations of the block. It tries this 
> dfs.client.block.write.retries times (default 3), sleeping 6 seconds between 
> each retry ( see DFSClient.nextBlockOutputStream).
> This setting works well when you have a reasonable size cluster; if u have 
> few datanodes in the cluster, every retry maybe pick the dead-datanode and 
> the above logic bails out.
> Our solution: when getting block location from namenode, we give nn the 
> excluded datanodes. The list of dead datanodes is only for one block 
> allocation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-630) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.

2010-01-26 Thread Cosmin Lehene (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804947#action_12804947
 ] 

Cosmin Lehene commented on HDFS-630:


I'm glad it finally got in both 0.21 and trunk. It was a long lived issue. 
Thanks for the support! :)

> In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific 
> datanodes when locating the next block.
> ---
>
> Key: HDFS-630
> URL: https://issues.apache.org/jira/browse/HDFS-630
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client, name-node
>Affects Versions: 0.21.0
>Reporter: Ruyue Ma
>Assignee: Cosmin Lehene
> Fix For: 0.21.0, 0.22.0
>
> Attachments: 0001-Fix-HDFS-630-0.21-svn-1.patch, 
> 0001-Fix-HDFS-630-0.21-svn-2.patch, 0001-Fix-HDFS-630-0.21-svn.patch, 
> 0001-Fix-HDFS-630-for-0.21-and-trunk-unified.patch, 
> 0001-Fix-HDFS-630-for-0.21.patch, 0001-Fix-HDFS-630-svn.patch, 
> 0001-Fix-HDFS-630-svn.patch, 0001-Fix-HDFS-630-trunk-svn-1.patch, 
> 0001-Fix-HDFS-630-trunk-svn-2.patch, 0001-Fix-HDFS-630-trunk-svn-3.patch, 
> 0001-Fix-HDFS-630-trunk-svn-3.patch, 0001-Fix-HDFS-630-trunk-svn-4.patch, 
> hdfs-630-0.20.txt, HDFS-630.patch
>
>
> created from hdfs-200.
> If during a write, the dfsclient sees that a block replica location for a 
> newly allocated block is not-connectable, it re-requests the NN to get a 
> fresh set of replica locations of the block. It tries this 
> dfs.client.block.write.retries times (default 3), sleeping 6 seconds between 
> each retry ( see DFSClient.nextBlockOutputStream).
> This setting works well when you have a reasonable size cluster; if u have 
> few datanodes in the cluster, every retry maybe pick the dead-datanode and 
> the above logic bails out.
> Our solution: when getting block location from namenode, we give nn the 
> excluded datanodes. The list of dead datanodes is only for one block 
> allocation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.