[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840051#action_12840051
 ] 

Hadoop QA commented on HDFS-1001:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/257/console

This message is automatically generated.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-01 Thread bc Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840048#action_12840048
 ] 

bc Wong commented on HDFS-1001:
---

I have a patch. But I can't assign this issue to myself. Could someone please 
fix Jira to let me work on it? Thanks.

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-01 Thread bc Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bc Wong updated HDFS-1001:
--

Status: Patch Available  (was: Open)

> DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
> -
>
> Key: HDFS-1001
> URL: https://issues.apache.org/jira/browse/HDFS-1001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: bc Wong
>
> Running the TestPread with additional debug statements reveals that the 
> BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
> Currently it doesn't matter since DataXceiver closes the connection after 
> each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
> cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-984) Delegation Tokens should be persisted in Namenode

2010-03-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840041#action_12840041
 ] 

Hudson commented on HDFS-984:
-

Integrated in Hadoop-Mapreduce-trunk-Commit #252 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/252/])


> Delegation Tokens should be persisted in Namenode
> -
>
> Key: HDFS-984
> URL: https://issues.apache.org/jira/browse/HDFS-984
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Fix For: 0.22.0
>
> Attachments: HDFS-984-0_20.4.patch, HDFS-984.10.patch, 
> HDFS-984.11.patch, HDFS-984.12.patch, HDFS-984.14.patch, HDFS-984.7.patch
>
>
> The Delegation tokens should be persisted in the FsImage and EditLogs so that 
> they are valid to be used after namenode shutdown and restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-204) Revive number of files listed metrics

2010-03-01 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-204:
---

Attachment: getFileNum-yahoo20.patch

This patch ported the feature to Yahoo 20 branch. In addition, it fixed the bug 
that NPE will be thrown when getListing on a non-existent path and I also added 
two more test cases, one is listing a non-existent path and one is listing path 
represented a file.

> Revive number of files listed metrics
> -
>
> Key: HDFS-204
> URL: https://issues.apache.org/jira/browse/HDFS-204
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Affects Versions: 0.21.0
>Reporter: Koji Noguchi
>Assignee: Jitendra Nath Pandey
> Fix For: 0.21.0
>
> Attachments: getFileNum-yahoo20.patch, HDFS-204-2.patch, 
> HDFS-204-2.patch, HDFS-204.patch, HDFS-204.patch
>
>
> When namenode becomes unresponsive by HADOOP-4693 (large filelist calls), 
> metrics has been helpful in finding out the cause.
> When gc time hikes, "FileListed" metrics also hiked.
> In 0.18, after we *fixed* "FileListed" metrics so that it shows number of 
> operations instead of number of files listed (HADOOP-3683), I stopped seeing 
> this relationship graph.  
> Can we bring back "NumbverOfFilesListed" metrics?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840009#action_12840009
 ] 

Hadoop QA commented on HDFS-1014:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12437557/HDFS-1014.2.patch
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/console

This message is automatically generated.

> Error in reading delegation tokens from edit logs.
> --
>
> Key: HDFS-1014
> URL: https://issues.apache.org/jira/browse/HDFS-1014
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch
>
>
>  When delegation tokens are read from the edit logs...same object is used to 
> read the identifier and is stored in the token cache. This is wrong because 
> same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread Rodrigo Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839975#action_12839975
 ] 

Rodrigo Schmidt commented on HDFS-729:
--

The errors are the same as before, and the same other patches seem to be going 
through.

Dhruba, could you please double check that everything is fine with this patch?

> fsck option to list only corrupted files
> 
>
> Key: HDFS-729
> URL: https://issues.apache.org/jira/browse/HDFS-729
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: Rodrigo Schmidt
> Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
> HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch
>
>
> An option to fsck to list only corrupted files will be very helpful for 
> frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839974#action_12839974
 ] 

Hadoop QA commented on HDFS-729:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12437530/HDFS-729.3.patch
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/console

This message is automatically generated.

> fsck option to list only corrupted files
> 
>
> Key: HDFS-729
> URL: https://issues.apache.org/jira/browse/HDFS-729
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: Rodrigo Schmidt
> Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
> HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch
>
>
> An option to fsck to list only corrupted files will be very helpful for 
> frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-988) saveNamespace can corrupt edits log

2010-03-01 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-988:
--

Tags: hbase

> saveNamespace can corrupt edits log
> ---
>
> Key: HDFS-988
> URL: https://issues.apache.org/jira/browse/HDFS-988
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Reporter: dhruba borthakur
> Attachments: saveNamespace.txt
>
>
> The adminstrator puts the namenode is safemode and then issues the 
> savenamespace command. This can corrupt the edits log. The problem is that  
> when the NN enters safemode, there could still be pending logSycs occuring 
> from other threads. Now, the saveNamespace command, when executed, would save 
> a edits log with partial writes. I have seen this happen on 0.20.
> https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1014:
---

Status: Patch Available  (was: Open)

> Error in reading delegation tokens from edit logs.
> --
>
> Key: HDFS-1014
> URL: https://issues.apache.org/jira/browse/HDFS-1014
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch
>
>
>  When delegation tokens are read from the edit logs...same object is used to 
> read the identifier and is stored in the token cache. This is wrong because 
> same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839962#action_12839962
 ] 

Jitendra Nath Pandey commented on HDFS-1014:


HDFS-1014.2.patch is for trunk.

> Error in reading delegation tokens from edit logs.
> --
>
> Key: HDFS-1014
> URL: https://issues.apache.org/jira/browse/HDFS-1014
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch
>
>
>  When delegation tokens are read from the edit logs...same object is used to 
> read the identifier and is stored in the token cache. This is wrong because 
> same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1014:
---

Attachment: HDFS-1014.2.patch

> Error in reading delegation tokens from edit logs.
> --
>
> Key: HDFS-1014
> URL: https://issues.apache.org/jira/browse/HDFS-1014
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch
>
>
>  When delegation tokens are read from the edit logs...same object is used to 
> read the identifier and is stored in the token cache. This is wrong because 
> same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839944#action_12839944
 ] 

Konstantin Shvachko commented on HDFS-1014:
---

+1 Patch looks good to me.

> Error in reading delegation tokens from edit logs.
> --
>
> Key: HDFS-1014
> URL: https://issues.apache.org/jira/browse/HDFS-1014
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1014-y20.1.patch
>
>
>  When delegation tokens are read from the edit logs...same object is used to 
> read the identifier and is stored in the token cache. This is wrong because 
> same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HDFS-1014:
--

Assignee: Jitendra Nath Pandey

> Error in reading delegation tokens from edit logs.
> --
>
> Key: HDFS-1014
> URL: https://issues.apache.org/jira/browse/HDFS-1014
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-1014-y20.1.patch
>
>
>  When delegation tokens are read from the edit logs...same object is used to 
> read the identifier and is stored in the token cache. This is wrong because 
> same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1014:
---

Attachment: HDFS-1014-y20.1.patch

Patch for hadoop-20 is uploaded.

> Error in reading delegation tokens from edit logs.
> --
>
> Key: HDFS-1014
> URL: https://issues.apache.org/jira/browse/HDFS-1014
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jitendra Nath Pandey
> Attachments: HDFS-1014-y20.1.patch
>
>
>  When delegation tokens are read from the edit logs...same object is used to 
> read the identifier and is stored in the token cache. This is wrong because 
> same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)
Error in reading delegation tokens from edit logs.
--

 Key: HDFS-1014
 URL: https://issues.apache.org/jira/browse/HDFS-1014
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jitendra Nath Pandey


 When delegation tokens are read from the edit logs...same object is used to 
read the identifier and is stored in the token cache. This is wrong because 
same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-458) Create target for 10 minute patch test build for hdfs

2010-03-01 Thread Erik Steffl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Steffl updated HDFS-458:
-

Attachment: jira.HDFS-458.branch-0.21.1xx.patch

> Create target for 10 minute patch test build for hdfs
> -
>
> Key: HDFS-458
> URL: https://issues.apache.org/jira/browse/HDFS-458
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: build, test
>Reporter: Jakob Homan
>Assignee: Jakob Homan
> Fix For: 0.21.0
>
> Attachments: build.xml, HDFS-458.patch, HDFS-458.patch, 
> jira.HDFS-458.branch-0.21.1xx.patch, TenMinuteTestData.xlsx
>
>
> It would be good to identify a subset of hdfs tests that provide strong test 
> code coverage within 10 minutes, as is the goal of MAPREDUCE-670 and 
> HADOOP-5628.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated HDFS-729:
-

Status: Patch Available  (was: Open)

> fsck option to list only corrupted files
> 
>
> Key: HDFS-729
> URL: https://issues.apache.org/jira/browse/HDFS-729
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: Rodrigo Schmidt
> Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
> HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch
>
>
> An option to fsck to list only corrupted files will be very helpful for 
> frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated HDFS-729:
-

Attachment: HDFS-729.3.patch

new patch attached. (3 is my lucky number)

I made all the changes suggested by Dhruba.

> fsck option to list only corrupted files
> 
>
> Key: HDFS-729
> URL: https://issues.apache.org/jira/browse/HDFS-729
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: Rodrigo Schmidt
> Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
> HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch
>
>
> An option to fsck to list only corrupted files will be very helpful for 
> frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1013) Miscellaneous improvements to HTML markup for web UIs

2010-03-01 Thread Todd Lipcon (JIRA)
Miscellaneous improvements to HTML markup for web UIs
-

 Key: HDFS-1013
 URL: https://issues.apache.org/jira/browse/HDFS-1013
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Todd Lipcon
Priority: Minor


The Web UIs have various bits of bad markup (eg missing  sections, some 
pages missing CSS links, inconsistent td vs th for table headings). We should 
fix this up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839875#action_12839875
 ] 

Konstantin Shvachko commented on HDFS-826:
--

What is the point of introducing new {{Replicable}} interface if it is not used 
anywhere? The new method {{getNumCurrentReplicas()}} in {{FSOutputSummer}} 
would work fine.

> Allow a mechanism for an application to detect that datanode(s)  have died in 
> the write pipeline
> 
>
> Key: HDFS-826
> URL: https://issues.apache.org/jira/browse/HDFS-826
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
> ReplicableHdfs3.txt
>
>
> HDFS does not replicate the last block of the file that is being currently 
> written to by an application. Every datanode death in the write pipeline 
> decreases the reliability of the last block of the currently-being-written 
> block. This situation can be improved if the application can be notified of a 
> datanode death in the write pipeline. Then, the application can decide what 
> is the right course of action to be taken on this event.
> In our use-case, the application can close the file on the first datanode 
> death, and start writing to a newly created file. This ensures that the 
> reliability guarantee of a block is close to 3 at all time.
> One idea is to make DFSOutoutStream. write() throw an exception if the number 
> of datanodes in the write pipeline fall below minimum.replication.factor that 
> is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-898) Sequential generation of block ids

2010-03-01 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839870#action_12839870
 ] 

Konstantin Shvachko commented on HDFS-898:
--

Great! Yet another cluster would have had its block ids converted using 8 bit 
projection collision free. Thanks Dmytro.

> Sequential generation of block ids
> --
>
> Key: HDFS-898
> URL: https://issues.apache.org/jira/browse/HDFS-898
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.20.1
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Fix For: 0.22.0
>
> Attachments: DuplicateBlockIds.patch, FreeBlockIds.pdf, 
> HighBitProjection.pdf
>
>
> This is a proposal to replace random generation of block ids with a 
> sequential generator in order to avoid block id reuse in the future.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839858#action_12839858
 ] 

Hadoop QA commented on HDFS-826:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12436152/ReplicableHdfs3.txt
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/console

This message is automatically generated.

> Allow a mechanism for an application to detect that datanode(s)  have died in 
> the write pipeline
> 
>
> Key: HDFS-826
> URL: https://issues.apache.org/jira/browse/HDFS-826
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
> ReplicableHdfs3.txt
>
>
> HDFS does not replicate the last block of the file that is being currently 
> written to by an application. Every datanode death in the write pipeline 
> decreases the reliability of the last block of the currently-being-written 
> block. This situation can be improved if the application can be notified of a 
> datanode death in the write pipeline. Then, the application can decide what 
> is the right course of action to be taken on this event.
> In our use-case, the application can close the file on the first datanode 
> death, and start writing to a newly created file. This ensures that the 
> reliability guarantee of a block is close to 3 at all time.
> One idea is to make DFSOutoutStream. write() throw an exception if the number 
> of datanodes in the write pipeline fall below minimum.replication.factor that 
> is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839853#action_12839853
 ] 

dhruba borthakur commented on HDFS-826:
---

Thanks Todd for the review. I agree with your recommendation and will post a 
new patch. 

> Allow a mechanism for an application to detect that datanode(s)  have died in 
> the write pipeline
> 
>
> Key: HDFS-826
> URL: https://issues.apache.org/jira/browse/HDFS-826
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
> ReplicableHdfs3.txt
>
>
> HDFS does not replicate the last block of the file that is being currently 
> written to by an application. Every datanode death in the write pipeline 
> decreases the reliability of the last block of the currently-being-written 
> block. This situation can be improved if the application can be notified of a 
> datanode death in the write pipeline. Then, the application can decide what 
> is the right course of action to be taken on this event.
> In our use-case, the application can close the file on the first datanode 
> death, and start writing to a newly created file. This ensures that the 
> reliability guarantee of a block is close to 3 at all time.
> One idea is to make DFSOutoutStream. write() throw an exception if the number 
> of datanodes in the write pipeline fall below minimum.replication.factor that 
> is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1007) HFTP needs to be updated to use delegation tokens

2010-03-01 Thread Kan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839844#action_12839844
 ] 

Kan Zhang commented on HDFS-1007:
-

The delegation token should be fetched by distcp client, not by HftpFilesystem 
(or HsftpFilesystem).

> HFTP needs to be updated to use delegation tokens
> -
>
> Key: HDFS-1007
> URL: https://issues.apache.org/jira/browse/HDFS-1007
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.22.0
>Reporter: Devaraj Das
> Fix For: 0.22.0
>
> Attachments: distcp-hftp.1.patch, distcp-hftp.2.1.patch, 
> distcp-hftp.2.patch, distcp-hftp.patch
>
>
> HFTPFileSystem should be updated to use the delegation tokens so that it can 
> talk to the secure namenodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839837#action_12839837
 ] 

Todd Lipcon commented on HDFS-826:
--

Patch looks good to me. I question whether returning 0 for the case in between 
blocks is a good idea - this seems a bit confusing from the API user's 
perspective. Since it is well documented it may not be an issue, but I wonder 
if it would make more sense to actually return the intended replication in this 
case.

> Allow a mechanism for an application to detect that datanode(s)  have died in 
> the write pipeline
> 
>
> Key: HDFS-826
> URL: https://issues.apache.org/jira/browse/HDFS-826
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
> ReplicableHdfs3.txt
>
>
> HDFS does not replicate the last block of the file that is being currently 
> written to by an application. Every datanode death in the write pipeline 
> decreases the reliability of the last block of the currently-being-written 
> block. This situation can be improved if the application can be notified of a 
> datanode death in the write pipeline. Then, the application can decide what 
> is the right course of action to be taken on this event.
> In our use-case, the application can close the file on the first datanode 
> death, and start writing to a newly created file. This ensures that the 
> reliability guarantee of a block is close to 3 at all time.
> One idea is to make DFSOutoutStream. write() throw an exception if the number 
> of datanodes in the write pipeline fall below minimum.replication.factor that 
> is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-898) Sequential generation of block ids

2010-03-01 Thread Dmytro Molkov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839805#action_12839805
 ] 

Dmytro Molkov commented on HDFS-898:


I just ran a tool on FB cluster. Here is the output:

Bit map applied: ff00
Number of collisions = 0
=
Number of blocks = 57909756
Number of negtive ids  = 57909756
Number of positive ids = 0
Largest segment = (-277768208, 9223372036854775807)
Segment size = 9.223372037132544E18
Expected max = 318542942464


> Sequential generation of block ids
> --
>
> Key: HDFS-898
> URL: https://issues.apache.org/jira/browse/HDFS-898
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.20.1
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Fix For: 0.22.0
>
> Attachments: DuplicateBlockIds.patch, FreeBlockIds.pdf, 
> HighBitProjection.pdf
>
>
> This is a proposal to replace random generation of block ids with a 
> sequential generator in order to avoid block id reuse in the future.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-01 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: testFileStatus.patch

This patch fixed a bug in TestFileStatus.java.

> HDFS should issue multiple RPCs for listing a large directory
> -
>
> Key: HDFS-985
> URL: https://issues.apache.org/jira/browse/HDFS-985
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.22.0
>
> Attachments: iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, 
> testFileStatus.patch
>
>
> Currently HDFS issues one RPC from the client to the NameNode for listing a 
> directory. However some directories are large that contain thousands or 
> millions of items. Listing such large directories in one RPC has a few 
> shortcomings:
> 1. The list operation holds the global fsnamesystem lock for a long time thus 
> blocking other requests. If a large number (like thousands) of such list 
> requests hit NameNode in a short period of time, NameNode will be 
> significantly slowed down. Users end up noticing longer response time or lost 
> connections to NameNode.
> 2. The response message is uncontrollable big. We observed a response as big 
> as 50M bytes when listing a directory of 300 thousand items. Even with the 
> optimization introduced at HDFS-946 that may be able to cut the response by 
> 20-50%, the response size will still in the magnitude of 10 mega bytes.
> I propose to implement a directory listing using multiple RPCs. Here is the 
> plan:
> 1. Each getListing RPC has an upper limit on the number of items returned.  
> This limit could be configurable, but I am thinking to set it to be a fixed 
> number like 500.
> 2. Each RPC additionally specifies a start position for this listing request. 
> I am thinking to use the last item of the previous listing RPC as an 
> indicator. Since NameNode stores all items in a directory as a sorted array, 
> NameNode uses the last item to locate the start item of this listing even if 
> the last item is deleted in between these two consecutive calls. This has the 
> advantage of avoid duplicate entries at the client side.
> 3. The return value additionally specifies if the whole directory is done 
> listing. If the client sees a false flag, it will continue to issue another 
> RPC.
> This proposal will change the semantics of large directory listing in a sense 
> that listing is no longer an atomic operation if a directory's content is 
> changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-826:
--

Status: Patch Available  (was: Open)

Can somebody pl review this patch? This is needed to make HBase work 
efficiently. Thanks.

> Allow a mechanism for an application to detect that datanode(s)  have died in 
> the write pipeline
> 
>
> Key: HDFS-826
> URL: https://issues.apache.org/jira/browse/HDFS-826
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
> ReplicableHdfs3.txt
>
>
> HDFS does not replicate the last block of the file that is being currently 
> written to by an application. Every datanode death in the write pipeline 
> decreases the reliability of the last block of the currently-being-written 
> block. This situation can be improved if the application can be notified of a 
> datanode death in the write pipeline. Then, the application can decide what 
> is the right course of action to be taken on this event.
> In our use-case, the application can close the file on the first datanode 
> death, and start writing to a newly created file. This ensures that the 
> reliability guarantee of a block is close to 3 at all time.
> One idea is to make DFSOutoutStream. write() throw an exception if the number 
> of datanodes in the write pipeline fall below minimum.replication.factor that 
> is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-729:
--

Status: Open  (was: Patch Available)

Thanks Rodrigo. I will wait for your new patch.

> fsck option to list only corrupted files
> 
>
> Key: HDFS-729
> URL: https://issues.apache.org/jira/browse/HDFS-729
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: dhruba borthakur
>Assignee: Rodrigo Schmidt
> Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
> HDFS-729.1.patch, HDFS-729.2.patch
>
>
> An option to fsck to list only corrupted files will be very helpful for 
> frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1006) getImage/putImage http requests should be https for the case of security enabled.

2010-03-01 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-1006:
--

Attachment: HDFS-1006-Y20.1.patch

Minor updates to the previous patch.

> getImage/putImage http requests should be https for the case of security 
> enabled.
> -
>
> Key: HDFS-1006
> URL: https://issues.apache.org/jira/browse/HDFS-1006
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Boris Shkolnik
>Assignee: Boris Shkolnik
> Attachments: HDFS-1006-BP20.patch, HDFS-1006-Y20.1.patch, 
> HDFS-1006-Y20.patch
>
>
> should use https:// and port 50475

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1007) HFTP needs to be updated to use delegation tokens

2010-03-01 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-1007:
--

Attachment: distcp-hftp.2.1.patch

This patch is a bugfix on top of the distcp-hftp.2.patch.

> HFTP needs to be updated to use delegation tokens
> -
>
> Key: HDFS-1007
> URL: https://issues.apache.org/jira/browse/HDFS-1007
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.22.0
>Reporter: Devaraj Das
> Fix For: 0.22.0
>
> Attachments: distcp-hftp.1.patch, distcp-hftp.2.1.patch, 
> distcp-hftp.2.patch, distcp-hftp.patch
>
>
> HFTPFileSystem should be updated to use the delegation tokens so that it can 
> talk to the secure namenodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1005) Fsck security

2010-03-01 Thread Boris Shkolnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839597#action_12839597
 ] 

Boris Shkolnik commented on HDFS-1005:
--

HDFS-1005-BP20.patch for previous version of Hadoop. Not for commit.




> Fsck security
> -
>
> Key: HDFS-1005
> URL: https://issues.apache.org/jira/browse/HDFS-1005
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Boris Shkolnik
> Attachments: HDFS-1005-BP20.patch, HDFS-1005-y20.1.patch
>
>
> This jira tracks implementation of security for Fsck. Fsck should make an 
> authenticated connection to the namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1007) HFTP needs to be updated to use delegation tokens

2010-03-01 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-1007:
--

Attachment: distcp-hftp.2.patch

Updates HsFtpFileSystem.. Patch for Y20. Not for commit here.

> HFTP needs to be updated to use delegation tokens
> -
>
> Key: HDFS-1007
> URL: https://issues.apache.org/jira/browse/HDFS-1007
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.22.0
>Reporter: Devaraj Das
> Fix For: 0.22.0
>
> Attachments: distcp-hftp.1.patch, distcp-hftp.2.patch, 
> distcp-hftp.patch
>
>
> HFTPFileSystem should be updated to use the delegation tokens so that it can 
> talk to the secure namenodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1012) documentLocation attribute in LdapEntry for HDFSProxy isn't specific to a cluster

2010-03-01 Thread Srikanth Sundarrajan (JIRA)
documentLocation attribute in LdapEntry for HDFSProxy isn't specific to a 
cluster
-

 Key: HDFS-1012
 URL: https://issues.apache.org/jira/browse/HDFS-1012
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: contrib/hdfsproxy
Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0
Reporter: Srikanth Sundarrajan


List of allowed document locations accessible through HDFSProxy isn't specific 
to a cluster. LDAP entries can include the name of the cluster to which the 
path belongs to have better control on which clusters/paths are accessible 
through HDFSProxy by the user.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.