[jira] Created: (HDFS-1012) documentLocation attribute in LdapEntry for HDFSProxy isn't specific to a cluster

2010-03-01 Thread Srikanth Sundarrajan (JIRA)
documentLocation attribute in LdapEntry for HDFSProxy isn't specific to a 
cluster
-

 Key: HDFS-1012
 URL: https://issues.apache.org/jira/browse/HDFS-1012
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: contrib/hdfsproxy
Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0
Reporter: Srikanth Sundarrajan


List of allowed document locations accessible through HDFSProxy isn't specific 
to a cluster. LDAP entries can include the name of the cluster to which the 
path belongs to have better control on which clusters/paths are accessible 
through HDFSProxy by the user.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1007) HFTP needs to be updated to use delegation tokens

2010-03-01 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-1007:
--

Attachment: distcp-hftp.2.patch

Updates HsFtpFileSystem.. Patch for Y20. Not for commit here.

 HFTP needs to be updated to use delegation tokens
 -

 Key: HDFS-1007
 URL: https://issues.apache.org/jira/browse/HDFS-1007
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Affects Versions: 0.22.0
Reporter: Devaraj Das
 Fix For: 0.22.0

 Attachments: distcp-hftp.1.patch, distcp-hftp.2.patch, 
 distcp-hftp.patch


 HFTPFileSystem should be updated to use the delegation tokens so that it can 
 talk to the secure namenodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1005) Fsck security

2010-03-01 Thread Boris Shkolnik (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839597#action_12839597
 ] 

Boris Shkolnik commented on HDFS-1005:
--

HDFS-1005-BP20.patch for previous version of Hadoop. Not for commit.




 Fsck security
 -

 Key: HDFS-1005
 URL: https://issues.apache.org/jira/browse/HDFS-1005
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Boris Shkolnik
 Attachments: HDFS-1005-BP20.patch, HDFS-1005-y20.1.patch


 This jira tracks implementation of security for Fsck. Fsck should make an 
 authenticated connection to the namenode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1007) HFTP needs to be updated to use delegation tokens

2010-03-01 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-1007:
--

Attachment: distcp-hftp.2.1.patch

This patch is a bugfix on top of the distcp-hftp.2.patch.

 HFTP needs to be updated to use delegation tokens
 -

 Key: HDFS-1007
 URL: https://issues.apache.org/jira/browse/HDFS-1007
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Affects Versions: 0.22.0
Reporter: Devaraj Das
 Fix For: 0.22.0

 Attachments: distcp-hftp.1.patch, distcp-hftp.2.1.patch, 
 distcp-hftp.2.patch, distcp-hftp.patch


 HFTPFileSystem should be updated to use the delegation tokens so that it can 
 talk to the secure namenodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1006) getImage/putImage http requests should be https for the case of security enabled.

2010-03-01 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HDFS-1006:
--

Attachment: HDFS-1006-Y20.1.patch

Minor updates to the previous patch.

 getImage/putImage http requests should be https for the case of security 
 enabled.
 -

 Key: HDFS-1006
 URL: https://issues.apache.org/jira/browse/HDFS-1006
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Boris Shkolnik
Assignee: Boris Shkolnik
 Attachments: HDFS-1006-BP20.patch, HDFS-1006-Y20.1.patch, 
 HDFS-1006-Y20.patch


 should use https:// and port 50475

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-729:
--

Status: Open  (was: Patch Available)

Thanks Rodrigo. I will wait for your new patch.

 fsck option to list only corrupted files
 

 Key: HDFS-729
 URL: https://issues.apache.org/jira/browse/HDFS-729
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: Rodrigo Schmidt
 Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
 HDFS-729.1.patch, HDFS-729.2.patch


 An option to fsck to list only corrupted files will be very helpful for 
 frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-826:
--

Status: Patch Available  (was: Open)

Can somebody pl review this patch? This is needed to make HBase work 
efficiently. Thanks.

 Allow a mechanism for an application to detect that datanode(s)  have died in 
 the write pipeline
 

 Key: HDFS-826
 URL: https://issues.apache.org/jira/browse/HDFS-826
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
 ReplicableHdfs3.txt


 HDFS does not replicate the last block of the file that is being currently 
 written to by an application. Every datanode death in the write pipeline 
 decreases the reliability of the last block of the currently-being-written 
 block. This situation can be improved if the application can be notified of a 
 datanode death in the write pipeline. Then, the application can decide what 
 is the right course of action to be taken on this event.
 In our use-case, the application can close the file on the first datanode 
 death, and start writing to a newly created file. This ensures that the 
 reliability guarantee of a block is close to 3 at all time.
 One idea is to make DFSOutoutStream. write() throw an exception if the number 
 of datanodes in the write pipeline fall below minimum.replication.factor that 
 is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory

2010-03-01 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-985:
---

Attachment: testFileStatus.patch

This patch fixed a bug in TestFileStatus.java.

 HDFS should issue multiple RPCs for listing a large directory
 -

 Key: HDFS-985
 URL: https://issues.apache.org/jira/browse/HDFS-985
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Hairong Kuang
Assignee: Hairong Kuang
 Fix For: 0.22.0

 Attachments: iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, 
 testFileStatus.patch


 Currently HDFS issues one RPC from the client to the NameNode for listing a 
 directory. However some directories are large that contain thousands or 
 millions of items. Listing such large directories in one RPC has a few 
 shortcomings:
 1. The list operation holds the global fsnamesystem lock for a long time thus 
 blocking other requests. If a large number (like thousands) of such list 
 requests hit NameNode in a short period of time, NameNode will be 
 significantly slowed down. Users end up noticing longer response time or lost 
 connections to NameNode.
 2. The response message is uncontrollable big. We observed a response as big 
 as 50M bytes when listing a directory of 300 thousand items. Even with the 
 optimization introduced at HDFS-946 that may be able to cut the response by 
 20-50%, the response size will still in the magnitude of 10 mega bytes.
 I propose to implement a directory listing using multiple RPCs. Here is the 
 plan:
 1. Each getListing RPC has an upper limit on the number of items returned.  
 This limit could be configurable, but I am thinking to set it to be a fixed 
 number like 500.
 2. Each RPC additionally specifies a start position for this listing request. 
 I am thinking to use the last item of the previous listing RPC as an 
 indicator. Since NameNode stores all items in a directory as a sorted array, 
 NameNode uses the last item to locate the start item of this listing even if 
 the last item is deleted in between these two consecutive calls. This has the 
 advantage of avoid duplicate entries at the client side.
 3. The return value additionally specifies if the whole directory is done 
 listing. If the client sees a false flag, it will continue to issue another 
 RPC.
 This proposal will change the semantics of large directory listing in a sense 
 that listing is no longer an atomic operation if a directory's content is 
 changing while the listing operation is in progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-898) Sequential generation of block ids

2010-03-01 Thread Dmytro Molkov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839805#action_12839805
 ] 

Dmytro Molkov commented on HDFS-898:


I just ran a tool on FB cluster. Here is the output:

Bit map applied: ff00
Number of collisions = 0
=
Number of blocks = 57909756
Number of negtive ids  = 57909756
Number of positive ids = 0
Largest segment = (-277768208, 9223372036854775807)
Segment size = 9.223372037132544E18
Expected max = 318542942464


 Sequential generation of block ids
 --

 Key: HDFS-898
 URL: https://issues.apache.org/jira/browse/HDFS-898
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.20.1
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.22.0

 Attachments: DuplicateBlockIds.patch, FreeBlockIds.pdf, 
 HighBitProjection.pdf


 This is a proposal to replace random generation of block ids with a 
 sequential generator in order to avoid block id reuse in the future.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839837#action_12839837
 ] 

Todd Lipcon commented on HDFS-826:
--

Patch looks good to me. I question whether returning 0 for the case in between 
blocks is a good idea - this seems a bit confusing from the API user's 
perspective. Since it is well documented it may not be an issue, but I wonder 
if it would make more sense to actually return the intended replication in this 
case.

 Allow a mechanism for an application to detect that datanode(s)  have died in 
 the write pipeline
 

 Key: HDFS-826
 URL: https://issues.apache.org/jira/browse/HDFS-826
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
 ReplicableHdfs3.txt


 HDFS does not replicate the last block of the file that is being currently 
 written to by an application. Every datanode death in the write pipeline 
 decreases the reliability of the last block of the currently-being-written 
 block. This situation can be improved if the application can be notified of a 
 datanode death in the write pipeline. Then, the application can decide what 
 is the right course of action to be taken on this event.
 In our use-case, the application can close the file on the first datanode 
 death, and start writing to a newly created file. This ensures that the 
 reliability guarantee of a block is close to 3 at all time.
 One idea is to make DFSOutoutStream. write() throw an exception if the number 
 of datanodes in the write pipeline fall below minimum.replication.factor that 
 is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1007) HFTP needs to be updated to use delegation tokens

2010-03-01 Thread Kan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839844#action_12839844
 ] 

Kan Zhang commented on HDFS-1007:
-

The delegation token should be fetched by distcp client, not by HftpFilesystem 
(or HsftpFilesystem).

 HFTP needs to be updated to use delegation tokens
 -

 Key: HDFS-1007
 URL: https://issues.apache.org/jira/browse/HDFS-1007
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Affects Versions: 0.22.0
Reporter: Devaraj Das
 Fix For: 0.22.0

 Attachments: distcp-hftp.1.patch, distcp-hftp.2.1.patch, 
 distcp-hftp.2.patch, distcp-hftp.patch


 HFTPFileSystem should be updated to use the delegation tokens so that it can 
 talk to the secure namenodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839853#action_12839853
 ] 

dhruba borthakur commented on HDFS-826:
---

Thanks Todd for the review. I agree with your recommendation and will post a 
new patch. 

 Allow a mechanism for an application to detect that datanode(s)  have died in 
 the write pipeline
 

 Key: HDFS-826
 URL: https://issues.apache.org/jira/browse/HDFS-826
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
 ReplicableHdfs3.txt


 HDFS does not replicate the last block of the file that is being currently 
 written to by an application. Every datanode death in the write pipeline 
 decreases the reliability of the last block of the currently-being-written 
 block. This situation can be improved if the application can be notified of a 
 datanode death in the write pipeline. Then, the application can decide what 
 is the right course of action to be taken on this event.
 In our use-case, the application can close the file on the first datanode 
 death, and start writing to a newly created file. This ensures that the 
 reliability guarantee of a block is close to 3 at all time.
 One idea is to make DFSOutoutStream. write() throw an exception if the number 
 of datanodes in the write pipeline fall below minimum.replication.factor that 
 is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839858#action_12839858
 ] 

Hadoop QA commented on HDFS-826:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12436152/ReplicableHdfs3.txt
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/console

This message is automatically generated.

 Allow a mechanism for an application to detect that datanode(s)  have died in 
 the write pipeline
 

 Key: HDFS-826
 URL: https://issues.apache.org/jira/browse/HDFS-826
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
 ReplicableHdfs3.txt


 HDFS does not replicate the last block of the file that is being currently 
 written to by an application. Every datanode death in the write pipeline 
 decreases the reliability of the last block of the currently-being-written 
 block. This situation can be improved if the application can be notified of a 
 datanode death in the write pipeline. Then, the application can decide what 
 is the right course of action to be taken on this event.
 In our use-case, the application can close the file on the first datanode 
 death, and start writing to a newly created file. This ensures that the 
 reliability guarantee of a block is close to 3 at all time.
 One idea is to make DFSOutoutStream. write() throw an exception if the number 
 of datanodes in the write pipeline fall below minimum.replication.factor that 
 is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-898) Sequential generation of block ids

2010-03-01 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839870#action_12839870
 ] 

Konstantin Shvachko commented on HDFS-898:
--

Great! Yet another cluster would have had its block ids converted using 8 bit 
projection collision free. Thanks Dmytro.

 Sequential generation of block ids
 --

 Key: HDFS-898
 URL: https://issues.apache.org/jira/browse/HDFS-898
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.20.1
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.22.0

 Attachments: DuplicateBlockIds.patch, FreeBlockIds.pdf, 
 HighBitProjection.pdf


 This is a proposal to replace random generation of block ids with a 
 sequential generator in order to avoid block id reuse in the future.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2010-03-01 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839875#action_12839875
 ] 

Konstantin Shvachko commented on HDFS-826:
--

What is the point of introducing new {{Replicable}} interface if it is not used 
anywhere? The new method {{getNumCurrentReplicas()}} in {{FSOutputSummer}} 
would work fine.

 Allow a mechanism for an application to detect that datanode(s)  have died in 
 the write pipeline
 

 Key: HDFS-826
 URL: https://issues.apache.org/jira/browse/HDFS-826
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Reporter: dhruba borthakur
Assignee: dhruba borthakur
 Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, 
 ReplicableHdfs3.txt


 HDFS does not replicate the last block of the file that is being currently 
 written to by an application. Every datanode death in the write pipeline 
 decreases the reliability of the last block of the currently-being-written 
 block. This situation can be improved if the application can be notified of a 
 datanode death in the write pipeline. Then, the application can decide what 
 is the right course of action to be taken on this event.
 In our use-case, the application can close the file on the first datanode 
 death, and start writing to a newly created file. This ensures that the 
 reliability guarantee of a block is close to 3 at all time.
 One idea is to make DFSOutoutStream. write() throw an exception if the number 
 of datanodes in the write pipeline fall below minimum.replication.factor that 
 is set on the client (this is backward compatible).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated HDFS-729:
-

Attachment: HDFS-729.3.patch

new patch attached. (3 is my lucky number)

I made all the changes suggested by Dhruba.

 fsck option to list only corrupted files
 

 Key: HDFS-729
 URL: https://issues.apache.org/jira/browse/HDFS-729
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: Rodrigo Schmidt
 Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
 HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch


 An option to fsck to list only corrupted files will be very helpful for 
 frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated HDFS-729:
-

Status: Patch Available  (was: Open)

 fsck option to list only corrupted files
 

 Key: HDFS-729
 URL: https://issues.apache.org/jira/browse/HDFS-729
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: Rodrigo Schmidt
 Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
 HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch


 An option to fsck to list only corrupted files will be very helpful for 
 frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-458) Create target for 10 minute patch test build for hdfs

2010-03-01 Thread Erik Steffl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Steffl updated HDFS-458:
-

Attachment: jira.HDFS-458.branch-0.21.1xx.patch

 Create target for 10 minute patch test build for hdfs
 -

 Key: HDFS-458
 URL: https://issues.apache.org/jira/browse/HDFS-458
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: build, test
Reporter: Jakob Homan
Assignee: Jakob Homan
 Fix For: 0.21.0

 Attachments: build.xml, HDFS-458.patch, HDFS-458.patch, 
 jira.HDFS-458.branch-0.21.1xx.patch, TenMinuteTestData.xlsx


 It would be good to identify a subset of hdfs tests that provide strong test 
 code coverage within 10 minutes, as is the goal of MAPREDUCE-670 and 
 HADOOP-5628.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)
Error in reading delegation tokens from edit logs.
--

 Key: HDFS-1014
 URL: https://issues.apache.org/jira/browse/HDFS-1014
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jitendra Nath Pandey


 When delegation tokens are read from the edit logs...same object is used to 
read the identifier and is stored in the token cache. This is wrong because 
same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1014:
---

Attachment: HDFS-1014-y20.1.patch

Patch for hadoop-20 is uploaded.

 Error in reading delegation tokens from edit logs.
 --

 Key: HDFS-1014
 URL: https://issues.apache.org/jira/browse/HDFS-1014
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
 Attachments: HDFS-1014-y20.1.patch


  When delegation tokens are read from the edit logs...same object is used to 
 read the identifier and is stored in the token cache. This is wrong because 
 same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839944#action_12839944
 ] 

Konstantin Shvachko commented on HDFS-1014:
---

+1 Patch looks good to me.

 Error in reading delegation tokens from edit logs.
 --

 Key: HDFS-1014
 URL: https://issues.apache.org/jira/browse/HDFS-1014
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HDFS-1014-y20.1.patch


  When delegation tokens are read from the edit logs...same object is used to 
 read the identifier and is stored in the token cache. This is wrong because 
 same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1014:
---

Attachment: HDFS-1014.2.patch

 Error in reading delegation tokens from edit logs.
 --

 Key: HDFS-1014
 URL: https://issues.apache.org/jira/browse/HDFS-1014
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch


  When delegation tokens are read from the edit logs...same object is used to 
 read the identifier and is stored in the token cache. This is wrong because 
 same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1014:
---

Status: Patch Available  (was: Open)

 Error in reading delegation tokens from edit logs.
 --

 Key: HDFS-1014
 URL: https://issues.apache.org/jira/browse/HDFS-1014
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch


  When delegation tokens are read from the edit logs...same object is used to 
 read the identifier and is stored in the token cache. This is wrong because 
 same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839962#action_12839962
 ] 

Jitendra Nath Pandey commented on HDFS-1014:


HDFS-1014.2.patch is for trunk.

 Error in reading delegation tokens from edit logs.
 --

 Key: HDFS-1014
 URL: https://issues.apache.org/jira/browse/HDFS-1014
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch


  When delegation tokens are read from the edit logs...same object is used to 
 read the identifier and is stored in the token cache. This is wrong because 
 same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-988) saveNamespace can corrupt edits log

2010-03-01 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-988:
--

Tags: hbase

 saveNamespace can corrupt edits log
 ---

 Key: HDFS-988
 URL: https://issues.apache.org/jira/browse/HDFS-988
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Reporter: dhruba borthakur
 Attachments: saveNamespace.txt


 The adminstrator puts the namenode is safemode and then issues the 
 savenamespace command. This can corrupt the edits log. The problem is that  
 when the NN enters safemode, there could still be pending logSycs occuring 
 from other threads. Now, the saveNamespace command, when executed, would save 
 a edits log with partial writes. I have seen this happen on 0.20.
 https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839974#action_12839974
 ] 

Hadoop QA commented on HDFS-729:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12437530/HDFS-729.3.patch
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/console

This message is automatically generated.

 fsck option to list only corrupted files
 

 Key: HDFS-729
 URL: https://issues.apache.org/jira/browse/HDFS-729
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: Rodrigo Schmidt
 Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
 HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch


 An option to fsck to list only corrupted files will be very helpful for 
 frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-729) fsck option to list only corrupted files

2010-03-01 Thread Rodrigo Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839975#action_12839975
 ] 

Rodrigo Schmidt commented on HDFS-729:
--

The errors are the same as before, and the same other patches seem to be going 
through.

Dhruba, could you please double check that everything is fine with this patch?

 fsck option to list only corrupted files
 

 Key: HDFS-729
 URL: https://issues.apache.org/jira/browse/HDFS-729
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: Rodrigo Schmidt
 Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, 
 HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch


 An option to fsck to list only corrupted files will be very helpful for 
 frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1014) Error in reading delegation tokens from edit logs.

2010-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840009#action_12840009
 ] 

Hadoop QA commented on HDFS-1014:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12437557/HDFS-1014.2.patch
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/console

This message is automatically generated.

 Error in reading delegation tokens from edit logs.
 --

 Key: HDFS-1014
 URL: https://issues.apache.org/jira/browse/HDFS-1014
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch


  When delegation tokens are read from the edit logs...same object is used to 
 read the identifier and is stored in the token cache. This is wrong because 
 same object is getting updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-204) Revive number of files listed metrics

2010-03-01 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HDFS-204:
---

Attachment: getFileNum-yahoo20.patch

This patch ported the feature to Yahoo 20 branch. In addition, it fixed the bug 
that NPE will be thrown when getListing on a non-existent path and I also added 
two more test cases, one is listing a non-existent path and one is listing path 
represented a file.

 Revive number of files listed metrics
 -

 Key: HDFS-204
 URL: https://issues.apache.org/jira/browse/HDFS-204
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: name-node
Affects Versions: 0.21.0
Reporter: Koji Noguchi
Assignee: Jitendra Nath Pandey
 Fix For: 0.21.0

 Attachments: getFileNum-yahoo20.patch, HDFS-204-2.patch, 
 HDFS-204-2.patch, HDFS-204.patch, HDFS-204.patch


 When namenode becomes unresponsive by HADOOP-4693 (large filelist calls), 
 metrics has been helpful in finding out the cause.
 When gc time hikes, FileListed metrics also hiked.
 In 0.18, after we *fixed* FileListed metrics so that it shows number of 
 operations instead of number of files listed (HADOOP-3683), I stopped seeing 
 this relationship graph.  
 Can we bring back NumbverOfFilesListed metrics?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-984) Delegation Tokens should be persisted in Namenode

2010-03-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840041#action_12840041
 ] 

Hudson commented on HDFS-984:
-

Integrated in Hadoop-Mapreduce-trunk-Commit #252 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/252/])


 Delegation Tokens should be persisted in Namenode
 -

 Key: HDFS-984
 URL: https://issues.apache.org/jira/browse/HDFS-984
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Fix For: 0.22.0

 Attachments: HDFS-984-0_20.4.patch, HDFS-984.10.patch, 
 HDFS-984.11.patch, HDFS-984.12.patch, HDFS-984.14.patch, HDFS-984.7.patch


 The Delegation tokens should be persisted in the FsImage and EditLogs so that 
 they are valid to be used after namenode shutdown and restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-01 Thread bc Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bc Wong updated HDFS-1001:
--

Status: Patch Available  (was: Open)

 DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
 -

 Key: HDFS-1001
 URL: https://issues.apache.org/jira/browse/HDFS-1001
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: bc Wong

 Running the TestPread with additional debug statements reveals that the 
 BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
 Currently it doesn't matter since DataXceiver closes the connection after 
 each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
 cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-01 Thread bc Wong (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840048#action_12840048
 ] 

bc Wong commented on HDFS-1001:
---

I have a patch. But I can't assign this issue to myself. Could someone please 
fix Jira to let me work on it? Thanks.

 DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
 -

 Key: HDFS-1001
 URL: https://issues.apache.org/jira/browse/HDFS-1001
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: bc Wong

 Running the TestPread with additional debug statements reveals that the 
 BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
 Currently it doesn't matter since DataXceiver closes the connection after 
 each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
 cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK

2010-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840051#action_12840051
 ] 

Hadoop QA commented on HDFS-1001:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org
  against trunk revision 916902.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/257/console

This message is automatically generated.

 DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
 -

 Key: HDFS-1001
 URL: https://issues.apache.org/jira/browse/HDFS-1001
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: bc Wong

 Running the TestPread with additional debug statements reveals that the 
 BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. 
 Currently it doesn't matter since DataXceiver closes the connection after 
 each op, and CHECKSUM_OK is the last thing on the wire. But if we want to 
 cache connections, they need to agree on the exchange of CHECKSUM_OK.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.