[jira] Commented: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests

2009-09-08 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752375#action_12752375
 ] 

dhruba borthakur commented on HDFS-599:
---

One proposal is to make the DatanodeProtocol have higher  priority that 
ClientProtocol. How do folks feel about this one?

 Improve Namenode robustness by prioritizing datanode heartbeats over client 
 requests
 

 Key: HDFS-599
 URL: https://issues.apache.org/jira/browse/HDFS-599
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur

 The namenode processes RPC requests from clients that are reading/writing to 
 files as well as heartbeats/block reports from datanodes.
 Sometime, because of various reasons (Java GC runs, inconsistent performance 
 of NFS filer that stores HDFS transacttion logs, etc), the namenode 
 encounters transient slowness. For example, if the device that stores the 
 HDFS transaction logs becomes sluggish, the Namenode's ability to process 
 RPCs slows down to a certain extent. During this time, the RPCs from clients 
 as well as the RPCs from datanodes suffer in similar fashion. If the 
 underlying problem becomes worse, the NN's ability to process a heartbeat 
 from a DN is severly impacted, thus causing the NN to declare that the DN is 
 dead. Then the NN starts replicating blocks that used to reside on the 
 now-declared-dead datanode. This adds extra load to the NN. Then the 
 now-declared-datanode finally re-establishes contact with the NN, and sends a 
 block report. The block report processing on the NN is another heavyweight 
 activity, thus casing more load to the already overloaded namenode. 
 My proposal is tha the NN should try its best to continue processing RPCs 
 from datanodes and give lesser priority to serving client requests. The 
 Datanode RPCs are integral to the consistency and performance of the Hadoop 
 file system, and it is better to protect it at all costs. This will ensure 
 that NN  recovers from the hiccup much faster than what it does now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-412) Hadoop JMX usage makes Nagios monitoring impossible

2009-09-08 Thread Tom White (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated HDFS-412:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Brian!

 Hadoop JMX usage makes Nagios monitoring impossible
 ---

 Key: HDFS-412
 URL: https://issues.apache.org/jira/browse/HDFS-412
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Brian Bockelman
Assignee: Brian Bockelman
 Fix For: 0.21.0

 Attachments: hadoop-4482.patch, hdfs-412.patch, jmx_name.patch, 
 jmx_name_replaced.patch


 When Hadoop reports Datanode information to JMX, the bean uses the name 
 DataNode- + storageid.  The storage ID incorporates a random number and is 
 unpredictable.
 This prevents me from monitoring DFS datanodes through Hadoop using the JMX 
 interface; in order to do that, you must be able to specify the bean name on 
 the command line.
 The fix is simple, patch will be coming momentarily.  However, there was 
 probably a reason for making the datanodes all unique names which I'm unaware 
 of, so it'd be nice to hear from the metrics maintainer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-585) Datanode should serve up to visible length of a replica for read requests

2009-09-08 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752611#action_12752611
 ] 

Hairong Kuang commented on HDFS-585:


1. I think DataNode should get the on-disk block length from ReplicaInfo not 
from disk because DataNode guarantees that crc file is also updated when 
updating the replica's in-memory bytesOnDisk. 
2. Should we have unit tests testing reading from an unclosed file?

 Datanode should serve up to visible length of a replica for read requests
 -

 Key: HDFS-585
 URL: https://issues.apache.org/jira/browse/HDFS-585
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: Append Branch
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: Append Branch

 Attachments: h585_20090903.patch, h585_20090904.patch, 
 h585_20090904b.patch


 As a part of the design in HDFS-265, datanodes should return all bytes within 
 the visible length of a replica to the DFSClient in order to support read 
 consistency.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests

2009-09-08 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752633#action_12752633
 ] 

dhruba borthakur commented on HDFS-599:
---

Maybe we can have two different ports that the Namenode listens on. The first 
port is same as the current one on which clients'c contact the namenode. The 
second port (the new one) will be used for Namenode protocol and 
DatanodeProtocol (between NN, DN and Secondary NN). This approach does not need 
any enhancement to RPC layer. (It also allows the cluster adminstrator to not 
open the second port to machines outside the cluster). In future, we can also 
associate higher java-Thread priorities to the Handler threads that serve the 
second port.

 Improve Namenode robustness by prioritizing datanode heartbeats over client 
 requests
 

 Key: HDFS-599
 URL: https://issues.apache.org/jira/browse/HDFS-599
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur

 The namenode processes RPC requests from clients that are reading/writing to 
 files as well as heartbeats/block reports from datanodes.
 Sometime, because of various reasons (Java GC runs, inconsistent performance 
 of NFS filer that stores HDFS transacttion logs, etc), the namenode 
 encounters transient slowness. For example, if the device that stores the 
 HDFS transaction logs becomes sluggish, the Namenode's ability to process 
 RPCs slows down to a certain extent. During this time, the RPCs from clients 
 as well as the RPCs from datanodes suffer in similar fashion. If the 
 underlying problem becomes worse, the NN's ability to process a heartbeat 
 from a DN is severly impacted, thus causing the NN to declare that the DN is 
 dead. Then the NN starts replicating blocks that used to reside on the 
 now-declared-dead datanode. This adds extra load to the NN. Then the 
 now-declared-datanode finally re-establishes contact with the NN, and sends a 
 block report. The block report processing on the NN is another heavyweight 
 activity, thus casing more load to the already overloaded namenode. 
 My proposal is tha the NN should try its best to continue processing RPCs 
 from datanodes and give lesser priority to serving client requests. The 
 Datanode RPCs are integral to the consistency and performance of the Hadoop 
 file system, and it is better to protect it at all costs. This will ensure 
 that NN  recovers from the hiccup much faster than what it does now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-602) Atempt to make a directory under an existing file on DistributedFileSystem should throw an FileAlreadyExitsException instead of FileNotFoundException

2009-09-08 Thread Boris Shkolnik (JIRA)
Atempt to make a directory under an existing file on DistributedFileSystem 
should throw an FileAlreadyExitsException instead of FileNotFoundException
-

 Key: HDFS-602
 URL: https://issues.apache.org/jira/browse/HDFS-602
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Boris Shkolnik


Atempt to make a directory under an existing file on DistributedFileSystem 
should throw an FileAlreadyExitsException instead of FileNotFoundException.
Also we should unwrap this exception from RemoteException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-595) FsPermission tests need to be updated for new octal configuration parameter from HADOOP-6234

2009-09-08 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-595.
--

Resolution: Fixed

I committed the changes. Thanks Jakob.

 FsPermission tests need to be updated for new octal configuration parameter 
 from HADOOP-6234
 

 Key: HDFS-595
 URL: https://issues.apache.org/jira/browse/HDFS-595
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs client
Reporter: Jakob Homan
Assignee: Jakob Homan
 Fix For: 0.21.0

 Attachments: HDFS-595.patch, HDFS-595.patch


 HADOOP-6234 changed the format of the configuration umask value.  Tests that 
 use this value need to be updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-597) Mofication introduced by HDFS-537 breakes an advice binding in FSDatasetAspects

2009-09-08 Thread Hairong Kuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752677#action_12752677
 ] 

Hairong Kuang commented on HDFS-597:


Cos, thanks for catching this. What I did in HDFS-537 was that I moved 
createBlockWriteStream from FSDataset to ReplicaInPipeline.

 Mofication introduced by HDFS-537 breakes an advice binding in 
 FSDatasetAspects
 ---

 Key: HDFS-597
 URL: https://issues.apache.org/jira/browse/HDFS-597
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: Append Branch
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Attachments: HDFS-597.patch


 HDFS-537's patch removed {{createBlockWriteStreams}} method which was bound 
 by {{FSDatasetAspects.callCreateBlockWriteStream}} poincut.
 While this hasn't broke any tests there's a certain number of JIRAs which 
 were reproduced by the injection of this particular fault.
 AJC compiler issues warnings during the build when something like above is 
 happening. These warnings have to be watched carefully.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-578) Support for using server default values for blockSize and replication when creating a file

2009-09-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752682#action_12752682
 ] 

Hudson commented on HDFS-578:
-

Integrated in Hadoop-Hdfs-trunk-Commit #21 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/21/])
. Add support for new FileSystem method for clients to get server defaults. 
Contributed by Kan Zhang.


 Support for using server default values for blockSize and replication when 
 creating a file
 --

 Key: HDFS-578
 URL: https://issues.apache.org/jira/browse/HDFS-578
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client, name-node
Reporter: Kan Zhang
Assignee: Kan Zhang
 Fix For: 0.21.0

 Attachments: h578-13.patch, h578-14.patch, h578-16.patch


 This is a sub-task of HADOOP-4952. This improvement makes it possible for a 
 client to specify that it wants to use the server default values for 
 blockSize and replication params when creating a file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-602) Atempt to make a directory under an existing file on DistributedFileSystem should throw an FileAlreadyExistsException instead of FileNotFoundException

2009-09-08 Thread Boris Shkolnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boris Shkolnik updated HDFS-602:


Description: 
Atempt to make a directory under an existing file on DistributedFileSystem 
should throw an FileAlreadyExistsException instead of FileNotFoundException.
Also we should unwrap this exception from RemoteException

  was:
Atempt to make a directory under an existing file on DistributedFileSystem 
should throw an FileAlreadyExitsException instead of FileNotFoundException.
Also we should unwrap this exception from RemoteException

Summary: Atempt to make a directory under an existing file on 
DistributedFileSystem should throw an FileAlreadyExistsException instead of 
FileNotFoundException  (was: Atempt to make a directory under an existing file 
on DistributedFileSystem should throw an FileAlreadyExitsException instead of 
FileNotFoundException)

 Atempt to make a directory under an existing file on DistributedFileSystem 
 should throw an FileAlreadyExistsException instead of FileNotFoundException
 --

 Key: HDFS-602
 URL: https://issues.apache.org/jira/browse/HDFS-602
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Boris Shkolnik

 Atempt to make a directory under an existing file on DistributedFileSystem 
 should throw an FileAlreadyExistsException instead of FileNotFoundException.
 Also we should unwrap this exception from RemoteException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-601) TestBlockReport should obtain data directories from MiniHDFSCluster

2009-09-08 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-601:


Status: Patch Available  (was: Open)

The trivial patch has been provided so let's see what Hudson has to say about 
it :-)

 TestBlockReport should obtain data directories from MiniHDFSCluster
 ---

 Key: HDFS-601
 URL: https://issues.apache.org/jira/browse/HDFS-601
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.21.0, Append Branch
Reporter: Konstantin Shvachko
Assignee: Konstantin Boudnik
 Fix For: 0.21.0, Append Branch

 Attachments: HDFS-601.patch


 TestBlockReport relies on that test.build.data property is set in 
 configuration, which is not always correct, e.g. when you run test from 
 eclipse. It would be better to get data directories directly from the 
 mini-cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-601) TestBlockReport should obtain data directories from MiniHDFSCluster

2009-09-08 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-601:


Attachment: HDFS-601.patch

Oops, my bad - the previous patch has been made against Append branch which is 
slightly behind of the trunk.
This is one is for trunk and should apply cleanly.

 TestBlockReport should obtain data directories from MiniHDFSCluster
 ---

 Key: HDFS-601
 URL: https://issues.apache.org/jira/browse/HDFS-601
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.21.0, Append Branch
Reporter: Konstantin Shvachko
Assignee: Konstantin Boudnik
 Fix For: 0.21.0, Append Branch

 Attachments: HDFS-601.patch, HDFS-601.patch


 TestBlockReport relies on that test.build.data property is set in 
 configuration, which is not always correct, e.g. when you run test from 
 eclipse. It would be better to get data directories directly from the 
 mini-cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-603) mReplica related classes cannot be accessed

2009-09-08 Thread Tsz Wo (Nicholas), SZE (JIRA)
mReplica related classes cannot be accessed
---

 Key: HDFS-603
 URL: https://issues.apache.org/jira/browse/HDFS-603
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: Append Branch
Reporter: Tsz Wo (Nicholas), SZE
 Fix For: Append Branch


Replica related classes cannot be accessed above FSDatasetInterface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-603) Most replica related classes cannot be accessed

2009-09-08 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-603:


Description: 
Currently, there are several replica related classes in 
org.apache.hadoop.hdfs.server.datanode:
- ReplicaInfo
- ReplicaInPipelineInterface
- ReplicaInPipeline
- ReplicaUnderRecovery
- ReplicaWaitingToBeRecovered
- ReplicaBeingWritten
- FinalizedReplica

All these classes cannot be accessed above FSDatasetInterface.

  was:Replica related classes cannot be accessed above FSDatasetInterface.

Summary: Most replica related classes cannot be accessed  (was: 
mReplica related classes cannot be accessed)

I suggest to add a new inteface called, Replica.  It will ultimately replace 
Block in the datanode package since the namenode manages blocks but datanodes 
manage replicas.

 Most replica related classes cannot be accessed
 ---

 Key: HDFS-603
 URL: https://issues.apache.org/jira/browse/HDFS-603
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: Append Branch
Reporter: Tsz Wo (Nicholas), SZE
 Fix For: Append Branch


 Currently, there are several replica related classes in 
 org.apache.hadoop.hdfs.server.datanode:
 - ReplicaInfo
 - ReplicaInPipelineInterface
 - ReplicaInPipeline
 - ReplicaUnderRecovery
 - ReplicaWaitingToBeRecovered
 - ReplicaBeingWritten
 - FinalizedReplica
 All these classes cannot be accessed above FSDatasetInterface.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-235) Add support for byte-ranges to hftp

2009-09-08 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-235:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I committed this. Thank you Bill.

 Add support for byte-ranges to hftp
 ---

 Key: HDFS-235
 URL: https://issues.apache.org/jira/browse/HDFS-235
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.21.0
Reporter: Venkatesh S
Assignee: Bill Zeller
 Fix For: 0.21.0

 Attachments: hdfs-235-1.patch, hdfs-235-2.patch, hdfs-235-3.patch


 Support should be similar to http byte-serving.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-602) Atempt to make a directory under an existing file on DistributedFileSystem should throw an FileAlreadyExistsException instead of FileNotFoundException

2009-09-08 Thread Boris Shkolnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boris Shkolnik updated HDFS-602:


Attachment: HDFS-602.patch

 Atempt to make a directory under an existing file on DistributedFileSystem 
 should throw an FileAlreadyExistsException instead of FileNotFoundException
 --

 Key: HDFS-602
 URL: https://issues.apache.org/jira/browse/HDFS-602
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Boris Shkolnik
 Attachments: HDFS-602.patch


 Atempt to make a directory under an existing file on DistributedFileSystem 
 should throw an FileAlreadyExistsException instead of FileNotFoundException.
 Also we should unwrap this exception from RemoteException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-202) Add a bulk FIleSystem.getFileBlockLocations

2009-09-08 Thread Sanjay Radia (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752748#action_12752748
 ] 

Sanjay Radia commented on HDFS-202:
---

Is the optimization for sending only partial block reports really necessary? 
Most files have very few blocks ...
Also arun's point of doing an extra call for doing the getFileStatus() is valid.

Why not create a class called DetailedFileStatus which contains both the file 
status and block locations:


DetailedFileStatus[] = getBlockLocations(Path[] paths);  // 1:1 mapping between 
the two arrays as Doug suggested.

We can add the range one later if we really need that optimization.

 Add a bulk FIleSystem.getFileBlockLocations
 ---

 Key: HDFS-202
 URL: https://issues.apache.org/jira/browse/HDFS-202
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Arun C Murthy
Assignee: Jakob Homan

 Currently map-reduce applications (specifically file-based input-formats) use 
 FileSystem.getFileBlockLocations to compute splits. However they are forced 
 to call it once per file.
 The downsides are multiple:
# Even with a few thousand files to process the number of RPCs quickly 
 starts getting noticeable
# The current implementation of getFileBlockLocations is too slow since 
 each call results in 'search' in the namesystem. Assuming a few thousand 
 input files it results in that many RPCs and 'searches'.
 It would be nice to have a FileSystem.getFileBlockLocations which can take in 
 a directory, and return the block-locations for all files in that directory. 
 We could eliminate both the per-file RPC and also the 'search' by a 'scan'.
 When I tested this for terasort, a moderate job with 8000 input files the 
 runtime halved from the current 8s to 4s. Clearly this is much more important 
 for latency-sensitive applications...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-595) FsPermission tests need to be updated for new octal configuration parameter from HADOOP-6234

2009-09-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752705#action_12752705
 ] 

Hudson commented on HDFS-595:
-

Integrated in Hadoop-Hdfs-trunk-Commit #22 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/22/])
. umask settings in configuration may now use octal or symbolic instead of 
decimal. Contributed by Jakob Homan.


 FsPermission tests need to be updated for new octal configuration parameter 
 from HADOOP-6234
 

 Key: HDFS-595
 URL: https://issues.apache.org/jira/browse/HDFS-595
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs client
Reporter: Jakob Homan
Assignee: Jakob Homan
 Fix For: 0.21.0

 Attachments: HDFS-595.patch, HDFS-595.patch


 HADOOP-6234 changed the format of the configuration umask value.  Tests that 
 use this value need to be updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-601) TestBlockReport should obtain data directories from MiniHDFSCluster

2009-09-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752721#action_12752721
 ] 

Hadoop QA commented on HDFS-601:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418957/HDFS-601.patch
  against trunk revision 812656.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/16/console

This message is automatically generated.

 TestBlockReport should obtain data directories from MiniHDFSCluster
 ---

 Key: HDFS-601
 URL: https://issues.apache.org/jira/browse/HDFS-601
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.21.0, Append Branch
Reporter: Konstantin Shvachko
Assignee: Konstantin Boudnik
 Fix For: 0.21.0, Append Branch

 Attachments: HDFS-601.patch


 TestBlockReport relies on that test.build.data property is set in 
 configuration, which is not always correct, e.g. when you run test from 
 eclipse. It would be better to get data directories directly from the 
 mini-cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests

2009-09-08 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752779#action_12752779
 ] 

Allen Wittenauer commented on HDFS-599:
---

I think separation of the two ports is a good idea.  This has a number of 
advantages:

- it makes it easier to segregate pure clients from the rest of the name node 
via firewall
- it means that client requests could potentially be encrypted, leaving pure DN 
ops unencrypted
- allows for easier implementation of Quality of Service (QoS) configurations 
at the network layer [in particular, MR client requests are second class 
traffic compared to data node requests from the same IP addr.]

I'm sure there are more. 

 Improve Namenode robustness by prioritizing datanode heartbeats over client 
 requests
 

 Key: HDFS-599
 URL: https://issues.apache.org/jira/browse/HDFS-599
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: dhruba borthakur
Assignee: dhruba borthakur

 The namenode processes RPC requests from clients that are reading/writing to 
 files as well as heartbeats/block reports from datanodes.
 Sometime, because of various reasons (Java GC runs, inconsistent performance 
 of NFS filer that stores HDFS transacttion logs, etc), the namenode 
 encounters transient slowness. For example, if the device that stores the 
 HDFS transaction logs becomes sluggish, the Namenode's ability to process 
 RPCs slows down to a certain extent. During this time, the RPCs from clients 
 as well as the RPCs from datanodes suffer in similar fashion. If the 
 underlying problem becomes worse, the NN's ability to process a heartbeat 
 from a DN is severly impacted, thus causing the NN to declare that the DN is 
 dead. Then the NN starts replicating blocks that used to reside on the 
 now-declared-dead datanode. This adds extra load to the NN. Then the 
 now-declared-datanode finally re-establishes contact with the NN, and sends a 
 block report. The block report processing on the NN is another heavyweight 
 activity, thus casing more load to the already overloaded namenode. 
 My proposal is tha the NN should try its best to continue processing RPCs 
 from datanodes and give lesser priority to serving client requests. The 
 Datanode RPCs are integral to the consistency and performance of the Hadoop 
 file system, and it is better to protect it at all costs. This will ensure 
 that NN  recovers from the hiccup much faster than what it does now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-235) Add support for byte-ranges to hftp

2009-09-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752802#action_12752802
 ] 

Hudson commented on HDFS-235:
-

Integrated in Hadoop-Hdfs-trunk-Commit #24 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/24/])
. Add support for byte ranges in HftpFileSystem to serve range of bytes 
from a file. Contributed by Bill Zeller.


 Add support for byte-ranges to hftp
 ---

 Key: HDFS-235
 URL: https://issues.apache.org/jira/browse/HDFS-235
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.21.0
Reporter: Venkatesh S
Assignee: Bill Zeller
 Fix For: 0.21.0

 Attachments: hdfs-235-1.patch, hdfs-235-2.patch, hdfs-235-3.patch


 Support should be similar to http byte-serving.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-576) Extend Block report to include under-construction replicas

2009-09-08 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko resolved HDFS-576.
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]

I committed this to the append branch.

 Extend Block report to include under-construction replicas
 --

 Key: HDFS-576
 URL: https://issues.apache.org/jira/browse/HDFS-576
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, name-node
Affects Versions: Append Branch
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: Append Branch

 Attachments: BlockReport.htm, NewBlockReport.patch, 
 NewBlockReport.patch


 Current data-node block reports report only finalized (in append terminology) 
 blocks. Data-nodes should report all block replicas except for the temporary 
 ones so that clients could read from incomplete replicas and to make block 
 recovery possible.
 The attached design document goes into more details of the new block reports.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-604) Block report processing for append

2009-09-08 Thread Konstantin Shvachko (JIRA)
Block report processing for append
--

 Key: HDFS-604
 URL: https://issues.apache.org/jira/browse/HDFS-604
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Append Branch
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: Append Branch


Implement new block report processing on the name-node as stated in the append 
design and HDFS-576.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-604) Block report processing for append

2009-09-08 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752810#action_12752810
 ] 

Konstantin Shvachko commented on HDFS-604:
--

This is the link to the block report design document.
https://issues.apache.org/jira/secure/attachment/12417932/BlockReport.htm

 Block report processing for append
 --

 Key: HDFS-604
 URL: https://issues.apache.org/jira/browse/HDFS-604
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Append Branch
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: Append Branch


 Implement new block report processing on the name-node as stated in the 
 append design and HDFS-576.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-605) There's not need to run fault-inject tests by 'run-test-hdfs-with-mr' target

2009-09-08 Thread Konstantin Boudnik (JIRA)
There's not need to run fault-inject tests by 'run-test-hdfs-with-mr' target


 Key: HDFS-605
 URL: https://issues.apache.org/jira/browse/HDFS-605
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: build, test
Reporter: Konstantin Boudnik


It turns out that running fault injection tests doesn't make any sense when 
{{run-test-hdfs-with-mr}} target is being executed. Thus, {{build.xml}} has to 
be modified.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-605) There's not need to run fault-inject tests by 'run-test-hdfs-with-mr' target

2009-09-08 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-605:


Attachment: HDFS-605.patch

The patch solves the problem by removing all 'with-mr' related code from 
fault-inject's context.

 There's not need to run fault-inject tests by 'run-test-hdfs-with-mr' target
 

 Key: HDFS-605
 URL: https://issues.apache.org/jira/browse/HDFS-605
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: build, test
Reporter: Konstantin Boudnik
 Attachments: HDFS-605.patch


 It turns out that running fault injection tests doesn't make any sense when 
 {{run-test-hdfs-with-mr}} target is being executed. Thus, {{build.xml}} has 
 to be modified.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-605) There's not need to run fault-inject tests by 'run-test-hdfs-with-mr' target

2009-09-08 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-605:


Affects Version/s: 0.21.0
Fix Version/s: 0.21.0
 Assignee: Konstantin Boudnik
 Hadoop Flags: [Reviewed]

+1

 There's not need to run fault-inject tests by 'run-test-hdfs-with-mr' target
 

 Key: HDFS-605
 URL: https://issues.apache.org/jira/browse/HDFS-605
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.21.0
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Fix For: 0.21.0

 Attachments: HDFS-605.patch


 It turns out that running fault injection tests doesn't make any sense when 
 {{run-test-hdfs-with-mr}} target is being executed. Thus, {{build.xml}} has 
 to be modified.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-602) Atempt to make a directory under an existing file on DistributedFileSystem should throw an FileAlreadyExistsException instead of FileNotFoundException

2009-09-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752864#action_12752864
 ] 

Hadoop QA commented on HDFS-602:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12418972/HDFS-602.patch
  against trunk revision 812701.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/17/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/17/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/17/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/17/console

This message is automatically generated.

 Atempt to make a directory under an existing file on DistributedFileSystem 
 should throw an FileAlreadyExistsException instead of FileNotFoundException
 --

 Key: HDFS-602
 URL: https://issues.apache.org/jira/browse/HDFS-602
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Boris Shkolnik
Assignee: Boris Shkolnik
 Attachments: HDFS-602.patch


 Atempt to make a directory under an existing file on DistributedFileSystem 
 should throw an FileAlreadyExistsException instead of FileNotFoundException.
 Also we should unwrap this exception from RemoteException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-606) ConcurrentModificationException in invalidateCorruptReplicas()

2009-09-08 Thread Konstantin Shvachko (JIRA)
ConcurrentModificationException in invalidateCorruptReplicas()
--

 Key: HDFS-606
 URL: https://issues.apache.org/jira/browse/HDFS-606
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.21.0


{{BlockManager.invalidateCorruptReplicas()}} iterates over DatanodeDescriptor-s 
while removing corrupt replicas from the descriptors. This causes 
{{ConcurrentModificationException}} if there is more than one replicas of the 
block. I ran into this exception debugging different scenarios in append, but 
it should be fixed in the trunk too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-606) ConcurrentModificationException in invalidateCorruptReplicas()

2009-09-08 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-606:
-

Status: Patch Available  (was: Open)

 ConcurrentModificationException in invalidateCorruptReplicas()
 --

 Key: HDFS-606
 URL: https://issues.apache.org/jira/browse/HDFS-606
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.21.0

 Attachments: CMEinCorruptReplicas.patch


 {{BlockManager.invalidateCorruptReplicas()}} iterates over 
 DatanodeDescriptor-s while removing corrupt replicas from the descriptors. 
 This causes {{ConcurrentModificationException}} if there is more than one 
 replicas of the block. I ran into this exception debugging different 
 scenarios in append, but it should be fixed in the trunk too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-606) ConcurrentModificationException in invalidateCorruptReplicas()

2009-09-08 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-606:
-

Attachment: CMEinCorruptReplicas.patch

This is a rather obvious patch, which makes a copy of the list before iterating.
Should not require any new tests, because this condition is hardly testable at 
all.

 ConcurrentModificationException in invalidateCorruptReplicas()
 --

 Key: HDFS-606
 URL: https://issues.apache.org/jira/browse/HDFS-606
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.21.0

 Attachments: CMEinCorruptReplicas.patch


 {{BlockManager.invalidateCorruptReplicas()}} iterates over 
 DatanodeDescriptor-s while removing corrupt replicas from the descriptors. 
 This causes {{ConcurrentModificationException}} if there is more than one 
 replicas of the block. I ran into this exception debugging different 
 scenarios in append, but it should be fixed in the trunk too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-605) There's not need to run fault-inject tests by 'run-test-hdfs-with-mr' target

2009-09-08 Thread Konstantin Boudnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated HDFS-605:


Status: Patch Available  (was: Open)

All tests have passed as expected. The sequence of test executions is like this:
{noformat}
run-test-hdfs:
...
run-test-hdfs-with-mr:
...
run-test-hdfs-fault-inject:
...
run-test-hdfs:
[mkdir] Created dir: /home/cos/work/Hdfs/build-fi/test/data
[mkdir] Created dir: /home/cos/work/Hdfs/build-fi/test/logs
 [copy] Copying 1 file to /home/cos/work/Hdfs/build-fi/test/extraconf
 [copy] Copying 1 file to /home/cos/work/Hdfs/build-fi/test/extraconf
[junit] Running 
org.apache.hadoop.hdfs.server.datanode.TestFiDataTransferProtocol
[junit] Tests run: 16, Failures: 0, Errors: 0, Time elapsed: 243.421 sec

checkfailure:

BUILD SUCCESSFUL
{noformat}

so no 'with-mr' tests are being ran in fault-injection mode.

 There's not need to run fault-inject tests by 'run-test-hdfs-with-mr' target
 

 Key: HDFS-605
 URL: https://issues.apache.org/jira/browse/HDFS-605
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: build, test
Affects Versions: 0.21.0
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik
 Fix For: 0.21.0

 Attachments: HDFS-605.patch


 It turns out that running fault injection tests doesn't make any sense when 
 {{run-test-hdfs-with-mr}} target is being executed. Thus, {{build.xml}} has 
 to be modified.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-606) ConcurrentModificationException in invalidateCorruptReplicas()

2009-09-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752911#action_12752911
 ] 

Hadoop QA commented on HDFS-606:


-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12419013/CMEinCorruptReplicas.patch
  against trunk revision 812701.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/18/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/18/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/18/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/18/console

This message is automatically generated.

 ConcurrentModificationException in invalidateCorruptReplicas()
 --

 Key: HDFS-606
 URL: https://issues.apache.org/jira/browse/HDFS-606
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.21.0

 Attachments: CMEinCorruptReplicas.patch


 {{BlockManager.invalidateCorruptReplicas()}} iterates over 
 DatanodeDescriptor-s while removing corrupt replicas from the descriptors. 
 This causes {{ConcurrentModificationException}} if there is more than one 
 replicas of the block. I ran into this exception debugging different 
 scenarios in append, but it should be fixed in the trunk too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.