date:20131105

[
https://issues.apache.org/jira/browse/HDFS-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813747#comment-13813747
]

Jing Zhao commented on HDFS-5443:
-

bq. Here actual problem is not containing the 0-sized blocks, but counting them
also in safemode threshold as these are loaded as COMPLETE blocks

Agree. But in the meanwhile, we also should clear these 0-sized block since if
the corresponding file is only in snapshot, no one will finalize the block I
guess. That's why I think maybe we should fix this part in a separate jira. I
think for the safemode part, as Vinay mentioned, the key issue is still the
current code fails to recognize INodeFileUC if the file is in snapshot and the
deletion is on its parent/ancestral directory, while loading the fsimage.

I think HDFS-5428 can solve the problem, but it may overkill the problem
because in the current HDFS-5428 patch we need to keep records in the lease
map, and maintain these records even for snapshot deletion and renaming. Since
the safemode issue only happens when starting NN, can we fix the problem by:
1. recording extra information in fsimage to indicate INodeFileUC that are only
in snapshots
2. re-generating all the INodeFileUC when loading fsimage
3. using a similar workaround as in HDFS-5283.

For 12, we need to cover the files that are deleted through its ancestral
directory. To avoid the incompatibility of fsimage, we can put the extra
information to the under construction files section of the fsimage.

Namenode can stuck in safemode on restart if it crashes just after addblock
logsync and after taking snapshot for such file.

Key: HDFS-5443
URL: https://issues.apache.org/jira/browse/HDFS-5443
Project: Hadoop HDFS
Issue Type: Bug
Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Uma Maheswara Rao G
Assignee: sathish

This issue is reported by Prakash and Sathish.
On looking into the issue following things are happening.
.
1) Client added block at NN and just did logsync
So, NN has block ID persisted.
2)Before returning addblock response to client take a snapshot for root or
parent directories for that file
3) Delete parent directory for that file
4) Now crash the NN with out responding success to client for that addBlock
call
Now on restart of the Namenode, it will stuck in safemode.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5443) Namenode can stuck in safemode on restart if it crashes just after addblock logsync and after taking snapshot for such file.


 [ 
https://issues.apache.org/jira/browse/HDFS-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5443:


Attachment: 5443-test.patch

Upload unit tests to reproduce the issue while clearing the 0-sized blocks.

 Namenode can stuck in safemode on restart if it crashes just after addblock 
 logsync and after taking snapshot for such file.
 

 Key: HDFS-5443
 URL: https://issues.apache.org/jira/browse/HDFS-5443
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Uma Maheswara Rao G
Assignee: sathish
 Attachments: 5443-test.patch


 This issue is reported by Prakash and Sathish.
 On looking into the issue following things are happening.
 .
 1) Client added block at NN and just did logsync
So, NN has block ID persisted.
 2)Before returning addblock response to client take a snapshot for root or 
 parent directories for that file
 3) Delete parent directory for that file
 4) Now crash the NN with out responding success to client for that addBlock 
 call
 Now on restart of the Namenode, it will stuck in safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5443) Namenode can stuck in safemode on restart if it crashes just after addblock logsync and after taking snapshot for such file.

2013-11-05 Thread Uma Maheswara Rao G (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813770#comment-13813770
]

Uma Maheswara Rao G commented on HDFS-5443:
---

{quote}
problem of 0-sized blocks is there with normal files also (HDFS-4516), but that
will not cause in NN safemode because file will be an under construction file
and 0-sized block will not be counted in safemode threshold.
{quote}
Yep, This JIRA explains the same. Please see description and first comment.

{quote}
ut counting them also in safemode threshold as these are loaded as COMPLETE
blocks
{quote}
Here point was we no need to keep them in snapshotted files.(There was
inconsistency in the flow) . If there is simple way to wipe out all the file
0-sized blocks consistently in someway, that will be good to address this.
Anyway, leases maintaining may solve as that will be same as normal file UC.
Let Sathish verify this with that patch.
But I am little uncomfortable for managing leases for snapshotted files as they
are readonly files, no need of leases. If all others ok on that point, I will
not object.

Namenode can stuck in safemode on restart if it crashes just after addblock
logsync and after taking snapshot for such file.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5443) Namenode can stuck in safemode on restart if it crashes just after addblock logsync and after taking snapshot for such file.

2013-11-05 Thread Uma Maheswara Rao G (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813776#comment-13813776
]

Uma Maheswara Rao G commented on HDFS-5443:
---

Oh, I did not see your comment. Thanks Jing for patch.

{quote}
I think HDFS-5428 can solve the problem, but it may overkill the problem
because in the current HDFS-5428 patch we need to keep records in the lease
map, and maintain these records even for snapshot deletion and renaming.
{quote}
Exactly. This is what I was trying to indicate with my above comment.

Namenode can stuck in safemode on restart if it crashes just after addblock
logsync and after taking snapshot for such file.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5443) Namenode can stuck in safemode on restart if it crashes just after addblock logsync and after taking snapshot for such file.

2013-11-05 Thread Vinay (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813777#comment-13813777
]

Vinay commented on HDFS-5443:
-

bq. 1. recording extra information in fsimage to indicate INodeFileUC that are
only in snapshots
These extra information only kept as snapshot leases. It will keep track all
the time instead of only at the time of checkpointing. Also it will keep

bq. 2. re-generating all the INodeFileUC when loading fsimage
This will happen as loading leases. and also blocksmap will be updated with
UNDERCONSTRUCTION state

bq. 3. using a similar workaround as in HDFS-5283.
As we already excluding under construction blocks, this workaround no more
required.

bq. To avoid the incompatibility of fsimage, we can put the extra information
to the under construction files section of the fsimage.
Yes. exactly because of this reason I went for the approach of storing these
files as leases. Because this section will be
stored from the leases and leases will be loaded from this section.

Namenode can stuck in safemode on restart if it crashes just after addblock
logsync and after taking snapshot for such file.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5458) Datanode failed volume threshold ignored if exception is thrown in getDataDirsFromURIs


[ 
https://issues.apache.org/jira/browse/HDFS-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813782#comment-13813782
 ] 

Hadoop QA commented on HDFS-5458:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612086/HDFS-5458-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5335//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5335//console

This message is automatically generated.

 Datanode failed volume threshold ignored if exception is thrown in 
 getDataDirsFromURIs
 --

 Key: HDFS-5458
 URL: https://issues.apache.org/jira/browse/HDFS-5458
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Mike Mellenthin
 Attachments: HDFS-5458-1.patch


 Saw a stacktrace of datanode startup with a bad volume, where even listing 
 directories would throw an IOException. The failed volume threshold was set 
 to 1, but it would fatally error out in {{File#getCanonicalPath}} in 
 {{getDataDirsFromURIs}}:
 {code}
   File dir = new File(dirURI.getPath());
   try {
 dataNodeDiskChecker.checkDir(localFS, new Path(dir.toURI()));
 dirs.add(dir);
   } catch (IOException ioe) {
 LOG.warn(Invalid  + DFS_DATANODE_DATA_DIR_KEY +  
 + dir +  : , ioe);
 invalidDirs.append(\).append(dir.getCanonicalPath()).append(\ );
   }
 {code}
 Since {{getCanonicalPath}} can need to do I/O and thus throw an IOException, 
 this catch clause doesn't properly protect startup from a failed volume.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-11-05 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813784#comment-13813784
]

Konstantin Shvachko commented on HDFS-2832:
---

UUID#randomUUID generates RFC-4122 compliant UUIDs which are unique *for all
practical purposes*

RFC-4122 has a special note about distributed applications. But let's just
think about it in general.
randomUUID is based on pseudo random sequence of numbers, which is like a
Mobius Strip or just a loop. It actually works well if you generate IDs on a
single node, because the sequence lasts long without repetitions. In our case
we initiate thousands of pseudo random sequences (one per node), each starting
from a random number. Let's mark those starting numbers on the Mobius Strip or
the loop. Then we actually decreased the probability of uniqueness because now
in order to get a collision one of the nodes need to reach the starting point
of another node, rather than going all around the loop. So in distributed
environment we increase the probability of collision with each new node added.
And when you add more storage types per node you further increase the collision
probability.
for all practical purposes as I understand it in the case means that
probability of non-unique IDs is low. But it does not mean impossible. The
consequences of a storageID collision are pretty bad, hard to detect and
recover. At the same time {{DataNode.createNewStorageId()}} generates unique
IDs as of today. Why changing it to a problematic approach?

Part of the rationale is in HDFS-5115. Making them UUIDs simplifies the
generation logic.

Looks like HDFS-5115 was based on an incomplete assumption:
bq. The Storage ID is currently generated from the DataNode's IP+Port+Random
components
while in fact it also includes currentTime, which guarantees the uniqueness of
ids generated on the same node, unless somebody resets the machine clock to the
past.

Enable support for heterogeneous storages in HDFS
-

Key: HDFS-2832
URL: https://issues.apache.org/jira/browse/HDFS-2832
Project: Hadoop HDFS
Issue Type: New Feature
Affects Versions: 0.24.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023.patch,
h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch,
h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch,
h2832_20131104.patch

HDFS currently supports configuration where storages are a list of
directories. Typically each of these directories correspond to a volume with
its own file system. All these directories are homogeneous and therefore
identified as a single storage at the namenode. I propose, change to the
current model where Datanode * is a * storage, to Datanode * is a collection
* of strorages.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5427) not able to read deleted files from snapshot directly under snapshottable dir after checkpoint and NN restart

[
https://issues.apache.org/jira/browse/HDFS-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813845#comment-13813845
]

Hudson commented on HDFS-5427:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #383 (See
[https://builds.apache.org/job/Hadoop-Yarn-trunk/383/])
HDFS-5427. Not able to read deleted files from snapshot directly under
snapshottable dir after checkpoint and NN restart. Contributed by Vinay.
(jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1538875)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotBlocksMap.java

not able to read deleted files from snapshot directly under snapshottable dir
after checkpoint and NN restart
-

Key: HDFS-5427
URL: https://issues.apache.org/jira/browse/HDFS-5427
Project: Hadoop HDFS
Issue Type: Bug
Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
Priority: Blocker
Fix For: 2.3.0

Attachments: HDFS-5427-v2.patch, HDFS-5427.patch, HDFS-5427.patch

1. allow snapshots under dir /foo
2. create a file /foo/bar
3. create a snapshot s1 under /foo
4. delete the file /foo/bar
5. wait till checkpoint or do saveNameSpace
6. restart NN.
7. Now try to read the file from snapshot /foo/.snapshot/s1/bar
client will get BlockMissingException
Reason is
While loading the deleted file list for a snashottable dir from fsimage,
blocks were not updated in blocksmap

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5456) NameNode startup progress creates new steps if caller attempts to create a counter for a step that doesn't already exist.


[ 
https://issues.apache.org/jira/browse/HDFS-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813846#comment-13813846
 ] 

Hudson commented on HDFS-5456:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #383 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/383/])
HDFS-5456. NameNode startup progress creates new steps if caller attempts to 
create a counter for a step that doesn't already exist. Contributed by Chris 
Nauroth. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1538872)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/StartupProgress.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/TestStartupProgress.java


 NameNode startup progress creates new steps if caller attempts to create a 
 counter for a step that doesn't already exist.
 -

 Key: HDFS-5456
 URL: https://issues.apache.org/jira/browse/HDFS-5456
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Critical
 Fix For: 3.0.0, 2.2.1

 Attachments: HDFS-5456.1.patch


 NameNode startup progress is supposed to be immutable after startup has 
 completed.  All methods are coded to ignore update attempts after startup has 
 completed.  However, {{StartupProgress#getCounter}} does not implement this 
 correctly.  If a caller attempts to get a counter for a new step that hasn't 
 been seen before, then the method accidentally creates the step.  This 
 allocates additional space in the internal tracking data structures, so 
 ultimately this is a memory leak.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5427) not able to read deleted files from snapshot directly under snapshottable dir after checkpoint and NN restart

[
https://issues.apache.org/jira/browse/HDFS-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813899#comment-13813899
]

Hudson commented on HDFS-5427:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1600 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1600/])
HDFS-5427. Not able to read deleted files from snapshot directly under
snapshottable dir after checkpoint and NN restart. Contributed by Vinay.
(jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1538875)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotBlocksMap.java

not able to read deleted files from snapshot directly under snapshottable dir
after checkpoint and NN restart
-

Attachments: HDFS-5427-v2.patch, HDFS-5427.patch, HDFS-5427.patch

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5456) NameNode startup progress creates new steps if caller attempts to create a counter for a step that doesn't already exist.


[ 
https://issues.apache.org/jira/browse/HDFS-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813900#comment-13813900
 ] 

Hudson commented on HDFS-5456:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1600 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1600/])
HDFS-5456. NameNode startup progress creates new steps if caller attempts to 
create a counter for a step that doesn't already exist. Contributed by Chris 
Nauroth. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1538872)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/StartupProgress.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/TestStartupProgress.java


 NameNode startup progress creates new steps if caller attempts to create a 
 counter for a step that doesn't already exist.
 -

 Key: HDFS-5456
 URL: https://issues.apache.org/jira/browse/HDFS-5456
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Critical
 Fix For: 3.0.0, 2.2.1

 Attachments: HDFS-5456.1.patch


 NameNode startup progress is supposed to be immutable after startup has 
 completed.  All methods are coded to ignore update attempts after startup has 
 completed.  However, {{StartupProgress#getCounter}} does not implement this 
 correctly.  If a caller attempts to get a counter for a new step that hasn't 
 been seen before, then the method accidentally creates the step.  This 
 allocates additional space in the internal tracking data structures, so 
 ultimately this is a memory leak.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HDFS-5463) NameNode should limit to number of blocks per file

2013-11-05 Thread Vinay (JIRA)

Vinay created HDFS-5463:
---

 Summary: NameNode should limit to number of blocks per file
 Key: HDFS-5463
 URL: https://issues.apache.org/jira/browse/HDFS-5463
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinay
Assignee: Vinay


Currently there is no limit to number of blocks user can write to a file.

And blocksize also can be set to minimum possible.

User can write any number of blocks continously, which may create problems in 
NameNodes performance and service as the number of blocks of file increases.
Because each time new block allocated, all blocks  of the file will be 
persisted, and this can cause serious performance degradation


So proposal is to limit the number of maximum blocks a user can write to a file.

May be 1024 blocks(if 128*MB is block size then 128 GB can be max file size)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5463) NameNode should limit the number of blocks per file

2013-11-05 Thread Vinay (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HDFS-5463:


Summary: NameNode should limit the number of blocks per file  (was: 
NameNode should limit to number of blocks per file)

 NameNode should limit the number of blocks per file
 ---

 Key: HDFS-5463
 URL: https://issues.apache.org/jira/browse/HDFS-5463
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinay
Assignee: Vinay

 Currently there is no limit to number of blocks user can write to a file.
 And blocksize also can be set to minimum possible.
 User can write any number of blocks continously, which may create problems in 
 NameNodes performance and service as the number of blocks of file increases.
 Because each time new block allocated, all blocks  of the file will be 
 persisted, and this can cause serious performance degradation
 So proposal is to limit the number of maximum blocks a user can write to a 
 file.
 May be 1024 blocks(if 128*MB is block size then 128 GB can be max file size)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5427) not able to read deleted files from snapshot directly under snapshottable dir after checkpoint and NN restart

[
https://issues.apache.org/jira/browse/HDFS-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813917#comment-13813917
]

Hudson commented on HDFS-5427:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1574 (See
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1574/])
HDFS-5427. Not able to read deleted files from snapshot directly under
snapshottable dir after checkpoint and NN restart. Contributed by Vinay.
(jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1538875)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java
*
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotBlocksMap.java

not able to read deleted files from snapshot directly under snapshottable dir
after checkpoint and NN restart
-

Attachments: HDFS-5427-v2.patch, HDFS-5427.patch, HDFS-5427.patch

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5456) NameNode startup progress creates new steps if caller attempts to create a counter for a step that doesn't already exist.


[ 
https://issues.apache.org/jira/browse/HDFS-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813918#comment-13813918
 ] 

Hudson commented on HDFS-5456:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1574 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1574/])
HDFS-5456. NameNode startup progress creates new steps if caller attempts to 
create a counter for a step that doesn't already exist. Contributed by Chris 
Nauroth. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1538872)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/StartupProgress.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/startupprogress/TestStartupProgress.java


 NameNode startup progress creates new steps if caller attempts to create a 
 counter for a step that doesn't already exist.
 -

 Key: HDFS-5456
 URL: https://issues.apache.org/jira/browse/HDFS-5456
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0, 2.2.0
Reporter: Chris Nauroth
Assignee: Chris Nauroth
Priority: Critical
 Fix For: 3.0.0, 2.2.1

 Attachments: HDFS-5456.1.patch


 NameNode startup progress is supposed to be immutable after startup has 
 completed.  All methods are coded to ignore update attempts after startup has 
 completed.  However, {{StartupProgress#getCounter}} does not implement this 
 correctly.  If a caller attempts to get a counter for a new step that hasn't 
 been seen before, then the method accidentally creates the step.  This 
 allocates additional space in the internal tracking data structures, so 
 ultimately this is a memory leak.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5463) NameNode should limit the number of blocks per file

2013-11-05 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813940#comment-13813940
 ] 

Uma Maheswara Rao G commented on HDFS-5463:
---

Hi Vinay,

I think this is already addressed. Please see this parameter: 
  public static final String  DFS_NAMENODE_MAX_BLOCKS_PER_FILE_KEY = 
dfs.namenode.fs-limits.max-blocks-per-file;
  public static final longDFS_NAMENODE_MAX_BLOCKS_PER_FILE_DEFAULT = 
1024*1024;

{code}
 if (pendingFile.getBlocks().length = maxBlocksPerFile) {
throw new IOException(File has reached the limit on maximum number of
+  blocks ( + DFSConfigKeys.DFS_NAMENODE_MAX_BLOCKS_PER_FILE_KEY
+ ):  + pendingFile.getBlocks().length +  = 
+ maxBlocksPerFile);
  }
{code}

Addressed as part of HDFS-4305.

 NameNode should limit the number of blocks per file
 ---

 Key: HDFS-5463
 URL: https://issues.apache.org/jira/browse/HDFS-5463
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinay
Assignee: Vinay

 Currently there is no limit to number of blocks user can write to a file.
 And blocksize also can be set to minimum possible.
 User can write any number of blocks continously, which may create problems in 
 NameNodes performance and service as the number of blocks of file increases.
 Because each time new block allocated, all blocks  of the file will be 
 persisted, and this can cause serious performance degradation
 So proposal is to limit the number of maximum blocks a user can write to a 
 file.
 May be 1024 blocks(if 128*MB is block size then 128 GB can be max file size)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HDFS-5462) Fail to compile in Branch HDFS-2832 with COMPILATION ERROR

2013-11-05 Thread Eric Sirianni (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Sirianni resolved HDFS-5462.
-

Resolution: Fixed

 Fail to compile in Branch HDFS-2832 with COMPILATION ERROR 
 ---

 Key: HDFS-5462
 URL: https://issues.apache.org/jira/browse/HDFS-5462
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: wenwupeng

 Failed to compile HDFS in Branch HDFS-2832 with COMPILATION ERROR ,  
 OutputFormat is Sun proprietary API and may be removed in a future release
 [INFO] Compiling 276 source files to 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/target/classes
 [INFO] -
 [ERROR] COMPILATION ERROR : 
 [INFO] -
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[32,48]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[33,48]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java:[337,34]
  unreported exception java.io.IOException; must be caught or declared to be 
 thrown
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[134,41]
  sun.nio.ch.DirectBuffer is Sun proprietary API and may be removed in a 
 future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[135,14]
  sun.misc.Cleaner is Sun proprietary API and may be removed in a future 
 release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[136,22]
  sun.nio.ch.DirectBuffer is Sun proprietary API and may be removed in a 
 future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[55,4]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[55,33]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[59,4]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[59,35]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [INFO] 10 errors 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5462) Fail to compile in Branch HDFS-2832 with COMPILATION ERROR

2013-11-05 Thread Eric Sirianni (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813942#comment-13813942
 ] 

Eric Sirianni commented on HDFS-5462:
-

There seems to be an issue in the maven-compiler-plugin whereby when an ERROR 
is detected, it incorrectly marks the compiler warnings as ERRORs also.  This 
makes it hard to see the actual ERROR in all the noise.  It looks like 
[MCOMPILER-179|http://jira.codehaus.org/browse/MCOMPILER-179] (though that has 
been marked as fixed in maven 3.0...

At any rate, the actual error is 
{code}
[ERROR] 
/home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java:[337,34]
 unreported exception java.io.IOException; must be caught or declared to be 
thrown
{code}

This has been fixed by [~arpitagarwal] (see my comment on HDFS-5448).  

 Fail to compile in Branch HDFS-2832 with COMPILATION ERROR 
 ---

 Key: HDFS-5462
 URL: https://issues.apache.org/jira/browse/HDFS-5462
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: wenwupeng

 Failed to compile HDFS in Branch HDFS-2832 with COMPILATION ERROR ,  
 OutputFormat is Sun proprietary API and may be removed in a future release
 [INFO] Compiling 276 source files to 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/target/classes
 [INFO] -
 [ERROR] COMPILATION ERROR : 
 [INFO] -
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[32,48]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[33,48]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java:[337,34]
  unreported exception java.io.IOException; must be caught or declared to be 
 thrown
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[134,41]
  sun.nio.ch.DirectBuffer is Sun proprietary API and may be removed in a 
 future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[135,14]
  sun.misc.Cleaner is Sun proprietary API and may be removed in a future 
 release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[136,22]
  sun.nio.ch.DirectBuffer is Sun proprietary API and may be removed in a 
 future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[55,4]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[55,33]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[59,4]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[59,35]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [INFO] 10 errors 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5462) Fail to compile in Branch HDFS-2832 with COMPILATION ERROR


[ 
https://issues.apache.org/jira/browse/HDFS-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813960#comment-13813960
 ] 

Arpit Agarwal commented on HDFS-5462:
-

Thanks for responding to this Eric.

Wenwu, please resync to the latest revision.

 Fail to compile in Branch HDFS-2832 with COMPILATION ERROR 
 ---

 Key: HDFS-5462
 URL: https://issues.apache.org/jira/browse/HDFS-5462
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: wenwupeng

 Failed to compile HDFS in Branch HDFS-2832 with COMPILATION ERROR ,  
 OutputFormat is Sun proprietary API and may be removed in a future release
 [INFO] Compiling 276 source files to 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/target/classes
 [INFO] -
 [ERROR] COMPILATION ERROR : 
 [INFO] -
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[32,48]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[33,48]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java:[337,34]
  unreported exception java.io.IOException; must be caught or declared to be 
 thrown
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[134,41]
  sun.nio.ch.DirectBuffer is Sun proprietary API and may be removed in a 
 future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[135,14]
  sun.misc.Cleaner is Sun proprietary API and may be removed in a future 
 release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[136,22]
  sun.nio.ch.DirectBuffer is Sun proprietary API and may be removed in a 
 future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[55,4]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[55,33]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[59,4]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[59,35]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [INFO] 10 errors 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

[
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814025#comment-13814025
]

Arpit Agarwal commented on HDFS-2832:
-

Konstantin, UUID generation uses a cryptographically secure PRNG. On Linux this
is /dev/random, the fallback is SHA1PRNG with a period of 2^160.

With a billion nodes the probability of a collision in a 128-bit space is less
than 1 in 10^20. Note that what was previously the storageID is now the
datanode UUID and it is generated once for the lifetime of a datanode.

Enable support for heterogeneous storages in HDFS
-

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5333) Improvement of current HDFS Web UI

2013-11-05 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814070#comment-13814070
 ] 

Luke Lu commented on HDFS-5333:
---

Auto-redirect from the index page is nice. A link to alternative UI at the 
bottom would be very convenient for dev/qa/user to checkout alternative without 
having to enable/disable js and/or type explicit URLs.

 Improvement of current HDFS Web UI
 --

 Key: HDFS-5333
 URL: https://issues.apache.org/jira/browse/HDFS-5333
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Haohui Mai

 This is an umbrella jira for improving the current JSP-based HDFS Web UI. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5333) Improvement of current HDFS Web UI

2013-11-05 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814087#comment-13814087
 ] 

Haohui Mai commented on HDFS-5333:
--

Thanks for the feedbacks. I'll make sure they will be addressed in HDFS-5444.

 Improvement of current HDFS Web UI
 --

 Key: HDFS-5333
 URL: https://issues.apache.org/jira/browse/HDFS-5333
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Haohui Mai

 This is an umbrella jira for improving the current JSP-based HDFS Web UI. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5458) Datanode failed volume threshold ignored if exception is thrown in getDataDirsFromURIs


[ 
https://issues.apache.org/jira/browse/HDFS-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814089#comment-13814089
 ] 

Andrew Wang commented on HDFS-5458:
---

Hey Mike, the patch looks good. I think this is small enough that we can commit 
it without a test.

+1, thanks for the contribution.

 Datanode failed volume threshold ignored if exception is thrown in 
 getDataDirsFromURIs
 --

 Key: HDFS-5458
 URL: https://issues.apache.org/jira/browse/HDFS-5458
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Mike Mellenthin
 Attachments: HDFS-5458-1.patch


 Saw a stacktrace of datanode startup with a bad volume, where even listing 
 directories would throw an IOException. The failed volume threshold was set 
 to 1, but it would fatally error out in {{File#getCanonicalPath}} in 
 {{getDataDirsFromURIs}}:
 {code}
   File dir = new File(dirURI.getPath());
   try {
 dataNodeDiskChecker.checkDir(localFS, new Path(dir.toURI()));
 dirs.add(dir);
   } catch (IOException ioe) {
 LOG.warn(Invalid  + DFS_DATANODE_DATA_DIR_KEY +  
 + dir +  : , ioe);
 invalidDirs.append(\).append(dir.getCanonicalPath()).append(\ );
   }
 {code}
 Since {{getCanonicalPath}} can need to do I/O and thus throw an IOException, 
 this catch clause doesn't properly protect startup from a failed volume.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5458) Datanode failed volume threshold ignored if exception is thrown in getDataDirsFromURIs


 [ 
https://issues.apache.org/jira/browse/HDFS-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5458:
--

   Resolution: Fixed
Fix Version/s: 2.2.1
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed for 2.2.1. Thanks again, Mike!

 Datanode failed volume threshold ignored if exception is thrown in 
 getDataDirsFromURIs
 --

 Key: HDFS-5458
 URL: https://issues.apache.org/jira/browse/HDFS-5458
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Mike Mellenthin
 Fix For: 2.2.1

 Attachments: HDFS-5458-1.patch


 Saw a stacktrace of datanode startup with a bad volume, where even listing 
 directories would throw an IOException. The failed volume threshold was set 
 to 1, but it would fatally error out in {{File#getCanonicalPath}} in 
 {{getDataDirsFromURIs}}:
 {code}
   File dir = new File(dirURI.getPath());
   try {
 dataNodeDiskChecker.checkDir(localFS, new Path(dir.toURI()));
 dirs.add(dir);
   } catch (IOException ioe) {
 LOG.warn(Invalid  + DFS_DATANODE_DATA_DIR_KEY +  
 + dir +  : , ioe);
 invalidDirs.append(\).append(dir.getCanonicalPath()).append(\ );
   }
 {code}
 Since {{getCanonicalPath}} can need to do I/O and thus throw an IOException, 
 this catch clause doesn't properly protect startup from a failed volume.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5458) Datanode failed volume threshold ignored if exception is thrown in getDataDirsFromURIs


[ 
https://issues.apache.org/jira/browse/HDFS-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814106#comment-13814106
 ] 

Hudson commented on HDFS-5458:
--

FAILURE: Integrated in Hadoop-trunk-Commit #4695 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4695/])
HDFS-5458. Datanode failed volume threshold ignored if exception is thrown in 
getDataDirsFromURIs. Contributed by Mike Mellenthin. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1539091)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


 Datanode failed volume threshold ignored if exception is thrown in 
 getDataDirsFromURIs
 --

 Key: HDFS-5458
 URL: https://issues.apache.org/jira/browse/HDFS-5458
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Mike Mellenthin
 Fix For: 2.2.1

 Attachments: HDFS-5458-1.patch


 Saw a stacktrace of datanode startup with a bad volume, where even listing 
 directories would throw an IOException. The failed volume threshold was set 
 to 1, but it would fatally error out in {{File#getCanonicalPath}} in 
 {{getDataDirsFromURIs}}:
 {code}
   File dir = new File(dirURI.getPath());
   try {
 dataNodeDiskChecker.checkDir(localFS, new Path(dir.toURI()));
 dirs.add(dir);
   } catch (IOException ioe) {
 LOG.warn(Invalid  + DFS_DATANODE_DATA_DIR_KEY +  
 + dir +  : , ioe);
 invalidDirs.append(\).append(dir.getCanonicalPath()).append(\ );
   }
 {code}
 Since {{getCanonicalPath}} can need to do I/O and thus throw an IOException, 
 this catch clause doesn't properly protect startup from a failed volume.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HDFS-5463) NameNode should limit the number of blocks per file


 [ 
https://issues.apache.org/jira/browse/HDFS-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HDFS-5463.
---

Resolution: Duplicate

As Uma said above, I think this is handled as of 2.1.0 by HDFS-4305. Please 
re-open if you feel this is incorrect. Thanks Vinay.

 NameNode should limit the number of blocks per file
 ---

 Key: HDFS-5463
 URL: https://issues.apache.org/jira/browse/HDFS-5463
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Vinay
Assignee: Vinay

 Currently there is no limit to number of blocks user can write to a file.
 And blocksize also can be set to minimum possible.
 User can write any number of blocks continously, which may create problems in 
 NameNodes performance and service as the number of blocks of file increases.
 Because each time new block allocated, all blocks  of the file will be 
 persisted, and this can cause serious performance degradation
 So proposal is to limit the number of maximum blocks a user can write to a 
 file.
 May be 1024 blocks(if 128*MB is block size then 128 GB can be max file size)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5252) Stable write is not handled correctly in someplace

[
https://issues.apache.org/jira/browse/HDFS-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814133#comment-13814133
]

Jing Zhao commented on HDFS-5252:
-

The patch looks good to me. One minor is that maybe we do not need to always
call sync-and-update-length here, which will always fire a RPC call to NN.
Instead, maybe we can do hsync for DATA_SYNC, and only update length when the
stable flag is FILE_SYNC. This will save us some RPC calls for larger files.

Stable write is not handled correctly in someplace
--

Key: HDFS-5252
URL: https://issues.apache.org/jira/browse/HDFS-5252
Project: Hadoop HDFS
Issue Type: Sub-task
Components: nfs
Reporter: Brandon Li
Assignee: Brandon Li
Attachments: HDFS-5252.001.patch

When the client asks for a stable write but the prerequisite writes are not
transferred to NFS gateway, the stableness can't be honored. NFS gateway has
to treat the write as unstable write and set the flag to UNSTABLE in the
write response.
One bug was found during test with Ubuntu client when copying one 1KB file.
For small files like 1KB file, Ubuntu client does one stable write (with
FILE_SYNC flag). However, NFS gateway missed one place
where(OpenFileCtx#doSingleWrite) it sends response with the flag NOT updated
to UNSTABLE.
With this bug, the client thinks the write is on disk and thus doesn't send
COMMIT anymore. The following test tries to read the data back and of course
fails to do so since the data was not synced.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Assigned] (HDFS-3752) BOOTSTRAPSTANDBY for new Standby node will not work just after saveNameSpace at ANN in case of BKJM

2013-11-05 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HDFS-3752:
-

Assignee: (was: Todd Lipcon)

 BOOTSTRAPSTANDBY for new Standby node will not work just after saveNameSpace 
 at ANN in case of BKJM
 ---

 Key: HDFS-3752
 URL: https://issues.apache.org/jira/browse/HDFS-3752
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 2.0.0-alpha
Reporter: Vinay

 1. do {{saveNameSpace}} in ANN node by entering into safemode
 2. in another new node, install standby NN and do BOOTSTRAPSTANDBY
 3. Now StandBy NN will not able to copy the fsimage_txid from ANN
 This is because, SNN not able to find the next txid (txid+1) in shared 
 storage.
 Just after {{saveNameSpace}} shared storage will have the new logsegment with 
 only START_LOG_SEGEMENT edits op.
 and BookKeeper will not be able to read last entry from inprogress ledger.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-3752) BOOTSTRAPSTANDBY for new Standby node will not work just after saveNameSpace at ANN in case of BKJM

2013-11-05 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814206#comment-13814206
 ] 

Todd Lipcon commented on HDFS-3752:
---

This might have gotten fixed by HDFS-5080. I don't know anything about BKJM, so 
not going to work on this.

 BOOTSTRAPSTANDBY for new Standby node will not work just after saveNameSpace 
 at ANN in case of BKJM
 ---

 Key: HDFS-3752
 URL: https://issues.apache.org/jira/browse/HDFS-3752
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: 2.0.0-alpha
Reporter: Vinay

 1. do {{saveNameSpace}} in ANN node by entering into safemode
 2. in another new node, install standby NN and do BOOTSTRAPSTANDBY
 3. Now StandBy NN will not able to copy the fsimage_txid from ANN
 This is because, SNN not able to find the next txid (txid+1) in shared 
 storage.
 Just after {{saveNameSpace}} shared storage will have the new logsegment with 
 only START_LOG_SEGEMENT edits op.
 and BookKeeper will not be able to read last entry from inprogress ledger.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5326) add modifyDirective to cacheAdmin

[
https://issues.apache.org/jira/browse/HDFS-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Colin Patrick McCabe updated HDFS-5326:
---

Attachment: HDFS-5326.003.patch

* adjust protobufs as recommended in HDFS-5166. Specifically, create a
{{PathBasedCacheDirectiveInfoProto}}, and use it in the add, modify, and list
RPCs. This avoids having to duplicate those fields everywhere.

* get rid of the descriptor / directive division. Having many different types
for the same thing is confusing to users. The only advantage of the division
is that it prevented using a Directive with an ID in a context where that was
inappropriate; however, we can simply validate this in the one case it matters
(in FSNamesystem when doing addDirective). This also gets rid of some
long-standing WTFs (why does -removeDirective remove a descriptor?, etc)

* Both the directive type and the protobuf now have all fields optional. We
can simply validate that fields exist when we need them. This will be helpful
later in allowing us to compatibly add new fields, once compatibility becomes a
big concern (in branch-2).

* add {{modifyPathBasedCacheDirective}}, which modifies an existing PBCD.

* in CacheManager, there were a few cases where we were converting a PBCE to a
PBCD, just to get some field we could have accessed directly in the PBCE. Just
access the field directly from the PBCE.

* in CacheManager, use try ... catch and log all {{IOException}} objects that
were thrown, rather than making the programmer duplicate the failure message in
the log and in the thrown exception. This does change some indentation but it
makes things much cleaner on the whole.

* use standardized exceptions like {{AccessControlException}} rather than
custom ones like {{AddPathBasedCacheDirectiveException}}. Add
{{IdNotFoundException}} to the common set of exceptions.

* {{addPathBasedCacheDirective}] now returns an ID, not a Directive. The
previous situation was confusing because the object that was being returned had
its ID based on what the NameNode set, but the rest of the fields left
identical to what the client passed. This could result in some of the fields
being wrong. So just return to the client what the server returned.

* Similarly, {{removePathBasedCacheDirective}} now just takes an ID, not an
object. It's confusing to take an object, since it obscures the fact that we
only look at one field (ID). Making the parameter an object encourages people
to try to remove by path or some other field, which simply won't work.
Calling Directive#getId is straightforward and makes it obvious what is going
on.

* Make sure that AddPathBasedCacheDirectiveOp stores the ID of the created
directive. Previously, we were relying on the ordering of the directives and
the ID assignment order, which is brittle. If any edit log entries are
unreadable, this strategy fails completely. Storing the ID is much more robust.

add modifyDirective to cacheAdmin
-

Key: HDFS-5326
URL: https://issues.apache.org/jira/browse/HDFS-5326
Project: Hadoop HDFS
Issue Type: Sub-task
Components: datanode, namenode
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Attachments: HDFS-5326.003.patch

We should add a way of modifying cache directives on the command-line,
similar to how modifyCachePool works.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5326) add modifyDirective to cacheAdmin


 [ 
https://issues.apache.org/jira/browse/HDFS-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5326:
---

Status: Patch Available  (was: Open)

 add modifyDirective to cacheAdmin
 -

 Key: HDFS-5326
 URL: https://issues.apache.org/jira/browse/HDFS-5326
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-5326.003.patch


 We should add a way of modifying cache directives on the command-line, 
 similar to how modifyCachePool works.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HDFS-5464) Simplify block report diff calculation

Tsz Wo (Nicholas), SZE created HDFS-5464:


 Summary: Simplify block report diff calculation
 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor


The current calculation in BlockManager.reportDiff(..) is unnecessarily 
complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5464) Simplify block report diff calculation


 [ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5464:
-

Status: Patch Available  (was: Open)

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5464) Simplify block report diff calculation


 [ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5464:
-

Attachment: h5464_20131105.patch

h5464_20131105.patch: remove the delimiter logic.

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5364) Add OpenFileCtx cache

2013-11-05 Thread Brandon Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5364:
-

Attachment: HDFS-5364.006.patch

 Add OpenFileCtx cache
 -

 Key: HDFS-5364
 URL: https://issues.apache.org/jira/browse/HDFS-5364
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-5364.001.patch, HDFS-5364.002.patch, 
 HDFS-5364.003.patch, HDFS-5364.004.patch, HDFS-5364.005.patch, 
 HDFS-5364.006.patch


 NFS gateway can run out of memory when the stream timeout is set to a 
 relatively long period(e.g., 1 minute) and user uploads thousands of files 
 in parallel.  Each stream DFSClient creates a DataStreamer thread, and will 
 eventually run out of memory by creating too many threads.
 NFS gateway should have a OpenFileCtx cache to limit the total opened files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5364) Add OpenFileCtx cache

2013-11-05 Thread Brandon Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814335#comment-13814335
 ] 

Brandon Li commented on HDFS-5364:
--

 Thanks for the review.
 1. done
 2 and 3 are optimization of the eviction method. As we discussed offline, I 
will file a following up JIRA for that.
4. done. The lock needs to be held there to synchronize with insert operation.
5. done. nice catch!
6. done.

 Add OpenFileCtx cache
 -

 Key: HDFS-5364
 URL: https://issues.apache.org/jira/browse/HDFS-5364
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-5364.001.patch, HDFS-5364.002.patch, 
 HDFS-5364.003.patch, HDFS-5364.004.patch, HDFS-5364.005.patch, 
 HDFS-5364.006.patch


 NFS gateway can run out of memory when the stream timeout is set to a 
 relatively long period(e.g., 1 minute) and user uploads thousands of files 
 in parallel.  Each stream DFSClient creates a DataStreamer thread, and will 
 eventually run out of memory by creating too many threads.
 NFS gateway should have a OpenFileCtx cache to limit the total opened files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5427) not able to read deleted files from snapshot directly under snapshottable dir after checkpoint and NN restart

2013-11-05 Thread Hari Mankude (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814338#comment-13814338
 ] 

Hari Mankude commented on HDFS-5427:


Is this patch going to be backported to 2.2 also?

 not able to read deleted files from snapshot directly under snapshottable dir 
 after checkpoint and NN restart
 -

 Key: HDFS-5427
 URL: https://issues.apache.org/jira/browse/HDFS-5427
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
Priority: Blocker
 Fix For: 2.3.0

 Attachments: HDFS-5427-v2.patch, HDFS-5427.patch, HDFS-5427.patch


 1. allow snapshots under dir /foo
 2. create a file /foo/bar
 3. create a snapshot s1 under /foo
 4. delete the file /foo/bar
 5. wait till checkpoint or do saveNameSpace
 6. restart NN.
 7. Now try to read the file from snapshot /foo/.snapshot/s1/bar
 client will get BlockMissingException
 Reason is 
 While loading the deleted file list for a snashottable dir from fsimage, 
 blocks were not updated in blocksmap



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5436) Move HsFtpFileSystem and HFtpFileSystem into org.apache.hdfs.web

2013-11-05 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814339#comment-13814339
 ] 

Sandy Ryza commented on HDFS-5436:
--

This appears to be causing the following when I try to set up a 
pseudo-distributed cluster.  Any idea why?

{code}
java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider 
org.apache.hadoop.hdfs.HftpFileSystem not found
at java.util.ServiceLoader.fail(ServiceLoader.java:214)
at java.util.ServiceLoader.access$400(ServiceLoader.java:164)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:350)
at java.util.ServiceLoader$1.next(ServiceLoader.java:421)
at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2282)
at 
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2293)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2310)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2349)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2331)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:353)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:446)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:274)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at 
org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{code}

Also, the old package name is still referred to in 
./hadoop-hdfs-project/hadoop-hdfs/src/site/apt/Hftp.apt.vm and 
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_1329348432655_0001_conf.xml.

 Move HsFtpFileSystem and HFtpFileSystem into org.apache.hdfs.web
 

 Key: HDFS-5436
 URL: https://issues.apache.org/jira/browse/HDFS-5436
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.3.0

 Attachments: HDFS-5436.000.patch, HDFS-5436.001.patch, 
 HDFS-5436.002.patch


 Currently HsftpFilesystem, HftpFileSystem and WebHdfsFileSystem reside in 
 different packages. This force several methods in ByteInputStream and 
 URLConnectionFactory to be public methods.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5436) Move HsFtpFileSystem and HFtpFileSystem into org.apache.hdfs.web

2013-11-05 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814350#comment-13814350
 ] 

Haohui Mai commented on HDFS-5436:
--

The service loader should be looking at the contents of 

{noformat}
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem
{noformat}

Can you please double check whether the file is up-to-date?

Thanks for catching the bugs in the documentation, I'll file a jira to fix them 
shortly.



 Move HsFtpFileSystem and HFtpFileSystem into org.apache.hdfs.web
 

 Key: HDFS-5436
 URL: https://issues.apache.org/jira/browse/HDFS-5436
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.3.0

 Attachments: HDFS-5436.000.patch, HDFS-5436.001.patch, 
 HDFS-5436.002.patch


 Currently HsftpFilesystem, HftpFileSystem and WebHdfsFileSystem reside in 
 different packages. This force several methods in ByteInputStream and 
 URLConnectionFactory to be public methods.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HDFS-5465) Update the package names for hsftp / hftp in the documentation

2013-11-05 Thread Haohui Mai (JIRA)

Haohui Mai created HDFS-5465:


 Summary: Update the package names for hsftp / hftp in the 
documentation
 Key: HDFS-5465
 URL: https://issues.apache.org/jira/browse/HDFS-5465
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai
Priority: Minor


HDFS-5436 move HftpFileSystem and HsftpFileSystem to a different package. The 
documentation should be updated as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5364) Add OpenFileCtx cache


[ 
https://issues.apache.org/jira/browse/HDFS-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814357#comment-13814357
 ] 

Hadoop QA commented on HDFS-5364:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612265/HDFS-5364.006.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5338//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5338//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs-nfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5338//console

This message is automatically generated.

 Add OpenFileCtx cache
 -

 Key: HDFS-5364
 URL: https://issues.apache.org/jira/browse/HDFS-5364
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-5364.001.patch, HDFS-5364.002.patch, 
 HDFS-5364.003.patch, HDFS-5364.004.patch, HDFS-5364.005.patch, 
 HDFS-5364.006.patch


 NFS gateway can run out of memory when the stream timeout is set to a 
 relatively long period(e.g., 1 minute) and user uploads thousands of files 
 in parallel.  Each stream DFSClient creates a DataStreamer thread, and will 
 eventually run out of memory by creating too many threads.
 NFS gateway should have a OpenFileCtx cache to limit the total opened files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5326) add modifyDirective to cacheAdmin


[ 
https://issues.apache.org/jira/browse/HDFS-5326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814361#comment-13814361
 ] 

Hadoop QA commented on HDFS-5326:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612243/HDFS-5326.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.protocolPB.TestClientNamenodeProtocolServerSideTranslatorPB
  
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5336//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5336//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5336//console

This message is automatically generated.

 add modifyDirective to cacheAdmin
 -

 Key: HDFS-5326
 URL: https://issues.apache.org/jira/browse/HDFS-5326
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-5326.003.patch


 We should add a way of modifying cache directives on the command-line, 
 similar to how modifyCachePool works.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HDFS-5466) Update storage IDs when the pipeline is updated

Tsz Wo (Nicholas), SZE created HDFS-5466:


 Summary: Update storage IDs when the pipeline is updated
 Key: HDFS-5466
 URL: https://issues.apache.org/jira/browse/HDFS-5466
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE


In DFSOutputStream, when the nodes in the pipeline is updated, we should also 
update the storage IDs.  Otherwise, the node list and the storage ID list are 
mismatched.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5466) Update storage IDs when the pipeline is updated


 [ 
https://issues.apache.org/jira/browse/HDFS-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5466:
-

Attachment: h5466_20131105.patch

h5466_20131105.patch: update storage IDs.

 Update storage IDs when the pipeline is updated
 ---

 Key: HDFS-5466
 URL: https://issues.apache.org/jira/browse/HDFS-5466
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h5466_20131105.patch


 In DFSOutputStream, when the nodes in the pipeline is updated, we should also 
 update the storage IDs.  Otherwise, the node list and the storage ID list are 
 mismatched.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5436) Move HsFtpFileSystem and HFtpFileSystem into org.apache.hdfs.web

2013-11-05 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814383#comment-13814383
 ] 

Sandy Ryza commented on HDFS-5436:
--

My bad, it looks like I had some old jars lying around.  Thanks, [~wheat9].

 Move HsFtpFileSystem and HFtpFileSystem into org.apache.hdfs.web
 

 Key: HDFS-5436
 URL: https://issues.apache.org/jira/browse/HDFS-5436
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.3.0

 Attachments: HDFS-5436.000.patch, HDFS-5436.001.patch, 
 HDFS-5436.002.patch


 Currently HsftpFilesystem, HftpFileSystem and WebHdfsFileSystem reside in 
 different packages. This force several methods in ByteInputStream and 
 URLConnectionFactory to be public methods.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5466) Update storage IDs when the pipeline is updated


 [ 
https://issues.apache.org/jira/browse/HDFS-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5466:
-

Attachment: h5466_20131105b.patch

h5466_20131105b.patch: add setPipeline(..) methodes.

BTW, this will fix TestEncryptedTransfer and TestCrcCorruption.

 Update storage IDs when the pipeline is updated
 ---

 Key: HDFS-5466
 URL: https://issues.apache.org/jira/browse/HDFS-5466
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h5466_20131105.patch, h5466_20131105b.patch


 In DFSOutputStream, when the nodes in the pipeline is updated, we should also 
 update the storage IDs.  Otherwise, the node list and the storage ID list are 
 mismatched.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5394) fix race conditions in DN caching and uncaching


[ 
https://issues.apache.org/jira/browse/HDFS-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814426#comment-13814426
 ] 

Andrew Wang commented on HDFS-5394:
---

Sorry for leaving this for so long, got tied up in a variety of different 
things. Thanks for bumping it based on feedback thus far, I think we're close.

* I like the test stub for mlock

Nits:

* Unused imports in FsDatasetImpl and FsVolumeImpl
* Do we still need to rename {{getExecutor}} to {{getCacheExecutor}} in 
FsVolumeImpl?
* {{State#isUncaching()}} is unused
* Could use a core pool size of 0 for {{uncachingExecutor}}, I don't think it's 
that latency sensitive
* usedBytes javadoc: more things to cache that we can't actually do because 
of is an awkward turn of phrase, maybe say assign more blocks than we can 
actually cache because of instead
* MappableBlock#load javadoc: visibleLeng parameter should be renamed to 
length. The return value is now also a MappableBlock, not a boolean.
* Key: rename {{id}} to {{blockId}} for clarity? or add a bit of javadoc
* Naming the HashMap {{replicaMap}} is confusing since there's already a 
datanode {{ReplicaMap}} class. Maybe {{mappableBlockMap}} instead?

Impl:
* Caching can fail if the underlying block is invalidated in between getting 
the block's filename and running the CacheTask. It'd be nice to distinguish 
this race from a real error for when we do metrics (and also quash the 
exception).
* If we get a {{DNA_CACHE}} for a block that is currently being uncached, 
shouldn't we try to cancel the uncache and re-cache it? The NN will resend the 
command, but it'd be better to not have to wait for that.
{code}
  if ((value == null) || (value.state != State.CACHING)) {
{code}
* Could this be written with {{value.state == State.CACHING_CANCELLED}} 
instead? Would be clearer, and I believe equivalent since {{uncacheBlock}} 
won't set the state to {{UNCACHING}} if it's {{CACHING}} or 
{{CACHING_CANCELLED}}.
* Even better would be interrupting a {{CachingTask}} on uncache since it'll 
save us I/O and CPU.
* Could we combine {{CACHING_CANCELLED}} into {{UNCACHING}}? It seems like 
{{CachingTask}} could check for {{UNCACHING}} in that if statement at the end 
and uncache, same sort of change for {{uncacheBlock}}.
* I think using a switch/case on the prevValue.state in uncacheBlock would be 
clearer

Test:
* 6,000,000 milliseconds seem like very long test timeouts :) Can we change 
them to say, 60,000?
* Are these new log prints for sanity checking? Maybe we can just remove them.
* Some of the comments seem to refer to a previous patch version that used a 
countdown latch.
* It's unclear what this is testing beyond caching and then uncaching a bunch 
of blocks. Can we check for log prints to see that it's actually cancelling as 
expected? Any other ideas for definitively hitting cancellation?


 fix race conditions in DN caching and uncaching
 ---

 Key: HDFS-5394
 URL: https://issues.apache.org/jira/browse/HDFS-5394
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-5394-caching.001.patch, 
 HDFS-5394-caching.002.patch, HDFS-5394-caching.003.patch, 
 HDFS-5394-caching.004.patch, HDFS-5394.005.patch, HDFS-5394.006.patch


 The DN needs to handle situations where it is asked to cache the same replica 
 more than once.  (Currently, it can actually do two mmaps and mlocks.)  It 
 also needs to handle the situation where caching a replica is cancelled 
 before said caching completes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5394) fix race conditions in DN caching and uncaching

[
https://issues.apache.org/jira/browse/HDFS-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814437#comment-13814437
]

Colin Patrick McCabe commented on HDFS-5394:

OK, I figured out the test failure. It seems that when computing how much data
we can mlock, we must round up mmap'ed regions to the operating system page
size. In the case of Linux, that is almost always 4096. The reason behind
this is because the OS manages memory in units of 4096 bytes. It is simply
impossible to lock at a finer granularity than that. So we should take this
into account in our statistics. I adjusted the test to take this into account,
and also added a skip if we don't have enough lockable memory available.

fix race conditions in DN caching and uncaching
---

Key: HDFS-5394
URL: https://issues.apache.org/jira/browse/HDFS-5394
Project: Hadoop HDFS
Issue Type: Sub-task
Components: datanode, namenode
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Attachments: HDFS-5394-caching.001.patch,
HDFS-5394-caching.002.patch, HDFS-5394-caching.003.patch,
HDFS-5394-caching.004.patch, HDFS-5394.005.patch, HDFS-5394.006.patch

The DN needs to handle situations where it is asked to cache the same replica
more than once. (Currently, it can actually do two mmaps and mlocks.) It
also needs to handle the situation where caching a replica is cancelled
before said caching completes.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HDFS-5466) Update storage IDs when the pipeline is updated


 [ 
https://issues.apache.org/jira/browse/HDFS-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-5466.
-

   Resolution: Fixed
Fix Version/s: Heterogeneous Storage (HDFS-2832)
 Hadoop Flags: Reviewed

+1 for the patch. I committed it to branch HDFS-2832.

 Update storage IDs when the pipeline is updated
 ---

 Key: HDFS-5466
 URL: https://issues.apache.org/jira/browse/HDFS-5466
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: Heterogeneous Storage (HDFS-2832)

 Attachments: h5466_20131105.patch, h5466_20131105b.patch


 In DFSOutputStream, when the nodes in the pipeline is updated, we should also 
 update the storage IDs.  Otherwise, the node list and the storage ID list are 
 mismatched.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5364) Add OpenFileCtx cache

2013-11-05 Thread Brandon Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5364:
-

Attachment: HDFS-5364.007.patch

 Add OpenFileCtx cache
 -

 Key: HDFS-5364
 URL: https://issues.apache.org/jira/browse/HDFS-5364
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-5364.001.patch, HDFS-5364.002.patch, 
 HDFS-5364.003.patch, HDFS-5364.004.patch, HDFS-5364.005.patch, 
 HDFS-5364.006.patch, HDFS-5364.007.patch


 NFS gateway can run out of memory when the stream timeout is set to a 
 relatively long period(e.g., 1 minute) and user uploads thousands of files 
 in parallel.  Each stream DFSClient creates a DataStreamer thread, and will 
 eventually run out of memory by creating too many threads.
 NFS gateway should have a OpenFileCtx cache to limit the total opened files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5364) Add OpenFileCtx cache


[ 
https://issues.apache.org/jira/browse/HDFS-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814459#comment-13814459
 ] 

Hadoop QA commented on HDFS-5364:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612290/HDFS-5364.007.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5339//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5339//console

This message is automatically generated.

 Add OpenFileCtx cache
 -

 Key: HDFS-5364
 URL: https://issues.apache.org/jira/browse/HDFS-5364
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: nfs
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-5364.001.patch, HDFS-5364.002.patch, 
 HDFS-5364.003.patch, HDFS-5364.004.patch, HDFS-5364.005.patch, 
 HDFS-5364.006.patch, HDFS-5364.007.patch


 NFS gateway can run out of memory when the stream timeout is set to a 
 relatively long period(e.g., 1 minute) and user uploads thousands of files 
 in parallel.  Each stream DFSClient creates a DataStreamer thread, and will 
 eventually run out of memory by creating too many threads.
 NFS gateway should have a OpenFileCtx cache to limit the total opened files. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5443) Namenode can stuck in safemode on restart if it crashes just after addblock logsync and after taking snapshot for such file.


 [ 
https://issues.apache.org/jira/browse/HDFS-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5443:


Attachment: HDFS-5443.000.patch

Upload a simple patch that tries to delete the 0-sized block for INodeFileUC.

 Namenode can stuck in safemode on restart if it crashes just after addblock 
 logsync and after taking snapshot for such file.
 

 Key: HDFS-5443
 URL: https://issues.apache.org/jira/browse/HDFS-5443
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Uma Maheswara Rao G
Assignee: sathish
 Attachments: 5443-test.patch, HDFS-5443.000.patch


 This issue is reported by Prakash and Sathish.
 On looking into the issue following things are happening.
 .
 1) Client added block at NN and just did logsync
So, NN has block ID persisted.
 2)Before returning addblock response to client take a snapshot for root or 
 parent directories for that file
 3) Delete parent directory for that file
 4) Now crash the NN with out responding success to client for that addBlock 
 call
 Now on restart of the Namenode, it will stuck in safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HDFS-5467) Remove tab characters in hdfs-default.xml

Andrew Wang created HDFS-5467:
-

 Summary: Remove tab characters in hdfs-default.xml
 Key: HDFS-5467
 URL: https://issues.apache.org/jira/browse/HDFS-5467
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Andrew Wang
Priority: Trivial


The retrycache parameters are indented with tabs rather than the normal 2 
spaces.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5443) Namenode can stuck in safemode on restart if it crashes just after addblock logsync and after taking snapshot for such file.


 [ 
https://issues.apache.org/jira/browse/HDFS-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5443:


Status: Patch Available  (was: Open)

 Namenode can stuck in safemode on restart if it crashes just after addblock 
 logsync and after taking snapshot for such file.
 

 Key: HDFS-5443
 URL: https://issues.apache.org/jira/browse/HDFS-5443
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 2.2.0, 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: sathish
 Attachments: 5443-test.patch, HDFS-5443.000.patch


 This issue is reported by Prakash and Sathish.
 On looking into the issue following things are happening.
 .
 1) Client added block at NN and just did logsync
So, NN has block ID persisted.
 2)Before returning addblock response to client take a snapshot for root or 
 parent directories for that file
 3) Delete parent directory for that file
 4) Now crash the NN with out responding success to client for that addBlock 
 call
 Now on restart of the Namenode, it will stuck in safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Assigned] (HDFS-5467) Remove tab characters in hdfs-default.xml


 [ 
https://issues.apache.org/jira/browse/HDFS-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HDFS-5467:
-

Assignee: Andrew Wang

 Remove tab characters in hdfs-default.xml
 -

 Key: HDFS-5467
 URL: https://issues.apache.org/jira/browse/HDFS-5467
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
  Labels: newbie

 The retrycache parameters are indented with tabs rather than the normal 2 
 spaces.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5464) Simplify block report diff calculation

2013-11-05 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814517#comment-13814517
 ] 

Konstantin Shvachko commented on HDFS-5464:
---

So in regular case you will be adding 100,000 replicas to {{toRemove}} list 
only in order to delete most of them later. How does it make things simpler?
The delimiter lets you keep the calculated lists as small as possible, reducing 
memory consumption, avoiding frequent GCs.

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5464) Simplify block report diff calculation


[ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814527#comment-13814527
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-5464:
--

Hi Konstantin,

You may be correct that the new code use more memory, however, I beg you won't 
argue that the new code is simpler than the existing code.  :)

I will think about how to reduce the memory usage.  Thanks for the input.

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5464) Simplify block report diff calculation


 [ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5464:
-

Attachment: h5466_20131105b.patch

Actually, it is unnecessarily to add all the blocks to the remove list.  Here 
is a new patch.

h5466_20131105b.patch

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch, h5464_20131105b.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5464) Simplify block report diff calculation


 [ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5464:
-

Attachment: h5464_20131105b.patch

Here is the correct file: h5464_20131105b.patch

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch, h5464_20131105b.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5464) Simplify block report diff calculation


 [ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5464:
-

Attachment: (was: h5466_20131105b.patch)

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch, h5464_20131105b.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5462) Fail to compile in Branch HDFS-2832 with COMPILATION ERROR

2013-11-05 Thread wenwupeng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814549#comment-13814549
 ] 

wenwupeng commented on HDFS-5462:
-

Thanks for helpful respondce Eric and Arpit.
 it is passed after sync to the latest version.

 Fail to compile in Branch HDFS-2832 with COMPILATION ERROR 
 ---

 Key: HDFS-5462
 URL: https://issues.apache.org/jira/browse/HDFS-5462
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: wenwupeng

 Failed to compile HDFS in Branch HDFS-2832 with COMPILATION ERROR ,  
 OutputFormat is Sun proprietary API and may be removed in a future release
 [INFO] Compiling 276 source files to 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/target/classes
 [INFO] -
 [ERROR] COMPILATION ERROR : 
 [INFO] -
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[32,48]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[33,48]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java:[337,34]
  unreported exception java.io.IOException; must be caught or declared to be 
 thrown
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[134,41]
  sun.nio.ch.DirectBuffer is Sun proprietary API and may be removed in a 
 future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[135,14]
  sun.misc.Cleaner is Sun proprietary API and may be removed in a future 
 release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/MappableBlock.java:[136,22]
  sun.nio.ch.DirectBuffer is Sun proprietary API and may be removed in a 
 future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[55,4]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[55,33]
  com.sun.org.apache.xml.internal.serialize.OutputFormat is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[59,4]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [ERROR] 
 /home/jenkins/slave/workspace/HVE-PostCommit-HDFS-2832/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/XmlEditsVisitor.java:[59,35]
  com.sun.org.apache.xml.internal.serialize.XMLSerializer is Sun proprietary 
 API and may be removed in a future release
 [INFO] 10 errors 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5439) Fix TestPendingReplication


[ 
https://issues.apache.org/jira/browse/HDFS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814552#comment-13814552
 ] 

Arpit Agarwal commented on HDFS-5439:
-

Thanks for the patch Junping! Most of your changes look fine. A remaining issue 
was that {{PendingReplicationBlock}} should track targets as 
{{DatanodeDescriptor}} instead of {{DatanodeStorageInfo}}. Will post a 
consolidated patch along with your changes.

 Fix TestPendingReplication
 --

 Key: HDFS-5439
 URL: https://issues.apache.org/jira/browse/HDFS-5439
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-5439-demo1.patch


 {{TestPendingReplication}} fails with the following exception:
 {code}
 java.lang.AssertionError: expected:4 but was:3
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication.testBlockReceived(TestPendingReplication.java:186)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5448) Datanode should generate its ID on first registration

[
https://issues.apache.org/jira/browse/HDFS-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814555#comment-13814555
]

Arpit Agarwal commented on HDFS-5448:
-

Thanks for the correction, you're right.

Datanode should generate its ID on first registration
-

Key: HDFS-5448
URL: https://issues.apache.org/jira/browse/HDFS-5448
Project: Hadoop HDFS
Issue Type: Sub-task
Components: datanode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Fix For: Heterogeneous Storage (HDFS-2832)

Attachments: h5448.01.patch, h5448.03.patch, h5448.04.addendum.patch,
h5448.04.patch

Prior to the heterogeneous storage feature, each Datanode had a single
storage ID which was generated by the Namenode on first registration. The
storage ID used fixed Datanode identifiers like IP address and port, so that
in a federated cluster, for example, all NameNodes would generate the same
storage ID.
With Heterogeneous storage, we have replaced the storage ID with a
per-datanode identifier called the Datanode-UUID. The Datanode UUID is also
assigned by a NameNode on first registration. In a federated cluster with
multiple namenodes, there are two ways to ensure a unique Datanode UUID
allocation:
# Synchronize initial registration requests from the BPServiceActors. If a
Datanode UUID is already assigned we don't need to synchronize.
# The datanode assigns itself a UUID on initialization.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5394) fix race conditions in DN caching and uncaching


 [ 
https://issues.apache.org/jira/browse/HDFS-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5394:
---

Attachment: HDFS-5394.007.patch

 fix race conditions in DN caching and uncaching
 ---

 Key: HDFS-5394
 URL: https://issues.apache.org/jira/browse/HDFS-5394
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-5394-caching.001.patch, 
 HDFS-5394-caching.002.patch, HDFS-5394-caching.003.patch, 
 HDFS-5394-caching.004.patch, HDFS-5394.005.patch, HDFS-5394.006.patch, 
 HDFS-5394.007.patch


 The DN needs to handle situations where it is asked to cache the same replica 
 more than once.  (Currently, it can actually do two mmaps and mlocks.)  It 
 also needs to handle the situation where caching a replica is cancelled 
 before said caching completes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5394) fix race conditions in DN caching and uncaching

[
https://issues.apache.org/jira/browse/HDFS-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814568#comment-13814568
]

Colin Patrick McCabe commented on HDFS-5394:

bq. Unused imports in FsDatasetImpl and FsVolumeImpl

removed

bq. Do we still need to rename getExecutor to getCacheExecutor in FsVolumeImpl?

Well, the name of the variable is {{cacheExecutor}}; shouldn't the getter be
{{getCacheExecutor}}?

bq. State#isUncaching() is unused

removed

bq. Could use a core pool size of 0 for uncachingExecutor, I don't think it's
that latency sensitive

agreed

bq. usedBytes javadoc: more things to cache that we can't actually do because
of is an awkward turn of phrase, maybe say assign more blocks than we can
actually cache because of instead

bq. MappableBlock#load javadoc: visibleLeng parameter should be renamed to
length. The return value is now also a MappableBlock, not a boolean.

fixed

bq. Key: rename id to blockId for clarity? or add a bit of javadoc

added javadoc

bq. Naming the HashMap replicaMap is confusing since there's already a datanode
ReplicaMap class. Maybe mappableBlockMap instead?

bq. Caching can fail if the underlying block is invalidated in between getting
the block's filename and running the CacheTask. It'd be nice to distinguish
this race from a real error for when we do metrics (and also quash the
exception).

I just added a catch block for the {{FileNotFound}} exception which both
{{getBlockInputStream}} and {{getMetaDataInputStream}} can throw. I still
think we want to log this exception, but at INFO rather than WARN. We will
retry sending the {{DNA_CACHE}} command (once 5366 is committed), so hitting
this narrow race if a block is being moved is just a temporary setback.

bq. If we get a DNA_CACHE for a block that is currently being uncached,
shouldn't we try to cancel the uncache and re-cache it? The NN will resend the
command, but it'd be better to not have to wait for that.

We don't know how far along the uncaching process is. We can't cancel it if we
already called {{munmap}}. We could allow cancellation of pending uncaches by
splitting {{UNCACHING}} into {{UNCACHING_SCHEDULED}} and
{{UNCACHING_IN_PROGRESS}}, and only allowing cancellation on the former. This
might be a good improvement to make as part of 5182. But for now, the
uncaching process is really quick, so let's keep it simple.

bq. Could this be written with value.state == State.CACHING_CANCELLED instead?
Would be clearer, and I believe equivalent since uncacheBlock won't set the
state to UNCACHING if it's CACHING or CACHING_CANCELLED.

well, if value is null, you don't want to be dereferencing that, right?

bq. Even better would be interrupting a CachingTask on uncache since it'll save
us I/O and CPU.

That kind of interruption logic gets complex quickly. I'd rather save that for
a potential performance improvement JIRA later down the line. I also think
that if we're thrashing (cancelling caching requests right and left) the real
fix might be on the NameNode anyway...

bq. Could we combine CACHING_CANCELLED into UNCACHING? It seems like
CachingTask could check for UNCACHING in that if statement at the end and
uncache, same sort of change for uncacheBlock.

I would rather not do that, since right now we can look at entries in the map
and instantly know that anything in state {{UNCACHING}} has an associated
{{Runnable}} scheduled in the {{Executor}}. cancelled is not really the same
thing as uncaching since in the former case, there is actually nothing to do!

bq. I think using a switch/case on the prevValue.state in uncacheBlock would be
clearer

bq. 6,000,000 milliseconds seem like very long test timeouts Can we change
them to say, 60,000?

the general idea is to do stuff that can time out in
{{GenericTestUtils#waitFor}} blocks. The waitFor blocks actually give useful
backtraces and messages when they time out, unlike the generic test timeouts.
I wanted to avoid the scenario where the test-level timeouts kick in, but out
of paranoia, I set the overall test timeout to 10 minutes in case there was
some other unexpected timeout. I wanted to avoid the issues we've had with
zombie tests in Jenkins causing heisenfailures.

bq. Are these new log prints for sanity checking? Maybe we can just remove them.

it's more so you can see what's going on in the sea of log messages.
otherwise, it becomes hard to debug.

bq. Some of the comments seem to refer to a previous patch version that used a
countdown latch.

fixed

bq. It's unclear what this is testing beyond caching and then uncaching a bunch
of blocks. Can we check for log prints to see that it's actually cancelling as
expected? Any other ideas for definitively hitting cancellation?

we could add callback hooks to more points in the system, and set up a bunch of
countdown latches (or similar), but it might

[jira] [Commented] (HDFS-5443) Namenode can stuck in safemode on restart if it crashes just after addblock logsync and after taking snapshot for such file.


[ 
https://issues.apache.org/jira/browse/HDFS-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814570#comment-13814570
 ] 

Hadoop QA commented on HDFS-5443:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612292/HDFS-5443.000.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5340//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5340//console

This message is automatically generated.

 Namenode can stuck in safemode on restart if it crashes just after addblock 
 logsync and after taking snapshot for such file.
 

 Key: HDFS-5443
 URL: https://issues.apache.org/jira/browse/HDFS-5443
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Uma Maheswara Rao G
Assignee: sathish
 Attachments: 5443-test.patch, HDFS-5443.000.patch


 This issue is reported by Prakash and Sathish.
 On looking into the issue following things are happening.
 .
 1) Client added block at NN and just did logsync
So, NN has block ID persisted.
 2)Before returning addblock response to client take a snapshot for root or 
 parent directories for that file
 3) Delete parent directory for that file
 4) Now crash the NN with out responding success to client for that addBlock 
 call
 Now on restart of the Namenode, it will stuck in safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HDFS-5468) CacheAdmin help command does not recognize commands

Stephen Chu created HDFS-5468:
-

 Summary: CacheAdmin help command does not recognize commands
 Key: HDFS-5468
 URL: https://issues.apache.org/jira/browse/HDFS-5468
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 3.0.0, 2.3.0
Reporter: Stephen Chu
Priority: Minor


Currently, the hdfs cacheadmin -help command will not recognize correct command 
inputs:

{code}
[hdfs@hdfs-cache ~]# hdfs cacheadmin -help listPools
Sorry, I don't know the command 'listPools'.
Valid command names are:
-addDirective, -removeDirective, -removeDirectives, -listDirectives, -addPool, 
-modifyPool, -removePool, -listPools, -help
[hdfs@hdfs-cache ~]# hdfs cacheadmin -help -listPools
Sorry, I don't know the command 'listPools'.
Valid command names are:
-addDirective, -removeDirective, -removeDirectives, -listDirectives, -addPool, 
-modifyPool, -removePool, -listPools, -help
{code}

In the code, we strip the input command of leading hyphens, but then compare it 
to the command names, which are all prefixed by a hyphen.

Also, cacheadmin -removeDirectives requires specifying a path with -path but 
-path is not shown in the usage. We should fix this as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5468) CacheAdmin help command does not recognize commands


 [ 
https://issues.apache.org/jira/browse/HDFS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-5468:
--

Attachment: HDFS-5468.patch

Attached a patch that avoids remixing leading hyphens on the input command, so 
getting help usage works when executing something like _hdfs cacheadmin -help 
-addPool_. Added a unit test to exercise the cacheadmin help command.

Added the -path specifier to the help usage of the removeDirectives command.

 CacheAdmin help command does not recognize commands
 ---

 Key: HDFS-5468
 URL: https://issues.apache.org/jira/browse/HDFS-5468
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 3.0.0, 2.3.0
Reporter: Stephen Chu
Priority: Minor
 Attachments: HDFS-5468.patch


 Currently, the hdfs cacheadmin -help command will not recognize correct 
 command inputs:
 {code}
 [hdfs@hdfs-cache ~]# hdfs cacheadmin -help listPools
 Sorry, I don't know the command 'listPools'.
 Valid command names are:
 -addDirective, -removeDirective, -removeDirectives, -listDirectives, 
 -addPool, -modifyPool, -removePool, -listPools, -help
 [hdfs@hdfs-cache ~]# hdfs cacheadmin -help -listPools
 Sorry, I don't know the command 'listPools'.
 Valid command names are:
 -addDirective, -removeDirective, -removeDirectives, -listDirectives, 
 -addPool, -modifyPool, -removePool, -listPools, -help
 {code}
 In the code, we strip the input command of leading hyphens, but then compare 
 it to the command names, which are all prefixed by a hyphen.
 Also, cacheadmin -removeDirectives requires specifying a path with -path but 
 -path is not shown in the usage. We should fix this as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5468) CacheAdmin help command does not recognize commands


 [ 
https://issues.apache.org/jira/browse/HDFS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu updated HDFS-5468:
--

Status: Patch Available  (was: Open)

 CacheAdmin help command does not recognize commands
 ---

 Key: HDFS-5468
 URL: https://issues.apache.org/jira/browse/HDFS-5468
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 3.0.0, 2.3.0
Reporter: Stephen Chu
Priority: Minor
 Attachments: HDFS-5468.patch


 Currently, the hdfs cacheadmin -help command will not recognize correct 
 command inputs:
 {code}
 [hdfs@hdfs-cache ~]# hdfs cacheadmin -help listPools
 Sorry, I don't know the command 'listPools'.
 Valid command names are:
 -addDirective, -removeDirective, -removeDirectives, -listDirectives, 
 -addPool, -modifyPool, -removePool, -listPools, -help
 [hdfs@hdfs-cache ~]# hdfs cacheadmin -help -listPools
 Sorry, I don't know the command 'listPools'.
 Valid command names are:
 -addDirective, -removeDirective, -removeDirectives, -listDirectives, 
 -addPool, -modifyPool, -removePool, -listPools, -help
 {code}
 In the code, we strip the input command of leading hyphens, but then compare 
 it to the command names, which are all prefixed by a hyphen.
 Also, cacheadmin -removeDirectives requires specifying a path with -path but 
 -path is not shown in the usage. We should fix this as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5464) Simplify block report diff calculation


[ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814583#comment-13814583
 ] 

Hadoop QA commented on HDFS-5464:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612253/h5464_20131105.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5341//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5341//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5341//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5341//console

This message is automatically generated.

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch, h5464_20131105b.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Assigned] (HDFS-5468) CacheAdmin help command does not recognize commands


 [ 
https://issues.apache.org/jira/browse/HDFS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Chu reassigned HDFS-5468:
-

Assignee: Stephen Chu

 CacheAdmin help command does not recognize commands
 ---

 Key: HDFS-5468
 URL: https://issues.apache.org/jira/browse/HDFS-5468
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 3.0.0, 2.3.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-5468.patch


 Currently, the hdfs cacheadmin -help command will not recognize correct 
 command inputs:
 {code}
 [hdfs@hdfs-cache ~]# hdfs cacheadmin -help listPools
 Sorry, I don't know the command 'listPools'.
 Valid command names are:
 -addDirective, -removeDirective, -removeDirectives, -listDirectives, 
 -addPool, -modifyPool, -removePool, -listPools, -help
 [hdfs@hdfs-cache ~]# hdfs cacheadmin -help -listPools
 Sorry, I don't know the command 'listPools'.
 Valid command names are:
 -addDirective, -removeDirective, -removeDirectives, -listDirectives, 
 -addPool, -modifyPool, -removePool, -listPools, -help
 {code}
 In the code, we strip the input command of leading hyphens, but then compare 
 it to the command names, which are all prefixed by a hyphen.
 Also, cacheadmin -removeDirectives requires specifying a path with -path but 
 -path is not shown in the usage. We should fix this as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5439) Fix TestPendingReplication

2013-11-05 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814587#comment-13814587
 ] 

Junping Du commented on HDFS-5439:
--

Hi Arpit, do you mean targets in PendingBlockInfo? I think storageID info is 
necessary and computeReplicationWorkForBlocks() in BlockManager is something we 
should fix. Thoughts?

 Fix TestPendingReplication
 --

 Key: HDFS-5439
 URL: https://issues.apache.org/jira/browse/HDFS-5439
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-5439-demo1.patch


 {{TestPendingReplication}} fails with the following exception:
 {code}
 java.lang.AssertionError: expected:4 but was:3
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication.testBlockReceived(TestPendingReplication.java:186)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5411) Update Bookkeeper dependency to 4.2.1

2013-11-05 Thread Rakesh R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-5411:
---

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-3399

 Update Bookkeeper dependency to 4.2.1
 -

 Key: HDFS-5411
 URL: https://issues.apache.org/jira/browse/HDFS-5411
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Robert Rati
Priority: Minor
 Attachments: HDFS-5411.patch


 Update the bookkeeper dependency to 4.2.1.  This eases compilation on Fedora 
 platforms



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5411) Update Bookkeeper dependency to 4.2.1

2013-11-05 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814602#comment-13814602
 ] 

Rakesh R commented on HDFS-5411:


Hi Robert, Recently Bookkeeper has released 4.2.2 version, would be good to use 
this stable version. Whats your opinion?

 Update Bookkeeper dependency to 4.2.1
 -

 Key: HDFS-5411
 URL: https://issues.apache.org/jira/browse/HDFS-5411
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Robert Rati
Priority: Minor
 Attachments: HDFS-5411.patch


 Update the bookkeeper dependency to 4.2.1.  This eases compilation on Fedora 
 platforms



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5439) Fix TestPendingReplication


 [ 
https://issues.apache.org/jira/browse/HDFS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-5439:


Attachment: h5439.04.patch

Attaching patch to fix the following:
# Bug in {{BlockPlacementPolicyDefault#chooseRandom}} hit when numOfReplicas  
2.
# {{PendingReplicationBlocks}} tracks replicas by {{DatanodeDescriptor}} 
instead of {{DatanodeStorageInfo}}
# Update couple of logs (copied from Junping's patch), remove obsolete TODO in 
{{BPServiceActor}}
# Update {{TestPendingReplications}}

Junping, I did not understand your comment about 
{{computeReplicationWorkForBlocks}}. Also I don't think we need 1 and 2 from 
your list.

 Fix TestPendingReplication
 --

 Key: HDFS-5439
 URL: https://issues.apache.org/jira/browse/HDFS-5439
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-5439-demo1.patch, h5439.04.patch


 {{TestPendingReplication}} fails with the following exception:
 {code}
 java.lang.AssertionError: expected:4 but was:3
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication.testBlockReceived(TestPendingReplication.java:186)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5439) Fix TestPendingReplication


[ 
https://issues.apache.org/jira/browse/HDFS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814616#comment-13814616
 ] 

Arpit Agarwal commented on HDFS-5439:
-

To clarify, the reason we don't need 1 anymore is because 
{{PendingReplicationBlocks}} uses DatanodeDescriptor as they key (thanks to 
[~szetszwo] for the idea).

 Fix TestPendingReplication
 --

 Key: HDFS-5439
 URL: https://issues.apache.org/jira/browse/HDFS-5439
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-5439-demo1.patch, h5439.04.patch


 {{TestPendingReplication}} fails with the following exception:
 {code}
 java.lang.AssertionError: expected:4 but was:3
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication.testBlockReceived(TestPendingReplication.java:186)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-5464) Simplify block report diff calculation


 [ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5464:
-

Attachment: h5464_20131105c.patch

h5464_20131105c.patch: reverts the changes causing the findbugs warning.

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch, h5464_20131105b.patch, 
 h5464_20131105c.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5464) Simplify block report diff calculation


[ 
https://issues.apache.org/jira/browse/HDFS-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814620#comment-13814620
 ] 

Hadoop QA commented on HDFS-5464:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612309/h5464_20131105b.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks
  org.apache.hadoop.hdfs.TestLeaseRecovery2

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks
org.apache.hadoop.hdfs.server.namenode.TestFsck

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5342//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5342//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5342//console

This message is automatically generated.

 Simplify block report diff calculation
 --

 Key: HDFS-5464
 URL: https://issues.apache.org/jira/browse/HDFS-5464
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
Priority: Minor
 Attachments: h5464_20131105.patch, h5464_20131105b.patch, 
 h5464_20131105c.patch


 The current calculation in BlockManager.reportDiff(..) is unnecessarily 
 complicated.  We could simplify the calculation.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5394) fix race conditions in DN caching and uncaching


[ 
https://issues.apache.org/jira/browse/HDFS-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814640#comment-13814640
 ] 

Hadoop QA commented on HDFS-5394:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612314/HDFS-5394.007.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 1548 javac 
compiler warnings (more than the trunk's current 1547 warnings).

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.TestPathBasedCacheRequests
  org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5343//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5343//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5343//console

This message is automatically generated.

 fix race conditions in DN caching and uncaching
 ---

 Key: HDFS-5394
 URL: https://issues.apache.org/jira/browse/HDFS-5394
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Affects Versions: 3.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-5394-caching.001.patch, 
 HDFS-5394-caching.002.patch, HDFS-5394-caching.003.patch, 
 HDFS-5394-caching.004.patch, HDFS-5394.005.patch, HDFS-5394.006.patch, 
 HDFS-5394.007.patch


 The DN needs to handle situations where it is asked to cache the same replica 
 more than once.  (Currently, it can actually do two mmaps and mlocks.)  It 
 also needs to handle the situation where caching a replica is cancelled 
 before said caching completes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5468) CacheAdmin help command does not recognize commands


[ 
https://issues.apache.org/jira/browse/HDFS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814647#comment-13814647
 ] 

Hadoop QA commented on HDFS-5468:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12612319/HDFS-5468.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5344//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5344//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5344//console

This message is automatically generated.

 CacheAdmin help command does not recognize commands
 ---

 Key: HDFS-5468
 URL: https://issues.apache.org/jira/browse/HDFS-5468
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: tools
Affects Versions: 3.0.0, 2.3.0
Reporter: Stephen Chu
Assignee: Stephen Chu
Priority: Minor
 Attachments: HDFS-5468.patch


 Currently, the hdfs cacheadmin -help command will not recognize correct 
 command inputs:
 {code}
 [hdfs@hdfs-cache ~]# hdfs cacheadmin -help listPools
 Sorry, I don't know the command 'listPools'.
 Valid command names are:
 -addDirective, -removeDirective, -removeDirectives, -listDirectives, 
 -addPool, -modifyPool, -removePool, -listPools, -help
 [hdfs@hdfs-cache ~]# hdfs cacheadmin -help -listPools
 Sorry, I don't know the command 'listPools'.
 Valid command names are:
 -addDirective, -removeDirective, -removeDirectives, -listDirectives, 
 -addPool, -modifyPool, -removePool, -listPools, -help
 {code}
 In the code, we strip the input command of leading hyphens, but then compare 
 it to the command names, which are all prefixed by a hyphen.
 Also, cacheadmin -removeDirectives requires specifying a path with -path but 
 -path is not shown in the usage. We should fix this as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HDFS-2832) Enable support for heterogeneous storages in HDFS


 [ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-2832:


Attachment: h2832_20131105.patch

 Enable support for heterogeneous storages in HDFS
 -

 Key: HDFS-2832
 URL: https://issues.apache.org/jira/browse/HDFS-2832
 Project: Hadoop HDFS
  Issue Type: New Feature
Affects Versions: 0.24.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023.patch, 
 h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, 
 h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, 
 h2832_20131104.patch, h2832_20131105.patch


 HDFS currently supports configuration where storages are a list of 
 directories. Typically each of these directories correspond to a volume with 
 its own file system. All these directories are homogeneous and therefore 
 identified as a single storage at the namenode. I propose, change to the 
 current model where Datanode * is a * storage, to Datanode * is a collection 
 * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5439) Fix TestPendingReplication


[ 
https://issues.apache.org/jira/browse/HDFS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814660#comment-13814660
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-5439:
--

Patch looks good.  Just a question: why changing BlockPlacementPolicyDefault?  
Is there a bug?

 Fix TestPendingReplication
 --

 Key: HDFS-5439
 URL: https://issues.apache.org/jira/browse/HDFS-5439
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-5439-demo1.patch, h5439.04.patch


 {{TestPendingReplication}} fails with the following exception:
 {code}
 java.lang.AssertionError: expected:4 but was:3
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication.testBlockReceived(TestPendingReplication.java:186)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5439) Fix TestPendingReplication


[ 
https://issues.apache.org/jira/browse/HDFS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814662#comment-13814662
 ] 

Arpit Agarwal commented on HDFS-5439:
-

Thanks for reviewing Nicholas.

Yes, the bug is that goodTarget was not reset after processing the first 
datanode but it was used as a terminating condition in the for loop. So the 
function would always fail when numOfReplicas  2.

 Fix TestPendingReplication
 --

 Key: HDFS-5439
 URL: https://issues.apache.org/jira/browse/HDFS-5439
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-5439-demo1.patch, h5439.04.patch


 {{TestPendingReplication}} fails with the following exception:
 {code}
 java.lang.AssertionError: expected:4 but was:3
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication.testBlockReceived(TestPendingReplication.java:186)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HDFS-5439) Fix TestPendingReplication


 [ 
https://issues.apache.org/jira/browse/HDFS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-5439.
-

   Resolution: Fixed
Fix Version/s: Heterogeneous Storage (HDFS-2832)
 Hadoop Flags: Reviewed

I committed this to branch HDFS-2832. Thanks for contributing part of the fix 
Junping.

 Fix TestPendingReplication
 --

 Key: HDFS-5439
 URL: https://issues.apache.org/jira/browse/HDFS-5439
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: Heterogeneous Storage (HDFS-2832)

 Attachments: HDFS-5439-demo1.patch, h5439.04.patch


 {{TestPendingReplication}} fails with the following exception:
 {code}
 java.lang.AssertionError: expected:4 but was:3
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication.testBlockReceived(TestPendingReplication.java:186)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5439) Fix TestPendingReplication


[ 
https://issues.apache.org/jira/browse/HDFS-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814664#comment-13814664
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-5439:
--

+1 good catch!

 Fix TestPendingReplication
 --

 Key: HDFS-5439
 URL: https://issues.apache.org/jira/browse/HDFS-5439
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: Heterogeneous Storage (HDFS-2832)

 Attachments: HDFS-5439-demo1.patch, h5439.04.patch


 {{TestPendingReplication}} fails with the following exception:
 {code}
 java.lang.AssertionError: expected:4 but was:3
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.failNotEquals(Assert.java:647)
 at org.junit.Assert.assertEquals(Assert.java:128)
 at org.junit.Assert.assertEquals(Assert.java:472)
 at org.junit.Assert.assertEquals(Assert.java:456)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication.testBlockReceived(TestPendingReplication.java:186)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-5464) Simplify block report diff calculation