date:20141126


[ 
https://issues.apache.org/jira/browse/HDFS-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225897#comment-14225897
 ] 

Hadoop QA commented on HDFS-7438:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683763/HDFS-7438.001.patch
  against trunk revision 4a31611.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestDFSPermission
  org.apache.hadoop.security.TestPermissionSymlinks
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestDisallowModifyROSnapshot
  org.apache.hadoop.fs.permission.TestStickyBit

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8843//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8843//console

This message is automatically generated.

 Consolidate implementation of rename()
 --

 Key: HDFS-7438
 URL: https://issues.apache.org/jira/browse/HDFS-7438
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-7438.000.patch, HDFS-7438.001.patch


 The implementation of {{rename()}} resides in both {{FSNameSystem}} and 
 {{FSDirectory}}. This jira proposes to consolidate them in a single class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7447) Number of maximum Acl entries on a File/Folder should be made user configurable than hardcoding .

2014-11-26 Thread J.Andreina (JIRA)

J.Andreina created HDFS-7447:


 Summary: Number of maximum Acl entries on a File/Folder should be 
made user configurable than hardcoding .
 Key: HDFS-7447
 URL: https://issues.apache.org/jira/browse/HDFS-7447
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: security
Reporter: J.Andreina



By default on creating a folder1 will have 6 acl entries . On top of that 
assigning acl  on a folder1 exceeds 32 , then unable to assign acls for a 
group/user to folder1. 
{noformat}
2014-11-20 18:55:06,553 ERROR [qtp1279235236-17 - /rolexml/role/modrole] Error 
occured while setting permissions for Resource:[ hdfs://hacluster/folder1 ] and 
Error message is : Invalid ACL: ACL has 33 entries, which exceeds maximum of 32.
at 
org.apache.hadoop.hdfs.server.namenode.AclTransformation.buildAndValidateAcl(AclTransformation.java:274)
at 
org.apache.hadoop.hdfs.server.namenode.AclTransformation.mergeAclEntries(AclTransformation.java:181)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedModifyAclEntries(FSDirectory.java:2771)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.modifyAclEntries(FSDirectory.java:2757)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.modifyAclEntries(FSNamesystem.java:7734)
{noformat}

Here value 32 is hardcoded  , which can be made user configurable. 

{noformat}
private static List buildAndValidateAcl(ArrayList aclBuilder)
throws AclException
{
if(aclBuilder.size()  32)
throw new AclException((new StringBuilder()).append(Invalid ACL: 
ACL has ).append(aclBuilder.size()).append( entries, which exceeds maximum of 
).append(32).append(.).toString());
:
:
}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6133) Make Balancer support exclude specified path


[ 
https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225982#comment-14225982
 ] 

Hadoop QA commented on HDFS-6133:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683784/HDFS-6133-5.patch
  against trunk revision 4a31611.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8845//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8845//console

This message is automatically generated.

 Make Balancer support exclude specified path
 

 Key: HDFS-6133
 URL: https://issues.apache.org/jira/browse/HDFS-6133
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover, namenode
Reporter: zhaoyunjiong
Assignee: zhaoyunjiong
 Attachments: HDFS-6133-1.patch, HDFS-6133-2.patch, HDFS-6133-3.patch, 
 HDFS-6133-4.patch, HDFS-6133-5.patch, HDFS-6133.patch


 Currently, run Balancer will destroying Regionserver's data locality.
 If getBlocks could exclude blocks belongs to files which have specific path 
 prefix, like /hbase, then we can run Balancer without destroying 
 Regionserver's data locality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7310) Mover can give first priority to local DN if it has target storage type available in local DN


[ 
https://issues.apache.org/jira/browse/HDFS-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225981#comment-14225981
 ] 

Hadoop QA commented on HDFS-7310:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683779/HDFS-7310-004.patch
  against trunk revision 4a31611.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8844//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8844//console

This message is automatically generated.

 Mover can give first priority to local DN if it has target storage type 
 available in local DN
 -

 Key: HDFS-7310
 URL: https://issues.apache.org/jira/browse/HDFS-7310
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Vinayakumar B
 Attachments: HDFS-7310-001.patch, HDFS-7310-002.patch, 
 HDFS-7310-003.patch, HDFS-7310-004.patch


 Currently Mover logic may move blocks to any DN which had target storage 
 type. But if the src DN has target storage type then mover can give highest 
 priority to local DN. If local DN does not contains target storage type, then 
 it can assign to any DN as the current logic does.
   This is a thought, have not go through the code fully yet.
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context

2014-11-26 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-6803:
-
Assignee: stack

 Documenting DFSClient#DFSInputStream expectations reading and preading in 
 concurrent context
 

 Key: HDFS-6803
 URL: https://issues.apache.org/jira/browse/HDFS-6803
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 2.4.1
Reporter: stack
Assignee: stack
 Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
 DocumentingDFSClientDFSInputStream.v2.pdf, HDFS-6803v2.txt, HDFS-6803v3.txt, 
 fsdatainputstream.md.v3.html


 Reviews of the patch posted the parent task suggest that we be more explicit 
 about how DFSIS is expected to behave when being read by contending threads. 
 It is also suggested that presumptions made internally be made explicit 
 documenting expectations.
 Before we put up a patch we've made a document of assertions we'd like to 
 make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
 a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context

2014-11-26 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HDFS-6803.
--
   Resolution: Fixed
Fix Version/s: 2.7.0

committed patch -thanks!

 Documenting DFSClient#DFSInputStream expectations reading and preading in 
 concurrent context
 

 Key: HDFS-6803
 URL: https://issues.apache.org/jira/browse/HDFS-6803
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 2.4.1
Reporter: stack
Assignee: stack
 Fix For: 2.7.0

 Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
 DocumentingDFSClientDFSInputStream.v2.pdf, HDFS-6803v2.txt, HDFS-6803v3.txt, 
 fsdatainputstream.md.v3.html


 Reviews of the patch posted the parent task suggest that we be more explicit 
 about how DFSIS is expected to behave when being read by contending threads. 
 It is also suggested that presumptions made internally be made explicit 
 documenting expectations.
 Before we put up a patch we've made a document of assertions we'd like to 
 make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
 a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7440) Consolidate snapshot related operations in a single class


[ 
https://issues.apache.org/jira/browse/HDFS-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226069#comment-14226069
 ] 

Hudson commented on HDFS-7440:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #17 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/17/])
HDFS-7440. Consolidate snapshot related operations in a single class. 
Contributed by Haohui Mai. (wheat9: rev 
4a3161182905afaf450a60d02528161ed1f97471)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java


 Consolidate snapshot related operations in a single class
 -

 Key: HDFS-7440
 URL: https://issues.apache.org/jira/browse/HDFS-7440
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.7.0

 Attachments: HDFS-7440.000.patch, HDFS-7440.001.patch, 
 HDFS-7440.002.patch


 Currently the snapshot-related code scatters across both {{FSNameSystem}} and 
 {{FSDirectory}}. This jira proposes to consolidate the logic in a single 
 class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node


[ 
https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226067#comment-14226067
 ] 

Hudson commented on HDFS-7097:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #17 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/17/])
HDFS-7097. Allow block reports to be processed during checkpointing on standby 
name node. (kihwal via wang) (wang: rev 
f43a20c529ac3f104add95b222de6580757b3763)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Allow block reports to be processed during checkpointing on standby name node
 -

 Key: HDFS-7097
 URL: https://issues.apache.org/jira/browse/HDFS-7097
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, 
 HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch


 On a reasonably busy HDFS cluster, there are stream of creates, causing data 
 nodes to generate incremental block reports.  When a standby name node is 
 checkpointing, RPC handler threads trying to process a full or incremental 
 block report is blocked on the name system's {{fsLock}}, because the 
 checkpointer acquires the read lock on it.  This can create a serious problem 
 if the size of name space is big and checkpointing takes a long time.
 All available RPC handlers can be tied up very quickly. If you have 100 
 handlers, it only takes 34 file creates.  If a separate service RPC port is 
 not used, HA transition will have to wait in the call queue for minutes. Even 
 if a separate service RPC port is configured, hearbeats from datanodes will 
 be blocked. A standby NN  with a big name space can lose all data nodes after 
 checkpointing.  The rpc calls will also be retransmitted by data nodes many 
 times, filling up the call queue and potentially causing listen queue 
 overflow.
 Since block reports are not modifying any state that is being saved to 
 fsimage, I propose letting them through during checkpointing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient

2014-11-26 Thread Yi Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226076#comment-14226076
 ] 

Yi Liu commented on HDFS-7435:
--

I open HADOOP-11339 to reuse buffer for Hadoop RPC as stated in my above 
comment. 

 PB encoding of block reports is very inefficient
 

 Key: HDFS-7435
 URL: https://issues.apache.org/jira/browse/HDFS-7435
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Attachments: HDFS-7435.000.patch, HDFS-7435.patch


 Block reports are encoded as a PB repeating long.  Repeating fields use an 
 {{ArrayList}} with default capacity of 10.  A block report containing tens or 
 hundreds of thousand of longs (3 for each replica) is extremely expensive 
 since the {{ArrayList}} must realloc many times.  Also, decoding repeating 
 fields will box the primitive longs which must then be unboxed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context


[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226079#comment-14226079
 ] 

Hudson commented on HDFS-6803:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6609 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6609/])
HDFS-6803 Document DFSClient#DFSInputStream expectations reading and preading  
in concurrent context. (stack via stevel) (stevel: rev 
aa7dac335960950d2254a5a78bd1f0786a290538)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md


 Documenting DFSClient#DFSInputStream expectations reading and preading in 
 concurrent context
 

 Key: HDFS-6803
 URL: https://issues.apache.org/jira/browse/HDFS-6803
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 2.4.1
Reporter: stack
Assignee: stack
 Fix For: 2.7.0

 Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
 DocumentingDFSClientDFSInputStream.v2.pdf, HDFS-6803v2.txt, HDFS-6803v3.txt, 
 fsdatainputstream.md.v3.html


 Reviews of the patch posted the parent task suggest that we be more explicit 
 about how DFSIS is expected to behave when being read by contending threads. 
 It is also suggested that presumptions made internally be made explicit 
 documenting expectations.
 Before we put up a patch we've made a document of assertions we'd like to 
 make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
 a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node


[ 
https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226087#comment-14226087
 ] 

Hudson commented on HDFS-7097:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #755 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/755/])
HDFS-7097. Allow block reports to be processed during checkpointing on standby 
name node. (kihwal via wang) (wang: rev 
f43a20c529ac3f104add95b222de6580757b3763)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java


 Allow block reports to be processed during checkpointing on standby name node
 -

 Key: HDFS-7097
 URL: https://issues.apache.org/jira/browse/HDFS-7097
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, 
 HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch


 On a reasonably busy HDFS cluster, there are stream of creates, causing data 
 nodes to generate incremental block reports.  When a standby name node is 
 checkpointing, RPC handler threads trying to process a full or incremental 
 block report is blocked on the name system's {{fsLock}}, because the 
 checkpointer acquires the read lock on it.  This can create a serious problem 
 if the size of name space is big and checkpointing takes a long time.
 All available RPC handlers can be tied up very quickly. If you have 100 
 handlers, it only takes 34 file creates.  If a separate service RPC port is 
 not used, HA transition will have to wait in the call queue for minutes. Even 
 if a separate service RPC port is configured, hearbeats from datanodes will 
 be blocked. A standby NN  with a big name space can lose all data nodes after 
 checkpointing.  The rpc calls will also be retransmitted by data nodes many 
 times, filling up the call queue and potentially causing listen queue 
 overflow.
 Since block reports are not modifying any state that is being saved to 
 fsimage, I propose letting them through during checkpointing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7440) Consolidate snapshot related operations in a single class


[ 
https://issues.apache.org/jira/browse/HDFS-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226089#comment-14226089
 ] 

Hudson commented on HDFS-7440:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #755 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/755/])
HDFS-7440. Consolidate snapshot related operations in a single class. 
Contributed by Haohui Mai. (wheat9: rev 
4a3161182905afaf450a60d02528161ed1f97471)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java


 Consolidate snapshot related operations in a single class
 -

 Key: HDFS-7440
 URL: https://issues.apache.org/jira/browse/HDFS-7440
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.7.0

 Attachments: HDFS-7440.000.patch, HDFS-7440.001.patch, 
 HDFS-7440.002.patch


 Currently the snapshot-related code scatters across both {{FSNameSystem}} and 
 {{FSDirectory}}. This jira proposes to consolidate the logic in a single 
 class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7414) Namenode got shutdown and can't recover where edit update might be missed

2014-11-26 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226141#comment-14226141
 ] 

Brahma Reddy Battula commented on HDFS-7414:


Seen following error from namenode log when it's initially came..

{noformat}
2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
CloseOp [length=0, inodeId=0, 
path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
 replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
clientMachine=, opCode=OP_CLOSE, txid=162982] | 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232
{noformat}

and let me describe full scenario..

Deleting DIr (which is having 100 files) and running mapreduce job..

I think, while gettingLastInodeInpath it may get null, since I am deleting the 
dir where it edits will not synced..

{code}
  final INodesInPath iip = fsDir.getLastINodeInPath(path);
  final INodeFile file = INodeFile.valueOf(iip.getINode(0), path);
{code}

Anyone have Anythoughts on this..?




 Namenode got shutdown and can't recover where edit update might be missed
 -

 Key: HDFS-7414
 URL: https://issues.apache.org/jira/browse/HDFS-7414
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.1, 2.5.1
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Blocker

 Scenario:
 
 Was running mapreduce job.
 CPU usage crossed 190% for Datanode and machine became slow..
 and seen the following exception .. 
  *Did not get the exact root cause, but as cpu usage more edit log updation 
 might be missed...Need dig to more...anyone have any thoughts.* 
 {noformat}
 2014-11-20 05:01:18,430 | ERROR | main | Encountered exception on operation 
 CloseOp [length=0, inodeId=0, 
 path=/outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025,
  replication=2, mtime=1416409309023, atime=1416409290816, blockSize=67108864, 
 blocks=[blk_1073766144_25321, blk_1073766154_25331, blk_1073766160_25337], 
 permissions=mapred:supergroup:rw-r--r--, aclEntries=null, clientName=, 
 clientMachine=, opCode=OP_CLOSE, txid=162982] | 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:224)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:133)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:805)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:665)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:272)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:893)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:640)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:519)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:575)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:741)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:724)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1387)
 at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1459)
 2014-11-20 05:01:18,654 | WARN  | main | Encountered exception loading 
 fsimage | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:642)
 java.io.FileNotFoundException: File does not exist: 
 /outDir2/_temporary/1/_temporary/attempt_1416390004064_0002_m_25_1/part-m-00025
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65)
 at 
 org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:409)
 at

[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node


[ 
https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226238#comment-14226238
 ] 

Hudson commented on HDFS-7097:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1945 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1945/])
HDFS-7097. Allow block reports to be processed during checkpointing on standby 
name node. (kihwal via wang) (wang: rev 
f43a20c529ac3f104add95b222de6580757b3763)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


 Allow block reports to be processed during checkpointing on standby name node
 -

 Key: HDFS-7097
 URL: https://issues.apache.org/jira/browse/HDFS-7097
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, 
 HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch


 On a reasonably busy HDFS cluster, there are stream of creates, causing data 
 nodes to generate incremental block reports.  When a standby name node is 
 checkpointing, RPC handler threads trying to process a full or incremental 
 block report is blocked on the name system's {{fsLock}}, because the 
 checkpointer acquires the read lock on it.  This can create a serious problem 
 if the size of name space is big and checkpointing takes a long time.
 All available RPC handlers can be tied up very quickly. If you have 100 
 handlers, it only takes 34 file creates.  If a separate service RPC port is 
 not used, HA transition will have to wait in the call queue for minutes. Even 
 if a separate service RPC port is configured, hearbeats from datanodes will 
 be blocked. A standby NN  with a big name space can lose all data nodes after 
 checkpointing.  The rpc calls will also be retransmitted by data nodes many 
 times, filling up the call queue and potentially causing listen queue 
 overflow.
 Since block reports are not modifying any state that is being saved to 
 fsimage, I propose letting them through during checkpointing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node


[ 
https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226248#comment-14226248
 ] 

Hudson commented on HDFS-7097:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #17 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/17/])
HDFS-7097. Allow block reports to be processed during checkpointing on standby 
name node. (kihwal via wang) (wang: rev 
f43a20c529ac3f104add95b222de6580757b3763)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java


 Allow block reports to be processed during checkpointing on standby name node
 -

 Key: HDFS-7097
 URL: https://issues.apache.org/jira/browse/HDFS-7097
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, 
 HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch


 On a reasonably busy HDFS cluster, there are stream of creates, causing data 
 nodes to generate incremental block reports.  When a standby name node is 
 checkpointing, RPC handler threads trying to process a full or incremental 
 block report is blocked on the name system's {{fsLock}}, because the 
 checkpointer acquires the read lock on it.  This can create a serious problem 
 if the size of name space is big and checkpointing takes a long time.
 All available RPC handlers can be tied up very quickly. If you have 100 
 handlers, it only takes 34 file creates.  If a separate service RPC port is 
 not used, HA transition will have to wait in the call queue for minutes. Even 
 if a separate service RPC port is configured, hearbeats from datanodes will 
 be blocked. A standby NN  with a big name space can lose all data nodes after 
 checkpointing.  The rpc calls will also be retransmitted by data nodes many 
 times, filling up the call queue and potentially causing listen queue 
 overflow.
 Since block reports are not modifying any state that is being saved to 
 fsimage, I propose letting them through during checkpointing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7440) Consolidate snapshot related operations in a single class


[ 
https://issues.apache.org/jira/browse/HDFS-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226240#comment-14226240
 ] 

Hudson commented on HDFS-7440:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1945 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1945/])
HDFS-7440. Consolidate snapshot related operations in a single class. 
Contributed by Haohui Mai. (wheat9: rev 
4a3161182905afaf450a60d02528161ed1f97471)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java


 Consolidate snapshot related operations in a single class
 -

 Key: HDFS-7440
 URL: https://issues.apache.org/jira/browse/HDFS-7440
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.7.0

 Attachments: HDFS-7440.000.patch, HDFS-7440.001.patch, 
 HDFS-7440.002.patch


 Currently the snapshot-related code scatters across both {{FSNameSystem}} and 
 {{FSDirectory}}. This jira proposes to consolidate the logic in a single 
 class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7440) Consolidate snapshot related operations in a single class


[ 
https://issues.apache.org/jira/browse/HDFS-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226250#comment-14226250
 ] 

Hudson commented on HDFS-7440:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #17 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/17/])
HDFS-7440. Consolidate snapshot related operations in a single class. 
Contributed by Haohui Mai. (wheat9: rev 
4a3161182905afaf450a60d02528161ed1f97471)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java


 Consolidate snapshot related operations in a single class
 -

 Key: HDFS-7440
 URL: https://issues.apache.org/jira/browse/HDFS-7440
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.7.0

 Attachments: HDFS-7440.000.patch, HDFS-7440.001.patch, 
 HDFS-7440.002.patch


 Currently the snapshot-related code scatters across both {{FSNameSystem}} and 
 {{FSDirectory}}. This jira proposes to consolidate the logic in a single 
 class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6407) new namenode UI, lost ability to sort columns in datanode tab

2014-11-26 Thread Benoy Antony (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226255#comment-14226255
]

Benoy Antony commented on HDFS-6407:

Why not use the plugin ?
Its a widely used plugin and used in other Apache projects like Storm. Based on
history, it's quite mature. It has a small size of 30 Kb. It seemed to function
very well during my testing. Instead of re-inventing the wheel, why not use a
standard solution ?

Enabling sorting on a specific table with the plugin is quite straight forward
as I explained in the comment above.
It is desirable to sort any table. Currently , I added sorting on data nodes,
snapshots and filesystem browsing. But if a table doesn't need sorting for
some reason, it can be easily removed.

BTW, pagination was never supported. It is normally required because client
doesn't have all the data which is not the case here.

new namenode UI, lost ability to sort columns in datanode tab
-

Key: HDFS-6407
URL: https://issues.apache.org/jira/browse/HDFS-6407
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 2.4.0
Reporter: Nathan Roberts
Assignee: Benoy Antony
Priority: Minor
Attachments: HDFS-6407.patch, browse_directory.png, datanodes.png,
snapshots.png

old ui supported clicking on column header to sort on that column. The new ui
seems to have dropped this very useful feature.
There are a few tables in the Namenode UI to display datanodes information,
directory listings and snapshots.
When there are many items in the tables, it is useful to have ability to sort
on the different columns.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7435) PB encoding of block reports is very inefficient

2014-11-26 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226276#comment-14226276
 ] 

Daryn Sharp commented on HDFS-7435:
---

Upon quick scan, it appears the the chunk size is encoded in the buffer?  If 
yes, I think each side should be able chose an appropriate chunk size - as an 
implementation detail - which is not leaked into the PB encoding.  If we can 
agree on that point, can we move the segmentation to a separate jira?

 PB encoding of block reports is very inefficient
 

 Key: HDFS-7435
 URL: https://issues.apache.org/jira/browse/HDFS-7435
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Attachments: HDFS-7435.000.patch, HDFS-7435.patch


 Block reports are encoded as a PB repeating long.  Repeating fields use an 
 {{ArrayList}} with default capacity of 10.  A block report containing tens or 
 hundreds of thousand of longs (3 for each replica) is extremely expensive 
 since the {{ArrayList}} must realloc many times.  Also, decoding repeating 
 fields will box the primitive longs which must then be unboxed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context


[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226294#comment-14226294
 ] 

Hudson commented on HDFS-6803:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1969 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1969/])
HDFS-6803 Document DFSClient#DFSInputStream expectations reading and preading  
in concurrent context. (stack via stevel) (stevel: rev 
aa7dac335960950d2254a5a78bd1f0786a290538)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md


 Documenting DFSClient#DFSInputStream expectations reading and preading in 
 concurrent context
 

 Key: HDFS-6803
 URL: https://issues.apache.org/jira/browse/HDFS-6803
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 2.4.1
Reporter: stack
Assignee: stack
 Fix For: 2.7.0

 Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
 DocumentingDFSClientDFSInputStream.v2.pdf, HDFS-6803v2.txt, HDFS-6803v3.txt, 
 fsdatainputstream.md.v3.html


 Reviews of the patch posted the parent task suggest that we be more explicit 
 about how DFSIS is expected to behave when being read by contending threads. 
 It is also suggested that presumptions made internally be made explicit 
 documenting expectations.
 Before we put up a patch we've made a document of assertions we'd like to 
 make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
 a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node


[ 
https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226302#comment-14226302
 ] 

Hudson commented on HDFS-7097:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1969 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1969/])
HDFS-7097. Allow block reports to be processed during checkpointing on standby 
name node. (kihwal via wang) (wang: rev 
f43a20c529ac3f104add95b222de6580757b3763)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java


 Allow block reports to be processed during checkpointing on standby name node
 -

 Key: HDFS-7097
 URL: https://issues.apache.org/jira/browse/HDFS-7097
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, 
 HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch


 On a reasonably busy HDFS cluster, there are stream of creates, causing data 
 nodes to generate incremental block reports.  When a standby name node is 
 checkpointing, RPC handler threads trying to process a full or incremental 
 block report is blocked on the name system's {{fsLock}}, because the 
 checkpointer acquires the read lock on it.  This can create a serious problem 
 if the size of name space is big and checkpointing takes a long time.
 All available RPC handlers can be tied up very quickly. If you have 100 
 handlers, it only takes 34 file creates.  If a separate service RPC port is 
 not used, HA transition will have to wait in the call queue for minutes. Even 
 if a separate service RPC port is configured, hearbeats from datanodes will 
 be blocked. A standby NN  with a big name space can lose all data nodes after 
 checkpointing.  The rpc calls will also be retransmitted by data nodes many 
 times, filling up the call queue and potentially causing listen queue 
 overflow.
 Since block reports are not modifying any state that is being saved to 
 fsimage, I propose letting them through during checkpointing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7440) Consolidate snapshot related operations in a single class


[ 
https://issues.apache.org/jira/browse/HDFS-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226304#comment-14226304
 ] 

Hudson commented on HDFS-7440:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1969 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1969/])
HDFS-7440. Consolidate snapshot related operations in a single class. 
Contributed by Haohui Mai. (wheat9: rev 
4a3161182905afaf450a60d02528161ed1f97471)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java


 Consolidate snapshot related operations in a single class
 -

 Key: HDFS-7440
 URL: https://issues.apache.org/jira/browse/HDFS-7440
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.7.0

 Attachments: HDFS-7440.000.patch, HDFS-7440.001.patch, 
 HDFS-7440.002.patch


 Currently the snapshot-related code scatters across both {{FSNameSystem}} and 
 {{FSDirectory}}. This jira proposes to consolidate the logic in a single 
 class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context


[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226311#comment-14226311
 ] 

Hudson commented on HDFS-6803:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #17 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/17/])
HDFS-6803 Document DFSClient#DFSInputStream expectations reading and preading  
in concurrent context. (stack via stevel) (stevel: rev 
aa7dac335960950d2254a5a78bd1f0786a290538)
* 
hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Documenting DFSClient#DFSInputStream expectations reading and preading in 
 concurrent context
 

 Key: HDFS-6803
 URL: https://issues.apache.org/jira/browse/HDFS-6803
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 2.4.1
Reporter: stack
Assignee: stack
 Fix For: 2.7.0

 Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
 DocumentingDFSClientDFSInputStream.v2.pdf, HDFS-6803v2.txt, HDFS-6803v3.txt, 
 fsdatainputstream.md.v3.html


 Reviews of the patch posted the parent task suggest that we be more explicit 
 about how DFSIS is expected to behave when being read by contending threads. 
 It is also suggested that presumptions made internally be made explicit 
 documenting expectations.
 Before we put up a patch we've made a document of assertions we'd like to 
 make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
 a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node


[ 
https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226319#comment-14226319
 ] 

Hudson commented on HDFS-7097:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #17 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/17/])
HDFS-7097. Allow block reports to be processed during checkpointing on standby 
name node. (kihwal via wang) (wang: rev 
f43a20c529ac3f104add95b222de6580757b3763)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java


 Allow block reports to be processed during checkpointing on standby name node
 -

 Key: HDFS-7097
 URL: https://issues.apache.org/jira/browse/HDFS-7097
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, 
 HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch


 On a reasonably busy HDFS cluster, there are stream of creates, causing data 
 nodes to generate incremental block reports.  When a standby name node is 
 checkpointing, RPC handler threads trying to process a full or incremental 
 block report is blocked on the name system's {{fsLock}}, because the 
 checkpointer acquires the read lock on it.  This can create a serious problem 
 if the size of name space is big and checkpointing takes a long time.
 All available RPC handlers can be tied up very quickly. If you have 100 
 handlers, it only takes 34 file creates.  If a separate service RPC port is 
 not used, HA transition will have to wait in the call queue for minutes. Even 
 if a separate service RPC port is configured, hearbeats from datanodes will 
 be blocked. A standby NN  with a big name space can lose all data nodes after 
 checkpointing.  The rpc calls will also be retransmitted by data nodes many 
 times, filling up the call queue and potentially causing listen queue 
 overflow.
 Since block reports are not modifying any state that is being saved to 
 fsimage, I propose letting them through during checkpointing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7440) Consolidate snapshot related operations in a single class


[ 
https://issues.apache.org/jira/browse/HDFS-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226321#comment-14226321
 ] 

Hudson commented on HDFS-7440:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #17 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/17/])
HDFS-7440. Consolidate snapshot related operations in a single class. 
Contributed by Haohui Mai. (wheat9: rev 
4a3161182905afaf450a60d02528161ed1f97471)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirSnapshotOp.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java


 Consolidate snapshot related operations in a single class
 -

 Key: HDFS-7440
 URL: https://issues.apache.org/jira/browse/HDFS-7440
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.7.0

 Attachments: HDFS-7440.000.patch, HDFS-7440.001.patch, 
 HDFS-7440.002.patch


 Currently the snapshot-related code scatters across both {{FSNameSystem}} and 
 {{FSDirectory}}. This jira proposes to consolidate the logic in a single 
 class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-7448) TestBookKeeperHACheckpoints fails in trunk build

2014-11-26 Thread Ted Yu (JIRA)

Ted Yu created HDFS-7448:


 Summary: TestBookKeeperHACheckpoints fails in trunk build
 Key: HDFS-7448
 URL: https://issues.apache.org/jira/browse/HDFS-7448
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor


The test failed against both java 7 and java 8.
From https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/17/console :
{code}
testStandbyExceptionThrownDuringCheckpoint(org.apache.hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints)
  Time elapsed: 6.822 sec   ERROR!
org.apache.hadoop.ipc.RemoteException: File /testFile could only be replicated 
to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and 
no node(s) are excluded in this operation.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1558)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3024)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:699)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2125)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2121)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1683)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2119)

at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
at com.sun.proxy.$Proxy21.addBlock(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context

2014-11-26 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226343#comment-14226343
 ] 

stack commented on HDFS-6803:
-

[~stev...@iseran.com] Where do I send the $1.00?

 Documenting DFSClient#DFSInputStream expectations reading and preading in 
 concurrent context
 

 Key: HDFS-6803
 URL: https://issues.apache.org/jira/browse/HDFS-6803
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 2.4.1
Reporter: stack
Assignee: stack
 Fix For: 2.7.0

 Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
 DocumentingDFSClientDFSInputStream.v2.pdf, HDFS-6803v2.txt, HDFS-6803v3.txt, 
 fsdatainputstream.md.v3.html


 Reviews of the patch posted the parent task suggest that we be more explicit 
 about how DFSIS is expected to behave when being read by contending threads. 
 It is also suggested that presumptions made internally be made explicit 
 documenting expectations.
 Before we put up a patch we've made a document of assertions we'd like to 
 make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
 a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context

2014-11-26 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226405#comment-14226405
 ] 

Steve Loughran commented on HDFS-6803:
--

hang on to it until I need you to vote up something of mine

 Documenting DFSClient#DFSInputStream expectations reading and preading in 
 concurrent context
 

 Key: HDFS-6803
 URL: https://issues.apache.org/jira/browse/HDFS-6803
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Affects Versions: 2.4.1
Reporter: stack
Assignee: stack
 Fix For: 2.7.0

 Attachments: 9117.md.txt, DocumentingDFSClientDFSInputStream (1).pdf, 
 DocumentingDFSClientDFSInputStream.v2.pdf, HDFS-6803v2.txt, HDFS-6803v3.txt, 
 fsdatainputstream.md.v3.html


 Reviews of the patch posted the parent task suggest that we be more explicit 
 about how DFSIS is expected to behave when being read by contending threads. 
 It is also suggested that presumptions made internally be made explicit 
 documenting expectations.
 Before we put up a patch we've made a document of assertions we'd like to 
 make into tenets of DFSInputSteam.  If agreement, we'll attach to this issue 
 a patch that weaves the assumptions into DFSIS as javadoc and class comments. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-7448) TestBookKeeperHACheckpoints fails in trunk build


 [ 
https://issues.apache.org/jira/browse/HDFS-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA reassigned HDFS-7448:
---

Assignee: Akira AJISAKA

 TestBookKeeperHACheckpoints fails in trunk build
 

 Key: HDFS-7448
 URL: https://issues.apache.org/jira/browse/HDFS-7448
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Akira AJISAKA
Priority: Minor

 The test failed against both java 7 and java 8.
 From https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/17/console :
 {code}
 testStandbyExceptionThrownDuringCheckpoint(org.apache.hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints)
   Time elapsed: 6.822 sec   ERROR!
 org.apache.hadoop.ipc.RemoteException: File /testFile could only be 
 replicated to 0 nodes instead of minReplication (=1).  There are 0 
 datanode(s) running and no node(s) are excluded in this operation.
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1558)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3024)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:699)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2125)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2121)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1683)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2119)
   at org.apache.hadoop.ipc.Client.call(Client.java:1468)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
   at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at com.sun.proxy.$Proxy21.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node

[
https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226450#comment-14226450
]

Akira AJISAKA commented on HDFS-7097:
-

The patch was committed to branch-2 and trunk. Updating the status.

Allow block reports to be processed during checkpointing on standby name node
-

Key: HDFS-7097
URL: https://issues.apache.org/jira/browse/HDFS-7097
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
Fix For: 2.7.0

Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch,
HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch

On a reasonably busy HDFS cluster, there are stream of creates, causing data
nodes to generate incremental block reports. When a standby name node is
checkpointing, RPC handler threads trying to process a full or incremental
block report is blocked on the name system's {{fsLock}}, because the
checkpointer acquires the read lock on it. This can create a serious problem
if the size of name space is big and checkpointing takes a long time.
All available RPC handlers can be tied up very quickly. If you have 100
handlers, it only takes 34 file creates. If a separate service RPC port is
not used, HA transition will have to wait in the call queue for minutes. Even
if a separate service RPC port is configured, hearbeats from datanodes will
be blocked. A standby NN with a big name space can lose all data nodes after
checkpointing. The rpc calls will also be retransmitted by data nodes many
times, filling up the call queue and potentially causing listen queue
overflow.
Since block reports are not modifying any state that is being saved to
fsimage, I propose letting them through during checkpointing.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node

[
https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Akira AJISAKA updated HDFS-7097:

Resolution: Fixed
Fix Version/s: 2.7.0
Status: Resolved (was: Patch Available)

Allow block reports to be processed during checkpointing on standby name node
-

Key: HDFS-7097
URL: https://issues.apache.org/jira/browse/HDFS-7097
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
Fix For: 2.7.0

Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch,
HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7448) TestBookKeeperHACheckpoints fails in trunk build


 [ 
https://issues.apache.org/jira/browse/HDFS-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-7448:

Attachment: HDFS-7448-001.patch

Attaching a simple patch to change the number of DNs.

 TestBookKeeperHACheckpoints fails in trunk build
 

 Key: HDFS-7448
 URL: https://issues.apache.org/jira/browse/HDFS-7448
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: HDFS-7448-001.patch


 The test failed against both java 7 and java 8.
 From https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/17/console :
 {code}
 testStandbyExceptionThrownDuringCheckpoint(org.apache.hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints)
   Time elapsed: 6.822 sec   ERROR!
 org.apache.hadoop.ipc.RemoteException: File /testFile could only be 
 replicated to 0 nodes instead of minReplication (=1).  There are 0 
 datanode(s) running and no node(s) are excluded in this operation.
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1558)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3024)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:699)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2125)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2121)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1683)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2119)
   at org.apache.hadoop.ipc.Client.call(Client.java:1468)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
   at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at com.sun.proxy.$Proxy21.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7448) TestBookKeeperHACheckpoints fails in trunk build


 [ 
https://issues.apache.org/jira/browse/HDFS-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-7448:

Target Version/s: 2.7.0
  Status: Patch Available  (was: Open)

 TestBookKeeperHACheckpoints fails in trunk build
 

 Key: HDFS-7448
 URL: https://issues.apache.org/jira/browse/HDFS-7448
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: HDFS-7448-001.patch


 The test failed against both java 7 and java 8.
 From https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/17/console :
 {code}
 testStandbyExceptionThrownDuringCheckpoint(org.apache.hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints)
   Time elapsed: 6.822 sec   ERROR!
 org.apache.hadoop.ipc.RemoteException: File /testFile could only be 
 replicated to 0 nodes instead of minReplication (=1).  There are 0 
 datanode(s) running and no node(s) are excluded in this operation.
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1558)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3024)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:699)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2125)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2121)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1683)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2119)
   at org.apache.hadoop.ipc.Client.call(Client.java:1468)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
   at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at com.sun.proxy.$Proxy21.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7448) TestBookKeeperHACheckpoints fails in trunk build


[ 
https://issues.apache.org/jira/browse/HDFS-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226460#comment-14226460
 ] 

Akira AJISAKA commented on HDFS-7448:
-

Thanks for the report, [~tedyu].

 TestBookKeeperHACheckpoints fails in trunk build
 

 Key: HDFS-7448
 URL: https://issues.apache.org/jira/browse/HDFS-7448
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: HDFS-7448-001.patch


 The test failed against both java 7 and java 8.
 From https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/17/console :
 {code}
 testStandbyExceptionThrownDuringCheckpoint(org.apache.hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints)
   Time elapsed: 6.822 sec   ERROR!
 org.apache.hadoop.ipc.RemoteException: File /testFile could only be 
 replicated to 0 nodes instead of minReplication (=1).  There are 0 
 datanode(s) running and no node(s) are excluded in this operation.
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1558)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3024)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:699)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2125)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2121)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1683)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2119)
   at org.apache.hadoop.ipc.Client.call(Client.java:1468)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
   at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at com.sun.proxy.$Proxy21.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7448) TestBookKeeperHACheckpoints fails in trunk build


[ 
https://issues.apache.org/jira/browse/HDFS-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226524#comment-14226524
 ] 

Hadoop QA commented on HDFS-7448:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12683870/HDFS-7448-001.patch
  against trunk revision aa7dac3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8846//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8846//console

This message is automatically generated.

 TestBookKeeperHACheckpoints fails in trunk build
 

 Key: HDFS-7448
 URL: https://issues.apache.org/jira/browse/HDFS-7448
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ted Yu
Assignee: Akira AJISAKA
Priority: Minor
 Attachments: HDFS-7448-001.patch


 The test failed against both java 7 and java 8.
 From https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/17/console :
 {code}
 testStandbyExceptionThrownDuringCheckpoint(org.apache.hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints)
   Time elapsed: 6.822 sec   ERROR!
 org.apache.hadoop.ipc.RemoteException: File /testFile could only be 
 replicated to 0 nodes instead of minReplication (=1).  There are 0 
 datanode(s) running and no node(s) are excluded in this operation.
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1558)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3024)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:699)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2125)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2121)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1683)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2119)
   at org.apache.hadoop.ipc.Client.call(Client.java:1468)
   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
   at com.sun.proxy.$Proxy20.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at com.sun.proxy.$Proxy21.addBlock(Unknown Source)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
   at

[jira] [Commented] (HDFS-7310) Mover can give first priority to local DN if it has target storage type available in local DN

2014-11-26 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226529#comment-14226529
 ] 

Uma Maheswara Rao G commented on HDFS-7310:
---

Thanks a lot Vinay for the update. Latest patch looks good to me. +1
Will commit the patch shortly.

 Mover can give first priority to local DN if it has target storage type 
 available in local DN
 -

 Key: HDFS-7310
 URL: https://issues.apache.org/jira/browse/HDFS-7310
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover
Affects Versions: 3.0.0
Reporter: Uma Maheswara Rao G
Assignee: Vinayakumar B
 Attachments: HDFS-7310-001.patch, HDFS-7310-002.patch, 
 HDFS-7310-003.patch, HDFS-7310-004.patch


 Currently Mover logic may move blocks to any DN which had target storage 
 type. But if the src DN has target storage type then mover can give highest 
 priority to local DN. If local DN does not contains target storage type, then 
 it can assign to any DN as the current logic does.
   This is a thought, have not go through the code fully yet.
 Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7431) log message for InvalidMagicNumberException may be incorrect

2014-11-26 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226538#comment-14226538
 ] 

Chris Nauroth commented on HDFS-7431:
-

Hello, Yi.  Thank you for investigating this and posting a patch.  I have a 
possible idea for distinguishing the 2 cases.  We throw 
{{InvalidMagicNumberException}} from 
{{SaslDataTransferServer#doSaslHandshake}}.  Within this method, we have the 
information we need to distinguish between the 2 cases:
* {{if (dnConf.getEncryptDataTransfer())}}, then it's the encrypted case.
* {{if (dnConf.getSaslPropsResolver() != null)}}, then it's the data transfer 
protection case.

After checking that, we could throw exceptions with different messages 
depending on the case.  This could either be done with 2 distinct subclasses of 
{{InvalidMagicNumberException}} or adding some kind of type tag as a member.  
For the text of the messages, I suggest:

{code}
LOG.info(Failed to read expected encryption handshake from client  +
 at  + peer.getRemoteAddressString() + . Perhaps the client  +
 is running an older version of Hadoop which does not support  +
encryption);
{code}

{code}
LOG.info(Failed to read expected SASL data transfer protection 
handshake from client  +
 at  + peer.getRemoteAddressString() + . Perhaps the client  +
 is running an older version of Hadoop which does not support  +
encryption);
{code}

What are your thoughts on this?

 log message for InvalidMagicNumberException may be incorrect
 

 Key: HDFS-7431
 URL: https://issues.apache.org/jira/browse/HDFS-7431
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Attachments: HDFS-7431.001.patch


 For security mode, HDFS now supports that Datanodes don't require root or 
 jsvc if {{dfs.data.transfer.protection}} is configured.
 Log message for {{InvalidMagicNumberException}}, we miss one case: 
 when the datanodes run on unprivileged port and 
 {{dfs.data.transfer.protection}} is configured to {{authentication}} but 
 {{dfs.encrypt.data.transfer}} is not configured. SASL handshake is required 
 and a low version dfs client is used, then {{InvalidMagicNumberException}} is 
 thrown and we write log:
 {quote}
 Failed to read expected encryption handshake from client at  Perhaps the 
 client is running an older version of Hadoop which does not support encryption
 {quote}
 Recently I run HDFS built on trunk and security is enabled, but the client is 
 2.5.1 version. Then I got the above log message, but actually I have not 
 configured encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7310) Mover can give first priority to local DN if it has target storage type available in local DN