date:20140625


[ 
https://issues.apache.org/jira/browse/HDFS-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043087#comment-14043087
 ] 

Hadoop QA commented on HDFS-6595:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12652334/HDFS-6595.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7228//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7228//console

This message is automatically generated.

 Configure the maximum threads allowed for balancing on datanodes
 

 Key: HDFS-6595
 URL: https://issues.apache.org/jira/browse/HDFS-6595
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: Benoy Antony
Assignee: Benoy Antony
 Attachments: HDFS-6595.patch, HDFS-6595.patch


 Currently datanode allows a max of 5 threads to be used for balancing.
 In some cases, , it may make sense to use a different number of threads to 
 the purpose of moving.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc


[ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043160#comment-14043160
 ] 

Hadoop QA commented on HDFS-2856:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12652338/HDFS-2856.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7229//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7229//console

This message is automatically generated.

 Fix block protocol so that Datanodes don't require root or jsvc
 ---

 Key: HDFS-2856
 URL: https://issues.apache.org/jira/browse/HDFS-2856
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, security
Affects Versions: 3.0.0, 2.4.0
Reporter: Owen O'Malley
Assignee: Chris Nauroth
 Attachments: Datanode-Security-Design.pdf, 
 Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, 
 HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, 
 HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.prototype.patch


 Since we send the block tokens unencrypted to the datanode, we currently 
 start the datanode as root using jsvc and get a secure ( 1024) port.
 If we have the datanode generate a nonce and send it on the connection and 
 the sends an hmac of the nonce back instead of the block token it won't 
 reveal any secrets. Thus, we wouldn't require a secure port and would not 
 require root or jsvc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read


 [ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6591:


Attachment: HDFS-6591.txt

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read


 [ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6591:


Attachment: (was: HDFS-6591.txt)

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks

2014-06-25 Thread Zesheng Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-6596:
-

Attachment: HDFS-6596.1.patch

An initial patch.
Since {{DataInputStream#readfully(byte[], int, int)}} is final and 
{{FSDataInputStream}} can't override it, so we implement a readFully with 
ByteBuffer.

 Improve InputStream when read spans two blocks
 --

 Key: HDFS-6596
 URL: https://issues.apache.org/jira/browse/HDFS-6596
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
 Attachments: HDFS-6596.1.patch


 In the current implementation of DFSInputStream, read(buffer, offset, length) 
 is implemented as following:
 {code}
 int realLen = (int) Math.min(len, (blockEnd - pos + 1L));
 if (locatedBlocks.isLastBlockComplete()) {
   realLen = (int) Math.min(realLen, locatedBlocks.getFileLength());
 }
 int result = readBuffer(strategy, off, realLen, corruptedBlockMap);
 {code}
 From the above code, we can conclude that the read will return at most 
 (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the 
 caller must call read() second time to complete the request, and must wait 
 second time to acquire the DFSInputStream lock(read() is synchronized for 
 DFSInputStream). For latency sensitive applications, such as hbase, this will 
 result in latency pain point when they under massive race conditions. So here 
 we propose that we should loop internally in read() to do best effort read.
 In the current implementation of pread(read(position, buffer, offset, 
 lenght)), it does loop internally to do best effort read. So we can refactor 
 to support this on normal read.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read


[ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043285#comment-14043285
 ] 

Liang Xie commented on HDFS-6591:
-

Retry.  After a debugging, showed a rare race. the CountDownLatch is inside 
Callable, but there's no guarantee: when a countDown happened, then one of 
tasks has done. see: 
http://stackoverflow.com/questions/9604713/future-isdone-returns-false-even-if-the-task-is-done
 . really tricky...
I rewrote the sync related code in the latest patch. and passed all the 
TestPread case in a shell loop:)

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read


 [ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6591:


Attachment: (was: HDFS-6591.txt)

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read


 [ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Xie updated HDFS-6591:


Attachment: HDFS-6591.txt

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6587) Bug in TestBPOfferService can cause test failure


[ 
https://issues.apache.org/jira/browse/HDFS-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043323#comment-14043323
 ] 

Hudson commented on HDFS-6587:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/594/])
HDFS-6587. Fix a typo in message issued from explorer.js. Contributed by 
Yongjun Zhang. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605184)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js


 Bug in TestBPOfferService can cause test failure
 

 Key: HDFS-6587
 URL: https://issues.apache.org/jira/browse/HDFS-6587
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.0
Reporter: Zhilei Xu
Assignee: Zhilei Xu
 Fix For: 3.0.0, 2.5.0

 Attachments: patch_TestBPOfferService.txt


 need to fix a bug in TestBPOfferService#waitForBlockReceived that fails the 
 trunk, e.g. in Build #1781.
 Details: in this test, the utility function waitForBlockReceived() has a bug:
 parameter mockNN is never used but hard-coded mockNN1 is used.
 This bug introduces undeterministic test failure when 
 testBasicFunctionality() calls
 ret = waitForBlockReceived(FAKE_BLOCK, mockNN2);
 and the call finishes before the actual interaction with mockNN2 happens.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6593) Move SnapshotDiffInfo out of INodeDirectorySnapshottable


[ 
https://issues.apache.org/jira/browse/HDFS-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043335#comment-14043335
 ] 

Hudson commented on HDFS-6593:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/594/])
HDFS-6593. Move SnapshotDiffInfo out of INodeDirectorySnapshottable. 
Contributed by Jing Zhao. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605169)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/SnapshotDiffReport.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotDiffInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


 Move SnapshotDiffInfo out of INodeDirectorySnapshottable
 

 Key: HDFS-6593
 URL: https://issues.apache.org/jira/browse/HDFS-6593
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, snapshots
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6593.000.patch, HDFS-6593.001.patch, 
 HDFS-6593.002.patch


 Per discussion in HDFS-4667, we can move SnapshotDiffInfo out of 
 INodeDirectorySnapshottable as an individual class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6486) Add user doc for XAttrs via WebHDFS.


[ 
https://issues.apache.org/jira/browse/HDFS-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043327#comment-14043327
 ] 

Hudson commented on HDFS-6486:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/594/])
HDFS-6486. Add user doc for XAttrs via WebHDFS. Contributed by Yi Liu. 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605062)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm


 Add user doc for XAttrs via WebHDFS.
 

 Key: HDFS-6486
 URL: https://issues.apache.org/jira/browse/HDFS-6486
 Project: Hadoop HDFS
  Issue Type: Task
  Components: webhdfs
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6486.patch


 Add the user doc for XAttrs via WebHDFS.
 Set xattr:
 {code}
 curl -i -X PUT 
 'http://HOST:PORT/webhdfs/v1/PATH?op=SETXATTRxattr.name=XATTRNAMExattr.value=XATTRVALUEflag=FLAG'
 {code}
 Remove xattr:
 {code}
 curl -i -X PUT 
 'http://HOST:PORT/webhdfs/v1/PATH?op=REMOVEXATTRxattr.name=XATTRNAME'
 {code}
 Get an xattr:
 {code}
 curl -i 
 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSxattr.name=XATTRNAMEencoding=ENCODING'
 {code}
 Get multiple xattrs (XATTRNAME1, XATTRNAME2, XATTRNAME3):
 {code}
 curl -i 
 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSxattr.name=XATTRNAME1xattr.name=XATTRNAME2xattr.name=XATTRNAME3encoding=ENCODING'
 {code}
 Get all xattrs:
 {code}
 curl -i 
 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSencoding=ENCODING'
 {code}
 List xattrs
 {code}
 curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=LISTXATTRS'
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6430) HTTPFS - Implement XAttr support


[ 
https://issues.apache.org/jira/browse/HDFS-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043337#comment-14043337
 ] 

Hudson commented on HDFS-6430:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/594/])
HDFS-6430. HTTPFS - Implement XAttr support. (Yi Liu via tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605118)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSParametersProvider.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/EnumSetParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/Parameters.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/ParametersProvider.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServerNoXAttrs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/test/TestHdfsHelper.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 HTTPFS - Implement XAttr support
 

 Key: HDFS-6430
 URL: https://issues.apache.org/jira/browse/HDFS-6430
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.5.0

 Attachments: HDFS-6430.1.patch, HDFS-6430.2.patch, HDFS-6430.3.patch, 
 HDFS-6430.4.patch, HDFS-6430.5.patch, HDFS-6430.patch


 Add xattr support to HttpFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks

2014-06-25 Thread Zesheng Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zesheng Wu updated HDFS-6596:
-

Status: Patch Available  (was: Open)

 Improve InputStream when read spans two blocks
 --

 Key: HDFS-6596
 URL: https://issues.apache.org/jira/browse/HDFS-6596
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
 Attachments: HDFS-6596.1.patch


 In the current implementation of DFSInputStream, read(buffer, offset, length) 
 is implemented as following:
 {code}
 int realLen = (int) Math.min(len, (blockEnd - pos + 1L));
 if (locatedBlocks.isLastBlockComplete()) {
   realLen = (int) Math.min(realLen, locatedBlocks.getFileLength());
 }
 int result = readBuffer(strategy, off, realLen, corruptedBlockMap);
 {code}
 From the above code, we can conclude that the read will return at most 
 (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the 
 caller must call read() second time to complete the request, and must wait 
 second time to acquire the DFSInputStream lock(read() is synchronized for 
 DFSInputStream). For latency sensitive applications, such as hbase, this will 
 result in latency pain point when they under massive race conditions. So here 
 we propose that we should loop internally in read() to do best effort read.
 In the current implementation of pread(read(position, buffer, offset, 
 lenght)), it does loop internally to do best effort read. So we can refactor 
 to support this on normal read.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6556) Refine XAttr permissions

2014-06-25 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043365#comment-14043365
 ] 

Uma Maheswara Rao G commented on HDFS-6556:
---

Thanks a lot, Yi for the review.

[~andrew.wang] or [~cnauroth] , do you want to take a look ? If any of you +1,  
 I can go for commit. 

 Refine XAttr permissions
 

 Key: HDFS-6556
 URL: https://issues.apache.org/jira/browse/HDFS-6556
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Yi Liu
Assignee: Uma Maheswara Rao G
 Attachments: RefinedPermissions-HDFS-6556-1.patch, 
 RefinedPermissions-HDFS-6556.patch, refinedPermissions-HDFS-6556-2.patch


 After discuss with Uma, we should refine setting permissions of {{user}} and 
 {{trusted}} namespace xattrs.
 *1.* For {{user}} namespace xattrs, In HDFS-6374, says setXAttr should 
 require the user to be the owner of the file or directory, we have a bit 
 misunderstanding. It actually is:
 {quote}
 The access permissions for user attributes are defined by the file permission 
 bits. only regular files and directories can have extended attributes. For 
 sticky directories, only the owner and privileged user can write attributes.
 {quote}
 We can refer to linux source code in 
 http://lxr.free-electrons.com/source/fs/xattr.c?v=2.6.35 
 I also check in linux, it's controlled by the file permission bits for 
 regular files and directories (not sticky).
 *2.* For {{trusted}} namespace, currently we require the user should be owner 
 and superuser. Actually superuser is enough. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read


[ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043411#comment-14043411
 ] 

Hadoop QA commented on HDFS-6591:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12652381/HDFS-6591.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7230//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7230//console

This message is automatically generated.

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read


[ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043433#comment-14043433
 ] 

Hadoop QA commented on HDFS-6591:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12652383/HDFS-6591.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.balancer.TestBalancer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7232//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7232//console

This message is automatically generated.

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read


[ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043443#comment-14043443
 ] 

Hadoop QA commented on HDFS-6591:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12652383/HDFS-6591.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7231//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7231//console

This message is automatically generated.

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException


[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043493#comment-14043493
 ] 

Hudson commented on HDFS-6475:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/])
HDFS-6475. WebHdfs clients fail without retry because incorrect handling of 
StandbyException. Contributed by Yongjun Zhang. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605217)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java


 WebHdfs clients fail without retry because incorrect handling of 
 StandbyException
 -

 Key: HDFS-6475
 URL: https://issues.apache.org/jira/browse/HDFS-6475
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, webhdfs
Affects Versions: 2.4.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Fix For: 2.5.0

 Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
 HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
 HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
 HDFS-6475.008.patch, HDFS-6475.009.patch


 With WebHdfs clients connected to a HA HDFS service, the delegation token is 
 previously initialized with the active NN.
 When clients try to issue request, the NN it contacts is stored in a map 
 returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
 the NN based on the order, so likely the first one it runs into is StandbyNN. 
 If the StandbyNN doesn't have the updated client crediential, it will throw a 
 s SecurityException that wraps StandbyException.
 The client is expected to retry another NN, but due to the insufficient 
 handling of SecurityException mentioned above, it failed.
 Example message:
 {code}
 {RemoteException={message=Failed to obtain user group information: 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: 
 StandbyException, javaCl
 assName=java.lang.SecurityException, exception=SecurityException}}
 org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
 obtain user group information: 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
 at 
 org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
 at kclient1.kclient$1.run(kclient.java:64)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:356)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
 at kclient1.kclient.main(kclient.java:58)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6587) Bug in TestBPOfferService can cause test failure


[ 
https://issues.apache.org/jira/browse/HDFS-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043491#comment-14043491
 ] 

Hudson commented on HDFS-6587:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/])
HDFS-6587. Fix a typo in message issued from explorer.js. Contributed by 
Yongjun Zhang. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605184)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js


 Bug in TestBPOfferService can cause test failure
 

 Key: HDFS-6587
 URL: https://issues.apache.org/jira/browse/HDFS-6587
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.0
Reporter: Zhilei Xu
Assignee: Zhilei Xu
 Fix For: 3.0.0, 2.5.0

 Attachments: patch_TestBPOfferService.txt


 need to fix a bug in TestBPOfferService#waitForBlockReceived that fails the 
 trunk, e.g. in Build #1781.
 Details: in this test, the utility function waitForBlockReceived() has a bug:
 parameter mockNN is never used but hard-coded mockNN1 is used.
 This bug introduces undeterministic test failure when 
 testBasicFunctionality() calls
 ret = waitForBlockReceived(FAKE_BLOCK, mockNN2);
 and the call finishes before the actual interaction with mockNN2 happens.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6486) Add user doc for XAttrs via WebHDFS.


[ 
https://issues.apache.org/jira/browse/HDFS-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043495#comment-14043495
 ] 

Hudson commented on HDFS-6486:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/])
HDFS-6486. Add user doc for XAttrs via WebHDFS. Contributed by Yi Liu. 
(umamahesh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605062)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm


 Add user doc for XAttrs via WebHDFS.
 

 Key: HDFS-6486
 URL: https://issues.apache.org/jira/browse/HDFS-6486
 Project: Hadoop HDFS
  Issue Type: Task
  Components: webhdfs
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6486.patch


 Add the user doc for XAttrs via WebHDFS.
 Set xattr:
 {code}
 curl -i -X PUT 
 'http://HOST:PORT/webhdfs/v1/PATH?op=SETXATTRxattr.name=XATTRNAMExattr.value=XATTRVALUEflag=FLAG'
 {code}
 Remove xattr:
 {code}
 curl -i -X PUT 
 'http://HOST:PORT/webhdfs/v1/PATH?op=REMOVEXATTRxattr.name=XATTRNAME'
 {code}
 Get an xattr:
 {code}
 curl -i 
 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSxattr.name=XATTRNAMEencoding=ENCODING'
 {code}
 Get multiple xattrs (XATTRNAME1, XATTRNAME2, XATTRNAME3):
 {code}
 curl -i 
 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSxattr.name=XATTRNAME1xattr.name=XATTRNAME2xattr.name=XATTRNAME3encoding=ENCODING'
 {code}
 Get all xattrs:
 {code}
 curl -i 
 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSencoding=ENCODING'
 {code}
 List xattrs
 {code}
 curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=LISTXATTRS'
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6430) HTTPFS - Implement XAttr support


[ 
https://issues.apache.org/jira/browse/HDFS-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043505#comment-14043505
 ] 

Hudson commented on HDFS-6430:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/])
HDFS-6430. HTTPFS - Implement XAttr support. (Yi Liu via tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605118)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSParametersProvider.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/EnumSetParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/Parameters.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/ParametersProvider.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServerNoXAttrs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/test/TestHdfsHelper.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 HTTPFS - Implement XAttr support
 

 Key: HDFS-6430
 URL: https://issues.apache.org/jira/browse/HDFS-6430
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.5.0

 Attachments: HDFS-6430.1.patch, HDFS-6430.2.patch, HDFS-6430.3.patch, 
 HDFS-6430.4.patch, HDFS-6430.5.patch, HDFS-6430.patch


 Add xattr support to HttpFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6593) Move SnapshotDiffInfo out of INodeDirectorySnapshottable


[ 
https://issues.apache.org/jira/browse/HDFS-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043503#comment-14043503
 ] 

Hudson commented on HDFS-6593:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/])
HDFS-6593. Move SnapshotDiffInfo out of INodeDirectorySnapshottable. 
Contributed by Jing Zhao. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605169)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/SnapshotDiffReport.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotDiffInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


 Move SnapshotDiffInfo out of INodeDirectorySnapshottable
 

 Key: HDFS-6593
 URL: https://issues.apache.org/jira/browse/HDFS-6593
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, snapshots
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6593.000.patch, HDFS-6593.001.patch, 
 HDFS-6593.002.patch


 Per discussion in HDFS-4667, we can move SnapshotDiffInfo out of 
 INodeDirectorySnapshottable as an individual class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6596) Improve InputStream when read spans two blocks


[ 
https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043545#comment-14043545
 ] 

Hadoop QA commented on HDFS-6596:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12652382/HDFS-6596.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7233//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7233//console

This message is automatically generated.

 Improve InputStream when read spans two blocks
 --

 Key: HDFS-6596
 URL: https://issues.apache.org/jira/browse/HDFS-6596
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: Zesheng Wu
Assignee: Zesheng Wu
 Attachments: HDFS-6596.1.patch


 In the current implementation of DFSInputStream, read(buffer, offset, length) 
 is implemented as following:
 {code}
 int realLen = (int) Math.min(len, (blockEnd - pos + 1L));
 if (locatedBlocks.isLastBlockComplete()) {
   realLen = (int) Math.min(realLen, locatedBlocks.getFileLength());
 }
 int result = readBuffer(strategy, off, realLen, corruptedBlockMap);
 {code}
 From the above code, we can conclude that the read will return at most 
 (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the 
 caller must call read() second time to complete the request, and must wait 
 second time to acquire the DFSInputStream lock(read() is synchronized for 
 DFSInputStream). For latency sensitive applications, such as hbase, this will 
 result in latency pain point when they under massive race conditions. So here 
 we propose that we should loop internally in read() to do best effort read.
 In the current implementation of pread(read(position, buffer, offset, 
 lenght)), it does loop internally to do best effort read. So we can refactor 
 to support this on normal read.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal


[ 
https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043586#comment-14043586
 ] 

Kihwal Lee commented on HDFS-6527:
--

[~jingzhao] You are right. Since it reresolves inside the write lock, it will 
detect the deletion.  I will revert it from 2.4.1.

 Edit log corruption due to defered INode removal
 

 Key: HDFS-6527
 URL: https://issues.apache.org/jira/browse/HDFS-6527
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Fix For: 2.4.1

 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, 
 HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch


 We have seen a SBN crashing with the following error:
 {panel}
 \[Edit log tailer\] ERROR namenode.FSEditLogLoader:
 Encountered exception on operation AddBlockOp
 [path=/xxx,
 penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=,
 RpcCallId=-2]
 java.io.FileNotFoundException: File does not exist: /xxx
 {panel}
 This was caused by the deferred removal of deleted inodes from the inode map. 
 Since getAdditionalBlock() acquires FSN read lock and then write lock, a 
 deletion can happen in between. Because of deferred inode removal outside FSN 
 write lock, getAdditionalBlock() can get the deleted inode from the inode map 
 with FSN write lock held. This allow addition of a block to a deleted file.
 As a result, the edit log will contain OP_ADD, OP_DELETE, followed by
  OP_ADD_BLOCK.  This cannot be replayed by NN, so NN doesn't start up or SBN 
 crashes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HDFS-6527) Edit log corruption due to defered INode removal


 [ 
https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HDFS-6527.
--

Resolution: Fixed

 Edit log corruption due to defered INode removal
 

 Key: HDFS-6527
 URL: https://issues.apache.org/jira/browse/HDFS-6527
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, 
 HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch


 We have seen a SBN crashing with the following error:
 {panel}
 \[Edit log tailer\] ERROR namenode.FSEditLogLoader:
 Encountered exception on operation AddBlockOp
 [path=/xxx,
 penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=,
 RpcCallId=-2]
 java.io.FileNotFoundException: File does not exist: /xxx
 {panel}
 This was caused by the deferred removal of deleted inodes from the inode map. 
 Since getAdditionalBlock() acquires FSN read lock and then write lock, a 
 deletion can happen in between. Because of deferred inode removal outside FSN 
 write lock, getAdditionalBlock() can get the deleted inode from the inode map 
 with FSN write lock held. This allow addition of a block to a deleted file.
 As a result, the edit log will contain OP_ADD, OP_DELETE, followed by
  OP_ADD_BLOCK.  This cannot be replayed by NN, so NN doesn't start up or SBN 
 crashes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6527) Edit log corruption due to defered INode removal


 [ 
https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6527:
-

Fix Version/s: (was: 2.4.1)
   2.5.0
   3.0.0

 Edit log corruption due to defered INode removal
 

 Key: HDFS-6527
 URL: https://issues.apache.org/jira/browse/HDFS-6527
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, 
 HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch


 We have seen a SBN crashing with the following error:
 {panel}
 \[Edit log tailer\] ERROR namenode.FSEditLogLoader:
 Encountered exception on operation AddBlockOp
 [path=/xxx,
 penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=,
 RpcCallId=-2]
 java.io.FileNotFoundException: File does not exist: /xxx
 {panel}
 This was caused by the deferred removal of deleted inodes from the inode map. 
 Since getAdditionalBlock() acquires FSN read lock and then write lock, a 
 deletion can happen in between. Because of deferred inode removal outside FSN 
 write lock, getAdditionalBlock() can get the deleted inode from the inode map 
 with FSN write lock held. This allow addition of a block to a deleted file.
 As a result, the edit log will contain OP_ADD, OP_DELETE, followed by
  OP_ADD_BLOCK.  This cannot be replayed by NN, so NN doesn't start up or SBN 
 crashes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal


[ 
https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043596#comment-14043596
 ] 

Kihwal Lee commented on HDFS-6527:
--

Reverted it from branch-2.4.1 and also updated the release note.

 Edit log corruption due to defered INode removal
 

 Key: HDFS-6527
 URL: https://issues.apache.org/jira/browse/HDFS-6527
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Fix For: 3.0.0, 2.5.0

 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, 
 HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch


 We have seen a SBN crashing with the following error:
 {panel}
 \[Edit log tailer\] ERROR namenode.FSEditLogLoader:
 Encountered exception on operation AddBlockOp
 [path=/xxx,
 penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=,
 RpcCallId=-2]
 java.io.FileNotFoundException: File does not exist: /xxx
 {panel}
 This was caused by the deferred removal of deleted inodes from the inode map. 
 Since getAdditionalBlock() acquires FSN read lock and then write lock, a 
 deletion can happen in between. Because of deferred inode removal outside FSN 
 write lock, getAdditionalBlock() can get the deleted inode from the inode map 
 with FSN write lock held. This allow addition of a block to a deleted file.
 As a result, the edit log will contain OP_ADD, OP_DELETE, followed by
  OP_ADD_BLOCK.  This cannot be replayed by NN, so NN doesn't start up or SBN 
 crashes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6601) Issues in finalizing rolling upgrade when there is a layout version change


 [ 
https://issues.apache.org/jira/browse/HDFS-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6601:
-

Priority: Blocker  (was: Major)

 Issues in finalizing rolling upgrade when there is a layout version change
 --

 Key: HDFS-6601
 URL: https://issues.apache.org/jira/browse/HDFS-6601
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Blocker
 Attachments: HDFS-6601.patch


 After HDFS-6545, we have noticed a couple of issues.
 - The storage dir's VERSION file is not properly updated. This becomes a 
 problem when there is a layout version change.  We can have the finalization 
 do {{storage.writeAll()}}
 - {{OP_ROLLING_UPGRADE_FINALIZE}} cannot be replayed, once the corresponding 
 {{OP_ROLLING_UPGRADE_START}} is consumed and a new fsimage is created (e.g. 
 rollback image). On restart, NN terminates complaining it can't finalize 
 something that it didn't start. We can make NN ignore 
 {{OP_ROLLING_UPGRADE_FINALIZE}} if no rolling upgrade is in progress.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6597) Add a new option to NN upgrade to terminate the process after upgrade on NN is completed

2014-06-25 Thread Danilo Vunjak (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043654#comment-14043654
 ] 

Danilo Vunjak commented on HDFS-6597:
-

Hi guys,
You have point when saying -force is not right name. I would pick -upgradeOnly 
as maybe best option. What is your opinion?
[~jingzhao]
Yes, NN is included in all services. Datanodes need it up to upgrade itself.

Thanks,
Danilo

 Add a new option to NN upgrade to terminate the process after upgrade on NN 
 is completed
 

 Key: HDFS-6597
 URL: https://issues.apache.org/jira/browse/HDFS-6597
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Danilo Vunjak
 Attachments: JIRA-HDFS-30.patch


 Currently when namenode is started for upgrade (hadoop namenode -upgrade 
 command), after finishing upgrade of metadata, namenode starts working 
 normally and wait for datanodes to upgrade itself and connect to to NN. We 
 need to have option for upgrading only NN metadata, so after upgrade is 
 finished on NN, process should terminate.
 I have tested it by changing in file: hdfs.server.namenode.NameNode.java, 
 method: public static NameNode createNameNode(String argv[], Configuration 
 conf):
  in switch added
  case UPGRADE:
 case UPGRADE:
   {
 DefaultMetricsSystem.initialize(NameNode);
   NameNode nameNode = new NameNode(conf);
   if (startOpt.getForceUpgrade()) {
 terminate(0);
 return null;
   }
   
   return nameNode;
   }
 This did upgrade of metadata, closed process after finished, and later when 
 all services were started, upgrade of datanodes finished sucessfully and 
 system run .
 What I'm suggesting right now is to add new startup parameter -force, so 
 namenode can be started like this hadoop namenode -upgrade -force, so we 
 can indicate that we want to terminate process after upgrade metadata on NN 
 is finished. Old functionality should be preserved, so users can run hadoop 
 namenode -upgrade on same way and with same behaviour as it was previous.
  Thanks,
  Danilo



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException


[ 
https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043677#comment-14043677
 ] 

Hudson commented on HDFS-6475:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/])
HDFS-6475. WebHdfs clients fail without retry because incorrect handling of 
StandbyException. Contributed by Yongjun Zhang. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605217)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java


 WebHdfs clients fail without retry because incorrect handling of 
 StandbyException
 -

 Key: HDFS-6475
 URL: https://issues.apache.org/jira/browse/HDFS-6475
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, webhdfs
Affects Versions: 2.4.0
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
 Fix For: 2.5.0

 Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, 
 HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, 
 HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, 
 HDFS-6475.008.patch, HDFS-6475.009.patch


 With WebHdfs clients connected to a HA HDFS service, the delegation token is 
 previously initialized with the active NN.
 When clients try to issue request, the NN it contacts is stored in a map 
 returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact 
 the NN based on the order, so likely the first one it runs into is StandbyNN. 
 If the StandbyNN doesn't have the updated client crediential, it will throw a 
 s SecurityException that wraps StandbyException.
 The client is expected to retry another NN, but due to the insufficient 
 handling of SecurityException mentioned above, it failed.
 Example message:
 {code}
 {RemoteException={message=Failed to obtain user group information: 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: 
 StandbyException, javaCl
 assName=java.lang.SecurityException, exception=SecurityException}}
 org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to 
 obtain user group information: 
 org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
 at 
 org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
 at 
 org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
 at kclient1.kclient$1.run(kclient.java:64)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:356)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
 at kclient1.kclient.main(kclient.java:58)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6430) HTTPFS - Implement XAttr support


[ 
https://issues.apache.org/jira/browse/HDFS-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043688#comment-14043688
 ] 

Hudson commented on HDFS-6430:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/])
HDFS-6430. HTTPFS - Implement XAttr support. (Yi Liu via tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605118)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSParametersProvider.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/EnumSetParam.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/Parameters.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/ParametersProvider.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServerNoXAttrs.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/test/TestHdfsHelper.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 HTTPFS - Implement XAttr support
 

 Key: HDFS-6430
 URL: https://issues.apache.org/jira/browse/HDFS-6430
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 3.0.0
Reporter: Yi Liu
Assignee: Yi Liu
 Fix For: 2.5.0

 Attachments: HDFS-6430.1.patch, HDFS-6430.2.patch, HDFS-6430.3.patch, 
 HDFS-6430.4.patch, HDFS-6430.5.patch, HDFS-6430.patch


 Add xattr support to HttpFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6593) Move SnapshotDiffInfo out of INodeDirectorySnapshottable


[ 
https://issues.apache.org/jira/browse/HDFS-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043686#comment-14043686
 ] 

Hudson commented on HDFS-6593:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/])
HDFS-6593. Move SnapshotDiffInfo out of INodeDirectorySnapshottable. 
Contributed by Jing Zhao. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605169)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/SnapshotDiffReport.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotDiffInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java


 Move SnapshotDiffInfo out of INodeDirectorySnapshottable
 

 Key: HDFS-6593
 URL: https://issues.apache.org/jira/browse/HDFS-6593
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, snapshots
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6593.000.patch, HDFS-6593.001.patch, 
 HDFS-6593.002.patch


 Per discussion in HDFS-4667, we can move SnapshotDiffInfo out of 
 INodeDirectorySnapshottable as an individual class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6587) Bug in TestBPOfferService can cause test failure


[ 
https://issues.apache.org/jira/browse/HDFS-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043675#comment-14043675
 ] 

Hudson commented on HDFS-6587:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/])
HDFS-6587. Fix a typo in message issued from explorer.js. Contributed by 
Yongjun Zhang. (wheat9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605184)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js


 Bug in TestBPOfferService can cause test failure
 

 Key: HDFS-6587
 URL: https://issues.apache.org/jira/browse/HDFS-6587
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.0
Reporter: Zhilei Xu
Assignee: Zhilei Xu
 Fix For: 3.0.0, 2.5.0

 Attachments: patch_TestBPOfferService.txt


 need to fix a bug in TestBPOfferService#waitForBlockReceived that fails the 
 trunk, e.g. in Build #1781.
 Details: in this test, the utility function waitForBlockReceived() has a bug:
 parameter mockNN is never used but hard-coded mockNN1 is used.
 This bug introduces undeterministic test failure when 
 testBasicFunctionality() calls
 ret = waitForBlockReceived(FAKE_BLOCK, mockNN2);
 and the call finishes before the actual interaction with mockNN2 happens.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing

Kihwal Lee created HDFS-6602:


 Summary: PendingDeletionBlocks on SBN keeps increasing
 Key: HDFS-6602
 URL: https://issues.apache.org/jira/browse/HDFS-6602
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Priority: Critical


PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It
means this data structure is populated, but IBR (incremental block reports) do 
not cause deleted blocks to be removed from it. As a result, the heap usage 
keeps increasing.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing


[ 
https://issues.apache.org/jira/browse/HDFS-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043736#comment-14043736
 ] 

Kihwal Lee commented on HDFS-6602:
--

Since {{ReplicationMonitor}} is not running on SBN, {{invalidateBlocks}} is not 
consumed. Only when the SBN becomes active, it will be cleared.  
{{invalidateBlocks}} is populated during  block report processing. I think no 
queue including {{invalidateBlocks}} should be populated in standby.

 PendingDeletionBlocks on SBN keeps increasing
 -

 Key: HDFS-6602
 URL: https://issues.apache.org/jira/browse/HDFS-6602
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Priority: Critical

 PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It
 means this data structure is populated, but IBR (incremental block reports) 
 do not cause deleted blocks to be removed from it. As a result, the heap 
 usage keeps increasing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing


[ 
https://issues.apache.org/jira/browse/HDFS-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043736#comment-14043736
 ] 

Kihwal Lee edited comment on HDFS-6602 at 6/25/14 5:05 PM:
---

Since {{ReplicationMonitor}} is not running or not generating any work on SBN, 
{{invalidateBlocks}} is not consumed. Only when SBN becomes active, it will be 
cleared.  {{invalidateBlocks}} is populated during  block report processing. I 
think no queues including {{invalidateBlocks}} should be populated in standby.


was (Author: kihwal):
Since {{ReplicationMonitor}} is not running on SBN, {{invalidateBlocks}} is not 
consumed. Only when the SBN becomes active, it will be cleared.  
{{invalidateBlocks}} is populated during  block report processing. I think no 
queue including {{invalidateBlocks}} should be populated in standby.

 PendingDeletionBlocks on SBN keeps increasing
 -

 Key: HDFS-6602
 URL: https://issues.apache.org/jira/browse/HDFS-6602
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Priority: Critical

 PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It
 means this data structure is populated, but IBR (incremental block reports) 
 do not cause deleted blocks to be removed from it. As a result, the heap 
 usage keeps increasing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6556) Refine XAttr permissions

2014-06-25 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043753#comment-14043753
 ] 

Chris Nauroth commented on HDFS-6556:
-

Hi, [~umamaheswararao].  The patch looks good.  I have one minor suggestion.  I 
see this code block is repeated in {{FSNamesystem#setXAttrInt}} and 
{{FSNamesystem#removeXAttr}}:

{code}
  if (isPermissionEnabled  xAttr.getNameSpace() == XAttr.NameSpace.USER) {
if (isStickyBitDirectory(src)) {
  if (!pc.isSuperUser()) {
checkOwner(pc, src);
  }
} else {
  checkPathAccess(pc, src, FsAction.WRITE);
}
  }
{code}

We could remove the {{isStickyBitDirectory}} method and instead add a method 
named something like {{checkXAttrChangeAccess}} that fully encapsulates all of 
the above logic.  This would reduce code duplication.  What do you think?

 Refine XAttr permissions
 

 Key: HDFS-6556
 URL: https://issues.apache.org/jira/browse/HDFS-6556
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.5.0
Reporter: Yi Liu
Assignee: Uma Maheswara Rao G
 Attachments: RefinedPermissions-HDFS-6556-1.patch, 
 RefinedPermissions-HDFS-6556.patch, refinedPermissions-HDFS-6556-2.patch


 After discuss with Uma, we should refine setting permissions of {{user}} and 
 {{trusted}} namespace xattrs.
 *1.* For {{user}} namespace xattrs, In HDFS-6374, says setXAttr should 
 require the user to be the owner of the file or directory, we have a bit 
 misunderstanding. It actually is:
 {quote}
 The access permissions for user attributes are defined by the file permission 
 bits. only regular files and directories can have extended attributes. For 
 sticky directories, only the owner and privileged user can write attributes.
 {quote}
 We can refer to linux source code in 
 http://lxr.free-electrons.com/source/fs/xattr.c?v=2.6.35 
 I also check in linux, it's controlled by the file permission bits for 
 regular files and directories (not sticky).
 *2.* For {{trusted}} namespace, currently we require the user should be owner 
 and superuser. Actually superuser is enough. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043766#comment-14043766
]

Owen O'Malley commented on HDFS-6134:
-

{quote}
I don’t see a previous -1 in any of the related JIRAs.
{quote}

I had consistently stated objections and some of them have been addressed, but
the fundamentals have become clear through this jira. I am always hesitant to
use a -1 and I certainly don't do so lightly. Through the discussion, my
opinion is transparent encryption in HDFS is a *really* bad idea. Let's run
through the case:

The one claimed benefit of integrating encryption into HDFS is that the user
doesn't need to change the URLs that they use. I believe this to be a
*disadvantage* because it hides the fact that these files are encrypted. That
said, a better approach if that is the desired goal is to create a *NEW* filter
filesystem that the user can configure to respond to hdfs urls that does silent
encryption. This imposes *NO* penalty on people who don't want encryption and
does not require hacks to the FileSystem API.

{quote}
FileSystem will had a new create()/open() signature to support this, if you
have access to the file but not the key, you can use the new signatures to copy
files as per the usecase you are mentioning.
{quote}
This will break every backup application. Some of them, such as HAR and DistCp
you can hack to handle HDFS as a special case, but this kind of special casing
always comes back to haunt us as a project. Changing FileSystem API is a really
bad idea and inducing more differences between the various implementations will
create many more problems than you are trying to avoid.

Transparent data at rest encryption
---

Key: HDFS-6134
URL: https://issues.apache.org/jira/browse/HDFS-6134
Project: Hadoop HDFS
Issue Type: New Feature
Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf,
HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf

Because of privacy and security regulations, for many industries, sensitive
data at rest must be in encrypted form. For example: the healthcare industry
(HIPAA regulations), the card payment industry (PCI DSS regulations) or the
US government (FISMA regulations).
This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can
be used transparently by any application accessing HDFS via Hadoop Filesystem
Java API, Hadoop libhdfs C library, or WebHDFS REST API.
The resulting implementation should be able to be used in compliance with
different regulation requirements.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-25 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043780#comment-14043780
 ] 

Todd Lipcon commented on HDFS-6134:
---

bq. The one claimed benefit of integrating encryption into HDFS is that the 
user doesn't need to change the URLs that they use. I believe this to be a 
disadvantage because it hides the fact that these files are encrypted

This is the transparent part of the design, and it's billed as a positive 
feature in many products in the storage market. For example, from the NetApp 
Storage Encryption (NSE) 
[datasheet|http://www.jivesoftware.com/wp-content/uploads/2014/03/Datasheet-Encryption-at-rest.pdf]:
{quote}
While higher level SAN and NAS fabric encryption solutions provide more flex-
ibility, they can also present a challenge to everyday operations. Data
encrypted before it is sent to the storage module cannot be compressed,
deduplicated, or scanned for viruses, and it might need to be decrypted before
it can be replicated to a backup site or archived to tape.  Contrast this with
NSE, which transpar- ently supports these NetApp ® storage efficiency features.
NSE can help you lower your overall storage costs, while preventing old data
from being accessed if a drive is repurposed.
{quote}

The same advantages hold for HDFS -- if we add features such as transparent 
compression, it's crucial that the encryption be done _after_ compression.

The other point that this datasheet makes is that transparent at-rest 
encryption acts as a backstop in case an administrator forgets to configure or 
misconfigures higher-level encryption. That is to say, users may still use 
encrypted file formats on top of HDFS using a schema like you're proposing, but 
many regulations require that all data at rest is encrypted. Asking users to 
configure and use wrapper filesystems leaves it quite possible (even likely) 
that data will land on HDFS without being encrypted.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5321) Clean up the HTTP-related configuration in HDFS

2014-06-25 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043779#comment-14043779
 ] 

Haohui Mai commented on HDFS-5321:
--

Hi [~atm], thanks for bringing this up. I understand your concerns on 
compatibility, but note that {{dfs.http.port}} and {{dfs.https.port}} are 
private configurations. They are not exposed in {{hdfs-default.xml}}. Since 
HDFS maintains no compatibility guarantees for private configurations, it 
should be okay to include this in minor releases. 

 Clean up the HTTP-related configuration in HDFS
 ---

 Key: HDFS-5321
 URL: https://issues.apache.org/jira/browse/HDFS-5321
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.4.0

 Attachments: HDFS-5321.000.patch, HDFS-5321.001.patch


 Currently there are multiple configuration keys that control the ports that 
 the NameNode and DataNode listen to, and the default ports that the 
 hftp/webhdfs clients are connecting to.
 Below is a quick summary of these configuration:
 || Keys || Description ||
 | dfs.namenode.http-address | The address that the namenode http server binds 
 to |
 | dfs.namenode.https-address | The address that the namenode https server 
 binds to |
 | dfs.http.port | The default port that the hftp/webhdfs client use to 
 connect to the remote server|
 | dfs.https.port | The default port that the hsftp client use to connect to 
 the remote server|
 I propose to deprecate dfs.http.port and dfs.https.port to avoid potential 
 confusions (e.g., HDFS-5316). Note that this removes no functionality, since 
 the users can specify ports in hftp / webhdfs URLs when they need to connect 
 to HDFS servers with non-default ports.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc

2014-06-25 Thread Chris Nauroth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-2856:


Attachment: HDFS-2856.5.patch

The test failures are unrelated.  {{TestPipelinesFailover}} has been failing 
intermittently on other unrelated patches.  
{{TestBalancerWithSaslDataTransfer}} reruns tests from {{TestBalancer}} under 
secure configuration, and {{TestBalancer}} also has experienced intermittent 
failures lately.

However, reviewing logs from the test runs made me notice that 
{{MiniDFSCluster}} was printing a bogus warning about failure to bind to a 
privileged port, which isn't relevant when SASL is configured on 
DataTransferProtocol.  This could cause confusion for people running the tests 
in the future, so I'd like to stop those log messages.  I'm attaching patch v5 
with a minor change in {{MiniDFSCluster}} to stifle the bogus log messages.

 Fix block protocol so that Datanodes don't require root or jsvc
 ---

 Key: HDFS-2856
 URL: https://issues.apache.org/jira/browse/HDFS-2856
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, security
Affects Versions: 3.0.0, 2.4.0
Reporter: Owen O'Malley
Assignee: Chris Nauroth
 Attachments: Datanode-Security-Design.pdf, 
 Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, 
 HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, 
 HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.5.patch, 
 HDFS-2856.prototype.patch


 Since we send the block tokens unencrypted to the datanode, we currently 
 start the datanode as root using jsvc and get a secure ( 1024) port.
 If we have the datanode generate a nonce and send it on the connection and 
 the sends an hmac of the nonce back instead of the block token it won't 
 reveal any secrets. Thus, we wouldn't require a secure port and would not 
 require root or jsvc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6387) HDFS CLI admin tool for creating deleting an encryption zone


[ 
https://issues.apache.org/jira/browse/HDFS-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043797#comment-14043797
 ] 

Charles Lamb commented on HDFS-6387:


Thanks for the review [~cmccabe]. The .004 patch fixes those two minor issues. 
I also noticed that {{CryptoAdmin.ListZonesCommand#getLongUsage()}} didn't need 
to create a {{TableListing}} so I removed that (effectively) dead code.


 HDFS CLI admin tool for creating  deleting an encryption zone
 --

 Key: HDFS-6387
 URL: https://issues.apache.org/jira/browse/HDFS-6387
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6387.002.patch, HDFS-6387.003.patch, 
 HDFS-6387.1.patch


 CLI admin tool to create/delete an encryption zone in HDFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HDFS-6387) HDFS CLI admin tool for creating deleting an encryption zone


 [ 
https://issues.apache.org/jira/browse/HDFS-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb resolved HDFS-6387.


   Resolution: Fixed
Fix Version/s: fs-encryption (HADOOP-10150 and HDFS-6134)

Committed to fs-encryption.

 HDFS CLI admin tool for creating  deleting an encryption zone
 --

 Key: HDFS-6387
 URL: https://issues.apache.org/jira/browse/HDFS-6387
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)

 Attachments: HDFS-6387.002.patch, HDFS-6387.003.patch, 
 HDFS-6387.1.patch


 CLI admin tool to create/delete an encryption zone in HDFS.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043839#comment-14043839
 ] 

Owen O'Malley commented on HDFS-6134:
-

I'll also point out that I've provided a solution that doesn't change the HDFS 
core and still lets you use your hdfs urls with encryption...

Finally, adding compression to the crypto file system would be a great addition 
and *still* not require any changes to HDFS or its API.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043836#comment-14043836
]

Owen O'Malley commented on HDFS-6134:
-

Todd, it is *still* transparent encryption if you use cfs:// instead of
hdfs://. The important piece is that the application doesn't need to change to
access the decrypted storage.

My problem is by refusing to layer the change over the storage layer, this jira
is making much disruptive and unnecessary changes to the critical
infrastructure and its API.

NSE is whole disk encryption and is equivalent to using lm-crypt to encrypt the
block files. That level of encryption is always very transparent and is already
available in HDFS without a code change.

Aaron, I can't do a meeting tomorrow afternoon. How about tomorrow morning? Say
10am-noon?

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

2014-06-25 Thread Alejandro Abdelnur (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043846#comment-14043846
 ] 

Alejandro Abdelnur commented on HDFS-6134:
--

bq. Todd, it is still transparent encryption if you use cfs:// instead of 
hdfs://.

Owen, that is NOT transparent.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6389) Rename restrictions for encryption zones

2014-06-25 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043880#comment-14043880
 ] 

Colin Patrick McCabe commented on HDFS-6389:


OK, I re-checked this and there are some tests posted, but just in a separate 
patch file.  It looks reasonable, but let's put it all into one patch as per 
usual.  Thanks.

 Rename restrictions for encryption zones
 

 Key: HDFS-6389
 URL: https://issues.apache.org/jira/browse/HDFS-6389
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6389.001.patch, HDFS-6389.tests.patch


 Files and directories should not be moved in or out an encryption zone. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043893#comment-14043893
]

Owen O'Malley commented on HDFS-6134:
-

{quote}
Owen, that is NOT transparent.
{quote}

Transparent means that you shouldn't have to change your application code.
Hacking HDFS to add encryption is transparent for one set of apps, but
completely breaks others. Changing URLs requires no code changes to any apps.

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6389) Rename restrictions for encryption zones


[ 
https://issues.apache.org/jira/browse/HDFS-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043891#comment-14043891
 ] 

Charles Lamb commented on HDFS-6389:


bq. let's put it all into one patch as per usual.

Yup. The only reason I made the exception this time is because the diffs for 
the tests were dependent on another non-committed patch (HDFS-6387). When I 
post the revised diffs, they'll of course be in one patch file.


 Rename restrictions for encryption zones
 

 Key: HDFS-6389
 URL: https://issues.apache.org/jira/browse/HDFS-6389
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6389.001.patch, HDFS-6389.tests.patch


 Files and directories should not be moved in or out an encryption zone. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6569) OOB message can't be sent to the client when DataNode shuts down for upgrade

2014-06-25 Thread Brandon Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043915#comment-14043915
 ] 

Brandon Li commented on HDFS-6569:
--

The current code looks good logically and it tries not closing streams before 
the OOB is sent.

I think problem is triggered by the NIO implementation. When DataNode is 
shutdown for restart, it interrupts all the DataXceiver threads.  The NIO 
channel in NioInetPeer are bound to these threads doing the block receiving. If 
these threads are interrupted, the stream / channel is closed due to IO safety 
issues.

So once the DataXceiver thread is interrupted, rarely the OOB can be sent 
before NIO channel is closed automatically. 
One possible fix is to send OOB message before interrupting DataXceiver threads.
Thoughts?

 OOB message can't be sent to the client when DataNode shuts down for upgrade
 

 Key: HDFS-6569
 URL: https://issues.apache.org/jira/browse/HDFS-6569
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0, 2.4.0
Reporter: Brandon Li

 The socket is closed too early before the OOB message can be sent to client, 
 which causes the write pipeline failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6595) Configure the maximum threads allowed for balancing on datanodes

2014-06-25 Thread Tsz Wo Nicholas Sze (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-6595:
--

 Component/s: balancer
Priority: Minor  (was: Major)
Hadoop Flags: Reviewed

+1 patch looks good.

 Configure the maximum threads allowed for balancing on datanodes
 

 Key: HDFS-6595
 URL: https://issues.apache.org/jira/browse/HDFS-6595
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer, datanode
Reporter: Benoy Antony
Assignee: Benoy Antony
Priority: Minor
 Attachments: HDFS-6595.patch, HDFS-6595.patch


 Currently datanode allows a max of 5 threads to be used for balancing.
 In some cases, , it may make sense to use a different number of threads to 
 the purpose of moving.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6595) Configure the maximum threads allowed for balancing on datanodes

2014-06-25 Thread Tsz Wo Nicholas Sze (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-6595:
--

   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Benoy!

 Configure the maximum threads allowed for balancing on datanodes
 

 Key: HDFS-6595
 URL: https://issues.apache.org/jira/browse/HDFS-6595
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer, datanode
Reporter: Benoy Antony
Assignee: Benoy Antony
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6595.patch, HDFS-6595.patch


 Currently datanode allows a max of 5 threads to be used for balancing.
 In some cases, , it may make sense to use a different number of threads to 
 the purpose of moving.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6595) Configure the maximum threads allowed for balancing on datanodes


[ 
https://issues.apache.org/jira/browse/HDFS-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043953#comment-14043953
 ] 

Hudson commented on HDFS-6595:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5779 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5779/])
HDFS-6595. Allow the maximum threads for balancing on datanodes to be 
configurable. Contributed by Benoy Antony (szetszwo: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605565)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java


 Configure the maximum threads allowed for balancing on datanodes
 

 Key: HDFS-6595
 URL: https://issues.apache.org/jira/browse/HDFS-6595
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer, datanode
Reporter: Benoy Antony
Assignee: Benoy Antony
Priority: Minor
 Fix For: 2.5.0

 Attachments: HDFS-6595.patch, HDFS-6595.patch


 Currently datanode allows a max of 5 threads to be used for balancing.
 In some cases, , it may make sense to use a different number of threads to 
 the purpose of moving.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5546) race condition crashes hadoop ls -R when directories are moved/removed

2014-06-25 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043956#comment-14043956
 ] 

Colin Patrick McCabe commented on HDFS-5546:


I agree with a lot of the stuff that's been presented, but I also think our 
behavior should be consistent beween {{ls /a1/b /a2/b}} and {{ls 
/a\{1,2\}/b}}, and right now I can't see a good way to achieve that if we 
catch IOE (since the globber does not catch IOE)  On the other hand, if we 
catch FNF and continue if a top-level directory disappears on us, then we are 
making things more consistent, since the globber catches and ignores IOEs (when 
dealing with globs).

bq. Colin Patrick McCabe shouldn't the globStatus() be out of scope for this 
JIRA? Maybe we should open another related JIRA?

I'm not sure how the globber would report IOE other than throwing it.  We'd 
have to return a list of {{OptionFileStatus, IOException}} or something?  It 
doesn't seem like the kind of change that could be made compatibly, since we'd 
need a new interface.

So overall I would lean towards just catching FNF at the top-level, like the 
earlier patch did.  And maybe revisiting this later if we have better ideas 
about how to handle this in the globber as well.  [~daryn], [~eddyxu], does 
that make sense?  Or am I trying too hard to be consistent? :)

 race condition crashes hadoop ls -R when directories are moved/removed
 

 Key: HDFS-5546
 URL: https://issues.apache.org/jira/browse/HDFS-5546
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Colin Patrick McCabe
Assignee: Lei (Eddy) Xu
Priority: Minor
 Fix For: 3.0.0

 Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, 
 HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, 
 HDFS-5546.2.004.patch


 This seems to be a rare race condition where we have a sequence of events 
 like this:
 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D.
 2. someone deletes or moves directory D
 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which 
 calls DFS#listStatus(D). This throws FileNotFoundException.
 4. ls command terminates with FNF



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption

[
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043975#comment-14043975
]

Aaron T. Myers commented on HDFS-6134:
--

bq. Aaron, I can't do a meeting tomorrow afternoon. How about tomorrow morning?
Say 10am-noon?

Sounds good. Here's the address of Cloudera's SF Office:

433 California Street, Floor 6
San Francisco, CA 94104

I'll post the remote meeting details later today on this JIRA once I get those
figured out.

See you tomorrow!

Transparent data at rest encryption
---

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing


[ 
https://issues.apache.org/jira/browse/HDFS-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043993#comment-14043993
 ] 

Kihwal Lee commented on HDFS-6602:
--

Block report processing is actually okay. All types of block report processing 
goes through {{BlockManager#processReportedBlock()}} and any report from future 
is queued.  It is {{delete()}} that causes this queue to be populated. After 
collecting all blocks to be invalidated, {{BlockManager#removeBlock()}}  is 
called, which calls {{addToInvalidates()}}.  If NN is in standby, 
{{addToInvalidates()}} should not be called.


 PendingDeletionBlocks on SBN keeps increasing
 -

 Key: HDFS-6602
 URL: https://issues.apache.org/jira/browse/HDFS-6602
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Priority: Critical

 PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It
 means this data structure is populated, but IBR (incremental block reports) 
 do not cause deleted blocks to be removed from it. As a result, the heap 
 usage keeps increasing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing


 [ 
https://issues.apache.org/jira/browse/HDFS-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HDFS-6602.
--

Resolution: Duplicate
  Assignee: Kihwal Lee

It's already fixed by HDFS-6424!

 PendingDeletionBlocks on SBN keeps increasing
 -

 Key: HDFS-6602
 URL: https://issues.apache.org/jira/browse/HDFS-6602
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical

 PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It
 means this data structure is populated, but IBR (incremental block reports) 
 do not cause deleted blocks to be removed from it. As a result, the heap 
 usage keeps increasing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc


[ 
https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044008#comment-14044008
 ] 

Hadoop QA commented on HDFS-2856:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12652461/HDFS-2856.5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7234//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7234//console

This message is automatically generated.

 Fix block protocol so that Datanodes don't require root or jsvc
 ---

 Key: HDFS-2856
 URL: https://issues.apache.org/jira/browse/HDFS-2856
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, security
Affects Versions: 3.0.0, 2.4.0
Reporter: Owen O'Malley
Assignee: Chris Nauroth
 Attachments: Datanode-Security-Design.pdf, 
 Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, 
 HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, 
 HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.5.patch, 
 HDFS-2856.prototype.patch


 Since we send the block tokens unencrypted to the datanode, we currently 
 start the datanode as root using jsvc and get a secure ( 1024) port.
 If we have the datanode generate a nonce and send it on the connection and 
 the sends an hmac of the nonce back instead of the block token it won't 
 reveal any secrets. Thus, we wouldn't require a secure port and would not 
 require root or jsvc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6560) Byte array native checksumming on DN side

2014-06-25 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044060#comment-14044060
 ] 

Tsz Wo Nicholas Sze commented on HDFS-6560:
---

For Java 7, NativeCrc32 with direct buffer is faster than zip.CRC32 for 
byte-per-crc  512 but slower than it for byte-per-crc  512.  For byte-per-crc 
== 512 (which is an important case), their performances are similar.

{noformat}
 java.version = 1.7.0_60
java.runtime.name = Java(TM) SE Runtime Environment
 java.runtime.version = 1.7.0_60-b19
  java.vm.version = 24.60-b09
   java.vm.vendor = Oracle Corporation
 java.vm.name = Java HotSpot(TM) 64-Bit Server VM
java.vm.specification.version = 1.7
   java.specification.version = 1.7
  os.arch = x86_64
  os.name = Mac OS X
   os.version = 10.9.3
DATA_LENGTH = 67108864
TRIALS  = 10
{noformat}
Performance Table (bpc is byte-per-crc in MB/sec; #T = #Theads)
|  bpc  | #T ||  Zip || PureJava | % diff ||   Native | % diff | % diff |
|32 |  1 | 237.7 | 789.8 | 232.3% |1624.2 | 583.3% | 105.7% |
|32 |  2 | 207.6 | 604.5 | 191.2% |1608.3 | 674.8% | 166.1% |
|32 |  4 | 179.8 | 609.8 | 239.2% |1387.8 | 671.9% | 127.6% |
|32 |  8 | 163.4 | 356.8 | 118.3% | 910.4 | 457.1% | 155.1% |
|32 | 16 |  81.6 | 183.7 | 125.0% | 490.9 | 501.4% | 167.3% |
|  bpc  | #T ||  Zip || PureJava | % diff ||   Native | % diff | % diff |
|64 |  1 | 423.7 |1027.0 | 142.4% |1654.4 | 290.4% |  61.1% |
|64 |  2 | 417.7 |1031.8 | 147.0% |1640.1 | 292.7% |  59.0% |
|64 |  4 | 366.0 | 693.8 |  89.5% |1381.7 | 277.5% |  99.2% |
|64 |  8 | 280.2 | 443.5 |  58.3% |1046.8 | 273.5% | 136.0% |
|64 | 16 | 143.3 | 233.0 |  62.6% | 556.3 | 288.2% | 138.8% |
|  bpc  | #T ||  Zip || PureJava | % diff ||   Native | % diff | % diff |
|   128 |  1 | 716.1 |1229.9 |  71.7% |1628.6 | 127.4% |  32.4% |
|   128 |  2 | 703.0 |1221.4 |  73.7% |1610.0 | 129.0% |  31.8% |
|   128 |  4 | 708.1 | 998.7 |  41.0% |1408.1 |  98.8% |  41.0% |
|   128 |  8 | 503.3 | 583.7 |  16.0% |1059.4 | 110.5% |  81.5% |
|   128 | 16 | 259.6 | 316.4 |  21.9% | 610.3 | 135.2% |  92.9% |
|  bpc  | #T ||  Zip || PureJava | % diff ||   Native | % diff | % diff |
|   256 |  1 |1217.4 |1346.6 |  10.6% |1554.3 |  27.7% |  15.4% |
|   256 |  2 |1186.3 |1339.0 |  12.9% |1556.6 |  31.2% |  16.3% |
|   256 |  4 |1094.9 |1102.9 |   0.7% |1389.3 |  26.9% |  26.0% |
|   256 |  8 | 768.3 | 656.8 | -14.5% |1109.4 |  44.4% |  68.9% |
|   256 | 16 | 394.6 | 358.7 |  -9.1% | 597.8 |  51.5% |  66.7% |
|  bpc  | #T ||  Zip || PureJava | % diff ||   Native | % diff | % diff |
|   512 |  1 |1632.0 |1391.3 | -14.7% |1548.1 |  -5.1% |  11.3% |
|   512 |  2 |1608.9 |1377.8 | -14.4% |1550.1 |  -3.7% |  12.5% |
|   512 |  4 |1465.2 |1092.6 | -25.4% |1420.8 |  -3.0% |  30.0% |
|   512 |  8 |1027.7 | 721.7 | -29.8% |1124.4 |   9.4% |  55.8% |
|   512 | 16 | 551.6 | 397.9 | -27.9% | 628.2 |  13.9% |  57.9% |
|  bpc  | #T ||  Zip || PureJava | % diff ||   Native | % diff | % diff |
|  1024 |  1 |1980.3 |1411.7 | -28.7% |1570.7 | -20.7% |  11.3% |
|  1024 |  2 |1909.4 |1396.7 | -26.9% |1534.7 | -19.6% |   9.9% |
|  1024 |  4 |1747.4 |1159.9 | -33.6% |1426.2 | -18.4% |  23.0% |
|  1024 |  8 |1245.6 | 752.7 | -39.6% |1149.8 |  -7.7% |  52.8% |
|  1024 | 16 | 660.6 | 380.2 | -42.4% | 618.1 |  -6.4% |  62.6% |
|  bpc  | #T ||  Zip || PureJava | % diff ||   Native | % diff | % diff |
|  2048 |  1 |2140.4 |1390.2 | -35.0% |1570.3 | -26.6% |  13.0% |
|  2048 |  2 |2126.5 |1374.5 | -35.4% |1538.9 | -27.6% |  12.0% |
|  2048 |  4 |1769.0 |1132.9 | -36.0% |1411.5 | -20.2% |  24.6% |
|  2048 |  8 |1358.6 | 754.8 | -44.4% |1207.0 | -11.2% |  59.9% |
|  2048 | 16 | 749.4 | 394.4 | -47.4% | 639.9 | -14.6% |  62.2% |
|  bpc  | #T ||  Zip || PureJava | % diff ||   Native | % diff | % diff |
|  4096 |  1 |2325.5 |1427.0 | -38.6% |1531.2 | -34.2% |   7.3% |
|  4096 |  2 |2199.7 |1375.1 | -37.5% |1524.4 | -30.7% |  10.9% |
|  4096 |  4 |1927.3 |1103.8 | -42.7% |1412.7 | -26.7% |  28.0% |
|  4096 |  8 |1427.1 | 773.2 | -45.8% |1206.2 | -15.5% |  56.0% |
|  4096 | 16 | 761.0 | 401.3 | -47.3% | 632.6 | -16.9% |  57.6% |
|  bpc  | #T ||  Zip || PureJava | % diff ||   Native | % diff | % diff |
|  8192 |  1 |2364.7 |1431.6 | -39.5% |1566.2 | -33.8%

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044087#comment-14044087
 ] 

Aaron T. Myers commented on HDFS-6134:
--

Here's the WebEx information for those who are planning on joining remotely 
tomorrow from 10am-noon Pacific Time:

{noformat}
--- 
To start or join the online meeting 
--- 
Go to 
https://cloudera.webex.com/cloudera/j.php?MTID=me67e0b50829b1dc39077ac5ca323038a
 

--- 
Audio Only conference information 
--- 
Call-in toll number (US/Canada): 1-650-479-3208 

Access code:627 373 149 
Global call-in numbers: 
https://cloudera.webex.com/cloudera/globalcallin.php?serviceType=MCED=321024932tollFree=0
{noformat}

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read

2014-06-25 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044093#comment-14044093
 ] 

stack commented on HDFS-6591:
-

Nice test and added metrics [~xieliang007] Looks good on first pass.  Let me 
give it another pass

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044109#comment-14044109
 ] 

Owen O'Malley commented on HDFS-6134:
-

Any chance for the PA office? Otherwise I'll be dialing in.

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6134) Transparent data at rest encryption


[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044124#comment-14044124
 ] 

Aaron T. Myers commented on HDFS-6134:
--

Unfortunately not, all of Tucu, Andrew, Charlie, Colin, Todd, and I are all 
based out of the SF office and it's quite a hike for us to get down there. Sure 
you can't come up to SF? I'll buy you lunch after the meeting. :)

 Transparent data at rest encryption
 ---

 Key: HDFS-6134
 URL: https://issues.apache.org/jira/browse/HDFS-6134
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: security
Affects Versions: 2.3.0
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
 Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
 HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf


 Because of privacy and security regulations, for many industries, sensitive 
 data at rest must be in encrypted form. For example: the healthcare industry 
 (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
 US government (FISMA regulations).
 This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
 be used transparently by any application accessing HDFS via Hadoop Filesystem 
 Java API, Hadoop libhdfs C library, or WebHDFS REST API.
 The resulting implementation should be able to be used in compliance with 
 different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6389) Rename restrictions for encryption zones


 [ 
https://issues.apache.org/jira/browse/HDFS-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6389:
---

Attachment: HDFS-6389.002.patch

[~cmccabe],

Thanks for the review.

My intent of putting it in FSN was to have the code fail sooner rather than 
after both the FSN and FSD locks were taken, but longer term I agree that it 
should be moved down to FSD from FSN.

The revised diffs are the .002 version of the file.



 Rename restrictions for encryption zones
 

 Key: HDFS-6389
 URL: https://issues.apache.org/jira/browse/HDFS-6389
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, security
Reporter: Alejandro Abdelnur
Assignee: Charles Lamb
 Attachments: HDFS-6389.001.patch, HDFS-6389.002.patch, 
 HDFS-6389.tests.patch


 Files and directories should not be moved in or out an encryption zone. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read

2014-06-25 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044172#comment-14044172
 ] 

Akira AJISAKA commented on HDFS-6591:
-

Nice fix to me. In TestPread.java,
{code}
}
isHedgedRead = true;
  }
{code}
Would you please create {{@Before}} class and initialize variables there 
instead of setting at the last of {{@Test}} class like above?

Minor nits:
1. In DFSInputStream.java:1107, 
{code}
  FutureByteBuffer future = null;
{code}
Now that {{future}} is not used in the else clause, would you move the 
declaration into the try-catch clause?
2. There is a trailing white space in
{code}
+CompletionServiceByteBuffer hedgedService = 
{code}

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-5369) Support negative caching of user-group mapping

2014-06-25 Thread Lei (Eddy) Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-5369:


Attachment: HDFS-5369.000.patch

This patch re-enables the negative cache behavior in user-group mapping, so 
that it would not result _*long-time*_ frequently retries when the user/group 
resolution service has transient issues. The difference between this patch and 
HADOOP-8088 is that this patch adds another timeout for negative cached items. 
Thus it should be able to differentiate the expiring times for normal case and 
negative cached case. 

It also reduces the error message generated from Ldap and Shell based 
GroupsMappings. 

It would be great to see whether this patch can fit into different cases from 
[~andrew.wang], [~vinayrpet] and [~kihwal]. All feedbacks are wellcome :)



 Support negative caching of user-group mapping
 --

 Key: HDFS-5369
 URL: https://issues.apache.org/jira/browse/HDFS-5369
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.2.0
Reporter: Andrew Wang
 Attachments: HDFS-5369.000.patch


 We've seen a situation at a couple of our customers where interactions from 
 an unknown user leads to a high-rate of group mapping calls. In one case, 
 this was happening at a rate of 450 calls per second with the shell-based 
 group mapping, enough to severely impact overall namenode performance and 
 also leading to large amounts of log spam (prints a stack trace each time).
 Let's consider negative caching of group mapping, as well as quashing the 
 rate of this log message.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5369) Support negative caching of user-group mapping

2014-06-25 Thread Lei (Eddy) Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044201#comment-14044201
 ] 

Lei (Eddy) Xu commented on HDFS-5369:
-

[~andrew.wang] Thanks for your comments and pointing to the issues raised from 
HADOOP-8088. I would consider this patch as reference implementation, so that 
it has not yet addressed the error handling of a transient error. I will to 
address it as well as SLAs after getting some inputs. 

Moreover, the default value for negative cache timeout is still not clear to 
me. It might need more field data to support an appropriate timeout value here.

 Support negative caching of user-group mapping
 --

 Key: HDFS-5369
 URL: https://issues.apache.org/jira/browse/HDFS-5369
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.2.0
Reporter: Andrew Wang
 Attachments: HDFS-5369.000.patch


 We've seen a situation at a couple of our customers where interactions from 
 an unknown user leads to a high-rate of group mapping calls. In one case, 
 this was happening at a rate of 450 calls per second with the shell-based 
 group mapping, enough to severely impact overall namenode performance and 
 also leading to large amounts of log spam (prints a stack trace each time).
 Let's consider negative caching of group mapping, as well as quashing the 
 rate of this log message.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5321) Clean up the HTTP-related configuration in HDFS


[ 
https://issues.apache.org/jira/browse/HDFS-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044254#comment-14044254
 ] 

Aaron T. Myers commented on HDFS-5321:
--

I very much disagree with the notion of private configurations. To my 
knowledge we've never made such a distinction, and if we have been then it 
certainly should have been called out more explicitly for each individual 
setting than the mere absence of them from {{hdfs-default.xml}}.

bq. Since HDFS maintains no compatibility guarantees for private 
configurations, it should be okay to include this in minor releases.

Where are you concluding this from? Our compatibility guide makes no mention of 
it, or the concept of private configurations at all:

http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html

I think we should seriously consider reverting this change. The stated benefits 
seem quite minor, these conf settings have never been deprecated properly using 
DeprecationDelta (meaning users have never seen printed warnings about the 
deprecation), and this is a clearly incompatible change that has the potential 
to break existing applications as-written.

 Clean up the HTTP-related configuration in HDFS
 ---

 Key: HDFS-5321
 URL: https://issues.apache.org/jira/browse/HDFS-5321
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.4.0

 Attachments: HDFS-5321.000.patch, HDFS-5321.001.patch


 Currently there are multiple configuration keys that control the ports that 
 the NameNode and DataNode listen to, and the default ports that the 
 hftp/webhdfs clients are connecting to.
 Below is a quick summary of these configuration:
 || Keys || Description ||
 | dfs.namenode.http-address | The address that the namenode http server binds 
 to |
 | dfs.namenode.https-address | The address that the namenode https server 
 binds to |
 | dfs.http.port | The default port that the hftp/webhdfs client use to 
 connect to the remote server|
 | dfs.https.port | The default port that the hsftp client use to connect to 
 the remote server|
 I propose to deprecate dfs.http.port and dfs.https.port to avoid potential 
 confusions (e.g., HDFS-5316). Note that this removes no functionality, since 
 the users can specify ports in hftp / webhdfs URLs when they need to connect 
 to HDFS servers with non-default ports.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-5321) Clean up the HTTP-related configuration in HDFS

2014-06-25 Thread Haohui Mai (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044276#comment-14044276
 ] 

Haohui Mai commented on HDFS-5321:
--

Points taken. I'm okay to putting these two configurations back to branch-2, 
but it looks to me that it requires more work other than simply reverting the 
changes. Can you please create a jira for this task?

 Clean up the HTTP-related configuration in HDFS
 ---

 Key: HDFS-5321
 URL: https://issues.apache.org/jira/browse/HDFS-5321
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 2.4.0

 Attachments: HDFS-5321.000.patch, HDFS-5321.001.patch


 Currently there are multiple configuration keys that control the ports that 
 the NameNode and DataNode listen to, and the default ports that the 
 hftp/webhdfs clients are connecting to.
 Below is a quick summary of these configuration:
 || Keys || Description ||
 | dfs.namenode.http-address | The address that the namenode http server binds 
 to |
 | dfs.namenode.https-address | The address that the namenode https server 
 binds to |
 | dfs.http.port | The default port that the hftp/webhdfs client use to 
 connect to the remote server|
 | dfs.https.port | The default port that the hsftp client use to connect to 
 the remote server|
 I propose to deprecate dfs.http.port and dfs.https.port to avoid potential 
 confusions (e.g., HDFS-5316). Note that this removes no functionality, since 
 the users can specify ports in hftp / webhdfs URLs when they need to connect 
 to HDFS servers with non-default ports.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read


[ 
https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044321#comment-14044321
 ] 

Liang Xie commented on HDFS-6591:
-

Attached v2 should address the above comments. Thank you [~stack] [~ajisakaa] 
for reviewing !

 while loop is executed tens of thousands of times  in Hedged  Read
 --

 Key: HDFS-6591
 URL: https://issues.apache.org/jira/browse/HDFS-6591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.4.0
Reporter: LiuLei
Assignee: Liang Xie
 Attachments: HDFS-6591-v2.txt, HDFS-6591.txt, 
 LoopTooManyTimesTestCase.patch


 I download hadoop-2.4.1-rc1 code from 
 http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/,  I test the  Hedged  
 Read. I find the while loop in hedgedFetchBlockByteRange method is executed 
 tens of thousands of times.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read