[jira] [Created] (HADOOP-10779) Generalize DFS_PERMISSIONS_SUPERUSERGROUP_KEY for any HCFS

2014-07-03 Thread Martin Bukatovic (JIRA)
Martin Bukatovic created HADOOP-10779:
-

 Summary: Generalize DFS_PERMISSIONS_SUPERUSERGROUP_KEY for any HCFS
 Key: HADOOP-10779
 URL: https://issues.apache.org/jira/browse/HADOOP-10779
 Project: Hadoop Common
  Issue Type: Wish
  Components: fs
Reporter: Martin Bukatovic
Priority: Minor


HDFS has configuration option {{dfs.permissions.superusergroup}} stored in
{{hdfs-site.xml}} configuration file:

{noformat}

  dfs.permissions.superusergroup
  supergroup
  The name of the group of super-users.

{noformat}

Since we have an option to use alternative Hadoop filesystems (HCFS), there is
a question how to specify a supergroup in such case.

Eg. would introducing HCFS option in say {{core-site.xml}} for this as shown
below make sense?

{noformat}

  hcfs.permissions.superusergroup
  ${dfs.permissions.superusergroup}
  The name of the group of super-users.

{noformat}

Or would you solve it in different way? I would like to at least declare 
a recommended approach for alternative Hadoop filesystems to follow.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10392) Use FileSystem#makeQualified(Path) instead of Path#makeQualified(FileSystem)

2014-07-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051186#comment-14051186
 ] 

Hadoop QA commented on HADOOP-10392:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12644781/HADOOP-10392.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 24 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-mapreduce-project/hadoop-mapreduce-examples 
hadoop-tools/hadoop-archives hadoop-tools/hadoop-extras 
hadoop-tools/hadoop-gridmix hadoop-tools/hadoop-openstack 
hadoop-tools/hadoop-rumen hadoop-tools/hadoop-streaming:

org.apache.hadoop.mapred.pipes.TestPipeApplication

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4204//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4204//console

This message is automatically generated.

> Use FileSystem#makeQualified(Path) instead of Path#makeQualified(FileSystem)
> 
>
> Key: HADOOP-10392
> URL: https://issues.apache.org/jira/browse/HADOOP-10392
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: 2.3.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Minor
>  Labels: newbie
> Attachments: HADOOP-10392.2.patch, HADOOP-10392.3.patch, 
> HADOOP-10392.4.patch, HADOOP-10392.4.patch, HADOOP-10392.patch
>
>
> There're some methods calling Path.makeQualified(FileSystem), which causes 
> javac warning.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-07-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051191#comment-14051191
 ] 

Steve Loughran commented on HADOOP-9361:



Andew: thanks, will merge in today

Juan: thanks for the testing. The entire Swift test suite is skipped if there's 
no auth-keys file -though we could migrate that to the 
contract-test-options.xml file. The reason for that policy is that 
# some of the junit 3 test suites that are subclassed for hadoop-common-test 
aren't skippable (junit 3, see) -this is why in Hadoop common the s3 & ftp 
tests don't start with Test*. While the contract tests are designed to be 
self-skipping -and so logged in test reports, I left the junit 3 stuff with a 
Test profile -you can't really test the swift client without the settings, 
except for some minor unit tests

Jay: tighter exceptions provide more information to clients, and lets you 
explicitly catch by type in your code, e.g. {{catch(EOFException e}}. general 
IOExceptions with text have to be caught as IOE and then tested -and are 
incredibly brittle to changes in text. That's why I didn't rename text messages 
from exceptions in the common filesystems, even when I tightened their class: 
we don't know what callers are searching for the text. 

Whenever you can, use explicit types. I also recommend using constants for 
text, constants that tests can look for -and in those tests use 
{{Exception.toString().contains()}} as the check -not equality, so that if more 
details are added the test still works. 

> Strictly define the expected behavior of filesystem APIs and write tests to 
> verify compliance
> -
>
> Key: HADOOP-9361
> URL: https://issues.apache.org/jira/browse/HADOOP-9361
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, test
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, 
> HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, 
> HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, 
> HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, 
> HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, 
> HADOOP-9361-016.patch, HADOOP-9361-017.patch, HADOOP-9361-018.patch, 
> HADOOP-9361.awang-addendum.patch
>
>
> {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while 
> HDFS gets tested downstream, other filesystems, such as blobstore bindings, 
> don't.
> The only tests that are common are those of {{FileSystemContractTestBase}}, 
> which HADOOP-9258 shows is incomplete.
> I propose 
> # writing more tests which clarify expected behavior
> # testing operations in the interface being in their own JUnit4 test classes, 
> instead of one big test suite. 
> # Having each FS declare via a properties file what behaviors they offer, 
> such as atomic-rename, atomic-delete, umask, immediate-consistency -test 
> methods can downgrade to skipped test cases if a feature is missing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10778) Use NativeCrc32 only if it is faster

2014-07-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051197#comment-14051197
 ] 

Steve Loughran commented on HADOOP-10778:
-

This is interesting - the speedup may depend on the CPU as well as other 
factors. 

Maybe the number "512" could be provided by the NativeCRC code itself, so it 
can make those decisions based on its knowledge of things ... the pending Arm 
patch could supply a different number than x86 parts, etc.

> Use NativeCrc32 only if it is faster
> 
>
> Key: HADOOP-10778
> URL: https://issues.apache.org/jira/browse/HADOOP-10778
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: c10778_20140702.patch
>
>
> From the benchmark post in [this 
> comment|https://issues.apache.org/jira/browse/HDFS-6560?focusedCommentId=14044060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14044060],
>  NativeCrc32 is slower than java.util.zip.CRC32 for Java 7 and above when 
> bytesPerChecksum > 512.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10720) KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API

2014-07-03 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated HADOOP-10720:
-

Attachment: HADOOP-10720.2.patch

[~tucu00], Updated with your suggestions. Thanks for the pointer on using Guava 
{{Cache}} instead of {{ConcurrentHashMap}}. 

bq. KMSClientProvider.java, keyQueueFiller runnable, the for loop should clone 
the keyQueues.entry(), even if the Map is a concurrent one.
cloning the Entry object does not seem to be possible, so I made a copy of the 
Map before iterating. If the number of keys are really large, this might take a 
slight perf hit I guess..

> KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API
> ---
>
> Key: HADOOP-10720
> URL: https://issues.apache.org/jira/browse/HADOOP-10720
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: COMBO.patch, COMBO.patch, COMBO.patch, COMBO.patch, 
> COMBO.patch, HADOOP-10720.1.patch, HADOOP-10720.2.patch, HADOOP-10720.patch, 
> HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch
>
>
> KMS client/server should implement support for generating encrypted keys and 
> decrypting them via the REST API being introduced by HADOOP-10719.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10650) Add ability to specify a reverse ACL (black list) of users and groups

2014-07-03 Thread Benoy Antony (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051258#comment-14051258
 ] 

Benoy Antony commented on HADOOP-10650:
---

hi [~daryn],  Could you please review this patch ?


> Add ability to specify a reverse ACL (black list) of users and groups
> -
>
> Key: HADOOP-10650
> URL: https://issues.apache.org/jira/browse/HADOOP-10650
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Reporter: Benoy Antony
>Assignee: Benoy Antony
> Attachments: HADOOP-10650.patch, HADOOP-10650.patch
>
>
> Currently , it is possible to define a ACL (user and groups) for a service. 
> To temporarily remove authorization for a set of users, administrator needs 
> to remove the users from the specific group and this may be a lengthy process 
> ( update ldap groups, flush caches on machines).
>  If there is a facility to define a reverse ACL for services, then 
> administrator can disable users by specifying the users in reverse ACL. In 
> other words, one can specify a whitelist of users and groups as well as a 
> blacklist of users and groups. 
> One can also specify a default blacklist to disable the users from accessing 
> any service.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10774) Update KerberosTestUtils for hadoop-auth tests when using IBM Java

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-10774:


Environment: AIX

> Update KerberosTestUtils for hadoop-auth tests when using IBM Java
> --
>
> Key: HADOOP-10774
> URL: https://issues.apache.org/jira/browse/HADOOP-10774
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: AIX
>Reporter: sangamesh
> Attachments: HADOOP-10774.patch
>
>
> There are two issues if IBM Java is used to while testing hadoop-auth tests.
>  
> 1) Bad JAAS configuration: unrecognized option: isInitiator
> 2) Cannot retrieve key from keytab HTTP/localh...@example.com
> #1 Is caused as isInitiator isn't defined when we use IBM JAVA. There is 
> already a defect in jira https://issues.apache.org/jira/browse/SENTRY-169
>But we need to apply it to KerberosTestUtils.java for some tests in 
> hadoop-auth to pass.
> #2 IS caused as, For IBM_JAVA keytab file must be a absolute path with 
> file:// as the prefix for the useKeytab option.
>But the file path is relative. This change will work with both openjdk & 
> IBM_JAVA. 
>   
> Attached patch will resolve all failures happening if we use IBM Java.   



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10772) Generating RPMs for common, hdfs, httpfs, mapreduce , yarn and tools

2014-07-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051285#comment-14051285
 ] 

Steve Loughran commented on HADOOP-10772:
-

eric -we had RPMs in the past but they were undermaintained, not tested and had 
scripts that weren't up to date. What we get from bigtop isn't just the RPM 
packaging, but the tests of the RPMs, init.d scripts etc. 

if bigtop is x86-only, that's a bug in bigtop

> Generating RPMs for common, hdfs, httpfs, mapreduce , yarn and tools 
> -
>
> Key: HADOOP-10772
> URL: https://issues.apache.org/jira/browse/HADOOP-10772
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Jinghui Wang
>Assignee: Jinghui Wang
> Attachments: HADOOP-10772.patch
>
>
> Generating RPMs for hadoop-common, hadoop-hdfs, hadoop-hdfs-httpfs, 
> hadoop-mapreduce , hadoop-yarn-project and hadoop-tools-dist with dist build 
> profile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10774) Update KerberosTestUtils for hadoop-auth tests when using IBM Java

2014-07-03 Thread sangamesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sangamesh updated HADOOP-10774:
---

Environment: 
AIX
OS: RHEL (64 bit), Ubuntu (64bit)

  was:AIX


> Update KerberosTestUtils for hadoop-auth tests when using IBM Java
> --
>
> Key: HADOOP-10774
> URL: https://issues.apache.org/jira/browse/HADOOP-10774
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: AIX
> OS: RHEL (64 bit), Ubuntu (64bit)
>Reporter: sangamesh
> Attachments: HADOOP-10774.patch
>
>
> There are two issues if IBM Java is used to while testing hadoop-auth tests.
>  
> 1) Bad JAAS configuration: unrecognized option: isInitiator
> 2) Cannot retrieve key from keytab HTTP/localh...@example.com
> #1 Is caused as isInitiator isn't defined when we use IBM JAVA. There is 
> already a defect in jira https://issues.apache.org/jira/browse/SENTRY-169
>But we need to apply it to KerberosTestUtils.java for some tests in 
> hadoop-auth to pass.
> #2 IS caused as, For IBM_JAVA keytab file must be a absolute path with 
> file:// as the prefix for the useKeytab option.
>But the file path is relative. This change will work with both openjdk & 
> IBM_JAVA. 
>   
> Attached patch will resolve all failures happening if we use IBM Java.   



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10774) Update KerberosTestUtils for hadoop-auth tests when using IBM Java

2014-07-03 Thread sangamesh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sangamesh updated HADOOP-10774:
---

Environment: 
AIX
RHEL (64 bit), Ubuntu (64bit)

  was:
AIX
OS: RHEL (64 bit), Ubuntu (64bit)


> Update KerberosTestUtils for hadoop-auth tests when using IBM Java
> --
>
> Key: HADOOP-10774
> URL: https://issues.apache.org/jira/browse/HADOOP-10774
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.4.1
> Environment: AIX
> RHEL (64 bit), Ubuntu (64bit)
>Reporter: sangamesh
> Attachments: HADOOP-10774.patch
>
>
> There are two issues if IBM Java is used to while testing hadoop-auth tests.
>  
> 1) Bad JAAS configuration: unrecognized option: isInitiator
> 2) Cannot retrieve key from keytab HTTP/localh...@example.com
> #1 Is caused as isInitiator isn't defined when we use IBM JAVA. There is 
> already a defect in jira https://issues.apache.org/jira/browse/SENTRY-169
>But we need to apply it to KerberosTestUtils.java for some tests in 
> hadoop-auth to pass.
> #2 IS caused as, For IBM_JAVA keytab file must be a absolute path with 
> file:// as the prefix for the useKeytab option.
>But the file path is relative. This change will work with both openjdk & 
> IBM_JAVA. 
>   
> Attached patch will resolve all failures happening if we use IBM Java.   



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10312) Shell.ExitCodeException to have more useful toString

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-10312:


   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

> Shell.ExitCodeException to have more useful toString
> 
>
> Key: HADOOP-10312
> URL: https://issues.apache.org/jira/browse/HADOOP-10312
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch
>
>
> Shell's ExitCodeException doesn't include the exit code in the toString 
> value, so isn't that useful in diagnosing container start failures in YARN



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HADOOP-10533) S3 input stream NPEs in MapReduce jon

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-10533:
---

Assignee: Steve Loughran

> S3 input stream NPEs in MapReduce jon
> -
>
> Key: HADOOP-10533
> URL: https://issues.apache.org/jira/browse/HADOOP-10533
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 1.0.0, 1.0.3, 3.0.0, 2.4.0
> Environment: Hadoop with default configurations
>Reporter: Benjamin Kim
>Assignee: Steve Loughran
>Priority: Minor
>
> I'm running a wordcount MR as follows
> hadoop jar WordCount.jar wordcount.WordCountDriver 
> s3n://bucket/wordcount/input s3n://bucket/wordcount/output
>  
> s3n://bucket/wordcount/input is a s3 object that contains other input files.
> However I get following NPE error
> 12/10/02 18:56:23 INFO mapred.JobClient:  map 0% reduce 0%
> 12/10/02 18:56:54 INFO mapred.JobClient:  map 50% reduce 0%
> 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : 
> attempt_201210021853_0001_m_01_0, Status : FAILED
> java.lang.NullPointerException
> at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
> at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
> at java.io.FilterInputStream.close(FilterInputStream.java:155)
> at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
> at 
> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> MR runs fine if i specify more specific input path such as 
> s3n://bucket/wordcount/input/file.txt
> MR fails if I pass s3 folder as a parameter
> In summary,
> This works
>  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
> /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/
> This doesn't work
>  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
> s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/
> (both input path are directories)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10312) Shell.ExitCodeException to have more useful toString

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051349#comment-14051349
 ] 

Hudson commented on HADOOP-10312:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #5817 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5817/])
HADOOP-10312 Shell.ExitCodeException to have more useful toString (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607591)
* /hadoop/common/trunk
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java


> Shell.ExitCodeException to have more useful toString
> 
>
> Key: HADOOP-10312
> URL: https://issues.apache.org/jira/browse/HADOOP-10312
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch
>
>
> Shell's ExitCodeException doesn't include the exit code in the toString 
> value, so isn't that useful in diagnosing container start failures in YARN



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9495) Define behaviour of Seekable.seek(), write tests, fix all hadoop implementations for compliance

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9495:
---

Issue Type: Improvement  (was: Sub-task)
Parent: (was: HADOOP-9361)

> Define behaviour of Seekable.seek(), write tests, fix all hadoop 
> implementations for compliance
> ---
>
> Key: HADOOP-9495
> URL: https://issues.apache.org/jira/browse/HADOOP-9495
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 1.2.0, 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-9495.patch, HADOOP-9545-002.patch
>
>
> {{Seekable.seek()}} seems a good starting point for specifying, testing and 
> implementing FS API compliance: one method, relatively non-ambiguous 
> semantics, easily assessed used in the Hadoop codebase. Specify and test it 
> first



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9565) Add a Blobstore interface to add to blobstore FileSystems

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9565:
---

Issue Type: Improvement  (was: Sub-task)
Parent: (was: HADOOP-9361)

> Add a Blobstore interface to add to blobstore FileSystems
> -
>
> Key: HADOOP-9565
> URL: https://issues.apache.org/jira/browse/HADOOP-9565
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.0.4-alpha
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> We can make the fact that some {{FileSystem}} implementations are really 
> blobstores, with different atomicity and consistency guarantees, by adding a 
> {{Blobstore}} interface to add to them. 
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that 
> all blobstores implement at server-side copy operation as a substitute for 
> rename.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9651) Filesystems to throw FileAlreadyExistsException in createFile(path, overwrite=false) when the file exists

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9651:
---

Status: Open  (was: Patch Available)

incorporated in the uber-JIRA HADOOP-9361

> Filesystems to throw FileAlreadyExistsException in createFile(path, 
> overwrite=false) when the file exists
> -
>
> Key: HADOOP-9651
> URL: https://issues.apache.org/jira/browse/HADOOP-9651
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: 2.1.0-beta
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-9651.patch
>
>
> While HDFS and other filesystems throw a {{FileAlreadyExistsException}} if 
> you try to create a file that exists and you have set {{overwrite=false}}, 
> {{RawLocalFileSystem}} throws a plain {{IOException}}. This makes it 
> impossible to distinguish a create operation failing from a fixable problem 
> (the file is there) and something more fundamental.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9371) Define Semantics of FileSystem more rigorously

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9371:
---

Summary: Define Semantics of FileSystem more rigorously  (was: Define 
Semantics of FileSystem and FileContext more rigorously)

> Define Semantics of FileSystem more rigorously
> --
>
> Key: HADOOP-9371
> URL: https://issues.apache.org/jira/browse/HADOOP-9371
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: 1.2.0, 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: HADOOP-9361.2.patch, HADOOP-9361.patch, 
> HADOOP-9371-003.patch, HadoopFilesystemContract.pdf
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The semantics of {{FileSystem}} and {{FileContext}} are not completely 
> defined in terms of 
> # core expectations of a filesystem
> # consistency requirements.
> # concurrency requirements.
> # minimum scale limits
> Furthermore, methods are not defined strictly enough in terms of their 
> outcomes and failure modes.
> The requirements and method semantics should be defined more strictly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (HADOOP-9597) FileSystem open() API is not clear if FileNotFoundException is throw when the path does not exist

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-9597:
--

Assignee: Steve Loughran

> FileSystem open() API is not clear if FileNotFoundException is throw when the 
> path does not exist
> -
>
> Key: HADOOP-9597
> URL: https://issues.apache.org/jira/browse/HADOOP-9597
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs
>Affects Versions: 2.0.4-alpha
>Reporter: Jerry He
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
>
> The current FileSystem open() method throws a generic IOException in its API 
> specification.
> Some FileSystem implementations (DFS, RawLocalFileSystem ...) throws more 
> specific FileNotFoundException if the path does not exist.  Some throws 
> IOException only (FTPFileSystem, HftpFileSystem ...). 
> If we have a new FileSystem implementation, what should we follow exactly for 
> open()?
> What should the application expect in this case.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-9597) FileSystem open() API is not clear if FileNotFoundException is throw when the path does not exist

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-9597.


   Resolution: Done
Fix Version/s: 2.5.0

> FileSystem open() API is not clear if FileNotFoundException is throw when the 
> path does not exist
> -
>
> Key: HADOOP-9597
> URL: https://issues.apache.org/jira/browse/HADOOP-9597
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs
>Affects Versions: 2.0.4-alpha
>Reporter: Jerry He
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
>
> The current FileSystem open() method throws a generic IOException in its API 
> specification.
> Some FileSystem implementations (DFS, RawLocalFileSystem ...) throws more 
> specific FileNotFoundException if the path does not exist.  Some throws 
> IOException only (FTPFileSystem, HftpFileSystem ...). 
> If we have a new FileSystem implementation, what should we follow exactly for 
> open()?
> What should the application expect in this case.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9597) FileSystem open() API is not clear if FileNotFoundException is throw when the path does not exist

2014-07-03 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051361#comment-14051361
 ] 

Steve Loughran commented on HADOOP-9597:


It's {{FileNotFoundException()}}; the parent JIRA makes sure it is this in all 
filesystems that it has coverage for. Marking as done

> FileSystem open() API is not clear if FileNotFoundException is throw when the 
> path does not exist
> -
>
> Key: HADOOP-9597
> URL: https://issues.apache.org/jira/browse/HADOOP-9597
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, fs
>Affects Versions: 2.0.4-alpha
>Reporter: Jerry He
>Priority: Minor
> Fix For: 2.5.0
>
>
> The current FileSystem open() method throws a generic IOException in its API 
> specification.
> Some FileSystem implementations (DFS, RawLocalFileSystem ...) throws more 
> specific FileNotFoundException if the path does not exist.  Some throws 
> IOException only (FTPFileSystem, HftpFileSystem ...). 
> If we have a new FileSystem implementation, what should we follow exactly for 
> open()?
> What should the application expect in this case.
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10246) define FS permissions model with tests

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-10246:


Issue Type: Improvement  (was: Sub-task)
Parent: (was: HADOOP-9361)

> define FS permissions model with tests
> --
>
> Key: HADOOP-10246
> URL: https://issues.apache.org/jira/browse/HADOOP-10246
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Reporter: Steve Loughran
>Priority: Minor
>
> It's interesting that HDFS mkdirs(dir, permission) uses the umask, but 
> setPermissions() does not
> The permissions model, including umask logic should be defined and have tests 
> implemented by those filesystems that support permissions-based security



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051375#comment-14051375
 ] 

Hudson commented on HADOOP-9361:


SUCCESS: Integrated in Hadoop-trunk-Commit #5818 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5818/])
HADOOP-9361: Strictly define FileSystem APIs - HDFS portion (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607597)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/HDFSContract.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractAppend.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractConcat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractCreate.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractDelete.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractMkdir.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractOpen.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractRename.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractRootDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractSeek.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/contract
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/contract/hdfs.xml
HADOOP-9361: Strictly define FileSystem APIs (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607596)
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BufferedFSInputStream.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataOutputStream.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSExceptionMessages.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ftp/FTPFileSystem.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ftp/FTPInputStream.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/s3/S3FileSystem.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/s3native/Jets3tNativeFileSystemStore.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/s3native/NativeS3FileSystem.java
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/extending.md
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/index.md
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/introduction.md
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/model.md
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/notation.md
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/testing.md
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFileSystem.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract
* 
/hadoop/comm

[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051384#comment-14051384
 ] 

Hudson commented on HADOOP-9361:


SUCCESS: Integrated in Hadoop-trunk-Commit #5819 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5819/])
HADOOP-9361: site and gitignore (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607601)
* /hadoop/common/trunk/.gitignore
* /hadoop/common/trunk/hadoop-project/src/site/site.xml
HADOOP-9361: Strictly define FileSystem APIs - OpenStack portion (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607600)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/exceptions/SwiftNotDirectoryException.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/exceptions/SwiftPathExistsException.java
HADOOP-9361: Strictly define FileSystem APIs - OpenStack portion (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607599)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/StrictBufferedFSInputStream.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeFileSystem.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeFileSystemStore.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeInputStream.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeOutputStream.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemBasicOps.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemContract.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemRename.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/SwiftContract.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractCreate.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractDelete.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractMkdir.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractOpen.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractRename.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractRootDir.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractSeek.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/hdfs2/TestV2LsOperations.java
* /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/resources/contract
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/resources/contract/swift.xml


> Strictly define the expected behavior of filesystem APIs and write tests to 
> verify compliance
> -
>
> Key: HADOOP-9361
> URL: https://issues.apache.org/jira/browse/HADOOP-9361
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, test
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, 
> HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, 
> HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, 
> HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, 
> HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, 
> HADOOP-9361-016.patch, HADOOP-9361-017.patch, HADOOP-9361-018.patch, 
> HADOOP-9361.awang-addendum.patch
>
>
> {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while 
> HDFS gets tested downstream, other filesystems, such as blobstore bindings, 
> don't.
> The only tests that are common are those of {{FileSystemContractTestBase}}, 
> which HADOOP-9258 shows is incomplete.
> I propose 
> # writing more tests which clarify expected b

[jira] [Resolved] (HADOOP-10419) BufferedFSInputStream NPEs on getPos() on a closed stream

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-10419.
-

   Resolution: Fixed
Fix Version/s: 2.5.0

> BufferedFSInputStream NPEs on getPos() on a closed stream
> -
>
> Key: HADOOP-10419
> URL: https://issues.apache.org/jira/browse/HADOOP-10419
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> if you call getPos on a {{ChecksumFileSystem}} after a {{close()}} you get an 
> NPE.
> While throwing an exception in this states is legitimate (HDFS does, RawLocal 
> does not), it should be an {{IOException}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-9711) Write contract tests for S3Native; fix places where it breaks

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-9711.


   Resolution: Fixed
Fix Version/s: 2.5.0
 Assignee: Steve Loughran

> Write contract tests for S3Native; fix places where it breaks
> -
>
> Key: HADOOP-9711
> URL: https://issues.apache.org/jira/browse/HADOOP-9711
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 1.2.0, 3.0.0, 2.1.0-beta
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HADOOP-9711-004.patch
>
>
> implement the abstract contract tests for S3, identify where it is failing to 
> meet expectations and, where possible, fix. Blobstores tend to treat 0 byte 
> files as directories, so tests overwriting files with dirs and vice versa may 
> fail and have to be skipped



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-10533) S3 input stream NPEs in MapReduce jon

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-10533.
-

   Resolution: Fixed
Fix Version/s: 2.5.0

> S3 input stream NPEs in MapReduce jon
> -
>
> Key: HADOOP-10533
> URL: https://issues.apache.org/jira/browse/HADOOP-10533
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 1.0.0, 1.0.3, 3.0.0, 2.4.0
> Environment: Hadoop with default configurations
>Reporter: Benjamin Kim
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
>
> I'm running a wordcount MR as follows
> hadoop jar WordCount.jar wordcount.WordCountDriver 
> s3n://bucket/wordcount/input s3n://bucket/wordcount/output
>  
> s3n://bucket/wordcount/input is a s3 object that contains other input files.
> However I get following NPE error
> 12/10/02 18:56:23 INFO mapred.JobClient:  map 0% reduce 0%
> 12/10/02 18:56:54 INFO mapred.JobClient:  map 50% reduce 0%
> 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : 
> attempt_201210021853_0001_m_01_0, Status : FAILED
> java.lang.NullPointerException
> at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106)
> at java.io.BufferedInputStream.close(BufferedInputStream.java:451)
> at java.io.FilterInputStream.close(FilterInputStream.java:155)
> at org.apache.hadoop.util.LineReader.close(LineReader.java:83)
> at 
> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> MR runs fine if i specify more specific input path such as 
> s3n://bucket/wordcount/input/file.txt
> MR fails if I pass s3 folder as a parameter
> In summary,
> This works
>  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
> /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/
> This doesn't work
>  hadoop jar ./hadoop-examples-1.0.3.jar wordcount 
> s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/
> (both input path are directories)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-9495) Define behaviour of Seekable.seek(), write tests, fix all hadoop implementations for compliance

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-9495.


   Resolution: Fixed
Fix Version/s: 2.5.0

> Define behaviour of Seekable.seek(), write tests, fix all hadoop 
> implementations for compliance
> ---
>
> Key: HADOOP-9495
> URL: https://issues.apache.org/jira/browse/HADOOP-9495
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 1.2.0, 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 2.5.0
>
> Attachments: HADOOP-9495.patch, HADOOP-9545-002.patch
>
>
> {{Seekable.seek()}} seems a good starting point for specifying, testing and 
> implementing FS API compliance: one method, relatively non-ambiguous 
> semantics, easily assessed used in the Hadoop codebase. Specify and test it 
> first



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9712) Write contract tests for FTP filesystem, fix places where it breaks

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9712:
---

   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

> Write contract tests for FTP filesystem, fix places where it breaks
> ---
>
> Key: HADOOP-9712
> URL: https://issues.apache.org/jira/browse/HADOOP-9712
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 1.2.0, 3.0.0, 2.1.0-beta
>Reporter: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HADOOP-9712-001.patch
>
>
> implement the abstract contract tests for S3, identify where it is failing to 
> meet expectations and, where possible, fix. 
> FTPFS appears to be the least tested (& presumably used) hadoop filesystem 
> implementation; there may be some bug reports that have been around for years 
> that could drive test cases and fixes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-9371) Define Semantics of FileSystem more rigorously

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-9371.


   Resolution: Fixed
Fix Version/s: 2.5.0

> Define Semantics of FileSystem more rigorously
> --
>
> Key: HADOOP-9371
> URL: https://issues.apache.org/jira/browse/HADOOP-9371
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs
>Affects Versions: 1.2.0, 3.0.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 2.5.0
>
> Attachments: HADOOP-9361.2.patch, HADOOP-9361.patch, 
> HADOOP-9371-003.patch, HadoopFilesystemContract.pdf
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> The semantics of {{FileSystem}} and {{FileContext}} are not completely 
> defined in terms of 
> # core expectations of a filesystem
> # consistency requirements.
> # concurrency requirements.
> # minimum scale limits
> Furthermore, methods are not defined strictly enough in terms of their 
> outcomes and failure modes.
> The requirements and method semantics should be defined more strictly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051418#comment-14051418
 ] 

Hudson commented on HADOOP-9361:


SUCCESS: Integrated in Hadoop-trunk-Commit #5820 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5820/])
HADOOP-9361: changes.txt (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607620)
* /hadoop/common/trunk
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


> Strictly define the expected behavior of filesystem APIs and write tests to 
> verify compliance
> -
>
> Key: HADOOP-9361
> URL: https://issues.apache.org/jira/browse/HADOOP-9361
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, test
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, 
> HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, 
> HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, 
> HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, 
> HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, 
> HADOOP-9361-016.patch, HADOOP-9361-017.patch, HADOOP-9361-018.patch, 
> HADOOP-9361.awang-addendum.patch
>
>
> {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while 
> HDFS gets tested downstream, other filesystems, such as blobstore bindings, 
> don't.
> The only tests that are common are those of {{FileSystemContractTestBase}}, 
> which HADOOP-9258 shows is incomplete.
> I propose 
> # writing more tests which clarify expected behavior
> # testing operations in the interface being in their own JUnit4 test classes, 
> instead of one big test suite. 
> # Having each FS declare via a properties file what behaviors they offer, 
> such as atomic-rename, atomic-delete, umask, immediate-consistency -test 
> methods can downgrade to skipped test cases if a feature is missing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10312) Shell.ExitCodeException to have more useful toString

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051417#comment-14051417
 ] 

Hudson commented on HADOOP-10312:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #5820 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5820/])
HADOOP-10312 changes.text updated in wrong place (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607631)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt


> Shell.ExitCodeException to have more useful toString
> 
>
> Key: HADOOP-10312
> URL: https://issues.apache.org/jira/browse/HADOOP-10312
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch
>
>
> Shell's ExitCodeException doesn't include the exit code in the toString 
> value, so isn't that useful in diagnosing container start failures in YARN



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9651) Filesystems to throw FileAlreadyExistsException in createFile(path, overwrite=false) when the file exists

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9651:
---

Issue Type: Bug  (was: Sub-task)
Parent: (was: HADOOP-9361)

> Filesystems to throw FileAlreadyExistsException in createFile(path, 
> overwrite=false) when the file exists
> -
>
> Key: HADOOP-9651
> URL: https://issues.apache.org/jira/browse/HADOOP-9651
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.1.0-beta
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: HADOOP-9651.patch
>
>
> While HDFS and other filesystems throw a {{FileAlreadyExistsException}} if 
> you try to create a file that exists and you have set {{overwrite=false}}, 
> {{RawLocalFileSystem}} throws a plain {{IOException}}. This makes it 
> impossible to distinguish a create operation failing from a fixable problem 
> (the file is there) and something more fundamental.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.

2014-07-03 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051427#comment-14051427
 ] 

Yi Liu commented on HADOOP-10734:
-

Thanks [~cmccabe], [~apurtell], [~andrew.wang] for the comments.

I summarize several ways to generate secure random in linux, and why RdRand.

*  /dev/random, it uses an entropy pool of several entropy sources, such as 
mouse movement, keyboard type and so on. If entropy pool is empty, reads to 
/dev/random will be blocked until additional environment noise is gathered. 
RdRand is used to improve the entropy by combining the values received from 
RdRand with other sources of randomness.
The reason of the combining way is some developers concern there may be 
back doors in RdRand, but it’s not true.
*  /dev/urandom, it reuses the internal entropy pool and will return as many 
random bytes as requested. The call will not block, and the outpout may contain 
less entropy than the corresponding read from /dev/random. If the entropy pool 
is empty, it will generate data using SHA or other algorithms.
* In java, new SecureRandom(), will read bytes from /dev/urandom and do {{xor}} 
with bytes from java SHA1PRNG. 
* RdRand, hardware generator. In Openssl, it’s recommended to use hardware 
generators, it says their entropy is always nearly 100%. We can use RdRand 
directly.

So we can see, option 4, the RdRand is faster than others and the entropy is 
nearly 100%.

http://en.wikipedia.org/wiki/RdRand
http://wiki.openssl.org/index.php/Random_Numbers
http://en.wikipedia.org/?title=/dev/random
http://docs.oracle.com/javase/7/docs/api/java/security/SecureRandom.html


> Implementation of true secure random with high performance using hardware 
> random number generator.
> --
>
> Key: HADOOP-10734
> URL: https://issues.apache.org/jira/browse/HADOOP-10734
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10734.patch
>
>
> This JIRA is to implement Secure random using JNI to OpenSSL, and 
> implementation should be thread-safe.
> Utilize RdRand to return random numbers from hardware random number 
> generator. It's TRNG(True Random Number generators) having much higher 
> performance than {{java.security.SecureRandom}}. 
> https://wiki.openssl.org/index.php/Random_Numbers
> http://en.wikipedia.org/wiki/RdRand
> https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-9361:
---

   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

> Strictly define the expected behavior of filesystem APIs and write tests to 
> verify compliance
> -
>
> Key: HADOOP-9361
> URL: https://issues.apache.org/jira/browse/HADOOP-9361
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, test
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Fix For: 2.5.0
>
> Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, 
> HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, 
> HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, 
> HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, 
> HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, 
> HADOOP-9361-016.patch, HADOOP-9361-017.patch, HADOOP-9361-018.patch, 
> HADOOP-9361.awang-addendum.patch
>
>
> {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while 
> HDFS gets tested downstream, other filesystems, such as blobstore bindings, 
> don't.
> The only tests that are common are those of {{FileSystemContractTestBase}}, 
> which HADOOP-9258 shows is incomplete.
> I propose 
> # writing more tests which clarify expected behavior
> # testing operations in the interface being in their own JUnit4 test classes, 
> instead of one big test suite. 
> # Having each FS declare via a properties file what behaviors they offer, 
> such as atomic-rename, atomic-delete, umask, immediate-consistency -test 
> methods can downgrade to skipped test cases if a feature is missing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10458) swifts should throw FileAlreadyExistsException on attempt to overwrite file

2014-07-03 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-10458:


   Resolution: Fixed
Fix Version/s: 2.5.0
   Status: Resolved  (was: Patch Available)

> swifts should throw FileAlreadyExistsException on attempt to overwrite file
> ---
>
> Key: HADOOP-10458
> URL: https://issues.apache.org/jira/browse/HADOOP-10458
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HADOOP-10458-001.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> the swift:// filesystem checks for and rejects {{create()}} calls over an 
> existing file if overwrite = false, but it throws a custom exception. 
> {{SwiftPathExistsException}}
> If it threw a {{org.apache.hadoop.fs.FileAlreadyExistsException}} it would 
> match HDFS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.

2014-07-03 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051432#comment-14051432
 ] 

Yi Liu commented on HADOOP-10734:
-

And we can also add enable flag in configuration and user can disable it.

> Implementation of true secure random with high performance using hardware 
> random number generator.
> --
>
> Key: HADOOP-10734
> URL: https://issues.apache.org/jira/browse/HADOOP-10734
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10734.patch
>
>
> This JIRA is to implement Secure random using JNI to OpenSSL, and 
> implementation should be thread-safe.
> Utilize RdRand to return random numbers from hardware random number 
> generator. It's TRNG(True Random Number generators) having much higher 
> performance than {{java.security.SecureRandom}}. 
> https://wiki.openssl.org/index.php/Random_Numbers
> http://en.wikipedia.org/wiki/RdRand
> https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10312) Shell.ExitCodeException to have more useful toString

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051468#comment-14051468
 ] 

Hudson commented on HADOOP-10312:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1793 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1793/])
HADOOP-10312 Shell.ExitCodeException to have more useful toString (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607591)
* /hadoop/common/trunk
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java


> Shell.ExitCodeException to have more useful toString
> 
>
> Key: HADOOP-10312
> URL: https://issues.apache.org/jira/browse/HADOOP-10312
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch
>
>
> Shell's ExitCodeException doesn't include the exit code in the toString 
> value, so isn't that useful in diagnosing container start failures in YARN



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.

2014-07-03 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051500#comment-14051500
 ] 

Yi Liu commented on HADOOP-10734:
-

[~cmccabe], thanks for the review :-). 

{quote}
I actually have the same problem with the scheme here: JNI calls are 
expensive... do we know how many random bits the API user is getting at a time? 
If that number is small, we might want to implement batching. 
{quote}
In most cases, we use it to generate key(16bytes, 32bytes, 128bytes, 256bytes), 
IV(16 bytes), long (8 bytes).
Furthermore, to make the random bytes good enough, we can’t avoid JNI, even 
{{java.security.SecureRandom}} also uses JNI.

{quote}
I also think we should consider using ByteBuffer rather than byte[] array, if 
performance is the primary goal.
{quote}
I suppose you mean direct ByteBuffer. Per my understanding, merit of direct 
ByteBuffer is to avoid bytes copy.  But {{SecureRandom#nextBytes}} will accept 
an pre-allocated byte[] array, if we use direct ByteBuffer for JNI, then there 
is additional copy in java layer, so the performance is the same, and we need 
to manage the direct ByteBuffer.

{quote}
{code}
+  final protected int next(int numBits) {
{code}
Should be private
{quote}
OK, I will update it.

{quote}
{code}
+  public long nextLong() {
+return ((long)(next(32)) << 32) + next(32);
+  }
{code}
Why use addition rather than bitwise OR here?
{quote}
Bitwise OR is also OK. Actually {{nextLong}}, {{nextFloat}} and {{nextDouble}} 
are copied from implementations in {{java.security.SecureRandom}}

{quote}
This is not correct. The type of {{pthread_t}} is not known. If you want a 
numeric thread ID, you could try gettid on Linux.
{quote}
Can you explain a bit more, I’m not sure I get your meaning. Per my 
understanding, {{pthread_t}} is defined in {{/usr/include/bits/pthreadtypes.h}} 
as
{code}
typedef unsigned long int pthread_t;
{code}
And this patch is compiled and run successfully on my Linux server.


> Implementation of true secure random with high performance using hardware 
> random number generator.
> --
>
> Key: HADOOP-10734
> URL: https://issues.apache.org/jira/browse/HADOOP-10734
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10734.patch
>
>
> This JIRA is to implement Secure random using JNI to OpenSSL, and 
> implementation should be thread-safe.
> Utilize RdRand to return random numbers from hardware random number 
> generator. It's TRNG(True Random Number generators) having much higher 
> performance than {{java.security.SecureRandom}}. 
> https://wiki.openssl.org/index.php/Random_Numbers
> http://en.wikipedia.org/wiki/RdRand
> https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10312) Shell.ExitCodeException to have more useful toString

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051563#comment-14051563
 ] 

Hudson commented on HADOOP-10312:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1820 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1820/])
HADOOP-10312 changes.text updated in wrong place (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607631)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
HADOOP-10312 Shell.ExitCodeException to have more useful toString (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607591)
* /hadoop/common/trunk
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java


> Shell.ExitCodeException to have more useful toString
> 
>
> Key: HADOOP-10312
> URL: https://issues.apache.org/jira/browse/HADOOP-10312
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.5.0
>
> Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch
>
>
> Shell's ExitCodeException doesn't include the exit code in the toString 
> value, so isn't that useful in diagnosing container start failures in YARN



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

2014-07-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051573#comment-14051573
 ] 

Hudson commented on HADOOP-9361:


FAILURE: Integrated in Hadoop-Mapreduce-trunk #1820 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1820/])
HADOOP-9361: changes.txt (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607620)
* /hadoop/common/trunk
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
HADOOP-9361: site and gitignore (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607601)
* /hadoop/common/trunk/.gitignore
* /hadoop/common/trunk/hadoop-project/src/site/site.xml
HADOOP-9361: Strictly define FileSystem APIs - OpenStack portion (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607600)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/exceptions/SwiftNotDirectoryException.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/exceptions/SwiftPathExistsException.java
HADOOP-9361: Strictly define FileSystem APIs - OpenStack portion (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607599)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/StrictBufferedFSInputStream.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeFileSystem.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeFileSystemStore.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeInputStream.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeOutputStream.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemBasicOps.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemContract.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemRename.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/SwiftContract.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractCreate.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractDelete.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractMkdir.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractOpen.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractRename.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractRootDir.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractSeek.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/hdfs2/TestV2LsOperations.java
* /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/resources/contract
* 
/hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/resources/contract/swift.xml
HADOOP-9361: Strictly define FileSystem APIs - HDFS portion (stevel: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607597)
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/HDFSContract.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractAppend.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractConcat.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractCreate.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractDelete.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/h

[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider

2014-07-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051628#comment-14051628
 ] 

Alejandro Abdelnur commented on HADOOP-10769:
-

Lets assume you have a {{DelegationTokenKeyProviderExtension}} providing a 
{{DelegationTokenExtension}} interface, it would be something like this: 

{code}
public class DelegationTokenKeyProviderExtension extends 
KeyProviderExtension {

  public interface DelegationTokenExtension extends Extension {
 public Token getDelegationToken(String renewer) throws IOException;
  }

  private DelegationTokenKeyProviderExtension(KeyProvider kp, 
DelegationTokenExtension dte) {
super(kp, dte);
  }

  public Token getDelegationToken(String renewer) throws IOException {
Token token = null;
 if (getExtension() != null) {
   token = getExtension().getDelegationToken(renewer);
 }
 return token;
  }

  privat static DefaultDelegationTokenExtension implements 
DelegationTokenExtension {
 public Token getDelegationToken(String renewer) throws IOException {
   return null;
  }
  }

  public static DelegationTokenKeyProviderExtension getExtension(KeyProvider 
kp) {
DelegationTokenExtension dte = (kp instanceof DelegationTokenExtension) ? 
(DelegationTokenExtension) kp : null;
return DelegationTokenKeyProviderExtension(kp, dte);
  }
}
{code}

When using the {{DelegationTokenKeyProviderExtension}} to get tokens you get 
the same semantics as you would do getting the tokens from the 
{{getDelegationToken()}} method if it would be backed in the {{KeyProvider}} 
API but without having the token retrieval in the {{KeyProvider}} API itself 
which is your source of concerns.



> Add getDelegationToken() method to KeyProvider
> --
>
> Key: HADOOP-10769
> URL: https://issues.apache.org/jira/browse/HADOOP-10769
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
>
> The KeyProvider API needs to return delegation tokens to enable access to the 
> KeyProvider from processes without Kerberos credentials (ie Yarn containers).
> This is required for HDFS encryption and KMS integration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider

2014-07-03 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051744#comment-14051744
 ] 

Larry McCay commented on HADOOP-10769:
--

That seems pretty convoluted.

Let's step back a second - so that the full usecase is clear.

* consumers of the managed keys will need access to them from services/tasks at 
execution time
* some of the keys will be unknown until file access time
* so, at job submission time KMS delegation tokens are needed so that the 
services/tasks can access the required keys as the submitting user later as 
they discover the need for the specific keys from HDFS ext attrs
* therefore the delegation tokens have to be in the credentials file
* they will also need to be made available to the KMSClientKeyProvider to 
include in the request to the KMS

So, we need:

1. the ability to get the KMS delegation token at job submission time
2. the ability to add it to and get it from the credentials file (already 
available in Credentials)
- though it seems that this has to be done by the consuming code not the 
KMSClientKeyProvider code
3. the ability to supply the delegation token to the KMSClientKeyProvider when 
requesting keys

My questions:

A. For #1 can't we have a standalone DelegationTokenClient component - 
especially since there is another jira for refactoring delegation token support 
out into common to be more reusable? Such a client could then potentially be 
used inside the KMSClientKeyProvider.
B. Wouldn't it be better if providers that know they need delegation tokens 
were able to handle #2 themselves?
C. How is #3 above going to be handled using the current interfaces - I don't 
see how it is being added to the interaction currently?
D. If the KMSClientKeyProvider had access to the credentials object ( already 
have access to UserKeyProvider) or some other execution context itself then 
could that be a way that #3 could be addressed?


> Add getDelegationToken() method to KeyProvider
> --
>
> Key: HADOOP-10769
> URL: https://issues.apache.org/jira/browse/HADOOP-10769
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
>
> The KeyProvider API needs to return delegation tokens to enable access to the 
> KeyProvider from processes without Kerberos credentials (ie Yarn containers).
> This is required for HDFS encryption and KMS integration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider

2014-07-03 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051745#comment-14051745
 ] 

Arun Suresh commented on HADOOP-10719:
--

[~tucu00], The problem with having just the static method you specified is that 
you are kind of restricting the compos-ability of the 
{{KeyProviderCryptoExtension}}. For Instance, How will you combine a 
{{JavaKeyStoreProvider}} with some different type of {{CryptoExtension}} other 
than {{DefautlCryptoExtension}} ?


> Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
> ---
>
> Key: HADOOP-10719
> URL: https://issues.apache.org/jira/browse/HADOOP-10719
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch
>
>
> This is a follow up on 
> [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044]
> KeyProvider API should  have 2 new methods:
> * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv)
> * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion 
> encryptedKey)
> The implementation would do a known transformation on the IV (i.e.: xor with 
> 0xff the original IV).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider

2014-07-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051748#comment-14051748
 ] 

Alejandro Abdelnur commented on HADOOP-10719:
-

As mentioned before, I wouldn't worry about composibilty, I would rather say no 
to it and you have different extensions wrapping the same provider instance, 
one for each use.

> Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
> ---
>
> Key: HADOOP-10719
> URL: https://issues.apache.org/jira/browse/HADOOP-10719
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch
>
>
> This is a follow up on 
> [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044]
> KeyProvider API should  have 2 new methods:
> * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv)
> * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion 
> encryptedKey)
> The implementation would do a known transformation on the IV (i.e.: xor with 
> 0xff the original IV).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider

2014-07-03 Thread Mike Yoder (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051747#comment-14051747
 ] 

Mike Yoder commented on HADOOP-10719:
-

Crypto-nerd comments - in generateEncryptedKey()...
- The line "SecureRandom.getInstance("SHA1PRNG").nextBytes(newKey);" - two 
things: SHA1 is obsolete, can you choose something stronger?  I don't know what 
the set of valid options are, but if there is one that resembles "NIST SP 
800-90" then pick that one.  Also you're doing the getInstance call every time 
through this function, better to call it once for the class and then just call 
nextBytes in this function?  We probably also will want to build in new 
re-seeding logic around this random stream.  Key generation is highly 
scrutinized, trust me!
- The line "Cipher cipher = Cipher.getInstance("AES/CTR/NoPadding");" - can you 
please use CBC mode instead of CTR mode?  If we use CTR mode we're subjecting 
the encrypted DEK to all the attacks we're trying to avoid for the data itself. 
 CBC mode has none of the nasty ciphertext attack problems that CTR mode has.

> Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
> ---
>
> Key: HADOOP-10719
> URL: https://issues.apache.org/jira/browse/HADOOP-10719
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch
>
>
> This is a follow up on 
> [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044]
> KeyProvider API should  have 2 new methods:
> * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv)
> * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion 
> encryptedKey)
> The implementation would do a known transformation on the IV (i.e.: xor with 
> 0xff the original IV).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider

2014-07-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051768#comment-14051768
 ] 

Alejandro Abdelnur commented on HADOOP-10719:
-

[~yoderme], the use of JDK JCE {{Cipher}} is temporary until we integrate with 
fs-encryption branch where we have all this taken care by the {{CryptoCodec}} 
API.

> Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
> ---
>
> Key: HADOOP-10719
> URL: https://issues.apache.org/jira/browse/HADOOP-10719
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch
>
>
> This is a follow up on 
> [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044]
> KeyProvider API should  have 2 new methods:
> * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv)
> * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion 
> encryptedKey)
> The implementation would do a known transformation on the IV (i.e.: xor with 
> 0xff the original IV).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider

2014-07-03 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated HADOOP-10719:
-

Attachment: HADOOP-10719.3.patch

Uploading patch with feedbacks addressed.. thank you all for the review !
This patch is to be applied to trunk

> Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
> ---
>
> Key: HADOOP-10719
> URL: https://issues.apache.org/jira/browse/HADOOP-10719
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, 
> HADOOP-10719.3.patch, HADOOP-10719.patch, HADOOP-10719.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch
>
>
> This is a follow up on 
> [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044]
> KeyProvider API should  have 2 new methods:
> * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv)
> * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion 
> encryptedKey)
> The implementation would do a known transformation on the IV (i.e.: xor with 
> 0xff the original IV).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10772) Generating RPMs for common, hdfs, httpfs, mapreduce , yarn and tools

2014-07-03 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051838#comment-14051838
 ] 

Eric Yang commented on HADOOP-10772:


Hadoop rpm packaging was removed due to conversion to maven.  Source code 
refactoring was what discontinued rpm packaging to evolve.  Maven did not have 
good support in building rpm packages in the past.  Now, there is a maven rpm 
plugin, this should be much easier to package in Hadoop.  When packaging is 
outside of Hadoop, the developer needs to modify Hadoop and modify BigTop in 
order to support new platform.  This is sub-optimal method to maintain 
modulation for the project.  I view this as a community's response to a needed 
feature.  RPM packages has been maintained for Hadoop 1.2.1 in parity with 
latest release of Hadoop 1.2.1, much appreciated of efforts in the community.  
This effort could make future Hadoop versions more polish to run on other 
platforms without dependent on release cycle of Bigtop.

> Generating RPMs for common, hdfs, httpfs, mapreduce , yarn and tools 
> -
>
> Key: HADOOP-10772
> URL: https://issues.apache.org/jira/browse/HADOOP-10772
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Jinghui Wang
>Assignee: Jinghui Wang
> Attachments: HADOOP-10772.patch
>
>
> Generating RPMs for hadoop-common, hadoop-hdfs, hadoop-hdfs-httpfs, 
> hadoop-mapreduce , hadoop-yarn-project and hadoop-tools-dist with dist build 
> profile.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HADOOP-10780) namenode throws java.lang.OutOfMemoryError upon DatanodeProtocol.versionRequest from datanode

2014-07-03 Thread Dmitry Sivachenko (JIRA)
Dmitry Sivachenko created HADOOP-10780:
--

 Summary: namenode throws java.lang.OutOfMemoryError upon 
DatanodeProtocol.versionRequest from datanode
 Key: HADOOP-10780
 URL: https://issues.apache.org/jira/browse/HADOOP-10780
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.4.1
 Environment: FreeBSD-10/stable
openjdk version "1.7.0_60"
OpenJDK Runtime Environment (build 1.7.0_60-b19)
OpenJDK 64-Bit Server VM (build 24.60-b09, mixed mode)

Reporter: Dmitry Sivachenko


I am trying hadoop-2.4.1 on FreeBSD-10/stable.
namenode starts up, but after first datanode contacts it, it throws an 
exception.
All limits seem to be high enough:

% limits -a
Resource limits (current):
  cputime  infinity secs
  filesize infinity kB
  datasize 33554432 kB
  stacksize  524288 kB
  coredumpsize infinity kB
  memoryuseinfinity kB
  memorylocked infinity kB
  maxprocesses   122778
  openfiles  14
  sbsize   infinity bytes
  vmemoryuse   infinity kB
  pseudo-terminals infinity
  swapuse  infinity kB

14944  1  S0:06.59 /usr/local/openjdk7/bin/java -Dproc_namenode 
-Xmx1000m -Dhadoop.log.dir=/var/log/hadoop 
-Dhadoop.log.file=hadoop-hdfs-namenode-nezabudka3-00.log 
-Dhadoop.home.dir=/usr/local -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA 
-Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true 
-Xmx32768m -Xms32768m -Djava.library.path=/usr/local/lib -Xmx32768m -Xms32768m 
-Djava.library.path=/usr/local/lib -Xmx32768m -Xms32768m 
-Djava.library.path=/usr/local/lib -Dhadoop.security.logger=INFO,RFAS 
org.apache.hadoop.hdfs.server.namenode.NameNode


>From the namenode's log:

2014-07-03 23:28:15,070 WARN  [IPC Server handler 5 on 8020] ipc.Server 
(Server.java:run(2032)) - IPC Server handler 5 on 8020, call 
org.apache.hadoop.hdfs.server.protocol.Datano
deProtocol.versionRequest from 5.255.231.209:57749 Call#842 Retry#0
java.lang.OutOfMemoryError
at 
org.apache.hadoop.security.JniBasedUnixGroupsMapping.getGroupsForUser(Native 
Method)
at 
org.apache.hadoop.security.JniBasedUnixGroupsMapping.getGroups(JniBasedUnixGroupsMapping.java:80)
at 
org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:50)
at org.apache.hadoop.security.Groups.getGroups(Groups.java:139)
at 
org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1417)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.(FSPermissionChecker.java:81)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3331)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:5491)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.versionRequest(NameNodeRpcServer.java:1082)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.versionRequest(DatanodeProtocolServerSideTranslatorPB.java:234)
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28069)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)


I did not have such an issue with hadoop-1.2.1.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10693) Implementation of AES-CTR CryptoCodec using JNI to OpenSSL

2014-07-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051886#comment-14051886
 ] 

Colin Patrick McCabe commented on HADOOP-10693:
---

{code}
+static int loadAesCtr(JNIEnv *env)
+{
+#ifdef UNIX
+  dlerror(); // Clear any existing error
+  dlsym_EVP_aes_256_ctr = dlsym(openssl, "EVP_aes_256_ctr");
+  dlsym_EVP_aes_128_ctr = dlsym(openssl, "EVP_aes_128_ctr");
+  if (dlerror() != NULL) {
+return -1;
+  }
+#endif
+
+#ifdef WINDOWS
+  dlsym_EVP_aes_256_ctr = (__dlsym_EVP_aes_256_ctr) GetProcAddress(openssl,  \
+  "EVP_aes_256_ctr");
+  dlsym_EVP_aes_128_ctr = (__dlsym_EVP_aes_128_ctr) GetProcAddress(openssl,  \
+  "EVP_aes_128_ctr");
+  if (dlsym_EVP_aes_256_ctr == NULL || dlsym_EVP_aes_128_ctr == NULL) {
+return -1;
+  }
+#endif
+  
+  return 0;
+}
{code}

If the first call to dlsym fails, the second call will clear the dlerror state. 
 So this isn't quite going to work, I think.
I think it would be easier to just use the LOAD_DYNAMIC_SYMBOL macro, and then 
check for the exception afterwards.  You'd need something like this:

{code}
void loadAes(void)
{
LOAD_DYNAMIC_SYMBOL(1...)
LOAD_DYNAMIC_SYMBOL(2...)
}

JNIEXPORT void JNICALL Java_org_apache_hadoop_crypto_OpensslCipher_initIDs
(JNIEnv *env, jclass clazz)
{
loadAes();
jthrowable jthr = (*env)->ExceptionOccurred();
if (jthr) {
(*env)->DeleteLocalRef(env, jthr);
THROW(...)
return;
}
...
}
{code}

Or something like that.  +1 once this is addressed

> Implementation of AES-CTR CryptoCodec using JNI to OpenSSL
> --
>
> Key: HADOOP-10693
> URL: https://issues.apache.org/jira/browse/HADOOP-10693
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10693.1.patch, HADOOP-10693.2.patch, 
> HADOOP-10693.3.patch, HADOOP-10693.4.patch, HADOOP-10693.5.patch, 
> HADOOP-10693.6.patch, HADOOP-10693.7.patch, HADOOP-10693.patch
>
>
> In HADOOP-10603, we have an implementation of AES-CTR CryptoCodec using Java 
> JCE provider. 
> To get high performance, the configured JCE provider should utilize native 
> code and AES-NI, but in JDK6,7 the Java embedded provider doesn't support it.
>  
> Considering not all hadoop user will use the provider like Diceros or able to 
> get signed certificate from oracle to develop a custom provider, so this JIRA 
> will have an implementation of AES-CTR CryptoCodec using JNI to OpenSSL 
> directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10725) Implement listStatus and getFileInfo in the native client

2014-07-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051944#comment-14051944
 ] 

Colin Patrick McCabe commented on HADOOP-10725:
---

bq. (hadoop-native-core/src/main/native/fs/fs.c) Pull this out into a separate 
function? Seems like an operation that will have to be done frequently.

This is a bit of a special case just for the connection URI.  I guess the issue 
is that you have people connecting with stuff like "localhost:8020", which 
isn't technically a well-formed URI, but which we sort of have to handle (by 
looking at it as authority=localhost, port=8020).  On the other hand, when 
someone gives you a path that looks like "myfile:123", you just want to parse 
it with the standard URI parsing code.  We might need more massaging for files 
with colons in them later, but it's a bit of a grey area (see HDFS-13) so I'd 
like to avoid dealing with it for now.  For now, I'd like to keep this hack for 
the connection uri, but not for others.

bq. (hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c) Should 
precedence be given to the explicitly defined "port" member or the pre-existing 
port in the URI? It seems like an explicit definition in the builder should 
take precedence?

So there are three options:
1. fail with error message (current behavior in trunk)
2. hdfsBuilderSetNameNodePort wins if set
3. URI port wins hdfsBuilderSetNameNodePort if set

#2 is hard to implement for jniFS.  If you're given a URI such as 
hdfs://server:123/foo/bar, you'd have to replace 123 with whatever port you 
liked through string operations, prior to sending along the URI to the java 
code.

I wish we had never added {{hdfsBuilderSetNameNodePort}}... it's definitely 
superfluous, since the port can be in the URI.  Maybe we should just stick with 
option #1 for now and error out when there is a conflict.

bq. (hadoop-native-core/src/main/native/ndfs/ndfs.c) Is this how the previous 
HDFS clients worked? Using the previous seen filename won't work if the file 
has been removed. Just curious...

Yes, this is how the Java code works.  I don't think there's an issue with the 
previous filename getting removed, either.  Doing a listStatus with a filename 
just means that you want filenames that sort after that filename, not that you 
necessarily think there is such a filename.

bq. (hadoop-native-core/src/main/native/jni/jnifs.c) This code segment appears 
to be exactly the same as 
hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c. Maybe a utility 
function would be useful?

The src/main/native/libhdfs directory is going away, to be replaced by the 
jnifs/ directory.  I haven't done that yet, but it's just an svn delete, not a 
very interesting patch.

> Implement listStatus and getFileInfo in the native client
> -
>
> Key: HADOOP-10725
> URL: https://issues.apache.org/jira/browse/HADOOP-10725
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: native
>Affects Versions: HADOOP-10388
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HADOOP-10725-pnative.001.patch, 
> HADOOP-10725-pnative.002.patch
>
>
> Implement listStatus and getFileInfo in the native client.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10693) Implementation of AES-CTR CryptoCodec using JNI to OpenSSL

2014-07-03 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HADOOP-10693:


Attachment: HADOOP-10693.8.patch

Thanks [~cmccabe], I have update the patch.

> Implementation of AES-CTR CryptoCodec using JNI to OpenSSL
> --
>
> Key: HADOOP-10693
> URL: https://issues.apache.org/jira/browse/HADOOP-10693
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10693.1.patch, HADOOP-10693.2.patch, 
> HADOOP-10693.3.patch, HADOOP-10693.4.patch, HADOOP-10693.5.patch, 
> HADOOP-10693.6.patch, HADOOP-10693.7.patch, HADOOP-10693.8.patch, 
> HADOOP-10693.patch
>
>
> In HADOOP-10603, we have an implementation of AES-CTR CryptoCodec using Java 
> JCE provider. 
> To get high performance, the configured JCE provider should utilize native 
> code and AES-NI, but in JDK6,7 the Java embedded provider doesn't support it.
>  
> Considering not all hadoop user will use the provider like Diceros or able to 
> get signed certificate from oracle to develop a custom provider, so this JIRA 
> will have an implementation of AES-CTR CryptoCodec using JNI to OpenSSL 
> directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine

2014-07-03 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051959#comment-14051959
 ] 

Alex Newman commented on HADOOP-10641:
--

[~rakeshr] zab or zk?

> Introduce Coordination Engine
> -
>
> Key: HADOOP-10641
> URL: https://issues.apache.org/jira/browse/HADOOP-10641
> Project: Hadoop Common
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
> Attachments: HADOOP-10641.patch, HADOOP-10641.patch, 
> HADOOP-10641.patch
>
>
> Coordination Engine (CE) is a system, which allows to agree on a sequence of 
> events in a distributed system. In order to be reliable CE should be 
> distributed by itself.
> Coordination Engine can be based on different algorithms (paxos, raft, 2PC, 
> zab) and have different implementations, depending on use cases, reliability, 
> availability, and performance requirements.
> CE should have a common API, so that it could serve as a pluggable component 
> in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and 
> HBase (HBASE-10909).
> First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider

2014-07-03 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated HADOOP-10719:
-

Status: Patch Available  (was: Open)

> Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
> ---
>
> Key: HADOOP-10719
> URL: https://issues.apache.org/jira/browse/HADOOP-10719
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, 
> HADOOP-10719.3.patch, HADOOP-10719.patch, HADOOP-10719.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch
>
>
> This is a follow up on 
> [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044]
> KeyProvider API should  have 2 new methods:
> * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv)
> * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion 
> encryptedKey)
> The implementation would do a known transformation on the IV (i.e.: xor with 
> 0xff the original IV).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10720) KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API

2014-07-03 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated HADOOP-10720:
-

Status: Patch Available  (was: Open)

> KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API
> ---
>
> Key: HADOOP-10720
> URL: https://issues.apache.org/jira/browse/HADOOP-10720
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: COMBO.patch, COMBO.patch, COMBO.patch, COMBO.patch, 
> COMBO.patch, HADOOP-10720.1.patch, HADOOP-10720.2.patch, HADOOP-10720.patch, 
> HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch
>
>
> KMS client/server should implement support for generating encrypted keys and 
> decrypting them via the REST API being introduced by HADOOP-10719.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10720) KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API

2014-07-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051987#comment-14051987
 ] 

Hadoop QA commented on HADOOP-10720:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653792/HADOOP-10720.2.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4205//console

This message is automatically generated.

> KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API
> ---
>
> Key: HADOOP-10720
> URL: https://issues.apache.org/jira/browse/HADOOP-10720
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: COMBO.patch, COMBO.patch, COMBO.patch, COMBO.patch, 
> COMBO.patch, HADOOP-10720.1.patch, HADOOP-10720.2.patch, HADOOP-10720.patch, 
> HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch
>
>
> KMS client/server should implement support for generating encrypted keys and 
> decrypting them via the REST API being introduced by HADOOP-10719.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10693) Implementation of AES-CTR CryptoCodec using JNI to OpenSSL

2014-07-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051997#comment-14051997
 ] 

Colin Patrick McCabe commented on HADOOP-10693:
---

+1.  Thanks, Yi.

> Implementation of AES-CTR CryptoCodec using JNI to OpenSSL
> --
>
> Key: HADOOP-10693
> URL: https://issues.apache.org/jira/browse/HADOOP-10693
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10693.1.patch, HADOOP-10693.2.patch, 
> HADOOP-10693.3.patch, HADOOP-10693.4.patch, HADOOP-10693.5.patch, 
> HADOOP-10693.6.patch, HADOOP-10693.7.patch, HADOOP-10693.8.patch, 
> HADOOP-10693.patch
>
>
> In HADOOP-10603, we have an implementation of AES-CTR CryptoCodec using Java 
> JCE provider. 
> To get high performance, the configured JCE provider should utilize native 
> code and AES-NI, but in JDK6,7 the Java embedded provider doesn't support it.
>  
> Considering not all hadoop user will use the provider like Diceros or able to 
> get signed certificate from oracle to develop a custom provider, so this JIRA 
> will have an implementation of AES-CTR CryptoCodec using JNI to OpenSSL 
> directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HADOOP-10693) Implementation of AES-CTR CryptoCodec using JNI to OpenSSL

2014-07-03 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved HADOOP-10693.
---

Resolution: Fixed

committed to fs-encryption branch

> Implementation of AES-CTR CryptoCodec using JNI to OpenSSL
> --
>
> Key: HADOOP-10693
> URL: https://issues.apache.org/jira/browse/HADOOP-10693
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10693.1.patch, HADOOP-10693.2.patch, 
> HADOOP-10693.3.patch, HADOOP-10693.4.patch, HADOOP-10693.5.patch, 
> HADOOP-10693.6.patch, HADOOP-10693.7.patch, HADOOP-10693.8.patch, 
> HADOOP-10693.patch
>
>
> In HADOOP-10603, we have an implementation of AES-CTR CryptoCodec using Java 
> JCE provider. 
> To get high performance, the configured JCE provider should utilize native 
> code and AES-NI, but in JDK6,7 the Java embedded provider doesn't support it.
>  
> Considering not all hadoop user will use the provider like Diceros or able to 
> get signed certificate from oracle to develop a custom provider, so this JIRA 
> will have an implementation of AES-CTR CryptoCodec using JNI to OpenSSL 
> directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.

2014-07-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052008#comment-14052008
 ] 

Colin Patrick McCabe commented on HADOOP-10734:
---

bq. \[discussion of randomness methods\]

I don't think {{/dev/random}} is a practical choice, due to the blocking issue. 
 From what I've heard, {{/dev/urandom}} can often be a good choice.  I'm 
tempted to try a super-simple piece of code that just periodically fills a big 
buffer from {{/dev/urandom}} and see if that performs well.  I have a hunch 
that it would (and it would also use RDRAND on supported platforms.)  But I 
think it's fine if you want to provide an option to go through openssl as 
well... we already have a dependency on that library.

bq. And we can also add enable flag in configuration and user can disable it.

I agree.  I think we should have a configuration option like 
{{encryption.random.number.generator}} which specifies a comma-separated list 
of class names to try to use.  That way a user could specify the openssl one 
plus a fallback to the standard java one if they so chose.  Or alternately, the 
user could enable just the java one (and configure it to use /dev/urandom) to 
get something which used RDRAND plus some additional randomness.  If you want 
to do this in a follow-on JIRA, that's OK too.

I think it's confusing that we have both 
{{org.apache.hadoop.crypto.random.SecureRandom}} and 
{{java.security.SecureRandom}}.  Maybe a better name for this new class would 
be {{OpenSslSecureRandom}} or something like that, to emphasize that it is 
using OpenSSL to get random bits.

{code}
+/**
+ * Utilize RdRand to return random numbers from hardware random number 
+ * generator. It's TRNG(True Random Number generators) having high 
performance. 
+ * https://wiki.openssl.org/index.php/Random_Numbers#Hardware
+ * http://en.wikipedia.org/wiki/RdRand
+ */
+static ENGINE * rdrand_init(JNIEnv *env)
{code}

I think the comment is a bit misleading here.  Openssl compiles on a lot of 
platforms that don't have RDRAND.  So all we really know here is that we're 
using openssl, not that we're using RDRAND.  I think it's appropriate to have a 
comment saying, "if you are using an Intel chipset with RDRAND, the 
high-performance random number generator will be used", or something like that. 
 But it's platform specific and we may be compiling on another platform.

{code}
+  \@Test(timeout=12)
+  public void testRandomInt() throws Exception {
+SecureRandom random = new SecureRandom();
+
+int rand1 = random.nextInt();
+int rand2 = random.nextInt();
+Assert.assertFalse(rand1 == rand2);
+  }
{code}

It's definitely difficult to test something which is returning true random 
numbers.  It requires a lot of mathematics.  So I see why you did it this way.  
Just one comment... maybe I'm being overly paranoid here, but can we loop until 
rand2 is not equal to rand1?

bq. I suppose you mean direct ByteBuffer. Per my understanding, merit of direct 
ByteBuffer is to avoid bytes copy. But SecureRandom#nextBytes will accept an 
pre-allocated byte[] array, if we use direct ByteBuffer for JNI, then there is 
additional copy in java layer, so the performance is the same, and we need to 
manage the direct ByteBuffer.

OK.

bq. Can you explain a bit more, I’m not sure I get your meaning. Per my 
understanding, pthread_t is defined in /usr/include/bits/pthreadtypes.h as

The stuff in {{/usr/include/bits}} is not public; it is an implementation 
detail that could change at any time.

from {{man pthread_self}}:
bq. POSIX.1 allows an implementation wide freedom in choosing the type used to 
represent a thread ID; for example, representation using either an arithmetic  
type  or a structure is permitted.  Therefore, variables of type pthread_t 
can't portably be compared using the C equality operator (==); use 
pthread_equal(3) instead.  Thread identifiers should be considered opaque: any 
attempt to use a thread ID other than in pthreads calls is nonportable and can 
lead to  unspecified results.

> Implementation of true secure random with high performance using hardware 
> random number generator.
> --
>
> Key: HADOOP-10734
> URL: https://issues.apache.org/jira/browse/HADOOP-10734
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10734.patch
>
>
> This JIRA is to implement Secure random using JNI to OpenSSL, and 
> implementation should be thread-safe.
> Utilize RdRand to return random numbers from hardware random number 
> g

[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider

2014-07-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052012#comment-14052012
 ] 

Hadoop QA commented on HADOOP-10719:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12653974/HADOOP-10719.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common:

  org.apache.hadoop.ha.TestZKFailoverControllerStress

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4206//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/4206//console

This message is automatically generated.

> Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
> ---
>
> Key: HADOOP-10719
> URL: https://issues.apache.org/jira/browse/HADOOP-10719
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
> Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, 
> HADOOP-10719.3.patch, HADOOP-10719.patch, HADOOP-10719.patch, 
> HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch
>
>
> This is a follow up on 
> [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044]
> KeyProvider API should  have 2 new methods:
> * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv)
> * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion 
> encryptedKey)
> The implementation would do a known transformation on the IV (i.e.: xor with 
> 0xff the original IV).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.

2014-07-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052014#comment-14052014
 ] 

Alejandro Abdelnur commented on HADOOP-10734:
-

Any reason for not baking this in the CryptoCodec for OpenSSL instead a new 
java class?

> Implementation of true secure random with high performance using hardware 
> random number generator.
> --
>
> Key: HADOOP-10734
> URL: https://issues.apache.org/jira/browse/HADOOP-10734
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10734.patch
>
>
> This JIRA is to implement Secure random using JNI to OpenSSL, and 
> implementation should be thread-safe.
> Utilize RdRand to return random numbers from hardware random number 
> generator. It's TRNG(True Random Number generators) having much higher 
> performance than {{java.security.SecureRandom}}. 
> https://wiki.openssl.org/index.php/Random_Numbers
> http://en.wikipedia.org/wiki/RdRand
> https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-5353) add progress callback feature to the slow FileUtil operations with ability to cancel the work

2014-07-03 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HADOOP-5353:
--

Description: 
This is something only of relevance of people doing front ends to FS 
operations, and as they could take the code in FSUtil and add something with 
this feature, its a blocker to none of them. 

Current FileUtil.copy can take a long time to move large files around, but 
there is no progress indicator to GUIs, or a way to cancel the operation 
mid-way, j interrupting the thread or closing the filesystem.

I propose a FileIOProgress interface to the copy ops, one that had a single 
method to notify listeners of bytes read and written, and the number of files 
handled.

{code}
interface FileIOProgress {
 boolean progress(int files, long bytesRead, long bytesWritten);
}

The return value would be true to continue the operation, or false to stop the 
copy and leave the FS in whatever incomplete state it is in currently. 

it could even be fancier: have  beginFileOperation and endFileOperation 
callbacks to pass in the name of the current file being worked on, though I 
don't have a personal need for that.

GUIs could show progress bars and cancel buttons, other tools could use the 
interface to pass any cancellation notice upstream.

The FileUtil.copy operations would call this interface (blocking) after every 
block copy, so the frequency of invocation would depend on block size and 
network/disk speeds. Which is also why I don't propose having any percentage 
done indicators; it's too hard to predict percentage of time done for 
distributed file IO with any degree of accuracy.

  was:
This is something only of relevance of people doing front ends to FS 
operations, and as they could take the code in FSUtil and add something with 
this feature, its a blocker to none of them. 

Current FileUtil.copy can take a long time to move large files around, but 
there is no progress indicator to GUIs, or a way to cancel the operation 
mid-way, short of interrupting the thread or closing the filesystem.

I propose a FileIOProgress interface to the copy ops, one that had a single 
method to notify listeners of bytes read and written, and the number of files 
handled.

{code}
interface FileIOProgress {
 boolean progress(int files, long bytesRead, long bytesWritten);
}

The return value would be true to continue the operation, or false to stop the 
copy and leave the FS in whatever incomplete state it is in currently. 

it could even be fancier: have  beginFileOperation and endFileOperation 
callbacks to pass in the name of the current file being worked on, though I 
don't have a personal need for that.

GUIs could show progress bars and cancel buttons, other tools could use the 
interface to pass any cancellation notice upstream.

The FileUtil.copy operations would call this interface (blocking) after every 
block copy, so the frequency of invocation would depend on block size and 
network/disk speeds. Which is also why I don't propose having any percentage 
done indicators; it's too hard to predict percentage of time done for 
distributed file IO with any degree of accuracy.


> add progress callback feature to the slow FileUtil operations with ability to 
> cancel the work
> -
>
> Key: HADOOP-5353
> URL: https://issues.apache.org/jira/browse/HADOOP-5353
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 0.21.0
>Reporter: Steve Loughran
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HADOOP-5353.000.patch
>
>
> This is something only of relevance of people doing front ends to FS 
> operations, and as they could take the code in FSUtil and add something with 
> this feature, its a blocker to none of them. 
> Current FileUtil.copy can take a long time to move large files around, but 
> there is no progress indicator to GUIs, or a way to cancel the operation 
> mid-way, j interrupting the thread or closing the filesystem.
> I propose a FileIOProgress interface to the copy ops, one that had a single 
> method to notify listeners of bytes read and written, and the number of files 
> handled.
> {code}
> interface FileIOProgress {
>  boolean progress(int files, long bytesRead, long bytesWritten);
> }
> The return value would be true to continue the operation, or false to stop 
> the copy and leave the FS in whatever incomplete state it is in currently. 
> it could even be fancier: have  beginFileOperation and endFileOperation 
> callbacks to pass in the name of the current file being worked on, though I 
> don't have a personal need for that.
> GUIs could show progress bars and cancel buttons, other tools could use the 
> interface to pass any cancellation notice upstream.

[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider

2014-07-03 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052049#comment-14052049
 ] 

Aaron T. Myers commented on HADOOP-10769:
-

bq. That seems pretty convoluted.

In the future, would appreciate you explaining why a proposal seems convoluted. 
It seems quite straightforward to me, so not sure how to address this comment. 
This proposal is an attempt to compromise and address your concern, which I 
understood to be not wanting to have this method baked into the {{KeyProvider}} 
interface, and thus allow some implementations to use it and others to not.

bq. A. For #1 can't we have a standalone DelegationTokenClient component - 
especially since there is another jira for refactoring delegation token support 
out into common to be more reusable? Such a client could then potentially be 
used inside the KMSClientKeyProvider.

That JIRA seems to me to be orthogonal to this one, so I don't think we should 
couple the two. How the {{KmsClientKeyProvider}} gets tokens under the hood 
shouldn't have anything to do with the API. Also, as you point out later in 
question C, it will still be necessary for the submitting code to somehow 
call/interact with the tokens/credentials of the KeyProvider at submission 
time, so I don't think it's actually possible to entirely encapsulate the 
delegation token fetching/storage within the {{KeyProvider}} implementation.

bq. B. Wouldn't it be better if providers that know they need delegation tokens 
were able to handle #2 themselves?

How about changing the proposal to mimic what's done in FileSystem today and 
add a method like "{{public Token[] addDelegationTokens(final String 
renewer, Credentials credentials)}}" to the {{KeyProvider}} API? The default 
behavior would be to add no tokens to the provided {{Credentials}} object, but 
the {{KmsClientKeyProvider}} could instead fetch and stash away the tokens in 
the provided {{Credentials}} object.

bq. C. How is #3 above going to be handled using the current interfaces - I 
don't see how it is being added to the interaction currently?

I believe this will happen transparently, because the tokens contained in the 
{{Credentials}} object will be added to the UGI object which will then be used 
to authenticate all the RPCs. The {{KeyProvider}} shouldn't need access to the 
tokens in the tasks.

bq. D. If the KMSClientKeyProvider had access to the credentials object ( 
already have access to UserKeyProvider) or some other execution context itself 
then could that be a way that #3 could be addressed?

If I'm understanding you correctly, I think this basically the same as what I'm 
proposing above in response to your question B. Am I right about that? Will 
this work for you?

> Add getDelegationToken() method to KeyProvider
> --
>
> Key: HADOOP-10769
> URL: https://issues.apache.org/jira/browse/HADOOP-10769
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
>
> The KeyProvider API needs to return delegation tokens to enable access to the 
> KeyProvider from processes without Kerberos credentials (ie Yarn containers).
> This is required for HDFS encryption and KMS integration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider

2014-07-03 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052059#comment-14052059
 ] 

Larry McCay commented on HADOOP-10769:
--

Hey [~atm] - Sorry for the "convoluted" statement - that came across stronger 
than I intended. I just really don't like the instanceof check in there to 
return an instance of an extension or null. I do understand the motivation but 
think that something simpler would be better.

Your proposal is actually very similar to my context proposal but limited to 
delegation tokens. In general, I am in favor of this approach. Do you think 
that we could make it more generic though? Out of curiosity, why does it return 
an array of Tokens?

If we were to open it up to include other things, like keys or passwords, etc 
then we could just make it an add credentials method call:
{{HashMap addToCredentials(HashMap props, 
Credentials creds)}}
or
{{HashMap setupCredentials(HashMap props, 
Credentials creds)}}

Where renewer would be in the props when a given provider expects it.
But we could also include the keyversions of the keys we know about at 
submission time and they be added.
We could provide the names of passwords that may be needed by a given provider 
as well.

Still not sure how the returned tokens are used in your proposal but they could 
be returned in the hashmap in this proposal - as well as anything else that 
would make sense.

We would just need a couple well-known property names to represent:
* renewer
* keyversions
* passwords
* returned tokens?

Does that make any sense?

> Add getDelegationToken() method to KeyProvider
> --
>
> Key: HADOOP-10769
> URL: https://issues.apache.org/jira/browse/HADOOP-10769
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
>
> The KeyProvider API needs to return delegation tokens to enable access to the 
> KeyProvider from processes without Kerberos credentials (ie Yarn containers).
> This is required for HDFS encryption and KMS integration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider

2014-07-03 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052061#comment-14052061
 ] 

Larry McCay commented on HADOOP-10769:
--

Of course, both of those proposals seem very strange to be part of the 
UserProvider which actually sits on top of the credentials object. :/

> Add getDelegationToken() method to KeyProvider
> --
>
> Key: HADOOP-10769
> URL: https://issues.apache.org/jira/browse/HADOOP-10769
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
>
> The KeyProvider API needs to return delegation tokens to enable access to the 
> KeyProvider from processes without Kerberos credentials (ie Yarn containers).
> This is required for HDFS encryption and KMS integration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider

2014-07-03 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052066#comment-14052066
 ] 

Aaron T. Myers commented on HADOOP-10769:
-

bq. Do you think that we could make it more generic though?

I'm sure we could, but I suggest we cross that bridge when we come to it. 
Hadoop currently does delegated authentication via {{DelegationTokens}} 
everywhere, so let's do something to support that and move on. If in the future 
we have need for other stuff, we'll amend the API appropriately. Seems quite 
premature to me to attempt to design a generic API when we don't have any 
concrete alternate use-cases.

bq. Out of curiosity, why does it return an array of Tokens?

The various callers use it for different things, e.g. in some places just to 
log which tokens were renewed. I don't think it's actually integral to the 
functioning of the API, just a convenience.

bq. If we were to open it up to include other things, like keys or passwords, 
etc then we could just make it an add credentials method call:

In general I'm really leery of a {{HashMap}}-based API. That 
seems quite fragile to me, and very overly-generic for the common use case of 
just dealing with DTs.

How about as a way forward with this JIRA we go with the "{{public Token[] 
addDelegationTokens(final String renewer, Credentials credentials)}}" added to 
{{KeyProvider}} as I proposed, and revisit a more generic API in the future 
when we actually have a concrete need for it? We could then perhaps later add a 
"{{addAdditionalCredentials}}" API call or something to accommodate 
non-DT-based implementations. It is *soft*ware, after all. :)

> Add getDelegationToken() method to KeyProvider
> --
>
> Key: HADOOP-10769
> URL: https://issues.apache.org/jira/browse/HADOOP-10769
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
>
> The KeyProvider API needs to return delegation tokens to enable access to the 
> KeyProvider from processes without Kerberos credentials (ie Yarn containers).
> This is required for HDFS encryption and KMS integration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider

2014-07-03 Thread Larry McCay (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052078#comment-14052078
 ] 

Larry McCay commented on HADOOP-10769:
--

Well, it isn't much different than the original getDelegationToken proposal 
this way.

So - you don't think that it makes sense to add a method that can move a list 
of specified keyversions into the credentials object?
That seems to imply that all keys will be fetched at runtime rather than those 
we know about at submission time being added then.

Incidentally, we wouldn't have to make *that* generic - we could come up with a 
type safe context that includes the same properties:
* renewer
* keyversions
* passwords (can leave this one out until we need it)
* returned tokens (if they are needed)

Anyway, I've beaten this one to death. 
Thanks for accommodating my nit-picking.

I think that I'll let [~owen.omalley] weigh in when he is back.

> Add getDelegationToken() method to KeyProvider
> --
>
> Key: HADOOP-10769
> URL: https://issues.apache.org/jira/browse/HADOOP-10769
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Alejandro Abdelnur
>Assignee: Arun Suresh
>
> The KeyProvider API needs to return delegation tokens to enable access to the 
> KeyProvider from processes without Kerberos credentials (ie Yarn containers).
> This is required for HDFS encryption and KMS integration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.

2014-07-03 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052086#comment-14052086
 ] 

Yi Liu commented on HADOOP-10734:
-

Thanks [~cmccabe] for your review, I will updated the patch and respond you 
later.

> Implementation of true secure random with high performance using hardware 
> random number generator.
> --
>
> Key: HADOOP-10734
> URL: https://issues.apache.org/jira/browse/HADOOP-10734
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10734.patch
>
>
> This JIRA is to implement Secure random using JNI to OpenSSL, and 
> implementation should be thread-safe.
> Utilize RdRand to return random numbers from hardware random number 
> generator. It's TRNG(True Random Number generators) having much higher 
> performance than {{java.security.SecureRandom}}. 
> https://wiki.openssl.org/index.php/Random_Numbers
> http://en.wikipedia.org/wiki/RdRand
> https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.

2014-07-03 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052094#comment-14052094
 ] 

Yi Liu commented on HADOOP-10734:
-

Thanks [~tucu00] for your comments. 

Openssl secure random is separated functionality and is used in 
{{OpensslAesCtrCryptoCodec}} and also can be used directly.  We should have a 
java class couple with the JNI implementation. It’s not suitable to put 
{code}
private native static void initSR();
private native boolean nextRandBytes(byte[] bytes);
{code}
into {{OpensslCipher}}. Having two classes makes code more clear. 
{{OpensslAesCtrCryptoCodec}} doesn't contain native methods directly, it will 
use {{OpensslCipher}} and Openssl secure random.


> Implementation of true secure random with high performance using hardware 
> random number generator.
> --
>
> Key: HADOOP-10734
> URL: https://issues.apache.org/jira/browse/HADOOP-10734
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: security
>Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134)
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: fs-encryption (HADOOP-10150 and HDFS-6134)
>
> Attachments: HADOOP-10734.patch
>
>
> This JIRA is to implement Secure random using JNI to OpenSSL, and 
> implementation should be thread-safe.
> Utilize RdRand to return random numbers from hardware random number 
> generator. It's TRNG(True Random Number generators) having much higher 
> performance than {{java.security.SecureRandom}}. 
> https://wiki.openssl.org/index.php/Random_Numbers
> http://en.wikipedia.org/wiki/RdRand
> https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-8808) Update FsShell documentation to mention deprecation of some of the commands, and mention alternatives

2014-07-03 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052178#comment-14052178
 ] 

Akira AJISAKA commented on HADOOP-8808:
---

The test failure is not related to the patch. It was reported at HADOOP-10406.

> Update FsShell documentation to mention deprecation of some of the commands, 
> and mention alternatives
> -
>
> Key: HADOOP-8808
> URL: https://issues.apache.org/jira/browse/HADOOP-8808
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation, fs
>Affects Versions: 2.2.0
>Reporter: Hemanth Yamijala
>Assignee: Akira AJISAKA
> Attachments: HADOOP-8808.2.patch, HADOOP-8808.patch
>
>
> In HADOOP-7286, we deprecated the following 3 commands dus, lsr and rmr, in 
> favour of du -s, ls -r and rm -r respectively. The FsShell documentation 
> should be updated to mention these, so that users can start switching. Also, 
> there are places where we refer to the deprecated commands as alternatives. 
> This can be changed as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10468) TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately

2014-07-03 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052185#comment-14052185
 ] 

Akira AJISAKA commented on HADOOP-10468:


bq. I'm hoping there's a way we can fix the underlying issue without breaking 
existing metrics2 property files
I agree with you, [~jlowe]. I'm trying to find the way.

> TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately
> ---
>
> Key: HADOOP-10468
> URL: https://issues.apache.org/jira/browse/HADOOP-10468
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.5.0
>
> Attachments: HADOOP-10468.000.patch, HADOOP-10468.001.patch
>
>
> {{TestMetricsSystemImpl.testMultiThreadedPublish}} can fail intermediately 
> due to the insufficient size of the sink queue:
> {code}
> 2014-04-06 21:34:55,269 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> 2014-04-06 21:34:55,270 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> 2014-04-06 21:34:55,271 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> {code}
> The unit test should increase the default queue size to avoid intermediate 
> failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10780) namenode throws java.lang.OutOfMemoryError upon DatanodeProtocol.versionRequest from datanode

2014-07-03 Thread Dmitry Sivachenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Sivachenko updated HADOOP-10780:
---

Status: Patch Available  (was: Open)

This is because of incorrect type of buf_sz variable in 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c,
function hadoop_user_info_alloc(void):

Currently it's type is size_t (which is unsigned), and during assignment
buf_sz = sysconf(_SC_GETPW_R_SIZE_MAX);
syscont() can return -1 (and it does on FreeBSD).  So buf_sz gets very large 
positive value (equivalent of signed -1), and then malloc() fails with 
OutOfMemory.

The correct solution will be to change type of buf_sz to long (because 
sysconf() returns long).

> namenode throws java.lang.OutOfMemoryError upon 
> DatanodeProtocol.versionRequest from datanode
> -
>
> Key: HADOOP-10780
> URL: https://issues.apache.org/jira/browse/HADOOP-10780
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.1
> Environment: FreeBSD-10/stable
> openjdk version "1.7.0_60"
> OpenJDK Runtime Environment (build 1.7.0_60-b19)
> OpenJDK 64-Bit Server VM (build 24.60-b09, mixed mode)
>Reporter: Dmitry Sivachenko
>
> I am trying hadoop-2.4.1 on FreeBSD-10/stable.
> namenode starts up, but after first datanode contacts it, it throws an 
> exception.
> All limits seem to be high enough:
> % limits -a
> Resource limits (current):
>   cputime  infinity secs
>   filesize infinity kB
>   datasize 33554432 kB
>   stacksize  524288 kB
>   coredumpsize infinity kB
>   memoryuseinfinity kB
>   memorylocked infinity kB
>   maxprocesses   122778
>   openfiles  14
>   sbsize   infinity bytes
>   vmemoryuse   infinity kB
>   pseudo-terminals infinity
>   swapuse  infinity kB
> 14944  1  S0:06.59 /usr/local/openjdk7/bin/java -Dproc_namenode 
> -Xmx1000m -Dhadoop.log.dir=/var/log/hadoop 
> -Dhadoop.log.file=hadoop-hdfs-namenode-nezabudka3-00.log 
> -Dhadoop.home.dir=/usr/local -Dhadoop.id.str=hdfs 
> -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml 
> -Djava.net.preferIPv4Stack=true -Xmx32768m -Xms32768m 
> -Djava.library.path=/usr/local/lib -Xmx32768m -Xms32768m 
> -Djava.library.path=/usr/local/lib -Xmx32768m -Xms32768m 
> -Djava.library.path=/usr/local/lib -Dhadoop.security.logger=INFO,RFAS 
> org.apache.hadoop.hdfs.server.namenode.NameNode
> From the namenode's log:
> 2014-07-03 23:28:15,070 WARN  [IPC Server handler 5 on 8020] ipc.Server 
> (Server.java:run(2032)) - IPC Server handler 5 on 8020, call 
> org.apache.hadoop.hdfs.server.protocol.Datano
> deProtocol.versionRequest from 5.255.231.209:57749 Call#842 Retry#0
> java.lang.OutOfMemoryError
> at 
> org.apache.hadoop.security.JniBasedUnixGroupsMapping.getGroupsForUser(Native 
> Method)
> at 
> org.apache.hadoop.security.JniBasedUnixGroupsMapping.getGroups(JniBasedUnixGroupsMapping.java:80)
> at 
> org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:50)
> at org.apache.hadoop.security.Groups.getGroups(Groups.java:139)
> at 
> org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1417)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.(FSPermissionChecker.java:81)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3331)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:5491)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.versionRequest(NameNodeRpcServer.java:1082)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.versionRequest(DatanodeProtocolServerSideTranslatorPB.java:234)
> at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28069)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

[jira] [Created] (HADOOP-10781) Unportable getgrouplist() usage breaks FreeBSD

2014-07-03 Thread Dmitry Sivachenko (JIRA)
Dmitry Sivachenko created HADOOP-10781:
--

 Summary: Unportable getgrouplist() usage breaks FreeBSD
 Key: HADOOP-10781
 URL: https://issues.apache.org/jira/browse/HADOOP-10781
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.4.1
Reporter: Dmitry Sivachenko


getgrouplist() has different return values on Linux and FreeBSD:
Linux: either the number of groups (positive) or -1 on error
FreeBSD: 0 on success or -1 on error

The return value of getgrouplist() is analyzed in Linux-specific way in 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c,
 in function hadoop_user_info_getgroups() which breaks FreeBSD.

In this function you have 3 choices for the return value 
ret = getgrouplist(uinfo->pwd.pw_name, uinfo->pwd.pw_gid,
 uinfo->gids, &ngroups);

1) ret > 0 : OK for Linux, it will be zero on FreeBSD.  I propose to change 
this to ret >= 0
2) First condition is false and ret != -1:  impossible according to manpage
3) ret == 1 -- OK for both Linux and FreeBSD

So I propose to change "ret > 0" to "ret >= 0" and (optionally) return 2nd case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load

2014-07-03 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HADOOP-10597:
-

Attachment: HADOOP-10597-2.patch
RPCClientBackoffDesignAndEvaluation.pdf

[~lohit] provided some feedback. Here is the design document with some 
evaluation results. The updated patch also includes unit tests and make the 
server side retry policy pluggable.

> Evaluate if we can have RPC client back off when server is under heavy load
> ---
>
> Key: HADOOP-10597
> URL: https://issues.apache.org/jira/browse/HADOOP-10597
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HADOOP-10597-2.patch, HADOOP-10597.patch, 
> RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking 
> state, assuming OS connection doesn't run out. Alternatively RPC or NN can 
> throw some well defined exception back to the client based on certain 
> policies when it is under heavy load; client will understand such exception 
> and do exponential back off, as another implementation of 
> RetryInvocationHandler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10468) TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately

2014-07-03 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052200#comment-14052200
 ] 

Akira AJISAKA commented on HADOOP-10468:


Attaching a patch to revert HADOOP-10468 and make 'Collector' to lower case.
{code}
-  .add("Test.sink.Collector." + MetricsConfig.QUEUE_CAPACITY_KEY,
+  .add("test.sink.collector." + MetricsConfig.QUEUE_CAPACITY_KEY,
{code}
{code}
-ms.registerSink("Collector",
+ms.registerSink("collector",
{code}
Using debugger, I confirmed the queue capacity of the sink was set to 10.

> TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately
> ---
>
> Key: HADOOP-10468
> URL: https://issues.apache.org/jira/browse/HADOOP-10468
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.5.0
>
> Attachments: HADOOP-10468.000.patch, HADOOP-10468.001.patch
>
>
> {{TestMetricsSystemImpl.testMultiThreadedPublish}} can fail intermediately 
> due to the insufficient size of the sink queue:
> {code}
> 2014-04-06 21:34:55,269 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> 2014-04-06 21:34:55,270 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> 2014-04-06 21:34:55,271 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> {code}
> The unit test should increase the default queue size to avoid intermediate 
> failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10468) TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately

2014-07-03 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HADOOP-10468:
---

Attachment: HADOOP-10468.2.patch

> TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately
> ---
>
> Key: HADOOP-10468
> URL: https://issues.apache.org/jira/browse/HADOOP-10468
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.5.0
>
> Attachments: HADOOP-10468.000.patch, HADOOP-10468.001.patch, 
> HADOOP-10468.2.patch
>
>
> {{TestMetricsSystemImpl.testMultiThreadedPublish}} can fail intermediately 
> due to the insufficient size of the sink queue:
> {code}
> 2014-04-06 21:34:55,269 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> 2014-04-06 21:34:55,270 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> 2014-04-06 21:34:55,271 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> {code}
> The unit test should increase the default queue size to avoid intermediate 
> failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load

2014-07-03 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HADOOP-10597:
-

Status: Patch Available  (was: Open)

> Evaluate if we can have RPC client back off when server is under heavy load
> ---
>
> Key: HADOOP-10597
> URL: https://issues.apache.org/jira/browse/HADOOP-10597
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HADOOP-10597-2.patch, HADOOP-10597.patch, 
> RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking 
> state, assuming OS connection doesn't run out. Alternatively RPC or NN can 
> throw some well defined exception back to the client based on certain 
> policies when it is under heavy load; client will understand such exception 
> and do exponential back off, as another implementation of 
> RetryInvocationHandler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10468) TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately

2014-07-03 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HADOOP-10468:
---

Affects Version/s: 2.5.0
   Status: Patch Available  (was: Reopened)

> TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately
> ---
>
> Key: HADOOP-10468
> URL: https://issues.apache.org/jira/browse/HADOOP-10468
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 2.5.0
>
> Attachments: HADOOP-10468.000.patch, HADOOP-10468.001.patch, 
> HADOOP-10468.2.patch
>
>
> {{TestMetricsSystemImpl.testMultiThreadedPublish}} can fail intermediately 
> due to the insufficient size of the sink queue:
> {code}
> 2014-04-06 21:34:55,269 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> 2014-04-06 21:34:55,270 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> 2014-04-06 21:34:55,271 WARN  impl.MetricsSinkAdapter 
> (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full 
> queue and can't consume the given metrics.
> {code}
> The unit test should increase the default queue size to avoid intermediate 
> failure.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HADOOP-10781) Unportable getgrouplist() usage breaks FreeBSD

2014-07-03 Thread Dmitry Sivachenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Sivachenko updated HADOOP-10781:
---

Status: Patch Available  (was: Open)

--- 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c.bak
   2014-06-21 09:40:12.0 +0400
+++ 
hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c
   2014-07-04 10:53:05.0 +0400
@@ -193,7 +193,7 @@
   ngroups = uinfo->gids_size;
   ret = getgrouplist(uinfo->pwd.pw_name, uinfo->pwd.pw_gid, 
  uinfo->gids, &ngroups);
-  if (ret > 0) {
+  if (ret > 0 /* Linux */ || ret == 0 /* FreeBSD */) {
 uinfo->num_gids = ngroups;
 ret = put_primary_gid_first(uinfo);
 if (ret) {


> Unportable getgrouplist() usage breaks FreeBSD
> --
>
> Key: HADOOP-10781
> URL: https://issues.apache.org/jira/browse/HADOOP-10781
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.1
>Reporter: Dmitry Sivachenko
>
> getgrouplist() has different return values on Linux and FreeBSD:
> Linux: either the number of groups (positive) or -1 on error
> FreeBSD: 0 on success or -1 on error
> The return value of getgrouplist() is analyzed in Linux-specific way in 
> hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c,
>  in function hadoop_user_info_getgroups() which breaks FreeBSD.
> In this function you have 3 choices for the return value 
> ret = getgrouplist(uinfo->pwd.pw_name, uinfo->pwd.pw_gid,
>  uinfo->gids, &ngroups);
> 1) ret > 0 : OK for Linux, it will be zero on FreeBSD.  I propose to change 
> this to ret >= 0
> 2) First condition is false and ret != -1:  impossible according to manpage
> 3) ret == 1 -- OK for both Linux and FreeBSD
> So I propose to change "ret > 0" to "ret >= 0" and (optionally) return 2nd 
> case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HADOOP-10781) Unportable getgrouplist() usage breaks FreeBSD

2014-07-03 Thread Dmitry Sivachenko (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052209#comment-14052209
 ] 

Dmitry Sivachenko commented on HADOOP-10781:


And also remove this code:

-  } else if (ret != -1) {
-// Any return code that is not -1 is considered as error.
-// Since the user lookup was successful, there should be at least one
-// group for this user.
-return EIO;


Because according to manpage this is impossible.

> Unportable getgrouplist() usage breaks FreeBSD
> --
>
> Key: HADOOP-10781
> URL: https://issues.apache.org/jira/browse/HADOOP-10781
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.4.1
>Reporter: Dmitry Sivachenko
>
> getgrouplist() has different return values on Linux and FreeBSD:
> Linux: either the number of groups (positive) or -1 on error
> FreeBSD: 0 on success or -1 on error
> The return value of getgrouplist() is analyzed in Linux-specific way in 
> hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c,
>  in function hadoop_user_info_getgroups() which breaks FreeBSD.
> In this function you have 3 choices for the return value 
> ret = getgrouplist(uinfo->pwd.pw_name, uinfo->pwd.pw_gid,
>  uinfo->gids, &ngroups);
> 1) ret > 0 : OK for Linux, it will be zero on FreeBSD.  I propose to change 
> this to ret >= 0
> 2) First condition is false and ret != -1:  impossible according to manpage
> 3) ret == 1 -- OK for both Linux and FreeBSD
> So I propose to change "ret > 0" to "ret >= 0" and (optionally) return 2nd 
> case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)