[jira] [Created] (HADOOP-10779) Generalize DFS_PERMISSIONS_SUPERUSERGROUP_KEY for any HCFS
Martin Bukatovic created HADOOP-10779: - Summary: Generalize DFS_PERMISSIONS_SUPERUSERGROUP_KEY for any HCFS Key: HADOOP-10779 URL: https://issues.apache.org/jira/browse/HADOOP-10779 Project: Hadoop Common Issue Type: Wish Components: fs Reporter: Martin Bukatovic Priority: Minor HDFS has configuration option {{dfs.permissions.superusergroup}} stored in {{hdfs-site.xml}} configuration file: {noformat} dfs.permissions.superusergroup supergroup The name of the group of super-users. {noformat} Since we have an option to use alternative Hadoop filesystems (HCFS), there is a question how to specify a supergroup in such case. Eg. would introducing HCFS option in say {{core-site.xml}} for this as shown below make sense? {noformat} hcfs.permissions.superusergroup ${dfs.permissions.superusergroup} The name of the group of super-users. {noformat} Or would you solve it in different way? I would like to at least declare a recommended approach for alternative Hadoop filesystems to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10392) Use FileSystem#makeQualified(Path) instead of Path#makeQualified(FileSystem)
[ https://issues.apache.org/jira/browse/HADOOP-10392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051186#comment-14051186 ] Hadoop QA commented on HADOOP-10392: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12644781/HADOOP-10392.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 24 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-examples hadoop-tools/hadoop-archives hadoop-tools/hadoop-extras hadoop-tools/hadoop-gridmix hadoop-tools/hadoop-openstack hadoop-tools/hadoop-rumen hadoop-tools/hadoop-streaming: org.apache.hadoop.mapred.pipes.TestPipeApplication {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4204//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4204//console This message is automatically generated. > Use FileSystem#makeQualified(Path) instead of Path#makeQualified(FileSystem) > > > Key: HADOOP-10392 > URL: https://issues.apache.org/jira/browse/HADOOP-10392 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs >Affects Versions: 2.3.0 >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA >Priority: Minor > Labels: newbie > Attachments: HADOOP-10392.2.patch, HADOOP-10392.3.patch, > HADOOP-10392.4.patch, HADOOP-10392.4.patch, HADOOP-10392.patch > > > There're some methods calling Path.makeQualified(FileSystem), which causes > javac warning. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051191#comment-14051191 ] Steve Loughran commented on HADOOP-9361: Andew: thanks, will merge in today Juan: thanks for the testing. The entire Swift test suite is skipped if there's no auth-keys file -though we could migrate that to the contract-test-options.xml file. The reason for that policy is that # some of the junit 3 test suites that are subclassed for hadoop-common-test aren't skippable (junit 3, see) -this is why in Hadoop common the s3 & ftp tests don't start with Test*. While the contract tests are designed to be self-skipping -and so logged in test reports, I left the junit 3 stuff with a Test profile -you can't really test the swift client without the settings, except for some minor unit tests Jay: tighter exceptions provide more information to clients, and lets you explicitly catch by type in your code, e.g. {{catch(EOFException e}}. general IOExceptions with text have to be caught as IOE and then tested -and are incredibly brittle to changes in text. That's why I didn't rename text messages from exceptions in the common filesystems, even when I tightened their class: we don't know what callers are searching for the text. Whenever you can, use explicit types. I also recommend using constants for text, constants that tests can look for -and in those tests use {{Exception.toString().contains()}} as the check -not equality, so that if more details are added the test still works. > Strictly define the expected behavior of filesystem APIs and write tests to > verify compliance > - > > Key: HADOOP-9361 > URL: https://issues.apache.org/jira/browse/HADOOP-9361 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, > HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, > HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, > HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, > HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, > HADOOP-9361-016.patch, HADOOP-9361-017.patch, HADOOP-9361-018.patch, > HADOOP-9361.awang-addendum.patch > > > {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while > HDFS gets tested downstream, other filesystems, such as blobstore bindings, > don't. > The only tests that are common are those of {{FileSystemContractTestBase}}, > which HADOOP-9258 shows is incomplete. > I propose > # writing more tests which clarify expected behavior > # testing operations in the interface being in their own JUnit4 test classes, > instead of one big test suite. > # Having each FS declare via a properties file what behaviors they offer, > such as atomic-rename, atomic-delete, umask, immediate-consistency -test > methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10778) Use NativeCrc32 only if it is faster
[ https://issues.apache.org/jira/browse/HADOOP-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051197#comment-14051197 ] Steve Loughran commented on HADOOP-10778: - This is interesting - the speedup may depend on the CPU as well as other factors. Maybe the number "512" could be provided by the NativeCRC code itself, so it can make those decisions based on its knowledge of things ... the pending Arm patch could supply a different number than x86 parts, etc. > Use NativeCrc32 only if it is faster > > > Key: HADOOP-10778 > URL: https://issues.apache.org/jira/browse/HADOOP-10778 > Project: Hadoop Common > Issue Type: Improvement > Components: util >Reporter: Tsz Wo Nicholas Sze >Assignee: Tsz Wo Nicholas Sze > Attachments: c10778_20140702.patch > > > From the benchmark post in [this > comment|https://issues.apache.org/jira/browse/HDFS-6560?focusedCommentId=14044060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14044060], > NativeCrc32 is slower than java.util.zip.CRC32 for Java 7 and above when > bytesPerChecksum > 512. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10720) KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API
[ https://issues.apache.org/jira/browse/HADOOP-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HADOOP-10720: - Attachment: HADOOP-10720.2.patch [~tucu00], Updated with your suggestions. Thanks for the pointer on using Guava {{Cache}} instead of {{ConcurrentHashMap}}. bq. KMSClientProvider.java, keyQueueFiller runnable, the for loop should clone the keyQueues.entry(), even if the Map is a concurrent one. cloning the Entry object does not seem to be possible, so I made a copy of the Map before iterating. If the number of keys are really large, this might take a slight perf hit I guess.. > KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API > --- > > Key: HADOOP-10720 > URL: https://issues.apache.org/jira/browse/HADOOP-10720 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: COMBO.patch, COMBO.patch, COMBO.patch, COMBO.patch, > COMBO.patch, HADOOP-10720.1.patch, HADOOP-10720.2.patch, HADOOP-10720.patch, > HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch > > > KMS client/server should implement support for generating encrypted keys and > decrypting them via the REST API being introduced by HADOOP-10719. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10650) Add ability to specify a reverse ACL (black list) of users and groups
[ https://issues.apache.org/jira/browse/HADOOP-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051258#comment-14051258 ] Benoy Antony commented on HADOOP-10650: --- hi [~daryn], Could you please review this patch ? > Add ability to specify a reverse ACL (black list) of users and groups > - > > Key: HADOOP-10650 > URL: https://issues.apache.org/jira/browse/HADOOP-10650 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Reporter: Benoy Antony >Assignee: Benoy Antony > Attachments: HADOOP-10650.patch, HADOOP-10650.patch > > > Currently , it is possible to define a ACL (user and groups) for a service. > To temporarily remove authorization for a set of users, administrator needs > to remove the users from the specific group and this may be a lengthy process > ( update ldap groups, flush caches on machines). > If there is a facility to define a reverse ACL for services, then > administrator can disable users by specifying the users in reverse ACL. In > other words, one can specify a whitelist of users and groups as well as a > blacklist of users and groups. > One can also specify a default blacklist to disable the users from accessing > any service. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10774) Update KerberosTestUtils for hadoop-auth tests when using IBM Java
[ https://issues.apache.org/jira/browse/HADOOP-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-10774: Environment: AIX > Update KerberosTestUtils for hadoop-auth tests when using IBM Java > -- > > Key: HADOOP-10774 > URL: https://issues.apache.org/jira/browse/HADOOP-10774 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: AIX >Reporter: sangamesh > Attachments: HADOOP-10774.patch > > > There are two issues if IBM Java is used to while testing hadoop-auth tests. > > 1) Bad JAAS configuration: unrecognized option: isInitiator > 2) Cannot retrieve key from keytab HTTP/localh...@example.com > #1 Is caused as isInitiator isn't defined when we use IBM JAVA. There is > already a defect in jira https://issues.apache.org/jira/browse/SENTRY-169 >But we need to apply it to KerberosTestUtils.java for some tests in > hadoop-auth to pass. > #2 IS caused as, For IBM_JAVA keytab file must be a absolute path with > file:// as the prefix for the useKeytab option. >But the file path is relative. This change will work with both openjdk & > IBM_JAVA. > > Attached patch will resolve all failures happening if we use IBM Java. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10772) Generating RPMs for common, hdfs, httpfs, mapreduce , yarn and tools
[ https://issues.apache.org/jira/browse/HADOOP-10772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051285#comment-14051285 ] Steve Loughran commented on HADOOP-10772: - eric -we had RPMs in the past but they were undermaintained, not tested and had scripts that weren't up to date. What we get from bigtop isn't just the RPM packaging, but the tests of the RPMs, init.d scripts etc. if bigtop is x86-only, that's a bug in bigtop > Generating RPMs for common, hdfs, httpfs, mapreduce , yarn and tools > - > > Key: HADOOP-10772 > URL: https://issues.apache.org/jira/browse/HADOOP-10772 > Project: Hadoop Common > Issue Type: Improvement > Components: build >Reporter: Jinghui Wang >Assignee: Jinghui Wang > Attachments: HADOOP-10772.patch > > > Generating RPMs for hadoop-common, hadoop-hdfs, hadoop-hdfs-httpfs, > hadoop-mapreduce , hadoop-yarn-project and hadoop-tools-dist with dist build > profile. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10774) Update KerberosTestUtils for hadoop-auth tests when using IBM Java
[ https://issues.apache.org/jira/browse/HADOOP-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sangamesh updated HADOOP-10774: --- Environment: AIX OS: RHEL (64 bit), Ubuntu (64bit) was:AIX > Update KerberosTestUtils for hadoop-auth tests when using IBM Java > -- > > Key: HADOOP-10774 > URL: https://issues.apache.org/jira/browse/HADOOP-10774 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: AIX > OS: RHEL (64 bit), Ubuntu (64bit) >Reporter: sangamesh > Attachments: HADOOP-10774.patch > > > There are two issues if IBM Java is used to while testing hadoop-auth tests. > > 1) Bad JAAS configuration: unrecognized option: isInitiator > 2) Cannot retrieve key from keytab HTTP/localh...@example.com > #1 Is caused as isInitiator isn't defined when we use IBM JAVA. There is > already a defect in jira https://issues.apache.org/jira/browse/SENTRY-169 >But we need to apply it to KerberosTestUtils.java for some tests in > hadoop-auth to pass. > #2 IS caused as, For IBM_JAVA keytab file must be a absolute path with > file:// as the prefix for the useKeytab option. >But the file path is relative. This change will work with both openjdk & > IBM_JAVA. > > Attached patch will resolve all failures happening if we use IBM Java. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10774) Update KerberosTestUtils for hadoop-auth tests when using IBM Java
[ https://issues.apache.org/jira/browse/HADOOP-10774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sangamesh updated HADOOP-10774: --- Environment: AIX RHEL (64 bit), Ubuntu (64bit) was: AIX OS: RHEL (64 bit), Ubuntu (64bit) > Update KerberosTestUtils for hadoop-auth tests when using IBM Java > -- > > Key: HADOOP-10774 > URL: https://issues.apache.org/jira/browse/HADOOP-10774 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 2.4.1 > Environment: AIX > RHEL (64 bit), Ubuntu (64bit) >Reporter: sangamesh > Attachments: HADOOP-10774.patch > > > There are two issues if IBM Java is used to while testing hadoop-auth tests. > > 1) Bad JAAS configuration: unrecognized option: isInitiator > 2) Cannot retrieve key from keytab HTTP/localh...@example.com > #1 Is caused as isInitiator isn't defined when we use IBM JAVA. There is > already a defect in jira https://issues.apache.org/jira/browse/SENTRY-169 >But we need to apply it to KerberosTestUtils.java for some tests in > hadoop-auth to pass. > #2 IS caused as, For IBM_JAVA keytab file must be a absolute path with > file:// as the prefix for the useKeytab option. >But the file path is relative. This change will work with both openjdk & > IBM_JAVA. > > Attached patch will resolve all failures happening if we use IBM Java. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10312) Shell.ExitCodeException to have more useful toString
[ https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-10312: Resolution: Fixed Fix Version/s: 2.5.0 Status: Resolved (was: Patch Available) > Shell.ExitCodeException to have more useful toString > > > Key: HADOOP-10312 > URL: https://issues.apache.org/jira/browse/HADOOP-10312 > Project: Hadoop Common > Issue Type: Bug > Components: util >Affects Versions: 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch > > > Shell's ExitCodeException doesn't include the exit code in the toString > value, so isn't that useful in diagnosing container start failures in YARN -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HADOOP-10533) S3 input stream NPEs in MapReduce jon
[ https://issues.apache.org/jira/browse/HADOOP-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran reassigned HADOOP-10533: --- Assignee: Steve Loughran > S3 input stream NPEs in MapReduce jon > - > > Key: HADOOP-10533 > URL: https://issues.apache.org/jira/browse/HADOOP-10533 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 1.0.0, 1.0.3, 3.0.0, 2.4.0 > Environment: Hadoop with default configurations >Reporter: Benjamin Kim >Assignee: Steve Loughran >Priority: Minor > > I'm running a wordcount MR as follows > hadoop jar WordCount.jar wordcount.WordCountDriver > s3n://bucket/wordcount/input s3n://bucket/wordcount/output > > s3n://bucket/wordcount/input is a s3 object that contains other input files. > However I get following NPE error > 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% > 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% > 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : > attempt_201210021853_0001_m_01_0, Status : FAILED > java.lang.NullPointerException > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) > at java.io.BufferedInputStream.close(BufferedInputStream.java:451) > at java.io.FilterInputStream.close(FilterInputStream.java:155) > at org.apache.hadoop.util.LineReader.close(LineReader.java:83) > at > org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > MR runs fine if i specify more specific input path such as > s3n://bucket/wordcount/input/file.txt > MR fails if I pass s3 folder as a parameter > In summary, > This works > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ > This doesn't work > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ > (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10312) Shell.ExitCodeException to have more useful toString
[ https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051349#comment-14051349 ] Hudson commented on HADOOP-10312: - SUCCESS: Integrated in Hadoop-trunk-Commit #5817 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5817/]) HADOOP-10312 Shell.ExitCodeException to have more useful toString (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607591) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java > Shell.ExitCodeException to have more useful toString > > > Key: HADOOP-10312 > URL: https://issues.apache.org/jira/browse/HADOOP-10312 > Project: Hadoop Common > Issue Type: Bug > Components: util >Affects Versions: 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch > > > Shell's ExitCodeException doesn't include the exit code in the toString > value, so isn't that useful in diagnosing container start failures in YARN -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9495) Define behaviour of Seekable.seek(), write tests, fix all hadoop implementations for compliance
[ https://issues.apache.org/jira/browse/HADOOP-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9495: --- Issue Type: Improvement (was: Sub-task) Parent: (was: HADOOP-9361) > Define behaviour of Seekable.seek(), write tests, fix all hadoop > implementations for compliance > --- > > Key: HADOOP-9495 > URL: https://issues.apache.org/jira/browse/HADOOP-9495 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 1.2.0, 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-9495.patch, HADOOP-9545-002.patch > > > {{Seekable.seek()}} seems a good starting point for specifying, testing and > implementing FS API compliance: one method, relatively non-ambiguous > semantics, easily assessed used in the Hadoop codebase. Specify and test it > first -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9565) Add a Blobstore interface to add to blobstore FileSystems
[ https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9565: --- Issue Type: Improvement (was: Sub-task) Parent: (was: HADOOP-9361) > Add a Blobstore interface to add to blobstore FileSystems > - > > Key: HADOOP-9565 > URL: https://issues.apache.org/jira/browse/HADOOP-9565 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 2.0.4-alpha >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > > We can make the fact that some {{FileSystem}} implementations are really > blobstores, with different atomicity and consistency guarantees, by adding a > {{Blobstore}} interface to add to them. > This could also be a place to add a {{Copy(Path,Path)}} method, assuming that > all blobstores implement at server-side copy operation as a substitute for > rename. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9651) Filesystems to throw FileAlreadyExistsException in createFile(path, overwrite=false) when the file exists
[ https://issues.apache.org/jira/browse/HADOOP-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9651: --- Status: Open (was: Patch Available) incorporated in the uber-JIRA HADOOP-9361 > Filesystems to throw FileAlreadyExistsException in createFile(path, > overwrite=false) when the file exists > - > > Key: HADOOP-9651 > URL: https://issues.apache.org/jira/browse/HADOOP-9651 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs >Affects Versions: 2.1.0-beta >Reporter: Steve Loughran >Priority: Minor > Attachments: HADOOP-9651.patch > > > While HDFS and other filesystems throw a {{FileAlreadyExistsException}} if > you try to create a file that exists and you have set {{overwrite=false}}, > {{RawLocalFileSystem}} throws a plain {{IOException}}. This makes it > impossible to distinguish a create operation failing from a fixable problem > (the file is there) and something more fundamental. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9371) Define Semantics of FileSystem more rigorously
[ https://issues.apache.org/jira/browse/HADOOP-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9371: --- Summary: Define Semantics of FileSystem more rigorously (was: Define Semantics of FileSystem and FileContext more rigorously) > Define Semantics of FileSystem more rigorously > -- > > Key: HADOOP-9371 > URL: https://issues.apache.org/jira/browse/HADOOP-9371 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs >Affects Versions: 1.2.0, 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: HADOOP-9361.2.patch, HADOOP-9361.patch, > HADOOP-9371-003.patch, HadoopFilesystemContract.pdf > > Original Estimate: 48h > Remaining Estimate: 48h > > The semantics of {{FileSystem}} and {{FileContext}} are not completely > defined in terms of > # core expectations of a filesystem > # consistency requirements. > # concurrency requirements. > # minimum scale limits > Furthermore, methods are not defined strictly enough in terms of their > outcomes and failure modes. > The requirements and method semantics should be defined more strictly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HADOOP-9597) FileSystem open() API is not clear if FileNotFoundException is throw when the path does not exist
[ https://issues.apache.org/jira/browse/HADOOP-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran reassigned HADOOP-9597: -- Assignee: Steve Loughran > FileSystem open() API is not clear if FileNotFoundException is throw when the > path does not exist > - > > Key: HADOOP-9597 > URL: https://issues.apache.org/jira/browse/HADOOP-9597 > Project: Hadoop Common > Issue Type: Sub-task > Components: documentation, fs >Affects Versions: 2.0.4-alpha >Reporter: Jerry He >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > > The current FileSystem open() method throws a generic IOException in its API > specification. > Some FileSystem implementations (DFS, RawLocalFileSystem ...) throws more > specific FileNotFoundException if the path does not exist. Some throws > IOException only (FTPFileSystem, HftpFileSystem ...). > If we have a new FileSystem implementation, what should we follow exactly for > open()? > What should the application expect in this case. > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-9597) FileSystem open() API is not clear if FileNotFoundException is throw when the path does not exist
[ https://issues.apache.org/jira/browse/HADOOP-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-9597. Resolution: Done Fix Version/s: 2.5.0 > FileSystem open() API is not clear if FileNotFoundException is throw when the > path does not exist > - > > Key: HADOOP-9597 > URL: https://issues.apache.org/jira/browse/HADOOP-9597 > Project: Hadoop Common > Issue Type: Sub-task > Components: documentation, fs >Affects Versions: 2.0.4-alpha >Reporter: Jerry He >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > > The current FileSystem open() method throws a generic IOException in its API > specification. > Some FileSystem implementations (DFS, RawLocalFileSystem ...) throws more > specific FileNotFoundException if the path does not exist. Some throws > IOException only (FTPFileSystem, HftpFileSystem ...). > If we have a new FileSystem implementation, what should we follow exactly for > open()? > What should the application expect in this case. > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9597) FileSystem open() API is not clear if FileNotFoundException is throw when the path does not exist
[ https://issues.apache.org/jira/browse/HADOOP-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051361#comment-14051361 ] Steve Loughran commented on HADOOP-9597: It's {{FileNotFoundException()}}; the parent JIRA makes sure it is this in all filesystems that it has coverage for. Marking as done > FileSystem open() API is not clear if FileNotFoundException is throw when the > path does not exist > - > > Key: HADOOP-9597 > URL: https://issues.apache.org/jira/browse/HADOOP-9597 > Project: Hadoop Common > Issue Type: Sub-task > Components: documentation, fs >Affects Versions: 2.0.4-alpha >Reporter: Jerry He >Priority: Minor > Fix For: 2.5.0 > > > The current FileSystem open() method throws a generic IOException in its API > specification. > Some FileSystem implementations (DFS, RawLocalFileSystem ...) throws more > specific FileNotFoundException if the path does not exist. Some throws > IOException only (FTPFileSystem, HftpFileSystem ...). > If we have a new FileSystem implementation, what should we follow exactly for > open()? > What should the application expect in this case. > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10246) define FS permissions model with tests
[ https://issues.apache.org/jira/browse/HADOOP-10246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-10246: Issue Type: Improvement (was: Sub-task) Parent: (was: HADOOP-9361) > define FS permissions model with tests > -- > > Key: HADOOP-10246 > URL: https://issues.apache.org/jira/browse/HADOOP-10246 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Steve Loughran >Priority: Minor > > It's interesting that HDFS mkdirs(dir, permission) uses the umask, but > setPermissions() does not > The permissions model, including umask logic should be defined and have tests > implemented by those filesystems that support permissions-based security -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051375#comment-14051375 ] Hudson commented on HADOOP-9361: SUCCESS: Integrated in Hadoop-trunk-Commit #5818 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5818/]) HADOOP-9361: Strictly define FileSystem APIs - HDFS portion (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607597) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/HDFSContract.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractAppend.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractConcat.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractCreate.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractDelete.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractMkdir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractOpen.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractRename.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractRootDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractSeek.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/contract * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/contract/hdfs.xml HADOOP-9361: Strictly define FileSystem APIs (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607596) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BufferedFSInputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSExceptionMessages.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSInputChecker.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/RawLocalFileSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ftp/FTPFileSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ftp/FTPInputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/s3/S3FileSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/s3native/Jets3tNativeFileSystemStore.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/s3native/NativeS3FileSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/extending.md * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/filesystem.md * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/index.md * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/introduction.md * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/model.md * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/notation.md * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/markdown/filesystem/testing.md * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestLocalFileSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract * /hadoop/comm
[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051384#comment-14051384 ] Hudson commented on HADOOP-9361: SUCCESS: Integrated in Hadoop-trunk-Commit #5819 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5819/]) HADOOP-9361: site and gitignore (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607601) * /hadoop/common/trunk/.gitignore * /hadoop/common/trunk/hadoop-project/src/site/site.xml HADOOP-9361: Strictly define FileSystem APIs - OpenStack portion (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607600) * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/exceptions/SwiftNotDirectoryException.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/exceptions/SwiftPathExistsException.java HADOOP-9361: Strictly define FileSystem APIs - OpenStack portion (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607599) * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/StrictBufferedFSInputStream.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeFileSystem.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeFileSystemStore.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeInputStream.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeOutputStream.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemBasicOps.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemContract.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemRename.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/SwiftContract.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractCreate.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractDelete.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractMkdir.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractOpen.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractRename.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractRootDir.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractSeek.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/hdfs2/TestV2LsOperations.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/resources/contract * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/resources/contract/swift.xml > Strictly define the expected behavior of filesystem APIs and write tests to > verify compliance > - > > Key: HADOOP-9361 > URL: https://issues.apache.org/jira/browse/HADOOP-9361 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, > HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, > HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, > HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, > HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, > HADOOP-9361-016.patch, HADOOP-9361-017.patch, HADOOP-9361-018.patch, > HADOOP-9361.awang-addendum.patch > > > {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while > HDFS gets tested downstream, other filesystems, such as blobstore bindings, > don't. > The only tests that are common are those of {{FileSystemContractTestBase}}, > which HADOOP-9258 shows is incomplete. > I propose > # writing more tests which clarify expected b
[jira] [Resolved] (HADOOP-10419) BufferedFSInputStream NPEs on getPos() on a closed stream
[ https://issues.apache.org/jira/browse/HADOOP-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-10419. - Resolution: Fixed Fix Version/s: 2.5.0 > BufferedFSInputStream NPEs on getPos() on a closed stream > - > > Key: HADOOP-10419 > URL: https://issues.apache.org/jira/browse/HADOOP-10419 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > if you call getPos on a {{ChecksumFileSystem}} after a {{close()}} you get an > NPE. > While throwing an exception in this states is legitimate (HDFS does, RawLocal > does not), it should be an {{IOException}} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-9711) Write contract tests for S3Native; fix places where it breaks
[ https://issues.apache.org/jira/browse/HADOOP-9711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-9711. Resolution: Fixed Fix Version/s: 2.5.0 Assignee: Steve Loughran > Write contract tests for S3Native; fix places where it breaks > - > > Key: HADOOP-9711 > URL: https://issues.apache.org/jira/browse/HADOOP-9711 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 1.2.0, 3.0.0, 2.1.0-beta >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > Attachments: HADOOP-9711-004.patch > > > implement the abstract contract tests for S3, identify where it is failing to > meet expectations and, where possible, fix. Blobstores tend to treat 0 byte > files as directories, so tests overwriting files with dirs and vice versa may > fail and have to be skipped -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-10533) S3 input stream NPEs in MapReduce jon
[ https://issues.apache.org/jira/browse/HADOOP-10533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-10533. - Resolution: Fixed Fix Version/s: 2.5.0 > S3 input stream NPEs in MapReduce jon > - > > Key: HADOOP-10533 > URL: https://issues.apache.org/jira/browse/HADOOP-10533 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 >Affects Versions: 1.0.0, 1.0.3, 3.0.0, 2.4.0 > Environment: Hadoop with default configurations >Reporter: Benjamin Kim >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > > I'm running a wordcount MR as follows > hadoop jar WordCount.jar wordcount.WordCountDriver > s3n://bucket/wordcount/input s3n://bucket/wordcount/output > > s3n://bucket/wordcount/input is a s3 object that contains other input files. > However I get following NPE error > 12/10/02 18:56:23 INFO mapred.JobClient: map 0% reduce 0% > 12/10/02 18:56:54 INFO mapred.JobClient: map 50% reduce 0% > 12/10/02 18:56:56 INFO mapred.JobClient: Task Id : > attempt_201210021853_0001_m_01_0, Status : FAILED > java.lang.NullPointerException > at > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.close(NativeS3FileSystem.java:106) > at java.io.BufferedInputStream.close(BufferedInputStream.java:451) > at java.io.FilterInputStream.close(FilterInputStream.java:155) > at org.apache.hadoop.util.LineReader.close(LineReader.java:83) > at > org.apache.hadoop.mapreduce.lib.input.LineRecordReader.close(LineRecordReader.java:144) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:497) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > MR runs fine if i specify more specific input path such as > s3n://bucket/wordcount/input/file.txt > MR fails if I pass s3 folder as a parameter > In summary, > This works > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > /user/hadoop/wordcount/input/ s3n://bucket/wordcount/output/ > This doesn't work > hadoop jar ./hadoop-examples-1.0.3.jar wordcount > s3n://bucket/wordcount/input/ s3n://bucket/wordcount/output/ > (both input path are directories) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-9495) Define behaviour of Seekable.seek(), write tests, fix all hadoop implementations for compliance
[ https://issues.apache.org/jira/browse/HADOOP-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-9495. Resolution: Fixed Fix Version/s: 2.5.0 > Define behaviour of Seekable.seek(), write tests, fix all hadoop > implementations for compliance > --- > > Key: HADOOP-9495 > URL: https://issues.apache.org/jira/browse/HADOOP-9495 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 1.2.0, 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: 2.5.0 > > Attachments: HADOOP-9495.patch, HADOOP-9545-002.patch > > > {{Seekable.seek()}} seems a good starting point for specifying, testing and > implementing FS API compliance: one method, relatively non-ambiguous > semantics, easily assessed used in the Hadoop codebase. Specify and test it > first -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9712) Write contract tests for FTP filesystem, fix places where it breaks
[ https://issues.apache.org/jira/browse/HADOOP-9712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9712: --- Resolution: Fixed Fix Version/s: 2.5.0 Status: Resolved (was: Patch Available) > Write contract tests for FTP filesystem, fix places where it breaks > --- > > Key: HADOOP-9712 > URL: https://issues.apache.org/jira/browse/HADOOP-9712 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 1.2.0, 3.0.0, 2.1.0-beta >Reporter: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > Attachments: HADOOP-9712-001.patch > > > implement the abstract contract tests for S3, identify where it is failing to > meet expectations and, where possible, fix. > FTPFS appears to be the least tested (& presumably used) hadoop filesystem > implementation; there may be some bug reports that have been around for years > that could drive test cases and fixes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-9371) Define Semantics of FileSystem more rigorously
[ https://issues.apache.org/jira/browse/HADOOP-9371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-9371. Resolution: Fixed Fix Version/s: 2.5.0 > Define Semantics of FileSystem more rigorously > -- > > Key: HADOOP-9371 > URL: https://issues.apache.org/jira/browse/HADOOP-9371 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs >Affects Versions: 1.2.0, 3.0.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Fix For: 2.5.0 > > Attachments: HADOOP-9361.2.patch, HADOOP-9361.patch, > HADOOP-9371-003.patch, HadoopFilesystemContract.pdf > > Original Estimate: 48h > Remaining Estimate: 48h > > The semantics of {{FileSystem}} and {{FileContext}} are not completely > defined in terms of > # core expectations of a filesystem > # consistency requirements. > # concurrency requirements. > # minimum scale limits > Furthermore, methods are not defined strictly enough in terms of their > outcomes and failure modes. > The requirements and method semantics should be defined more strictly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051418#comment-14051418 ] Hudson commented on HADOOP-9361: SUCCESS: Integrated in Hadoop-trunk-Commit #5820 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5820/]) HADOOP-9361: changes.txt (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607620) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt > Strictly define the expected behavior of filesystem APIs and write tests to > verify compliance > - > > Key: HADOOP-9361 > URL: https://issues.apache.org/jira/browse/HADOOP-9361 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, > HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, > HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, > HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, > HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, > HADOOP-9361-016.patch, HADOOP-9361-017.patch, HADOOP-9361-018.patch, > HADOOP-9361.awang-addendum.patch > > > {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while > HDFS gets tested downstream, other filesystems, such as blobstore bindings, > don't. > The only tests that are common are those of {{FileSystemContractTestBase}}, > which HADOOP-9258 shows is incomplete. > I propose > # writing more tests which clarify expected behavior > # testing operations in the interface being in their own JUnit4 test classes, > instead of one big test suite. > # Having each FS declare via a properties file what behaviors they offer, > such as atomic-rename, atomic-delete, umask, immediate-consistency -test > methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10312) Shell.ExitCodeException to have more useful toString
[ https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051417#comment-14051417 ] Hudson commented on HADOOP-10312: - SUCCESS: Integrated in Hadoop-trunk-Commit #5820 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5820/]) HADOOP-10312 changes.text updated in wrong place (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607631) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt > Shell.ExitCodeException to have more useful toString > > > Key: HADOOP-10312 > URL: https://issues.apache.org/jira/browse/HADOOP-10312 > Project: Hadoop Common > Issue Type: Bug > Components: util >Affects Versions: 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch > > > Shell's ExitCodeException doesn't include the exit code in the toString > value, so isn't that useful in diagnosing container start failures in YARN -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9651) Filesystems to throw FileAlreadyExistsException in createFile(path, overwrite=false) when the file exists
[ https://issues.apache.org/jira/browse/HADOOP-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9651: --- Issue Type: Bug (was: Sub-task) Parent: (was: HADOOP-9361) > Filesystems to throw FileAlreadyExistsException in createFile(path, > overwrite=false) when the file exists > - > > Key: HADOOP-9651 > URL: https://issues.apache.org/jira/browse/HADOOP-9651 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.1.0-beta >Reporter: Steve Loughran >Priority: Minor > Attachments: HADOOP-9651.patch > > > While HDFS and other filesystems throw a {{FileAlreadyExistsException}} if > you try to create a file that exists and you have set {{overwrite=false}}, > {{RawLocalFileSystem}} throws a plain {{IOException}}. This makes it > impossible to distinguish a create operation failing from a fixable problem > (the file is there) and something more fundamental. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.
[ https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051427#comment-14051427 ] Yi Liu commented on HADOOP-10734: - Thanks [~cmccabe], [~apurtell], [~andrew.wang] for the comments. I summarize several ways to generate secure random in linux, and why RdRand. * /dev/random, it uses an entropy pool of several entropy sources, such as mouse movement, keyboard type and so on. If entropy pool is empty, reads to /dev/random will be blocked until additional environment noise is gathered. RdRand is used to improve the entropy by combining the values received from RdRand with other sources of randomness. The reason of the combining way is some developers concern there may be back doors in RdRand, but it’s not true. * /dev/urandom, it reuses the internal entropy pool and will return as many random bytes as requested. The call will not block, and the outpout may contain less entropy than the corresponding read from /dev/random. If the entropy pool is empty, it will generate data using SHA or other algorithms. * In java, new SecureRandom(), will read bytes from /dev/urandom and do {{xor}} with bytes from java SHA1PRNG. * RdRand, hardware generator. In Openssl, it’s recommended to use hardware generators, it says their entropy is always nearly 100%. We can use RdRand directly. So we can see, option 4, the RdRand is faster than others and the entropy is nearly 100%. http://en.wikipedia.org/wiki/RdRand http://wiki.openssl.org/index.php/Random_Numbers http://en.wikipedia.org/?title=/dev/random http://docs.oracle.com/javase/7/docs/api/java/security/SecureRandom.html > Implementation of true secure random with high performance using hardware > random number generator. > -- > > Key: HADOOP-10734 > URL: https://issues.apache.org/jira/browse/HADOOP-10734 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10734.patch > > > This JIRA is to implement Secure random using JNI to OpenSSL, and > implementation should be thread-safe. > Utilize RdRand to return random numbers from hardware random number > generator. It's TRNG(True Random Number generators) having much higher > performance than {{java.security.SecureRandom}}. > https://wiki.openssl.org/index.php/Random_Numbers > http://en.wikipedia.org/wiki/RdRand > https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9361: --- Resolution: Fixed Fix Version/s: 2.5.0 Status: Resolved (was: Patch Available) > Strictly define the expected behavior of filesystem APIs and write tests to > verify compliance > - > > Key: HADOOP-9361 > URL: https://issues.apache.org/jira/browse/HADOOP-9361 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.0.0, 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > Fix For: 2.5.0 > > Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, > HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, > HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, > HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, > HADOOP-9361-013.patch, HADOOP-9361-014.patch, HADOOP-9361-015.patch, > HADOOP-9361-016.patch, HADOOP-9361-017.patch, HADOOP-9361-018.patch, > HADOOP-9361.awang-addendum.patch > > > {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while > HDFS gets tested downstream, other filesystems, such as blobstore bindings, > don't. > The only tests that are common are those of {{FileSystemContractTestBase}}, > which HADOOP-9258 shows is incomplete. > I propose > # writing more tests which clarify expected behavior > # testing operations in the interface being in their own JUnit4 test classes, > instead of one big test suite. > # Having each FS declare via a properties file what behaviors they offer, > such as atomic-rename, atomic-delete, umask, immediate-consistency -test > methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10458) swifts should throw FileAlreadyExistsException on attempt to overwrite file
[ https://issues.apache.org/jira/browse/HADOOP-10458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-10458: Resolution: Fixed Fix Version/s: 2.5.0 Status: Resolved (was: Patch Available) > swifts should throw FileAlreadyExistsException on attempt to overwrite file > --- > > Key: HADOOP-10458 > URL: https://issues.apache.org/jira/browse/HADOOP-10458 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > Attachments: HADOOP-10458-001.patch > > Original Estimate: 0.5h > Remaining Estimate: 0.5h > > the swift:// filesystem checks for and rejects {{create()}} calls over an > existing file if overwrite = false, but it throws a custom exception. > {{SwiftPathExistsException}} > If it threw a {{org.apache.hadoop.fs.FileAlreadyExistsException}} it would > match HDFS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.
[ https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051432#comment-14051432 ] Yi Liu commented on HADOOP-10734: - And we can also add enable flag in configuration and user can disable it. > Implementation of true secure random with high performance using hardware > random number generator. > -- > > Key: HADOOP-10734 > URL: https://issues.apache.org/jira/browse/HADOOP-10734 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10734.patch > > > This JIRA is to implement Secure random using JNI to OpenSSL, and > implementation should be thread-safe. > Utilize RdRand to return random numbers from hardware random number > generator. It's TRNG(True Random Number generators) having much higher > performance than {{java.security.SecureRandom}}. > https://wiki.openssl.org/index.php/Random_Numbers > http://en.wikipedia.org/wiki/RdRand > https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10312) Shell.ExitCodeException to have more useful toString
[ https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051468#comment-14051468 ] Hudson commented on HADOOP-10312: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1793 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1793/]) HADOOP-10312 Shell.ExitCodeException to have more useful toString (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607591) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java > Shell.ExitCodeException to have more useful toString > > > Key: HADOOP-10312 > URL: https://issues.apache.org/jira/browse/HADOOP-10312 > Project: Hadoop Common > Issue Type: Bug > Components: util >Affects Versions: 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch > > > Shell's ExitCodeException doesn't include the exit code in the toString > value, so isn't that useful in diagnosing container start failures in YARN -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.
[ https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051500#comment-14051500 ] Yi Liu commented on HADOOP-10734: - [~cmccabe], thanks for the review :-). {quote} I actually have the same problem with the scheme here: JNI calls are expensive... do we know how many random bits the API user is getting at a time? If that number is small, we might want to implement batching. {quote} In most cases, we use it to generate key(16bytes, 32bytes, 128bytes, 256bytes), IV(16 bytes), long (8 bytes). Furthermore, to make the random bytes good enough, we can’t avoid JNI, even {{java.security.SecureRandom}} also uses JNI. {quote} I also think we should consider using ByteBuffer rather than byte[] array, if performance is the primary goal. {quote} I suppose you mean direct ByteBuffer. Per my understanding, merit of direct ByteBuffer is to avoid bytes copy. But {{SecureRandom#nextBytes}} will accept an pre-allocated byte[] array, if we use direct ByteBuffer for JNI, then there is additional copy in java layer, so the performance is the same, and we need to manage the direct ByteBuffer. {quote} {code} + final protected int next(int numBits) { {code} Should be private {quote} OK, I will update it. {quote} {code} + public long nextLong() { +return ((long)(next(32)) << 32) + next(32); + } {code} Why use addition rather than bitwise OR here? {quote} Bitwise OR is also OK. Actually {{nextLong}}, {{nextFloat}} and {{nextDouble}} are copied from implementations in {{java.security.SecureRandom}} {quote} This is not correct. The type of {{pthread_t}} is not known. If you want a numeric thread ID, you could try gettid on Linux. {quote} Can you explain a bit more, I’m not sure I get your meaning. Per my understanding, {{pthread_t}} is defined in {{/usr/include/bits/pthreadtypes.h}} as {code} typedef unsigned long int pthread_t; {code} And this patch is compiled and run successfully on my Linux server. > Implementation of true secure random with high performance using hardware > random number generator. > -- > > Key: HADOOP-10734 > URL: https://issues.apache.org/jira/browse/HADOOP-10734 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10734.patch > > > This JIRA is to implement Secure random using JNI to OpenSSL, and > implementation should be thread-safe. > Utilize RdRand to return random numbers from hardware random number > generator. It's TRNG(True Random Number generators) having much higher > performance than {{java.security.SecureRandom}}. > https://wiki.openssl.org/index.php/Random_Numbers > http://en.wikipedia.org/wiki/RdRand > https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10312) Shell.ExitCodeException to have more useful toString
[ https://issues.apache.org/jira/browse/HADOOP-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051563#comment-14051563 ] Hudson commented on HADOOP-10312: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1820 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1820/]) HADOOP-10312 changes.text updated in wrong place (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607631) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt HADOOP-10312 Shell.ExitCodeException to have more useful toString (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607591) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java > Shell.ExitCodeException to have more useful toString > > > Key: HADOOP-10312 > URL: https://issues.apache.org/jira/browse/HADOOP-10312 > Project: Hadoop Common > Issue Type: Bug > Components: util >Affects Versions: 2.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 2.5.0 > > Attachments: HADOOP-10312-001.patch, HADOOP-10312-002.patch > > > Shell's ExitCodeException doesn't include the exit code in the toString > value, so isn't that useful in diagnosing container start failures in YARN -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051573#comment-14051573 ] Hudson commented on HADOOP-9361: FAILURE: Integrated in Hadoop-Mapreduce-trunk #1820 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1820/]) HADOOP-9361: changes.txt (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607620) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt HADOOP-9361: site and gitignore (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607601) * /hadoop/common/trunk/.gitignore * /hadoop/common/trunk/hadoop-project/src/site/site.xml HADOOP-9361: Strictly define FileSystem APIs - OpenStack portion (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607600) * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/exceptions/SwiftNotDirectoryException.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/exceptions/SwiftPathExistsException.java HADOOP-9361: Strictly define FileSystem APIs - OpenStack portion (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607599) * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/StrictBufferedFSInputStream.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeFileSystem.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeFileSystemStore.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeInputStream.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/main/java/org/apache/hadoop/fs/swift/snative/SwiftNativeOutputStream.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemBasicOps.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemContract.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/TestSwiftFileSystemRename.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/SwiftContract.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractCreate.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractDelete.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractMkdir.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractOpen.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractRename.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractRootDir.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/contract/TestSwiftContractSeek.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/java/org/apache/hadoop/fs/swift/hdfs2/TestV2LsOperations.java * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/resources/contract * /hadoop/common/trunk/hadoop-tools/hadoop-openstack/src/test/resources/contract/swift.xml HADOOP-9361: Strictly define FileSystem APIs - HDFS portion (stevel: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1607597) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/HDFSContract.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractAppend.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractConcat.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractCreate.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/contract/hdfs/TestHDFSContractDelete.java * /hadoop/common/trunk/hadoop-hdfs-project/h
[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051628#comment-14051628 ] Alejandro Abdelnur commented on HADOOP-10769: - Lets assume you have a {{DelegationTokenKeyProviderExtension}} providing a {{DelegationTokenExtension}} interface, it would be something like this: {code} public class DelegationTokenKeyProviderExtension extends KeyProviderExtension { public interface DelegationTokenExtension extends Extension { public Token getDelegationToken(String renewer) throws IOException; } private DelegationTokenKeyProviderExtension(KeyProvider kp, DelegationTokenExtension dte) { super(kp, dte); } public Token getDelegationToken(String renewer) throws IOException { Token token = null; if (getExtension() != null) { token = getExtension().getDelegationToken(renewer); } return token; } privat static DefaultDelegationTokenExtension implements DelegationTokenExtension { public Token getDelegationToken(String renewer) throws IOException { return null; } } public static DelegationTokenKeyProviderExtension getExtension(KeyProvider kp) { DelegationTokenExtension dte = (kp instanceof DelegationTokenExtension) ? (DelegationTokenExtension) kp : null; return DelegationTokenKeyProviderExtension(kp, dte); } } {code} When using the {{DelegationTokenKeyProviderExtension}} to get tokens you get the same semantics as you would do getting the tokens from the {{getDelegationToken()}} method if it would be backed in the {{KeyProvider}} API but without having the token retrieval in the {{KeyProvider}} API itself which is your source of concerns. > Add getDelegationToken() method to KeyProvider > -- > > Key: HADOOP-10769 > URL: https://issues.apache.org/jira/browse/HADOOP-10769 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > > The KeyProvider API needs to return delegation tokens to enable access to the > KeyProvider from processes without Kerberos credentials (ie Yarn containers). > This is required for HDFS encryption and KMS integration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051744#comment-14051744 ] Larry McCay commented on HADOOP-10769: -- That seems pretty convoluted. Let's step back a second - so that the full usecase is clear. * consumers of the managed keys will need access to them from services/tasks at execution time * some of the keys will be unknown until file access time * so, at job submission time KMS delegation tokens are needed so that the services/tasks can access the required keys as the submitting user later as they discover the need for the specific keys from HDFS ext attrs * therefore the delegation tokens have to be in the credentials file * they will also need to be made available to the KMSClientKeyProvider to include in the request to the KMS So, we need: 1. the ability to get the KMS delegation token at job submission time 2. the ability to add it to and get it from the credentials file (already available in Credentials) - though it seems that this has to be done by the consuming code not the KMSClientKeyProvider code 3. the ability to supply the delegation token to the KMSClientKeyProvider when requesting keys My questions: A. For #1 can't we have a standalone DelegationTokenClient component - especially since there is another jira for refactoring delegation token support out into common to be more reusable? Such a client could then potentially be used inside the KMSClientKeyProvider. B. Wouldn't it be better if providers that know they need delegation tokens were able to handle #2 themselves? C. How is #3 above going to be handled using the current interfaces - I don't see how it is being added to the interaction currently? D. If the KMSClientKeyProvider had access to the credentials object ( already have access to UserKeyProvider) or some other execution context itself then could that be a way that #3 could be addressed? > Add getDelegationToken() method to KeyProvider > -- > > Key: HADOOP-10769 > URL: https://issues.apache.org/jira/browse/HADOOP-10769 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > > The KeyProvider API needs to return delegation tokens to enable access to the > KeyProvider from processes without Kerberos credentials (ie Yarn containers). > This is required for HDFS encryption and KMS integration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051745#comment-14051745 ] Arun Suresh commented on HADOOP-10719: -- [~tucu00], The problem with having just the static method you specified is that you are kind of restricting the compos-ability of the {{KeyProviderCryptoExtension}}. For Instance, How will you combine a {{JavaKeyStoreProvider}} with some different type of {{CryptoExtension}} other than {{DefautlCryptoExtension}} ? > Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider > --- > > Key: HADOOP-10719 > URL: https://issues.apache.org/jira/browse/HADOOP-10719 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, > HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch, > HADOOP-10719.patch, HADOOP-10719.patch > > > This is a follow up on > [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044] > KeyProvider API should have 2 new methods: > * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv) > * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion > encryptedKey) > The implementation would do a known transformation on the IV (i.e.: xor with > 0xff the original IV). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051748#comment-14051748 ] Alejandro Abdelnur commented on HADOOP-10719: - As mentioned before, I wouldn't worry about composibilty, I would rather say no to it and you have different extensions wrapping the same provider instance, one for each use. > Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider > --- > > Key: HADOOP-10719 > URL: https://issues.apache.org/jira/browse/HADOOP-10719 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, > HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch, > HADOOP-10719.patch, HADOOP-10719.patch > > > This is a follow up on > [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044] > KeyProvider API should have 2 new methods: > * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv) > * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion > encryptedKey) > The implementation would do a known transformation on the IV (i.e.: xor with > 0xff the original IV). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051747#comment-14051747 ] Mike Yoder commented on HADOOP-10719: - Crypto-nerd comments - in generateEncryptedKey()... - The line "SecureRandom.getInstance("SHA1PRNG").nextBytes(newKey);" - two things: SHA1 is obsolete, can you choose something stronger? I don't know what the set of valid options are, but if there is one that resembles "NIST SP 800-90" then pick that one. Also you're doing the getInstance call every time through this function, better to call it once for the class and then just call nextBytes in this function? We probably also will want to build in new re-seeding logic around this random stream. Key generation is highly scrutinized, trust me! - The line "Cipher cipher = Cipher.getInstance("AES/CTR/NoPadding");" - can you please use CBC mode instead of CTR mode? If we use CTR mode we're subjecting the encrypted DEK to all the attacks we're trying to avoid for the data itself. CBC mode has none of the nasty ciphertext attack problems that CTR mode has. > Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider > --- > > Key: HADOOP-10719 > URL: https://issues.apache.org/jira/browse/HADOOP-10719 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, > HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch, > HADOOP-10719.patch, HADOOP-10719.patch > > > This is a follow up on > [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044] > KeyProvider API should have 2 new methods: > * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv) > * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion > encryptedKey) > The implementation would do a known transformation on the IV (i.e.: xor with > 0xff the original IV). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051768#comment-14051768 ] Alejandro Abdelnur commented on HADOOP-10719: - [~yoderme], the use of JDK JCE {{Cipher}} is temporary until we integrate with fs-encryption branch where we have all this taken care by the {{CryptoCodec}} API. > Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider > --- > > Key: HADOOP-10719 > URL: https://issues.apache.org/jira/browse/HADOOP-10719 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, > HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch, > HADOOP-10719.patch, HADOOP-10719.patch > > > This is a follow up on > [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044] > KeyProvider API should have 2 new methods: > * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv) > * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion > encryptedKey) > The implementation would do a known transformation on the IV (i.e.: xor with > 0xff the original IV). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HADOOP-10719: - Attachment: HADOOP-10719.3.patch Uploading patch with feedbacks addressed.. thank you all for the review ! This patch is to be applied to trunk > Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider > --- > > Key: HADOOP-10719 > URL: https://issues.apache.org/jira/browse/HADOOP-10719 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, > HADOOP-10719.3.patch, HADOOP-10719.patch, HADOOP-10719.patch, > HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch > > > This is a follow up on > [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044] > KeyProvider API should have 2 new methods: > * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv) > * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion > encryptedKey) > The implementation would do a known transformation on the IV (i.e.: xor with > 0xff the original IV). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10772) Generating RPMs for common, hdfs, httpfs, mapreduce , yarn and tools
[ https://issues.apache.org/jira/browse/HADOOP-10772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051838#comment-14051838 ] Eric Yang commented on HADOOP-10772: Hadoop rpm packaging was removed due to conversion to maven. Source code refactoring was what discontinued rpm packaging to evolve. Maven did not have good support in building rpm packages in the past. Now, there is a maven rpm plugin, this should be much easier to package in Hadoop. When packaging is outside of Hadoop, the developer needs to modify Hadoop and modify BigTop in order to support new platform. This is sub-optimal method to maintain modulation for the project. I view this as a community's response to a needed feature. RPM packages has been maintained for Hadoop 1.2.1 in parity with latest release of Hadoop 1.2.1, much appreciated of efforts in the community. This effort could make future Hadoop versions more polish to run on other platforms without dependent on release cycle of Bigtop. > Generating RPMs for common, hdfs, httpfs, mapreduce , yarn and tools > - > > Key: HADOOP-10772 > URL: https://issues.apache.org/jira/browse/HADOOP-10772 > Project: Hadoop Common > Issue Type: Improvement > Components: build >Reporter: Jinghui Wang >Assignee: Jinghui Wang > Attachments: HADOOP-10772.patch > > > Generating RPMs for hadoop-common, hadoop-hdfs, hadoop-hdfs-httpfs, > hadoop-mapreduce , hadoop-yarn-project and hadoop-tools-dist with dist build > profile. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10780) namenode throws java.lang.OutOfMemoryError upon DatanodeProtocol.versionRequest from datanode
Dmitry Sivachenko created HADOOP-10780: -- Summary: namenode throws java.lang.OutOfMemoryError upon DatanodeProtocol.versionRequest from datanode Key: HADOOP-10780 URL: https://issues.apache.org/jira/browse/HADOOP-10780 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.1 Environment: FreeBSD-10/stable openjdk version "1.7.0_60" OpenJDK Runtime Environment (build 1.7.0_60-b19) OpenJDK 64-Bit Server VM (build 24.60-b09, mixed mode) Reporter: Dmitry Sivachenko I am trying hadoop-2.4.1 on FreeBSD-10/stable. namenode starts up, but after first datanode contacts it, it throws an exception. All limits seem to be high enough: % limits -a Resource limits (current): cputime infinity secs filesize infinity kB datasize 33554432 kB stacksize 524288 kB coredumpsize infinity kB memoryuseinfinity kB memorylocked infinity kB maxprocesses 122778 openfiles 14 sbsize infinity bytes vmemoryuse infinity kB pseudo-terminals infinity swapuse infinity kB 14944 1 S0:06.59 /usr/local/openjdk7/bin/java -Dproc_namenode -Xmx1000m -Dhadoop.log.dir=/var/log/hadoop -Dhadoop.log.file=hadoop-hdfs-namenode-nezabudka3-00.log -Dhadoop.home.dir=/usr/local -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx32768m -Xms32768m -Djava.library.path=/usr/local/lib -Xmx32768m -Xms32768m -Djava.library.path=/usr/local/lib -Xmx32768m -Xms32768m -Djava.library.path=/usr/local/lib -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.namenode.NameNode >From the namenode's log: 2014-07-03 23:28:15,070 WARN [IPC Server handler 5 on 8020] ipc.Server (Server.java:run(2032)) - IPC Server handler 5 on 8020, call org.apache.hadoop.hdfs.server.protocol.Datano deProtocol.versionRequest from 5.255.231.209:57749 Call#842 Retry#0 java.lang.OutOfMemoryError at org.apache.hadoop.security.JniBasedUnixGroupsMapping.getGroupsForUser(Native Method) at org.apache.hadoop.security.JniBasedUnixGroupsMapping.getGroups(JniBasedUnixGroupsMapping.java:80) at org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:50) at org.apache.hadoop.security.Groups.getGroups(Groups.java:139) at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1417) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.(FSPermissionChecker.java:81) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3331) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:5491) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.versionRequest(NameNodeRpcServer.java:1082) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.versionRequest(DatanodeProtocolServerSideTranslatorPB.java:234) at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28069) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) I did not have such an issue with hadoop-1.2.1. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10693) Implementation of AES-CTR CryptoCodec using JNI to OpenSSL
[ https://issues.apache.org/jira/browse/HADOOP-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051886#comment-14051886 ] Colin Patrick McCabe commented on HADOOP-10693: --- {code} +static int loadAesCtr(JNIEnv *env) +{ +#ifdef UNIX + dlerror(); // Clear any existing error + dlsym_EVP_aes_256_ctr = dlsym(openssl, "EVP_aes_256_ctr"); + dlsym_EVP_aes_128_ctr = dlsym(openssl, "EVP_aes_128_ctr"); + if (dlerror() != NULL) { +return -1; + } +#endif + +#ifdef WINDOWS + dlsym_EVP_aes_256_ctr = (__dlsym_EVP_aes_256_ctr) GetProcAddress(openssl, \ + "EVP_aes_256_ctr"); + dlsym_EVP_aes_128_ctr = (__dlsym_EVP_aes_128_ctr) GetProcAddress(openssl, \ + "EVP_aes_128_ctr"); + if (dlsym_EVP_aes_256_ctr == NULL || dlsym_EVP_aes_128_ctr == NULL) { +return -1; + } +#endif + + return 0; +} {code} If the first call to dlsym fails, the second call will clear the dlerror state. So this isn't quite going to work, I think. I think it would be easier to just use the LOAD_DYNAMIC_SYMBOL macro, and then check for the exception afterwards. You'd need something like this: {code} void loadAes(void) { LOAD_DYNAMIC_SYMBOL(1...) LOAD_DYNAMIC_SYMBOL(2...) } JNIEXPORT void JNICALL Java_org_apache_hadoop_crypto_OpensslCipher_initIDs (JNIEnv *env, jclass clazz) { loadAes(); jthrowable jthr = (*env)->ExceptionOccurred(); if (jthr) { (*env)->DeleteLocalRef(env, jthr); THROW(...) return; } ... } {code} Or something like that. +1 once this is addressed > Implementation of AES-CTR CryptoCodec using JNI to OpenSSL > -- > > Key: HADOOP-10693 > URL: https://issues.apache.org/jira/browse/HADOOP-10693 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10693.1.patch, HADOOP-10693.2.patch, > HADOOP-10693.3.patch, HADOOP-10693.4.patch, HADOOP-10693.5.patch, > HADOOP-10693.6.patch, HADOOP-10693.7.patch, HADOOP-10693.patch > > > In HADOOP-10603, we have an implementation of AES-CTR CryptoCodec using Java > JCE provider. > To get high performance, the configured JCE provider should utilize native > code and AES-NI, but in JDK6,7 the Java embedded provider doesn't support it. > > Considering not all hadoop user will use the provider like Diceros or able to > get signed certificate from oracle to develop a custom provider, so this JIRA > will have an implementation of AES-CTR CryptoCodec using JNI to OpenSSL > directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10725) Implement listStatus and getFileInfo in the native client
[ https://issues.apache.org/jira/browse/HADOOP-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051944#comment-14051944 ] Colin Patrick McCabe commented on HADOOP-10725: --- bq. (hadoop-native-core/src/main/native/fs/fs.c) Pull this out into a separate function? Seems like an operation that will have to be done frequently. This is a bit of a special case just for the connection URI. I guess the issue is that you have people connecting with stuff like "localhost:8020", which isn't technically a well-formed URI, but which we sort of have to handle (by looking at it as authority=localhost, port=8020). On the other hand, when someone gives you a path that looks like "myfile:123", you just want to parse it with the standard URI parsing code. We might need more massaging for files with colons in them later, but it's a bit of a grey area (see HDFS-13) so I'd like to avoid dealing with it for now. For now, I'd like to keep this hack for the connection uri, but not for others. bq. (hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c) Should precedence be given to the explicitly defined "port" member or the pre-existing port in the URI? It seems like an explicit definition in the builder should take precedence? So there are three options: 1. fail with error message (current behavior in trunk) 2. hdfsBuilderSetNameNodePort wins if set 3. URI port wins hdfsBuilderSetNameNodePort if set #2 is hard to implement for jniFS. If you're given a URI such as hdfs://server:123/foo/bar, you'd have to replace 123 with whatever port you liked through string operations, prior to sending along the URI to the java code. I wish we had never added {{hdfsBuilderSetNameNodePort}}... it's definitely superfluous, since the port can be in the URI. Maybe we should just stick with option #1 for now and error out when there is a conflict. bq. (hadoop-native-core/src/main/native/ndfs/ndfs.c) Is this how the previous HDFS clients worked? Using the previous seen filename won't work if the file has been removed. Just curious... Yes, this is how the Java code works. I don't think there's an issue with the previous filename getting removed, either. Doing a listStatus with a filename just means that you want filenames that sort after that filename, not that you necessarily think there is such a filename. bq. (hadoop-native-core/src/main/native/jni/jnifs.c) This code segment appears to be exactly the same as hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/hdfs.c. Maybe a utility function would be useful? The src/main/native/libhdfs directory is going away, to be replaced by the jnifs/ directory. I haven't done that yet, but it's just an svn delete, not a very interesting patch. > Implement listStatus and getFileInfo in the native client > - > > Key: HADOOP-10725 > URL: https://issues.apache.org/jira/browse/HADOOP-10725 > Project: Hadoop Common > Issue Type: Sub-task > Components: native >Affects Versions: HADOOP-10388 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HADOOP-10725-pnative.001.patch, > HADOOP-10725-pnative.002.patch > > > Implement listStatus and getFileInfo in the native client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10693) Implementation of AES-CTR CryptoCodec using JNI to OpenSSL
[ https://issues.apache.org/jira/browse/HADOOP-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-10693: Attachment: HADOOP-10693.8.patch Thanks [~cmccabe], I have update the patch. > Implementation of AES-CTR CryptoCodec using JNI to OpenSSL > -- > > Key: HADOOP-10693 > URL: https://issues.apache.org/jira/browse/HADOOP-10693 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10693.1.patch, HADOOP-10693.2.patch, > HADOOP-10693.3.patch, HADOOP-10693.4.patch, HADOOP-10693.5.patch, > HADOOP-10693.6.patch, HADOOP-10693.7.patch, HADOOP-10693.8.patch, > HADOOP-10693.patch > > > In HADOOP-10603, we have an implementation of AES-CTR CryptoCodec using Java > JCE provider. > To get high performance, the configured JCE provider should utilize native > code and AES-NI, but in JDK6,7 the Java embedded provider doesn't support it. > > Considering not all hadoop user will use the provider like Diceros or able to > get signed certificate from oracle to develop a custom provider, so this JIRA > will have an implementation of AES-CTR CryptoCodec using JNI to OpenSSL > directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10641) Introduce Coordination Engine
[ https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051959#comment-14051959 ] Alex Newman commented on HADOOP-10641: -- [~rakeshr] zab or zk? > Introduce Coordination Engine > - > > Key: HADOOP-10641 > URL: https://issues.apache.org/jira/browse/HADOOP-10641 > Project: Hadoop Common > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Konstantin Shvachko >Assignee: Plamen Jeliazkov > Attachments: HADOOP-10641.patch, HADOOP-10641.patch, > HADOOP-10641.patch > > > Coordination Engine (CE) is a system, which allows to agree on a sequence of > events in a distributed system. In order to be reliable CE should be > distributed by itself. > Coordination Engine can be based on different algorithms (paxos, raft, 2PC, > zab) and have different implementations, depending on use cases, reliability, > availability, and performance requirements. > CE should have a common API, so that it could serve as a pluggable component > in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and > HBase (HBASE-10909). > First implementation is proposed to be based on ZooKeeper. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HADOOP-10719: - Status: Patch Available (was: Open) > Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider > --- > > Key: HADOOP-10719 > URL: https://issues.apache.org/jira/browse/HADOOP-10719 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, > HADOOP-10719.3.patch, HADOOP-10719.patch, HADOOP-10719.patch, > HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch > > > This is a follow up on > [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044] > KeyProvider API should have 2 new methods: > * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv) > * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion > encryptedKey) > The implementation would do a known transformation on the IV (i.e.: xor with > 0xff the original IV). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10720) KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API
[ https://issues.apache.org/jira/browse/HADOOP-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HADOOP-10720: - Status: Patch Available (was: Open) > KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API > --- > > Key: HADOOP-10720 > URL: https://issues.apache.org/jira/browse/HADOOP-10720 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: COMBO.patch, COMBO.patch, COMBO.patch, COMBO.patch, > COMBO.patch, HADOOP-10720.1.patch, HADOOP-10720.2.patch, HADOOP-10720.patch, > HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch > > > KMS client/server should implement support for generating encrypted keys and > decrypting them via the REST API being introduced by HADOOP-10719. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10720) KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API
[ https://issues.apache.org/jira/browse/HADOOP-10720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051987#comment-14051987 ] Hadoop QA commented on HADOOP-10720: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12653792/HADOOP-10720.2.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4205//console This message is automatically generated. > KMS: Implement generateEncryptedKey and decryptEncryptedKey in the REST API > --- > > Key: HADOOP-10720 > URL: https://issues.apache.org/jira/browse/HADOOP-10720 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: COMBO.patch, COMBO.patch, COMBO.patch, COMBO.patch, > COMBO.patch, HADOOP-10720.1.patch, HADOOP-10720.2.patch, HADOOP-10720.patch, > HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch, HADOOP-10720.patch > > > KMS client/server should implement support for generating encrypted keys and > decrypting them via the REST API being introduced by HADOOP-10719. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10693) Implementation of AES-CTR CryptoCodec using JNI to OpenSSL
[ https://issues.apache.org/jira/browse/HADOOP-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14051997#comment-14051997 ] Colin Patrick McCabe commented on HADOOP-10693: --- +1. Thanks, Yi. > Implementation of AES-CTR CryptoCodec using JNI to OpenSSL > -- > > Key: HADOOP-10693 > URL: https://issues.apache.org/jira/browse/HADOOP-10693 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10693.1.patch, HADOOP-10693.2.patch, > HADOOP-10693.3.patch, HADOOP-10693.4.patch, HADOOP-10693.5.patch, > HADOOP-10693.6.patch, HADOOP-10693.7.patch, HADOOP-10693.8.patch, > HADOOP-10693.patch > > > In HADOOP-10603, we have an implementation of AES-CTR CryptoCodec using Java > JCE provider. > To get high performance, the configured JCE provider should utilize native > code and AES-NI, but in JDK6,7 the Java embedded provider doesn't support it. > > Considering not all hadoop user will use the provider like Diceros or able to > get signed certificate from oracle to develop a custom provider, so this JIRA > will have an implementation of AES-CTR CryptoCodec using JNI to OpenSSL > directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-10693) Implementation of AES-CTR CryptoCodec using JNI to OpenSSL
[ https://issues.apache.org/jira/browse/HADOOP-10693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HADOOP-10693. --- Resolution: Fixed committed to fs-encryption branch > Implementation of AES-CTR CryptoCodec using JNI to OpenSSL > -- > > Key: HADOOP-10693 > URL: https://issues.apache.org/jira/browse/HADOOP-10693 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10693.1.patch, HADOOP-10693.2.patch, > HADOOP-10693.3.patch, HADOOP-10693.4.patch, HADOOP-10693.5.patch, > HADOOP-10693.6.patch, HADOOP-10693.7.patch, HADOOP-10693.8.patch, > HADOOP-10693.patch > > > In HADOOP-10603, we have an implementation of AES-CTR CryptoCodec using Java > JCE provider. > To get high performance, the configured JCE provider should utilize native > code and AES-NI, but in JDK6,7 the Java embedded provider doesn't support it. > > Considering not all hadoop user will use the provider like Diceros or able to > get signed certificate from oracle to develop a custom provider, so this JIRA > will have an implementation of AES-CTR CryptoCodec using JNI to OpenSSL > directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.
[ https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052008#comment-14052008 ] Colin Patrick McCabe commented on HADOOP-10734: --- bq. \[discussion of randomness methods\] I don't think {{/dev/random}} is a practical choice, due to the blocking issue. From what I've heard, {{/dev/urandom}} can often be a good choice. I'm tempted to try a super-simple piece of code that just periodically fills a big buffer from {{/dev/urandom}} and see if that performs well. I have a hunch that it would (and it would also use RDRAND on supported platforms.) But I think it's fine if you want to provide an option to go through openssl as well... we already have a dependency on that library. bq. And we can also add enable flag in configuration and user can disable it. I agree. I think we should have a configuration option like {{encryption.random.number.generator}} which specifies a comma-separated list of class names to try to use. That way a user could specify the openssl one plus a fallback to the standard java one if they so chose. Or alternately, the user could enable just the java one (and configure it to use /dev/urandom) to get something which used RDRAND plus some additional randomness. If you want to do this in a follow-on JIRA, that's OK too. I think it's confusing that we have both {{org.apache.hadoop.crypto.random.SecureRandom}} and {{java.security.SecureRandom}}. Maybe a better name for this new class would be {{OpenSslSecureRandom}} or something like that, to emphasize that it is using OpenSSL to get random bits. {code} +/** + * Utilize RdRand to return random numbers from hardware random number + * generator. It's TRNG(True Random Number generators) having high performance. + * https://wiki.openssl.org/index.php/Random_Numbers#Hardware + * http://en.wikipedia.org/wiki/RdRand + */ +static ENGINE * rdrand_init(JNIEnv *env) {code} I think the comment is a bit misleading here. Openssl compiles on a lot of platforms that don't have RDRAND. So all we really know here is that we're using openssl, not that we're using RDRAND. I think it's appropriate to have a comment saying, "if you are using an Intel chipset with RDRAND, the high-performance random number generator will be used", or something like that. But it's platform specific and we may be compiling on another platform. {code} + \@Test(timeout=12) + public void testRandomInt() throws Exception { +SecureRandom random = new SecureRandom(); + +int rand1 = random.nextInt(); +int rand2 = random.nextInt(); +Assert.assertFalse(rand1 == rand2); + } {code} It's definitely difficult to test something which is returning true random numbers. It requires a lot of mathematics. So I see why you did it this way. Just one comment... maybe I'm being overly paranoid here, but can we loop until rand2 is not equal to rand1? bq. I suppose you mean direct ByteBuffer. Per my understanding, merit of direct ByteBuffer is to avoid bytes copy. But SecureRandom#nextBytes will accept an pre-allocated byte[] array, if we use direct ByteBuffer for JNI, then there is additional copy in java layer, so the performance is the same, and we need to manage the direct ByteBuffer. OK. bq. Can you explain a bit more, I’m not sure I get your meaning. Per my understanding, pthread_t is defined in /usr/include/bits/pthreadtypes.h as The stuff in {{/usr/include/bits}} is not public; it is an implementation detail that could change at any time. from {{man pthread_self}}: bq. POSIX.1 allows an implementation wide freedom in choosing the type used to represent a thread ID; for example, representation using either an arithmetic type or a structure is permitted. Therefore, variables of type pthread_t can't portably be compared using the C equality operator (==); use pthread_equal(3) instead. Thread identifiers should be considered opaque: any attempt to use a thread ID other than in pthreads calls is nonportable and can lead to unspecified results. > Implementation of true secure random with high performance using hardware > random number generator. > -- > > Key: HADOOP-10734 > URL: https://issues.apache.org/jira/browse/HADOOP-10734 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10734.patch > > > This JIRA is to implement Secure random using JNI to OpenSSL, and > implementation should be thread-safe. > Utilize RdRand to return random numbers from hardware random number > g
[jira] [Commented] (HADOOP-10719) Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052012#comment-14052012 ] Hadoop QA commented on HADOOP-10719: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12653974/HADOOP-10719.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common: org.apache.hadoop.ha.TestZKFailoverControllerStress {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4206//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4206//console This message is automatically generated. > Add generateEncryptedKey and decryptEncryptedKey methods to KeyProvider > --- > > Key: HADOOP-10719 > URL: https://issues.apache.org/jira/browse/HADOOP-10719 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > Attachments: HADOOP-10719.1.patch, HADOOP-10719.2.patch, > HADOOP-10719.3.patch, HADOOP-10719.patch, HADOOP-10719.patch, > HADOOP-10719.patch, HADOOP-10719.patch, HADOOP-10719.patch > > > This is a follow up on > [HDFS-6134|https://issues.apache.org/jira/browse/HDFS-6134?focusedCommentId=14036044&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14036044] > KeyProvider API should have 2 new methods: > * KeyVersion generateEncryptedKey(String keyVersionName, byte[] iv) > * KeyVersion decryptEncryptedKey(String keyVersionName, byte[] iv, KeyVersion > encryptedKey) > The implementation would do a known transformation on the IV (i.e.: xor with > 0xff the original IV). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.
[ https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052014#comment-14052014 ] Alejandro Abdelnur commented on HADOOP-10734: - Any reason for not baking this in the CryptoCodec for OpenSSL instead a new java class? > Implementation of true secure random with high performance using hardware > random number generator. > -- > > Key: HADOOP-10734 > URL: https://issues.apache.org/jira/browse/HADOOP-10734 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10734.patch > > > This JIRA is to implement Secure random using JNI to OpenSSL, and > implementation should be thread-safe. > Utilize RdRand to return random numbers from hardware random number > generator. It's TRNG(True Random Number generators) having much higher > performance than {{java.security.SecureRandom}}. > https://wiki.openssl.org/index.php/Random_Numbers > http://en.wikipedia.org/wiki/RdRand > https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-5353) add progress callback feature to the slow FileUtil operations with ability to cancel the work
[ https://issues.apache.org/jira/browse/HADOOP-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HADOOP-5353: -- Description: This is something only of relevance of people doing front ends to FS operations, and as they could take the code in FSUtil and add something with this feature, its a blocker to none of them. Current FileUtil.copy can take a long time to move large files around, but there is no progress indicator to GUIs, or a way to cancel the operation mid-way, j interrupting the thread or closing the filesystem. I propose a FileIOProgress interface to the copy ops, one that had a single method to notify listeners of bytes read and written, and the number of files handled. {code} interface FileIOProgress { boolean progress(int files, long bytesRead, long bytesWritten); } The return value would be true to continue the operation, or false to stop the copy and leave the FS in whatever incomplete state it is in currently. it could even be fancier: have beginFileOperation and endFileOperation callbacks to pass in the name of the current file being worked on, though I don't have a personal need for that. GUIs could show progress bars and cancel buttons, other tools could use the interface to pass any cancellation notice upstream. The FileUtil.copy operations would call this interface (blocking) after every block copy, so the frequency of invocation would depend on block size and network/disk speeds. Which is also why I don't propose having any percentage done indicators; it's too hard to predict percentage of time done for distributed file IO with any degree of accuracy. was: This is something only of relevance of people doing front ends to FS operations, and as they could take the code in FSUtil and add something with this feature, its a blocker to none of them. Current FileUtil.copy can take a long time to move large files around, but there is no progress indicator to GUIs, or a way to cancel the operation mid-way, short of interrupting the thread or closing the filesystem. I propose a FileIOProgress interface to the copy ops, one that had a single method to notify listeners of bytes read and written, and the number of files handled. {code} interface FileIOProgress { boolean progress(int files, long bytesRead, long bytesWritten); } The return value would be true to continue the operation, or false to stop the copy and leave the FS in whatever incomplete state it is in currently. it could even be fancier: have beginFileOperation and endFileOperation callbacks to pass in the name of the current file being worked on, though I don't have a personal need for that. GUIs could show progress bars and cancel buttons, other tools could use the interface to pass any cancellation notice upstream. The FileUtil.copy operations would call this interface (blocking) after every block copy, so the frequency of invocation would depend on block size and network/disk speeds. Which is also why I don't propose having any percentage done indicators; it's too hard to predict percentage of time done for distributed file IO with any degree of accuracy. > add progress callback feature to the slow FileUtil operations with ability to > cancel the work > - > > Key: HADOOP-5353 > URL: https://issues.apache.org/jira/browse/HADOOP-5353 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 0.21.0 >Reporter: Steve Loughran >Assignee: Lei (Eddy) Xu >Priority: Minor > Attachments: HADOOP-5353.000.patch > > > This is something only of relevance of people doing front ends to FS > operations, and as they could take the code in FSUtil and add something with > this feature, its a blocker to none of them. > Current FileUtil.copy can take a long time to move large files around, but > there is no progress indicator to GUIs, or a way to cancel the operation > mid-way, j interrupting the thread or closing the filesystem. > I propose a FileIOProgress interface to the copy ops, one that had a single > method to notify listeners of bytes read and written, and the number of files > handled. > {code} > interface FileIOProgress { > boolean progress(int files, long bytesRead, long bytesWritten); > } > The return value would be true to continue the operation, or false to stop > the copy and leave the FS in whatever incomplete state it is in currently. > it could even be fancier: have beginFileOperation and endFileOperation > callbacks to pass in the name of the current file being worked on, though I > don't have a personal need for that. > GUIs could show progress bars and cancel buttons, other tools could use the > interface to pass any cancellation notice upstream.
[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052049#comment-14052049 ] Aaron T. Myers commented on HADOOP-10769: - bq. That seems pretty convoluted. In the future, would appreciate you explaining why a proposal seems convoluted. It seems quite straightforward to me, so not sure how to address this comment. This proposal is an attempt to compromise and address your concern, which I understood to be not wanting to have this method baked into the {{KeyProvider}} interface, and thus allow some implementations to use it and others to not. bq. A. For #1 can't we have a standalone DelegationTokenClient component - especially since there is another jira for refactoring delegation token support out into common to be more reusable? Such a client could then potentially be used inside the KMSClientKeyProvider. That JIRA seems to me to be orthogonal to this one, so I don't think we should couple the two. How the {{KmsClientKeyProvider}} gets tokens under the hood shouldn't have anything to do with the API. Also, as you point out later in question C, it will still be necessary for the submitting code to somehow call/interact with the tokens/credentials of the KeyProvider at submission time, so I don't think it's actually possible to entirely encapsulate the delegation token fetching/storage within the {{KeyProvider}} implementation. bq. B. Wouldn't it be better if providers that know they need delegation tokens were able to handle #2 themselves? How about changing the proposal to mimic what's done in FileSystem today and add a method like "{{public Token[] addDelegationTokens(final String renewer, Credentials credentials)}}" to the {{KeyProvider}} API? The default behavior would be to add no tokens to the provided {{Credentials}} object, but the {{KmsClientKeyProvider}} could instead fetch and stash away the tokens in the provided {{Credentials}} object. bq. C. How is #3 above going to be handled using the current interfaces - I don't see how it is being added to the interaction currently? I believe this will happen transparently, because the tokens contained in the {{Credentials}} object will be added to the UGI object which will then be used to authenticate all the RPCs. The {{KeyProvider}} shouldn't need access to the tokens in the tasks. bq. D. If the KMSClientKeyProvider had access to the credentials object ( already have access to UserKeyProvider) or some other execution context itself then could that be a way that #3 could be addressed? If I'm understanding you correctly, I think this basically the same as what I'm proposing above in response to your question B. Am I right about that? Will this work for you? > Add getDelegationToken() method to KeyProvider > -- > > Key: HADOOP-10769 > URL: https://issues.apache.org/jira/browse/HADOOP-10769 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > > The KeyProvider API needs to return delegation tokens to enable access to the > KeyProvider from processes without Kerberos credentials (ie Yarn containers). > This is required for HDFS encryption and KMS integration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052059#comment-14052059 ] Larry McCay commented on HADOOP-10769: -- Hey [~atm] - Sorry for the "convoluted" statement - that came across stronger than I intended. I just really don't like the instanceof check in there to return an instance of an extension or null. I do understand the motivation but think that something simpler would be better. Your proposal is actually very similar to my context proposal but limited to delegation tokens. In general, I am in favor of this approach. Do you think that we could make it more generic though? Out of curiosity, why does it return an array of Tokens? If we were to open it up to include other things, like keys or passwords, etc then we could just make it an add credentials method call: {{HashMap addToCredentials(HashMap props, Credentials creds)}} or {{HashMap setupCredentials(HashMap props, Credentials creds)}} Where renewer would be in the props when a given provider expects it. But we could also include the keyversions of the keys we know about at submission time and they be added. We could provide the names of passwords that may be needed by a given provider as well. Still not sure how the returned tokens are used in your proposal but they could be returned in the hashmap in this proposal - as well as anything else that would make sense. We would just need a couple well-known property names to represent: * renewer * keyversions * passwords * returned tokens? Does that make any sense? > Add getDelegationToken() method to KeyProvider > -- > > Key: HADOOP-10769 > URL: https://issues.apache.org/jira/browse/HADOOP-10769 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > > The KeyProvider API needs to return delegation tokens to enable access to the > KeyProvider from processes without Kerberos credentials (ie Yarn containers). > This is required for HDFS encryption and KMS integration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052061#comment-14052061 ] Larry McCay commented on HADOOP-10769: -- Of course, both of those proposals seem very strange to be part of the UserProvider which actually sits on top of the credentials object. :/ > Add getDelegationToken() method to KeyProvider > -- > > Key: HADOOP-10769 > URL: https://issues.apache.org/jira/browse/HADOOP-10769 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > > The KeyProvider API needs to return delegation tokens to enable access to the > KeyProvider from processes without Kerberos credentials (ie Yarn containers). > This is required for HDFS encryption and KMS integration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052066#comment-14052066 ] Aaron T. Myers commented on HADOOP-10769: - bq. Do you think that we could make it more generic though? I'm sure we could, but I suggest we cross that bridge when we come to it. Hadoop currently does delegated authentication via {{DelegationTokens}} everywhere, so let's do something to support that and move on. If in the future we have need for other stuff, we'll amend the API appropriately. Seems quite premature to me to attempt to design a generic API when we don't have any concrete alternate use-cases. bq. Out of curiosity, why does it return an array of Tokens? The various callers use it for different things, e.g. in some places just to log which tokens were renewed. I don't think it's actually integral to the functioning of the API, just a convenience. bq. If we were to open it up to include other things, like keys or passwords, etc then we could just make it an add credentials method call: In general I'm really leery of a {{HashMap}}-based API. That seems quite fragile to me, and very overly-generic for the common use case of just dealing with DTs. How about as a way forward with this JIRA we go with the "{{public Token[] addDelegationTokens(final String renewer, Credentials credentials)}}" added to {{KeyProvider}} as I proposed, and revisit a more generic API in the future when we actually have a concrete need for it? We could then perhaps later add a "{{addAdditionalCredentials}}" API call or something to accommodate non-DT-based implementations. It is *soft*ware, after all. :) > Add getDelegationToken() method to KeyProvider > -- > > Key: HADOOP-10769 > URL: https://issues.apache.org/jira/browse/HADOOP-10769 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > > The KeyProvider API needs to return delegation tokens to enable access to the > KeyProvider from processes without Kerberos credentials (ie Yarn containers). > This is required for HDFS encryption and KMS integration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10769) Add getDelegationToken() method to KeyProvider
[ https://issues.apache.org/jira/browse/HADOOP-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052078#comment-14052078 ] Larry McCay commented on HADOOP-10769: -- Well, it isn't much different than the original getDelegationToken proposal this way. So - you don't think that it makes sense to add a method that can move a list of specified keyversions into the credentials object? That seems to imply that all keys will be fetched at runtime rather than those we know about at submission time being added then. Incidentally, we wouldn't have to make *that* generic - we could come up with a type safe context that includes the same properties: * renewer * keyversions * passwords (can leave this one out until we need it) * returned tokens (if they are needed) Anyway, I've beaten this one to death. Thanks for accommodating my nit-picking. I think that I'll let [~owen.omalley] weigh in when he is back. > Add getDelegationToken() method to KeyProvider > -- > > Key: HADOOP-10769 > URL: https://issues.apache.org/jira/browse/HADOOP-10769 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 3.0.0 >Reporter: Alejandro Abdelnur >Assignee: Arun Suresh > > The KeyProvider API needs to return delegation tokens to enable access to the > KeyProvider from processes without Kerberos credentials (ie Yarn containers). > This is required for HDFS encryption and KMS integration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.
[ https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052086#comment-14052086 ] Yi Liu commented on HADOOP-10734: - Thanks [~cmccabe] for your review, I will updated the patch and respond you later. > Implementation of true secure random with high performance using hardware > random number generator. > -- > > Key: HADOOP-10734 > URL: https://issues.apache.org/jira/browse/HADOOP-10734 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10734.patch > > > This JIRA is to implement Secure random using JNI to OpenSSL, and > implementation should be thread-safe. > Utilize RdRand to return random numbers from hardware random number > generator. It's TRNG(True Random Number generators) having much higher > performance than {{java.security.SecureRandom}}. > https://wiki.openssl.org/index.php/Random_Numbers > http://en.wikipedia.org/wiki/RdRand > https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10734) Implementation of true secure random with high performance using hardware random number generator.
[ https://issues.apache.org/jira/browse/HADOOP-10734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052094#comment-14052094 ] Yi Liu commented on HADOOP-10734: - Thanks [~tucu00] for your comments. Openssl secure random is separated functionality and is used in {{OpensslAesCtrCryptoCodec}} and also can be used directly. We should have a java class couple with the JNI implementation. It’s not suitable to put {code} private native static void initSR(); private native boolean nextRandBytes(byte[] bytes); {code} into {{OpensslCipher}}. Having two classes makes code more clear. {{OpensslAesCtrCryptoCodec}} doesn't contain native methods directly, it will use {{OpensslCipher}} and Openssl secure random. > Implementation of true secure random with high performance using hardware > random number generator. > -- > > Key: HADOOP-10734 > URL: https://issues.apache.org/jira/browse/HADOOP-10734 > Project: Hadoop Common > Issue Type: Sub-task > Components: security >Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) > > Attachments: HADOOP-10734.patch > > > This JIRA is to implement Secure random using JNI to OpenSSL, and > implementation should be thread-safe. > Utilize RdRand to return random numbers from hardware random number > generator. It's TRNG(True Random Number generators) having much higher > performance than {{java.security.SecureRandom}}. > https://wiki.openssl.org/index.php/Random_Numbers > http://en.wikipedia.org/wiki/RdRand > https://software.intel.com/en-us/articles/performance-impact-of-intel-secure-key-on-openssl -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-8808) Update FsShell documentation to mention deprecation of some of the commands, and mention alternatives
[ https://issues.apache.org/jira/browse/HADOOP-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052178#comment-14052178 ] Akira AJISAKA commented on HADOOP-8808: --- The test failure is not related to the patch. It was reported at HADOOP-10406. > Update FsShell documentation to mention deprecation of some of the commands, > and mention alternatives > - > > Key: HADOOP-8808 > URL: https://issues.apache.org/jira/browse/HADOOP-8808 > Project: Hadoop Common > Issue Type: Bug > Components: documentation, fs >Affects Versions: 2.2.0 >Reporter: Hemanth Yamijala >Assignee: Akira AJISAKA > Attachments: HADOOP-8808.2.patch, HADOOP-8808.patch > > > In HADOOP-7286, we deprecated the following 3 commands dus, lsr and rmr, in > favour of du -s, ls -r and rm -r respectively. The FsShell documentation > should be updated to mention these, so that users can start switching. Also, > there are places where we refer to the deprecated commands as alternatives. > This can be changed as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10468) TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately
[ https://issues.apache.org/jira/browse/HADOOP-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052185#comment-14052185 ] Akira AJISAKA commented on HADOOP-10468: bq. I'm hoping there's a way we can fix the underlying issue without breaking existing metrics2 property files I agree with you, [~jlowe]. I'm trying to find the way. > TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately > --- > > Key: HADOOP-10468 > URL: https://issues.apache.org/jira/browse/HADOOP-10468 > Project: Hadoop Common > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: 2.5.0 > > Attachments: HADOOP-10468.000.patch, HADOOP-10468.001.patch > > > {{TestMetricsSystemImpl.testMultiThreadedPublish}} can fail intermediately > due to the insufficient size of the sink queue: > {code} > 2014-04-06 21:34:55,269 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > 2014-04-06 21:34:55,270 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > 2014-04-06 21:34:55,271 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > {code} > The unit test should increase the default queue size to avoid intermediate > failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10780) namenode throws java.lang.OutOfMemoryError upon DatanodeProtocol.versionRequest from datanode
[ https://issues.apache.org/jira/browse/HADOOP-10780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Sivachenko updated HADOOP-10780: --- Status: Patch Available (was: Open) This is because of incorrect type of buf_sz variable in hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c, function hadoop_user_info_alloc(void): Currently it's type is size_t (which is unsigned), and during assignment buf_sz = sysconf(_SC_GETPW_R_SIZE_MAX); syscont() can return -1 (and it does on FreeBSD). So buf_sz gets very large positive value (equivalent of signed -1), and then malloc() fails with OutOfMemory. The correct solution will be to change type of buf_sz to long (because sysconf() returns long). > namenode throws java.lang.OutOfMemoryError upon > DatanodeProtocol.versionRequest from datanode > - > > Key: HADOOP-10780 > URL: https://issues.apache.org/jira/browse/HADOOP-10780 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.4.1 > Environment: FreeBSD-10/stable > openjdk version "1.7.0_60" > OpenJDK Runtime Environment (build 1.7.0_60-b19) > OpenJDK 64-Bit Server VM (build 24.60-b09, mixed mode) >Reporter: Dmitry Sivachenko > > I am trying hadoop-2.4.1 on FreeBSD-10/stable. > namenode starts up, but after first datanode contacts it, it throws an > exception. > All limits seem to be high enough: > % limits -a > Resource limits (current): > cputime infinity secs > filesize infinity kB > datasize 33554432 kB > stacksize 524288 kB > coredumpsize infinity kB > memoryuseinfinity kB > memorylocked infinity kB > maxprocesses 122778 > openfiles 14 > sbsize infinity bytes > vmemoryuse infinity kB > pseudo-terminals infinity > swapuse infinity kB > 14944 1 S0:06.59 /usr/local/openjdk7/bin/java -Dproc_namenode > -Xmx1000m -Dhadoop.log.dir=/var/log/hadoop > -Dhadoop.log.file=hadoop-hdfs-namenode-nezabudka3-00.log > -Dhadoop.home.dir=/usr/local -Dhadoop.id.str=hdfs > -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml > -Djava.net.preferIPv4Stack=true -Xmx32768m -Xms32768m > -Djava.library.path=/usr/local/lib -Xmx32768m -Xms32768m > -Djava.library.path=/usr/local/lib -Xmx32768m -Xms32768m > -Djava.library.path=/usr/local/lib -Dhadoop.security.logger=INFO,RFAS > org.apache.hadoop.hdfs.server.namenode.NameNode > From the namenode's log: > 2014-07-03 23:28:15,070 WARN [IPC Server handler 5 on 8020] ipc.Server > (Server.java:run(2032)) - IPC Server handler 5 on 8020, call > org.apache.hadoop.hdfs.server.protocol.Datano > deProtocol.versionRequest from 5.255.231.209:57749 Call#842 Retry#0 > java.lang.OutOfMemoryError > at > org.apache.hadoop.security.JniBasedUnixGroupsMapping.getGroupsForUser(Native > Method) > at > org.apache.hadoop.security.JniBasedUnixGroupsMapping.getGroups(JniBasedUnixGroupsMapping.java:80) > at > org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:50) > at org.apache.hadoop.security.Groups.getGroups(Groups.java:139) > at > org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1417) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.(FSPermissionChecker.java:81) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3331) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:5491) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.versionRequest(NameNodeRpcServer.java:1082) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.versionRequest(DatanodeProtocolServerSideTranslatorPB.java:234) > at > org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28069) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
[jira] [Created] (HADOOP-10781) Unportable getgrouplist() usage breaks FreeBSD
Dmitry Sivachenko created HADOOP-10781: -- Summary: Unportable getgrouplist() usage breaks FreeBSD Key: HADOOP-10781 URL: https://issues.apache.org/jira/browse/HADOOP-10781 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.4.1 Reporter: Dmitry Sivachenko getgrouplist() has different return values on Linux and FreeBSD: Linux: either the number of groups (positive) or -1 on error FreeBSD: 0 on success or -1 on error The return value of getgrouplist() is analyzed in Linux-specific way in hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c, in function hadoop_user_info_getgroups() which breaks FreeBSD. In this function you have 3 choices for the return value ret = getgrouplist(uinfo->pwd.pw_name, uinfo->pwd.pw_gid, uinfo->gids, &ngroups); 1) ret > 0 : OK for Linux, it will be zero on FreeBSD. I propose to change this to ret >= 0 2) First condition is false and ret != -1: impossible according to manpage 3) ret == 1 -- OK for both Linux and FreeBSD So I propose to change "ret > 0" to "ret >= 0" and (optionally) return 2nd case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load
[ https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HADOOP-10597: - Attachment: HADOOP-10597-2.patch RPCClientBackoffDesignAndEvaluation.pdf [~lohit] provided some feedback. Here is the design document with some evaluation results. The updated patch also includes unit tests and make the server side retry policy pluggable. > Evaluate if we can have RPC client back off when server is under heavy load > --- > > Key: HADOOP-10597 > URL: https://issues.apache.org/jira/browse/HADOOP-10597 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HADOOP-10597-2.patch, HADOOP-10597.patch, > RPCClientBackoffDesignAndEvaluation.pdf > > > Currently if an application hits NN too hard, RPC requests be in blocking > state, assuming OS connection doesn't run out. Alternatively RPC or NN can > throw some well defined exception back to the client based on certain > policies when it is under heavy load; client will understand such exception > and do exponential back off, as another implementation of > RetryInvocationHandler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10468) TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately
[ https://issues.apache.org/jira/browse/HADOOP-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052200#comment-14052200 ] Akira AJISAKA commented on HADOOP-10468: Attaching a patch to revert HADOOP-10468 and make 'Collector' to lower case. {code} - .add("Test.sink.Collector." + MetricsConfig.QUEUE_CAPACITY_KEY, + .add("test.sink.collector." + MetricsConfig.QUEUE_CAPACITY_KEY, {code} {code} -ms.registerSink("Collector", +ms.registerSink("collector", {code} Using debugger, I confirmed the queue capacity of the sink was set to 10. > TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately > --- > > Key: HADOOP-10468 > URL: https://issues.apache.org/jira/browse/HADOOP-10468 > Project: Hadoop Common > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: 2.5.0 > > Attachments: HADOOP-10468.000.patch, HADOOP-10468.001.patch > > > {{TestMetricsSystemImpl.testMultiThreadedPublish}} can fail intermediately > due to the insufficient size of the sink queue: > {code} > 2014-04-06 21:34:55,269 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > 2014-04-06 21:34:55,270 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > 2014-04-06 21:34:55,271 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > {code} > The unit test should increase the default queue size to avoid intermediate > failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10468) TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately
[ https://issues.apache.org/jira/browse/HADOOP-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HADOOP-10468: --- Attachment: HADOOP-10468.2.patch > TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately > --- > > Key: HADOOP-10468 > URL: https://issues.apache.org/jira/browse/HADOOP-10468 > Project: Hadoop Common > Issue Type: Bug >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: 2.5.0 > > Attachments: HADOOP-10468.000.patch, HADOOP-10468.001.patch, > HADOOP-10468.2.patch > > > {{TestMetricsSystemImpl.testMultiThreadedPublish}} can fail intermediately > due to the insufficient size of the sink queue: > {code} > 2014-04-06 21:34:55,269 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > 2014-04-06 21:34:55,270 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > 2014-04-06 21:34:55,271 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > {code} > The unit test should increase the default queue size to avoid intermediate > failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10597) Evaluate if we can have RPC client back off when server is under heavy load
[ https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HADOOP-10597: - Status: Patch Available (was: Open) > Evaluate if we can have RPC client back off when server is under heavy load > --- > > Key: HADOOP-10597 > URL: https://issues.apache.org/jira/browse/HADOOP-10597 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HADOOP-10597-2.patch, HADOOP-10597.patch, > RPCClientBackoffDesignAndEvaluation.pdf > > > Currently if an application hits NN too hard, RPC requests be in blocking > state, assuming OS connection doesn't run out. Alternatively RPC or NN can > throw some well defined exception back to the client based on certain > policies when it is under heavy load; client will understand such exception > and do exponential back off, as another implementation of > RetryInvocationHandler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10468) TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately
[ https://issues.apache.org/jira/browse/HADOOP-10468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HADOOP-10468: --- Affects Version/s: 2.5.0 Status: Patch Available (was: Reopened) > TestMetricsSystemImpl.testMultiThreadedPublish fails intermediately > --- > > Key: HADOOP-10468 > URL: https://issues.apache.org/jira/browse/HADOOP-10468 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.5.0 >Reporter: Haohui Mai >Assignee: Haohui Mai > Fix For: 2.5.0 > > Attachments: HADOOP-10468.000.patch, HADOOP-10468.001.patch, > HADOOP-10468.2.patch > > > {{TestMetricsSystemImpl.testMultiThreadedPublish}} can fail intermediately > due to the insufficient size of the sink queue: > {code} > 2014-04-06 21:34:55,269 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > 2014-04-06 21:34:55,270 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > 2014-04-06 21:34:55,271 WARN impl.MetricsSinkAdapter > (MetricsSinkAdapter.java:putMetricsImmediate(107)) - Collector has a full > queue and can't consume the given metrics. > {code} > The unit test should increase the default queue size to avoid intermediate > failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10781) Unportable getgrouplist() usage breaks FreeBSD
[ https://issues.apache.org/jira/browse/HADOOP-10781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Sivachenko updated HADOOP-10781: --- Status: Patch Available (was: Open) --- hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c.bak 2014-06-21 09:40:12.0 +0400 +++ hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c 2014-07-04 10:53:05.0 +0400 @@ -193,7 +193,7 @@ ngroups = uinfo->gids_size; ret = getgrouplist(uinfo->pwd.pw_name, uinfo->pwd.pw_gid, uinfo->gids, &ngroups); - if (ret > 0) { + if (ret > 0 /* Linux */ || ret == 0 /* FreeBSD */) { uinfo->num_gids = ngroups; ret = put_primary_gid_first(uinfo); if (ret) { > Unportable getgrouplist() usage breaks FreeBSD > -- > > Key: HADOOP-10781 > URL: https://issues.apache.org/jira/browse/HADOOP-10781 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.4.1 >Reporter: Dmitry Sivachenko > > getgrouplist() has different return values on Linux and FreeBSD: > Linux: either the number of groups (positive) or -1 on error > FreeBSD: 0 on success or -1 on error > The return value of getgrouplist() is analyzed in Linux-specific way in > hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c, > in function hadoop_user_info_getgroups() which breaks FreeBSD. > In this function you have 3 choices for the return value > ret = getgrouplist(uinfo->pwd.pw_name, uinfo->pwd.pw_gid, > uinfo->gids, &ngroups); > 1) ret > 0 : OK for Linux, it will be zero on FreeBSD. I propose to change > this to ret >= 0 > 2) First condition is false and ret != -1: impossible according to manpage > 3) ret == 1 -- OK for both Linux and FreeBSD > So I propose to change "ret > 0" to "ret >= 0" and (optionally) return 2nd > case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10781) Unportable getgrouplist() usage breaks FreeBSD
[ https://issues.apache.org/jira/browse/HADOOP-10781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052209#comment-14052209 ] Dmitry Sivachenko commented on HADOOP-10781: And also remove this code: - } else if (ret != -1) { -// Any return code that is not -1 is considered as error. -// Since the user lookup was successful, there should be at least one -// group for this user. -return EIO; Because according to manpage this is impossible. > Unportable getgrouplist() usage breaks FreeBSD > -- > > Key: HADOOP-10781 > URL: https://issues.apache.org/jira/browse/HADOOP-10781 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.4.1 >Reporter: Dmitry Sivachenko > > getgrouplist() has different return values on Linux and FreeBSD: > Linux: either the number of groups (positive) or -1 on error > FreeBSD: 0 on success or -1 on error > The return value of getgrouplist() is analyzed in Linux-specific way in > hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/security/hadoop_user_info.c, > in function hadoop_user_info_getgroups() which breaks FreeBSD. > In this function you have 3 choices for the return value > ret = getgrouplist(uinfo->pwd.pw_name, uinfo->pwd.pw_gid, > uinfo->gids, &ngroups); > 1) ret > 0 : OK for Linux, it will be zero on FreeBSD. I propose to change > this to ret >= 0 > 2) First condition is false and ret != -1: impossible according to manpage > 3) ret == 1 -- OK for both Linux and FreeBSD > So I propose to change "ret > 0" to "ret >= 0" and (optionally) return 2nd > case. -- This message was sent by Atlassian JIRA (v6.2#6252)