[jira] [Updated] (HADOOP-10632) Minor improvements to Crypto input and output streams
[ https://issues.apache.org/jira/browse/HADOOP-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-10632: Attachment: HADOOP-10632.4.patch Thanks Alejandro, update the patch. Minor improvements to Crypto input and output streams - Key: HADOOP-10632 URL: https://issues.apache.org/jira/browse/HADOOP-10632 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Alejandro Abdelnur Assignee: Yi Liu Fix For: 3.0.0 Attachments: HADOOP-10632.1.patch, HADOOP-10632.2.patch, HADOOP-10632.3.patch, HADOOP-10632.4.patch, HADOOP-10632.patch Minor follow up feedback on the crypto streams -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10635) Add a method to CryptoCodec to generate SRNs for IV
[ https://issues.apache.org/jira/browse/HADOOP-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HADOOP-10635: Attachment: HADOOP-10635.1.patch Thanks Alejandro and Charles for review, the new patch includes update for your comments. Add a method to CryptoCodec to generate SRNs for IV --- Key: HADOOP-10635 URL: https://issues.apache.org/jira/browse/HADOOP-10635 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Alejandro Abdelnur Assignee: Yi Liu Fix For: 3.0.0 Attachments: HADOOP-10635.1.patch, HADOOP-10635.patch SRN generators are provided by crypto libraries. the CryptoCodec gives access to a crypto library, thus it makes sense to expose the SRN generator on the CryptoCodec API. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10632) Minor improvements to Crypto input and output streams
[ https://issues.apache.org/jira/browse/HADOOP-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013362#comment-14013362 ] Alejandro Abdelnur commented on HADOOP-10632: - +1 Minor improvements to Crypto input and output streams - Key: HADOOP-10632 URL: https://issues.apache.org/jira/browse/HADOOP-10632 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Alejandro Abdelnur Assignee: Yi Liu Fix For: 3.0.0 Attachments: HADOOP-10632.1.patch, HADOOP-10632.2.patch, HADOOP-10632.3.patch, HADOOP-10632.4.patch, HADOOP-10632.patch Minor follow up feedback on the crypto streams -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10635) Add a method to CryptoCodec to generate SRNs for IV
[ https://issues.apache.org/jira/browse/HADOOP-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013366#comment-14013366 ] Alejandro Abdelnur commented on HADOOP-10635: - LGTM, +1. I would do a minor change before committing, I'd move the DEFAULT_SECURE_RANDOM_ALG constant to CommonConfigurationKeysPublic.java. Add a method to CryptoCodec to generate SRNs for IV --- Key: HADOOP-10635 URL: https://issues.apache.org/jira/browse/HADOOP-10635 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Alejandro Abdelnur Assignee: Yi Liu Fix For: 3.0.0 Attachments: HADOOP-10635.1.patch, HADOOP-10635.patch SRN generators are provided by crypto libraries. the CryptoCodec gives access to a crypto library, thus it makes sense to expose the SRN generator on the CryptoCodec API. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10624) Fix some minors typo and add more test cases for hadoop_err
[ https://issues.apache.org/jira/browse/HADOOP-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenwu Peng updated HADOOP-10624: Attachment: HADOOP-10624-pnative.004.patch Fix some minors typo and add more test cases for hadoop_err --- Key: HADOOP-10624 URL: https://issues.apache.org/jira/browse/HADOOP-10624 Project: Hadoop Common Issue Type: Sub-task Affects Versions: HADOOP-10388 Reporter: Wenwu Peng Assignee: Wenwu Peng Attachments: HADOOP-10624-pnative.001.patch, HADOOP-10624-pnative.002.patch, HADOOP-10624-pnative.003.patch, HADOOP-10624-pnative.004.patch Changes: 1. Add more test cases to cover method hadoop_lerr_alloc and hadoop_uverr_alloc 2. Fix typo as following: 1) Change hadoop_uverr_alloc(int cod to hadoop_uverr_alloc(int code in hadoop_err.h 2) Change OutOfMemory to OutOfMemoryException to consistent with other Exception in hadoop_err.c 3) Change DBUG to DEBUG in messenger.c 4) Change DBUG to DEBUG in reactor.c -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10635) Add a method to CryptoCodec to generate SRNs for IV
[ https://issues.apache.org/jira/browse/HADOOP-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013369#comment-14013369 ] Yi Liu commented on HADOOP-10635: - Thanks Alejandro :-) Add a method to CryptoCodec to generate SRNs for IV --- Key: HADOOP-10635 URL: https://issues.apache.org/jira/browse/HADOOP-10635 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Alejandro Abdelnur Assignee: Yi Liu Fix For: 3.0.0 Attachments: HADOOP-10635.1.patch, HADOOP-10635.patch SRN generators are provided by crypto libraries. the CryptoCodec gives access to a crypto library, thus it makes sense to expose the SRN generator on the CryptoCodec API. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-10632) Minor improvements to Crypto input and output streams
[ https://issues.apache.org/jira/browse/HADOOP-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu resolved HADOOP-10632. - Resolution: Fixed Minor improvements to Crypto input and output streams - Key: HADOOP-10632 URL: https://issues.apache.org/jira/browse/HADOOP-10632 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Alejandro Abdelnur Assignee: Yi Liu Fix For: 3.0.0 Attachments: HADOOP-10632.1.patch, HADOOP-10632.2.patch, HADOOP-10632.3.patch, HADOOP-10632.4.patch, HADOOP-10632.patch Minor follow up feedback on the crypto streams -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10632) Minor improvements to Crypto input and output streams
[ https://issues.apache.org/jira/browse/HADOOP-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013403#comment-14013403 ] Yi Liu commented on HADOOP-10632: - Committed to branch. Minor improvements to Crypto input and output streams - Key: HADOOP-10632 URL: https://issues.apache.org/jira/browse/HADOOP-10632 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Alejandro Abdelnur Assignee: Yi Liu Fix For: 3.0.0 Attachments: HADOOP-10632.1.patch, HADOOP-10632.2.patch, HADOOP-10632.3.patch, HADOOP-10632.4.patch, HADOOP-10632.patch Minor follow up feedback on the crypto streams -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-10635) Add a method to CryptoCodec to generate SRNs for IV
[ https://issues.apache.org/jira/browse/HADOOP-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu resolved HADOOP-10635. - Resolution: Fixed Hadoop Flags: Reviewed Committed to branch. Move the {{DEFAULT_SECURE_RANDOM_ALG}} to CommonConfigurationKeysPublic.java: {code} /** Defalt value for HADOOP_SECURITY_SECURE_RANDOM_ALGORITHM_KEY */ public static final String HADOOP_SECURITY_SECURE_RANDOM_ALGORITHM_DEFAULT = SHA1PRNG; {code} Add a method to CryptoCodec to generate SRNs for IV --- Key: HADOOP-10635 URL: https://issues.apache.org/jira/browse/HADOOP-10635 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Alejandro Abdelnur Assignee: Yi Liu Fix For: 3.0.0 Attachments: HADOOP-10635.1.patch, HADOOP-10635.patch SRN generators are provided by crypto libraries. the CryptoCodec gives access to a crypto library, thus it makes sense to expose the SRN generator on the CryptoCodec API. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10400) Incorporate new S3A FileSystem implementation
[ https://issues.apache.org/jira/browse/HADOOP-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-10400: Component/s: fs/s3 Affects Version/s: 2.4.0 Incorporate new S3A FileSystem implementation - Key: HADOOP-10400 URL: https://issues.apache.org/jira/browse/HADOOP-10400 Project: Hadoop Common Issue Type: Improvement Components: fs, fs/s3 Affects Versions: 2.4.0 Reporter: Jordan Mendelson Assignee: Jordan Mendelson Attachments: HADOOP-10400-1.patch, HADOOP-10400-2.patch, HADOOP-10400-3.patch, HADOOP-10400-4.patch, HADOOP-10400-5.patch The s3native filesystem has a number of limitations (some of which were recently fixed by HADOOP-9454). This patch adds an s3a filesystem which uses the aws-sdk instead of the jets3t library. There are a number of improvements over s3native including: - Parallel copy (rename) support (dramatically speeds up commits on large files) - AWS S3 explorer compatible empty directories files xyz/ instead of xyz_$folder$ (reduces littering) - Ignores s3native created _$folder$ files created by s3native and other S3 browsing utilities - Supports multiple output buffer dirs to even out IO when uploading files - Supports IAM role-based authentication - Allows setting a default canned ACL for uploads (public, private, etc.) - Better error recovery handling - Should handle input seeks without having to download the whole file (used for splits a lot) This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to various pom files to get it to build against trunk. I've been using 0.0.1 in production with CDH 4 for several months and CDH 5 for a few days. The version here is 0.0.2 which changes around some keys to hopefully bring the key name style more inline with the rest of hadoop 2.x. *Tunable parameters:* fs.s3a.access.key - Your AWS access key ID (omit for role authentication) fs.s3a.secret.key - Your AWS secret key (omit for role authentication) fs.s3a.connection.maximum - Controls how many parallel connections HttpClient spawns (default: 15) fs.s3a.connection.ssl.enabled - Enables or disables SSL connections to S3 (default: true) fs.s3a.attempts.maximum - How many times we should retry commands on transient errors (default: 10) fs.s3a.connection.timeout - Socket connect timeout (default: 5000) fs.s3a.paging.maximum - How many keys to request from S3 when doing directory listings at a time (default: 5000) fs.s3a.multipart.size - How big (in bytes) to split a upload or copy operation up into (default: 104857600) fs.s3a.multipart.threshold - Until a file is this large (in bytes), use non-parallel upload (default: 2147483647) fs.s3a.acl.default - Set a canned ACL on newly created/copied objects (private | public-read | public-read-write | authenticated-read | log-delivery-write | bucket-owner-read | bucket-owner-full-control) fs.s3a.multipart.purge - True if you want to purge existing multipart uploads that may not have been completed/aborted correctly (default: false) fs.s3a.multipart.purge.age - Minimum age in seconds of multipart uploads to purge (default: 86400) fs.s3a.buffer.dir - Comma separated list of directories that will be used to buffer file writes out of (default: uses ${hadoop.tmp.dir}/s3a ) *Caveats*: Hadoop uses a standard output committer which uploads files as filename.COPYING before renaming them. This can cause unnecessary performance issues with S3 because it does not have a rename operation and S3 already verifies uploads against an md5 that the driver sets on the upload request. While this FileSystem should be significantly faster than the built-in s3native driver because of parallel copy support, you may want to consider setting a null output committer on our jobs to further improve performance. Because S3 requires the file length and MD5 to be known before a file is uploaded, all output is buffered out to a temporary file first similar to the s3native driver. Due to the lack of native rename() for S3, renaming extremely large files or directories make take a while. Unfortunately, there is no way to notify hadoop that progress is still being made for rename operations, so your job may time out unless you increase the task timeout. This driver will fully ignore _$folder$ files. This was necessary so that it could interoperate with repositories that have had the s3native driver used on them, but means that it won't recognize empty directories that s3native has been used on. Statistics for the filesystem may be calculated differently than the s3native filesystem. When uploading a file, we do not count writing the temporary file on the
[jira] [Commented] (HADOOP-10643) Add NativeS3Fs that delgates calls from FileContext apis to native s3 fs implementation
[ https://issues.apache.org/jira/browse/HADOOP-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013453#comment-14013453 ] Steve Loughran commented on HADOOP-10643: - this could be useful, when ready. We're looking for all our future S3 support -other than bug fixes- to be on HADOOP-10400 and the proposed new s3a filesystem, in a new module {{hadoop-tools/hadoop-aws}} # Can you look at the code proposed for HADOOP-10400 and build on that? And if not -what's wrong with that patch? # all new code will need to go into the proposed new module, which will have tests that only run if the relevant authentication details are provided. This gives you an opportunity to write contract compliance tests for the new FS Add NativeS3Fs that delgates calls from FileContext apis to native s3 fs implementation --- Key: HADOOP-10643 URL: https://issues.apache.org/jira/browse/HADOOP-10643 Project: Hadoop Common Issue Type: New Feature Components: fs/s3 Affects Versions: 2.4.0 Reporter: Sumit Kumar Attachments: HADOOP-10643.patch The new set of file system related apis (FileContext/AbstractFileSystem) already support local filesytem, hdfs, viewfs) however they don't support s3n. This patch is to add that support using configurations like fs.AbstractFileSystem.s3n.impl = org.apache.hadoop.fs.s3native.NativeS3Fs This patch however doesn't provide a new implementation, instead relies on DelegateToFileSystem abstract class to delegate all calls from FileContext apis for s3n to the NativeS3FileSystem implementation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10602) Documentation has broken Go Back hyperlinks.
[ https://issues.apache.org/jira/browse/HADOOP-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013527#comment-14013527 ] Hudson commented on HADOOP-10602: - FAILURE: Integrated in Hadoop-Yarn-trunk #568 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/568/]) HADOOP-10602. Correct CHANGES.txt. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598339) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt HADOOP-10602. Documentation has broken Go Back hyperlinks. Contributed by Akira AJISAKA. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598337) * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/site/apt/BuildingIt.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/site/apt/Configuration.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/site/apt/Examples.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-kms/src/site/apt/index.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/ServerSetup.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/UsingHttpTools.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/CentralizedCacheManagement.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ExtendedAttributes.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ViewFs.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapReduce_Compatibility_Hadoop1_Hadoop2.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm * /hadoop/common/trunk/hadoop-tools/hadoop-sls/src/site/apt/SchedulerLoadSimulator.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManager.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerRest.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/TimelineServer.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WebServicesIntro.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WritingYarnApplications.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/YarnCommands.apt.vm Documentation has broken Go Back hyperlinks. -- Key: HADOOP-10602 URL: https://issues.apache.org/jira/browse/HADOOP-10602 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Akira AJISAKA Priority: Trivial Labels: newbie Fix For: 3.0.0, 2.5.0 Attachments: HADOOP-10602.2.patch, HADOOP-10602.3.patch, HADOOP-10602.patch Multiple pages of our documentation have Go Back links that are broken, because they point to an incorrect relative path. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9968) ProxyUsers does not work with NetGroups
[ https://issues.apache.org/jira/browse/HADOOP-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013525#comment-14013525 ] Hudson commented on HADOOP-9968: FAILURE: Integrated in Hadoop-Yarn-trunk #568 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/568/]) HADOOP-9968. Update CHANGES.txt in trunk. (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598442) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt ProxyUsers does not work with NetGroups --- Key: HADOOP-9968 URL: https://issues.apache.org/jira/browse/HADOOP-9968 Project: Hadoop Common Issue Type: Improvement Components: security Reporter: Benoy Antony Assignee: Benoy Antony Fix For: 3.0.0, 2.5.0 Attachments: HADOOP-9968.patch, HADOOP-9968.patch, HADOOP-9968.patch, hadoop-9968-1.2.patch It is possible to use NetGroups for ACLs. This requires specifying the config property hadoop.security.group.mapping as org.apache.hadoop.security.JniBasedUnixGroupsNetgroupMapping or org.apache.hadoop.security.ShellBasedUnixGroupsNetgroupMapping. The authorization to proxy a user by another user is specified as a list of groups hadoop.proxyuser.user-name.groups. The Group resolution does not work if we are using NetGroups. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10448) Support pluggable mechanism to specify proxy user settings
[ https://issues.apache.org/jira/browse/HADOOP-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013522#comment-14013522 ] Hudson commented on HADOOP-10448: - FAILURE: Integrated in Hadoop-Yarn-trunk #568 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/568/]) HADOOP-10448. Support pluggable mechanism to specify proxy user settings (Contributed by Benoy Antony) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598396) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/DefaultImpersonationProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/ImpersonationProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/ProxyUsers.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/MiniRPCBenchmark.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestDoAsEffectiveUser.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/authorize/TestProxyUsers.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/TestReaddir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestDelegationTokenForProxyUser.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/common/TestJspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/security/TestRefreshUserMappings.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java Support pluggable mechanism to specify proxy user settings -- Key: HADOOP-10448 URL: https://issues.apache.org/jira/browse/HADOOP-10448 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: 2.3.0 Reporter: Benoy Antony Assignee: Benoy Antony Fix For: 3.0.0, 2.5.0 Attachments: HADOOP-10448-branch2.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch We have a requirement to support large number of superusers. (users who impersonate as another user) (http://hadoop.apache.org/docs/r1.2.1/Secure_Impersonation.html) Currently each superuser needs to be defined in the core-site.xml via proxyuser settings. This will be cumbersome when there are 1000 entries. It seems useful to have a pluggable mechanism to specify proxy user settings with the current approach as the default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10639) FileBasedKeyStoresFactory initialization is not using default for SSL_REQUIRE_CLIENT_CERT_KEY
[ https://issues.apache.org/jira/browse/HADOOP-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013524#comment-14013524 ] Hudson commented on HADOOP-10639: - FAILURE: Integrated in Hadoop-Yarn-trunk #568 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/568/]) HADOOP-10639. FileBasedKeyStoresFactory initialization is not using default for SSL_REQUIRE_CLIENT_CERT_KEY. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598413) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ssl/FileBasedKeyStoresFactory.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/ssl/TestSSLFactory.java FileBasedKeyStoresFactory initialization is not using default for SSL_REQUIRE_CLIENT_CERT_KEY - Key: HADOOP-10639 URL: https://issues.apache.org/jira/browse/HADOOP-10639 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.4.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.5.0 Attachments: HADOOP-10639.patch The FileBasedKeyStoresFactory initialization is defaulting SSL_REQUIRE_CLIENT_CERT_KEY to true instead of the default DEFAULT_SSL_REQUIRE_CLIENT_CERT (false). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10638) Updating hadoop-daemon.sh to work as expected when nfs is started as a privileged user.
[ https://issues.apache.org/jira/browse/HADOOP-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013526#comment-14013526 ] Hudson commented on HADOOP-10638: - FAILURE: Integrated in Hadoop-Yarn-trunk #568 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/568/]) HADOOP-10638. Updating hadoop-daemon.sh to work as expected when nfs is started as a privileged user. Contributed by Manikandan Narayanaswamy. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598451) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh Updating hadoop-daemon.sh to work as expected when nfs is started as a privileged user. Key: HADOOP-10638 URL: https://issues.apache.org/jira/browse/HADOOP-10638 Project: Hadoop Common Issue Type: Bug Components: nfs Affects Versions: 2.4.0 Reporter: Manikandan Narayanaswamy Assignee: Manikandan Narayanaswamy Labels: patch Fix For: 2.5.0 Attachments: 0001-Picking-the-right-pid-file-when-running-NFS-as-privi.patch When NFS is started as a privileged user, this change sets up required environment variables: HADOOP_PID_DIR = $HADOOP_PRIVILEGED_NFS_PID_DIR HADOOP_LOG_DIR = $HADOOP_PRIVILEGED_NFS_LOG_DIR HADOOP_IDENT_STRING = $HADOOP_PRIVILEGED_NFS_USER Also, along with the above, we also now collect ulimits for the right user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10644) Remote principal name case sensitivity issue introduced on Windows by HADOOP-10418
Remus Rusanu created HADOOP-10644: - Summary: Remote principal name case sensitivity issue introduced on Windows by HADOOP-10418 Key: HADOOP-10644 URL: https://issues.apache.org/jira/browse/HADOOP-10644 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.4.0 Reporter: Remus Rusanu HADOOP-10418 caused the SPN to be generated using KRB_NT_SRV_HST type. This results in a wrong case FQDN name and the {code} isPrincipalValid = serverPrincipal.equals(confPrincipal); {code} check fails due to case difference. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10644) Remote principal name case sensitivity issue introduced on Windows by HADOOP-10418
[ https://issues.apache.org/jira/browse/HADOOP-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HADOOP-10644: -- Labels: windows (was: ) Remote principal name case sensitivity issue introduced on Windows by HADOOP-10418 -- Key: HADOOP-10644 URL: https://issues.apache.org/jira/browse/HADOOP-10644 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.4.0 Reporter: Remus Rusanu Labels: windows HADOOP-10418 caused the SPN to be generated using KRB_NT_SRV_HST type. This results in a wrong case FQDN name and the {code} isPrincipalValid = serverPrincipal.equals(confPrincipal); {code} check fails due to case difference. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10644) Remote principal name case sensitivity issue introduced on Windows by HADOOP-10418
[ https://issues.apache.org/jira/browse/HADOOP-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013625#comment-14013625 ] Remus Rusanu commented on HADOOP-10644: --- Note that as per RFC4120 the fix is to make the confPrincipal lowercase: {code} 6.2.1. Name of Server Principals ... Where the name of the host is not case sensitive (for example, with Internet domain names) the name of the host MUST be lowercase {code} Remote principal name case sensitivity issue introduced on Windows by HADOOP-10418 -- Key: HADOOP-10644 URL: https://issues.apache.org/jira/browse/HADOOP-10644 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.4.0 Reporter: Remus Rusanu Labels: windows HADOOP-10418 caused the SPN to be generated using KRB_NT_SRV_HST type. This results in a wrong case FQDN name and the {code} isPrincipalValid = serverPrincipal.equals(confPrincipal); {code} check fails due to case difference. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10639) FileBasedKeyStoresFactory initialization is not using default for SSL_REQUIRE_CLIENT_CERT_KEY
[ https://issues.apache.org/jira/browse/HADOOP-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013643#comment-14013643 ] Hudson commented on HADOOP-10639: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1759 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1759/]) HADOOP-10639. FileBasedKeyStoresFactory initialization is not using default for SSL_REQUIRE_CLIENT_CERT_KEY. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598413) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ssl/FileBasedKeyStoresFactory.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/ssl/TestSSLFactory.java FileBasedKeyStoresFactory initialization is not using default for SSL_REQUIRE_CLIENT_CERT_KEY - Key: HADOOP-10639 URL: https://issues.apache.org/jira/browse/HADOOP-10639 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.4.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.5.0 Attachments: HADOOP-10639.patch The FileBasedKeyStoresFactory initialization is defaulting SSL_REQUIRE_CLIENT_CERT_KEY to true instead of the default DEFAULT_SSL_REQUIRE_CLIENT_CERT (false). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10448) Support pluggable mechanism to specify proxy user settings
[ https://issues.apache.org/jira/browse/HADOOP-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013641#comment-14013641 ] Hudson commented on HADOOP-10448: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1759 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1759/]) HADOOP-10448. Support pluggable mechanism to specify proxy user settings (Contributed by Benoy Antony) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598396) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/DefaultImpersonationProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/ImpersonationProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/ProxyUsers.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/MiniRPCBenchmark.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestDoAsEffectiveUser.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/authorize/TestProxyUsers.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/TestReaddir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestDelegationTokenForProxyUser.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/common/TestJspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/security/TestRefreshUserMappings.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java Support pluggable mechanism to specify proxy user settings -- Key: HADOOP-10448 URL: https://issues.apache.org/jira/browse/HADOOP-10448 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: 2.3.0 Reporter: Benoy Antony Assignee: Benoy Antony Fix For: 3.0.0, 2.5.0 Attachments: HADOOP-10448-branch2.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch We have a requirement to support large number of superusers. (users who impersonate as another user) (http://hadoop.apache.org/docs/r1.2.1/Secure_Impersonation.html) Currently each superuser needs to be defined in the core-site.xml via proxyuser settings. This will be cumbersome when there are 1000 entries. It seems useful to have a pluggable mechanism to specify proxy user settings with the current approach as the default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10638) Updating hadoop-daemon.sh to work as expected when nfs is started as a privileged user.
[ https://issues.apache.org/jira/browse/HADOOP-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013645#comment-14013645 ] Hudson commented on HADOOP-10638: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1759 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1759/]) HADOOP-10638. Updating hadoop-daemon.sh to work as expected when nfs is started as a privileged user. Contributed by Manikandan Narayanaswamy. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598451) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh Updating hadoop-daemon.sh to work as expected when nfs is started as a privileged user. Key: HADOOP-10638 URL: https://issues.apache.org/jira/browse/HADOOP-10638 Project: Hadoop Common Issue Type: Bug Components: nfs Affects Versions: 2.4.0 Reporter: Manikandan Narayanaswamy Assignee: Manikandan Narayanaswamy Labels: patch Fix For: 2.5.0 Attachments: 0001-Picking-the-right-pid-file-when-running-NFS-as-privi.patch When NFS is started as a privileged user, this change sets up required environment variables: HADOOP_PID_DIR = $HADOOP_PRIVILEGED_NFS_PID_DIR HADOOP_LOG_DIR = $HADOOP_PRIVILEGED_NFS_LOG_DIR HADOOP_IDENT_STRING = $HADOOP_PRIVILEGED_NFS_USER Also, along with the above, we also now collect ulimits for the right user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10602) Documentation has broken Go Back hyperlinks.
[ https://issues.apache.org/jira/browse/HADOOP-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013646#comment-14013646 ] Hudson commented on HADOOP-10602: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1759 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1759/]) HADOOP-10602. Correct CHANGES.txt. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598339) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt HADOOP-10602. Documentation has broken Go Back hyperlinks. Contributed by Akira AJISAKA. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598337) * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/site/apt/BuildingIt.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/site/apt/Configuration.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/site/apt/Examples.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-kms/src/site/apt/index.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/ServerSetup.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/UsingHttpTools.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/CentralizedCacheManagement.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ExtendedAttributes.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ViewFs.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapReduce_Compatibility_Hadoop1_Hadoop2.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm * /hadoop/common/trunk/hadoop-tools/hadoop-sls/src/site/apt/SchedulerLoadSimulator.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManager.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerRest.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/TimelineServer.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WebServicesIntro.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WritingYarnApplications.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/YarnCommands.apt.vm Documentation has broken Go Back hyperlinks. -- Key: HADOOP-10602 URL: https://issues.apache.org/jira/browse/HADOOP-10602 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Akira AJISAKA Priority: Trivial Labels: newbie Fix For: 3.0.0, 2.5.0 Attachments: HADOOP-10602.2.patch, HADOOP-10602.3.patch, HADOOP-10602.patch Multiple pages of our documentation have Go Back links that are broken, because they point to an incorrect relative path. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9968) ProxyUsers does not work with NetGroups
[ https://issues.apache.org/jira/browse/HADOOP-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013644#comment-14013644 ] Hudson commented on HADOOP-9968: FAILURE: Integrated in Hadoop-Hdfs-trunk #1759 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1759/]) HADOOP-9968. Update CHANGES.txt in trunk. (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598442) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt ProxyUsers does not work with NetGroups --- Key: HADOOP-9968 URL: https://issues.apache.org/jira/browse/HADOOP-9968 Project: Hadoop Common Issue Type: Improvement Components: security Reporter: Benoy Antony Assignee: Benoy Antony Fix For: 3.0.0, 2.5.0 Attachments: HADOOP-9968.patch, HADOOP-9968.patch, HADOOP-9968.patch, hadoop-9968-1.2.patch It is possible to use NetGroups for ACLs. This requires specifying the config property hadoop.security.group.mapping as org.apache.hadoop.security.JniBasedUnixGroupsNetgroupMapping or org.apache.hadoop.security.ShellBasedUnixGroupsNetgroupMapping. The authorization to proxy a user by another user is specified as a list of groups hadoop.proxyuser.user-name.groups. The Group resolution does not work if we are using NetGroups. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-10644) Remote principal name case sensitivity issue introduced on Windows by HADOOP-10418
[ https://issues.apache.org/jira/browse/HADOOP-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu resolved HADOOP-10644. --- Resolution: Not a Problem ok, the issue comes from config. the principals in *-site.zml must be properly cased (lowercase hostnames). SecurityUtil.replacePatternt would do the right thing, had it have a chance. Remote principal name case sensitivity issue introduced on Windows by HADOOP-10418 -- Key: HADOOP-10644 URL: https://issues.apache.org/jira/browse/HADOOP-10644 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.4.0 Reporter: Remus Rusanu Labels: windows HADOOP-10418 caused the SPN to be generated using KRB_NT_SRV_HST type. This results in a wrong case FQDN name and the {code} isPrincipalValid = serverPrincipal.equals(confPrincipal); {code} check fails due to case difference. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10561) Copy command with preserve option should handle Xattrs
[ https://issues.apache.org/jira/browse/HADOOP-10561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013740#comment-14013740 ] Yi Liu commented on HADOOP-10561: - In MAPREDUCE-5898, we add distcp support for preserving HDFS xattrs, and use option flag -px. As [~andrew.wang] said, {{cp}} with \-p option doesn't preserve xattrs in linux, we should specify via \-\-preserve=xattr or \-\-preserve=all. So we have two choices: - same as in linux, using \-\-preserve or \-\-preserve=all - simliar with distcp, using -px flag. According to Andrew's comment add make consistent with distcp, we choose option 2. Copy command with preserve option should handle Xattrs -- Key: HADOOP-10561 URL: https://issues.apache.org/jira/browse/HADOOP-10561 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 3.0.0 Reporter: Uma Maheswara Rao G Assignee: Yi Liu The design docs for Xattrs stated that we handle preserve options with copy commands From doc: Preserve option of commands like “cp -p” shell command and “distcp -p” should work on XAttrs. In the case of source fs supports XAttrs but target fs does not support, XAttrs will be ignored with warning message -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10639) FileBasedKeyStoresFactory initialization is not using default for SSL_REQUIRE_CLIENT_CERT_KEY
[ https://issues.apache.org/jira/browse/HADOOP-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013770#comment-14013770 ] Hudson commented on HADOOP-10639: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1786 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1786/]) HADOOP-10639. FileBasedKeyStoresFactory initialization is not using default for SSL_REQUIRE_CLIENT_CERT_KEY. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598413) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ssl/FileBasedKeyStoresFactory.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/ssl/TestSSLFactory.java FileBasedKeyStoresFactory initialization is not using default for SSL_REQUIRE_CLIENT_CERT_KEY - Key: HADOOP-10639 URL: https://issues.apache.org/jira/browse/HADOOP-10639 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.4.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.5.0 Attachments: HADOOP-10639.patch The FileBasedKeyStoresFactory initialization is defaulting SSL_REQUIRE_CLIENT_CERT_KEY to true instead of the default DEFAULT_SSL_REQUIRE_CLIENT_CERT (false). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10638) Updating hadoop-daemon.sh to work as expected when nfs is started as a privileged user.
[ https://issues.apache.org/jira/browse/HADOOP-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013771#comment-14013771 ] Hudson commented on HADOOP-10638: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1786 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1786/]) HADOOP-10638. Updating hadoop-daemon.sh to work as expected when nfs is started as a privileged user. Contributed by Manikandan Narayanaswamy. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598451) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh Updating hadoop-daemon.sh to work as expected when nfs is started as a privileged user. Key: HADOOP-10638 URL: https://issues.apache.org/jira/browse/HADOOP-10638 Project: Hadoop Common Issue Type: Bug Components: nfs Affects Versions: 2.4.0 Reporter: Manikandan Narayanaswamy Assignee: Manikandan Narayanaswamy Labels: patch Fix For: 2.5.0 Attachments: 0001-Picking-the-right-pid-file-when-running-NFS-as-privi.patch When NFS is started as a privileged user, this change sets up required environment variables: HADOOP_PID_DIR = $HADOOP_PRIVILEGED_NFS_PID_DIR HADOOP_LOG_DIR = $HADOOP_PRIVILEGED_NFS_LOG_DIR HADOOP_IDENT_STRING = $HADOOP_PRIVILEGED_NFS_USER Also, along with the above, we also now collect ulimits for the right user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9968) ProxyUsers does not work with NetGroups
[ https://issues.apache.org/jira/browse/HADOOP-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013760#comment-14013760 ] Hudson commented on HADOOP-9968: FAILURE: Integrated in Hadoop-Mapreduce-trunk #1786 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1786/]) HADOOP-9968. Update CHANGES.txt in trunk. (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598442) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt ProxyUsers does not work with NetGroups --- Key: HADOOP-9968 URL: https://issues.apache.org/jira/browse/HADOOP-9968 Project: Hadoop Common Issue Type: Improvement Components: security Reporter: Benoy Antony Assignee: Benoy Antony Fix For: 3.0.0, 2.5.0 Attachments: HADOOP-9968.patch, HADOOP-9968.patch, HADOOP-9968.patch, hadoop-9968-1.2.patch It is possible to use NetGroups for ACLs. This requires specifying the config property hadoop.security.group.mapping as org.apache.hadoop.security.JniBasedUnixGroupsNetgroupMapping or org.apache.hadoop.security.ShellBasedUnixGroupsNetgroupMapping. The authorization to proxy a user by another user is specified as a list of groups hadoop.proxyuser.user-name.groups. The Group resolution does not work if we are using NetGroups. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10625) Configuration: names should be trimmed when putting/getting to properties
[ https://issues.apache.org/jira/browse/HADOOP-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013764#comment-14013764 ] Hudson commented on HADOOP-10625: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1786 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1786/]) HADOOP-10625. Trim configuration names when putting/getting them to properties (xgong: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598072) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java Configuration: names should be trimmed when putting/getting to properties - Key: HADOOP-10625 URL: https://issues.apache.org/jira/browse/HADOOP-10625 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 2.4.0 Reporter: Wangda Tan Assignee: Wangda Tan Fix For: 2.5.0 Attachments: HADOOP-10625.patch, HADOOP-10625.patch, HADOOP-10625.patch Currently, Hadoop will not trim name when putting a pair of k/v to property. But when loading configuration from file, names will be trimmed: (In Configuration.java) {code} if (name.equals(field.getTagName()) field.hasChildNodes()) attr = StringInterner.weakIntern( ((Text)field.getFirstChild()).getData().trim()); if (value.equals(field.getTagName()) field.hasChildNodes()) value = StringInterner.weakIntern( ((Text)field.getFirstChild()).getData()); {code} With this behavior, following steps will be problematic: 1. User incorrectly set hadoop.key=value (with a space before hadoop.key) 2. User try to get hadoop.key, cannot get value 3. Serialize/deserialize configuration (Like what did in MR) 4. User try to get hadoop.key, can get value, which will make inconsistency problem. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10602) Documentation has broken Go Back hyperlinks.
[ https://issues.apache.org/jira/browse/HADOOP-10602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013772#comment-14013772 ] Hudson commented on HADOOP-10602: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1786 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1786/]) HADOOP-10602. Correct CHANGES.txt. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598339) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt HADOOP-10602. Documentation has broken Go Back hyperlinks. Contributed by Akira AJISAKA. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598337) * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/site/apt/BuildingIt.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/site/apt/Configuration.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-auth/src/site/apt/Examples.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-kms/src/site/apt/index.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/ServerSetup.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/site/apt/UsingHttpTools.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/CentralizedCacheManagement.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ExtendedAttributes.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsNfsGateway.apt.vm * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/ViewFs.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/DistributedCacheDeploy.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/EncryptedShuffle.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapReduce_Compatibility_Hadoop1_Hadoop2.apt.vm * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm * /hadoop/common/trunk/hadoop-tools/hadoop-sls/src/site/apt/SchedulerLoadSimulator.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManager.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerRest.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/TimelineServer.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WebServicesIntro.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/WritingYarnApplications.apt.vm * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/YarnCommands.apt.vm Documentation has broken Go Back hyperlinks. -- Key: HADOOP-10602 URL: https://issues.apache.org/jira/browse/HADOOP-10602 Project: Hadoop Common Issue Type: Bug Components: documentation Affects Versions: 3.0.0, 2.4.0 Reporter: Chris Nauroth Assignee: Akira AJISAKA Priority: Trivial Labels: newbie Fix For: 3.0.0, 2.5.0 Attachments: HADOOP-10602.2.patch, HADOOP-10602.3.patch, HADOOP-10602.patch Multiple pages of our documentation have Go Back links that are broken, because they point to an incorrect relative path. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10448) Support pluggable mechanism to specify proxy user settings
[ https://issues.apache.org/jira/browse/HADOOP-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013768#comment-14013768 ] Hudson commented on HADOOP-10448: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1786 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1786/]) HADOOP-10448. Support pluggable mechanism to specify proxy user settings (Contributed by Benoy Antony) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598396) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/DefaultImpersonationProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/ImpersonationProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/authorize/ProxyUsers.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/MiniRPCBenchmark.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestDoAsEffectiveUser.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/authorize/TestProxyUsers.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/TestReaddir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestDelegationTokenForProxyUser.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/common/TestJspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/security/TestRefreshUserMappings.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMAdminService.java Support pluggable mechanism to specify proxy user settings -- Key: HADOOP-10448 URL: https://issues.apache.org/jira/browse/HADOOP-10448 Project: Hadoop Common Issue Type: Sub-task Components: security Affects Versions: 2.3.0 Reporter: Benoy Antony Assignee: Benoy Antony Fix For: 3.0.0, 2.5.0 Attachments: HADOOP-10448-branch2.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch, HADOOP-10448.patch We have a requirement to support large number of superusers. (users who impersonate as another user) (http://hadoop.apache.org/docs/r1.2.1/Secure_Impersonation.html) Currently each superuser needs to be defined in the core-site.xml via proxyuser settings. This will be cumbersome when there are 1000 entries. It seems useful to have a pluggable mechanism to specify proxy user settings with the current approach as the default. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9704) Write metrics sink plugin for Hadoop/Graphite
[ https://issues.apache.org/jira/browse/HADOOP-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Babak Behzad updated HADOOP-9704: - Attachment: HADOOP-9704.patch Write metrics sink plugin for Hadoop/Graphite - Key: HADOOP-9704 URL: https://issues.apache.org/jira/browse/HADOOP-9704 Project: Hadoop Common Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Chu Tong Attachments: 0001-HADOOP-9704.-Write-metrics-sink-plugin-for-Hadoop-Gr.patch, HADOOP-9704.patch, HADOOP-9704.patch, HADOOP-9704.patch, Hadoop-9704.patch Write a metrics sink plugin for Hadoop to send metrics directly to Graphite in additional to the current ganglia and file ones. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9704) Write metrics sink plugin for Hadoop/Graphite
[ https://issues.apache.org/jira/browse/HADOOP-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013912#comment-14013912 ] Babak Behzad commented on HADOOP-9704: -- Thanks both Ravi and Luke. I fixed the tabs and the formatting and attached a new patch. Hopefully it is correct and sorry about that. Write metrics sink plugin for Hadoop/Graphite - Key: HADOOP-9704 URL: https://issues.apache.org/jira/browse/HADOOP-9704 Project: Hadoop Common Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Chu Tong Attachments: 0001-HADOOP-9704.-Write-metrics-sink-plugin-for-Hadoop-Gr.patch, HADOOP-9704.patch, HADOOP-9704.patch, HADOOP-9704.patch, Hadoop-9704.patch Write a metrics sink plugin for Hadoop to send metrics directly to Graphite in additional to the current ganglia and file ones. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9361: --- Status: Open (was: Patch Available) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance - Key: HADOOP-9361 URL: https://issues.apache.org/jira/browse/HADOOP-9361 Project: Hadoop Common Issue Type: Improvement Components: fs, test Affects Versions: 2.4.0, 2.2.0, 3.0.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while HDFS gets tested downstream, other filesystems, such as blobstore bindings, don't. The only tests that are common are those of {{FileSystemContractTestBase}}, which HADOOP-9258 shows is incomplete. I propose # writing more tests which clarify expected behavior # testing operations in the interface being in their own JUnit4 test classes, instead of one big test suite. # Having each FS declare via a properties file what behaviors they offer, such as atomic-rename, atomic-delete, umask, immediate-consistency -test methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10400) Incorporate new S3A FileSystem implementation
[ https://issues.apache.org/jira/browse/HADOOP-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013959#comment-14013959 ] Steve Loughran commented on HADOOP-10400: - Amadeep, the patch is not good to go until it works, which is what the extra HADOOP-9361 will do -testing a lot more than what is in the existing FS contract. I am confident that this is the case, because HADOOP-10533 shows that the last update to s3n caused a lot of regressions. We need the extra tests so that we can be confident that the code works as expected. I think '9361 is nearly ready to go in -if we can get the core spec and abstract tests in, then s3a can pick them up quickly, and there's less to worry about in terms of backwards compatibility in any changes. One thing we could do, quickly, is create a stub {{hadoop-tools/hadoop-aws}} module that has nothing but the code structure and the maven dependency -this patch could then use that as the basis for code -rather than build- changes. I can help do that with a build that doesn't run tests until a tests configuratin resource file is present Incorporate new S3A FileSystem implementation - Key: HADOOP-10400 URL: https://issues.apache.org/jira/browse/HADOOP-10400 Project: Hadoop Common Issue Type: Improvement Components: fs, fs/s3 Affects Versions: 2.4.0 Reporter: Jordan Mendelson Assignee: Jordan Mendelson Attachments: HADOOP-10400-1.patch, HADOOP-10400-2.patch, HADOOP-10400-3.patch, HADOOP-10400-4.patch, HADOOP-10400-5.patch The s3native filesystem has a number of limitations (some of which were recently fixed by HADOOP-9454). This patch adds an s3a filesystem which uses the aws-sdk instead of the jets3t library. There are a number of improvements over s3native including: - Parallel copy (rename) support (dramatically speeds up commits on large files) - AWS S3 explorer compatible empty directories files xyz/ instead of xyz_$folder$ (reduces littering) - Ignores s3native created _$folder$ files created by s3native and other S3 browsing utilities - Supports multiple output buffer dirs to even out IO when uploading files - Supports IAM role-based authentication - Allows setting a default canned ACL for uploads (public, private, etc.) - Better error recovery handling - Should handle input seeks without having to download the whole file (used for splits a lot) This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to various pom files to get it to build against trunk. I've been using 0.0.1 in production with CDH 4 for several months and CDH 5 for a few days. The version here is 0.0.2 which changes around some keys to hopefully bring the key name style more inline with the rest of hadoop 2.x. *Tunable parameters:* fs.s3a.access.key - Your AWS access key ID (omit for role authentication) fs.s3a.secret.key - Your AWS secret key (omit for role authentication) fs.s3a.connection.maximum - Controls how many parallel connections HttpClient spawns (default: 15) fs.s3a.connection.ssl.enabled - Enables or disables SSL connections to S3 (default: true) fs.s3a.attempts.maximum - How many times we should retry commands on transient errors (default: 10) fs.s3a.connection.timeout - Socket connect timeout (default: 5000) fs.s3a.paging.maximum - How many keys to request from S3 when doing directory listings at a time (default: 5000) fs.s3a.multipart.size - How big (in bytes) to split a upload or copy operation up into (default: 104857600) fs.s3a.multipart.threshold - Until a file is this large (in bytes), use non-parallel upload (default: 2147483647) fs.s3a.acl.default - Set a canned ACL on newly created/copied objects (private | public-read | public-read-write | authenticated-read | log-delivery-write | bucket-owner-read | bucket-owner-full-control) fs.s3a.multipart.purge - True if you want to purge existing multipart uploads that may not have been completed/aborted correctly (default: false) fs.s3a.multipart.purge.age - Minimum age in seconds of multipart uploads to purge (default: 86400) fs.s3a.buffer.dir - Comma separated list of directories that will be used to buffer file writes out of (default: uses ${hadoop.tmp.dir}/s3a ) *Caveats*: Hadoop uses a standard output committer which uploads files as filename.COPYING before renaming them. This can cause unnecessary performance issues with S3 because it does not have a rename operation and S3 already verifies uploads against an md5 that the driver sets on the upload request. While this FileSystem should be significantly faster than the built-in s3native driver because of parallel copy support, you may want to consider setting a null output
[jira] [Updated] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9361: --- Attachment: HADOOP-9361-013.patch This iteration has the following main changes # docs cover testing and extending the specification # tests have options for specific behaviours of HDFS and local fs on rename corner cases (overwrite, destination is nonexistent directory) # FTP mostly works, except for the detail that rename() fails consistently. # Swift FS exceptions are in sync with what the contract tests expect, as well as the existing tests. # S3N has all the fixes need for HADOOP-10589 -as well as the tests to show the issue. I think this is complete enough to go in -if jenkins is happy. We just need to split up the individual patches into manageable units : core+ local ,hdfs, s3, ftp, swift for actual application. Given the state of s3, I'd give that priority. If someone wants to run these tests, have a look at the testing.md file -it explains how to do it. Strictly define the expected behavior of filesystem APIs and write tests to verify compliance - Key: HADOOP-9361 URL: https://issues.apache.org/jira/browse/HADOOP-9361 Project: Hadoop Common Issue Type: Improvement Components: fs, test Affects Versions: 3.0.0, 2.2.0, 2.4.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, HADOOP-9361-013.patch {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while HDFS gets tested downstream, other filesystems, such as blobstore bindings, don't. The only tests that are common are those of {{FileSystemContractTestBase}}, which HADOOP-9258 shows is incomplete. I propose # writing more tests which clarify expected behavior # testing operations in the interface being in their own JUnit4 test classes, instead of one big test suite. # Having each FS declare via a properties file what behaviors they offer, such as atomic-rename, atomic-delete, umask, immediate-consistency -test methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-9361: --- Affects Version/s: (was: 2.2.0) Status: Patch Available (was: Open) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance - Key: HADOOP-9361 URL: https://issues.apache.org/jira/browse/HADOOP-9361 Project: Hadoop Common Issue Type: Improvement Components: fs, test Affects Versions: 2.4.0, 3.0.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, HADOOP-9361-013.patch {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while HDFS gets tested downstream, other filesystems, such as blobstore bindings, don't. The only tests that are common are those of {{FileSystemContractTestBase}}, which HADOOP-9258 shows is incomplete. I propose # writing more tests which clarify expected behavior # testing operations in the interface being in their own JUnit4 test classes, instead of one big test suite. # Having each FS declare via a properties file what behaviors they offer, such as atomic-rename, atomic-delete, umask, immediate-consistency -test methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-8065) discp should have an option to compress data while copying.
[ https://issues.apache.org/jira/browse/HADOOP-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014095#comment-14014095 ] Ken Krugler commented on HADOOP-8065: - I think this is reasonable functionality to add to distcp. For reference (based on user input) see what Amazon has added to their version of distcp (S3distcp): http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_s3distcp.html They support a --outputCodec codec parameter, to specify what compression to use. discp should have an option to compress data while copying. --- Key: HADOOP-8065 URL: https://issues.apache.org/jira/browse/HADOOP-8065 Project: Hadoop Common Issue Type: Improvement Components: fs Affects Versions: 0.20.2 Reporter: Suresh Antony Priority: Minor Labels: distcp Fix For: 0.20.2 Attachments: patch.distcp.2012-02-10 We would like compress the data while transferring from our source system to target system. One way to do this is to write a map/reduce job to compress that after/before being transferred. This looks inefficient. Since distcp already reading writing data it would be better if it can accomplish while doing this. Flip side of this is that distcp -update option can not check file size before copying data. It can only check for the existence of file. So I propose if -compress option is given then file size is not checked. Also when we copy file appropriate extension needs to be added to file depending on compression type. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014179#comment-14014179 ] Andrew Wang commented on HADOOP-9361: - Hey Steve, this is a big patch but I'd like to help review. Do you have any kind of feedback in particular you're looking for? I don't have exposure to the various FileSystems besides local and HDFS, but I can at least look for potential compat issues (I see return types and exception types being changed), and also proofread the documentation you added. Running the Swift and S3 tests might be a bit hard too, so I'll just trust you that they work :) If you want to attack this piece by piece, I'll also wait for the patch split. Strictly define the expected behavior of filesystem APIs and write tests to verify compliance - Key: HADOOP-9361 URL: https://issues.apache.org/jira/browse/HADOOP-9361 Project: Hadoop Common Issue Type: Improvement Components: fs, test Affects Versions: 3.0.0, 2.4.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, HADOOP-9361-013.patch {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while HDFS gets tested downstream, other filesystems, such as blobstore bindings, don't. The only tests that are common are those of {{FileSystemContractTestBase}}, which HADOOP-9258 shows is incomplete. I propose # writing more tests which clarify expected behavior # testing operations in the interface being in their own JUnit4 test classes, instead of one big test suite. # Having each FS declare via a properties file what behaviors they offer, such as atomic-rename, atomic-delete, umask, immediate-consistency -test methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014184#comment-14014184 ] jay vyas commented on HADOOP-9361: -- Hi Andrew I can help to review it: it's an important part of the hcfs initiative. Possibly we could review it In person at Hadoop summit if you are going, as I will be there for the week. Strictly define the expected behavior of filesystem APIs and write tests to verify compliance - Key: HADOOP-9361 URL: https://issues.apache.org/jira/browse/HADOOP-9361 Project: Hadoop Common Issue Type: Improvement Components: fs, test Affects Versions: 3.0.0, 2.4.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, HADOOP-9361-013.patch {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while HDFS gets tested downstream, other filesystems, such as blobstore bindings, don't. The only tests that are common are those of {{FileSystemContractTestBase}}, which HADOOP-9258 shows is incomplete. I propose # writing more tests which clarify expected behavior # testing operations in the interface being in their own JUnit4 test classes, instead of one big test suite. # Having each FS declare via a properties file what behaviors they offer, such as atomic-rename, atomic-delete, umask, immediate-consistency -test methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10611: Summary: KMS, keyVersion name should not be assumed to be keyName@versionNumber (was: KeyVersion name should not be assumed to be the 'key name @ the version number) KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur The KeyProvider public API should treat keyversion name as an opaque value. Same for the KMS client/server. Methods like {{KeyProvider#buildVersionName()}} and {KeyProvider#getBaseName()}} should not be part of the {{KeyProvider}} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10611: Description: Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. was: The KeyProvider public API should treat keyversion name as an opaque value. Same for the KMS client/server. Methods like {{KeyProvider#buildVersionName()}} and {KeyProvider#getBaseName()}} should not be part of the {{KeyProvider}} KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014193#comment-14014193 ] Andrew Wang commented on HADOOP-9361: - I wasn't planning on attending Summit, but my quick skim of the patch was that the bulk of it was testing and documentation, which I think I can get through without too much hand holding. I'd need to read through it anyway before we could talk about FS semantics, which is I think the real meat of this patch. Strictly define the expected behavior of filesystem APIs and write tests to verify compliance - Key: HADOOP-9361 URL: https://issues.apache.org/jira/browse/HADOOP-9361 Project: Hadoop Common Issue Type: Improvement Components: fs, test Affects Versions: 3.0.0, 2.4.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, HADOOP-9361-013.patch {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while HDFS gets tested downstream, other filesystems, such as blobstore bindings, don't. The only tests that are common are those of {{FileSystemContractTestBase}}, which HADOOP-9258 shows is incomplete. I propose # writing more tests which clarify expected behavior # testing operations in the interface being in their own JUnit4 test classes, instead of one big test suite. # Having each FS declare via a properties file what behaviors they offer, such as atomic-rename, atomic-delete, umask, immediate-consistency -test methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014198#comment-14014198 ] Alejandro Abdelnur commented on HADOOP-10611: - Changed the scope of this JIRA to the KMS classes. KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9361) Strictly define the expected behavior of filesystem APIs and write tests to verify compliance
[ https://issues.apache.org/jira/browse/HADOOP-9361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014199#comment-14014199 ] Hadoop QA commented on HADOOP-9361: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647652/HADOOP-9361-013.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 77 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1279 javac compiler warnings (more than the trunk's current 1278 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-tools/hadoop-openstack: org.apache.hadoop.hdfs.server.namenode.TestStartup {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3985//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/3985//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Javac warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/3985//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3985//console This message is automatically generated. Strictly define the expected behavior of filesystem APIs and write tests to verify compliance - Key: HADOOP-9361 URL: https://issues.apache.org/jira/browse/HADOOP-9361 Project: Hadoop Common Issue Type: Improvement Components: fs, test Affects Versions: 3.0.0, 2.4.0 Reporter: Steve Loughran Assignee: Steve Loughran Attachments: HADOOP-9361-001.patch, HADOOP-9361-002.patch, HADOOP-9361-003.patch, HADOOP-9361-004.patch, HADOOP-9361-005.patch, HADOOP-9361-006.patch, HADOOP-9361-007.patch, HADOOP-9361-008.patch, HADOOP-9361-009.patch, HADOOP-9361-011.patch, HADOOP-9361-012.patch, HADOOP-9361-013.patch {{FileSystem}} and {{FileContract}} aren't tested rigorously enough -while HDFS gets tested downstream, other filesystems, such as blobstore bindings, don't. The only tests that are common are those of {{FileSystemContractTestBase}}, which HADOOP-9258 shows is incomplete. I propose # writing more tests which clarify expected behavior # testing operations in the interface being in their own JUnit4 test classes, instead of one big test suite. # Having each FS declare via a properties file what behaviors they offer, such as atomic-rename, atomic-delete, umask, immediate-consistency -test methods can downgrade to skipped test cases if a feature is missing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10611: Status: Patch Available (was: Open) KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10611: Attachment: HADOOP-10611.patch KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10444) add pom.xml infrastructure for hadoop-native-core
[ https://issues.apache.org/jira/browse/HADOOP-10444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014214#comment-14014214 ] Colin Patrick McCabe commented on HADOOP-10444: --- I think the issue with HADOOP-9648 is that it needs someone with more familiarity with YARN (and MacOS) to look at it. I know there are a few YARN developers who use MacOS, so maybe start by asking them? add pom.xml infrastructure for hadoop-native-core - Key: HADOOP-10444 URL: https://issues.apache.org/jira/browse/HADOOP-10444 Project: Hadoop Common Issue Type: Sub-task Reporter: Colin Patrick McCabe Assignee: Binglin Chang Attachments: HADOOP-10444.v1.patch, HADOOP-10444.v2.patch Add pom.xml infrastructure for hadoop-native-core, so that it builds under Maven. We can look to how we integrated CMake into hadoop-hdfs-project and hadoop-common-project for inspiration here. In the long term, it would be nice to use a Maven plugin here (see HADOOP-8887) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9704) Write metrics sink plugin for Hadoop/Graphite
[ https://issues.apache.org/jira/browse/HADOOP-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Babak Behzad updated HADOOP-9704: - Attachment: (was: HADOOP-9704.patch) Write metrics sink plugin for Hadoop/Graphite - Key: HADOOP-9704 URL: https://issues.apache.org/jira/browse/HADOOP-9704 Project: Hadoop Common Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Chu Tong Attachments: 0001-HADOOP-9704.-Write-metrics-sink-plugin-for-Hadoop-Gr.patch, HADOOP-9704.patch, HADOOP-9704.patch, Hadoop-9704.patch Write a metrics sink plugin for Hadoop to send metrics directly to Graphite in additional to the current ganglia and file ones. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9704) Write metrics sink plugin for Hadoop/Graphite
[ https://issues.apache.org/jira/browse/HADOOP-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Babak Behzad updated HADOOP-9704: - Attachment: (was: Hadoop-9704.patch) Write metrics sink plugin for Hadoop/Graphite - Key: HADOOP-9704 URL: https://issues.apache.org/jira/browse/HADOOP-9704 Project: Hadoop Common Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Chu Tong Attachments: 0001-HADOOP-9704.-Write-metrics-sink-plugin-for-Hadoop-Gr.patch, HADOOP-9704.patch, HADOOP-9704.patch, HADOOP-9704.patch Write a metrics sink plugin for Hadoop to send metrics directly to Graphite in additional to the current ganglia and file ones. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9704) Write metrics sink plugin for Hadoop/Graphite
[ https://issues.apache.org/jira/browse/HADOOP-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Babak Behzad updated HADOOP-9704: - Attachment: HADOOP-9704.patch Write metrics sink plugin for Hadoop/Graphite - Key: HADOOP-9704 URL: https://issues.apache.org/jira/browse/HADOOP-9704 Project: Hadoop Common Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Chu Tong Attachments: 0001-HADOOP-9704.-Write-metrics-sink-plugin-for-Hadoop-Gr.patch, HADOOP-9704.patch, HADOOP-9704.patch, HADOOP-9704.patch Write a metrics sink plugin for Hadoop to send metrics directly to Graphite in additional to the current ganglia and file ones. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9902) Shell script rewrite
[ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-9902: - Attachment: HADOOP-9902.patch Here's the latest version of this patch against trunk (svn revision 1598750). There are still a few things I want to fix and I'm sure there are bugs floating around here and there. I'd appreciate any feedback with the patch thus far! Shell script rewrite Key: HADOOP-9902 URL: https://issues.apache.org/jira/browse/HADOOP-9902 Project: Hadoop Common Issue Type: Improvement Components: scripts Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt Umbrella JIRA for shell script rewrite. See more-info.txt for more details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014229#comment-14014229 ] Andrew Wang commented on HADOOP-10611: -- I'd like if we tackled this for the common classes too for consistency. It'd be nice to do something like make {{buildVersionName}} abstract and provide the current impl as a static helper in e.g. {{KeyProviderUtils}} as Tucu proposed. I'm not sure exactly what method signature these third-party key providers need though. Is there a JIRA for this? +1 for the KMS side stuff though, this part at least looks good. KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014235#comment-14014235 ] Alejandro Abdelnur commented on HADOOP-10611: - [~andrew.wang], how about tackling that in a diff JIRA? KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014240#comment-14014240 ] Andrew Wang commented on HADOOP-10611: -- Yea totally, I think we're good on this one as soon as Jenkins come back. If you can file a follow-on that'd be swell. KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014259#comment-14014259 ] Hadoop QA commented on HADOOP-10611: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647681/HADOOP-10611.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-common-project/hadoop-kms: org.apache.hadoop.crypto.key.kms.server.TestKMS {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3986//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3986//console This message is automatically generated. KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10342) Extend UserGroupInformation to return a UGI given a preauthenticated kerberos Subject
[ https://issues.apache.org/jira/browse/HADOOP-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014258#comment-14014258 ] Owen O'Malley commented on HADOOP-10342: This is good for branch-2. Extend UserGroupInformation to return a UGI given a preauthenticated kerberos Subject - Key: HADOOP-10342 URL: https://issues.apache.org/jira/browse/HADOOP-10342 Project: Hadoop Common Issue Type: Bug Components: security Reporter: Larry McCay Assignee: Larry McCay Attachments: 10342.branch-1.2.patch, 10342.branch-1.patch, 10342.branch-2.3.patch, 10342.branch-2.patch, 10342.patch We need the ability to use a Subject that was created inside an embedding application through a kerberos authentication. For example, an application that uses JAAS to authenticate to a KDC should be able to provide the resulting Subject and get a UGI instance to call doAs on. Example: {code} UserGroupInformation.setConfiguration(conf); LoginContext context = new LoginContext(com.sun.security.jgss.login, new UserNamePasswordCallbackHandler(userName, password)); context.login(); Subject subject = context.getSubject(); final UserGroupInformation ugi2 = UserGroupInformation.getUGIFromSubject(subject); ugi2.doAs(new PrivilegedExceptionActionObject() { @Override public Object run() throws Exception { final FileSystem fs = FileSystem.get(conf); int i=0; for (FileStatus status : fs.listStatus(new Path(/user))) { System.out.println(status.getPath()); System.out.println(status); if (i++ 10) { System.out.println(only first 10 showed...); break; } } return null; } }); {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HADOOP-9629: -- Attachment: HADOOP-9629.trunk.2.patch I'm re-uploading the same HADOOP-9629.trunk.2.patch file, just to trigger a Jenkins run. Support Windows Azure Storage - Blob as a file system in Hadoop --- Key: HADOOP-9629 URL: https://issues.apache.org/jira/browse/HADOOP-9629 Project: Hadoop Common Issue Type: Improvement Reporter: Mostafa Elhemali Assignee: Mike Liddell Attachments: HADOOP-9629 - Azure Filesystem - Information for developers.docx, HADOOP-9629 - Azure Filesystem - Information for developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch h2. Description This JIRA incorporates adding a new file system implementation for accessing Windows Azure Storage - Blob from within Hadoop, such as using blobs as input to MR jobs or configuring MR jobs to put their output directly into blob storage. h2. High level design At a high level, the code here extends the FileSystem class to provide an implementation for accessing blob storage; the scheme wasb is used for accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI scheme: {code}wasb[s]://container@account/path/to/file{code} to address individual blobs. We use the standard Azure Java SDK (com.microsoft.windowsazure) to do most of the work. In order to map a hierarchical file system over the flat name-value pair nature of blob storage, we create a specially tagged blob named path/to/dir whenever we create a directory called path/to/dir, then files under that are stored as normal blobs path/to/dir/file. We have many metrics implemented for it using the Metrics2 interface. Tests are implemented mostly using a mock implementation for the Azure SDK functionality, with an option to test against a real blob storage if configured (instructions provided inside in README.txt). h2. Credits and history This has been ongoing work for a while, and the early version of this work can be seen in HADOOP-8079. This JIRA is a significant revision of that and we'll post the patch here for Hadoop trunk first, then post a patch for branch-1 as well for backporting the functionality if accepted. Credit for this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and [~stojanovic] as well as multiple people who have taken over this work since then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and [~chuanliu]. h2. Test Besides unit tests, we have used WASB as the default file system in our service product. (HDFS is also used but not as default file system.) Various different customer and test workloads have been run against clusters with such configurations for quite some time. The current version reflects to the version of the code tested and used in our production environment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HADOOP-9629: -- Attachment: (was: HADOOP-9629.trunk.2.patch) Support Windows Azure Storage - Blob as a file system in Hadoop --- Key: HADOOP-9629 URL: https://issues.apache.org/jira/browse/HADOOP-9629 Project: Hadoop Common Issue Type: Improvement Reporter: Mostafa Elhemali Assignee: Mike Liddell Attachments: HADOOP-9629 - Azure Filesystem - Information for developers.docx, HADOOP-9629 - Azure Filesystem - Information for developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch h2. Description This JIRA incorporates adding a new file system implementation for accessing Windows Azure Storage - Blob from within Hadoop, such as using blobs as input to MR jobs or configuring MR jobs to put their output directly into blob storage. h2. High level design At a high level, the code here extends the FileSystem class to provide an implementation for accessing blob storage; the scheme wasb is used for accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI scheme: {code}wasb[s]://container@account/path/to/file{code} to address individual blobs. We use the standard Azure Java SDK (com.microsoft.windowsazure) to do most of the work. In order to map a hierarchical file system over the flat name-value pair nature of blob storage, we create a specially tagged blob named path/to/dir whenever we create a directory called path/to/dir, then files under that are stored as normal blobs path/to/dir/file. We have many metrics implemented for it using the Metrics2 interface. Tests are implemented mostly using a mock implementation for the Azure SDK functionality, with an option to test against a real blob storage if configured (instructions provided inside in README.txt). h2. Credits and history This has been ongoing work for a while, and the early version of this work can be seen in HADOOP-8079. This JIRA is a significant revision of that and we'll post the patch here for Hadoop trunk first, then post a patch for branch-1 as well for backporting the functionality if accepted. Credit for this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and [~stojanovic] as well as multiple people who have taken over this work since then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and [~chuanliu]. h2. Test Besides unit tests, we have used WASB as the default file system in our service product. (HDFS is also used but not as default file system.) Various different customer and test workloads have been run against clusters with such configurations for quite some time. The current version reflects to the version of the code tested and used in our production environment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9902) Shell script rewrite
[ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-9902: - Attachment: (was: HADOOP-9902.patch) Shell script rewrite Key: HADOOP-9902 URL: https://issues.apache.org/jira/browse/HADOOP-9902 Project: Hadoop Common Issue Type: Improvement Components: scripts Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt Umbrella JIRA for shell script rewrite. See more-info.txt for more details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HADOOP-10641) Introduce Coordination Engine
[ https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Plamen Jeliazkov reassigned HADOOP-10641: - Assignee: Plamen Jeliazkov Introduce Coordination Engine - Key: HADOOP-10641 URL: https://issues.apache.org/jira/browse/HADOOP-10641 Project: Hadoop Common Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Plamen Jeliazkov Coordination Engine (CE) is a system, which allows to agree on a sequence of events in a distributed system. In order to be reliable CE should be distributed by itself. Coordination Engine can be based on different algorithms (paxos, raft, 2PC, zab) and have different implementations, depending on use cases, reliability, availability, and performance requirements. CE should have a common API, so that it could serve as a pluggable component in different projects. The immediate beneficiaries are HDFS (HDFS-6469) and HBase (HBASE-10909). First implementation is proposed to be based on ZooKeeper. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-9902) Shell script rewrite
[ https://issues.apache.org/jira/browse/HADOOP-9902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-9902: - Attachment: HADOOP-9902.patch Let's try this again: git rev ca2d0153bf3ec2f7f228bb1e68c0cadf4fb2d6c5 svn rev 1598764 Shell script rewrite Key: HADOOP-9902 URL: https://issues.apache.org/jira/browse/HADOOP-9902 Project: Hadoop Common Issue Type: Improvement Components: scripts Affects Versions: 3.0.0 Reporter: Allen Wittenauer Assignee: Allen Wittenauer Attachments: HADOOP-9902.patch, HADOOP-9902.txt, hadoop-9902-1.patch, more-info.txt Umbrella JIRA for shell script rewrite. See more-info.txt for more details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014319#comment-14014319 ] Chris Nauroth commented on HADOOP-9629: --- I imported the patch into ReviewBoard: https://reviews.apache.org/r/22096/ Support Windows Azure Storage - Blob as a file system in Hadoop --- Key: HADOOP-9629 URL: https://issues.apache.org/jira/browse/HADOOP-9629 Project: Hadoop Common Issue Type: Improvement Reporter: Mostafa Elhemali Assignee: Mike Liddell Attachments: HADOOP-9629 - Azure Filesystem - Information for developers.docx, HADOOP-9629 - Azure Filesystem - Information for developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch h2. Description This JIRA incorporates adding a new file system implementation for accessing Windows Azure Storage - Blob from within Hadoop, such as using blobs as input to MR jobs or configuring MR jobs to put their output directly into blob storage. h2. High level design At a high level, the code here extends the FileSystem class to provide an implementation for accessing blob storage; the scheme wasb is used for accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI scheme: {code}wasb[s]://container@account/path/to/file{code} to address individual blobs. We use the standard Azure Java SDK (com.microsoft.windowsazure) to do most of the work. In order to map a hierarchical file system over the flat name-value pair nature of blob storage, we create a specially tagged blob named path/to/dir whenever we create a directory called path/to/dir, then files under that are stored as normal blobs path/to/dir/file. We have many metrics implemented for it using the Metrics2 interface. Tests are implemented mostly using a mock implementation for the Azure SDK functionality, with an option to test against a real blob storage if configured (instructions provided inside in README.txt). h2. Credits and history This has been ongoing work for a while, and the early version of this work can be seen in HADOOP-8079. This JIRA is a significant revision of that and we'll post the patch here for Hadoop trunk first, then post a patch for branch-1 as well for backporting the functionality if accepted. Credit for this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and [~stojanovic] as well as multiple people who have taken over this work since then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and [~chuanliu]. h2. Test Besides unit tests, we have used WASB as the default file system in our service product. (HDFS is also used but not as default file system.) Various different customer and test workloads have been run against clusters with such configurations for quite some time. The current version reflects to the version of the code tested and used in our production environment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014338#comment-14014338 ] Hadoop QA commented on HADOOP-9629: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647704/HADOOP-9629.trunk.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 25 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 100 warning messages. See https://builds.apache.org/job/PreCommit-HADOOP-Build/3987//artifact/trunk/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-azure hadoop-tools/hadoop-tools-dist. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3987//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3987//console This message is automatically generated. Support Windows Azure Storage - Blob as a file system in Hadoop --- Key: HADOOP-9629 URL: https://issues.apache.org/jira/browse/HADOOP-9629 Project: Hadoop Common Issue Type: Improvement Reporter: Mostafa Elhemali Assignee: Mike Liddell Attachments: HADOOP-9629 - Azure Filesystem - Information for developers.docx, HADOOP-9629 - Azure Filesystem - Information for developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch h2. Description This JIRA incorporates adding a new file system implementation for accessing Windows Azure Storage - Blob from within Hadoop, such as using blobs as input to MR jobs or configuring MR jobs to put their output directly into blob storage. h2. High level design At a high level, the code here extends the FileSystem class to provide an implementation for accessing blob storage; the scheme wasb is used for accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI scheme: {code}wasb[s]://container@account/path/to/file{code} to address individual blobs. We use the standard Azure Java SDK (com.microsoft.windowsazure) to do most of the work. In order to map a hierarchical file system over the flat name-value pair nature of blob storage, we create a specially tagged blob named path/to/dir whenever we create a directory called path/to/dir, then files under that are stored as normal blobs path/to/dir/file. We have many metrics implemented for it using the Metrics2 interface. Tests are implemented mostly using a mock implementation for the Azure SDK functionality, with an option to test against a real blob storage if configured (instructions provided inside in README.txt). h2. Credits and history This has been ongoing work for a while, and the early version of this work can be seen in HADOOP-8079. This JIRA is a significant revision of that and we'll post the patch here for Hadoop trunk first, then post a patch for branch-1 as well for backporting the functionality if accepted. Credit for this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and [~stojanovic] as well as multiple people who have taken over this work since then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and [~chuanliu]. h2. Test Besides unit tests, we have used WASB as the default file system in our service product. (HDFS is also used but not as default file system.) Various different customer and test workloads have been run against clusters with such configurations for quite some time. The current version reflects to the version of the code tested and used in our production environment. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10645) TestKMS fails because race condition writing acl files
Alejandro Abdelnur created HADOOP-10645: --- Summary: TestKMS fails because race condition writing acl files Key: HADOOP-10645 URL: https://issues.apache.org/jira/browse/HADOOP-10645 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur The {{TestKMS#testACLs()}} test randomly fails because a race condition while updating the acls files which is hot-reloaded. We should disable the background thread that does the reload and do it manually for the purposes of the test. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014340#comment-14014340 ] Alejandro Abdelnur commented on HADOOP-10611: - the failure is unrelated to this JIRA, is because a race condition in the testcase while modifying a hot-reloadable config file, opened HADOOP-10645 to fix it. KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10645) TestKMS fails because race condition writing acl files
[ https://issues.apache.org/jira/browse/HADOOP-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10645: Attachment: HADOOP-10645.patch patch stopping the automatic reloader and running it manually only in the testcase. TestKMS fails because race condition writing acl files -- Key: HADOOP-10645 URL: https://issues.apache.org/jira/browse/HADOOP-10645 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10645.patch The {{TestKMS#testACLs()}} test randomly fails because a race condition while updating the acls files which is hot-reloaded. We should disable the background thread that does the reload and do it manually for the purposes of the test. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10645) TestKMS fails because race condition writing acl files
[ https://issues.apache.org/jira/browse/HADOOP-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10645: Status: Patch Available (was: Open) TestKMS fails because race condition writing acl files -- Key: HADOOP-10645 URL: https://issues.apache.org/jira/browse/HADOOP-10645 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10645.patch The {{TestKMS#testACLs()}} test randomly fails because a race condition while updating the acls files which is hot-reloaded. We should disable the background thread that does the reload and do it manually for the purposes of the test. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10645) TestKMS fails because race condition writing acl files
[ https://issues.apache.org/jira/browse/HADOOP-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014343#comment-14014343 ] Andrew Wang commented on HADOOP-10645: -- +1 thanks tucu, nice find TestKMS fails because race condition writing acl files -- Key: HADOOP-10645 URL: https://issues.apache.org/jira/browse/HADOOP-10645 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HADOOP-10645.patch The {{TestKMS#testACLs()}} test randomly fails because a race condition while updating the acls files which is hot-reloaded. We should disable the background thread that does the reload and do it manually for the purposes of the test. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10645) TestKMS fails because race condition writing acl files
[ https://issues.apache.org/jira/browse/HADOOP-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10645: Resolution: Fixed Fix Version/s: 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) committed to trunk. TestKMS fails because race condition writing acl files -- Key: HADOOP-10645 URL: https://issues.apache.org/jira/browse/HADOOP-10645 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.0.0 Attachments: HADOOP-10645.patch The {{TestKMS#testACLs()}} test randomly fails because a race condition while updating the acls files which is hot-reloaded. We should disable the background thread that does the reload and do it manually for the purposes of the test. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-10624) Fix some minors typo and add more test cases for hadoop_err
[ https://issues.apache.org/jira/browse/HADOOP-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HADOOP-10624. --- Resolution: Fixed Fix Version/s: HADOOP-10388 committed, thanks! Fix some minors typo and add more test cases for hadoop_err --- Key: HADOOP-10624 URL: https://issues.apache.org/jira/browse/HADOOP-10624 Project: Hadoop Common Issue Type: Sub-task Affects Versions: HADOOP-10388 Reporter: Wenwu Peng Assignee: Wenwu Peng Fix For: HADOOP-10388 Attachments: HADOOP-10624-pnative.001.patch, HADOOP-10624-pnative.002.patch, HADOOP-10624-pnative.003.patch, HADOOP-10624-pnative.004.patch Changes: 1. Add more test cases to cover method hadoop_lerr_alloc and hadoop_uverr_alloc 2. Fix typo as following: 1) Change hadoop_uverr_alloc(int cod to hadoop_uverr_alloc(int code in hadoop_err.h 2) Change OutOfMemory to OutOfMemoryException to consistent with other Exception in hadoop_err.c 3) Change DBUG to DEBUG in messenger.c 4) Change DBUG to DEBUG in reactor.c -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10624) Fix some minor typos and add more test cases for hadoop_err
[ https://issues.apache.org/jira/browse/HADOOP-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HADOOP-10624: -- Summary: Fix some minor typos and add more test cases for hadoop_err (was: Fix some minors typo and add more test cases for hadoop_err) Fix some minor typos and add more test cases for hadoop_err --- Key: HADOOP-10624 URL: https://issues.apache.org/jira/browse/HADOOP-10624 Project: Hadoop Common Issue Type: Sub-task Affects Versions: HADOOP-10388 Reporter: Wenwu Peng Assignee: Wenwu Peng Fix For: HADOOP-10388 Attachments: HADOOP-10624-pnative.001.patch, HADOOP-10624-pnative.002.patch, HADOOP-10624-pnative.003.patch, HADOOP-10624-pnative.004.patch Changes: 1. Add more test cases to cover method hadoop_lerr_alloc and hadoop_uverr_alloc 2. Fix typo as following: 1) Change hadoop_uverr_alloc(int cod to hadoop_uverr_alloc(int code in hadoop_err.h 2) Change OutOfMemory to OutOfMemoryException to consistent with other Exception in hadoop_err.c 3) Change DBUG to DEBUG in messenger.c 4) Change DBUG to DEBUG in reactor.c -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-10637) Add snapshot and several dfsadmin tests into TestCLI
[ https://issues.apache.org/jira/browse/HADOOP-10637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dasha Boudnik resolved HADOOP-10637. Resolution: Duplicate Duplicate of HDFS-6297. Add snapshot and several dfsadmin tests into TestCLI Key: HADOOP-10637 URL: https://issues.apache.org/jira/browse/HADOOP-10637 Project: Hadoop Common Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Dasha Boudnik Attachments: HADOOP-10637.patch, HADOOP-10637.patch Add the following commands to TestCLI: appendToFile text rmdir rmdir with ignore-fail-on-non-empty df expunge getmerge allowSnapshot disallowSnapshot createSnapshot renameSnapshot deleteSnapshot refreshUserToGroupsMappings refreshSuperUserGroupsConfiguration setQuota clrQuota setSpaceQuota setBalancerBandwidth finalizeUpgrade -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work started] (HADOOP-10640) Implement Namenode RPCs in HDFS native client
[ https://issues.apache.org/jira/browse/HADOOP-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HADOOP-10640 started by Colin Patrick McCabe. Implement Namenode RPCs in HDFS native client - Key: HADOOP-10640 URL: https://issues.apache.org/jira/browse/HADOOP-10640 Project: Hadoop Common Issue Type: Sub-task Components: native Affects Versions: HADOOP-10388 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HADOOP-10640-pnative.001.patch Implement the parts of libhdfs that just involve making RPCs to the Namenode, such as mkdir, rename, etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10646) KeyProvider buildVersion method should be moved to a utils class
Alejandro Abdelnur created HADOOP-10646: --- Summary: KeyProvider buildVersion method should be moved to a utils class Key: HADOOP-10646 URL: https://issues.apache.org/jira/browse/HADOOP-10646 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.0.0 The buildVersion() method should not be part of the KeyProvider public API because keyversions could be opaque (not built based on the keyname and key generation counter). KeyProvider implementations may choose to use buildVersion() for reasons such as described in HADOOP-10611. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10611: Resolution: Fixed Fix Version/s: 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) committed to trunk. KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.0.0 Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10646) KeyProvider buildVersionName method should be moved to a utils class
[ https://issues.apache.org/jira/browse/HADOOP-10646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10646: Description: The buildVersionName() method should not be part of the KeyProvider public API because keyversions could be opaque (not built based on the keyname and key generation counter). KeyProvider implementations may choose to use buildVersionName() for reasons such as described in HADOOP-10611. was: The buildVersion() method should not be part of the KeyProvider public API because keyversions could be opaque (not built based on the keyname and key generation counter). KeyProvider implementations may choose to use buildVersion() for reasons such as described in HADOOP-10611. KeyProvider buildVersionName method should be moved to a utils class Key: HADOOP-10646 URL: https://issues.apache.org/jira/browse/HADOOP-10646 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.0.0 The buildVersionName() method should not be part of the KeyProvider public API because keyversions could be opaque (not built based on the keyname and key generation counter). KeyProvider implementations may choose to use buildVersionName() for reasons such as described in HADOOP-10611. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014348#comment-14014348 ] Alejandro Abdelnur commented on HADOOP-10611: - Created HADOOP-10646 to move buildVersionName() to a util class. KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.0.0 Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HADOOP-10646) KeyProvider buildVersionName method should be moved to a utils class
[ https://issues.apache.org/jira/browse/HADOOP-10646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HADOOP-10646: Summary: KeyProvider buildVersionName method should be moved to a utils class (was: KeyProvider buildVersion method should be moved to a utils class) KeyProvider buildVersionName method should be moved to a utils class Key: HADOOP-10646 URL: https://issues.apache.org/jira/browse/HADOOP-10646 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.0.0 The buildVersion() method should not be part of the KeyProvider public API because keyversions could be opaque (not built based on the keyname and key generation counter). KeyProvider implementations may choose to use buildVersion() for reasons such as described in HADOOP-10611. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10645) TestKMS fails because race condition writing acl files
[ https://issues.apache.org/jira/browse/HADOOP-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014359#comment-14014359 ] Hudson commented on HADOOP-10645: - SUCCESS: Integrated in Hadoop-trunk-Commit #5636 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5636/]) HADOOP-10645. TestKMS fails because race condition writing acl files. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598773) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/TestKMS.java TestKMS fails because race condition writing acl files -- Key: HADOOP-10645 URL: https://issues.apache.org/jira/browse/HADOOP-10645 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.0.0 Attachments: HADOOP-10645.patch The {{TestKMS#testACLs()}} test randomly fails because a race condition while updating the acls files which is hot-reloaded. We should disable the background thread that does the reload and do it manually for the purposes of the test. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10611) KMS, keyVersion name should not be assumed to be keyName@versionNumber
[ https://issues.apache.org/jira/browse/HADOOP-10611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014358#comment-14014358 ] Hudson commented on HADOOP-10611: - SUCCESS: Integrated in Hadoop-trunk-Commit #5636 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5636/]) HADOOP-10611. KMS, keyVersion name should not be assumed to be keyName@versionNumber. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1598775) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-kms/src/main/java/org/apache/hadoop/crypto/key/kms/server/KMSCacheKeyProvider.java * /hadoop/common/trunk/hadoop-common-project/hadoop-kms/src/test/java/org/apache/hadoop/crypto/key/kms/server/TestKMS.java KMS, keyVersion name should not be assumed to be keyName@versionNumber -- Key: HADOOP-10611 URL: https://issues.apache.org/jira/browse/HADOOP-10611 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.0.0 Attachments: HADOOP-10611.patch Some KMS classes are assuming the keyVersion is keyName@versionNumber. The keyVersion should be handled as an opaque value. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10444) add pom.xml infrastructure for hadoop-native-core
[ https://issues.apache.org/jira/browse/HADOOP-10444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014367#comment-14014367 ] Colin Patrick McCabe commented on HADOOP-10444: --- +1. Thanks, Binglin. add pom.xml infrastructure for hadoop-native-core - Key: HADOOP-10444 URL: https://issues.apache.org/jira/browse/HADOOP-10444 Project: Hadoop Common Issue Type: Sub-task Reporter: Colin Patrick McCabe Assignee: Binglin Chang Attachments: HADOOP-10444.v1.patch, HADOOP-10444.v2.patch Add pom.xml infrastructure for hadoop-native-core, so that it builds under Maven. We can look to how we integrated CMake into hadoop-hdfs-project and hadoop-common-project for inspiration here. In the long term, it would be nice to use a Maven plugin here (see HADOOP-8887) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HADOOP-10444) add pom.xml infrastructure for hadoop-native-core
[ https://issues.apache.org/jira/browse/HADOOP-10444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HADOOP-10444. --- Resolution: Fixed Fix Version/s: HADOOP-10388 Target Version/s: HADOOP-10388 add pom.xml infrastructure for hadoop-native-core - Key: HADOOP-10444 URL: https://issues.apache.org/jira/browse/HADOOP-10444 Project: Hadoop Common Issue Type: Sub-task Reporter: Colin Patrick McCabe Assignee: Binglin Chang Fix For: HADOOP-10388 Attachments: HADOOP-10444.v1.patch, HADOOP-10444.v2.patch Add pom.xml infrastructure for hadoop-native-core, so that it builds under Maven. We can look to how we integrated CMake into hadoop-hdfs-project and hadoop-common-project for inspiration here. In the long term, it would be nice to use a Maven plugin here (see HADOOP-8887) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10624) Fix some minor typos and add more test cases for hadoop_err
[ https://issues.apache.org/jira/browse/HADOOP-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014380#comment-14014380 ] Wenwu Peng commented on HADOOP-10624: - Thanks! Colin Fix some minor typos and add more test cases for hadoop_err --- Key: HADOOP-10624 URL: https://issues.apache.org/jira/browse/HADOOP-10624 Project: Hadoop Common Issue Type: Sub-task Affects Versions: HADOOP-10388 Reporter: Wenwu Peng Assignee: Wenwu Peng Fix For: HADOOP-10388 Attachments: HADOOP-10624-pnative.001.patch, HADOOP-10624-pnative.002.patch, HADOOP-10624-pnative.003.patch, HADOOP-10624-pnative.004.patch Changes: 1. Add more test cases to cover method hadoop_lerr_alloc and hadoop_uverr_alloc 2. Fix typo as following: 1) Change hadoop_uverr_alloc(int cod to hadoop_uverr_alloc(int code in hadoop_err.h 2) Change OutOfMemory to OutOfMemoryException to consistent with other Exception in hadoop_err.c 3) Change DBUG to DEBUG in messenger.c 4) Change DBUG to DEBUG in reactor.c -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-9629) Support Windows Azure Storage - Blob as a file system in Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14014377#comment-14014377 ] Chris Nauroth commented on HADOOP-9629: --- [~mikeliddell], and everyone involved, this is very nice work. Thank you all for the contribution. I've completed a pass through the whole patch, and overall it looks good. I ran the test suites successfully against an Azure storage account. Additionally, we have confidence in this code from additional system testing performed at Microsoft and Hortonworks during the past year (maybe longer). Please refer to the ReviewBoard link for feedback on a couple of points in the code. Let's try to address these points before committing. Additionally, we ask that new classes have appropriate annotations for {{InterfaceAudience}} and {{InterfaceStability}}. My suggestion on this is: * {{NativeAzureFileSystem}} gets {{InterfaceAudience#Public}} and {{InterfaceStability#Stable}} (by extension of the fact that the base {{FileSystem}} class is categorized this way). * {{Wasb}} gets {{InterfaceAudience#Public}} and {{InterfaceStability#Evolving}}. (Again, this falls from how its base class is categorized.) * All other classes get {{InterfaceAudience#Private}}. You can skip the {{InterfaceStability}} annotation for these, because when the audience is private, it is assumed that there is no stability guarantee. * These annotations are only necessary for main classes, not test classes. Let me know whether or not you agree that classification makes sense. bq. -1 javadoc. The javadoc tool appears to have generated 100 warning messages. The patch actually introduced just 1 new JavaDoc warning, which I mentioned in the ReviewBoard feedback. For some reason, it flagged a bunch of unrelated things under hadoop-common, so maybe this is a bug in our automation. Support Windows Azure Storage - Blob as a file system in Hadoop --- Key: HADOOP-9629 URL: https://issues.apache.org/jira/browse/HADOOP-9629 Project: Hadoop Common Issue Type: Improvement Reporter: Mostafa Elhemali Assignee: Mike Liddell Attachments: HADOOP-9629 - Azure Filesystem - Information for developers.docx, HADOOP-9629 - Azure Filesystem - Information for developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch h2. Description This JIRA incorporates adding a new file system implementation for accessing Windows Azure Storage - Blob from within Hadoop, such as using blobs as input to MR jobs or configuring MR jobs to put their output directly into blob storage. h2. High level design At a high level, the code here extends the FileSystem class to provide an implementation for accessing blob storage; the scheme wasb is used for accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI scheme: {code}wasb[s]://container@account/path/to/file{code} to address individual blobs. We use the standard Azure Java SDK (com.microsoft.windowsazure) to do most of the work. In order to map a hierarchical file system over the flat name-value pair nature of blob storage, we create a specially tagged blob named path/to/dir whenever we create a directory called path/to/dir, then files under that are stored as normal blobs path/to/dir/file. We have many metrics implemented for it using the Metrics2 interface. Tests are implemented mostly using a mock implementation for the Azure SDK functionality, with an option to test against a real blob storage if configured (instructions provided inside in README.txt). h2. Credits and history This has been ongoing work for a while, and the early version of this work can be seen in HADOOP-8079. This JIRA is a significant revision of that and we'll post the patch here for Hadoop trunk first, then post a patch for branch-1 as well for backporting the functionality if accepted. Credit for this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and [~stojanovic] as well as multiple people who have taken over this work since then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and [~chuanliu]. h2. Test Besides unit tests, we have used WASB as the default file system in our service product. (HDFS is also used but not as default file system.) Various different customer and test workloads have been run against clusters with such configurations for quite some time. The current version reflects to the version of the code tested and used in our production environment. -- This message was sent by Atlassian JIRA (v6.2#6252)