[jira] [Updated] (HADOOP-15189) backport HADOOP-15039 to branch-2 and branch-3
[ https://issues.apache.org/jira/browse/HADOOP-15189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] SammiChen updated HADOOP-15189: --- Fix Version/s: 2.10.0 > backport HADOOP-15039 to branch-2 and branch-3 > -- > > Key: HADOOP-15189 > URL: https://issues.apache.org/jira/browse/HADOOP-15189 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Genmao Yu >Assignee: Genmao Yu >Priority: Blocker > Fix For: 2.10.0, 2.9.1, 3.0.1 > > Attachments: HADOOP-15189-branch-2.001.patch, > HADOOP-15189-branch-2.9.001.patch, HADOOP-15189-branch-3.0.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input
[ https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey reassigned HADOOP-15171: - Assignee: Lokesh Jain > Hadoop native ZLIB decompressor produces 0 bytes for some input > --- > > Key: HADOOP-15171 > URL: https://issues.apache.org/jira/browse/HADOOP-15171 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Sergey Shelukhin >Assignee: Lokesh Jain >Priority: Blocker > Fix For: 3.1.0, 3.0.1 > > > While reading some ORC file via direct buffers, Hive gets a 0-sized buffer > for a particular compressed segment of the file. We narrowed it down to > Hadoop native ZLIB codec; when the data is copied to heap-based buffer and > the JDK Inflater is used, it produces correct output. Input is only 127 bytes > so I can paste it here. > All the other (many) blocks of the file are decompressed without problems by > the same code. > {noformat} > 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing > 127 bytes to dest buffer pos 524288, limit 786432 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has > produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 > 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 > 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa > 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 > b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 > 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to > JDK decompressor with memcopy; got 155 bytes > {noformat} > Hadoop version is based on 3.1 snapshot. > The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 > FWIW. Not sure how to extract versions from those. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP
[ https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344497#comment-16344497 ] genericqa commented on HADOOP-15191: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 29s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 3m 55s{color} | {color:orange} root: The patch generated 12 new + 39 unchanged - 1 fixed = 51 total (was 40) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 6m 47s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 25s{color} | {color:red} hadoop-tools_hadoop-aws generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 1s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 58s{color} | {color:green} hadoop-distcp in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 29s{color} | {color:green} hadoop-aws in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}164m 4s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HADOOP-15191 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908245/HADOOP-15191-002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 06bee037e2db 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Comment Edited] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line
[ https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344463#comment-16344463 ] Vishwajeet Dusane edited comment on HADOOP-15186 at 1/30/18 3:54 AM: - Thanks [~ste...@apache.org]. Is it possible to back port this CR to 3.0.0, 2.9 and 2.8 branch ? was (Author: vishwajeet.dusane): Thanks [~ste...@apache.org] > Allow Azure Data Lake SDK dependency version to be set on the command line > -- > > Key: HADOOP-15186 > URL: https://issues.apache.org/jira/browse/HADOOP-15186 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/adl >Affects Versions: 3.0.0 >Reporter: Vishwajeet Dusane >Assignee: Vishwajeet Dusane >Priority: Major > Fix For: 3.0.1 > > Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, > HADOOP-15186-003.patch > > > For backward/forward release of Java SDK compatibility test against Hadoop > driver. Allow Azure Data Lake Java SDK dependency version to override from > command line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line
[ https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344463#comment-16344463 ] Vishwajeet Dusane commented on HADOOP-15186: Thanks [~ste...@apache.org] > Allow Azure Data Lake SDK dependency version to be set on the command line > -- > > Key: HADOOP-15186 > URL: https://issues.apache.org/jira/browse/HADOOP-15186 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/adl >Affects Versions: 3.0.0 >Reporter: Vishwajeet Dusane >Assignee: Vishwajeet Dusane >Priority: Major > Fix For: 3.0.1 > > Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, > HADOOP-15186-003.patch > > > For backward/forward release of Java SDK compatibility test against Hadoop > driver. Allow Azure Data Lake Java SDK dependency version to override from > command line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP
[ https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-15191: Status: Patch Available (was: Open) > Add Private/Unstable BulkDelete operations to supporting object stores for > DistCP > - > > Key: HADOOP-15191 > URL: https://issues.apache.org/jira/browse/HADOOP-15191 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, tools/distcp >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15191-001.patch, HADOOP-15191-002.patch > > > Large scale DistCP with the -delete option doesn't finish in a viable time > because of the final CopyCommitter doing a 1 by 1 delete of all missing > files. This isn't randomized (the list is sorted), and it's throttled by AWS. > If bulk deletion of files was exposed as an API, distCP would do 1/1000 of > the REST calls, so not get throttled. > Proposed: add an initially private/unstable interface for stores, > {{BulkDelete}} which declares a page size and offers a > {{bulkDelete(List)}} operation for the bulk deletion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP
[ https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344385#comment-16344385 ] Steve Loughran edited comment on HADOOP-15191 at 1/30/18 2:00 AM: -- h2. Proposed * New interface {{org.apache.hadoop.fs.store.BulkIO}} * S3A to implement this, relaying to {{S3ABulkOperations}} * {{S3ABulkOperations}} to implement an optimised delete If you look at the cost of the delete(file), it's not just the DELETE call its: # getFileStatus(file) : HEAD, [HEAD], [LIST]. # DELETE # getFileStatus(file.parent) HEAD, HEAD, LIST. # if not found, PUT file.parent + "/" FWIW, we could maybe optimise that second getFileStatus in the assumption that there's no file or dir marker there; all you need to do is check for the LIST call returning 1+ entry. Anyway. you are looking at ~7 HTTP requests per delete. Optimising that directory creation is equally important. Now, we could just have the bulk IO operation say "outcome of empty directories is undefined". I'm happy with that, but it's more of a change to the observable outcome of a distcp call. New {{S3ABulkOperations.bulkDeleteFiles}} * No check for a file existing before delete * Issues a bulk delete with the configured page size * builds up a tree of parent paths, and only attempts to creates fake directories for the parent directories at the bottom of the tree. That is, if you delete the paths {code} /A/B.txt /A/C/D.txt /A/C/E.txt {code} Then the only directory to consider creating is /A/C/; after which you know that the parent /A path will have an entry, so doesn't need any work. The number of fake directory creation therefore goes from O(files) to O(leaves in directory tree). At best, Ω(1), at worst O(files). One caveat: we now create an empty dir even if the source file doesn't exist. h2. Testing I've made the page size configurable (fs.s3a.experimental.bulkdelete.pagesize). We can switch on the paged delete mode with a very small page size, and so check it works properly even for a small number of files. New unit test suite {{TestS3ABulkOperations}}, primarily checks tree logic for the directory creation process. New integration test suite {{ITestS3ABulkOperations}} performs bulk IO and sees what it does. The existing {{AbstractContractDistCpTest}} test extends its {{deepDirectoryStructureToRemote}} test to become {{deepDirectoryStructureToRemoteWithSync}}, doing an update with some files added, some removed, and assertions about the final state. This verifies that distcp is happy. I've also reviewed the logs to see that all is well there. h2. Alternate Design: publish summary and do it independently The other tactic for doing this would be to not integrate DistCP with the bulk delete, and instead have it publish the files of input & output for a followup reconciler. Good: * No changes to DistCP delete process * No need to add any explicit API/interface in hadoop-common Bad: * New visible option to distcp to save output * May lead to expectations of future maintenance of the option * and also a persistent format for the data You'd still need to add the bulk delete calls alongside the S3A Fs, and any other stores to which the bulk IO was also added (Wasb could save on directory setup, by the look of things, as would oss: and swift was (Author: ste...@apache.org): h2. Proposed * New interface {{org.apache.hadoop.fs.store.BulkIO}} * S3A to implement this, relaying to {{S3ABulkOperations}} * {{S3ABulkOperations}} to implement an optimised delete If you look at the cost of the delete(file), it's not just the DELETE call its: # getFileStatus(file) : HEAD, [HEAD], [LIST]. # DELETE # getFileStatus(file.parent) HEAD, HEAD, LIST. # if not found, PUT file.parent + "/" FWIW, we could maybe optimise that second getFileStatus in the assumption that there's no file or dir marker there; all you need to do is check for the LIST call returning 1+ entry. Anyway. you are looking at ~7 HTTP requests per delete. Optimising that directory creation is equally important. Now, we could just have the bulk IO operation say "outcome of empty directories is undefined". I'm happy with that, but it's more of a change to the observable outcome of a distcp call. New {{S3ABulkOperations.bulkDeleteFiles}} * No check for a file existing before delete * Issues a bulk delete with the configured page size * builds up a tree of parent paths, and only attempts to creates fake directories for the parent directories at the bottom of the tree. That is, if you delete the paths {code} /A/B.txt /A/C/D.txt /A/C/E.txt {code} Then the only directory to consider creating is /A/C/; after which you know that the parent /A path will have an entry, so doesn't need any work. The number of fake directory creation therefore goes from O(files) to O(leaves in directory tree). At best, Ω(1), at worst O(files). One caveat:
[jira] [Commented] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP
[ https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344385#comment-16344385 ] Steve Loughran commented on HADOOP-15191: - h2. Proposed * New interface {{org.apache.hadoop.fs.store.BulkIO}} * S3A to implement this, relaying to {{S3ABulkOperations}} * {{S3ABulkOperations}} to implement an optimised delete If you look at the cost of the delete(file), it's not just the DELETE call its: # getFileStatus(file) : HEAD, [HEAD], [LIST]. # DELETE # getFileStatus(file.parent) HEAD, HEAD, LIST. # if not found, PUT file.parent + "/" FWIW, we could maybe optimise that second getFileStatus in the assumption that there's no file or dir marker there; all you need to do is check for the LIST call returning 1+ entry. Anyway. you are looking at ~7 HTTP requests per delete. Optimising that directory creation is equally important. Now, we could just have the bulk IO operation say "outcome of empty directories is undefined". I'm happy with that, but it's more of a change to the observable outcome of a distcp call. New {{S3ABulkOperations.bulkDeleteFiles}} * No check for a file existing before delete * Issues a bulk delete with the configured page size * builds up a tree of parent paths, and only attempts to creates fake directories for the parent directories at the bottom of the tree. That is, if you delete the paths {code} /A/B.txt /A/C/D.txt /A/C/E.txt {code} Then the only directory to consider creating is /A/C/; after which you know that the parent /A path will have an entry, so doesn't need any work. The number of fake directory creation therefore goes from O(files) to O(leaves in directory tree). At best, Ω(1), at worst O(files). One caveat: we now create an empty dir even if the source file doesn't exist. h2. Testing I've made the page size configurable (fs.s3a.experimental.bulkdelete.pagesize). We can switch on the paged delete mode with a very small page size, and so check it works properly even for a small number of files. New unit test suite {{TestS3ABulkOperations}}, primarily checks tree logic for the directory creation process. New integration test suite {{ITestS3ABulkOperations}} performs bulk IO and sees what it does. The existing {{AbstractContractDistCpTest}} test extends its {{deepDirectoryStructureToRemote}} test to become {{deepDirectoryStructureToRemoteWithSync}}, doing an update with some files added, some removed, and assertions about the final state. This verifies that distcp is happy. I've also reviewed the logs to see that all is well there. h2. Alternate Design: publish summary and do it independently The other tactic for doing this would be to not integrate DistCP with the bulk delete, and instead have it publish the files of input & output for a followup reconciler. Good: * No changes to DistCP delete process * No need to add any explicit API/interface in hadoop-common Bad: * New visible option to distcp to save output * May lead to expectations of future maintenance of the option * and also a persistent format for the data You'd still need to add the bulk delete calls alongside the S3A Fs, and any other stores to which the bulk IO was also added (Wasb could save on directory setup, by the look of things, as would oss: and swift:) > Add Private/Unstable BulkDelete operations to supporting object stores for > DistCP > - > > Key: HADOOP-15191 > URL: https://issues.apache.org/jira/browse/HADOOP-15191 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, tools/distcp >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15191-001.patch, HADOOP-15191-002.patch > > > Large scale DistCP with the -delete option doesn't finish in a viable time > because of the final CopyCommitter doing a 1 by 1 delete of all missing > files. This isn't randomized (the list is sorted), and it's throttled by AWS. > If bulk deletion of files was exposed as an API, distCP would do 1/1000 of > the REST calls, so not get throttled. > Proposed: add an initially private/unstable interface for stores, > {{BulkDelete}} which declares a page size and offers a > {{bulkDelete(List)}} operation for the bulk deletion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15168) Add kdiag tool to hadoop command
[ https://issues.apache.org/jira/browse/HADOOP-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344383#comment-16344383 ] genericqa commented on HADOOP-15168: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 5m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 1s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} shellcheck {color} | {color:red} 0m 55s{color} | {color:red} The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 8s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 51s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 2s{color} | {color:green} hadoop-yarn in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 65m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HADOOP-15168 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908239/HADOOP-15168.02.patch | | Optional Tests | asflicense mvnsite unit shellcheck shelldocs | | uname | Linux f64e3bcd0bab 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fde95d4 | | maven | version: Apache Maven 3.3.9 | | shellcheck | v0.4.6 | | shellcheck | https://builds.apache.org/job/PreCommit-HADOOP-Build/14046/artifact/out/diff-patch-shellcheck.txt | | whitespace | https://builds.apache.org/job/PreCommit-HADOOP-Build/14046/artifact/out/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/14046/testReport/ | | Max. process+thread count | 439 (vs. ulimit of 5000) | | modules | C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-yarn-project/hadoop-yarn U: . | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/14046/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add kdiag tool to hadoop command > > > Key: HADOOP-15168 > URL:
[jira] [Updated] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP
[ https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-15191: Attachment: HADOOP-15191-002.patch > Add Private/Unstable BulkDelete operations to supporting object stores for > DistCP > - > > Key: HADOOP-15191 > URL: https://issues.apache.org/jira/browse/HADOOP-15191 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, tools/distcp >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15191-001.patch, HADOOP-15191-002.patch > > > Large scale DistCP with the -delete option doesn't finish in a viable time > because of the final CopyCommitter doing a 1 by 1 delete of all missing > files. This isn't randomized (the list is sorted), and it's throttled by AWS. > If bulk deletion of files was exposed as an API, distCP would do 1/1000 of > the REST calls, so not get throttled. > Proposed: add an initially private/unstable interface for stores, > {{BulkDelete}} which declares a page size and offers a > {{bulkDelete(List)}} operation for the bulk deletion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP
[ https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-15191: Status: Open (was: Patch Available) > Add Private/Unstable BulkDelete operations to supporting object stores for > DistCP > - > > Key: HADOOP-15191 > URL: https://issues.apache.org/jira/browse/HADOOP-15191 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, tools/distcp >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15191-001.patch, HADOOP-15191-002.patch > > > Large scale DistCP with the -delete option doesn't finish in a viable time > because of the final CopyCommitter doing a 1 by 1 delete of all missing > files. This isn't randomized (the list is sorted), and it's throttled by AWS. > If bulk deletion of files was exposed as an API, distCP would do 1/1000 of > the REST calls, so not get throttled. > Proposed: add an initially private/unstable interface for stores, > {{BulkDelete}} which declares a page size and offers a > {{bulkDelete(List)}} operation for the bulk deletion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures
[ https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344320#comment-16344320 ] genericqa commented on HADOOP-12897: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 0s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 24s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-common-project/hadoop-auth: The patch generated 2 new + 40 unchanged - 1 fixed = 42 total (was 41) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 35s{color} | {color:green} hadoop-auth in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 86m 14s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HADOOP-12897 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908225/HADOOP-12897.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 992137aff30d 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fde95d4 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HADOOP-Build/14045/artifact/out/diff-checkstyle-hadoop-common-project_hadoop-auth.txt | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/14045/testReport/ | | Max. process+thread count | 340 (vs. ulimit of 5000) | | modules | C: hadoop-common-project/hadoop-auth U: hadoop-common-project/hadoop-auth | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/14045/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT
[jira] [Commented] (HADOOP-15168) Add kdiag tool to hadoop command
[ https://issues.apache.org/jira/browse/HADOOP-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344300#comment-16344300 ] Bharat Viswanadham commented on HADOOP-15168: - Hi [~hanishakoneru] Thank you for review. Addressed review comments in v02 patch > Add kdiag tool to hadoop command > > > Key: HADOOP-15168 > URL: https://issues.apache.org/jira/browse/HADOOP-15168 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Minor > Attachments: HADOOP-15168.00.patch, HADOOP-15168.01.patch, > HADOOP-15168.02.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15168) Add kdiag tool to hadoop command
[ https://issues.apache.org/jira/browse/HADOOP-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HADOOP-15168: Attachment: HADOOP-15168.02.patch > Add kdiag tool to hadoop command > > > Key: HADOOP-15168 > URL: https://issues.apache.org/jira/browse/HADOOP-15168 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Minor > Attachments: HADOOP-15168.00.patch, HADOOP-15168.01.patch, > HADOOP-15168.02.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input
[ https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344268#comment-16344268 ] Steve Loughran commented on HADOOP-15171: - There was another JIRA on this wasn't there? Sergei, can you find it? > Hadoop native ZLIB decompressor produces 0 bytes for some input > --- > > Key: HADOOP-15171 > URL: https://issues.apache.org/jira/browse/HADOOP-15171 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Sergey Shelukhin >Priority: Blocker > Fix For: 3.1.0, 3.0.1 > > > While reading some ORC file via direct buffers, Hive gets a 0-sized buffer > for a particular compressed segment of the file. We narrowed it down to > Hadoop native ZLIB codec; when the data is copied to heap-based buffer and > the JDK Inflater is used, it produces correct output. Input is only 127 bytes > so I can paste it here. > All the other (many) blocks of the file are decompressed without problems by > the same code. > {noformat} > 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing > 127 bytes to dest buffer pos 524288, limit 786432 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has > produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 > 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 > 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa > 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 > b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 > 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to > JDK decompressor with memcopy; got 155 bytes > {noformat} > Hadoop version is based on 3.1 snapshot. > The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 > FWIW. Not sure how to extract versions from those. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input
[ https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344204#comment-16344204 ] Gopal V commented on HADOOP-15171: -- bq. this is becoming a pain This is a huge perf hit right now, the workaround is much slower than the original codepath. > Hadoop native ZLIB decompressor produces 0 bytes for some input > --- > > Key: HADOOP-15171 > URL: https://issues.apache.org/jira/browse/HADOOP-15171 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Sergey Shelukhin >Priority: Blocker > Fix For: 3.1.0, 3.0.1 > > > While reading some ORC file via direct buffers, Hive gets a 0-sized buffer > for a particular compressed segment of the file. We narrowed it down to > Hadoop native ZLIB codec; when the data is copied to heap-based buffer and > the JDK Inflater is used, it produces correct output. Input is only 127 bytes > so I can paste it here. > All the other (many) blocks of the file are decompressed without problems by > the same code. > {noformat} > 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing > 127 bytes to dest buffer pos 524288, limit 786432 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has > produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 > 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 > 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa > 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 > b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 > 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to > JDK decompressor with memcopy; got 155 bytes > {noformat} > Hadoop version is based on 3.1 snapshot. > The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 > FWIW. Not sure how to extract versions from those. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15140) S3guard mistakes root URI without / as non-absolute path
[ https://issues.apache.org/jira/browse/HADOOP-15140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344201#comment-16344201 ] Abraham Fine commented on HADOOP-15140: --- [~ste...@apache.org] I went ahead and added two tests to {{AbstractContractGetFileStatusTest}}: {code:java} @Test public void testGetFileStatusRootURI() throws Throwable { String fileSystemURI = getFileSystem().getUri().toString(); assertTrue("uri should not end with '/': " + fileSystemURI, fileSystemURI.endsWith("//") || !fileSystemURI.endsWith("/")); ContractTestUtils.assertIsDirectory( getFileSystem().getFileStatus(new Path(fileSystemURI))); } @Test public void testGetFileStatusRootFromChild() throws Throwable { ContractTestUtils.assertIsDirectory( getFileSystem().getFileStatus(new Path("/dir").getParent())); } {code} These tests ran against S3, Azure, Azure Data Lake, and HDFS using: {{mvn test -Dtest="**/*ContractGetFileStatus*" -DS3guard -fae -Dmaven.test.failure.ignore=true}} {{testGetFileStatusRootFromChild}} never appears to fail. {{testGetFileStatusRootURI}} does on HDFS, S3, and Azure Data Lake (Azure native passes). Local file systems also pass. Here are the failures and their corresponding stack traces: {code:java} [INFO] Running org.apache.hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus [ERROR] Tests run: 20, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.344 s <<< FAILURE! - in org.apache.hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus [ERROR] testGetFileStatusRootURI(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus) Time elapsed: 0.02 s <<< ERROR! java.lang.IllegalArgumentException: Pathname from hdfs://localhost:63826 is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:242) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1568) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1565) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1580) at org.apache.hadoop.fs.contract.AbstractContractGetFileStatusTest.testGetFileStatusRootURI(AbstractContractGetFileStatusTest.java:86) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) [INFO] Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus [ERROR] Tests run: 20, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 113.029 s <<< FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus [ERROR] testGetFileStatusRootURI(org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus) Time elapsed: 1.27 s <<< ERROR! java.lang.IllegalArgumentException: path must be absolute at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:68) at org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:60) at org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:56) at org.apache.hadoop.fs.s3a.s3guard.S3Guard.putAndReturn(S3Guard.java:149) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2130) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2070) at org.apache.hadoop.fs.contract.AbstractContractGetFileStatusTest.testGetFileStatusRootURI(AbstractContractGetFileStatusTest.java:86) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
[jira] [Updated] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input
[ https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HADOOP-15171: -- Priority: Blocker (was: Critical) > Hadoop native ZLIB decompressor produces 0 bytes for some input > --- > > Key: HADOOP-15171 > URL: https://issues.apache.org/jira/browse/HADOOP-15171 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Sergey Shelukhin >Priority: Blocker > Fix For: 3.1.0, 3.0.1 > > > While reading some ORC file via direct buffers, Hive gets a 0-sized buffer > for a particular compressed segment of the file. We narrowed it down to > Hadoop native ZLIB codec; when the data is copied to heap-based buffer and > the JDK Inflater is used, it produces correct output. Input is only 127 bytes > so I can paste it here. > All the other (many) blocks of the file are decompressed without problems by > the same code. > {noformat} > 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing > 127 bytes to dest buffer pos 524288, limit 786432 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has > produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 > 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 > 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa > 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 > b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 > 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to > JDK decompressor with memcopy; got 155 bytes > {noformat} > Hadoop version is based on 3.1 snapshot. > The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 > FWIW. Not sure how to extract versions from those. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input
[ https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344184#comment-16344184 ] Sergey Shelukhin commented on HADOOP-15171: --- [~ste...@apache.org] [~jnp] is it possible to get some traction on this actually? We now also have to work around this in ORC project, and this is becoming a pain > Hadoop native ZLIB decompressor produces 0 bytes for some input > --- > > Key: HADOOP-15171 > URL: https://issues.apache.org/jira/browse/HADOOP-15171 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Sergey Shelukhin >Priority: Critical > Fix For: 3.1.0, 3.0.1 > > > While reading some ORC file via direct buffers, Hive gets a 0-sized buffer > for a particular compressed segment of the file. We narrowed it down to > Hadoop native ZLIB codec; when the data is copied to heap-based buffer and > the JDK Inflater is used, it produces correct output. Input is only 127 bytes > so I can paste it here. > All the other (many) blocks of the file are decompressed without problems by > the same code. > {noformat} > 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing > 127 bytes to dest buffer pos 524288, limit 786432 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has > produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 > 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 > 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa > 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 > b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 > 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to > JDK decompressor with memcopy; got 155 bytes > {noformat} > Hadoop version is based on 3.1 snapshot. > The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 > FWIW. Not sure how to extract versions from those. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input
[ https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HADOOP-15171: -- Fix Version/s: 3.0.1 3.1.0 > Hadoop native ZLIB decompressor produces 0 bytes for some input > --- > > Key: HADOOP-15171 > URL: https://issues.apache.org/jira/browse/HADOOP-15171 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Sergey Shelukhin >Priority: Critical > Fix For: 3.1.0, 3.0.1 > > > While reading some ORC file via direct buffers, Hive gets a 0-sized buffer > for a particular compressed segment of the file. We narrowed it down to > Hadoop native ZLIB codec; when the data is copied to heap-based buffer and > the JDK Inflater is used, it produces correct output. Input is only 127 bytes > so I can paste it here. > All the other (many) blocks of the file are decompressed without problems by > the same code. > {noformat} > 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing > 127 bytes to dest buffer pos 524288, limit 786432 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has > produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 > 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 > 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa > 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 > b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 > 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00 > 2018-01-13T02:47:40,816 WARN [IO-Elevator-Thread-0 > (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to > JDK decompressor with memcopy; got 155 bytes > {noformat} > Hadoop version is based on 3.1 snapshot. > The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 > FWIW. Not sure how to extract versions from those. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures
[ https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HADOOP-12897: Attachment: HADOOP-12897.004.patch > KerberosAuthenticator.authenticate to include URL on IO failures > > > Key: HADOOP-12897 > URL: https://issues.apache.org/jira/browse/HADOOP-12897 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, > HADOOP-12897.003.patch, HADOOP-12897.004.patch > > > If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you > get a stack trace, but without the URL it is trying to talk to. > That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} > handler —which can't be called here as its not in the {{hadoop-auth}} module -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures
[ https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HADOOP-12897: Attachment: (was: HADOOP-12897.004.patch) > KerberosAuthenticator.authenticate to include URL on IO failures > > > Key: HADOOP-12897 > URL: https://issues.apache.org/jira/browse/HADOOP-12897 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, > HADOOP-12897.003.patch, HADOOP-12897.004.patch > > > If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you > get a stack trace, but without the URL it is trying to talk to. > That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} > handler —which can't be called here as its not in the {{hadoop-auth}} module -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures
[ https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HADOOP-12897: Attachment: HADOOP-12897.004.patch > KerberosAuthenticator.authenticate to include URL on IO failures > > > Key: HADOOP-12897 > URL: https://issues.apache.org/jira/browse/HADOOP-12897 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, > HADOOP-12897.003.patch, HADOOP-12897.004.patch > > > If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you > get a stack trace, but without the URL it is trying to talk to. > That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} > handler —which can't be called here as its not in the {{hadoop-auth}} module -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures
[ https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344165#comment-16344165 ] Ajay Kumar commented on HADOOP-12897: - Patch v4 adds log message at DEBUG level to address [~ste...@apache.org] suggestion on {{wrapExceptionWithMessage}}. > KerberosAuthenticator.authenticate to include URL on IO failures > > > Key: HADOOP-12897 > URL: https://issues.apache.org/jira/browse/HADOOP-12897 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, > HADOOP-12897.003.patch > > > If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you > get a stack trace, but without the URL it is trying to talk to. > That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} > handler —which can't be called here as its not in the {{hadoop-auth}} module -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures
[ https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344139#comment-16344139 ] Ajay Kumar edited comment on HADOOP-12897 at 1/29/18 10:46 PM: --- [~arpitagarwal], thanks for suggestion. Using multi-catch in this case gives another compile time error which basically expects throws clause in function to be Exception. Seems like a bug in JDK. was (Author: ajayydv): [~arpitagarwal], thanks for suggestion. Using multi-catch in this case gives another compile time error which basically expects throws clause in function to Exception. Seems like a bug in JDK. > KerberosAuthenticator.authenticate to include URL on IO failures > > > Key: HADOOP-12897 > URL: https://issues.apache.org/jira/browse/HADOOP-12897 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, > HADOOP-12897.003.patch > > > If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you > get a stack trace, but without the URL it is trying to talk to. > That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} > handler —which can't be called here as its not in the {{hadoop-auth}} module -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures
[ https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344139#comment-16344139 ] Ajay Kumar commented on HADOOP-12897: - [~arpitagarwal], thanks for suggestion. Using multi-catch in this case gives another compile time error which basically expects throws clause in function to Exception. Seems like a bug in JDK. > KerberosAuthenticator.authenticate to include URL on IO failures > > > Key: HADOOP-12897 > URL: https://issues.apache.org/jira/browse/HADOOP-12897 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, > HADOOP-12897.003.patch > > > If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you > get a stack trace, but without the URL it is trying to talk to. > That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} > handler —which can't be called here as its not in the {{hadoop-auth}} module -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP
[ https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344136#comment-16344136 ] Steve Loughran commented on HADOOP-15191: - The patch I'm working on now (bigger, passing tests) doesn't contain any attempts to recover from partially failed deletes. That's a more complex issue which need to be implemented and tested more broadly, and is only relevant when you are mixing permissions down a tree. As S3A doesn't yet even handle delete(file) properly there, this new operation isn't making things worse > Add Private/Unstable BulkDelete operations to supporting object stores for > DistCP > - > > Key: HADOOP-15191 > URL: https://issues.apache.org/jira/browse/HADOOP-15191 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, tools/distcp >Affects Versions: 2.9.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Attachments: HADOOP-15191-001.patch > > > Large scale DistCP with the -delete option doesn't finish in a viable time > because of the final CopyCommitter doing a 1 by 1 delete of all missing > files. This isn't randomized (the list is sorted), and it's throttled by AWS. > If bulk deletion of files was exposed as an API, distCP would do 1/1000 of > the REST calls, so not get throttled. > Proposed: add an initially private/unstable interface for stores, > {{BulkDelete}} which declares a page size and offers a > {{bulkDelete(List)}} operation for the bulk deletion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances
[ https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344105#comment-16344105 ] Xiao Chen commented on HADOOP-14445: Thanks [~daryn] for circling back with the new idea. Mixed feeling (and head scratching)! :) I think a new and standardized token kind should work, and conveniently eliminate the need for changing client configs, so SGTM. We may also check in the RM, when its {{DelegationTokenRenewer}} received a set of tokens, and there are both kms-dt and KMS_D_T with the same sequence number, only renew the KMS_D_T. For that to work, we'd need a new {{KMSDelegationTokenIdentifier}} class and a new {{DelegationTokenAuthenticationHandler}} too. Curious: with the current approach (patch 3) we need just an additional config deployment after the upgrade, right? What changed your mind from [earlier|https://issues.apache.org/jira/browse/HADOOP-14445?focusedCommentId=16279134=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16279134] (assuming the implementation comments are addressed) ? I'd rather prefer not to sacrifice old RM + new client. True for RU should still work, but there is still support burden for a new client connecting to an existing cluster. Token issues are not the easiest to figure out, and IMO we should avoid this case when we can. > Delegation tokens are not shared between KMS instances > -- > > Key: HADOOP-14445 > URL: https://issues.apache.org/jira/browse/HADOOP-14445 > Project: Hadoop Common > Issue Type: Bug > Components: kms >Affects Versions: 2.8.0, 3.0.0-alpha1 > Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption >Reporter: Wei-Chiu Chuang >Assignee: Rushabh S Shah >Priority: Major > Attachments: HADOOP-14445-branch-2.8.002.patch, > HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, HADOOP-14445.003.patch > > > As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do > not share delegation tokens. (a client uses KMS address/port as the key for > delegation token) > {code:title=DelegationTokenAuthenticatedURL#openConnection} > if (!creds.getAllTokens().isEmpty()) { > InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(), > url.getPort()); > Text service = SecurityUtil.buildTokenService(serviceAddr); > dToken = creds.getToken(service); > {code} > But KMS doc states: > {quote} > Delegation Tokens > Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation > tokens too. > Under HA, A KMS instance must verify the delegation token given by another > KMS instance, by checking the shared secret used to sign the delegation > token. To do this, all KMS instances must be able to retrieve the shared > secret from ZooKeeper. > {quote} > We should either update the KMS documentation, or fix this code to share > delegation tokens. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15129) Datanode caches namenode DNS lookup failure and cannot startup
[ https://issues.apache.org/jira/browse/HADOOP-15129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344037#comment-16344037 ] Ajay Kumar commented on HADOOP-15129: - {quote}local host is: (unknown); {quote} can we pass "localhost" for {{NetUtils.wrapException}} instead of null. Above message in logs is little misleading. > Datanode caches namenode DNS lookup failure and cannot startup > -- > > Key: HADOOP-15129 > URL: https://issues.apache.org/jira/browse/HADOOP-15129 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 2.8.2 > Environment: Google Compute Engine. > I'm using Java 8, Debian 8, Hadoop 2.8.2. >Reporter: Karthik Palaniappan >Assignee: Karthik Palaniappan >Priority: Minor > Attachments: HADOOP-15129.001.patch, HADOOP-15129.002.patch > > > On startup, the Datanode creates an InetSocketAddress to register with each > namenode. Though there are retries on connection failure throughout the > stack, the same InetSocketAddress is reused. > InetSocketAddress is an interesting class, because it resolves DNS names to > IP addresses on construction, and it is never refreshed. Hadoop re-creates an > InetSocketAddress in some cases just in case the remote IP has changed for a > particular DNS name: https://issues.apache.org/jira/browse/HADOOP-7472. > Anyway, on startup, you cna see the Datanode log: "Namenode...remains > unresolved" -- referring to the fact that DNS lookup failed. > {code:java} > 2017-11-02 16:01:55,115 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Refresh request received for nameservices: null > 2017-11-02 16:01:55,153 WARN org.apache.hadoop.hdfs.DFSUtilClient: Namenode > for null remains unresolved for ID null. Check your hdfs-site.xml file to > ensure namenodes are configured properly. > 2017-11-02 16:01:55,156 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Starting BPOfferServices for nameservices: > 2017-11-02 16:01:55,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Block pool (Datanode Uuid unassigned) service to > cluster-32f5-m:8020 starting to offer service > {code} > The Datanode then proceeds to use this unresolved address, as it may work if > the DN is configured to use a proxy. Since I'm not using a proxy, it forever > prints out this message: > {code:java} > 2017-12-15 00:13:40,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > 2017-12-15 00:13:45,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > 2017-12-15 00:13:50,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > 2017-12-15 00:13:55,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > 2017-12-15 00:14:00,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > {code} > Unfortunately, the log doesn't contain the exception that triggered it, but > the culprit is actually in IPC Client: > https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L444. > This line was introduced in https://issues.apache.org/jira/browse/HADOOP-487 > to give a clear error message when somebody mispells an address. > However, the fix in HADOOP-7472 doesn't apply here, because that code happens > in Client#getConnection after the Connection is constructed. > My proposed fix (will attach a patch) is to move this exception out of the > constructor and into a place that will trigger HADOOP-7472's logic to > re-resolve addresses. If the DNS failure was temporary, this will allow the > connection to succeed. If not, the connection will fail after ipc client > retries (default 10 seconds worth of retries). > I want to fix this in ipc client rather than just in Datanode startup, as > this fixes temporary DNS issues for all of Hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava
[ https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344003#comment-16344003 ] genericqa commented on HADOOP-15170: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 53s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 8m 25s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 52s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 5s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | HADOOP-15170 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908190/HADOOP-15170.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e7438a831884 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7fd287b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_151 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/14044/testReport/ | | Max. process+thread count | 1430 (vs. ulimit of 5000) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/14044/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Add symlink support to FileUtil#unTarUsingJava > --- > > Key: HADOOP-15170 >
[jira] [Commented] (HADOOP-15006) Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS
[ https://issues.apache.org/jira/browse/HADOOP-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343975#comment-16343975 ] Steve Moist commented on HADOOP-15006: -- >what's your proposal for letting the client encryption be an optional feature, >with key? Config If s3a.client.encryption.enabled=true then check for BEZ if exists encrypt objects, else no encryption for the bucket. Or if the BEZI provider is configured as well rather than just the flag. >Is the file length as returned in listings 100% consistent with the amount of >data you get to read? Yes. >I'm not going to touch this right now as its at the too raw stage That's why I submitted it, for you and everyone else to play with to evaluate if this is something that we should move forward with. If needed I can go fix the broken S3Guard/Committer/byte comparison tests and have yetus pass it, but the actual code is going to be about the same. > Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS > --- > > Key: HADOOP-15006 > URL: https://issues.apache.org/jira/browse/HADOOP-15006 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3, kms >Reporter: Steve Moist >Priority: Minor > Attachments: S3-CSE Proposal.pdf, s3-cse-poc.patch > > > This is for the proposal to introduce Client Side Encryption to S3 in such a > way that it can leverage HDFS transparent encryption, use the Hadoop KMS to > manage keys, use the `hdfs crypto` command line tools to manage encryption > zones in the cloud, and enable distcp to copy from HDFS to S3 (and > vice-versa) with data still encrypted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15129) Datanode caches namenode DNS lookup failure and cannot startup
[ https://issues.apache.org/jira/browse/HADOOP-15129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343948#comment-16343948 ] Arpit Agarwal commented on HADOOP-15129: I haven't looked at the test cases yet but the change looks fine to me. Will review the new tests. [~kihwal], do you have any thoughts since you added the original re-resolution logic? > Datanode caches namenode DNS lookup failure and cannot startup > -- > > Key: HADOOP-15129 > URL: https://issues.apache.org/jira/browse/HADOOP-15129 > Project: Hadoop Common > Issue Type: Bug > Components: ipc >Affects Versions: 2.8.2 > Environment: Google Compute Engine. > I'm using Java 8, Debian 8, Hadoop 2.8.2. >Reporter: Karthik Palaniappan >Assignee: Karthik Palaniappan >Priority: Minor > Attachments: HADOOP-15129.001.patch, HADOOP-15129.002.patch > > > On startup, the Datanode creates an InetSocketAddress to register with each > namenode. Though there are retries on connection failure throughout the > stack, the same InetSocketAddress is reused. > InetSocketAddress is an interesting class, because it resolves DNS names to > IP addresses on construction, and it is never refreshed. Hadoop re-creates an > InetSocketAddress in some cases just in case the remote IP has changed for a > particular DNS name: https://issues.apache.org/jira/browse/HADOOP-7472. > Anyway, on startup, you cna see the Datanode log: "Namenode...remains > unresolved" -- referring to the fact that DNS lookup failed. > {code:java} > 2017-11-02 16:01:55,115 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Refresh request received for nameservices: null > 2017-11-02 16:01:55,153 WARN org.apache.hadoop.hdfs.DFSUtilClient: Namenode > for null remains unresolved for ID null. Check your hdfs-site.xml file to > ensure namenodes are configured properly. > 2017-11-02 16:01:55,156 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Starting BPOfferServices for nameservices: > 2017-11-02 16:01:55,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Block pool (Datanode Uuid unassigned) service to > cluster-32f5-m:8020 starting to offer service > {code} > The Datanode then proceeds to use this unresolved address, as it may work if > the DN is configured to use a proxy. Since I'm not using a proxy, it forever > prints out this message: > {code:java} > 2017-12-15 00:13:40,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > 2017-12-15 00:13:45,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > 2017-12-15 00:13:50,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > 2017-12-15 00:13:55,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > 2017-12-15 00:14:00,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Problem connecting to server: cluster-32f5-m:8020 > {code} > Unfortunately, the log doesn't contain the exception that triggered it, but > the culprit is actually in IPC Client: > https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L444. > This line was introduced in https://issues.apache.org/jira/browse/HADOOP-487 > to give a clear error message when somebody mispells an address. > However, the fix in HADOOP-7472 doesn't apply here, because that code happens > in Client#getConnection after the Connection is constructed. > My proposed fix (will attach a patch) is to move this exception out of the > constructor and into a place that will trigger HADOOP-7472's logic to > re-resolve addresses. If the DNS failure was temporary, this will allow the > connection to succeed. If not, the connection will fail after ipc client > retries (default 10 seconds worth of retries). > I want to fix this in ipc client rather than just in Datanode startup, as > this fixes temporary DNS issues for all of Hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures
[ https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343884#comment-16343884 ] Arpit Agarwal commented on HADOOP-12897: Minor comment. You can simplify the following code: {code} } catch (IOException ex) { throw wrapExceptionWithMessage(ex, "Error while authenticating with endpoint: " + url); } catch (AuthenticationException ex) { throw wrapExceptionWithMessage(ex, "Error while authenticating with endpoint: " + url); } {code} as follows: {code} } catch (IOException | AuthenticationException ex) { throw wrapExceptionWithMessage(ex, "Error while authenticating with endpoint: " + url); } {code} > KerberosAuthenticator.authenticate to include URL on IO failures > > > Key: HADOOP-12897 > URL: https://issues.apache.org/jira/browse/HADOOP-12897 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, > HADOOP-12897.003.patch > > > If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you > get a stack trace, but without the URL it is trying to talk to. > That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} > handler —which can't be called here as its not in the {{hadoop-auth}} module -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15168) Add kdiag tool to hadoop command
[ https://issues.apache.org/jira/browse/HADOOP-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343880#comment-16343880 ] Hanisha Koneru commented on HADOOP-15168: - {quote}I think, the commands which are related to hdfs, we add in hdfs script, similar for yarn. Or do we add to all scripts in general? {quote} No, if you are adding, you should add it in {{hdfs}} and {{yarn}} scripts. I am not sure if it is required as they do not have other kerberos related commands (such as {{key}} and {{kerbname}}). I meant to say we should change the following lines in {{SecureMode.md}} to reflect the changes introduced by this Jira. {code:java} The `KDiag` command has its own entry point; it is currently not hooked up to the end-user CLI. It is invoked simply by passing its full classname to one of the `bin/hadoop`, `bin/hdfs` or `bin/yarn` commands. Accordingly, it will display the kerberos client state of the command used to invoke it. ``` hadoop org.apache.hadoop.security.KDiag hdfs org.apache.hadoop.security.KDiag yarn org.apache.hadoop.security.KDiag {code} > Add kdiag tool to hadoop command > > > Key: HADOOP-15168 > URL: https://issues.apache.org/jira/browse/HADOOP-15168 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Minor > Attachments: HADOOP-15168.00.patch, HADOOP-15168.01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava
[ https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343828#comment-16343828 ] Ajay Kumar commented on HADOOP-15170: - [~jlowe], thanks for review. Updated patch v3 with suggested changes. > Add symlink support to FileUtil#unTarUsingJava > --- > > Key: HADOOP-15170 > URL: https://issues.apache.org/jira/browse/HADOOP-15170 > Project: Hadoop Common > Issue Type: Improvement > Components: util >Reporter: Jason Lowe >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, > HADOOP-15170.003.patch > > > Now that JDK7 or later is required, we can leverage > java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support > archives that contain symbolic links. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava
[ https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HADOOP-15170: Attachment: HADOOP-15170.003.patch > Add symlink support to FileUtil#unTarUsingJava > --- > > Key: HADOOP-15170 > URL: https://issues.apache.org/jira/browse/HADOOP-15170 > Project: Hadoop Common > Issue Type: Improvement > Components: util >Reporter: Jason Lowe >Assignee: Ajay Kumar >Priority: Minor > Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, > HADOOP-15170.003.patch > > > Now that JDK7 or later is required, we can leverage > java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support > archives that contain symbolic links. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14671) Upgrade to Apache Yetus 0.7.0
[ https://issues.apache.org/jira/browse/HADOOP-14671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343778#comment-16343778 ] Allen Wittenauer commented on HADOOP-14671: --- YETUS-609 is pretty much a blocker for Hadoop to go to 0.7.0. But 0.6.0 is always an option. > Upgrade to Apache Yetus 0.7.0 > - > > Key: HADOOP-14671 > URL: https://issues.apache.org/jira/browse/HADOOP-14671 > Project: Hadoop Common > Issue Type: Improvement > Components: build, documentation, test >Affects Versions: 3.0.0-beta1 >Reporter: Allen Wittenauer >Assignee: Akira Ajisaka >Priority: Major > Attachments: HADOOP-14671.001.patch > > > Apache Yetus 0.7.0 was released. Let's upgrade the bundled reference to the > new version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances
[ https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343754#comment-16343754 ] Daryn Sharp commented on HADOOP-14445: -- This fell off my radar. Quick recap since conversation has been fragmented across multiple jiras: The LB provider requests 1 token, like it should, but it’s used only for that specific kms. Ironic the load balancer increased load since it only works by retries cycling back to that kms, doesn't tolerate if that kms goes down, and it went unnoticed. This Jira proposed originally proposed obtaining n-many tokens from each subordinate kms, even though a token from 1 will work for all. The RM would have to unnecessarily renew n-many tokens and if one renew fails, job submission fails. Not good. Rushabh's original goal addresses a huge kms token renewal issue: it always uses the conf. A server like the RM cannot support a multi-kms environment. The fix is use the kms provider's uri as the token service so the same provider can later be instantiated for renewal. This also elegantly allows the LB provider to use a single token for all subordinate providers by using its own uri. But it poses compatibility issues for job submitted by a new client that runs old tasks. –– The semantics for getDelegationTokenService are oddly cyclical. I'd expect it, like other hadoop clients, to premeditate the service name. The latest patch is looking at the creds to decide the service based on whether a token exists so it can attempt to look up a token for that service – which it already looked up. I’d prefer for the compatibility to be cleaner, and easier to revoke in the future. The patch falls back to conf by assuming URISyntaxException means old service, however a malformed new service should fail to avoid surprises. If it looks like a uri, it must be a valid uri. Simplest approach is check if it contains ://. I'm also uneasy about a client-side config to control compatibility since clients are notoriously hard to upgrade. An alternative could remove the service guesswork, client conf, and be a bit more compatible by using a new token kind. The current one is “kms-dt” whereas the standard naming convention should be “KMS_DELEGATION_TOKEN”. The old token kind could continue using the conf, as today, while the new kind requires a service uri. Effectively the current/old code remains unchanged. There are tradeoffs to support old clients that must use the host:port. I know I objected to duplicating tokens, but I’ll acquiesce if it provides a cleaner approach. Duplicating a new KMS_DELEGATION_TOKEN/uri token into a single kms-dt/host:port is "no worse than today": * Pro: Old client finds kms-dt from old client. * Pro: Old client finds kms-dt from new client. * Pro: New client finds kms-dt from old client. * Pro: New client finds KMS_DELEGATION_TOKEN from new client. * Pro: Old RM renews the kms-dt for both old/new clients. * *Con*: New RM renews KMS_DELEGATION_TOKEN from new clients, effectively a double renew for the same token as kms-dt. If we are willing to sacrifice a bit for new client + old RM: Abuse fact that old kms clients look for a host:port service regardless of kind. We can trick the RM into not renewing the unknown kind, ex. “kms-dt-deprecated”, to avoid the double renew. * Pro: Old client finds kms-dt from old client. * Pro: Old client finds kms-dt-deprecated from new client (remember, doesn't care about kind) * Pro: New client finds kms-dt from old client. * Pro: New client finds KMS_DELEGATION_TOKEN from new client. * Pro: Old RM renews the kms-dt for old clients (all it knows about) * *Con*: Old RM renews nothing for new clients (doesn't know KMS_DELEGATION_TOKEN or kms-dt-deprecated) * Pro: New RM renews kms-dt for old clients. * Pro: New RM renews KMS_DELEGATION_TOKEN for new clients (not kms-dt-deprecated) Thoughts? > Delegation tokens are not shared between KMS instances > -- > > Key: HADOOP-14445 > URL: https://issues.apache.org/jira/browse/HADOOP-14445 > Project: Hadoop Common > Issue Type: Bug > Components: kms >Affects Versions: 2.8.0, 3.0.0-alpha1 > Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption >Reporter: Wei-Chiu Chuang >Assignee: Rushabh S Shah >Priority: Major > Attachments: HADOOP-14445-branch-2.8.002.patch, > HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, HADOOP-14445.003.patch > > > As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do > not share delegation tokens. (a client uses KMS address/port as the key for > delegation token) > {code:title=DelegationTokenAuthenticatedURL#openConnection} > if (!creds.getAllTokens().isEmpty()) { > InetSocketAddress serviceAddr = new
[jira] [Commented] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line
[ https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343733#comment-16343733 ] Hudson commented on HADOOP-15186: - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13577 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13577/]) HADOOP-15186. Allow Azure Data Lake SDK dependency version to be set on (stevel: rev 7fd287b4af5a191f18ea92850b7d904e4b4fb693) * (edit) hadoop-tools/hadoop-azure-datalake/pom.xml > Allow Azure Data Lake SDK dependency version to be set on the command line > -- > > Key: HADOOP-15186 > URL: https://issues.apache.org/jira/browse/HADOOP-15186 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/adl >Affects Versions: 3.0.0 >Reporter: Vishwajeet Dusane >Assignee: Vishwajeet Dusane >Priority: Major > Fix For: 3.0.1 > > Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, > HADOOP-15186-003.patch > > > For backward/forward release of Java SDK compatibility test against Hadoop > driver. Allow Azure Data Lake Java SDK dependency version to override from > command line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14969) Improve diagnostics in secure DataNode startup
[ https://issues.apache.org/jira/browse/HADOOP-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343727#comment-16343727 ] Ajay Kumar commented on HADOOP-14969: - Failed test is unrelated, passes locally. > Improve diagnostics in secure DataNode startup > -- > > Key: HADOOP-14969 > URL: https://issues.apache.org/jira/browse/HADOOP-14969 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Attachments: HADOOP-14969.001.patch, HADOOP-14969.002.patch, > HADOOP-14969.003.patch, HADOOP-14969.004.patch, HADOOP-14969.005.patch, > HADOOP-14969.006.patch > > > When DN secure mode configuration is incorrect, it throws the following > exception from Datanode#checkSecureConfig > {code} > private static void checkSecureConfig(DNConf dnConf, Configuration conf, > SecureResources resources) throws RuntimeException { > if (!UserGroupInformation.isSecurityEnabled()) { > return; > } > ... > throw new RuntimeException("Cannot start secure DataNode without " + > "configuring either privileged resources or SASL RPC data transfer " + > "protection and SSL for HTTP. Using privileged resources in " + > "combination with SASL RPC data transfer protection is not supported."); > {code} > The DN should print more useful diagnostics as to what exactly what went > wrong. > Also when starting secure DN with resources then the startup scripts should > launch the SecureDataNodeStarter class. If no SASL is configured and > SecureDataNodeStarter is not used, then we could mention that too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-15006) Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS
[ https://issues.apache.org/jira/browse/HADOOP-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342920#comment-16342920 ] Steve Loughran edited comment on HADOOP-15006 at 1/29/18 5:49 PM: -- I'm not going to touch this right now as its at the too raw stage, but progressing. I'll let yetus be the style police, including rejecting files for lack of ASF copyright, line endings etc. Ignoring that * what's your proposal for letting the client encryption be an optional feature, with key? Config * Once its configurable, the test would need to use two FS instances, one without encryption, one with. * Is the file length as returned in listings 100% consistent with the amount of data you get to read? was (Author: ste...@apache.org): I'm going to touch this right now as its at the too raw stage, but progressing. I'll let yetus be the style police, including rejecting files for lack of ASF copyright, line endings etc. Ignoring that * what's your proposal for letting the client encryption be an optional feature, with key? Config * Once its configurable, the test would need to use two FS instances, one without encryption, one with. * Is the file length as returned in listings 100% consistent with the amount of data you get to read? > Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS > --- > > Key: HADOOP-15006 > URL: https://issues.apache.org/jira/browse/HADOOP-15006 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/s3, kms >Reporter: Steve Moist >Priority: Minor > Attachments: S3-CSE Proposal.pdf, s3-cse-poc.patch > > > This is for the proposal to introduce Client Side Encryption to S3 in such a > way that it can leverage HDFS transparent encryption, use the Hadoop KMS to > manage keys, use the `hdfs crypto` command line tools to manage encryption > zones in the cloud, and enable distcp to copy from HDFS to S3 (and > vice-versa) with data still encrypted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line
[ https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-15186: Resolution: Fixed Fix Version/s: 3.0.1 Status: Resolved (was: Patch Available) +1, committed > Allow Azure Data Lake SDK dependency version to be set on the command line > -- > > Key: HADOOP-15186 > URL: https://issues.apache.org/jira/browse/HADOOP-15186 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/adl >Affects Versions: 3.0.0 >Reporter: Vishwajeet Dusane >Assignee: Vishwajeet Dusane >Priority: Major > Fix For: 3.0.1 > > Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, > HADOOP-15186-003.patch > > > For backward/forward release of Java SDK compatibility test against Hadoop > driver. Allow Azure Data Lake Java SDK dependency version to override from > command line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line
[ https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-15186: Summary: Allow Azure Data Lake SDK dependency version to be set on the command line (was: Allow Azure Data Lake SDK dependency version to override from the command line) > Allow Azure Data Lake SDK dependency version to be set on the command line > -- > > Key: HADOOP-15186 > URL: https://issues.apache.org/jira/browse/HADOOP-15186 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/adl >Affects Versions: 3.0.0 >Reporter: Vishwajeet Dusane >Assignee: Vishwajeet Dusane >Priority: Major > Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, > HADOOP-15186-003.patch > > > For backward/forward release of Java SDK compatibility test against Hadoop > driver. Allow Azure Data Lake Java SDK dependency version to override from > command line. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14671) Upgrade to Apache Yetus 0.7.0
[ https://issues.apache.org/jira/browse/HADOOP-14671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343554#comment-16343554 ] Allen Wittenauer commented on HADOOP-14671: --- 0.7.0 was released today. It includes a lot of key fixes. Note that releasedocmaker has changed, so this will likely be a more invasive upgrade than just increasing the version number. > Upgrade to Apache Yetus 0.7.0 > - > > Key: HADOOP-14671 > URL: https://issues.apache.org/jira/browse/HADOOP-14671 > Project: Hadoop Common > Issue Type: Improvement > Components: build, documentation, test >Affects Versions: 3.0.0-beta1 >Reporter: Allen Wittenauer >Assignee: Akira Ajisaka >Priority: Major > Attachments: HADOOP-14671.001.patch > > > Apache Yetus 0.7.0 was released. Let's upgrade the bundled reference to the > new version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14671) Upgrade to Apache Yetus 0.7.0
[ https://issues.apache.org/jira/browse/HADOOP-14671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-14671: -- Description: Apache Yetus 0.7.0 was released. Let's upgrade the bundled reference to the new version. (was: Apache Yetus 0.5.0 was released. Let's upgrade the bundled reference to the new version.) > Upgrade to Apache Yetus 0.7.0 > - > > Key: HADOOP-14671 > URL: https://issues.apache.org/jira/browse/HADOOP-14671 > Project: Hadoop Common > Issue Type: Improvement > Components: build, documentation, test >Affects Versions: 3.0.0-beta1 >Reporter: Allen Wittenauer >Assignee: Akira Ajisaka >Priority: Major > Attachments: HADOOP-14671.001.patch > > > Apache Yetus 0.7.0 was released. Let's upgrade the bundled reference to the > new version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14671) Upgrade to Apache Yetus 0.7.0
[ https://issues.apache.org/jira/browse/HADOOP-14671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HADOOP-14671: -- Summary: Upgrade to Apache Yetus 0.7.0 (was: Upgrade to Apache Yetus 0.5.1) > Upgrade to Apache Yetus 0.7.0 > - > > Key: HADOOP-14671 > URL: https://issues.apache.org/jira/browse/HADOOP-14671 > Project: Hadoop Common > Issue Type: Improvement > Components: build, documentation, test >Affects Versions: 3.0.0-beta1 >Reporter: Allen Wittenauer >Assignee: Akira Ajisaka >Priority: Major > Attachments: HADOOP-14671.001.patch > > > Apache Yetus 0.5.0 was released. Let's upgrade the bundled reference to the > new version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15151) MapFile.fix creates a wrong index file in case of block-compressed data file.
[ https://issues.apache.org/jira/browse/HADOOP-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343324#comment-16343324 ] Grigori Rybkine commented on HADOOP-15151: -- Thank you very much for the comment about the existing code, [~chris.douglas]. I am willing to provide a patch. Please, let me know how it would be better to proceed or maybe open a ticket and assign it to me? > MapFile.fix creates a wrong index file in case of block-compressed data file. > - > > Key: HADOOP-15151 > URL: https://issues.apache.org/jira/browse/HADOOP-15151 > Project: Hadoop Common > Issue Type: Bug > Components: common >Reporter: Grigori Rybkine >Assignee: Grigori Rybkine >Priority: Major > Labels: patch > Fix For: 2.9.1 > > Attachments: HADOOP-15151.001.patch, HADOOP-15151.002.patch, > HADOOP-15151.003.patch, HADOOP-15151.004.patch, HADOOP-15151.004.patch, > HADOOP-15151.005.patch > > > Index file created with MapFile.fix for an ordered block-compressed data file > does not allow to find values for keys existing in the data file via the > MapFile.get method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15151) MapFile.fix creates a wrong index file in case of block-compressed data file.
[ https://issues.apache.org/jira/browse/HADOOP-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343316#comment-16343316 ] Grigori Rybkine commented on HADOOP-15151: -- Thank you, [~chris.douglas], very much for reviewing and committing the patch. > MapFile.fix creates a wrong index file in case of block-compressed data file. > - > > Key: HADOOP-15151 > URL: https://issues.apache.org/jira/browse/HADOOP-15151 > Project: Hadoop Common > Issue Type: Bug > Components: common >Reporter: Grigori Rybkine >Assignee: Grigori Rybkine >Priority: Major > Labels: patch > Fix For: 2.9.1 > > Attachments: HADOOP-15151.001.patch, HADOOP-15151.002.patch, > HADOOP-15151.003.patch, HADOOP-15151.004.patch, HADOOP-15151.004.patch, > HADOOP-15151.005.patch > > > Index file created with MapFile.fix for an ordered block-compressed data file > does not allow to find values for keys existing in the data file via the > MapFile.get method. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org