date:20180129

[jira] [Updated] (HADOOP-15189) backport HADOOP-15039 to branch-2 and branch-3

2018-01-29 Thread SammiChen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HADOOP-15189:
---
Fix Version/s: 2.10.0

> backport HADOOP-15039 to branch-2 and branch-3
> --
>
> Key: HADOOP-15189
> URL: https://issues.apache.org/jira/browse/HADOOP-15189
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Genmao Yu
>Assignee: Genmao Yu
>Priority: Blocker
> Fix For: 2.10.0, 2.9.1, 3.0.1
>
> Attachments: HADOOP-15189-branch-2.001.patch, 
> HADOOP-15189-branch-2.9.001.patch, HADOOP-15189-branch-3.0.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input

2018-01-29 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reassigned HADOOP-15171:
-

Assignee: Lokesh Jain

> Hadoop native ZLIB decompressor produces 0 bytes for some input
> ---
>
> Key: HADOOP-15171
> URL: https://issues.apache.org/jira/browse/HADOOP-15171
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Sergey Shelukhin
>Assignee: Lokesh Jain
>Priority: Blocker
> Fix For: 3.1.0, 3.0.1
>
>
> While reading some ORC file via direct buffers, Hive gets a 0-sized buffer 
> for a particular compressed segment of the file. We narrowed it down to 
> Hadoop native ZLIB codec; when the data is copied to heap-based buffer and 
> the JDK Inflater is used, it produces correct output. Input is only 127 bytes 
> so I can paste it here.
> All the other (many) blocks of the file are decompressed without problems by 
> the same code.
> {noformat}
> 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing 
> 127 bytes to dest buffer pos 524288, limit 786432
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has 
> produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 
> 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 
> 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa 
> 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 
> b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 
> 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to 
> JDK decompressor with memcopy; got 155 bytes
> {noformat}
> Hadoop version is based on 3.1 snapshot.
> The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 
> FWIW. Not sure how to extract versions from those. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP

2018-01-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344497#comment-16344497
 ] 

genericqa commented on HADOOP-15191:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 
29s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 55s{color} | {color:orange} root: The patch generated 12 new + 39 unchanged 
- 1 fixed = 51 total (was 40) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m 
47s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-tools_hadoop-aws generated 1 new + 1 unchanged 
- 0 fixed = 2 total (was 1) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m  
1s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
58s{color} | {color:green} hadoop-distcp in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
29s{color} | {color:green} hadoop-aws in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HADOOP-15191 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908245/HADOOP-15191-002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 06bee037e2db 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Comment Edited] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line

2018-01-29 Thread Vishwajeet Dusane (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344463#comment-16344463
 ] 

Vishwajeet Dusane edited comment on HADOOP-15186 at 1/30/18 3:54 AM:
-

Thanks [~ste...@apache.org]. Is it possible to back port this CR to 3.0.0, 2.9 
and 2.8 branch ?


was (Author: vishwajeet.dusane):
Thanks [~ste...@apache.org]

> Allow Azure Data Lake SDK dependency version to be set on the command line
> --
>
> Key: HADOOP-15186
> URL: https://issues.apache.org/jira/browse/HADOOP-15186
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, fs/adl
>Affects Versions: 3.0.0
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, 
> HADOOP-15186-003.patch
>
>
> For backward/forward release of Java SDK compatibility test against Hadoop 
> driver. Allow Azure Data Lake Java SDK dependency version to override from 
> command line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line

2018-01-29 Thread Vishwajeet Dusane (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344463#comment-16344463
 ] 

Vishwajeet Dusane commented on HADOOP-15186:


Thanks [~ste...@apache.org]

> Allow Azure Data Lake SDK dependency version to be set on the command line
> --
>
> Key: HADOOP-15186
> URL: https://issues.apache.org/jira/browse/HADOOP-15186
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, fs/adl
>Affects Versions: 3.0.0
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, 
> HADOOP-15186-003.patch
>
>
> For backward/forward release of Java SDK compatibility test against Hadoop 
> driver. Allow Azure Data Lake Java SDK dependency version to override from 
> command line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP

2018-01-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15191:

Status: Patch Available  (was: Open)

> Add Private/Unstable BulkDelete operations to supporting object stores for 
> DistCP
> -
>
> Key: HADOOP-15191
> URL: https://issues.apache.org/jira/browse/HADOOP-15191
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15191-001.patch, HADOOP-15191-002.patch
>
>
> Large scale DistCP with the -delete option doesn't finish in a viable time 
> because of the final CopyCommitter doing a 1 by 1 delete of all missing 
> files. This isn't randomized (the list is sorted), and it's throttled by AWS.
> If bulk deletion of files was exposed as an API, distCP would do 1/1000 of 
> the REST calls, so not get throttled.
> Proposed: add an initially private/unstable interface for stores, 
> {{BulkDelete}} which declares a page size and offers a 
> {{bulkDelete(List)}} operation for the bulk deletion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP

2018-01-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344385#comment-16344385
 ] 

Steve Loughran edited comment on HADOOP-15191 at 1/30/18 2:00 AM:
--

h2. Proposed

* New interface {{org.apache.hadoop.fs.store.BulkIO}}
* S3A to implement this, relaying to {{S3ABulkOperations}}
* {{S3ABulkOperations}} to implement an optimised delete

If you look at the cost of the delete(file), it's not just the DELETE call its:

# getFileStatus(file) : HEAD, [HEAD], [LIST].
# DELETE
# getFileStatus(file.parent) HEAD, HEAD, LIST.
# if not found, PUT file.parent + "/"


FWIW, we could maybe optimise that second getFileStatus in the assumption that 
there's no file or dir marker there; all you need to do is check for the LIST 
call returning 1+ entry.

Anyway. you are looking at ~7 HTTP requests per delete. 

Optimising that directory creation is equally important. Now, we could just 
have the bulk IO operation say "outcome of empty directories is undefined". I'm 
happy with that, but it's more of a change to the observable outcome of a 
distcp call.

New {{S3ABulkOperations.bulkDeleteFiles}}

* No check for a file existing before delete
* Issues a bulk delete with the configured page size
* builds up a tree of parent paths, and only attempts to creates fake 
directories for the parent directories at the bottom of the tree.

That is, if you delete the paths
{code}
/A/B.txt
/A/C/D.txt
/A/C/E.txt
{code}

Then the only directory to consider creating is /A/C/; after which you know 
that the parent /A path will have an entry, so doesn't need
any work. The number of fake directory creation therefore goes from O(files) to 
O(leaves in directory tree). At best,  Ω(1), at worst O(files).

One caveat: we now create an empty dir even if the source file doesn't exist.


h2. Testing

I've made the page size configurable (fs.s3a.experimental.bulkdelete.pagesize). 
We can switch on the paged delete mode with a very small page size, and so 
check it works properly even for a small number of files.

New unit test suite {{TestS3ABulkOperations}}, primarily checks tree logic for 
the directory creation process.

New integration test suite {{ITestS3ABulkOperations}} performs bulk IO and sees 
what it does.

The existing {{AbstractContractDistCpTest}} test extends its 
{{deepDirectoryStructureToRemote}} test to become 
{{deepDirectoryStructureToRemoteWithSync}}, 
doing an update with some files added, some removed, and assertions about the 
final state. This verifies that distcp is happy. I've also reviewed the logs
to see that all is well there.

h2. Alternate Design: publish summary and do it independently

The other tactic for doing this would be to not integrate DistCP with the bulk 
delete, and
instead have it publish the files of input & output for a followup reconciler.

Good: 

* No changes to DistCP delete process
* No need to add any explicit API/interface in hadoop-common

Bad:

* New visible option to distcp to save output
* May lead to expectations of future maintenance of the option
* and also a persistent format for the data

You'd still need to add the bulk delete calls alongside the S3A Fs, and any 
other stores to which the bulk IO was also added (Wasb could save on directory 
setup, by the look of things, as would oss: and swift




was (Author: ste...@apache.org):
h2. Proposed

* New interface {{org.apache.hadoop.fs.store.BulkIO}}
* S3A to implement this, relaying to {{S3ABulkOperations}}
* {{S3ABulkOperations}} to implement an optimised delete

If you look at the cost of the delete(file), it's not just the DELETE call its:

# getFileStatus(file) : HEAD, [HEAD], [LIST].
# DELETE
# getFileStatus(file.parent) HEAD, HEAD, LIST.
# if not found, PUT file.parent + "/"


FWIW, we could maybe optimise that second getFileStatus in the assumption that 
there's no file or dir marker there; all you need to do is check for the LIST 
call returning 1+ entry.

Anyway. you are looking at ~7 HTTP requests per delete. 

Optimising that directory creation is equally important. Now, we could just 
have the bulk IO operation say "outcome of empty directories is undefined". I'm 
happy with that, but it's more of a change to the observable outcome of a 
distcp call.

New {{S3ABulkOperations.bulkDeleteFiles}}

* No check for a file existing before delete
* Issues a bulk delete with the configured page size
* builds up a tree of parent paths, and only attempts to creates fake 
directories for the parent directories at the bottom of the tree.

That is, if you delete the paths
{code}
/A/B.txt
/A/C/D.txt
/A/C/E.txt
{code}

Then the only directory to consider creating is /A/C/; after which you know 
that the parent /A path will have an entry, so doesn't need
any work. The number of fake directory creation therefore goes from O(files) to 
O(leaves in directory tree). At best,  Ω(1), at worst O(files).

One caveat:

[jira] [Commented] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP

2018-01-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344385#comment-16344385
 ] 

Steve Loughran commented on HADOOP-15191:
-

h2. Proposed

* New interface {{org.apache.hadoop.fs.store.BulkIO}}
* S3A to implement this, relaying to {{S3ABulkOperations}}
* {{S3ABulkOperations}} to implement an optimised delete

If you look at the cost of the delete(file), it's not just the DELETE call its:

# getFileStatus(file) : HEAD, [HEAD], [LIST].
# DELETE
# getFileStatus(file.parent) HEAD, HEAD, LIST.
# if not found, PUT file.parent + "/"


FWIW, we could maybe optimise that second getFileStatus in the assumption that 
there's no file or dir marker there; all you need to do is check for the LIST 
call returning 1+ entry.

Anyway. you are looking at ~7 HTTP requests per delete. 

Optimising that directory creation is equally important. Now, we could just 
have the bulk IO operation say "outcome of empty directories is undefined". I'm 
happy with that, but it's more of a change to the observable outcome of a 
distcp call.

New {{S3ABulkOperations.bulkDeleteFiles}}

* No check for a file existing before delete
* Issues a bulk delete with the configured page size
* builds up a tree of parent paths, and only attempts to creates fake 
directories for the parent directories at the bottom of the tree.

That is, if you delete the paths
{code}
/A/B.txt
/A/C/D.txt
/A/C/E.txt
{code}

Then the only directory to consider creating is /A/C/; after which you know 
that the parent /A path will have an entry, so doesn't need
any work. The number of fake directory creation therefore goes from O(files) to 
O(leaves in directory tree). At best,  Ω(1), at worst O(files).

One caveat: we now create an empty dir even if the source file doesn't exist.


h2. Testing

I've made the page size configurable (fs.s3a.experimental.bulkdelete.pagesize). 
We can switch on the paged delete mode with a very small page size, and so 
check it works properly even for a small number of files.

New unit test suite {{TestS3ABulkOperations}}, primarily checks tree logic for 
the directory creation process.

New integration test suite {{ITestS3ABulkOperations}} performs bulk IO and sees 
what it does.

The existing {{AbstractContractDistCpTest}} test extends its 
{{deepDirectoryStructureToRemote}} test to become 
{{deepDirectoryStructureToRemoteWithSync}}, 
doing an update with some files added, some removed, and assertions about the 
final state. This verifies that distcp is happy. I've also reviewed the logs
to see that all is well there.

h2. Alternate Design: publish summary and do it independently

The other tactic for doing this would be to not integrate DistCP with the bulk 
delete, and
instead have it publish the files of input & output for a followup reconciler.

Good: 

* No changes to DistCP delete process
* No need to add any explicit API/interface in hadoop-common

Bad:

* New visible option to distcp to save output
* May lead to expectations of future maintenance of the option
* and also a persistent format for the data

You'd still need to add the bulk delete calls alongside the S3A Fs, and any 
other stores to which the bulk IO was also added (Wasb could save on directory 
setup, by the look of things, as would oss: and swift:)



> Add Private/Unstable BulkDelete operations to supporting object stores for 
> DistCP
> -
>
> Key: HADOOP-15191
> URL: https://issues.apache.org/jira/browse/HADOOP-15191
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15191-001.patch, HADOOP-15191-002.patch
>
>
> Large scale DistCP with the -delete option doesn't finish in a viable time 
> because of the final CopyCommitter doing a 1 by 1 delete of all missing 
> files. This isn't randomized (the list is sorted), and it's throttled by AWS.
> If bulk deletion of files was exposed as an API, distCP would do 1/1000 of 
> the REST calls, so not get throttled.
> Proposed: add an initially private/unstable interface for stores, 
> {{BulkDelete}} which declares a page size and offers a 
> {{bulkDelete(List)}} operation for the bulk deletion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15168) Add kdiag tool to hadoop command

2018-01-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344383#comment-16344383
 ] 

genericqa commented on HADOOP-15168:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
31s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} shellcheck {color} | {color:red}  0m 
55s{color} | {color:red} The patch generated 1 new + 0 unchanged - 0 fixed = 1 
total (was 0) {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m  
8s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 19s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
51s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
0s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m  
2s{color} | {color:green} hadoop-yarn in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HADOOP-15168 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908239/HADOOP-15168.02.patch 
|
| Optional Tests |  asflicense  mvnsite  unit  shellcheck  shelldocs  |
| uname | Linux f64e3bcd0bab 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / fde95d4 |
| maven | version: Apache Maven 3.3.9 |
| shellcheck | v0.4.6 |
| shellcheck | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14046/artifact/out/diff-patch-shellcheck.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14046/artifact/out/whitespace-eol.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14046/testReport/ |
| Max. process+thread count | 439 (vs. ulimit of 5000) |
| modules | C: hadoop-common-project/hadoop-common 
hadoop-hdfs-project/hadoop-hdfs hadoop-yarn-project/hadoop-yarn U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14046/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add kdiag tool to hadoop command
> 
>
> Key: HADOOP-15168
> URL:

[jira] [Updated] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP

2018-01-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15191:

Attachment: HADOOP-15191-002.patch

> Add Private/Unstable BulkDelete operations to supporting object stores for 
> DistCP
> -
>
> Key: HADOOP-15191
> URL: https://issues.apache.org/jira/browse/HADOOP-15191
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15191-001.patch, HADOOP-15191-002.patch
>
>
> Large scale DistCP with the -delete option doesn't finish in a viable time 
> because of the final CopyCommitter doing a 1 by 1 delete of all missing 
> files. This isn't randomized (the list is sorted), and it's throttled by AWS.
> If bulk deletion of files was exposed as an API, distCP would do 1/1000 of 
> the REST calls, so not get throttled.
> Proposed: add an initially private/unstable interface for stores, 
> {{BulkDelete}} which declares a page size and offers a 
> {{bulkDelete(List)}} operation for the bulk deletion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP

2018-01-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15191:

Status: Open  (was: Patch Available)

> Add Private/Unstable BulkDelete operations to supporting object stores for 
> DistCP
> -
>
> Key: HADOOP-15191
> URL: https://issues.apache.org/jira/browse/HADOOP-15191
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15191-001.patch, HADOOP-15191-002.patch
>
>
> Large scale DistCP with the -delete option doesn't finish in a viable time 
> because of the final CopyCommitter doing a 1 by 1 delete of all missing 
> files. This isn't randomized (the list is sorted), and it's throttled by AWS.
> If bulk deletion of files was exposed as an API, distCP would do 1/1000 of 
> the REST calls, so not get throttled.
> Proposed: add an initially private/unstable interface for stores, 
> {{BulkDelete}} which declares a page size and offers a 
> {{bulkDelete(List)}} operation for the bulk deletion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures

2018-01-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344320#comment-16344320
 ] 

genericqa commented on HADOOP-12897:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
24s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} hadoop-common-project/hadoop-auth: The patch 
generated 2 new + 40 unchanged - 1 fixed = 42 total (was 41) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
35s{color} | {color:green} hadoop-auth in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HADOOP-12897 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908225/HADOOP-12897.004.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 992137aff30d 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / fde95d4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14045/artifact/out/diff-checkstyle-hadoop-common-project_hadoop-auth.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14045/testReport/ |
| Max. process+thread count | 340 (vs. ulimit of 5000) |
| modules | C: hadoop-common-project/hadoop-auth U: 
hadoop-common-project/hadoop-auth |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14045/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT

[jira] [Commented] (HADOOP-15168) Add kdiag tool to hadoop command

2018-01-29 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344300#comment-16344300
 ] 

Bharat Viswanadham commented on HADOOP-15168:
-

Hi [~hanishakoneru]

Thank you for review.

Addressed review comments in v02 patch

> Add kdiag tool to hadoop command
> 
>
> Key: HADOOP-15168
> URL: https://issues.apache.org/jira/browse/HADOOP-15168
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Minor
> Attachments: HADOOP-15168.00.patch, HADOOP-15168.01.patch, 
> HADOOP-15168.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15168) Add kdiag tool to hadoop command

2018-01-29 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HADOOP-15168:

Attachment: HADOOP-15168.02.patch

> Add kdiag tool to hadoop command
> 
>
> Key: HADOOP-15168
> URL: https://issues.apache.org/jira/browse/HADOOP-15168
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Minor
> Attachments: HADOOP-15168.00.patch, HADOOP-15168.01.patch, 
> HADOOP-15168.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input

2018-01-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344268#comment-16344268
 ] 

Steve Loughran commented on HADOOP-15171:
-

There was another JIRA on this wasn't there? Sergei, can you find it?

> Hadoop native ZLIB decompressor produces 0 bytes for some input
> ---
>
> Key: HADOOP-15171
> URL: https://issues.apache.org/jira/browse/HADOOP-15171
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Sergey Shelukhin
>Priority: Blocker
> Fix For: 3.1.0, 3.0.1
>
>
> While reading some ORC file via direct buffers, Hive gets a 0-sized buffer 
> for a particular compressed segment of the file. We narrowed it down to 
> Hadoop native ZLIB codec; when the data is copied to heap-based buffer and 
> the JDK Inflater is used, it produces correct output. Input is only 127 bytes 
> so I can paste it here.
> All the other (many) blocks of the file are decompressed without problems by 
> the same code.
> {noformat}
> 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing 
> 127 bytes to dest buffer pos 524288, limit 786432
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has 
> produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 
> 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 
> 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa 
> 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 
> b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 
> 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to 
> JDK decompressor with memcopy; got 155 bytes
> {noformat}
> Hadoop version is based on 3.1 snapshot.
> The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 
> FWIW. Not sure how to extract versions from those. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input

2018-01-29 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344204#comment-16344204
 ] 

Gopal V commented on HADOOP-15171:
--

bq.  this is becoming a pain

This is a huge perf hit right now, the workaround is much slower than the 
original codepath.

> Hadoop native ZLIB decompressor produces 0 bytes for some input
> ---
>
> Key: HADOOP-15171
> URL: https://issues.apache.org/jira/browse/HADOOP-15171
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Sergey Shelukhin
>Priority: Blocker
> Fix For: 3.1.0, 3.0.1
>
>
> While reading some ORC file via direct buffers, Hive gets a 0-sized buffer 
> for a particular compressed segment of the file. We narrowed it down to 
> Hadoop native ZLIB codec; when the data is copied to heap-based buffer and 
> the JDK Inflater is used, it produces correct output. Input is only 127 bytes 
> so I can paste it here.
> All the other (many) blocks of the file are decompressed without problems by 
> the same code.
> {noformat}
> 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing 
> 127 bytes to dest buffer pos 524288, limit 786432
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has 
> produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 
> 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 
> 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa 
> 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 
> b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 
> 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to 
> JDK decompressor with memcopy; got 155 bytes
> {noformat}
> Hadoop version is based on 3.1 snapshot.
> The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 
> FWIW. Not sure how to extract versions from those. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15140) S3guard mistakes root URI without / as non-absolute path

2018-01-29 Thread Abraham Fine (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344201#comment-16344201
 ] 

Abraham Fine commented on HADOOP-15140:
---

[~ste...@apache.org] I went ahead and added two tests to 
{{AbstractContractGetFileStatusTest}}:
{code:java}
  @Test
  public void testGetFileStatusRootURI() throws Throwable {
String fileSystemURI = getFileSystem().getUri().toString();
assertTrue("uri should not end with '/': " + fileSystemURI, 
fileSystemURI.endsWith("//") || !fileSystemURI.endsWith("/"));
ContractTestUtils.assertIsDirectory(
  getFileSystem().getFileStatus(new Path(fileSystemURI)));
  }

  @Test
  public void testGetFileStatusRootFromChild() throws Throwable {
ContractTestUtils.assertIsDirectory(
  getFileSystem().getFileStatus(new Path("/dir").getParent()));
  }
{code}
These tests ran against S3, Azure, Azure Data Lake, and HDFS using: {{mvn test 
-Dtest="**/*ContractGetFileStatus*" -DS3guard -fae 
-Dmaven.test.failure.ignore=true}}

{{testGetFileStatusRootFromChild}} never appears to fail. 
{{testGetFileStatusRootURI}} does on HDFS, S3, and Azure Data Lake (Azure 
native passes). Local file systems also pass.

Here are the failures and their corresponding stack traces:
{code:java}
[INFO] Running org.apache.hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus
[ERROR] Tests run: 20, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.344 
s <<< FAILURE! - in 
org.apache.hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus
[ERROR] 
testGetFileStatusRootURI(org.apache.hadoop.fs.contract.hdfs.TestHDFSContractGetFileStatus)
  Time elapsed: 0.02 s  <<< ERROR!
java.lang.IllegalArgumentException: Pathname  from hdfs://localhost:63826 is 
not a valid DFS filename.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:242)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1568)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1565)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1580)
at 
org.apache.hadoop.fs.contract.AbstractContractGetFileStatusTest.testGetFileStatusRootURI(AbstractContractGetFileStatusTest.java:86)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

[INFO] Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus
[ERROR] Tests run: 20, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
113.029 s <<< FAILURE! - in 
org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus
[ERROR] 
testGetFileStatusRootURI(org.apache.hadoop.fs.contract.s3a.ITestS3AContractGetFileStatus)
  Time elapsed: 1.27 s  <<< ERROR!
java.lang.IllegalArgumentException: path must be absolute
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:68)
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:60)
at 
org.apache.hadoop.fs.s3a.s3guard.PathMetadata.(PathMetadata.java:56)
at 
org.apache.hadoop.fs.s3a.s3guard.S3Guard.putAndReturn(S3Guard.java:149)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2130)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2070)
at 
org.apache.hadoop.fs.contract.AbstractContractGetFileStatusTest.testGetFileStatusRootURI(AbstractContractGetFileStatusTest.java:86)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at

[jira] [Updated] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input

2018-01-29 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HADOOP-15171:
--
Priority: Blocker  (was: Critical)

> Hadoop native ZLIB decompressor produces 0 bytes for some input
> ---
>
> Key: HADOOP-15171
> URL: https://issues.apache.org/jira/browse/HADOOP-15171
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Sergey Shelukhin
>Priority: Blocker
> Fix For: 3.1.0, 3.0.1
>
>
> While reading some ORC file via direct buffers, Hive gets a 0-sized buffer 
> for a particular compressed segment of the file. We narrowed it down to 
> Hadoop native ZLIB codec; when the data is copied to heap-based buffer and 
> the JDK Inflater is used, it produces correct output. Input is only 127 bytes 
> so I can paste it here.
> All the other (many) blocks of the file are decompressed without problems by 
> the same code.
> {noformat}
> 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing 
> 127 bytes to dest buffer pos 524288, limit 786432
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has 
> produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 
> 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 
> 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa 
> 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 
> b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 
> 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to 
> JDK decompressor with memcopy; got 155 bytes
> {noformat}
> Hadoop version is based on 3.1 snapshot.
> The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 
> FWIW. Not sure how to extract versions from those. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input

2018-01-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344184#comment-16344184
 ] 

Sergey Shelukhin commented on HADOOP-15171:
---

[~ste...@apache.org] [~jnp] is it possible to get some traction on this 
actually? We now also have to work around this in ORC project, and this is 
becoming a pain

> Hadoop native ZLIB decompressor produces 0 bytes for some input
> ---
>
> Key: HADOOP-15171
> URL: https://issues.apache.org/jira/browse/HADOOP-15171
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Sergey Shelukhin
>Priority: Critical
> Fix For: 3.1.0, 3.0.1
>
>
> While reading some ORC file via direct buffers, Hive gets a 0-sized buffer 
> for a particular compressed segment of the file. We narrowed it down to 
> Hadoop native ZLIB codec; when the data is copied to heap-based buffer and 
> the JDK Inflater is used, it produces correct output. Input is only 127 bytes 
> so I can paste it here.
> All the other (many) blocks of the file are decompressed without problems by 
> the same code.
> {noformat}
> 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing 
> 127 bytes to dest buffer pos 524288, limit 786432
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has 
> produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 
> 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 
> 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa 
> 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 
> b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 
> 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to 
> JDK decompressor with memcopy; got 155 bytes
> {noformat}
> Hadoop version is based on 3.1 snapshot.
> The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 
> FWIW. Not sure how to extract versions from those. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15171) Hadoop native ZLIB decompressor produces 0 bytes for some input

2018-01-29 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HADOOP-15171:
--
Fix Version/s: 3.0.1
   3.1.0

> Hadoop native ZLIB decompressor produces 0 bytes for some input
> ---
>
> Key: HADOOP-15171
> URL: https://issues.apache.org/jira/browse/HADOOP-15171
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Sergey Shelukhin
>Priority: Critical
> Fix For: 3.1.0, 3.0.1
>
>
> While reading some ORC file via direct buffers, Hive gets a 0-sized buffer 
> for a particular compressed segment of the file. We narrowed it down to 
> Hadoop native ZLIB codec; when the data is copied to heap-based buffer and 
> the JDK Inflater is used, it produces correct output. Input is only 127 bytes 
> so I can paste it here.
> All the other (many) blocks of the file are decompressed without problems by 
> the same code.
> {noformat}
> 2018-01-13T02:47:40,815 TRACE [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Decompressing 
> 127 bytes to dest buffer pos 524288, limit 786432
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: The codec has 
> produced 0 bytes for 127 bytes at pos 0, data hash 1719565039: [e3 92 e1 62 
> 66 60 60 10 12 e5 98 e0 27 c4 c7 f1 e8 12 8f 40 c3 7b 5e 89 09 7f 6e 74 73 04 
> 30 70 c9 72 b1 30 14 4d 60 82 49 37 bd e7 15 58 d0 cd 2f 31 a1 a1 e3 35 4c fa 
> 15 a3 02 4c 7a 51 37 bf c0 81 e5 02 12 13 5a b6 9f e2 04 ea 96 e3 62 65 b8 c3 
> b4 01 ae fd d0 72 01 81 07 87 05 25 26 74 3c 5b c9 05 35 fd 0a b3 03 50 7b 83 
> 11 c8 f2 c3 82 02 0f 96 0b 49 34 7c fa ff 9f 2d 80 01 00
> 2018-01-13T02:47:40,816  WARN [IO-Elevator-Thread-0 
> (1515637158315_0079_1_00_00_0)] encoded.EncodedReaderImpl: Fell back to 
> JDK decompressor with memcopy; got 155 bytes
> {noformat}
> Hadoop version is based on 3.1 snapshot.
> The size of libhadoop.so is 824403 bytes, and libgplcompression is 78273 
> FWIW. Not sure how to extract versions from those. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures

2018-01-29 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-12897:

Attachment: HADOOP-12897.004.patch

> KerberosAuthenticator.authenticate to include URL on IO failures
> 
>
> Key: HADOOP-12897
> URL: https://issues.apache.org/jira/browse/HADOOP-12897
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, 
> HADOOP-12897.003.patch, HADOOP-12897.004.patch
>
>
> If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you 
> get a stack trace, but without the URL it is trying to talk to.
> That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} 
> handler —which can't be called here as its not in the {{hadoop-auth}} module



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures

2018-01-29 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-12897:

Attachment: (was: HADOOP-12897.004.patch)

> KerberosAuthenticator.authenticate to include URL on IO failures
> 
>
> Key: HADOOP-12897
> URL: https://issues.apache.org/jira/browse/HADOOP-12897
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, 
> HADOOP-12897.003.patch, HADOOP-12897.004.patch
>
>
> If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you 
> get a stack trace, but without the URL it is trying to talk to.
> That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} 
> handler —which can't be called here as its not in the {{hadoop-auth}} module



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures

2018-01-29 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-12897:

Attachment: HADOOP-12897.004.patch

> KerberosAuthenticator.authenticate to include URL on IO failures
> 
>
> Key: HADOOP-12897
> URL: https://issues.apache.org/jira/browse/HADOOP-12897
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, 
> HADOOP-12897.003.patch, HADOOP-12897.004.patch
>
>
> If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you 
> get a stack trace, but without the URL it is trying to talk to.
> That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} 
> handler —which can't be called here as its not in the {{hadoop-auth}} module



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures

2018-01-29 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344165#comment-16344165
 ] 

Ajay Kumar commented on HADOOP-12897:
-

Patch v4 adds log message at DEBUG level to address [~ste...@apache.org] 
suggestion on {{wrapExceptionWithMessage}}.

> KerberosAuthenticator.authenticate to include URL on IO failures
> 
>
> Key: HADOOP-12897
> URL: https://issues.apache.org/jira/browse/HADOOP-12897
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, 
> HADOOP-12897.003.patch
>
>
> If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you 
> get a stack trace, but without the URL it is trying to talk to.
> That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} 
> handler —which can't be called here as its not in the {{hadoop-auth}} module



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures

2018-01-29 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344139#comment-16344139
 ] 

Ajay Kumar edited comment on HADOOP-12897 at 1/29/18 10:46 PM:
---

[~arpitagarwal], thanks for suggestion. Using multi-catch in this case gives 
another compile time error which basically expects throws clause in function to 
be Exception. Seems like a bug in JDK.


was (Author: ajayydv):
[~arpitagarwal], thanks for suggestion. Using multi-catch in this case gives 
another compile time error which basically expects throws clause in function to 
Exception. Seems like a bug in JDK.

> KerberosAuthenticator.authenticate to include URL on IO failures
> 
>
> Key: HADOOP-12897
> URL: https://issues.apache.org/jira/browse/HADOOP-12897
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, 
> HADOOP-12897.003.patch
>
>
> If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you 
> get a stack trace, but without the URL it is trying to talk to.
> That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} 
> handler —which can't be called here as its not in the {{hadoop-auth}} module



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures

2018-01-29 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344139#comment-16344139
 ] 

Ajay Kumar commented on HADOOP-12897:
-

[~arpitagarwal], thanks for suggestion. Using multi-catch in this case gives 
another compile time error which basically expects throws clause in function to 
Exception. Seems like a bug in JDK.

> KerberosAuthenticator.authenticate to include URL on IO failures
> 
>
> Key: HADOOP-12897
> URL: https://issues.apache.org/jira/browse/HADOOP-12897
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, 
> HADOOP-12897.003.patch
>
>
> If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you 
> get a stack trace, but without the URL it is trying to talk to.
> That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} 
> handler —which can't be called here as its not in the {{hadoop-auth}} module



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15191) Add Private/Unstable BulkDelete operations to supporting object stores for DistCP

2018-01-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344136#comment-16344136
 ] 

Steve Loughran commented on HADOOP-15191:
-

The patch I'm working on now (bigger, passing tests) doesn't contain any 
attempts to recover from partially failed deletes. That's a more complex issue 
which need to be implemented and tested more broadly, and is only relevant when 
you are mixing permissions down a tree. As S3A doesn't yet even handle 
delete(file) properly there, this new operation isn't making things worse

> Add Private/Unstable BulkDelete operations to supporting object stores for 
> DistCP
> -
>
> Key: HADOOP-15191
> URL: https://issues.apache.org/jira/browse/HADOOP-15191
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, tools/distcp
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Attachments: HADOOP-15191-001.patch
>
>
> Large scale DistCP with the -delete option doesn't finish in a viable time 
> because of the final CopyCommitter doing a 1 by 1 delete of all missing 
> files. This isn't randomized (the list is sorted), and it's throttled by AWS.
> If bulk deletion of files was exposed as an API, distCP would do 1/1000 of 
> the REST calls, so not get throttled.
> Proposed: add an initially private/unstable interface for stores, 
> {{BulkDelete}} which declares a page size and offers a 
> {{bulkDelete(List)}} operation for the bulk deletion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2018-01-29 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344105#comment-16344105
 ] 

Xiao Chen commented on HADOOP-14445:


Thanks [~daryn] for circling back with the new idea. Mixed feeling (and head 
scratching)! :)

I think a new and standardized token kind should work, and conveniently 
eliminate the need for changing client configs, so SGTM. We may also check in 
the RM, when its {{DelegationTokenRenewer}} received a set of tokens, and there 
are both kms-dt and KMS_D_T with the same sequence number, only renew the 
KMS_D_T.
For that to work, we'd need a new {{KMSDelegationTokenIdentifier}} class and a 
new {{DelegationTokenAuthenticationHandler}} too. 

Curious: with the current approach (patch 3) we need just an additional config 
deployment after the upgrade, right? What changed your mind from 
[earlier|https://issues.apache.org/jira/browse/HADOOP-14445?focusedCommentId=16279134=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16279134]
 (assuming the implementation comments are addressed) ?

I'd rather prefer not to sacrifice old RM + new client. True for RU should 
still work, but there is still support burden for a new client connecting to an 
existing cluster. Token issues are not the easiest to figure out, and IMO we 
should avoid this case when we can.

> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
> Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Rushabh S Shah
>Priority: Major
> Attachments: HADOOP-14445-branch-2.8.002.patch, 
> HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, HADOOP-14445.003.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new InetSocketAddress(url.getHost(),
> url.getPort());
> Text service = SecurityUtil.buildTokenService(serviceAddr);
> dToken = creds.getToken(service);
> {code}
> But KMS doc states:
> {quote}
> Delegation Tokens
> Similar to HTTP authentication, KMS uses Hadoop Authentication for delegation 
> tokens too.
> Under HA, A KMS instance must verify the delegation token given by another 
> KMS instance, by checking the shared secret used to sign the delegation 
> token. To do this, all KMS instances must be able to retrieve the shared 
> secret from ZooKeeper.
> {quote}
> We should either update the KMS documentation, or fix this code to share 
> delegation tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15129) Datanode caches namenode DNS lookup failure and cannot startup

2018-01-29 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344037#comment-16344037
 ] 

Ajay Kumar commented on HADOOP-15129:
-

{quote}local host is: (unknown); {quote}
can we pass "localhost" for {{NetUtils.wrapException}} instead of null. Above 
message in logs is little misleading.

> Datanode caches namenode DNS lookup failure and cannot startup
> --
>
> Key: HADOOP-15129
> URL: https://issues.apache.org/jira/browse/HADOOP-15129
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.8.2
> Environment: Google Compute Engine.
> I'm using Java 8, Debian 8, Hadoop 2.8.2.
>Reporter: Karthik Palaniappan
>Assignee: Karthik Palaniappan
>Priority: Minor
> Attachments: HADOOP-15129.001.patch, HADOOP-15129.002.patch
>
>
> On startup, the Datanode creates an InetSocketAddress to register with each 
> namenode. Though there are retries on connection failure throughout the 
> stack, the same InetSocketAddress is reused.
> InetSocketAddress is an interesting class, because it resolves DNS names to 
> IP addresses on construction, and it is never refreshed. Hadoop re-creates an 
> InetSocketAddress in some cases just in case the remote IP has changed for a 
> particular DNS name: https://issues.apache.org/jira/browse/HADOOP-7472.
> Anyway, on startup, you cna see the Datanode log: "Namenode...remains 
> unresolved" -- referring to the fact that DNS lookup failed.
> {code:java}
> 2017-11-02 16:01:55,115 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Refresh request received for nameservices: null
> 2017-11-02 16:01:55,153 WARN org.apache.hadoop.hdfs.DFSUtilClient: Namenode 
> for null remains unresolved for ID null. Check your hdfs-site.xml file to 
> ensure namenodes are configured properly.
> 2017-11-02 16:01:55,156 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Starting BPOfferServices for nameservices: 
> 2017-11-02 16:01:55,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Block pool  (Datanode Uuid unassigned) service to 
> cluster-32f5-m:8020 starting to offer service
> {code}
> The Datanode then proceeds to use this unresolved address, as it may work if 
> the DN is configured to use a proxy. Since I'm not using a proxy, it forever 
> prints out this message:
> {code:java}
> 2017-12-15 00:13:40,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:45,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:50,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:55,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:14:00,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> {code}
> Unfortunately, the log doesn't contain the exception that triggered it, but 
> the culprit is actually in IPC Client: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L444.
> This line was introduced in https://issues.apache.org/jira/browse/HADOOP-487 
> to give a clear error message when somebody mispells an address.
> However, the fix in HADOOP-7472 doesn't apply here, because that code happens 
> in Client#getConnection after the Connection is constructed.
> My proposed fix (will attach a patch) is to move this exception out of the 
> constructor and into a place that will trigger HADOOP-7472's logic to 
> re-resolve addresses. If the DNS failure was temporary, this will allow the 
> connection to succeed. If not, the connection will fail after ipc client 
> retries (default 10 seconds worth of retries).
> I want to fix this in ipc client rather than just in Datanode startup, as 
> this fixes temporary DNS issues for all of Hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava

2018-01-29 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344003#comment-16344003
 ] 

genericqa commented on HADOOP-15170:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 53s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
8m 25s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
52s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HADOOP-15170 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908190/HADOOP-15170.003.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e7438a831884 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 7fd287b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14044/testReport/ |
| Max. process+thread count | 1430 (vs. ulimit of 5000) |
| modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
| Console output | 
https://builds.apache.org/job/PreCommit-HADOOP-Build/14044/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add symlink support to FileUtil#unTarUsingJava 
> ---
>
> Key: HADOOP-15170
>

[jira] [Commented] (HADOOP-15006) Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS

2018-01-29 Thread Steve Moist (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343975#comment-16343975
 ] 

Steve Moist commented on HADOOP-15006:
--

>what's your proposal for letting the client encryption be an optional feature, 
>with key? Config

If s3a.client.encryption.enabled=true then check for BEZ if exists encrypt 
objects, else no encryption for the bucket.  Or if the BEZI provider is 
configured as well rather than just the flag.

>Is the file length as returned in listings 100% consistent with the amount of 
>data you get to read?

Yes.

>I'm not going to touch this right now as its at the too raw stage

That's why I submitted it, for you and everyone else to play with to evaluate 
if this is something that we should move forward with.  If needed I can go fix 
the broken S3Guard/Committer/byte comparison tests and have yetus pass it, but 
the actual code is going to be about the same.

 

> Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS
> ---
>
> Key: HADOOP-15006
> URL: https://issues.apache.org/jira/browse/HADOOP-15006
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3, kms
>Reporter: Steve Moist
>Priority: Minor
> Attachments: S3-CSE Proposal.pdf, s3-cse-poc.patch
>
>
> This is for the proposal to introduce Client Side Encryption to S3 in such a 
> way that it can leverage HDFS transparent encryption, use the Hadoop KMS to 
> manage keys, use the `hdfs crypto` command line tools to manage encryption 
> zones in the cloud, and enable distcp to copy from HDFS to S3 (and 
> vice-versa) with data still encrypted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15129) Datanode caches namenode DNS lookup failure and cannot startup

2018-01-29 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343948#comment-16343948
 ] 

Arpit Agarwal commented on HADOOP-15129:


I haven't looked at the test cases yet but the change looks fine to me. Will 
review the new tests.

[~kihwal], do you have any thoughts since you added the original re-resolution 
logic?

> Datanode caches namenode DNS lookup failure and cannot startup
> --
>
> Key: HADOOP-15129
> URL: https://issues.apache.org/jira/browse/HADOOP-15129
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 2.8.2
> Environment: Google Compute Engine.
> I'm using Java 8, Debian 8, Hadoop 2.8.2.
>Reporter: Karthik Palaniappan
>Assignee: Karthik Palaniappan
>Priority: Minor
> Attachments: HADOOP-15129.001.patch, HADOOP-15129.002.patch
>
>
> On startup, the Datanode creates an InetSocketAddress to register with each 
> namenode. Though there are retries on connection failure throughout the 
> stack, the same InetSocketAddress is reused.
> InetSocketAddress is an interesting class, because it resolves DNS names to 
> IP addresses on construction, and it is never refreshed. Hadoop re-creates an 
> InetSocketAddress in some cases just in case the remote IP has changed for a 
> particular DNS name: https://issues.apache.org/jira/browse/HADOOP-7472.
> Anyway, on startup, you cna see the Datanode log: "Namenode...remains 
> unresolved" -- referring to the fact that DNS lookup failed.
> {code:java}
> 2017-11-02 16:01:55,115 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Refresh request received for nameservices: null
> 2017-11-02 16:01:55,153 WARN org.apache.hadoop.hdfs.DFSUtilClient: Namenode 
> for null remains unresolved for ID null. Check your hdfs-site.xml file to 
> ensure namenodes are configured properly.
> 2017-11-02 16:01:55,156 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Starting BPOfferServices for nameservices: 
> 2017-11-02 16:01:55,169 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Block pool  (Datanode Uuid unassigned) service to 
> cluster-32f5-m:8020 starting to offer service
> {code}
> The Datanode then proceeds to use this unresolved address, as it may work if 
> the DN is configured to use a proxy. Since I'm not using a proxy, it forever 
> prints out this message:
> {code:java}
> 2017-12-15 00:13:40,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:45,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:50,712 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:13:55,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> 2017-12-15 00:14:00,713 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Problem connecting to server: cluster-32f5-m:8020
> {code}
> Unfortunately, the log doesn't contain the exception that triggered it, but 
> the culprit is actually in IPC Client: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L444.
> This line was introduced in https://issues.apache.org/jira/browse/HADOOP-487 
> to give a clear error message when somebody mispells an address.
> However, the fix in HADOOP-7472 doesn't apply here, because that code happens 
> in Client#getConnection after the Connection is constructed.
> My proposed fix (will attach a patch) is to move this exception out of the 
> constructor and into a place that will trigger HADOOP-7472's logic to 
> re-resolve addresses. If the DNS failure was temporary, this will allow the 
> connection to succeed. If not, the connection will fail after ipc client 
> retries (default 10 seconds worth of retries).
> I want to fix this in ipc client rather than just in Datanode startup, as 
> this fixes temporary DNS issues for all of Hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-12897) KerberosAuthenticator.authenticate to include URL on IO failures

2018-01-29 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343884#comment-16343884
 ] 

Arpit Agarwal commented on HADOOP-12897:


Minor comment. You can simplify the following code:
{code}
  } catch (IOException ex) {
throw wrapExceptionWithMessage(ex,
"Error while authenticating with endpoint: " + url);
  } catch (AuthenticationException ex) {
throw wrapExceptionWithMessage(ex,
"Error while authenticating with endpoint: " + url);
  }
{code}
as follows:
{code}
  } catch (IOException | AuthenticationException ex) {
throw wrapExceptionWithMessage(ex,
"Error while authenticating with endpoint: " + url);
  }
{code}

> KerberosAuthenticator.authenticate to include URL on IO failures
> 
>
> Key: HADOOP-12897
> URL: https://issues.apache.org/jira/browse/HADOOP-12897
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-12897.001.patch, HADOOP-12897.002.patch, 
> HADOOP-12897.003.patch
>
>
> If {{KerberosAuthenticator.authenticate}} can't connect to the endpoint, you 
> get a stack trace, but without the URL it is trying to talk to.
> That is: it doesn't have any equivalent of the {{NetUtils.wrapException}} 
> handler —which can't be called here as its not in the {{hadoop-auth}} module



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15168) Add kdiag tool to hadoop command

2018-01-29 Thread Hanisha Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343880#comment-16343880
 ] 

Hanisha Koneru commented on HADOOP-15168:
-

{quote}I think, the commands which are related to hdfs, we add in hdfs script, 
similar for yarn.
 Or do we add to all scripts in general?
{quote}
No, if you are adding, you should add it in {{hdfs}} and {{yarn}} scripts. I am 
not sure if it is required as they do not have other kerberos related commands 
(such as {{key}} and {{kerbname}}).

I meant to say we should change the following lines in {{SecureMode.md}} to 
reflect the changes introduced by this Jira.
{code:java}
The `KDiag` command has its own entry point; it is currently not hooked up
to the end-user CLI.

It is invoked simply by passing its full classname to one of the `bin/hadoop`,
`bin/hdfs` or `bin/yarn` commands. Accordingly, it will display the kerberos 
client
state of the command used to invoke it.

```
hadoop org.apache.hadoop.security.KDiag
hdfs org.apache.hadoop.security.KDiag
yarn org.apache.hadoop.security.KDiag

{code}

> Add kdiag tool to hadoop command
> 
>
> Key: HADOOP-15168
> URL: https://issues.apache.org/jira/browse/HADOOP-15168
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Minor
> Attachments: HADOOP-15168.00.patch, HADOOP-15168.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava

2018-01-29 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343828#comment-16343828
 ] 

Ajay Kumar commented on HADOOP-15170:
-

[~jlowe], thanks for review. Updated patch v3 with suggested changes.

> Add symlink support to FileUtil#unTarUsingJava 
> ---
>
> Key: HADOOP-15170
> URL: https://issues.apache.org/jira/browse/HADOOP-15170
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Jason Lowe
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, 
> HADOOP-15170.003.patch
>
>
> Now that JDK7 or later is required, we can leverage 
> java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support 
> archives that contain symbolic links.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15170) Add symlink support to FileUtil#unTarUsingJava

2018-01-29 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HADOOP-15170:

Attachment: HADOOP-15170.003.patch

> Add symlink support to FileUtil#unTarUsingJava 
> ---
>
> Key: HADOOP-15170
> URL: https://issues.apache.org/jira/browse/HADOOP-15170
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: util
>Reporter: Jason Lowe
>Assignee: Ajay Kumar
>Priority: Minor
> Attachments: HADOOP-15170.001.patch, HADOOP-15170.002.patch, 
> HADOOP-15170.003.patch
>
>
> Now that JDK7 or later is required, we can leverage 
> java.nio.Files.createSymbolicLink in FileUtil.unTarUsingJava to support 
> archives that contain symbolic links.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14671) Upgrade to Apache Yetus 0.7.0

2018-01-29 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343778#comment-16343778
 ] 

Allen Wittenauer commented on HADOOP-14671:
---

YETUS-609 is pretty much a blocker for Hadoop to go to 0.7.0.  But 0.6.0 is 
always an option.

> Upgrade to Apache Yetus 0.7.0
> -
>
> Key: HADOOP-14671
> URL: https://issues.apache.org/jira/browse/HADOOP-14671
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, documentation, test
>Affects Versions: 3.0.0-beta1
>Reporter: Allen Wittenauer
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-14671.001.patch
>
>
> Apache Yetus 0.7.0 was released.  Let's upgrade the bundled reference to the 
> new version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14445) Delegation tokens are not shared between KMS instances

2018-01-29 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343754#comment-16343754
 ] 

Daryn Sharp commented on HADOOP-14445:
--

This fell off my radar.  Quick recap since conversation has been fragmented 
across multiple jiras:
The LB provider requests 1 token, like it should, but it’s used only for that 
specific kms.  Ironic the load balancer increased load since it only works by 
retries cycling back to that kms, doesn't tolerate if that kms goes down, and 
it went unnoticed.  This Jira proposed originally proposed obtaining n-many 
tokens from each subordinate kms, even though a token from 1 will work for all. 
 The RM would have to unnecessarily renew n-many tokens and if one renew fails, 
job submission fails.  Not good.

Rushabh's original goal addresses a huge kms token renewal issue: it always 
uses the conf.  A server like the RM cannot support a multi-kms environment.  
The fix is use the kms provider's uri as the token service so the same provider 
can later be instantiated for renewal.  This also elegantly allows the LB 
provider to use a single token for all subordinate providers by using its own 
uri.  But it poses compatibility issues for job submitted by a new client that 
runs old tasks.

––

The semantics for getDelegationTokenService are oddly cyclical.  I'd expect it, 
like other hadoop clients, to premeditate the service name.  The latest patch 
is looking at the creds to decide the service based on whether a token exists 
so it can attempt to look up a token for that service – which it already looked 
up.

I’d prefer for the compatibility to be cleaner, and easier to revoke in the 
future.  The patch falls back to conf by assuming URISyntaxException means old 
service, however a malformed new service should fail to avoid surprises.  If it 
looks like a uri, it must be a valid uri.  Simplest approach is check if it 
contains ://.

I'm also uneasy about a client-side config to control compatibility since 
clients are notoriously hard to upgrade.

An alternative could remove the service guesswork, client conf, and be a bit 
more compatible by using a new token kind.  The current one is “kms-dt” whereas 
the standard naming convention should be “KMS_DELEGATION_TOKEN”.  The old token 
kind could continue using the conf, as today, while the new kind requires a 
service uri.  Effectively the current/old code remains unchanged.

There are tradeoffs to support old clients that must use the host:port.  I know 
I objected to duplicating tokens, but I’ll acquiesce if it provides a cleaner 
approach.  Duplicating a new KMS_DELEGATION_TOKEN/uri token into a single 
kms-dt/host:port is "no worse than today":
* Pro: Old client finds kms-dt from old client.
* Pro: Old client finds kms-dt from new client.
* Pro: New client finds kms-dt from old client.
* Pro: New client finds KMS_DELEGATION_TOKEN from new client.
* Pro: Old RM renews the kms-dt for both old/new clients.
* *Con*: New RM renews KMS_DELEGATION_TOKEN from new clients, effectively a 
double renew for the same token as kms-dt.

If we are willing to sacrifice a bit for new client + old RM:  Abuse fact that 
old kms clients look for a host:port service regardless of kind.  We can trick 
the RM into not renewing the unknown kind, ex. “kms-dt-deprecated”, to avoid 
the double renew.
* Pro: Old client finds kms-dt from old client.
* Pro: Old client finds kms-dt-deprecated from new client (remember, doesn't 
care about kind)
* Pro: New client finds kms-dt from old client.
* Pro: New client finds KMS_DELEGATION_TOKEN from new client.
* Pro: Old RM renews the kms-dt for old clients (all it knows about)
* *Con*: Old RM renews nothing for new clients (doesn't know 
KMS_DELEGATION_TOKEN or kms-dt-deprecated)
* Pro: New RM renews kms-dt for old clients.
* Pro: New RM renews KMS_DELEGATION_TOKEN for new clients (not 
kms-dt-deprecated)

Thoughts?

> Delegation tokens are not shared between KMS instances
> --
>
> Key: HADOOP-14445
> URL: https://issues.apache.org/jira/browse/HADOOP-14445
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: kms
>Affects Versions: 2.8.0, 3.0.0-alpha1
> Environment: CDH5.7.4, Kerberized, SSL, KMS-HA, at rest encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Rushabh S Shah
>Priority: Major
> Attachments: HADOOP-14445-branch-2.8.002.patch, 
> HADOOP-14445-branch-2.8.patch, HADOOP-14445.002.patch, HADOOP-14445.003.patch
>
>
> As discovered in HADOOP-14441, KMS HA using LoadBalancingKMSClientProvider do 
> not share delegation tokens. (a client uses KMS address/port as the key for 
> delegation token)
> {code:title=DelegationTokenAuthenticatedURL#openConnection}
> if (!creds.getAllTokens().isEmpty()) {
> InetSocketAddress serviceAddr = new

[jira] [Commented] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line

2018-01-29 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343733#comment-16343733
 ] 

Hudson commented on HADOOP-15186:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13577 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13577/])
HADOOP-15186. Allow Azure Data Lake SDK dependency version to be set on 
(stevel: rev 7fd287b4af5a191f18ea92850b7d904e4b4fb693)
* (edit) hadoop-tools/hadoop-azure-datalake/pom.xml


> Allow Azure Data Lake SDK dependency version to be set on the command line
> --
>
> Key: HADOOP-15186
> URL: https://issues.apache.org/jira/browse/HADOOP-15186
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, fs/adl
>Affects Versions: 3.0.0
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, 
> HADOOP-15186-003.patch
>
>
> For backward/forward release of Java SDK compatibility test against Hadoop 
> driver. Allow Azure Data Lake Java SDK dependency version to override from 
> command line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14969) Improve diagnostics in secure DataNode startup

2018-01-29 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343727#comment-16343727
 ] 

Ajay Kumar commented on HADOOP-14969:
-

Failed test is unrelated, passes locally.

> Improve diagnostics in secure DataNode startup
> --
>
> Key: HADOOP-14969
> URL: https://issues.apache.org/jira/browse/HADOOP-14969
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HADOOP-14969.001.patch, HADOOP-14969.002.patch, 
> HADOOP-14969.003.patch, HADOOP-14969.004.patch, HADOOP-14969.005.patch, 
> HADOOP-14969.006.patch
>
>
> When DN secure mode configuration is incorrect, it throws the following 
> exception from Datanode#checkSecureConfig
> {code}
>   private static void checkSecureConfig(DNConf dnConf, Configuration conf,
>   SecureResources resources) throws RuntimeException {
> if (!UserGroupInformation.isSecurityEnabled()) {
>   return;
> }
> ...
> throw new RuntimeException("Cannot start secure DataNode without " +
>   "configuring either privileged resources or SASL RPC data transfer " +
>   "protection and SSL for HTTP.  Using privileged resources in " +
>   "combination with SASL RPC data transfer protection is not supported.");
> {code}
> The DN should print more useful diagnostics as to what exactly what went 
> wrong.
> Also when starting secure DN with resources then the startup scripts should 
> launch the SecureDataNodeStarter class. If no SASL is configured and 
> SecureDataNodeStarter is not used, then we could mention that too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-15006) Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS

2018-01-29 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342920#comment-16342920
 ] 

Steve Loughran edited comment on HADOOP-15006 at 1/29/18 5:49 PM:
--

I'm not going to touch this right now as its at the too raw stage, but 
progressing. I'll let yetus be the style police, including rejecting files for 
lack of ASF copyright, line endings etc.

Ignoring that
 * what's your proposal for letting the client encryption be an optional 
feature, with key? Config
 * Once its configurable, the test would need to use two FS instances, one 
without encryption, one with.
 * Is the file length as returned in listings 100% consistent with the amount 
of data you get to read?


was (Author: ste...@apache.org):
I'm going to touch this right now as its at the too raw stage, but progressing. 
I'll let yetus be the style police, including rejecting files for lack of ASF 
copyright, line endings etc.

Ignoring that
 * what's your proposal for letting the client encryption be an optional 
feature, with key? Config
 * Once its configurable, the test would need to use two FS instances, one 
without encryption, one with.
 * Is the file length as returned in listings 100% consistent with the amount 
of data you get to read?

> Encrypt S3A data client-side with Hadoop libraries & Hadoop KMS
> ---
>
> Key: HADOOP-15006
> URL: https://issues.apache.org/jira/browse/HADOOP-15006
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3, kms
>Reporter: Steve Moist
>Priority: Minor
> Attachments: S3-CSE Proposal.pdf, s3-cse-poc.patch
>
>
> This is for the proposal to introduce Client Side Encryption to S3 in such a 
> way that it can leverage HDFS transparent encryption, use the Hadoop KMS to 
> manage keys, use the `hdfs crypto` command line tools to manage encryption 
> zones in the cloud, and enable distcp to copy from HDFS to S3 (and 
> vice-versa) with data still encrypted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line

2018-01-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15186:

   Resolution: Fixed
Fix Version/s: 3.0.1
   Status: Resolved  (was: Patch Available)

+1, committed

> Allow Azure Data Lake SDK dependency version to be set on the command line
> --
>
> Key: HADOOP-15186
> URL: https://issues.apache.org/jira/browse/HADOOP-15186
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, fs/adl
>Affects Versions: 3.0.0
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, 
> HADOOP-15186-003.patch
>
>
> For backward/forward release of Java SDK compatibility test against Hadoop 
> driver. Allow Azure Data Lake Java SDK dependency version to override from 
> command line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-15186) Allow Azure Data Lake SDK dependency version to be set on the command line

2018-01-29 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-15186:

Summary: Allow Azure Data Lake SDK dependency version to be set on the 
command line  (was: Allow Azure Data Lake SDK dependency version to override 
from the command line)

> Allow Azure Data Lake SDK dependency version to be set on the command line
> --
>
> Key: HADOOP-15186
> URL: https://issues.apache.org/jira/browse/HADOOP-15186
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, fs/adl
>Affects Versions: 3.0.0
>Reporter: Vishwajeet Dusane
>Assignee: Vishwajeet Dusane
>Priority: Major
> Attachments: HADOOP-15186-001.patch, HADOOP-15186-002.patch, 
> HADOOP-15186-003.patch
>
>
> For backward/forward release of Java SDK compatibility test against Hadoop 
> driver. Allow Azure Data Lake Java SDK dependency version to override from 
> command line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-14671) Upgrade to Apache Yetus 0.7.0

2018-01-29 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-14671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343554#comment-16343554
 ] 

Allen Wittenauer commented on HADOOP-14671:
---

0.7.0 was released today.  It includes a lot of key fixes.  Note that 
releasedocmaker has changed, so this will likely be a more invasive upgrade 
than just increasing the version number.

> Upgrade to Apache Yetus 0.7.0
> -
>
> Key: HADOOP-14671
> URL: https://issues.apache.org/jira/browse/HADOOP-14671
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, documentation, test
>Affects Versions: 3.0.0-beta1
>Reporter: Allen Wittenauer
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-14671.001.patch
>
>
> Apache Yetus 0.7.0 was released.  Let's upgrade the bundled reference to the 
> new version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14671) Upgrade to Apache Yetus 0.7.0

2018-01-29 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-14671:
--
Description: Apache Yetus 0.7.0 was released.  Let's upgrade the bundled 
reference to the new version.  (was: Apache Yetus 0.5.0 was released.  Let's 
upgrade the bundled reference to the new version.)

> Upgrade to Apache Yetus 0.7.0
> -
>
> Key: HADOOP-14671
> URL: https://issues.apache.org/jira/browse/HADOOP-14671
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, documentation, test
>Affects Versions: 3.0.0-beta1
>Reporter: Allen Wittenauer
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-14671.001.patch
>
>
> Apache Yetus 0.7.0 was released.  Let's upgrade the bundled reference to the 
> new version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-14671) Upgrade to Apache Yetus 0.7.0

2018-01-29 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HADOOP-14671:
--
Summary: Upgrade to Apache Yetus 0.7.0  (was: Upgrade to Apache Yetus 0.5.1)

> Upgrade to Apache Yetus 0.7.0
> -
>
> Key: HADOOP-14671
> URL: https://issues.apache.org/jira/browse/HADOOP-14671
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, documentation, test
>Affects Versions: 3.0.0-beta1
>Reporter: Allen Wittenauer
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-14671.001.patch
>
>
> Apache Yetus 0.5.0 was released.  Let's upgrade the bundled reference to the 
> new version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15151) MapFile.fix creates a wrong index file in case of block-compressed data file.

2018-01-29 Thread Grigori Rybkine (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343324#comment-16343324
 ] 

Grigori Rybkine commented on HADOOP-15151:
--

Thank you very much for the comment about the existing code, [~chris.douglas]. 
I am willing to provide a patch. Please, let me know how it would be better to 
proceed or maybe open a ticket and assign it to me?

> MapFile.fix creates a wrong index file in case of block-compressed data file.
> -
>
> Key: HADOOP-15151
> URL: https://issues.apache.org/jira/browse/HADOOP-15151
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Grigori Rybkine
>Assignee: Grigori Rybkine
>Priority: Major
>  Labels: patch
> Fix For: 2.9.1
>
> Attachments: HADOOP-15151.001.patch, HADOOP-15151.002.patch, 
> HADOOP-15151.003.patch, HADOOP-15151.004.patch, HADOOP-15151.004.patch, 
> HADOOP-15151.005.patch
>
>
> Index file created with MapFile.fix for an ordered block-compressed data file 
> does not allow to find values for keys existing in the data file via the 
> MapFile.get method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-15151) MapFile.fix creates a wrong index file in case of block-compressed data file.

2018-01-29 Thread Grigori Rybkine (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-15151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343316#comment-16343316
 ] 

Grigori Rybkine commented on HADOOP-15151:
--

Thank you, [~chris.douglas], very much for reviewing and committing the patch.

> MapFile.fix creates a wrong index file in case of block-compressed data file.
> -
>
> Key: HADOOP-15151
> URL: https://issues.apache.org/jira/browse/HADOOP-15151
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: Grigori Rybkine
>Assignee: Grigori Rybkine
>Priority: Major
>  Labels: patch
> Fix For: 2.9.1
>
> Attachments: HADOOP-15151.001.patch, HADOOP-15151.002.patch, 
> HADOOP-15151.003.patch, HADOOP-15151.004.patch, HADOOP-15151.004.patch, 
> HADOOP-15151.005.patch
>
>
> Index file created with MapFile.fix for an ordered block-compressed data file 
> does not allow to find values for keys existing in the data file via the 
> MapFile.get method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

48 matches

Mail list logo