[jira] [Assigned] (HADOOP-11223) Offer a read-only conf alternative to new Configuration()

2019-02-09 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reassigned HADOOP-11223:


Assignee: Michael Miller  (was: Varun Saxena)

> Offer a read-only conf alternative to new Configuration()
> -
>
> Key: HADOOP-11223
> URL: https://issues.apache.org/jira/browse/HADOOP-11223
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: conf
>Reporter: Gopal V
>Assignee: Michael Miller
>Priority: Major
>  Labels: Performance
> Attachments: HADOOP-11223.001.patch
>
>
> new Configuration() is called from several static blocks across Hadoop.
> This is incredibly inefficient, since each one of those involves primarily 
> XML parsing at a point where the JIT won't be triggered & interpreter mode is 
> essentially forced on the JVM.
> The alternate solution would be to offer a {{Configuration::getDefault()}} 
> alternative which disallows any modifications.
> At the very least, such a method would need to be called from 
> # org.apache.hadoop.io.nativeio.NativeIO::()
> # org.apache.hadoop.security.SecurityUtil::()
> # org.apache.hadoop.yarn.factory.providers.RecordFactoryProvider::



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-12549) Extend HDFS-7456 default generically to all pattern lookups

2018-05-15 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reassigned HADOOP-12549:


Assignee: (was: Harsh J)

> Extend HDFS-7456 default generically to all pattern lookups
> ---
>
> Key: HADOOP-12549
> URL: https://issues.apache.org/jira/browse/HADOOP-12549
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, security
>Affects Versions: 2.7.1
>Reporter: Harsh J
>Priority: Minor
> Attachments: HADOOP-12549.patch
>
>
> In HDFS-7546 we added a hdfs-default.xml property to bring back the regular 
> behaviour of trusting all principals (as was the case before HADOOP-9789). 
> However, the change only targeted HDFS users and also only those that used 
> the default-loading mechanism of Configuration class (i.e. not {{new 
> Configuration(false)}} users).
> I'd like to propose adding the same default to the generic RPC client code 
> also, so the default affects all form of clients equally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2018-05-15 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reassigned HADOOP-13694:


Assignee: (was: Harsh J)

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Priority: Major
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
>   at 
> 

[jira] [Updated] (HADOOP-6801) io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java

2017-03-03 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-6801:

Release Note: 
Two new configuration keys, seq.io.sort.mb and seq.io.sort.factor have been 
introduced for the SequenceFile's Sorter feature to replace older, deprecated 
property keys of io.sort.mb and io.sort.factor.

This only affects direct users of the org.apache.hadoop.io.SequenceFile.Sorter 
Java class. For controlling MR2's internal sorting instead, use the existing 
config keys of mapreduce.task.io.sort.mb and mapreduce.task.io.sort.factor.

Thank you [~ajisakaa]!

> io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are 
> still in CommonConfigurationKeysPublic.java and used in SequenceFile.java
> ---
>
> Key: HADOOP-6801
> URL: https://issues.apache.org/jira/browse/HADOOP-6801
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Erik Steffl
>Assignee: Harsh J
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: HADOOP-6801.05.patch, HADOOP-6801.r1.diff, 
> HADOOP-6801.r2.diff, HADOOP-6801.r3.diff, HADOOP-6801.r4.diff, 
> HADOOP-6801.r5.diff
>
>
> Following configuration keys in CommonConfigurationKeysPublic.java (former 
> CommonConfigurationKeys.java):
> public static final String  IO_SORT_MB_KEY = "io.sort.mb";
> public static final String  IO_SORT_FACTOR_KEY = "io.sort.factor";
> are partially moved:
>   - they were renamed to mapreduce.task.io.sort.mb and 
> mapreduce.task.io.sort.factor respectively
>   - they were moved to mapreduce project, documented in mapred-default.xml
> However:
>   - they are still listed in CommonConfigurationKeysPublic.java as quoted 
> above
>   - strings "io.sort.mb" and "io.sort.factor" are used in SequenceFile.java 
> in Hadoop Common project
> Not sure what the solution is, these constants should probably be removed 
> from CommonConfigurationKeysPublic.java but I am not sure what's the best 
> solution for SequenceFile.java.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13817:
-
Release Note: A new introduced configuration key 
"hadoop.security.groups.shell.command.timeout" allows applying a finite wait 
timeout over the 'id' commands launched by the ShellBasedUnixGroupsMapping 
plugin. Values specified can be in any valid time duration units: 
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/conf/Configuration.html#getTimeDuration-java.lang.String-long-java.util.concurrent.TimeUnit-

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13817:
-
Release Note: 
A new introduced configuration key 
"hadoop.security.groups.shell.command.timeout" allows applying a finite wait 
timeout over the 'id' commands launched by the ShellBasedUnixGroupsMapping 
plugin. Values specified can be in any valid time duration units: 
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/conf/Configuration.html#getTimeDuration-java.lang.String-long-java.util.concurrent.TimeUnit-

Value defaults to 0, indicating infinite wait (preserving existing behaviour).

  was:A new introduced configuration key 
"hadoop.security.groups.shell.command.timeout" allows applying a finite wait 
timeout over the 'id' commands launched by the ShellBasedUnixGroupsMapping 
plugin. Values specified can be in any valid time duration units: 
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/conf/Configuration.html#getTimeDuration-java.lang.String-long-java.util.concurrent.TimeUnit-


> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13817:
-
  Resolution: Fixed
   Fix Version/s: 3.0.0-beta1
  2.9.0
Target Version/s:   (was: 2.9.0, 3.0.0-beta1)
  Status: Resolved  (was: Patch Available)

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13817:
-
Hadoop Flags: Reviewed

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13817:
-
Target Version/s: 2.9.0, 3.0.0-beta1

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-24 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13817:
-
Attachment: HADOOP-13817.000.patch
HADOOP-13817-branch-2.000.patch

Uploading direct patches to trigger Jenkins build.

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-13817.000.patch, HADOOP-13817-branch-2.000.patch
>
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2017-02-16 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870061#comment-15870061
 ] 

Harsh J commented on HADOOP-13694:
--

[~jojochuang] - Could you help in reviewing this?

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
>   at 
> 

[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2017-02-16 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870058#comment-15870058
 ] 

Harsh J commented on HADOOP-13817:
--

[~jojochuang] - Could you help in reviewing this? Current diff is at 
https://github.com/apache/hadoop/pull/161

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-10434) Is it possible to use "df" to calculate the dfs usage instead of "du"

2016-12-18 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-10434.
--
Resolution: Duplicate

> Is it possible to use "df" to calculate the dfs usage instead of "du"
> -
>
> Key: HADOOP-10434
> URL: https://issues.apache.org/jira/browse/HADOOP-10434
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.3.0
>Reporter: MaoYuan Xian
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.8.0
>
> Attachments: HADOOP-10434-1.patch
>
>
> When we run datanode from the machine with big disk volume, it's found du 
> operations from org.apache.hadoop.fs.DU's DURefreshThread cost lots of disk 
> performance.
> As we use the whole disk for hdfs storage, it is possible calculate volume 
> usage via "df" command. Is it necessary adding the "df" option for usage 
> calculation in hdfs 
> (org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-10434) Is it possible to use "df" to calculate the dfs usage instead of "du"

2016-12-18 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened HADOOP-10434:
--

Reopening to close as Duplicate status vs. Fixed.

> Is it possible to use "df" to calculate the dfs usage instead of "du"
> -
>
> Key: HADOOP-10434
> URL: https://issues.apache.org/jira/browse/HADOOP-10434
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.3.0
>Reporter: MaoYuan Xian
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10434-1.patch
>
>
> When we run datanode from the machine with big disk volume, it's found du 
> operations from org.apache.hadoop.fs.DU's DURefreshThread cost lots of disk 
> performance.
> As we use the whole disk for hdfs storage, it is possible calculate volume 
> usage via "df" command. Is it necessary adding the "df" option for usage 
> calculation in hdfs 
> (org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-10434) Is it possible to use "df" to calculate the dfs usage instead of "du"

2016-12-18 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-10434:
-
Fix Version/s: (was: 2.8.0)

> Is it possible to use "df" to calculate the dfs usage instead of "du"
> -
>
> Key: HADOOP-10434
> URL: https://issues.apache.org/jira/browse/HADOOP-10434
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.3.0
>Reporter: MaoYuan Xian
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10434-1.patch
>
>
> When we run datanode from the machine with big disk volume, it's found du 
> operations from org.apache.hadoop.fs.DU's DURefreshThread cost lots of disk 
> performance.
> As we use the whole disk for hdfs storage, it is possible calculate volume 
> usage via "df" command. Is it necessary adding the "df" option for usage 
> calculation in hdfs 
> (org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable

2016-11-25 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-1381:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-beta1
 Release Note: The default sync interval within new SequenceFile writes is 
now 100KB, up from the older default of 2000B. The sync interval is now also 
manually configurable via the SequenceFile.Writer API.  (was: Made sync 
interval of sequencefiles configurable and raised default from 2000 bytes to 
100 kilobytes, to optimize for large files.)
   Status: Resolved  (was: Patch Available)

Thank you [~ajisakaa]! Pushed to trunk.

> The distance between sync blocks in SequenceFiles should be configurable
> 
>
> Key: HADOOP-1381
> URL: https://issues.apache.org/jira/browse/HADOOP-1381
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 2.0.0-alpha
>Reporter: Owen O'Malley
>Assignee: Harsh J
> Fix For: 3.0.0-beta1
>
> Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
> HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
> HADOOP-1381.r5.diff
>
>
> Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
> better if it was configurable with a much higher default (1mb or so?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-6801) io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java

2016-11-25 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15696276#comment-15696276
 ] 

Harsh J commented on HADOOP-6801:
-

Thank you [~ajisakaa], updated the commit with that addressed.

> io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are 
> still in CommonConfigurationKeysPublic.java and used in SequenceFile.java
> ---
>
> Key: HADOOP-6801
> URL: https://issues.apache.org/jira/browse/HADOOP-6801
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Erik Steffl
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-6801.05.patch, HADOOP-6801.r1.diff, 
> HADOOP-6801.r2.diff, HADOOP-6801.r3.diff, HADOOP-6801.r4.diff, 
> HADOOP-6801.r5.diff
>
>
> Following configuration keys in CommonConfigurationKeysPublic.java (former 
> CommonConfigurationKeys.java):
> public static final String  IO_SORT_MB_KEY = "io.sort.mb";
> public static final String  IO_SORT_FACTOR_KEY = "io.sort.factor";
> are partially moved:
>   - they were renamed to mapreduce.task.io.sort.mb and 
> mapreduce.task.io.sort.factor respectively
>   - they were moved to mapreduce project, documented in mapred-default.xml
> However:
>   - they are still listed in CommonConfigurationKeysPublic.java as quoted 
> above
>   - strings "io.sort.mb" and "io.sort.factor" are used in SequenceFile.java 
> in Hadoop Common project
> Not sure what the solution is, these constants should probably be removed 
> from CommonConfigurationKeysPublic.java but I am not sure what's the best 
> solution for SequenceFile.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12549) Extend HDFS-7456 default generically to all pattern lookups

2016-11-24 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694990#comment-15694990
 ] 

Harsh J commented on HADOOP-12549:
--

[~daryn] / [~yzhangal] - Could you help take a look at this one?

> Extend HDFS-7456 default generically to all pattern lookups
> ---
>
> Key: HADOOP-12549
> URL: https://issues.apache.org/jira/browse/HADOOP-12549
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, security
>Affects Versions: 2.7.1
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-12549.patch
>
>
> In HDFS-7546 we added a hdfs-default.xml property to bring back the regular 
> behaviour of trusting all principals (as was the case before HADOOP-9789). 
> However, the change only targeted HDFS users and also only those that used 
> the default-loading mechanism of Configuration class (i.e. not {{new 
> Configuration(false)}} users).
> I'd like to propose adding the same default to the generic RPC client code 
> also, so the default affects all form of clients equally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-6801) io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java

2016-11-24 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694987#comment-15694987
 ] 

Harsh J commented on HADOOP-6801:
-

[~cnauroth] / [~ajisakaa] - Could you help take a look at this one?

> io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are 
> still in CommonConfigurationKeysPublic.java and used in SequenceFile.java
> ---
>
> Key: HADOOP-6801
> URL: https://issues.apache.org/jira/browse/HADOOP-6801
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Erik Steffl
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-6801.05.patch, HADOOP-6801.r1.diff, 
> HADOOP-6801.r2.diff, HADOOP-6801.r3.diff, HADOOP-6801.r4.diff, 
> HADOOP-6801.r5.diff
>
>
> Following configuration keys in CommonConfigurationKeysPublic.java (former 
> CommonConfigurationKeys.java):
> public static final String  IO_SORT_MB_KEY = "io.sort.mb";
> public static final String  IO_SORT_FACTOR_KEY = "io.sort.factor";
> are partially moved:
>   - they were renamed to mapreduce.task.io.sort.mb and 
> mapreduce.task.io.sort.factor respectively
>   - they were moved to mapreduce project, documented in mapred-default.xml
> However:
>   - they are still listed in CommonConfigurationKeysPublic.java as quoted 
> above
>   - strings "io.sort.mb" and "io.sort.factor" are used in SequenceFile.java 
> in Hadoop Common project
> Not sure what the solution is, these constants should probably be removed 
> from CommonConfigurationKeysPublic.java but I am not sure what's the best 
> solution for SequenceFile.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-9424) The "hadoop jar" invocation should include the passed jar on the classpath as a whole

2016-11-24 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-9424:

Assignee: (was: Harsh J)

> The "hadoop jar" invocation should include the passed jar on the classpath as 
> a whole
> -
>
> Key: HADOOP-9424
> URL: https://issues.apache.org/jira/browse/HADOOP-9424
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 2.0.3-alpha
>Reporter: Harsh J
>Priority: Minor
> Attachments: HADOOP-9424.patch
>
>
> When you have a case such as this:
> {{X.jar -> Classes = Main, Foo}}
> {{Y.jar -> Classes = Bar}}
> With implementation details such as:
> * Main references Bar and invokes a public, static method on it.
> * Bar does a class lookup to find Foo (Class.forName("Foo")).
> Then when you do a {{HADOOP_CLASSPATH=Y.jar hadoop jar X.jar Main}}, the 
> Bar's method fails with a ClassNotFound exception cause of the way RunJar 
> runs.
> RunJar extracts the passed jar and includes its contents on the ClassLoader 
> of its current thread but the {{Class.forName(…)}} call from another class 
> does not check that class loader and hence cannot find the class as its not 
> on any classpath it is aware of.
> The script of "hadoop jar" should ideally include the passed jar argument to 
> the CLASSPATH before RunJar is invoked, for this above case to pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-24 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694982#comment-15694982
 ] 

Harsh J commented on HADOOP-13817:
--

[~xyao] - Thanks for the reviews so far. Could you take a look at the current 
patch-set?

[~yzhangal] - Could you help take a look at this one?

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable

2016-11-24 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694979#comment-15694979
 ] 

Harsh J commented on HADOOP-1381:
-

[~ajisakaa] / [~ozawa]] - Could you help take a look at this one?

> The distance between sync blocks in SequenceFiles should be configurable
> 
>
> Key: HADOOP-1381
> URL: https://issues.apache.org/jira/browse/HADOOP-1381
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 2.0.0-alpha
>Reporter: Owen O'Malley
>Assignee: Harsh J
> Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
> HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
> HADOOP-1381.r5.diff
>
>
> Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
> better if it was configurable with a much higher default (1mb or so?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2016-11-24 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15694976#comment-15694976
 ] 

Harsh J commented on HADOOP-13694:
--

[~cnauroth] / [~hitliuyi] - Would you be able to take a look at this one?

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
>   at 

[jira] [Commented] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable

2016-11-15 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669509#comment-15669509
 ] 

Harsh J commented on HADOOP-1381:
-

Fixed all javac and checkstyle issues except this one:

{code}
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java:1264:
void init(Configuration config, FSDataOutputStream outStream,:10: More than 
7 parameters (found 8).
{code}

This can only be cleaned up when the deprecated API constructors are removed.

> The distance between sync blocks in SequenceFiles should be configurable
> 
>
> Key: HADOOP-1381
> URL: https://issues.apache.org/jira/browse/HADOOP-1381
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 2.0.0-alpha
>Reporter: Owen O'Malley
>Assignee: Harsh J
> Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
> HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
> HADOOP-1381.r5.diff
>
>
> Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
> better if it was configurable with a much higher default (1mb or so?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable

2016-11-15 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-1381:

Summary: The distance between sync blocks in SequenceFiles should be 
configurable  (was: The distance between sync blocks in SequenceFiles should be 
configurable rather than hard coded to 2000 bytes)

> The distance between sync blocks in SequenceFiles should be configurable
> 
>
> Key: HADOOP-1381
> URL: https://issues.apache.org/jira/browse/HADOOP-1381
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 2.0.0-alpha
>Reporter: Owen O'Malley
>Assignee: Harsh J
> Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
> HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
> HADOOP-1381.r5.diff
>
>
> Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
> better if it was configurable with a much higher default (1mb or so?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-6801) io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java

2016-11-14 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-6801:

Affects Version/s: (was: 0.22.0)
   2.0.0-alpha
 Target Version/s:   (was: )
   Status: Open  (was: Patch Available)

> io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are 
> still in CommonConfigurationKeysPublic.java and used in SequenceFile.java
> ---
>
> Key: HADOOP-6801
> URL: https://issues.apache.org/jira/browse/HADOOP-6801
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Erik Steffl
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-6801.05.patch, HADOOP-6801.r1.diff, 
> HADOOP-6801.r2.diff, HADOOP-6801.r3.diff, HADOOP-6801.r4.diff, 
> HADOOP-6801.r5.diff
>
>
> Following configuration keys in CommonConfigurationKeysPublic.java (former 
> CommonConfigurationKeys.java):
> public static final String  IO_SORT_MB_KEY = "io.sort.mb";
> public static final String  IO_SORT_FACTOR_KEY = "io.sort.factor";
> are partially moved:
>   - they were renamed to mapreduce.task.io.sort.mb and 
> mapreduce.task.io.sort.factor respectively
>   - they were moved to mapreduce project, documented in mapred-default.xml
> However:
>   - they are still listed in CommonConfigurationKeysPublic.java as quoted 
> above
>   - strings "io.sort.mb" and "io.sort.factor" are used in SequenceFile.java 
> in Hadoop Common project
> Not sure what the solution is, these constants should probably be removed 
> from CommonConfigurationKeysPublic.java but I am not sure what's the best 
> solution for SequenceFile.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-6801) io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java

2016-11-14 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-6801:

Status: Patch Available  (was: Open)

> io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are 
> still in CommonConfigurationKeysPublic.java and used in SequenceFile.java
> ---
>
> Key: HADOOP-6801
> URL: https://issues.apache.org/jira/browse/HADOOP-6801
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Erik Steffl
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-6801.05.patch, HADOOP-6801.r1.diff, 
> HADOOP-6801.r2.diff, HADOOP-6801.r3.diff, HADOOP-6801.r4.diff, 
> HADOOP-6801.r5.diff
>
>
> Following configuration keys in CommonConfigurationKeysPublic.java (former 
> CommonConfigurationKeys.java):
> public static final String  IO_SORT_MB_KEY = "io.sort.mb";
> public static final String  IO_SORT_FACTOR_KEY = "io.sort.factor";
> are partially moved:
>   - they were renamed to mapreduce.task.io.sort.mb and 
> mapreduce.task.io.sort.factor respectively
>   - they were moved to mapreduce project, documented in mapred-default.xml
> However:
>   - they are still listed in CommonConfigurationKeysPublic.java as quoted 
> above
>   - strings "io.sort.mb" and "io.sort.factor" are used in SequenceFile.java 
> in Hadoop Common project
> Not sure what the solution is, these constants should probably be removed 
> from CommonConfigurationKeysPublic.java but I am not sure what's the best 
> solution for SequenceFile.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2016-11-14 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13694:
-
Status: Open  (was: Patch Available)

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
>   at 
> 

[jira] [Updated] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2016-11-14 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13694:
-
Status: Patch Available  (was: Open)

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
>   at 
> 

[jira] [Updated] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-14 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13817:
-
Status: Patch Available  (was: Open)

> Add a finite shell command timeout to ShellBasedUnixGroupsMapping
> -
>
> Key: HADOOP-13817
> URL: https://issues.apache.org/jira/browse/HADOOP-13817
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.6.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
> ShellCommandExecutor modules without a timeout set (its set to 0, which 
> implies infinite).
> If this command hangs for a long time on the OS end due to an unresponsive 
> groups backend or other reasons, it also blocks the handlers that use it on 
> the NameNode (or other services that use this class). That inadvertently 
> causes odd timeout troubles on the client end where its forced to retry (only 
> to likely run into such hangs again with every attempt until at least one 
> command returns).
> It would be helpful to have a finite command timeout after which we may give 
> up on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-6801) io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java

2016-11-01 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15625461#comment-15625461
 ] 

Harsh J commented on HADOOP-6801:
-

- Updated patch addresses the checkstyle issues

> io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are 
> still in CommonConfigurationKeysPublic.java and used in SequenceFile.java
> ---
>
> Key: HADOOP-6801
> URL: https://issues.apache.org/jira/browse/HADOOP-6801
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Erik Steffl
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-6801.05.patch, HADOOP-6801.r1.diff, 
> HADOOP-6801.r2.diff, HADOOP-6801.r3.diff, HADOOP-6801.r4.diff, 
> HADOOP-6801.r5.diff
>
>
> Following configuration keys in CommonConfigurationKeysPublic.java (former 
> CommonConfigurationKeys.java):
> public static final String  IO_SORT_MB_KEY = "io.sort.mb";
> public static final String  IO_SORT_FACTOR_KEY = "io.sort.factor";
> are partially moved:
>   - they were renamed to mapreduce.task.io.sort.mb and 
> mapreduce.task.io.sort.factor respectively
>   - they were moved to mapreduce project, documented in mapred-default.xml
> However:
>   - they are still listed in CommonConfigurationKeysPublic.java as quoted 
> above
>   - strings "io.sort.mb" and "io.sort.factor" are used in SequenceFile.java 
> in Hadoop Common project
> Not sure what the solution is, these constants should probably be removed 
> from CommonConfigurationKeysPublic.java but I am not sure what's the best 
> solution for SequenceFile.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-1381:

Release Note: Made sync interval of sequencefiles configurable and raised 
default from 2000 bytes to 100 kilobytes, to optimize for large files.  (was: 
Made sync interval of sequencefiles configurable and raised default from 100 
bytes to 100 kilobytes, to optimize for large files.)

> The distance between sync blocks in SequenceFiles should be configurable 
> rather than hard coded to 2000 bytes
> -
>
> Key: HADOOP-1381
> URL: https://issues.apache.org/jira/browse/HADOOP-1381
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 2.0.0-alpha
>Reporter: Owen O'Malley
>Assignee: Harsh J
> Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
> HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
> HADOOP-1381.r5.diff
>
>
> Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
> better if it was configurable with a much higher default (1mb or so?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-1381) The distance between sync blocks in SequenceFiles should be configurable rather than hard coded to 2000 bytes

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-1381:

Target Version/s:   (was: )
  Status: Patch Available  (was: Open)

> The distance between sync blocks in SequenceFiles should be configurable 
> rather than hard coded to 2000 bytes
> -
>
> Key: HADOOP-1381
> URL: https://issues.apache.org/jira/browse/HADOOP-1381
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: io
>Affects Versions: 2.0.0-alpha
>Reporter: Owen O'Malley
>Assignee: Harsh J
> Attachments: HADOOP-1381.r1.diff, HADOOP-1381.r2.diff, 
> HADOOP-1381.r3.diff, HADOOP-1381.r4.diff, HADOOP-1381.r5.diff, 
> HADOOP-1381.r5.diff
>
>
> Currently SequenceFiles put in sync blocks every 2000 bytes. It would be much 
> better if it was configurable with a much higher default (1mb or so?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-8134) DNS claims to return a hostname but returns a PTR record in some cases

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8134.
-
Resolution: Not A Problem
  Assignee: (was: Harsh J)

This hasn't proven as a problem in late. Closing as stale.

> DNS claims to return a hostname but returns a PTR record in some cases
> --
>
> Key: HADOOP-8134
> URL: https://issues.apache.org/jira/browse/HADOOP-8134
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.23.0
>Reporter: Harsh J
>Priority: Minor
>
> Per Shrijeet on HBASE-4109:
> {quote}
> If you are using an interface anything other than 'default' (literally that 
> keyword) DNS.java's getDefaultHost will return a string which will have a 
> trailing period at the end. It seems javadoc of reverseDns in DNS.java (see 
> below) is conflicting with what that function is actually doing. 
> It is returning a PTR record while claims it returns a hostname. The PTR 
> record always has period at the end , RFC: 
> http://irbs.net/bog-4.9.5/bog47.html
> We make call to DNS.getDefaultHost at more than one places and treat that as 
> actual hostname.
> Quoting HRegionServer for example
> String machineName = DNS.getDefaultHost(conf.get(
> "hbase.regionserver.dns.interface", "default"), conf.get(
> "hbase.regionserver.dns.nameserver", "default"));
> We may want to sanitize the string returned from DNS class. Or better we can 
> take a path of overhauling the way we do DNS name matching all over.
> {quote}
> While HBase has worked around the issue, we should fix the methods that 
> aren't doing what they've intended.
> 1. We fix the method. This may be an 'incompatible change'. But I do not know 
> who outside of us uses DNS classes.
> 2. We fix HDFS's DN at the calling end, cause that is affected by the 
> trailing period in its reporting back to the NN as well (Just affects NN->DN 
> weblinks, non critical).
> For 2, we can close this and open a HDFS JIRA.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-9424) The "hadoop jar" invocation should include the passed jar on the classpath as a whole

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-9424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-9424:

  Resolution: Invalid
Target Version/s:   (was: )
  Status: Resolved  (was: Patch Available)

Looks like this proposed approach does not appear to be entirely desirable.

> The "hadoop jar" invocation should include the passed jar on the classpath as 
> a whole
> -
>
> Key: HADOOP-9424
> URL: https://issues.apache.org/jira/browse/HADOOP-9424
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 2.0.3-alpha
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-9424.patch
>
>
> When you have a case such as this:
> {{X.jar -> Classes = Main, Foo}}
> {{Y.jar -> Classes = Bar}}
> With implementation details such as:
> * Main references Bar and invokes a public, static method on it.
> * Bar does a class lookup to find Foo (Class.forName("Foo")).
> Then when you do a {{HADOOP_CLASSPATH=Y.jar hadoop jar X.jar Main}}, the 
> Bar's method fails with a ClassNotFound exception cause of the way RunJar 
> runs.
> RunJar extracts the passed jar and includes its contents on the ClassLoader 
> of its current thread but the {{Class.forName(…)}} call from another class 
> does not check that class loader and hence cannot find the class as its not 
> on any classpath it is aware of.
> The script of "hadoop jar" should ideally include the passed jar argument to 
> the CLASSPATH before RunJar is invoked, for this above case to pass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-6801) io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java

2016-10-26 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-6801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607784#comment-15607784
 ] 

Harsh J commented on HADOOP-6801:
-

[~cnauroth] - Thanks for reviewing! I added the ordering test as well. Updated 
patch is in the PR.

> io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are 
> still in CommonConfigurationKeysPublic.java and used in SequenceFile.java
> ---
>
> Key: HADOOP-6801
> URL: https://issues.apache.org/jira/browse/HADOOP-6801
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Erik Steffl
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-6801.05.patch, HADOOP-6801.r1.diff, 
> HADOOP-6801.r2.diff, HADOOP-6801.r3.diff, HADOOP-6801.r4.diff, 
> HADOOP-6801.r5.diff
>
>
> Following configuration keys in CommonConfigurationKeysPublic.java (former 
> CommonConfigurationKeys.java):
> public static final String  IO_SORT_MB_KEY = "io.sort.mb";
> public static final String  IO_SORT_FACTOR_KEY = "io.sort.factor";
> are partially moved:
>   - they were renamed to mapreduce.task.io.sort.mb and 
> mapreduce.task.io.sort.factor respectively
>   - they were moved to mapreduce project, documented in mapred-default.xml
> However:
>   - they are still listed in CommonConfigurationKeysPublic.java as quoted 
> above
>   - strings "io.sort.mb" and "io.sort.factor" are used in SequenceFile.java 
> in Hadoop Common project
> Not sure what the solution is, these constants should probably be removed 
> from CommonConfigurationKeysPublic.java but I am not sure what's the best 
> solution for SequenceFile.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-7505) EOFException in RPC stack should have a nicer error message

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7505.
-
Resolution: Duplicate
  Assignee: (was: Harsh J)

This seems to be taken care (in part) via HADOOP-7346

> EOFException in RPC stack should have a nicer error message
> ---
>
> Key: HADOOP-7505
> URL: https://issues.apache.org/jira/browse/HADOOP-7505
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Priority: Minor
>
> Lots of user logs involve a user running mismatched versions, and for some 
> reason or another, they get EOFException instead of a proper version mismatch 
> exception. We should be able to catch this at appropriate points, and have a 
> nicer exception message explaining that it's a possible version mismatch, or 
> that they're trying to connect to the incorrect port.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-8579) Websites for HDFS and MapReduce both send users to video training resource which is non-public

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8579.
-
Resolution: Not A Problem
  Assignee: (was: Harsh J)

This does not appear to be a problem after the project re-merge.

> Websites for HDFS and MapReduce both send users to video training resource 
> which is non-public
> --
>
> Key: HADOOP-8579
> URL: https://issues.apache.org/jira/browse/HADOOP-8579
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: website
>Reporter: David L. Willson
>Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Main pages for HDFS and MapReduce send new user to unavailable training 
> resource.
> These two pages:
> http://hadoop.apache.org/mapreduce/
> http://hadoop.apache.org/hdfs/
> Link to this page:
> http://vimeo.com/3584536
> That page is not public, and not shared to all registered Vimeo users, and I 
> see nothing indicating how to ask for access to the resource.
> Please make the vids public, or remove the link of disappointment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-8863) Eclipse plugin may not be working on Juno due to changes in it

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8863.
-
Resolution: Won't Fix
  Assignee: (was: Harsh J)

The eclipse plugin is formally out.

> Eclipse plugin may not be working on Juno due to changes in it
> --
>
> Key: HADOOP-8863
> URL: https://issues.apache.org/jira/browse/HADOOP-8863
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: contrib/eclipse-plugin
>Affects Versions: 1.2.0
>Reporter: Harsh J
>
> We need to debug/investigate why it is so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-12549) Extend HDFS-7456 default generically to all pattern lookups

2016-10-26 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607688#comment-15607688
 ] 

Harsh J commented on HADOOP-12549:
--

[~yzhangal] or [~aw] - Could you help review this one? The change should help 
some limited form of clients that deal with multiple RM/KMS/etc. services (HDFS 
is already covered via a hdfs-default.xml change, for the most classic use-case 
of DistCp).

> Extend HDFS-7456 default generically to all pattern lookups
> ---
>
> Key: HADOOP-12549
> URL: https://issues.apache.org/jira/browse/HADOOP-12549
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, security
>Affects Versions: 2.7.1
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-12549.patch
>
>
> In HDFS-7546 we added a hdfs-default.xml property to bring back the regular 
> behaviour of trusting all principals (as was the case before HADOOP-9789). 
> However, the change only targeted HDFS users and also only those that used 
> the default-loading mechanism of Configuration class (i.e. not {{new 
> Configuration(false)}} users).
> I'd like to propose adding the same default to the generic RPC client code 
> also, so the default affects all form of clients equally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13056) Print expected values when rejecting a server's determined principal

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13056:
-
Target Version/s:   (was: 2.9.0)

> Print expected values when rejecting a server's determined principal
> 
>
> Key: HADOOP-13056
> URL: https://issues.apache.org/jira/browse/HADOOP-13056
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Harsh J
>Priority: Trivial
> Attachments: HADOOP-13056.000.patch
>
>
> When an address-constructed service principal by a client does not match a 
> provided pattern or the configured principal property, the error is very 
> uninformative on what the specific cause is. Currently the only error printed 
> is, in both cases:
> {code}
>  java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
> hdfs/host.internal@REALM
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13056) Print expected values when rejecting a server's determined principal

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13056:
-
Resolution: Duplicate
  Assignee: (was: Harsh J)
Status: Resolved  (was: Patch Available)

Thanks [~steve_l], sorry on delay. This seems to have been done (in messages 
alone at least) by HADOOP-13503. Closing out as dupe.

> Print expected values when rejecting a server's determined principal
> 
>
> Key: HADOOP-13056
> URL: https://issues.apache.org/jira/browse/HADOOP-13056
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Harsh J
>Priority: Trivial
> Attachments: HADOOP-13056.000.patch
>
>
> When an address-constructed service principal by a client does not match a 
> provided pattern or the configured principal property, the error is very 
> uninformative on what the specific cause is. Currently the only error printed 
> is, in both cases:
> {code}
>  java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
> hdfs/host.internal@REALM
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2016-10-26 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607544#comment-15607544
 ] 

Harsh J commented on HADOOP-13694:
--

[~hitliuyi] or [~jojochuang], could you help review this one? The patch extends 
the OpensslCipher implementation to cover AES-192, and adds tests for all 
available AES variants (existing tests just cover 128-bit).

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> 

[jira] [Updated] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13694:
-
Status: Patch Available  (was: Open)

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
>   at 
> 

[jira] [Updated] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2016-10-26 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13694:
-
Status: Open  (was: Patch Available)

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
>   at 
> 

[jira] [Commented] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2016-10-13 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571320#comment-15571320
 ] 

Harsh J commented on HADOOP-13694:
--

- Updated patch to address {{cc}} warnings (w.r.t. use of compare of asprintf 
return val)
- The new checkstyle warnings on the test was not from my changes but the 
updated PR addresses them as well.

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> 

[jira] [Updated] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2016-10-06 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13694:
-
Status: Patch Available  (was: Open)

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1318)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1266)
>   at 
> 

[jira] [Moved] (HADOOP-13694) Data transfer encryption with AES 192: Invalid key length.

2016-10-06 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J moved HDFS-9878 to HADOOP-13694:


Affects Version/s: (was: 2.7.2)
   2.7.2
  Component/s: (was: hdfs-client)
   (was: datanode)
   security
   Issue Type: Improvement  (was: Bug)
  Key: HADOOP-13694  (was: HDFS-9878)
  Project: Hadoop Common  (was: Hadoop HDFS)

> Data transfer encryption with AES 192: Invalid key length.
> --
>
> Key: HADOOP-13694
> URL: https://issues.apache.org/jira/browse/HADOOP-13694
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.7.2
> Environment: OS: Ubuntu 14.04
> /hadoop-2.7.2/bin$ uname -a
> Linux wkstn-kpalaniappan 3.13.0-79-generic #123-Ubuntu SMP Fri Feb 19 
> 14:27:58 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> /hadoop-2.7.2/bin$ java -version
> java version "1.7.0_95"
> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1)
> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
> Hadoop version: 2.7.2
>Reporter: Karthik Palaniappan
>Assignee: Harsh J
>
> Configuring aes 128 or aes 256 encryption 
> (dfs.encrypt.data.transfer.cipher.key.bitlength = [128, 256]) works perfectly 
> fine. Trying to use AES 192 generates this exception on the datanode:
> 16/02/29 17:34:10 ERROR datanode.DataNode: 
> wkstn-kpalaniappan:50010:DataXceiver error processing unknown operation  src: 
> /127.0.0.1:57237 dst: /127.0.0.1:50010
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.doSaslHandshake(SaslDataTransferServer.java:396)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.getEncryptedStreams(SaslDataTransferServer.java:178)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferServer.receive(SaslDataTransferServer.java:110)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:193)
>   at java.lang.Thread.run(Thread.java:745)
> And this exception on the client:
> /hadoop-2.7.2/bin$ ./hdfs dfs -copyFromLocal ~/.vimrc /vimrc
> 16/02/29 17:34:10 WARN hdfs.DFSClient: DataStreamer Exception
> java.lang.IllegalArgumentException: Invalid key length.
>   at org.apache.hadoop.crypto.OpensslCipher.init(Native Method)
>   at org.apache.hadoop.crypto.OpensslCipher.init(OpensslCipher.java:176)
>   at 
> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec$OpensslAesCtrCipher.init(OpensslAesCtrCryptoCodec.java:116)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.updateDecryptor(CryptoInputStream.java:290)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.resetStreamOffset(CryptoInputStream.java:303)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:128)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:109)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.(CryptoInputStream.java:133)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.createStreamPair(DataTransferSaslUtil.java:345)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:490)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>   at 
> 

[jira] [Created] (HADOOP-13515) Redundant transitionToActive call can cause a NameNode to crash

2016-08-18 Thread Harsh J (JIRA)
Harsh J created HADOOP-13515:


 Summary: Redundant transitionToActive call can cause a NameNode to 
crash
 Key: HADOOP-13515
 URL: https://issues.apache.org/jira/browse/HADOOP-13515
 Project: Hadoop Common
  Issue Type: Bug
  Components: ha
Affects Versions: 2.5.0
Reporter: Harsh J
Priority: Minor


The situation in parts is similar to HADOOP-8217, but the cause is different 
and so is the result.

Consider this situation:

- At the beginning NN1 is Active, NN2 is Standby
- ZKFC1 faces a ZK disconnect (not a session timeout, just a socket disconnect) 
and thereby reconnects

{code}
2016-08-11 07:00:46,068 INFO org.apache.zookeeper.ClientCnxn: Client session 
timed out, have not heard from server in 4000ms for sessionid 
0x4566f0c97500bd9, closing socket connection and attempting reconnect
2016-08-11 07:00:46,169 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session 
disconnected. Entering neutral mode...
…
2016-08-11 07:00:46,610 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session 
connected.
{code}

- The reconnection on the ZKFC1 triggers the elector code, and the elector 
re-run finds that NN1 should be the new active (a redundant decision cause NN1 
is already active)

{code}
2016-08-11 07:00:46,615 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Checking for any old active which needs to be fenced...
2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old 
node exists: …
2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: But old 
node has our own data, so don't need to fence it.
{code}

- The ZKFC1 sets the new ZK data, and fires a NN1 RPC call of transitionToActive

{code}
2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing 
znode /hadoop-ha/nameservice1/ActiveBreadCrumb to indicate that the local node 
is the most recent active...
2016-08-11 07:00:46,649 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 175: 
Call -> nn01/10.10.10.10:8022: transitionToActive {reqInfo { reqSource: 
REQUEST_BY_ZKFC }}
{code}

- At the same time as the transitionToActive call is in progress at NN1, but 
not complete yet, the ZK session of ZKFC1 is timed out by ZK Quorum, and a 
watch notification is sent to ZKFC2

{code}
2016-08-11 07:01:00,003 DEBUG org.apache.zookeeper.ClientCnxn: Got notification 
sessionid:0x4566f0c97500bde
2016-08-11 07:01:00,004 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent 
state:SyncConnected type:NodeDeleted 
path:/hadoop-ha/nameservice1/ActiveStandbyElectorLock for sessionid 
0x4566f0c97500bde
{code}

- ZKFC2 responds by marking NN2 as standby, which succeeds (NN hasn't handled 
transitionToActive call yet due to busy status, but has handled 
transitionToStandby before it)

{code}
2016-08-11 07:01:00,013 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Checking for any old active which needs to be fenced...
2016-08-11 07:01:00,018 INFO org.apache.hadoop.ha.ZKFailoverController: Should 
fence: NameNode at nn01/10.10.10.10:8022
2016-08-11 07:01:00,020 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412: 
Call -> nn01/10.10.10.10:8022: transitionToStandby {reqInfo { reqSource: 
REQUEST_BY_ZKFC }}
2016-08-11 07:01:03,880 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
transitionToStandby took 3860ms
{code}

- ZKFC2 then marks NN2 as active, and NN2 begins its transition (is in midst of 
it, not done yet at this point)

{code}
2016-08-11 07:01:03,894 INFO org.apache.hadoop.ha.ZKFailoverController: Trying 
to make NameNode at nn02/11.11.11.11:8022 active...
2016-08-11 07:01:03,895 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412: 
Call -> nn02/11.11.11.11:8022: transitionToActive {reqInfo { reqSource: 
REQUEST_BY_ZKFC }}
…
{code}

{code}
2016-08-11 07:01:09,558 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required 
for active state
…
2016-08-11 07:01:19,968 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 5635
{code}

- At the same time in parallel NN1 processes the transitionToActive requests 
finally, and becomes active

{code}
2016-08-11 07:01:13,281 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required 
for active state
…
2016-08-11 07:01:19,599 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 5635
…
2016-08-11 07:01:19,602 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Starting log segment at 5635
{code}

- NN2's active transition fails as a result of this parallel active transition 
on NN1 which has completed right before it tries to take over

{code}
2016-08-11 07:01:19,968 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 5635
2016-08-11 07:01:22,799 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: 
Error encountered requiring NN 

[jira] [Commented] (HADOOP-13410) RunJar adds the content of the jar twice to the classpath

2016-08-09 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413827#comment-15413827
 ] 

Harsh J commented on HADOOP-13410:
--

There is a use-case that requires something similar, see HADOOP-9424. Its not 
directly related to what is being pointed out here though.

> RunJar adds the content of the jar twice to the classpath
> -
>
> Key: HADOOP-13410
> URL: https://issues.apache.org/jira/browse/HADOOP-13410
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Reporter: Sangjin Lee
>Assignee: Yuanbo Liu
> Attachments: HADOOP-13410.001.patch
>
>
> Today when you run a "hadoop jar" command, the jar is unzipped to a temporary 
> location and gets added to the classloader.
> However, the original jar itself is still added to the classpath.
> {code}
>   List classPath = new ArrayList<>();
>   classPath.add(new File(workDir + "/").toURI().toURL());
>   classPath.add(file.toURI().toURL());
>   classPath.add(new File(workDir, "classes/").toURI().toURL());
>   File[] libs = new File(workDir, "lib").listFiles();
>   if (libs != null) {
> for (File lib : libs) {
>   classPath.add(lib.toURI().toURL());
> }
>   }
> {code}
> As a result, the contents of the jar are present in the classpath *twice* and 
> are completely redundant. Although this does not necessarily cause 
> correctness issues, some stricter code written to require a single presence 
> of files may fail.
> I cannot think of a good reason why the jar should be added to the classpath 
> if the unjarred content was added to it. I think we should remove the jar 
> from the classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13099) Globbing does not return file whose name has nonprintable character

2016-05-05 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272428#comment-15272428
 ] 

Harsh J commented on HADOOP-13099:
--

This looks like a dupe of HADOOP-13051?

> Globbing does not return file whose name has nonprintable character
> ---
>
> Key: HADOOP-13099
> URL: https://issues.apache.org/jira/browse/HADOOP-13099
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>
> In a directory, create a file with a name containing non-printable character, 
> e.g., '\r'.  {{dfs -ls dir}} can list such file, but {{dfs -ls dir/*}} can 
> not.
> {noformat}
> $ hdfs dfs -touchz /tmp/test/abc
> $ hdfs dfs -touchz $'/tmp/test/abc\rdef'
> $ hdfs dfs -ls /tmp/test
> Found 2 items
> -rw-r--r--   3 systest supergroup  0 2016-05-05 01:35 /tmp/test/abc
> def-r--r--   3 systest supergroup  0 2016-05-05 01:36 /tmp/test/abc
> $ hdfs dfs -ls /tmp/test | od -c
> 000   F   o   u   n   d   2   i   t   e   m   s  \n   -   r
> 020   w   -   r   -   -   r   -   -   3   s   y   s
> 040   t   e   s   t   s   u   p   e   r   g   r   o   u   p
> 060   0   2   0   1   6   -
> 100   0   5   -   0   5   0   1   :   3   5   /   t   m   p
> 120   /   t   e   s   t   /   a   b   c  \n   -   r   w   -   r   -
> 140   -   r   -   -   3   s   y   s   t   e   s   t
> 160   s   u   p   e   r   g   r   o   u   p
> 200   0   2   0   1   6   -   0   5   -   0
> 220   5   0   1   :   3   6   /   t   m   p   /   t   e   s
> 240   t   /   a   b   c  \r   d   e   f  \n
> 252
> $ hdfs dfs -ls /tmp/test/*
> -rw-r--r--   3 systest supergroup  0 2016-05-05 01:35 /tmp/test/abc
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-13051) Test for special characters in path being respected during globPaths

2016-04-23 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13051:
-
Attachment: HADOOP-13051.000.patch

(Renamed attachment to the right project)

> Test for special characters in path being respected during globPaths
> 
>
> Key: HADOOP-13051
> URL: https://issues.apache.org/jira/browse/HADOOP-13051
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-13051.000.patch
>
>
> On {{branch-2}}, the below is the (incorrect) behaviour today, where paths 
> with special characters get dropped during globStatus calls:
> {code}
> bin/hdfs dfs -mkdir /foo
> bin/hdfs dfs -touchz /foo/foo1
> bin/hdfs dfs -touchz $'/foo/foo1\r'
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> {code}
> Whereas trunk has the right behaviour, subtly fixed via the pattern library 
> change of HADOOP-12436:
> {code}
> bin/hdfs dfs -mkdir /foo
> bin/hdfs dfs -touchz /foo/foo1
> bin/hdfs dfs -touchz $'/foo/foo1\r'
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> {code}
> (I've placed a ^M explicitly to indicate presence of the intentional hidden 
> character)
> We should still add a simple test-case to cover this situation for future 
> regressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-13051) Test for special characters in path being respected during globPaths

2016-04-23 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13051:
-
Attachment: (was: HDFS-13051.000.patch)

> Test for special characters in path being respected during globPaths
> 
>
> Key: HADOOP-13051
> URL: https://issues.apache.org/jira/browse/HADOOP-13051
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
>
> On {{branch-2}}, the below is the (incorrect) behaviour today, where paths 
> with special characters get dropped during globStatus calls:
> {code}
> bin/hdfs dfs -mkdir /foo
> bin/hdfs dfs -touchz /foo/foo1
> bin/hdfs dfs -touchz $'/foo/foo1\r'
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> {code}
> Whereas trunk has the right behaviour, subtly fixed via the pattern library 
> change of HADOOP-12436:
> {code}
> bin/hdfs dfs -mkdir /foo
> bin/hdfs dfs -touchz /foo/foo1
> bin/hdfs dfs -touchz $'/foo/foo1\r'
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> {code}
> (I've placed a ^M explicitly to indicate presence of the intentional hidden 
> character)
> We should still add a simple test-case to cover this situation for future 
> regressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-13056) Print expected values when rejecting a server's determined principal

2016-04-23 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13056:
-
Status: Patch Available  (was: Open)

> Print expected values when rejecting a server's determined principal
> 
>
> Key: HADOOP-13056
> URL: https://issues.apache.org/jira/browse/HADOOP-13056
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Trivial
> Attachments: HADOOP-13056.000.patch
>
>
> When an address-constructed service principal by a client does not match a 
> provided pattern or the configured principal property, the error is very 
> uninformative on what the specific cause is. Currently the only error printed 
> is, in both cases:
> {code}
>  java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
> hdfs/host.internal@REALM
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-13056) Print expected values when rejecting a server's determined principal

2016-04-23 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13056:
-
Attachment: HADOOP-13056.000.patch

> Print expected values when rejecting a server's determined principal
> 
>
> Key: HADOOP-13056
> URL: https://issues.apache.org/jira/browse/HADOOP-13056
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.5.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Trivial
> Attachments: HADOOP-13056.000.patch
>
>
> When an address-constructed service principal by a client does not match a 
> provided pattern or the configured principal property, the error is very 
> uninformative on what the specific cause is. Currently the only error printed 
> is, in both cases:
> {code}
>  java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
> hdfs/host.internal@REALM
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-13056) Print expected values when rejecting a server's determined principal

2016-04-22 Thread Harsh J (JIRA)
Harsh J created HADOOP-13056:


 Summary: Print expected values when rejecting a server's 
determined principal
 Key: HADOOP-13056
 URL: https://issues.apache.org/jira/browse/HADOOP-13056
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.5.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial


When an address-constructed service principal by a client does not match a 
provided pattern or the configured principal property, the error is very 
uninformative on what the specific cause is. Currently the only error printed 
is, in both cases:

{code}
 java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
hdfs/host.internal@REALM
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12436) GlobPattern regex library has performance issues with wildcard characters

2016-04-22 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15255017#comment-15255017
 ] 

Harsh J commented on HADOOP-12436:
--

This change subtly fixes the issue described in HADOOP-13051 (test-case added 
there for regression's sake)

> GlobPattern regex library has performance issues with wildcard characters
> -
>
> Key: HADOOP-12436
> URL: https://issues.apache.org/jira/browse/HADOOP-12436
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.2.0, 2.7.1
>Reporter: Matthew Paduano
>Assignee: Matthew Paduano
> Fix For: 3.0.0
>
> Attachments: HADOOP-12436.01.patch, HADOOP-12436.02.patch, 
> HADOOP-12436.03.patch, HADOOP-12436.04.patch, HADOOP-12436.05.patch
>
>
> java.util.regex classes have performance problems with certain wildcard 
> patterns.  Namely, consecutive * characters in a file name (not properly 
> escaped as literals) will cause commands such as "hadoop fs -ls 
> file**name" to consume 100% CPU and probably never return in a reasonable 
> time (time scales with number of *'s). 
> Here is an example:
> {noformat}
> hadoop fs -touchz 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D\\\+\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\*\\\+\\\+\\\+...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> hadoop fs -ls 
> /user/mattp/job_1429571161900_4222-1430338332599-tda%2D%2D+**+++...%270%27%28Stage-1430338580443-39-2000-SUCCEEDED-production%2Dhigh-1430338340360.jhist
> {noformat}
> causes:
> {noformat}
> PIDCOMMAND  %CPU   TIME  
> 14526  java 100.0  01:18.85 
> {noformat}
> Not every string of *'s causes this, but the above filename reproduces this 
> reliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-13051) Test for special characters in path being respected during globPaths

2016-04-22 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13051:
-
Attachment: HDFS-13051.000.patch

Patch that fails in {{branch-2}} but passes in trunk after the mentioned fix.

> Test for special characters in path being respected during globPaths
> 
>
> Key: HADOOP-13051
> URL: https://issues.apache.org/jira/browse/HADOOP-13051
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HDFS-13051.000.patch
>
>
> On {{branch-2}}, the below is the (incorrect) behaviour today, where paths 
> with special characters get dropped during globStatus calls:
> {code}
> bin/hdfs dfs -mkdir /foo
> bin/hdfs dfs -touchz /foo/foo1
> bin/hdfs dfs -touchz $'/foo/foo1\r'
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> {code}
> Whereas trunk has the right behaviour, subtly fixed via the pattern library 
> change of HADOOP-12436:
> {code}
> bin/hdfs dfs -mkdir /foo
> bin/hdfs dfs -touchz /foo/foo1
> bin/hdfs dfs -touchz $'/foo/foo1\r'
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> {code}
> (I've placed a ^M explicitly to indicate presence of the intentional hidden 
> character)
> We should still add a simple test-case to cover this situation for future 
> regressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-13051) Test for special characters in path being respected during globPaths

2016-04-22 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-13051:
-
Status: Patch Available  (was: Open)

> Test for special characters in path being respected during globPaths
> 
>
> Key: HADOOP-13051
> URL: https://issues.apache.org/jira/browse/HADOOP-13051
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs
>Affects Versions: 3.0.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HDFS-13051.000.patch
>
>
> On {{branch-2}}, the below is the (incorrect) behaviour today, where paths 
> with special characters get dropped during globStatus calls:
> {code}
> bin/hdfs dfs -mkdir /foo
> bin/hdfs dfs -touchz /foo/foo1
> bin/hdfs dfs -touchz $'/foo/foo1\r'
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> {code}
> Whereas trunk has the right behaviour, subtly fixed via the pattern library 
> change of HADOOP-12436:
> {code}
> bin/hdfs dfs -mkdir /foo
> bin/hdfs dfs -touchz /foo/foo1
> bin/hdfs dfs -touchz $'/foo/foo1\r'
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> bin/hdfs dfs -ls '/foo/*'
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
> -rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
> {code}
> (I've placed a ^M explicitly to indicate presence of the intentional hidden 
> character)
> We should still add a simple test-case to cover this situation for future 
> regressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-13051) Test for special characters in path being respected during globPaths

2016-04-22 Thread Harsh J (JIRA)
Harsh J created HADOOP-13051:


 Summary: Test for special characters in path being respected 
during globPaths
 Key: HADOOP-13051
 URL: https://issues.apache.org/jira/browse/HADOOP-13051
 Project: Hadoop Common
  Issue Type: Test
  Components: fs
Affects Versions: 3.0.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor


On {{branch-2}}, the below is the (incorrect) behaviour today, where paths with 
special characters get dropped during globStatus calls:

{code}
bin/hdfs dfs -mkdir /foo
bin/hdfs dfs -touchz /foo/foo1
bin/hdfs dfs -touchz $'/foo/foo1\r'
bin/hdfs dfs -ls '/foo/*'
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
bin/hdfs dfs -ls '/foo/*'
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
{code}

Whereas trunk has the right behaviour, subtly fixed via the pattern library 
change of HADOOP-12436:

{code}
bin/hdfs dfs -mkdir /foo
bin/hdfs dfs -touchz /foo/foo1
bin/hdfs dfs -touchz $'/foo/foo1\r'
bin/hdfs dfs -ls '/foo/*'
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
bin/hdfs dfs -ls '/foo/*'
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
{code}

(I've placed a ^M explicitly to indicate presence of the intentional hidden 
character)

We should still add a simple test-case to cover this situation for future 
regressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12990) lz4 incompatibility between OS and Hadoop

2016-04-01 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222434#comment-15222434
 ] 

Harsh J commented on HADOOP-12990:
--

There does not also seem to be any benefit of using LZ4 Frame format with 
Hadoop based on the below point from its current documentation: 
http://cyan4973.github.io/lz4/lz4_Frame_format.html:

bq. The data format defined by this specification does not attempt to allow 
random access to compressed data.

It does appear to however have an ability to concatenate frames, in sequential 
order, so one could technically index it to then make splits out of it (but 
only for concatenated files).

Doesn't sound like a lot of good would come out for the amount of work needed 
to support the frame format; LZ4 seems best to use in existing container 
formats we already have (SequenceFiles, HFiles, any file format that accepts a 
codec or extensions of those), vs. direct data compression (such as text files).

> lz4 incompatibility between OS and Hadoop
> -
>
> Key: HADOOP-12990
> URL: https://issues.apache.org/jira/browse/HADOOP-12990
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io, native
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Priority: Minor
>
> {{hdfs dfs -text}} hit exception when trying to view the compression file 
> created by Linux lz4 tool.
> The Hadoop version has HADOOP-11184 "update lz4 to r123", thus it is using 
> LZ4 library in release r123.
> Linux lz4 version:
> {code}
> $ /tmp/lz4 -h 2>&1 | head -1
> *** LZ4 Compression CLI 64-bits r123, by Yann Collet (Apr  1 2016) ***
> {code}
> Test steps:
> {code}
> $ cat 10rows.txt
> 001|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 002|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 003|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 004|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 005|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 006|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 007|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 008|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 009|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 010|c1|c2|c3|c4|c5|c6|c7|c8|c9
> $ /tmp/lz4 10rows.txt 10rows.txt.r123.lz4
> Compressed 310 bytes into 105 bytes ==> 33.87%
> $ hdfs dfs -put 10rows.txt.r123.lz4 /tmp
> $ hdfs dfs -text /tmp/10rows.txt.r123.lz4
> 16/04/01 08:19:07 INFO compress.CodecPool: Got brand-new decompressor [.lz4]
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:123)
> at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:98)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:106)
> at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:101)
> at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
> at 
> org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
> at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
> at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
> at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:118)
> at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
> at org.apache.hadoop.fs.FsShell.run(FsShell.java:315)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.hadoop.fs.FsShell.main(FsShell.java:372)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12990) lz4 incompatibility between OS and Hadoop

2016-04-01 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221946#comment-15221946
 ] 

Harsh J commented on HADOOP-12990:
--

I don't think our Lz4Codec implementation actually uses the FRAME specification 
(http://cyan4973.github.io/lz4/lz4_Frame_format.html) when creating text based 
files. It seems it was added in as a codec for use inside block compression 
formats such as SequenceFiles/HFiles/etc., but wasn't oriented towards Text 
files from the looks of it, or was introduced at a time when there was no FRAME 
specification of LZ4.

The lz4 utility uses the frame specification:

{code}
# cat actual-file.txt
hadoop,foo,hadoop,foo,hadoop,foo,hadoop,foo
# lz4 actual-file.txt
# lz4cat actual-file.txt.lz4
hadoop,foo,hadoop,foo,hadoop,foo,hadoop,foo
# cat actual-file.txt.lz4 | od -X
000 184d2204 15a74064 bf00 6f646168
020 662c706f 0b2c6f6f 2c500900 0a6f6f66
040  cf718d62
050
{code}

Note the header magic bytes that match the FRAME specification: {{184d2204}}, 
as per http://cyan4973.github.io/lz4/lz4_Frame_format.html

Whereas in Hadoop, we produce just the actual block compression form:

{code}
# cat StreamCompressor.java 
import org.apache.hadoop.util.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.io.compress.*;
import org.apache.hadoop.conf.*;

public class StreamCompressor {

  public static void main(String[] args) throws Exception {
String codecClassname = args[0];
Class codecClass = Class.forName(codecClassname);
Configuration conf = new Configuration();
CompressionCodec codec = (CompressionCodec)
  ReflectionUtils.newInstance(codecClass, conf);

CompressionOutputStream out = codec.createOutputStream(System.out);
IOUtils.copyBytes(System.in, out, 4096, false);
out.finish();
  }
}
# javac -cp $(hadoop classpath) StreamCompressor.java
# java -cp $PWD StreamCompressor < actual-file.txt > hadoop-file.txt.lz4
# # cat hadoop-file.lz4 | od -X
000 2c00 1500 646168bf 2c706f6f
020 2c6f6f66 5009000b 6f6f662c 000a
035
{code}

Note that we are not writing any of the FRAME required elements (magic header, 
etc.), but are only writing the compressed block directly.

Therefore, fundamentally, we are not interoperable with the {{lz4}} utility. 
The difference is very similar to the GPLExtras' {{LzoCodec}} vs. 
{{LzopCodec}}, the former is just the data compressing algorithm, but the 
latter is an actual framed format interoperable with {{lzop}} CLI utility.

To make ourselves interoperable, we'll need to introduce a new frame wrapping 
codec such as {{LZ4FrameCodec}}, and users could use that when they want to 
decompress or compress text data produced/readable by {{lz4/lz4cat}} CLI 
utilities.

> lz4 incompatibility between OS and Hadoop
> -
>
> Key: HADOOP-12990
> URL: https://issues.apache.org/jira/browse/HADOOP-12990
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: io, native
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Priority: Minor
>
> {{hdfs dfs -text}} hit exception when trying to view the compression file 
> created by Linux lz4 tool.
> The Hadoop version has HADOOP-11184 "update lz4 to r123", thus it is using 
> LZ4 library in release r123.
> Linux lz4 version:
> {code}
> $ /tmp/lz4 -h 2>&1 | head -1
> *** LZ4 Compression CLI 64-bits r123, by Yann Collet (Apr  1 2016) ***
> {code}
> Test steps:
> {code}
> $ cat 10rows.txt
> 001|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 002|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 003|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 004|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 005|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 006|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 007|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 008|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 009|c1|c2|c3|c4|c5|c6|c7|c8|c9
> 010|c1|c2|c3|c4|c5|c6|c7|c8|c9
> $ /tmp/lz4 10rows.txt 10rows.txt.r123.lz4
> Compressed 310 bytes into 105 bytes ==> 33.87%
> $ hdfs dfs -put 10rows.txt.r123.lz4 /tmp
> $ hdfs dfs -text /tmp/10rows.txt.r123.lz4
> 16/04/01 08:19:07 INFO compress.CodecPool: Got brand-new decompressor [.lz4]
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:123)
> at 
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:98)
> at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
> at java.io.InputStream.read(InputStream.java:101)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
> at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:106)
> at 

[jira] [Commented] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-04-01 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221793#comment-15221793
 ] 

Harsh J commented on HADOOP-11687:
--

Oops, sorry Steve, I misunderstood the LGTM...  Thanks for backporting, wasnt 
sure what the protocol was for adding to it.

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
>Assignee: Harsh J
> Fix For: 2.8.0
>
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch, 
> HADOOP-11687.003.patch, HADOOP-11687.004.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-04-01 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-11687:
-
  Resolution: Fixed
   Fix Version/s: 2.9.0
Target Version/s:   (was: 2.9.0)
  Status: Resolved  (was: Patch Available)

Pushed to branch-2 and trunk. Resolving.

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
>Assignee: Harsh J
> Fix For: 2.9.0
>
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch, 
> HADOOP-11687.003.patch, HADOOP-11687.004.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-04-01 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-11687:
-
Target Version/s: 2.9.0

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
>Assignee: Harsh J
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch, 
> HADOOP-11687.003.patch, HADOOP-11687.004.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-04-01 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221388#comment-15221388
 ] 

Harsh J commented on HADOOP-11687:
--

Ran the S3A tests a few times more just in case, they still succeed just fine. 
Will commit this shortly.

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
>Assignee: Harsh J
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch, 
> HADOOP-11687.003.patch, HADOOP-11687.004.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-03-31 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15219672#comment-15219672
 ] 

Harsh J commented on HADOOP-11687:
--

I also did a manual "Connection: close" header inject test prior to the clone 
point, and it simulated the issue of HADOOP-12970 without the fix, but did not 
impact with the fix added.

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
>Assignee: Harsh J
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch, 
> HADOOP-11687.003.patch, HADOOP-11687.004.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-03-31 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-11687:
-
Attachment: HADOOP-11687.004.patch

* Fixes the checkstyle warning

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
>Assignee: Harsh J
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch, 
> HADOOP-11687.003.patch, HADOOP-11687.004.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-03-31 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15219523#comment-15219523
 ] 

Harsh J commented on HADOOP-11687:
--

All of the S3A tests pass with this change, when running {{mvn test}} under 
{{hadoop-aws}} module. Awaiting new jenkins results.

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch, 
> HADOOP-11687.003.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-03-31 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reassigned HADOOP-11687:


Assignee: Harsh J

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
>Assignee: Harsh J
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch, 
> HADOOP-11687.003.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-03-31 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-11687:
-
Attachment: HADOOP-11687.003.patch

Updated patch

* Added null checks during clone on fields that are nullable, as they appear to 
cause a problem later during the CopyObject operation if a null gets through
* Fixed the findbugs suggestion
* Unrelated but added a note about core-site.xml for testing

{code}
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractRename
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 130.767 sec - 
in org.apache.hadoop.fs.contract.s3a.TestS3AContractRename
{code}

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch, 
> HADOOP-11687.003.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-03-31 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15219511#comment-15219511
 ] 

Harsh J commented on HADOOP-11687:
--

Thanks [~ste...@apache.org], am running the tests of rename operation 
currently. Will update with results of the test and changes/new patch as 
required.

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-03-31 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-11687:
-
Status: Patch Available  (was: Open)

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-03-31 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-11687:
-
Attachment: HADOOP-11687.002.patch

Updating patch to address Steve's earlier review comment.

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
> Attachments: HADOOP-11687.001.patch, HADOOP-11687.002.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-30 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217513#comment-15217513
 ] 

Harsh J commented on HADOOP-12970:
--

Just for posterity: I did get my upstream patch into aws-sdk-java: 
https://github.com/aws/aws-sdk-java/blob/1.10.65/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/internal/AbstractS3ResponseHandler.java#L58,
 its available in AWS Java SDK versions 1.10.65 onwards, so using that version 
in future (within hadoop-aws) will resolve the issue too.

> Intermittent signature match failures in S3AFileSystem due connection closure
> -
>
> Key: HADOOP-12970
> URL: https://issues.apache.org/jira/browse/HADOOP-12970
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: HADOOP-12970.patch, HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-29 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-12970:
-
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

> Intermittent signature match failures in S3AFileSystem due connection closure
> -
>
> Key: HADOOP-12970
> URL: https://issues.apache.org/jira/browse/HADOOP-12970
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: HADOOP-12970.patch, HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11687) Ignore x-* and response headers when copying an Amazon S3 object

2016-03-29 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216184#comment-15216184
 ] 

Harsh J commented on HADOOP-11687:
--

Yes this can solve HADOOP-12970 too. I am unsure if skipping all headers would 
be harmful in any way though (in my approach I only skipped the problematic 
header vs. all of them).

For a test case are we looking at a manual Reflection-based injection of the 
bad header? I can help with adding such a test if needed. It looks like this is 
currently being worked on by [~petera5], so I'll defer to him.

> Ignore x-* and response headers when copying an Amazon S3 object
> 
>
> Key: HADOOP-11687
> URL: https://issues.apache.org/jira/browse/HADOOP-11687
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Denis Jannot
> Attachments: HADOOP-11687.001.patch
>
>
> The EMC ViPR/ECS object storage platform uses proprietary headers starting by 
> x-emc-* (like Amazon does with x-amz-*).
> Headers starting by x-emc-* should be included in the signature computation, 
> but it's not done by the Amazon S3 Java SDK (it's done by the EMC S3 SDK).
> When s3a copy an object it copies all the headers, but when the object 
> includes x-emc-* headers, it generates a signature mismatch.
> Removing the x-emc-* headers from the copy would allow s3a to be compatible 
> with the EMC ViPR/ECS object storage platform.
> Removing the x-* which aren't x-amz-* headers from the copy would allow s3a 
> to be compatible with any object storage platform which is using proprietary 
> headers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-29 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216179#comment-15216179
 ] 

Harsh J commented on HADOOP-12970:
--

Yes it seems to be doing the same thing as this one, but it ignores all headers 
vs. just the Connection one. I don't know if that's right to do, but from the 
Connection POV it does solve the issue.

I'll comment on the other JIRA.

> Intermittent signature match failures in S3AFileSystem due connection closure
> -
>
> Key: HADOOP-12970
> URL: https://issues.apache.org/jira/browse/HADOOP-12970
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: HADOOP-12970.patch, HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-29 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215880#comment-15215880
 ] 

Harsh J commented on HADOOP-12970:
--

Thanks [~ste...@apache.org]!

bq. you do need to use StringUtils.equalsIgnoreCase() though, unless you want 
to field support calls from Turkey at UTC+2.00 hours.
Sorry to hear! Happy to make this change to not let that happen again ;)

> Intermittent signature match failures in S3AFileSystem due connection closure
> -
>
> Key: HADOOP-12970
> URL: https://issues.apache.org/jira/browse/HADOOP-12970
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: HADOOP-12970.patch, HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-29 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-12970:
-
Attachment: HADOOP-12970.patch

> Intermittent signature match failures in S3AFileSystem due connection closure
> -
>
> Key: HADOOP-12970
> URL: https://issues.apache.org/jira/browse/HADOOP-12970
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: HADOOP-12970.patch, HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-28 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215335#comment-15215335
 ] 

Harsh J commented on HADOOP-12970:
--

bq. Patch generated 1 ASF License warnings.

Unrelated to patch, per below, seems to have been caused elsewhere:

{code}
Lines that start with ? in the ASF License  report indicate files that do 
not have an Apache license header:
 !? 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/dfs.hosts.json
{code}

bq. The patch doesn't appear to include any new or modified tests. Please 
justify why no new tests are needed for this patch. Also please list what 
manual steps were performed to verify this patch.

Writing a test-case for this would be non-trivial as it would involve 
controlling the S3 service to send back a connection close header. I doubt 
there's a way to control/simulate that, as S3 does not always respond back with 
"Connection: close" to the object metadata requests.

> Intermittent signature match failures in S3AFileSystem due connection closure
> -
>
> Key: HADOOP-12970
> URL: https://issues.apache.org/jira/browse/HADOOP-12970
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-28 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-12970:
-
Status: Patch Available  (was: Open)

> Intermittent signature match failures in S3AFileSystem due connection closure
> -
>
> Key: HADOOP-12970
> URL: https://issues.apache.org/jira/browse/HADOOP-12970
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-28 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-12970:
-
Attachment: HADOOP-12970.patch

> Intermittent signature match failures in S3AFileSystem due connection closure
> -
>
> Key: HADOOP-12970
> URL: https://issues.apache.org/jira/browse/HADOOP-12970
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
>Reporter: Harsh J
>Assignee: Harsh J
> Attachments: HADOOP-12970.patch
>
>
> S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
> {{copyFile}} implementation may fail in circumstances where the connection 
> used for obtaining the metadata is closed by the server (i.e. response 
> carries a {{Connection: close}} header). Due to this header not being 
> stripped away when the {{ObjectMetadata}} is created, and due to us cloning 
> it for use in the next {{CopyObjectRequest}}, it causes the request to use 
> {{Connection: close}} headers as a part of itself.
> This causes signer related exceptions because the client now includes the 
> {{Connection}} header as part of the {{SignedHeaders}}, but the S3 server 
> does not receive the same value for it ({{Connection}} headers are likely 
> stripped away before the S3 Server tries to match signature hashes), causing 
> a failure like below:
> {code}
> 2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
> org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
> Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
> SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
>  Signature=MNOPQRSTUVWXYZ[\r][\n]"
> …
> com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
> calculated does not match the signature you provided. Check your key and 
> signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
> SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
> {code}
> This is intermittent because the S3 Server does not always add a 
> {{Connection: close}} directive in its response, but whenever we receive it 
> AND we clone it, the above exception would happen for the copy request. The 
> copy request is often used in the context of FileOutputCommitter, when a lot 
> of the MR attempt files on {{s3a://}} destination filesystem are to be moved 
> to their parent directories post-commit.
> I've also submitted a fix upstream with AWS Java SDK to strip out the 
> {{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
> acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
> until that release is available and can be used by us, we'll need to 
> workaround the clone approach by manually excluding the {{Connection}} header 
> (not straight-forward due to the {{metadata}} object being private with no 
> mutable access). We can remove such a change in future when there's a release 
> available with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-28 Thread Harsh J (JIRA)
Harsh J created HADOOP-12970:


 Summary: Intermittent signature match failures in S3AFileSystem 
due connection closure
 Key: HADOOP-12970
 URL: https://issues.apache.org/jira/browse/HADOOP-12970
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 2.7.0
Reporter: Harsh J
Assignee: Harsh J


S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
{{copyFile}} implementation may fail in circumstances where the connection used 
for obtaining the metadata is closed by the server (i.e. response carries a 
{{Connection: close}} header). Due to this header not being stripped away when 
the {{ObjectMetadata}} is created, and due to us cloning it for use in the next 
{{CopyObjectRequest}}, it causes the request to use {{Connection: close}} 
headers as a part of itself.

This causes signer related exceptions because the client now includes the 
{{Connection}} header as part of the {{SignedHeaders}}, but the S3 server does 
not receive the same value for it ({{Connection}} headers are likely stripped 
away before the S3 Server tries to match signature hashes), causing a failure 
like below:

{code}
2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
 Signature=MNOPQRSTUVWXYZ[\r][\n]"
…
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
calculated does not match the signature you provided. Check your key and 
signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
{code}

This is intermittent because the S3 Server does not always add a {{Connection: 
close}} directive in its response, but whenever we receive it AND we clone it, 
the above exception would happen for the copy request. The copy request is 
often used in the context of FileOutputCommitter, when a lot of the MR attempt 
files on {{s3a://}} destination filesystem are to be moved to their parent 
directories post-commit.

I've also submitted a fix upstream with AWS Java SDK to strip out the 
{{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
until that release is available and can be used by us, we'll need to workaround 
the clone approach by manually excluding the {{Connection}} header (not 
straight-forward due to the {{metadata}} object being private with no mutable 
access). We can remove such a change in future when there's a release available 
with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-9567) Provide auto-renewal for keytab based logins

2016-03-24 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-9567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15211143#comment-15211143
 ] 

Harsh J commented on HADOOP-9567:
-

Thanks [~subrotosanyal], that is indeed what handles it automatically for us 
today. I was thinking more from a perspective where the app does not utilise 
Hadoop or HBase IPCs. Some form of scheduled b/g daemon thread that could do 
periodic renews, for each held LoginContext. There could be locking parts to 
consider around this though, if someone uses getCurrentUser vs. getLoginUser.

I agree that in Hadoop's context we have it already covered, so we could also 
close this out. Any direct users of UGI should self-ensure and call the 
checkTGTAndReloginFromKeytab functionality themselves.

bq. but when you have all datanodes (or other client processes) started at the 
same time, this can lead to a thundering herd effect, where all processes pile 
on the KDC at the same time.

This seems like a separate issue to me (it wasn't my intention when filing 
this). Perhaps such a check should only get added to DN login/re-login code, 
cause in the UGI end it'd affect all users (even those who do not desire the 
jitter)?

> Provide auto-renewal for keytab based logins
> 
>
> Key: HADOOP-9567
> URL: https://issues.apache.org/jira/browse/HADOOP-9567
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Gary Helmling
>Priority: Minor
>
> We do a renewal for cached tickets (obtained via kinit before using a Hadoop 
> application) but we explicitly seem to avoid doing a renewal for keytab based 
> logins (done from within the client code) when we could do that as well via a 
> similar thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12559) KMS connection failures should trigger TGT renewal

2016-03-18 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198683#comment-15198683
 ] 

Harsh J commented on HADOOP-12559:
--

(We'll need also HADOOP-12682 for tests if backporting BTW)

> KMS connection failures should trigger TGT renewal
> --
>
> Key: HADOOP-12559
> URL: https://issues.apache.org/jira/browse/HADOOP-12559
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0
>
> Attachments: HADOOP-12559.00.patch, HADOOP-12559.01.patch, 
> HADOOP-12559.02.patch, HADOOP-12559.03.patch, HADOOP-12559.04.patch, 
> HADOOP-12559.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11404) Clarify the "expected client Kerberos principal is null" authorization message

2016-03-10 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-11404:
-
   Resolution: Fixed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

Committed to branch-2 and trunk.

> Clarify the "expected client Kerberos principal is null" authorization message
> --
>
> Key: HADOOP-11404
> URL: https://issues.apache.org/jira/browse/HADOOP-11404
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.2.0
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>Priority: Minor
>  Labels: BB2015-05-TBR, supportability
> Fix For: 2.9.0
>
> Attachments: HADOOP-11404.001.patch, HADOOP-11404.002.patch, 
> HADOOP-11404.003.patch
>
>
> In {{ServiceAuthorizationManager#authorize}}, we throw an 
> {{AuthorizationException}} with message "expected client Kerberos principal 
> is null" when authorization fails.
> However, this is a confusing log message, because it leads users to believe 
> there was a Kerberos authentication problem, when in fact the the user could 
> have authenticated successfully.
> {code}
> if((clientPrincipal != null && !clientPrincipal.equals(user.getUserName())) 
> || 
>acls.length != 2  || !acls[0].isUserAllowed(user) || 
> acls[1].isUserAllowed(user)) {
>   AUDITLOG.warn(AUTHZ_FAILED_FOR + user + " for protocol=" + protocol
>   + ", expected client Kerberos principal is " + clientPrincipal);
>   throw new AuthorizationException("User " + user + 
>   " is not authorized for protocol " + protocol + 
>   ", expected client Kerberos principal is " + clientPrincipal);
> }
> AUDITLOG.info(AUTHZ_SUCCESSFUL_FOR + user + " for protocol="+protocol);
> {code}
> In the above code, if clientPrincipal is null, then the user is authenticated 
> successfully but denied by a configured ACL, not a Kerberos issue. We should 
> improve this log message to state this.
> Thanks to [~tlipcon] for finding this and proposing a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11404) Clarify the "expected client Kerberos principal is null" authorization message

2016-03-10 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189177#comment-15189177
 ] 

Harsh J commented on HADOOP-11404:
--

Thanks [~ste...@apache.org]! Committing shortly.

> Clarify the "expected client Kerberos principal is null" authorization message
> --
>
> Key: HADOOP-11404
> URL: https://issues.apache.org/jira/browse/HADOOP-11404
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.2.0
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>Priority: Minor
>  Labels: BB2015-05-TBR, supportability
> Attachments: HADOOP-11404.001.patch, HADOOP-11404.002.patch, 
> HADOOP-11404.003.patch
>
>
> In {{ServiceAuthorizationManager#authorize}}, we throw an 
> {{AuthorizationException}} with message "expected client Kerberos principal 
> is null" when authorization fails.
> However, this is a confusing log message, because it leads users to believe 
> there was a Kerberos authentication problem, when in fact the the user could 
> have authenticated successfully.
> {code}
> if((clientPrincipal != null && !clientPrincipal.equals(user.getUserName())) 
> || 
>acls.length != 2  || !acls[0].isUserAllowed(user) || 
> acls[1].isUserAllowed(user)) {
>   AUDITLOG.warn(AUTHZ_FAILED_FOR + user + " for protocol=" + protocol
>   + ", expected client Kerberos principal is " + clientPrincipal);
>   throw new AuthorizationException("User " + user + 
>   " is not authorized for protocol " + protocol + 
>   ", expected client Kerberos principal is " + clientPrincipal);
> }
> AUDITLOG.info(AUTHZ_SUCCESSFUL_FOR + user + " for protocol="+protocol);
> {code}
> In the above code, if clientPrincipal is null, then the user is authenticated 
> successfully but denied by a configured ACL, not a Kerberos issue. We should 
> improve this log message to state this.
> Thanks to [~tlipcon] for finding this and proposing a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11404) Clarify the "expected client Kerberos principal is null" authorization message

2016-03-10 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-11404:
-
Attachment: HADOOP-11404.003.patch

Fixing the indent issue. Retrying.

> Clarify the "expected client Kerberos principal is null" authorization message
> --
>
> Key: HADOOP-11404
> URL: https://issues.apache.org/jira/browse/HADOOP-11404
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.2.0
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>Priority: Minor
>  Labels: BB2015-05-TBR, supportability
> Attachments: HADOOP-11404.001.patch, HADOOP-11404.002.patch, 
> HADOOP-11404.003.patch
>
>
> In {{ServiceAuthorizationManager#authorize}}, we throw an 
> {{AuthorizationException}} with message "expected client Kerberos principal 
> is null" when authorization fails.
> However, this is a confusing log message, because it leads users to believe 
> there was a Kerberos authentication problem, when in fact the the user could 
> have authenticated successfully.
> {code}
> if((clientPrincipal != null && !clientPrincipal.equals(user.getUserName())) 
> || 
>acls.length != 2  || !acls[0].isUserAllowed(user) || 
> acls[1].isUserAllowed(user)) {
>   AUDITLOG.warn(AUTHZ_FAILED_FOR + user + " for protocol=" + protocol
>   + ", expected client Kerberos principal is " + clientPrincipal);
>   throw new AuthorizationException("User " + user + 
>   " is not authorized for protocol " + protocol + 
>   ", expected client Kerberos principal is " + clientPrincipal);
> }
> AUDITLOG.info(AUTHZ_SUCCESSFUL_FOR + user + " for protocol="+protocol);
> {code}
> In the above code, if clientPrincipal is null, then the user is authenticated 
> successfully but denied by a configured ACL, not a Kerberos issue. We should 
> improve this log message to state this.
> Thanks to [~tlipcon] for finding this and proposing a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11404) Clarify the "expected client Kerberos principal is null" authorization message

2016-03-09 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-11404:
-
Attachment: HADOOP-11404.002.patch

Resubmitting for jenkins (fixed a 84 line change to clear probable checkstyle 
nit)

> Clarify the "expected client Kerberos principal is null" authorization message
> --
>
> Key: HADOOP-11404
> URL: https://issues.apache.org/jira/browse/HADOOP-11404
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.2.0
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>Priority: Minor
>  Labels: BB2015-05-TBR, supportability
> Attachments: HADOOP-11404.001.patch, HADOOP-11404.002.patch
>
>
> In {{ServiceAuthorizationManager#authorize}}, we throw an 
> {{AuthorizationException}} with message "expected client Kerberos principal 
> is null" when authorization fails.
> However, this is a confusing log message, because it leads users to believe 
> there was a Kerberos authentication problem, when in fact the the user could 
> have authenticated successfully.
> {code}
> if((clientPrincipal != null && !clientPrincipal.equals(user.getUserName())) 
> || 
>acls.length != 2  || !acls[0].isUserAllowed(user) || 
> acls[1].isUserAllowed(user)) {
>   AUDITLOG.warn(AUTHZ_FAILED_FOR + user + " for protocol=" + protocol
>   + ", expected client Kerberos principal is " + clientPrincipal);
>   throw new AuthorizationException("User " + user + 
>   " is not authorized for protocol " + protocol + 
>   ", expected client Kerberos principal is " + clientPrincipal);
> }
> AUDITLOG.info(AUTHZ_SUCCESSFUL_FOR + user + " for protocol="+protocol);
> {code}
> In the above code, if clientPrincipal is null, then the user is authenticated 
> successfully but denied by a configured ACL, not a Kerberos issue. We should 
> improve this log message to state this.
> Thanks to [~tlipcon] for finding this and proposing a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11404) Clarify the "expected client Kerberos principal is null" authorization message

2016-03-09 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188715#comment-15188715
 ] 

Harsh J commented on HADOOP-11404:
--

+1, patch still applies. Will commit in a day unless you have any objection, 
[~ste...@apache.org].

> Clarify the "expected client Kerberos principal is null" authorization message
> --
>
> Key: HADOOP-11404
> URL: https://issues.apache.org/jira/browse/HADOOP-11404
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.2.0
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>Priority: Minor
>  Labels: BB2015-05-TBR, supportability
> Attachments: HADOOP-11404.001.patch
>
>
> In {{ServiceAuthorizationManager#authorize}}, we throw an 
> {{AuthorizationException}} with message "expected client Kerberos principal 
> is null" when authorization fails.
> However, this is a confusing log message, because it leads users to believe 
> there was a Kerberos authentication problem, when in fact the the user could 
> have authenticated successfully.
> {code}
> if((clientPrincipal != null && !clientPrincipal.equals(user.getUserName())) 
> || 
>acls.length != 2  || !acls[0].isUserAllowed(user) || 
> acls[1].isUserAllowed(user)) {
>   AUDITLOG.warn(AUTHZ_FAILED_FOR + user + " for protocol=" + protocol
>   + ", expected client Kerberos principal is " + clientPrincipal);
>   throw new AuthorizationException("User " + user + 
>   " is not authorized for protocol " + protocol + 
>   ", expected client Kerberos principal is " + clientPrincipal);
> }
> AUDITLOG.info(AUTHZ_SUCCESSFUL_FOR + user + " for protocol="+protocol);
> {code}
> In the above code, if clientPrincipal is null, then the user is authenticated 
> successfully but denied by a configured ACL, not a Kerberos issue. We should 
> improve this log message to state this.
> Thanks to [~tlipcon] for finding this and proposing a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12894) Add yarn.app.mapreduce.am.log.level to mapred-default.xml

2016-03-05 Thread Harsh J (JIRA)
Harsh J created HADOOP-12894:


 Summary: Add yarn.app.mapreduce.am.log.level to mapred-default.xml
 Key: HADOOP-12894
 URL: https://issues.apache.org/jira/browse/HADOOP-12894
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.9.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12559) KMS connection failures should trigger TGT renewal

2015-12-11 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052650#comment-15052650
 ] 

Harsh J commented on HADOOP-12559:
--

[~zhz] - I've isolated the issue to a problem with tgt lifetime vs. when the NN 
renews it, so that trace can be ignored in this JIRA. I'll log a separate JIRA 
for it and follow up here later. Thanks for taking a look!

> KMS connection failures should trigger TGT renewal
> --
>
> Key: HADOOP-12559
> URL: https://issues.apache.org/jira/browse/HADOOP-12559
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HADOOP-12559.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12559) KMS connection failures should trigger TGT renewal

2015-12-11 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052880#comment-15052880
 ] 

Harsh J commented on HADOOP-12559:
--

Although if its possible for the patch to not be specific to re-login during 
only decrypt calls, it can solve the NN tgt expiry issue also.

> KMS connection failures should trigger TGT renewal
> --
>
> Key: HADOOP-12559
> URL: https://issues.apache.org/jira/browse/HADOOP-12559
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HADOOP-12559.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-12559) KMS connection failures should trigger TGT renewal

2015-12-08 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046632#comment-15046632
 ] 

Harsh J commented on HADOOP-12559:
--

The reason I ask is that the NameNode also sees the same error (outside of a 
DFSClient):

{code}
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:743)
at 
org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2530)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2664)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2560)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:585)
at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:110)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:395)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)
at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:289)
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111)
at 
com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:132)
at 
com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2381)
at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2351)
at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
at 
com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
at 
org.apache.hadoop.crypto.key.kms.ValueQueue.getAtMost(ValueQueue.java:266)
at 
org.apache.hadoop.crypto.key.kms.ValueQueue.getNext(ValueQueue.java:226)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:738)
... 16 more
Caused by: java.io.IOException: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:488)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.access$100(KMSClientProvider.java:83)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider$EncryptedQueueRefiller.fillQueueForKey(KMSClientProvider.java:132)
at 
org.apache.hadoop.crypto.key.kms.ValueQueue$1.load(ValueQueue.java:181)
at 
org.apache.hadoop.crypto.key.kms.ValueQueue$1.load(ValueQueue.java:175)
at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
... 24 more
Caused by: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)
at 

[jira] [Commented] (HADOOP-11187) NameNode - KMS communication fails after a long period of inactivity

2015-12-08 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046601#comment-15046601
 ] 

Harsh J commented on HADOOP-11187:
--

This doesn't cover the retries properly if the connection setup itself fails 
(the retries are limited to the call(…) method being executed, but the 
connection is setup before that:

{code}
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:743)
at 
org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:371)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.generateEncryptedDataEncryptionKey(FSNamesystem.java:2530)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2664)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2560)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:585)
at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:110)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:395)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)
at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:289)
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276)
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:111)
at 
com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:132)
at 
com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2381)
at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2351)
at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
at 
com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
at 
org.apache.hadoop.crypto.key.kms.ValueQueue.getAtMost(ValueQueue.java:266)
at 
org.apache.hadoop.crypto.key.kms.ValueQueue.getNext(ValueQueue.java:226)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.generateEncryptedKey(KMSClientProvider.java:738)
... 16 more
Caused by: java.io.IOException: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:488)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider.access$100(KMSClientProvider.java:83)
at 
org.apache.hadoop.crypto.key.kms.KMSClientProvider$EncryptedQueueRefiller.fillQueueForKey(KMSClientProvider.java:132)
at 
org.apache.hadoop.crypto.key.kms.ValueQueue$1.load(ValueQueue.java:181)
at 
org.apache.hadoop.crypto.key.kms.ValueQueue$1.load(ValueQueue.java:175)
at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
... 24 more
Caused by: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)
at 

[jira] [Commented] (HADOOP-12559) KMS connection failures should trigger TGT renewal

2015-11-15 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-12559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005962#comment-15005962
 ] 

Harsh J commented on HADOOP-12559:
--

On the client-end, or the server-end BTW?

> KMS connection failures should trigger TGT renewal
> --
>
> Key: HADOOP-12559
> URL: https://issues.apache.org/jira/browse/HADOOP-12559
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12549) Extend HDFS-7456 default generically to all pattern lookups

2015-11-03 Thread Harsh J (JIRA)
Harsh J created HADOOP-12549:


 Summary: Extend HDFS-7456 default generically to all pattern 
lookups
 Key: HADOOP-12549
 URL: https://issues.apache.org/jira/browse/HADOOP-12549
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc, security
Affects Versions: 2.7.1
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor


In HDFS-7546 we added a hdfs-default.xml property to bring back the regular 
behaviour of trusting all principals (as was the case before HADOOP-9789). 
However, the change only targeted HDFS users and also only those that used the 
default-loading mechanism of Configuration class (i.e. not {{new 
Configuration(false)}} users).

I'd like to propose adding the same default to the generic RPC client code 
also, so the default affects all form of clients equally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12549) Extend HDFS-7456 default generically to all pattern lookups

2015-11-03 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-12549:
-
Status: Patch Available  (was: Open)

> Extend HDFS-7456 default generically to all pattern lookups
> ---
>
> Key: HADOOP-12549
> URL: https://issues.apache.org/jira/browse/HADOOP-12549
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, security
>Affects Versions: 2.7.1
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-12549.patch
>
>
> In HDFS-7546 we added a hdfs-default.xml property to bring back the regular 
> behaviour of trusting all principals (as was the case before HADOOP-9789). 
> However, the change only targeted HDFS users and also only those that used 
> the default-loading mechanism of Configuration class (i.e. not {{new 
> Configuration(false)}} users).
> I'd like to propose adding the same default to the generic RPC client code 
> also, so the default affects all form of clients equally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-12549) Extend HDFS-7456 default generically to all pattern lookups

2015-11-03 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HADOOP-12549:
-
Attachment: HADOOP-12549.patch

> Extend HDFS-7456 default generically to all pattern lookups
> ---
>
> Key: HADOOP-12549
> URL: https://issues.apache.org/jira/browse/HADOOP-12549
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc, security
>Affects Versions: 2.7.1
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Attachments: HADOOP-12549.patch
>
>
> In HDFS-7546 we added a hdfs-default.xml property to bring back the regular 
> behaviour of trusting all principals (as was the case before HADOOP-9789). 
> However, the change only targeted HDFS users and also only those that used 
> the default-loading mechanism of Configuration class (i.e. not {{new 
> Configuration(false)}} users).
> I'd like to propose adding the same default to the generic RPC client code 
> also, so the default affects all form of clients equally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >