[jira] [Created] (HADOOP-10421) Enable Kerberos profiled UTs to run with IBM JAVA
Jinghui Wang created HADOOP-10421: - Summary: Enable Kerberos profiled UTs to run with IBM JAVA Key: HADOOP-10421 URL: https://issues.apache.org/jira/browse/HADOOP-10421 Project: Hadoop Common Issue Type: Test Components: security, test Affects Versions: 2.2.0 Reporter: Jinghui Wang Fix For: 2.3.0 KerberosTestUtils in both hadoop-auth and hadoop-httpfs does not support IBM JAVA, which has different configuration options. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10422) Remove redundant logging of retry attempts.
Chris Nauroth created HADOOP-10422: -- Summary: Remove redundant logging of retry attempts. Key: HADOOP-10422 URL: https://issues.apache.org/jira/browse/HADOOP-10422 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth {{RetryUtils}} logs each retry attempt at both info level and debug level. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10423) Clarify compatibility policy document for combination of new client and old server.
Chris Nauroth created HADOOP-10423: -- Summary: Clarify compatibility policy document for combination of new client and old server. Key: HADOOP-10423 URL: https://issues.apache.org/jira/browse/HADOOP-10423 Project: Hadoop Common Issue Type: Improvement Components: documentation Affects Versions: 2.3.0 Reporter: Chris Nauroth Assignee: Chris Nauroth As discussed on the dev mailing lists and MAPREDUCE-4052, we need to update the text of the compatibility policy to discuss a new client combined with an old server. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server
Thank you, everyone, for the discussion. There is general agreement, so I have filed HADOOP-10423 with a patch to update the compatibility documentation. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Mar 20, 2014 at 11:24 AM, Colin McCabe cmcc...@alumni.cmu.eduwrote: +1 for making this guarantee explicit. It also definitely seems like a good idea to test mixed versions in bigtop. HDFS is not immune to new client, old server scenarios because the HDFS client gets bundled into a lot of places. Colin On Mar 20, 2014 10:55 AM, Chris Nauroth cnaur...@hortonworks.com wrote: Our use of protobuf helps mitigate a lot of compatibility concerns, but there still can be situations that require careful coding on our part. When adding a new field to a protobuf message, the client might need to do a null check, even if the server-side implementation in the new version always populates the field. When adding a whole new RPC endpoint, the client might need to consider the possibility that the RPC endpoint isn't there on an old server, and degrade gracefully after the RPC fails. The original issue in MAPREDUCE-4052 concerned the script commands passed in a YARN container submission, where protobuf doesn't provide any validation beyond the fact that they're strings. Forward compatibility is harder than backward compatibility, and testing is a big challenge. Our test suites in the Hadoop repo don't cover this. Does anyone know if anything in Bigtop tries to run with mixed versions? I agree that we need to make it clear in the language that upgrading client alone is insufficient to get access to new server-side features, including new YARN APIs. Thanks for the suggestions, Steve. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Mar 20, 2014 at 5:53 AM, Steve Loughran ste...@hortonworks.com wrote: I'm clearly supportive of this, though of course the testing costs needed to back up the assertion make it more expensive than just a statement. Two issues -we'd need to make clear that new cluster features that a client can invoke won't be available. You can't expect snapshot or symlink support running against a -2.2.0 cluster, even if the client supports it. -in YARN, there are no guarantees that an app compiled against later YARN APIs will work in old clusters. Because YARN apps upload themselves to the server, and run with their hadoop, hdfs yarn libraries. We have to do a bit of introspection in our code already to support this situation. The compatibility doc would need to be clear on that too: YARN apps that use new APIs (including new fields in datastructures) can expect link exceptions On 20 March 2014 04:25, Vinayakumar B vinayakuma...@huawei.com wrote: +1, I agree with your point Chris. It depends on the client application how they using the hdfs jars in their classpath. As implementation already supports the compatibility (through protobuf), No extra code changes required to support new Client + old server. I feel it will be good to explicitly mention about the compatibility of existing APIs in both versions. Anyway this is not applicable for the new APIs in latest client and this is understood. We can make it explicit in the document though. Regards, Vinayakumar B -Original Message- From: Chris Nauroth [mailto:cnaur...@hortonworks.com] Sent: 20 March 2014 05:36 To: common-dev@hadoop.apache.org Cc: mapreduce-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org Subject: Re: [DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server I think this kind of compatibility issue still could surface for HDFS, particularly for custom applications (i.e. something not executed via hadoop jar on a cluster node, where the client classes ought to be injected into the classpath automatically). Running DistCP between 2 clusters of different versions could result in a 2.4.0 client calling a 2.3.0 NameNode. Someone could potentially pick up the 2.4.0 WebHDFS client as a dependency and try to use it to make HTTP calls to a 2.3.0 HDFS cluster. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Mar 19, 2014 at 4:28 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: It makes sense only for YARN today where we separated out the clients. HDFS is still a monolithic jar so this compatibility issue is kind of invalid there. +vinod On Mar 19, 2014, at 1:59 PM, Chris Nauroth cnaur...@hortonworks.com wrote: I'd like to discuss clarification of part of our compatibility policy. Here is a link to the compatibility documentation for release 2.3.0:
Re: [DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server
Adding back all *-dev lists to make sure everyone is covered. Chris Nauroth Hortonworks http://hortonworks.com/ On Mon, Mar 24, 2014 at 2:02 PM, Chris Nauroth cnaur...@hortonworks.comwrote: Thank you, everyone, for the discussion. There is general agreement, so I have filed HADOOP-10423 with a patch to update the compatibility documentation. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Mar 20, 2014 at 11:24 AM, Colin McCabe cmcc...@alumni.cmu.eduwrote: +1 for making this guarantee explicit. It also definitely seems like a good idea to test mixed versions in bigtop. HDFS is not immune to new client, old server scenarios because the HDFS client gets bundled into a lot of places. Colin On Mar 20, 2014 10:55 AM, Chris Nauroth cnaur...@hortonworks.com wrote: Our use of protobuf helps mitigate a lot of compatibility concerns, but there still can be situations that require careful coding on our part. When adding a new field to a protobuf message, the client might need to do a null check, even if the server-side implementation in the new version always populates the field. When adding a whole new RPC endpoint, the client might need to consider the possibility that the RPC endpoint isn't there on an old server, and degrade gracefully after the RPC fails. The original issue in MAPREDUCE-4052 concerned the script commands passed in a YARN container submission, where protobuf doesn't provide any validation beyond the fact that they're strings. Forward compatibility is harder than backward compatibility, and testing is a big challenge. Our test suites in the Hadoop repo don't cover this. Does anyone know if anything in Bigtop tries to run with mixed versions? I agree that we need to make it clear in the language that upgrading client alone is insufficient to get access to new server-side features, including new YARN APIs. Thanks for the suggestions, Steve. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Mar 20, 2014 at 5:53 AM, Steve Loughran ste...@hortonworks.com wrote: I'm clearly supportive of this, though of course the testing costs needed to back up the assertion make it more expensive than just a statement. Two issues -we'd need to make clear that new cluster features that a client can invoke won't be available. You can't expect snapshot or symlink support running against a -2.2.0 cluster, even if the client supports it. -in YARN, there are no guarantees that an app compiled against later YARN APIs will work in old clusters. Because YARN apps upload themselves to the server, and run with their hadoop, hdfs yarn libraries. We have to do a bit of introspection in our code already to support this situation. The compatibility doc would need to be clear on that too: YARN apps that use new APIs (including new fields in datastructures) can expect link exceptions On 20 March 2014 04:25, Vinayakumar B vinayakuma...@huawei.com wrote: +1, I agree with your point Chris. It depends on the client application how they using the hdfs jars in their classpath. As implementation already supports the compatibility (through protobuf), No extra code changes required to support new Client + old server. I feel it will be good to explicitly mention about the compatibility of existing APIs in both versions. Anyway this is not applicable for the new APIs in latest client and this is understood. We can make it explicit in the document though. Regards, Vinayakumar B -Original Message- From: Chris Nauroth [mailto:cnaur...@hortonworks.com] Sent: 20 March 2014 05:36 To: common-dev@hadoop.apache.org Cc: mapreduce-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org Subject: Re: [DISCUSS] Clarification on Compatibility Policy: Upgraded Client + Old Server I think this kind of compatibility issue still could surface for HDFS, particularly for custom applications (i.e. something not executed via hadoop jar on a cluster node, where the client classes ought to be injected into the classpath automatically). Running DistCP between 2 clusters of different versions could result in a 2.4.0 client calling a 2.3.0 NameNode. Someone could potentially pick up the 2.4.0 WebHDFS client as a dependency and try to use it to make HTTP calls to a 2.3.0 HDFS cluster. Chris Nauroth Hortonworks http://hortonworks.com/ On Wed, Mar 19, 2014 at 4:28 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: It makes sense only for YARN today where we separated out the clients. HDFS is still a monolithic jar so this compatibility issue is kind of invalid there. +vinod On Mar 19, 2014, at 1:59 PM, Chris Nauroth
[jira] [Created] (HADOOP-10424) TestStreamingTaskLog is failing
Mit Desai created HADOOP-10424: -- Summary: TestStreamingTaskLog is failing Key: HADOOP-10424 URL: https://issues.apache.org/jira/browse/HADOOP-10424 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0 Reporter: Mit Desai testStreamingTaskLogWithHadoopCmd(org.apache.hadoop.streaming.TestStreamingTaskLog) Time elapsed: 44.069 sec FAILURE! java.lang.AssertionError: environment set for child is wrong at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.streaming.TestStreamingTaskLog.runStreamJobAndValidateEnv(TestStreamingTaskLog.java:157) at org.apache.hadoop.streaming.TestStreamingTaskLog.testStreamingTaskLogWithHadoopCmd(TestStreamingTaskLog.java:107) Results : Failed tests: TestStreamingTaskLog.testStreamingTaskLogWithHadoopCmd:107-runStreamJobAndValidateEnv:157 environment set for child is wrong -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10425) Incompatible behavior of LocalFileSystem:getContentSummary
Brandon Li created HADOOP-10425: --- Summary: Incompatible behavior of LocalFileSystem:getContentSummary Key: HADOOP-10425 URL: https://issues.apache.org/jira/browse/HADOOP-10425 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.3.0 Reporter: Brandon Li Assignee: Tsz Wo Nicholas Sze Unlike in Hadoop1, FilterFileSystem overrides getContentSummary, which causes content summary to be called on rawLocalFileSystem in Local mode. This impacts the computations of Stats in Hive with getting back FileSizes that include the size of the crc files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10426) CreateOpts.getOpt(..) should declare with generic type argument
Tsz Wo Nicholas Sze created HADOOP-10426: Summary: CreateOpts.getOpt(..) should declare with generic type argument Key: HADOOP-10426 URL: https://issues.apache.org/jira/browse/HADOOP-10426 Project: Hadoop Common Issue Type: Sub-task Components: fs Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Priority: Minor Similar to CreateOpts.setOpt(..), the CreateOpts.getOpt(..) should also declare with a generic type parameter T extends CreateOpts. Then, all the casting from CreateOpts to its subclasses can be avoided. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10427) KeyProvider implementations should be thread safe
Alejandro Abdelnur created HADOOP-10427: --- Summary: KeyProvider implementations should be thread safe Key: HADOOP-10427 URL: https://issues.apache.org/jira/browse/HADOOP-10427 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur The {{KeyProvider}} API should be thread-safe so it can be used safely in server apps. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10428) JavaKeyStoreProvider should accept keystore password via configuration falling back to ENV VAR
Alejandro Abdelnur created HADOOP-10428: --- Summary: JavaKeyStoreProvider should accept keystore password via configuration falling back to ENV VAR Key: HADOOP-10428 URL: https://issues.apache.org/jira/browse/HADOOP-10428 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Currently the password for the {{JavaKeyStoreProvider}} must be set in an ENV VAR. Allowing the password to be set via configuration enables applications to interactively ask for the password before initializing the {{JavaKeyStoreProvider}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10429) KeyStores should have methods to generate the materials themselves, KeyShell should use them
Alejandro Abdelnur created HADOOP-10429: --- Summary: KeyStores should have methods to generate the materials themselves, KeyShell should use them Key: HADOOP-10429 URL: https://issues.apache.org/jira/browse/HADOOP-10429 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Currently, the {{KeyProvider}} API expects the caller to provide the key materials. And, the {{KeyShell}} generates key materials. For security reasons, {{KeyProvider}} implementations may want to generate and hide (from the user generating the key) the key materials. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10430) KeyProvider Metadata should have an optional label, there should be a method to retrieve the metadata from all keys
Alejandro Abdelnur created HADOOP-10430: --- Summary: KeyProvider Metadata should have an optional label, there should be a method to retrieve the metadata from all keys Key: HADOOP-10430 URL: https://issues.apache.org/jira/browse/HADOOP-10430 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Being able to attach an optional label (and show it when displaying metadata) will enable giving some context on the keys. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HADOOP-10431) Change visibility of KeyStore KeyVersion/Metadata/Options constructor and methods to public
Alejandro Abdelnur created HADOOP-10431: --- Summary: Change visibility of KeyStore KeyVersion/Metadata/Options constructor and methods to public Key: HADOOP-10431 URL: https://issues.apache.org/jira/browse/HADOOP-10431 Project: Hadoop Common Issue Type: Improvement Components: security Affects Versions: 3.0.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Making KeyVersion/Metadata/Options constructor and methods public will facilitate {{KeyProvider}} implementations to use those classes. -- This message was sent by Atlassian JIRA (v6.2#6252)