[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2015-04-27 Thread mag-
Github user mag- commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-96587264 Are you aware that all this regexp hacks will break when hadoop changes version to 3.0.0? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2015-04-27 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-96611185 @mag- if you're talking about what I think you are, it was a temporary thing that's long since gone already https://github.com/apache/spark/pull/629/files --- If your

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2015-04-27 Thread LuqmanSahaf
Github user LuqmanSahaf commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-96522017 @darose I am facing the VerifyError you mentioned in one of the comments. Can you tell me how you solved that error? --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2015-04-27 Thread mag-
Github user mag- commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-96642739 Well: `val jets3tVersion = if (^2\\.[3-9]+.r.findFirstIn(hadoopVersion).isDefined) 0.9.0 else 0.7.1` It probably should be other way round, if hadoop version is lower

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2015-04-27 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-96643172 Agree but that doesn't exist in `master` anyway. Now the SBT build drives off the Maven build. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2015-04-27 Thread darose
Github user darose commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-96883761 On 04/27/2015 07:11 AM, Sean Owen wrote: @mag- if you're talking about what I think you are, it was a temporary thing that's long since gone already

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-05 Thread CodingCat
Github user CodingCat closed the pull request at: https://github.com/apache/spark/pull/468 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-05 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42253192 fixed in #629 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-03 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42102935 @pwendell Before I begin can I propose a refactoring of profiles that will make this and similar issues easy to deal with? Probably it's for a different PR, but will

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-03 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42109604 @srowen Not every one uses the same version of HDFS vs YARN. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-03 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42109781 @witgo Hm, is there an example that comes up repeatedly? Is it ever intentional, or just some accident of someone's legacy deployment? I don't know of a case of this, and

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-03 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42110042 @srowen Related discussion in [PR 502](https://github.com/apache/spark/pull/502). @berngp Can you explain the reason of not using the same version of HDFS vs YARN ?

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-03 Thread berngp
Github user berngp commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42112284 I think in general is an edge case but there are folks still using hdfs 1.0.x with a different version of YARN, that said it is not my case. I like what you

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-03 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42113320 @srowen YARN version does need to be separate from hadoop version. Downstream consumers of our build sometimes do this. For instance, if they want to build against a

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-02 Thread darose
Github user darose commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42027309 Man oh man, I cannot get this to work no way no how. I tried rebuilding spark using the jets3t 0.9 jar, then tried rebuilding shark doing the same. I keep getting a

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-02 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42096004 @srowen I'd prefer not to remove it from the dependency graph if possible because it will break local builds. The best solution I see is to add a profile for Hadoop 2.3

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-05-02 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-42096201 @srowen if you'd like to take a crack at this by the way, please do. I'll probably look at it on Sunday if no one else has. --- If your project is set up for it, you

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-30 Thread darose
Github user darose commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41764125 So @srowen, I think @mateiz is right, the CDH5 spark-core package (on Ubuntu, it's version 0.9.0+cdh5.0.0+31-1.cdh5.0.0.p0.31~precise-cdh5.0.0) won't function correctly

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-30 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41770038 @darose this can be patched downstream, but that would not fix this for any other distro. Ideally, the dependency is set to 0.9.0 when built against Hadoop 2.3.0+. As

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-30 Thread darose
Github user darose commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41797308 What I can confirm is that trying to remove the jets3t 0.7 jars from the CDH spark-core package and replace them with 0.9 jars doesn't fix the issue. (I'm guessing

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-30 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41804304 @darose what about removing the library from the assembly entirely? so there is no copy in your app or in the deployed Spark jars? May not be a viable solution in general,

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-30 Thread darose
Github user darose commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41804694 Definitely worth a shot! Will give that a try and report back. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-30 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41804895 Hi, @srowen, do you want to take over the patch? I'm concerning I cannot fix it in the following days, considering my schedule and my knowledge level on mvn and sbt?

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-30 Thread darose
Github user darose commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41808390 Sigh. Was a promising idea, but no dice. Even with the 0.7 jars out of the way, I'm still getting java.lang.NoClassDefFoundError: org/jets3t/service/S3ServiceException

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-30 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41831614 @CodingCat I can make a patch, but it will mean introducing a new profile like hadoop230 that one has to enable when building for Hadoop 2.3.0. I always hate to add that

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-30 Thread darose
Github user darose commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41841626 FYI - I think I might have figured out why deleting the jets3t jar didn't fix the issue. It looks like the spark build process bundles the jets3t classes into the spark

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-29 Thread darose
Github user darose commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41732946 Is there any way to apply this fix without a rebuild of spark? E.g., to just replace jets3t-0.7.1.jar with jets3t-0.9.0.jar in a deployed spark package? I'm running into

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-29 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41747427 You can try adding jets3t 0.9 as a Maven dependency in your application, but unfortunately I think that goes after the Spark assembly JAR when running an app. In 1.0 there

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-29 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41747796 @mateiz for @darose 's question, how about compile the application against a customized spark jar (with newer jets3t)? I think in that case, he does not need to restart

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-29 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41748362 BTW the right way to do it would be to make hadoop-client have a Maven dependency on the right version of Jets3t. Then Spark would just build with the right version out of

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-29 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41748394 @CodingCat the problem is that on worker nodes there will be the wrong jets3t in the Spark JAR. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41639344 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41640655 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-22 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41009471 Unfortunately this will not work in older Hadoop versions as far as I know. Can you still build Spark against Hadoop 1.0.4 and run it with this change? It might be

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-22 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41014340 @mateiz I thought the same thing, that `hadoop-client` pulls this in, but it does not. Only things like `hadoop-hdfs`. I agree with updating the dependency, but to

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-22 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41059208 In that case let's see exactly which Hadoop 2.x version bumped up the dependency, because I don't think 2.0 and 2.1 did it (could be wrong though). --- If your project is

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-22 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41064211 @mateiz It looks like it went to 0.8.1 in Hadoop 1.3.0 (https://issues.apache.org/jira/browse/HADOOP-8136) and 0.9.0 in 2.3.0

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-22 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41073254 Great, so there's no easy way to set it based on profiles and support all Hadoop versions :). Maybe for Hadoop 2.3+ users, we can just tell them to add a new version of

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-22 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41079532 Hi, @mateiz @srowen , if Spark built with Hadoop 1.0.4/2.x (x 3) and jets3t 0.9.0 can access S3 smoothly, does it also mean that bumping to 0.9.0 is safe?

[GitHub] spark pull request: SPARK-1556: bump jets3t version to 0.9.0

2014-04-22 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/468#issuecomment-41079837 Sure, that would work. Please try it. Unfortunately I remember it having problems, but I could be wrong. --- If your project is set up for it, you can reply to this email