Github user mag- commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-96587264
Are you aware that all this regexp hacks will break when hadoop changes
version to 3.0.0?
---
If your project is set up for it, you can reply to this email and have your
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-96611185
@mag- if you're talking about what I think you are, it was a temporary
thing that's long since gone already
https://github.com/apache/spark/pull/629/files
---
If your
Github user LuqmanSahaf commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-96522017
@darose I am facing the VerifyError you mentioned in one of the comments.
Can you tell me how you solved that error?
---
If your project is set up for it, you can
Github user mag- commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-96642739
Well:
`val jets3tVersion = if
(^2\\.[3-9]+.r.findFirstIn(hadoopVersion).isDefined) 0.9.0 else 0.7.1`
It probably should be other way round, if hadoop version is lower
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-96643172
Agree but that doesn't exist in `master` anyway. Now the SBT build drives
off the Maven build.
---
If your project is set up for it, you can reply to this email and have
Github user darose commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-96883761
On 04/27/2015 07:11 AM, Sean Owen wrote:
@mag- if you're talking about what I think you are, it was a temporary
thing that's long since gone already
Github user CodingCat closed the pull request at:
https://github.com/apache/spark/pull/468
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42253192
fixed in #629
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42102935
@pwendell Before I begin can I propose a refactoring of profiles that will
make this and similar issues easy to deal with? Probably it's for a different
PR, but will
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42109604
@srowen Not every one uses the same version of HDFS vs YARN.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42109781
@witgo Hm, is there an example that comes up repeatedly? Is it ever
intentional, or just some accident of someone's legacy deployment? I don't
know of a case of this, and
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42110042
@srowen Related discussion in [PR
502](https://github.com/apache/spark/pull/502).
@berngp Can you explain the reason of not using the same version of HDFS
vs YARN ?
Github user berngp commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42112284
I think in general is an edge case but there are folks still using hdfs
1.0.x with a different version of YARN, that said it is not my case.
I like what you
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42113320
@srowen YARN version does need to be separate from hadoop version.
Downstream consumers of our build sometimes do this. For instance, if they want
to build against a
Github user darose commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42027309
Man oh man, I cannot get this to work no way no how. I tried rebuilding
spark using the jets3t 0.9 jar, then tried rebuilding shark doing the same. I
keep getting a
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42096004
@srowen I'd prefer not to remove it from the dependency graph if possible
because it will break local builds. The best solution I see is to add a profile
for Hadoop 2.3
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-42096201
@srowen if you'd like to take a crack at this by the way, please do. I'll
probably look at it on Sunday if no one else has.
---
If your project is set up for it, you
Github user darose commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41764125
So @srowen, I think @mateiz is right, the CDH5 spark-core package (on
Ubuntu, it's version 0.9.0+cdh5.0.0+31-1.cdh5.0.0.p0.31~precise-cdh5.0.0) won't
function correctly
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41770038
@darose this can be patched downstream, but that would not fix this for any
other distro. Ideally, the dependency is set to 0.9.0 when built against Hadoop
2.3.0+. As
Github user darose commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41797308
What I can confirm is that trying to remove the jets3t 0.7 jars from the
CDH spark-core package and replace them with 0.9 jars doesn't fix the issue.
(I'm guessing
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41804304
@darose what about removing the library from the assembly entirely? so
there is no copy in your app or in the deployed Spark jars? May not be a viable
solution in general,
Github user darose commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41804694
Definitely worth a shot! Will give that a try and report back.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41804895
Hi, @srowen, do you want to take over the patch? I'm concerning I cannot
fix it in the following days, considering my schedule and my knowledge level on
mvn and sbt?
Github user darose commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41808390
Sigh. Was a promising idea, but no dice. Even with the 0.7 jars out of
the way, I'm still getting java.lang.NoClassDefFoundError:
org/jets3t/service/S3ServiceException
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41831614
@CodingCat I can make a patch, but it will mean introducing a new profile
like hadoop230 that one has to enable when building for Hadoop 2.3.0. I
always hate to add that
Github user darose commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41841626
FYI - I think I might have figured out why deleting the jets3t jar didn't
fix the issue. It looks like the spark build process bundles the jets3t
classes into the spark
Github user darose commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41732946
Is there any way to apply this fix without a rebuild of spark? E.g., to
just replace jets3t-0.7.1.jar with jets3t-0.9.0.jar in a deployed spark
package? I'm running into
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41747427
You can try adding jets3t 0.9 as a Maven dependency in your application,
but unfortunately I think that goes after the Spark assembly JAR when running
an app. In 1.0 there
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41747796
@mateiz for @darose 's question, how about compile the application against
a customized spark jar (with newer jets3t)? I think in that case, he does not
need to restart
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41748362
BTW the right way to do it would be to make hadoop-client have a Maven
dependency on the right version of Jets3t. Then Spark would just build with the
right version out of
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41748394
@CodingCat the problem is that on worker nodes there will be the wrong
jets3t in the Spark JAR.
---
If your project is set up for it, you can reply to this email and have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41639344
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41640655
Merged build finished. All automated tests passed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well.
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41009471
Unfortunately this will not work in older Hadoop versions as far as I know.
Can you still build Spark against Hadoop 1.0.4 and run it with this change?
It might be
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41014340
@mateiz I thought the same thing, that `hadoop-client` pulls this in, but
it does not. Only things like `hadoop-hdfs`.
I agree with updating the dependency, but to
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41059208
In that case let's see exactly which Hadoop 2.x version bumped up the
dependency, because I don't think 2.0 and 2.1 did it (could be wrong though).
---
If your project is
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41064211
@mateiz It looks like it went to 0.8.1 in Hadoop 1.3.0
(https://issues.apache.org/jira/browse/HADOOP-8136) and 0.9.0 in 2.3.0
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41073254
Great, so there's no easy way to set it based on profiles and support all
Hadoop versions :). Maybe for Hadoop 2.3+ users, we can just tell them to add a
new version of
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41079532
Hi, @mateiz @srowen , if Spark built with Hadoop 1.0.4/2.x (x 3) and
jets3t 0.9.0 can access S3 smoothly, does it also mean that bumping to 0.9.0 is
safe?
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/468#issuecomment-41079837
Sure, that would work. Please try it. Unfortunately I remember it having
problems, but I could be wrong.
---
If your project is set up for it, you can reply to this email
40 matches
Mail list logo