Re: Suggestion for SPARK-1825

Colin McCabe Fri, 25 Jul 2014 15:10:37 -0700

So, I'm leaning more towards using reflection for this.  Maven profiles
could work, but it's tough since we have new stuff coming in in 2.4, 2.5,
etc.  and the number of profiles will multiply quickly if we have to do it
that way.  Reflection is the approach HBase took in a similar situation.


best,
Colin


On Fri, Jul 25, 2014 at 11:23 AM, Colin McCabe <cmcc...@alumni.cmu.edu>
wrote:

> I have a similar issue with SPARK-1767.  There are basically three ways to
> resolve the issue:
>
> 1. Use reflection to access classes newer than 0.21 (or whatever the
> oldest version of Hadoop is that Spark supports)
> 2. Add a build variant (in Maven this would be a profile) that deals with
> this.
> 3. Auto-detect which classes are available and use those.
>
> #1 is the easiest for end-users, but it can lead to some ugly code.
>
> #2 makes the code look nicer, but requires some effort on the part of
> people building spark.  This can also lead to headaches for IDEs, if people
> don't remember to select the new profile.  (For example, in IntelliJ, you
> can't see any of the yarn classes when you import the project from Maven
> without the YARN profile selected.)
>
> #3 is something that... I don't know how to do in sbt or Maven.  I've been
> told that an antrun task might work here, but it seems like it could get
> really tricky.
>
> Overall, I'd lean more towards #2 here.
>
> best,
> Colin
>
>
> On Tue, Jul 22, 2014 at 12:47 AM, innowireless TaeYun Kim <
> taeyun....@innowireless.co.kr> wrote:
>
>> (I'm resending this mail since it seems that it was not sent. Sorry if
>> this
>> was already sent.)
>>
>> Hi,
>>
>>
>>
>> A couple of month ago, I made a pull request to fix
>> https://issues.apache.org/jira/browse/SPARK-1825.
>>
>> My pull request is here: https://github.com/apache/spark/pull/899
>>
>>
>>
>> But that pull request has problems:
>>
>> l  It is Hadoop 2.4.0+ only. It won't compile on the versions below it.
>>
>> l  The related Hadoop API is marked as '@Unstable'.
>>
>>
>>
>> Here is an idea to remedy the problems: a new Spark configuration
>> variable.
>>
>> Maybe it can be named as "spark.yarn.submit.crossplatform".
>>
>> If it is set to "true"(default is false), the related Spark code can use
>> the
>> hard-coded strings that is the same as the Hadoop API provides, thus
>> avoiding compile error on the Hadoop versions below 2.4.0.
>>
>>
>>
>> Can someone implement this feature, if this idea is acceptable?
>>
>> Currently my knowledge on Spark source code and Scala is limited to
>> implement it myself.
>>
>> To the right person, the modification should be trivial.
>>
>> You can refer to the source code changes of my pull request.
>>
>>
>>
>> Thanks.
>>
>>
>>
>>
>

Re: Suggestion for SPARK-1825

Reply via email to