On Fri, May 30, 2014 at 12:05 PM, Colin McCabe <cmcc...@alumni.cmu.edu> wrote:
> I don't know if Scala provides any mechanisms to do this beyond what Java 
> provides.

In fact it does. You can say something like "private[foo]" and the
annotated element will be visible for all classes under "foo" (where
"foo" is any package in the hierarchy leading up to the class). That's
used a lot in Spark.

I haven't fully looked at how the @DeveloperApi is used, but I agree
with you - annotations are not a good way to do this. The Scala
feature above would be much better, but it might still leak things at
the Java bytecode level (don't know how Scala implements it under the
cover, but I assume it's not by declaring the element as a Java
"private").

Another thing is that in Scala the default visibility is public, which
makes it very easy to inadvertently add things to the API. I'd like to
see more care in making things have the proper visibility - I
generally declare things private first, and relax that as needed.
Using @VisibleForTesting would be great too, when the Scala
private[foo] approach doesn't work.

> Does Spark also expose its CLASSPATH in
> this way to executors?  I was under the impression that it did.

If you're using the Spark assemblies, yes, there is a lot of things
that your app gets exposed to. For example, you can see Guava and
Jetty (and many other things) there. This is something that has always
bugged me, but I don't really have a good suggestion of how to fix it;
shading goes a certain way, but it also breaks codes that uses
reflection (e.g. Class.forName()-style class loading).

What is worse is that Spark doesn't even agree with the Hadoop code it
depends on; e.g., Spark uses Guava 14.x while Hadoop is still in Guava
11.x. So when you run your Scala app, what gets loaded?

> At some point we will also have to confront the Scala version issue.  Will
> there be flag days where Spark jobs need to be upgraded to a new,
> incompatible version of Scala to run on the latest Spark?

Yes, this could be an issue - I'm not sure Scala has a policy towards
this, but updates (at least minor, e.g. 2.9 -> 2.10) tend to break
binary compatibility.

Scala also makes some API updates tricky - e.g., adding a new named
argument to a Scala method is not a binary compatible change (while,
e.g., adding a new keyword argument in a python method is just fine).
The use of implicits and other Scala features make this even more
opaque...

Anyway, not really any solutions in this message, just a few comments
I wanted to throw out there. :-)

-- 
Marcelo

Reply via email to