Hi Ron,
what ever api you have in scala you can possibly use it form java. scala is
inter-operable with java and vice versa. scala being both object oriented
and functional will make your job easier on jvm and it is more consise than
java. Take it as an opportunity and start learning scala ;).
A pretty large fraction of users use Java, but a few features are still not
available in it. JdbcRDD is one of them -- this functionality will likely be
superseded by Spark SQL when we add JDBC as a data source. In the meantime, to
use it, I'd recommend writing a class in Scala that has
I interpret this to mean you have to learn Scala in order to work with Spark in
Scala (goes without saying) and also to work with Spark in Java (since you have
to jump through some hoops for basic functionality).
The best path here is to take this as a learning opportunity and sit down and
The overridable methods of RDD are marked as @DeveloperApi, which means that
these are internal APIs used by people that might want to extend Spark, but are
not guaranteed to remain stable across Spark versions (unlike Spark's public
APIs).
BTW, if you want a way to do this that does not
I believe that you are overstating your case.
If you want to work with with Spark, then the Java API is entirely adequate
with a very few exceptions -- unfortunately, though, one of those
exceptions is with something that you are interested in, JdbcRDD.
If you want to work on Spark --
Don't be too concerned about the Scala hoop. Before making the
commitment to Scala, I had coded up a modest analytic prototype in
Hadoop mapreduce. Once making the commitment, it took 10 days to
(1) learn enough Scala, and (2) re-write the prototype in Spark in
Scala.