Re: using spark-submit to launch CLI jobs

Dmitriy Lyubimov Fri, 27 Nov 2015 20:45:07 -0800

PS all is needed to run a submitted application is to tell the context not
to look for mahout standard jars (assuming it is not installed on every
cluster node) but rather run the Uber jar instead that the app was started
from (which is trivially resolved for example from Hadoop get jar util).
Don't remember details exactly but can probably clarify further if needed.


But point is, everything to support custom class paths is already in the
api, including situations involving running off the spark-submit api.

Unfortunately, the whole Spark submit deal is pretty awkward. One usually
needs to have his app build to build Uber jars, including all transitive
dependencies except those already in spark. And then either use
undocumented submit class or the official way which is running the script.

I think the undocumented api become rewritten once again in later Spark
versions and has been declared user level after that, iirc.

That means that there is very little incentive to do mahout submit code in
the mahout project itself, since it will only be able to handle submits
that rely exclusively on the code that is already in Mahout itself, but the
external code will still have to take care of building Uber jars, which is
really the main thing there is to take care of.

Re: using spark-submit to launch CLI jobs

Reply via email to