Re: Location of JARs

Dmitriy Lyubimov Thu, 02 Jun 2016 10:24:30 -0700

i already looked. my main concern is that it meddles with spark interpreter
code too much which may create friction with spark interpreters in future.
it may be hard to have two products integration code coherent in one
component (in this case, the same interpreter class/file). I don't want to
put this comment to zeppelin discussion, but internally i think it should
be a concern for us.


Is it possible to have a standalone mahout-spark interpreter but use the
same spark configuration as configured for spark interpreter? If yes, i
would very much like not to have spark-alone and spark+mahout code
intermingled in same interpreter class.

visually, it probably also would be preferable to have a block that would
require boiler of something like

%spark.mahout

... blah ....

On Thu, Jun 2, 2016 at 8:24 AM, Trevor Grant <trevor.d.gr...@gmail.com>
wrote:

> Would you mind having a look at
> https://github.com/apache/incubator-zeppelin/pull/928/files
> to see if I'm missing anything critical.
>
> The idea is the user specifies a directory containing the necessary (to be
> covered in the setup documentation), and the jars are loaded from there.
> Also adds some configuration settings (mainly Kyro) when 'spark.mahout' is
> true.  Finally imports the mahout and sets up the sdc from the already
> declared sc.
>
> Based on my testing that works in local and cluster mode.
>
> Thanks,
> tg
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Wed, Jun 1, 2016 at 12:48 PM, Dmitriy Lyubimov <dlie...@gmail.com>
> wrote:
>
> > On Wed, Jun 1, 2016 at 10:46 AM, Dmitriy Lyubimov <dlie...@gmail.com>
> > wrote:
> >
> > >
> > >
> > > On Wed, Jun 1, 2016 at 7:47 AM, Trevor Grant <trevor.d.gr...@gmail.com
> >
> > > wrote:
> > >
> > >>
> > >> Other approaches?
> > >>
> > >> For background, Zeppelin starts a Spark Shell and we need to make sure
> > all
> > >> of the required Mahout jars get loaded in the class path when spark
> > >> starts.
> > >> The question is where do all of these JARs relatively live.
> > >>
> > >
> > > How does zeppelin copes with extra dependencies for other interpreters
> > > (even spark itself)? I guess we should follow the same practice there.
> > >
> > > Release independence of location algorithm largely depends on jar
> filters
> > > (again, see filters in the spark binding package). It is possible that
> > > artifacts required may change but not very likely (i don't think they
> > ever
> > > changed since 0.10). so it should be possible to build (mahout)
> > > release-independent logic to locate, filter and assert the necessary
> > jars.
> > >
> >
> > PS this  may change soon though if/when custom javacpp code is built, we
> > may probably want to keep all native things as separate release
> artifacts,
> > as they are basically treated as optionally available accellerators and
> may
> > or may not be properly loaded in all situations. hence they may warrant a
> > seaprate jar vehicle.
> >
> > >
> > >
> > >>
> > >> Thanks for any feedback,
> > >> tg
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> Trevor Grant
> > >> Data Scientist
> > >> https://github.com/rawkintrevo
> > >> http://stackexchange.com/users/3002022/rawkintrevo
> > >> http://trevorgrant.org
> > >>
> > >> *"Fortunate is he, who is able to know the causes of things."
> -Virgil*
> > >>
> > >
> > >
> >
>

Re: Location of JARs

Reply via email to