What I am saying is that for certain algorithms including both
engine-specific (such as aggregation) and DSL stuff, what is the best way
of handling them?

i) should we add the distributed operations to Mahout codebase as it is
proposed in #62?

ii) should we have [engine]-ml modules (like spark-bindings and
h2o-bindings) where we can mix the DSL and engine-specific stuff?

Picking i. has the advantage of writing an ML-algorithm once and then it
can be run on alternative engines, but it requires wrapping/duplicating
existing distributed operations.

Picking ii. has the advantage of avoiding writing distributed operations,
but since we're mixing the DSL and the engine-specific stuff, an
ML-algorithm written for an engine would not be available for the others.

I just wanted to hear some opinions.

Gokhan

On Thu, Feb 5, 2015 at 4:11 AM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:

> I took it Gokhan had objections himself, based on his comments. if we are
> talking about #62.
>
> He also expressed concerns about computing GSGD but i suspect it can still
> be algebraically computed.
>
> On Wed, Feb 4, 2015 at 5:52 PM, Pat Ferrel <p...@occamsmachete.com> wrote:
>
> > BTW Ted and Andrew have both expressed interest in the distributed
> > aggregation stuff. It sounds like we are agreeing that
> > non-algebra—computation method type things can be engine specific.
> >
> > So does anyone have an objection to Gokhan pushing his PR?
> >
> > On Feb 4, 2015, at 2:20 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
> >
> > On Wed, Feb 4, 2015 at 1:51 PM, Andrew Palumbo <ap....@outlook.com>
> wrote:
> >
> > >
> > >
> > >
> > > My thought was not to bring primitive engine specific aggregetors,
> > > combiners,  etc. into math-scala.
> > >
> >
> > Yeah. +1. I would like to support that as an experiment, see where it
> goes.
> > Clearly some distributed use cases are simple enough while also pervasive
> > enough.
> >
> >
>

Reply via email to