Well... I think it is an issue that has to do with figuring out how to
*avoid* import and export as much as possible.


On Tue, Apr 15, 2014 at 6:36 PM, Pat Ferrel <[email protected]> wrote:

> Which is why it’s an import/export issue.
>
> On Apr 15, 2014, at 5:48 PM, Ted Dunning <[email protected]> wrote:
>
> On Tue, Apr 15, 2014 at 10:58 AM, Pat Ferrel <[email protected]>
> wrote:
>
> > As to the statement "There is not, nor do i think there will be a way to
> > run this stuff with CLI” seems unduly misleading. Really, does anyone
> > second this?
> >
> > There will be Scala scripts to drive this stuff and yes even from the
> CLI.
> > Do you imagine that every Mahout USER will be a Scala + Mahout DSL
> > programmer? That may be fine for commiters but users will be PHP devs,
> Ruby
> > devs, Python or Java devs maybe even a few C# devs. I think you are
> > confusing Mahout DEVS with USERS. Few users are R devs moving into
> > production work, they are production engineers moving into ML who want a
> > blackbox. They will need a language agnostic way to drive Mahout. Making
> > statements like this only confuse potential users and drive them away to
> no
> > purpose. I’m happy for the nascent Mahout-Scala shell, but it’s not in
> the
> > typical user’s world view.
> >
>
> Yes, ultimately there may need to be command line programs of various
> sorts, but the fact is, we need to make sure that we avoid files as the API
> for moving large amounts of data. That means that we have to have some way
> of controlling the persistence of in-memory objects and in many cases, that
> means that processing chains will not typically be integrated at the level
> of command line programs.
>
> Dmitriy's comment about R is apropos.  You can put scripts together for
> various end-user purposes but you don't have a CLI for every R comment.
> Nor for every Perl, python or php command either.
>
> To the extent we have in-memory persistence across the life-time of
> multiple driver programs, then a sort of CLI interface will be possible.  I
> know that h2o will do that, but I am not entirely clear on the life-time of
> RDD's in Spark relative to Mahout DSL programs.  Regardless of possibility,
> I don't expect CLI interface to be the primary integration path for these
> new capabilities.
>
>

Reply via email to