Ditto, thanks for reaching out Jim; grateful for your offer. We are cutting
an 0.13 release in the next couple weeks and I know we could use help
testing/signing/etc.

Best
Andrew

On Thu, Feb 9, 2017 at 10:48 AM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:

> Jim, let me start by stating it's an (unexpected on my side) honor. Are you
> willing to get hands-on at this point in numerical problems (or have
> resources that can get hands-on)?
>
> Short modern Mahout story (as short as it is possible to be short)
>
> Most nagging problem: lack of support by industry and/or academia. We have
> capable committers but less capable capable backers in terms of willingness
> to sanction contributions.
>
> Current mahout development goes 2 ways: (a) the platform (aka `samsara`);
> and (b)  useful, preferrably end2end use case scenarious, or just
> methodology implementation. Note that while (b) is intended to use (a) (and
> gain backend portability as a bonus), it is not strictly required as long
> as the backend-speicific code could be fairly easily ported to other
> backends. Still though, if we come across a need for custom code, we try to
> analyze the situation if it is something that might be a fairly common
> abstraction so we could add it to the formalisms list we got in the
> platform and avoid repetition in the future. Platform primer could be found
> on the site, I won't be getting into that now.
>
> In the platform the problem #1, currently, is the performance. Not that it
> is generally bad, but some pieces are limited by back-ends. We did some
> in-memory work to integrate more performing backends there but the effort
> is constrained by our immediate capacities to contribute, and the most
> glaring issue (as one of visitors duly noted in jira) is that the
> distributed backends we are trying to run are severely limited in terms of
> interconnected algebraic problems. We have ideas what to do here though.
>
> It is the very distributed performance of interconnected numerical problems
> of the current backends (flink, spark) which precludes Mahout from being a
> pragmatical platform for implementing deep learning at scale, for example.
> I suppose in-memory performance should be ok for that purpose once we have
> added GPU and DL specific GPU primitives. The in-memory improvements are
> not complete for everything that would be ideal, but there has been some
> notable progress there.
>
> With methodologies, well, there's no one single most pressing problem, it
> is really just defined by a pragmatical problem one has at hand. Currently,
> Trevor does the most of this outstanding work. It simply and preferably
> should be a more edgy than most distributed packages offer.
>
> E.g., decent-to-good bayesian optimization for hyperparameters, or say I
> was suggesting to experiment with LRFM recommendation techniques for a few
> years, as they significantly expand on type of predictors the method can
> take, and their treatment, compared to things like COO or implicit feedback
> behavior-based recommenders. Another example is there's no good coverage in
> clustering in terms of _type_ of clustering -- mixtures, density, spectral,
> not just traditional centroid type of methods. Visualization techniques,
> even as simple as 2d density estimators for big datasets are also in
> demand. Generally speaking, industry has stepped far ahead in terms of
> visualization approaches than commonly is available in open source
> software. Bottom line, the only guidance here i see is -- "don't be
> trivial. Seek unique  value proposition". But most guiding principle so far
> was people's pragmatism: "I have actual production use case and/or very
> specific requirements for that, I want to use the methodology X for that,
> and I don't seem to be able to find it elsewhere under management of a
> distributed platform Y".
>
> -d
>
>
> On Thu, Feb 9, 2017 at 6:34 AM, Jim Jagielski <j...@jagunet.com> wrote:
>
> >
> > > On Feb 8, 2017, at 11:50 PM, Suneel Marthi <smar...@apache.org> wrote:
> > >
> > > Curious JimJag,
> > > Did some dude from CapitalOne poke u about Mahout
> > >
> >
> > Not really, no...
> >
>

Reply via email to