Re: machine learning API, common models

Tyler Akidau Tue, 17 May 2016 15:18:27 -0700

On Sat, May 14, 2016 at 4:53 AM Kavulya, Soila P <[email protected]>
wrote:


> Hi Tyler,
>
> Thank you so much for your feedback. I agree that starting with the
> high-level API is a good direction. We are interested in Python because it
> is the language that our data scientists are most familiar with. I think
> starting with Java would be the best approach, because the Python API can
> be a thin wrapper for Java API.
>
> In Spark, the Scala, Java and Python APIs are identical. Flink does not
> have a Python API for ML pipelines at present.
>
> Could you point me to the updated runner API?
>

Sorry for the delay; I've been traveling. The runner API proposal is here:
https://docs.google.com/document/d/1bao-5B6uBuf-kwH1meenAuXXS0c9cBQ1B2J59I3FiyI/edit

-Tyler


>
> Soila
>
> -----Original Message-----
> From: Tyler Akidau [mailto:[email protected]]
> Sent: Friday, May 13, 2016 6:34 PM
> To: [email protected]
> Subject: Re: machine learning API, common models
>
> Hi Kam & Soila,
>
> Thanks a lot for writing this up. I ran the doc past some of the folks
> who've been doing ML work here at Google, and they were generally happy
> with the distillation of common methods in the doc. I'd be curious to hear
> what folks on the Flink- and Spark- runner sides think.
>
> To me, this seems like a good direction for a high-level API. Presumably,
> once a high-level API is in place, we could begin looking at what it would
> take to add lower-level ML algorithm support (e.g. iterative) to the Beam
> Model. Is this essentially what you're thinking?
>
> Some more specific questions/comments:
>
>    - Presumably you'd want to tackle this in Java first, since that's the
>    only language we currently support? Given that half of your examples
> are in
>    Python, I'm also assuming Python will be interesting once it's
> available.
>
>    - Along those lines, what languages are represented in the capability
>    matrix? E.g. is Spark ML support as detailed there identical across
>    Java/Scala and Python?
>
>    - Have you thought about how this would tie in at the runner level,
>    particularly given the updated Runner API changes that are coming? I'm
>    assuming they'd be provided as composite transforms that (for now) would
>    have no default implementation, given the lack of low-level primitives
> for
>    ML algorithms, but am curious what your thoughts are there.
>
>    - I still don't fully understand how incremental updates due to model
>    drift would tie in at the API level. There's a comment thread in the doc
>    still open tracking this, so no need to comment here additionally. Just
>    pointing it out as one of the things that stands out as potentially
> having
>    API-level impacts to me that doesn't seem 100% fleshed out in the doc
> yet
>    (thought that admittedly may just be my limited understanding at this
> point
>    :-).
>
> -Tyler
>
>
>
>
> On Fri, May 13, 2016 at 10:48 AM Kam Kasravi <[email protected]> wrote:
>
> > Hi Tyler - my bad. Comments should be enabled now.
> >
> > On Fri, May 13, 2016 at 10:45 AM, Tyler Akidau
> > <[email protected]
> > >
> > wrote:
> >
> > > Thanks a lot, Kam. Can you please enable comment access on the doc?
> > > I
> > seem
> > > to have view access only.
> > >
> > > -Tyler
> > >
> > > On Fri, May 13, 2016 at 9:54 AM Kam Kasravi <[email protected]>
> > wrote:
> > >
> > > > Hi
> > > >
> > > > A number of readers have made comments on this topic recently. We
> > > > have created a document that does some analysis of common ML
> > > > models and
> > > related
> > > > APIs. We hope this can drive an approach that will result in an
> > > > API, compatibility matrix and involvement from the same groups
> > > > that are implementing transformation runners (spark, flink, etc).
> > > > We welcome comments here or in the document itself.
> > > >
> > > >
> > > >
> > >
> > https://docs.google.com/document/d/17cRZk_yqHm3C0fljivjN66MbLkeKS1yjo4
> > PBECHb-xA/edit?usp=sharing
> > > >
> > >
> >
>

Re: machine learning API, common models

Reply via email to