The Beam portability framework will enable this in Java too; not sure we
can do much sooner than that!

On Fri, Feb 24, 2017 at 3:33 PM, Amit Sela <amitsel...@gmail.com> wrote:

> That's great! many people have asked me about that and I'm glad to see this
> happening.
> Anyone know if there's something at work for the Java SDK (assuming I don't
> want to wait for Fn API support) ?
>
> On Fri, Feb 24, 2017 at 8:44 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
> > Fantastic !
> >
> > That's a great addition and awesome to see that with Beam !
> >
> > Regards
> > JB
> >
> > On 02/24/2017 02:51 AM, Robert Bradshaw wrote:
> > > One thing I'm really excited about this library is that it allows one
> to
> > > more easily express transforms on columnar data (which is useful beyond
> > > just ML). For example, if your input elements have two fields "x" and
> "y"
> > > then you can write functions like
> > >
> > > def preprocessing_fn(inputs):
> > >     x_centered = tft.map(lambda x, mean: x - mean, inputs['x'],
> > > tft.mean(inputs['x']))
> > >     y_normalized = tft.scale_to_0_1(inputs['y'])
> > >     return {
> > >         'x_centered': x_centered,
> > >         'y_normalized': y_normalized,
> > >         'x_centered_times_y_normalized': tft.map(operations.mul,
> > > x_centered, y_normalized)
> > >     }
> > >
> > > # Read PCollection of dicts with 'x' and 'y' keys and numeric values
> > > input = p | Read(...)
> > >
> > > # output will contain dicts with 'x_centered', 'y_normalized', and
> > > 'x_centered_times_y_normalized' keys
> > > # with the expected values, and fn can be used to transform other data
> > > using the
> > > # statistics (mean, mins, and maxes) without re-analysis.
> > > output, fn = (input, schema) |
> > > beam_impl.AnalyzeAndTransformDataset(preprocessing_fn)
> > >
> > > This automatically injects the relevant global aggregations (which can
> be
> > > interleaved) and builds up tensorflow graphs to apply the
> transformations
> > > very efficiently.
> > >
> > >
> > > On Thu, Feb 23, 2017 at 4:55 PM, Davor Bonaci <da...@apache.org>
> wrote:
> > >
> > >> Beam and TensorFlow coming together -- a big deal for us!
> > >>
> > >> On Thu, Feb 23, 2017 at 3:49 PM, Ahmet Altay <al...@google.com.invalid
> >
> > >> wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> Yesterday, there was an announcement from TensorFlow community about
> > the
> > >>> new tf.Transform library [1]. It is a library that allows users to
> > define
> > >>> pre-processing pipelines and run using large scale data processing
> > >>> frameworks. It is a library specifically designed to work with Apache
> > >> Beam.
> > >>> It is great to see Python SDK getting a larger ecosystem and
> increased
> > >>> usage.
> > >>>
> > >>> Also worth mentioning is, PMC member Robert Bradshaw was one of the
> > >>> contributors to this new library.
> > >>>
> > >>> Thank you,
> > >>> Ahmet
> > >>>
> > >>> [1] https://research.googleblog.com/2017/02/preprocessing-for-
> machine-
> > >>> learning-with.html
> > >>>
> > >>
> > >
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>

Reply via email to