Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-03 Thread Thomas Weise
Amit, Thanks for this pointer as well, CoderHelpers helps indeed! Thomas On Thu, Jun 2, 2016 at 12:51 PM, Amit Sela wrote: > Oh sorry, of course I meant Thomas Groh in my previous email.. But @Thomas > Weise this example > < > https://github.com/apache/incubator-beam/blob/master/runners/spark/

Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-03 Thread Thomas Weise
Thanks, works like a charm! For such hidden gems there should be a Beam runner newbie guide ;-) Thomas On Thu, Jun 2, 2016 at 11:59 AM, Thomas Groh wrote: > The Beam Model ensures that all PCollections have a Coder; the PCollection > Coder is the standard way to materialize the elements of a >

Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-02 Thread Amit Sela
Oh sorry, of course I meant Thomas Groh in my previous email.. But @Thomas Weise this example might help, this is how the Spark runner uses Coders

Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-02 Thread Amit Sela
Thomas is right, though in my case, I encountered this issue when using Spark's new API that uses Encoders not just for serialization but also for "translating" the object into a schema of o

Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-02 Thread Thomas Groh
The Beam Model ensures that all PCollections have a Coder; the PCollection Coder is the standard way to materialize the elements of a PCollection[1][2]. Most SDK-provided classes that will need to be transferred across the wire have an associated coder, and some additional default datatypes have co

Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-02 Thread Thomas Weise
Hi Amit, Thanks for the help. I implemented the same serialization workaround for the PipelineOptions. Since every distributed runner will have to solve this, would it make sense to provide the serialization support along with the interface proxy? Here is the exception I get with with WindowedVal

Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-01 Thread Amit Sela
Hi Thomas, Spark and the Spark runner are using kryo for serialization and it seems to work just fine. What is your exact problem ? stack trace/message ? I've hit an issue with Guava's ImmutableList/Map etc. and used https://github.com/magro/kryo-serializers for that. For PipelineOptions you can

Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-05-31 Thread Thomas Weise
Hi, I'm working on putting together a basic runner for Apache Apex. Hitting a couple of serialization related issues with running tests. Apex is using Kryo for serialization by default (and Kryo can delegate to other serialization frameworks). The inner classes of WindowedValue are private and h