Good idea Jesse !
On Nov 8, 2016, 14:45, at 14:45, Jesse Anderson <[email protected]> wrote: >Moving this thread to dev mailing list for discussion. > >On Tue, Nov 8, 2016 at 1:24 AM Jean-Baptiste Onofré <[email protected]> >wrote: > >> Hi Jesse, >> >> Coder is not for type conversion, but for serialization. >> >> I'm using the same as you: >> >> >> >https://github.com/jbonofre/beam-samples/blob/master/EventsByLocation/src/main/java/org/apache/beam/samples/EventsByLocation.java#L111 >> >> with a SimpleFunction (that I can reuse in different MapElements >call). >> >> I had the same need as you in different situation (like having >> PCollection<Foo> and I want PCollection<String> just calling >toString() >> on Foo). I think it could be helpful to have TypeConverter like we >have >> in Apache Camel. >> A list of TypeConverter (implicit) can be present in the Pipeline >> context as something like: >> >> Element Source Type -> Element Target Type -> TypeConverter >> >> (of course an user could add his own type converter with a >source/target >> type). >> >> Implicitly, when we have a PCollection<Source> and want a >> PCollection<Target> the type converter can be called. >> >> A TypeConverter could be basically a PTransform. >> >> Just thinking loud ;) >> >> Regards >> JB >> >> On 11/08/2016 12:56 AM, Jesse Anderson wrote: >> > Is there a way to directly take a PCollection<KV> and make it a >> > PCollection<String>? I need to make the PCollection a >> > PCollection<String> before writing it out with TextIO.Write. >> > >> > I tried using: >> > withCoder(KvCoder.of(StringDelegateCoder.of(String.class), >> > StringDelegateCoder.of(Long.class)) >> > >> > but that causes binary data to be written out by the KV coder. >> > >> > The only way appears to be a manual transform with: >> > PCollection<String> stringCounts = counts.apply(MapElements >> > .via((KV<String, Long> count) -> >> > count.getKey() + ":" + count.getValue()) >> > .withOutputType(TypeDescriptors.strings())); >> > >> > If this is missing, that manual step should be baked into the API. >That >> > should be something either in StringDelegateCoder or a new String >> > transform. The new StringDelegateCoder method would take in any KV >(or >> > list types) and put a specific String delimiter. The new transform >would >> > take in any type in a PCollection<T> and makes it a >PCollection<String> >> > using a specific String delimiter. >> > >> > Thanks, >> > >> > Jesse >> >> -- >> Jean-Baptiste Onofré >> [email protected] >> http://blog.nanthrax.net >> Talend - http://www.talend.com >>
