This is a thread moved over from the user mailing list.

I think there needs to be a way to convert a PCollection<KV> to
PCollection<String> Conversion.

To do a minimal WordCount, you have to manually convert the KV to a String:
        p
                .apply(TextIO.Read.from("playing_cards.tsv"))
                .apply(Regex.split("\\W+"))
                .apply(Count.perElement())
*                .apply(MapElements.via((KV<String, Long> count) ->*
*                            count.getKey() + ":" + count.getValue()*
*                        ).withOutputType(TypeDescriptors.strings()))*
                .apply(TextIO.Write.to("output/stringcounts"));

This code really should be something like:
        p
                .apply(TextIO.Read.from("playing_cards.tsv"))
                .apply(Regex.split("\\W+"))
                .apply(Count.perElement())
*                .apply(ToString.stringify())*
                .apply(TextIO.Write.to("output/stringcounts"));

To summarize the discussion:

   - JA: Add a method to StringDelegateCoder to output any KV or list
   - JA and DH: Add a SimpleFunction that takes an type and runs toString()
   on it:
   class ToStringFn<InputT> extends SimpleFunction<InputT, String> {
       public static String apply(InputT input) {
           return input.toString();
       }
   }
   - JB: Add a general purpose type converter like in Apache Camel.
   - JA: Add Object support to TextIO.Write that would write out the
   toString of any Object.

My thoughts:

Is converting to a PCollection<String> mostly needed when you're using
TextIO.Write? Will a general purpose transform only work in certain cases
and you'll normally have to write custom code format the strings the way
you want them?

IMHO, it's yes to both. I'd prefer to add Object support to TextIO.Write or
a SimpleFunction that takes a delimiter as an argument. Making a
SimpleFunction that's able to specify a delimiter (and perhaps a prefix and
suffix) should cover the majority of formats and cases.

Thanks,

Jesse

Reply via email to