Re: PCollection to PCollection Conversion

Dan Halperin Thu, 29 Dec 2016 15:18:07 -0800

On Thu, Dec 29, 2016 at 2:10 PM, Jesse Anderson <[email protected]>
wrote:


> I agree MapElements isn't hard to use. I think there is a demand for this
> built-in conversion.
>
> My thought on the formatter is that, worst case, we could do runtime type
> checking. It would be ugly and not as performant, but it should work. As
> we've said, we'd point them to MapElements for better code. We'd write the
> JavaDoc accordingly.
>

I think it will be good to see these proposals in PR form. I would stay far
away from reflection and varargs if possible, but properly-typed bits of
code (possibly exposed as SerializableFunctions in ToString?) would
probably make sense.

In the short-term, I can't find anyone arguing against a ToString.create()
that simply does input.toString().

To get started, how about we ask Vikas to clean up the PR to be more
future-proof for now? Aka make `ToString` itself not a PTransform,  but
instead ToString.create() returns ToString.Default which is a private class
implementing what ToString is now (PTransform<T, String>, wrapping
MapElements).

Then we can send PRs adding new features to that.

IME and to Ben's point, these will mostly be used in development. Some of
> our assumptions will break down when programmers aren't the ones using
> Beam. I can see from the user traffic already that not everyone using Beam
> is a programmer and they'll need classes like this to be productive.


> On Thu, Dec 29, 2016 at 1:46 PM Dan Halperin <[email protected]>
> wrote:
>
> On Thu, Dec 29, 2016 at 1:36 PM, Jesse Anderson <[email protected]>
> wrote:
>
> > I prefer JB's take. I think there should be three overloaded methods on
> the
> > class. I like Vikas' name ToString. The methods for a simple conversion
> > should be:
> >
> > ToString.strings() - Outputs the .toString() of the objects in the
> > PCollection
> > ToString.strings(String delimiter) - Outputs the .toString() of KVs,
> Lists,
> > etc with the delimiter between every entry
> > ToString.formatted(String format) - Outputs the formatted
> > <https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html>
> > string
> > with the object passed in. For objects made up of different parts like
> KVs,
> > each one is passed in as separate toString() of a varargs.
> >
>
> Riffing a little, with some types:
>
> ToString.<T>of() -- PTransform<T, String> that is equivalent to a ParDo
> that takes in a T and outputs T.toString().
>
> ToString.<K,V>kv(String delimiter) -- PTransform<KV<K, V>, String> that is
> equivalent to a ParDo that takes in a KV<K,V> and outputs
> kv.getKey().toString() + delimiter + kv.getValue().toString()
>
> ToString.<T>iterable(String delimiter) -- PTransform<? extends Iterable<T>,
> String> that is equivalent to a ParDo that takes in an Iterable<T> and
> outputs the iterable[0] + delimiter + iterable[1] + delimiter + ... +
> delimiter + iterable[N-1]
>
> ToString.<T>custom(SerializableFunction<T, String> formatter) ?
>
> The last one is just MapElement.via, except you don't need to set the
> output type.
>
> I don't see a way to make the generic .formatted() that you propose that
> just works with anything "made of different parts".
>
> I think this adding too many overrides beyond "of" and "custom" is opening
> up a Pandora's Box. the KV one might want to have left and right
> delimiters, might want to take custom formatters for K and V, etc. etc. The
> iterable one might want to have a special configuration for an empty
> iterable. So I'm inclined towards simplicity with the awareness that
> MapElements.via is just not that hard to use.
>
> Dan
>
>
> >
> > I think doing these three methods would cover every simple and advanced
> > "simple conversions." As JB says, we'll need other specific converters
> for
> > other formats like XML.
> >
> > I'd really like to see this class in the next version of Beam. What does
> > everyone think of the class name, methods name, and method operations so
> we
> > can have Vikas finish up?
> >
> > Thanks,
> >
> > Jesse
> >
> > On Wed, Dec 28, 2016 at 12:28 PM Jean-Baptiste Onofré <[email protected]>
> > wrote:
> >
> > > Hi Vikas,
> > >
> > > did you take a look on:
> > >
> > >
> > > https://github.com/jbonofre/beam/tree/DATAFORMAT/sdks/
> > java/extensions/dataformat
> > >
> > > You can see KV2String and ToString could be part of this extension.
> > > I'm also using JAXB for XML and Jackson for JSON
> > > marshalling/unmarshalling. I'm planning to deal with Avro
> > (IndexedRecord).
> > >
> > > Regards
> > > JB
> > >
> > > On 12/28/2016 08:37 PM, Vikas Kedigehalli wrote:
> > > > Hi All,
> > > >
> > > >   Not being aware of the discussion here, I sent out a PR
> > > > <https://github.com/apache/beam/pull/1704> but JB and others
> directed
> > > me to
> > > > this thread. Having converted PCollection<T> to PCollection<String>
> > > several
> > > > times, I feel something like 'ToString' transform is common enough to
> > be
> > > > part of the core. What do you all think?
> > > >
> > > > Also, if someone else is already working on or interested in tackling
> > > this,
> > > > then I am happy to discard the PR.
> > > >
> > > > Regards,
> > > > Vikas
> > > >
> > > > On Tue, Dec 13, 2016 at 1:56 AM, Amit Sela <[email protected]>
> > wrote:
> > > >
> > > >> It seems that there were a lot of good points raised here, and I
> tend
> > to
> > > >> agree that something as trivial and lean as "ToString" should be a
> > part
> > > of
> > > >> core.ake
> > > >> I'm particularly fond of makeString(prefix, toString, suffix) in
> > various
> > > >> combinations (Scala-like).
> > > >> For "fromString", I think JB has a good point leveraging JAXB and
> > > Jackson -
> > > >> though I think this should be in extensions as it is not as lean as
> > > >> toString.
> > > >>
> > > >> Thanks,
> > > >> Amit
> > > >>
> > > >> On Wed, Nov 30, 2016 at 5:13 AM Jean-Baptiste Onofré <
> [email protected]
> > >
> > > >> wrote:
> > > >>
> > > >>> Hi Jesse,
> > > >>>
> > > >>> yes, I started something there (using JAXB and Jackson). Let me
> > polish
> > > >>> and push.
> > > >>>
> > > >>> Regards
> > > >>> JB
> > > >>>
> > > >>> On 11/29/2016 10:00 PM, Jesse Anderson wrote:
> > > >>>> I went through the string conversions. Do you have an example of
> > > >> writing
> > > >>>> out XML/JSON/etc too?
> > > >>>>
> > > >>>> On Tue, Nov 29, 2016 at 3:46 PM Jean-Baptiste Onofré <
> > [email protected]
> > > >
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Hi Jesse,
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>> https://github.com/jbonofre/incubator-beam/tree/
> > DATAFORMAT/sdks/java/
> > > >> extensions/dataformat
> > > >>>>>
> > > >>>>> it's very simple and stupid and of course not complete at all (I
> > have
> > > >>>>> other commits but not merged as they need some polishing), but as
> I
> > > >>>>> said, it's a base of discussion.
> > > >>>>>
> > > >>>>> Regards
> > > >>>>> JB
> > > >>>>>
> > > >>>>> On 11/29/2016 09:23 PM, Jesse Anderson wrote:
> > > >>>>>> @jb Sounds good. Just let us know once you've pushed.
> > > >>>>>>
> > > >>>>>> On Tue, Nov 29, 2016 at 2:54 PM Jean-Baptiste Onofré <
> > > >> [email protected]>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Good point Eugene.
> > > >>>>>>>
> > > >>>>>>> Right now, it's a DoFn collection to experiment a bit (a pure
> > > >>>>>>> extension). It's pretty stupid ;)
> > > >>>>>>>
> > > >>>>>>> But, you are right, depending the direction of such extension,
> it
> > > >>> could
> > > >>>>>>> cover more use cases (even if it's not my first intention ;)).
> > > >>>>>>>
> > > >>>>>>> Let me push the branch (pretty small) as an illustration, and
> in
> > > the
> > > >>>>>>> mean time, I'm preparing a document (more focused on the use
> > > cases).
> > > >>>>>>>
> > > >>>>>>> WDYT ?
> > > >>>>>>>
> > > >>>>>>> Regards
> > > >>>>>>> JB
> > > >>>>>>>
> > > >>>>>>> On 11/29/2016 08:47 PM, Eugene Kirpichov wrote:
> > > >>>>>>>> Hi JB,
> > > >>>>>>>> Depending on the scope of what you want to ultimately
> accomplish
> > > >> with
> > > >>>>>>> this
> > > >>>>>>>> extension, I think it may make sense to write a proposal
> > document
> > > >> and
> > > >>>>>>>> discuss it.
> > > >>>>>>>> If it's just a collection of utility DoFn's for various
> > > >> well-defined
> > > >>>>>>>> source/target format pairs, then that's probably not needed,
> but
> > > if
> > > >>>>> it's
> > > >>>>>>>> anything more, then I think it is.
> > > >>>>>>>> That will help avoid a lot of churn if people propose
> reasonable
> > > >>>>>>>> significant changes.
> > > >>>>>>>>
> > > >>>>>>>> On Tue, Nov 29, 2016 at 11:15 AM Jean-Baptiste Onofré <
> > > >>> [email protected]
> > > >>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> By the way Jesse, I gonna push my DATAFORMAT branch on my
> > github
> > > >>> and I
> > > >>>>>>>>> will post on the dev mailing list when done.
> > > >>>>>>>>>
> > > >>>>>>>>> Regards
> > > >>>>>>>>> JB
> > > >>>>>>>>>
> > > >>>>>>>>> On 11/29/2016 07:01 PM, Jesse Anderson wrote:
> > > >>>>>>>>>> I want to bring this thread back up since we've had time to
> > > think
> > > >>>>> about
> > > >>>>>>>>> it
> > > >>>>>>>>>> more and make a plan.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I think a format-specific converter will be more time
> > consuming
> > > >>> task
> > > >>>>>>> than
> > > >>>>>>>>>> we originally thought. It'd have to be a writer that takes
> > > >> another
> > > >>>>>>> writer
> > > >>>>>>>>>> as a parameter.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I think a string converter can be done as a simple
> transform.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I think we should start with a simple string converter and
> > plan
> > > >>> for a
> > > >>>>>>>>>> format-specific writer.
> > > >>>>>>>>>>
> > > >>>>>>>>>> What are your thoughts?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Thanks,
> > > >>>>>>>>>>
> > > >>>>>>>>>> Jesse
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Thu, Nov 10, 2016 at 10:33 AM Jesse Anderson <
> > > >>>>> [email protected]
> > > >>>>>>>>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> I was thinking about what the outputs would look like last
> > > >> night. I
> > > >>>>>>>>>> realized that more complex formats like JSON and XML may or
> > may
> > > >> not
> > > >>>>>>>>> output
> > > >>>>>>>>>> the data in a valid format.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Doing a direct conversion on unbounded collections would
> work
> > > >> just
> > > >>>>>>> fine.
> > > >>>>>>>>>> They're self-contained. For writing out bounded collections,
> > > >> that's
> > > >>>>>>> where
> > > >>>>>>>>>> we'll hit the issues. This changes the uber conversion
> > transform
> > > >>>>> into a
> > > >>>>>>>>>> transform that needs to be a writer.
> > > >>>>>>>>>>
> > > >>>>>>>>>> If a transform executes a JSON conversion on a per element
> > > basis,
> > > >>>>> we'd
> > > >>>>>>>>> get
> > > >>>>>>>>>> this:
> > > >>>>>>>>>> {
> > > >>>>>>>>>> "key": "value"
> > > >>>>>>>>>> }, {
> > > >>>>>>>>>> "key": "value"
> > > >>>>>>>>>> },
> > > >>>>>>>>>>
> > > >>>>>>>>>> That isn't valid JSON.
> > > >>>>>>>>>>
> > > >>>>>>>>>> The conversion transform would need to know do several
> things
> > > >> when
> > > >>>>>>>>> writing
> > > >>>>>>>>>> out a file. It would need to add brackets for an array. Now
> we
> > > >>> have:
> > > >>>>>>>>>> [
> > > >>>>>>>>>> {
> > > >>>>>>>>>> "key": "value"
> > > >>>>>>>>>> }, {
> > > >>>>>>>>>> "key": "value"
> > > >>>>>>>>>> },
> > > >>>>>>>>>> ]
> > > >>>>>>>>>>
> > > >>>>>>>>>> We still don't have valid JSON. We have to remove the last
> > comma
> > > >> or
> > > >>>>>>> have
> > > >>>>>>>>>> the uber transform start putting in the commas, except for
> the
> > > >> last
> > > >>>>>>>>> element.
> > > >>>>>>>>>>
> > > >>>>>>>>>> [
> > > >>>>>>>>>> {
> > > >>>>>>>>>> "key": "value"
> > > >>>>>>>>>> }, {
> > > >>>>>>>>>> "key": "value"
> > > >>>>>>>>>> }
> > > >>>>>>>>>> ]
> > > >>>>>>>>>>
> > > >>>>>>>>>> Only by doing this do we have valid JSON.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I'd argue we'd have a similar issue with XML. Some parsers
> > > >> require
> > > >>> a
> > > >>>>>>> root
> > > >>>>>>>>>> element for everything. The uber transform would have to put
> > the
> > > >>> root
> > > >>>>>>>>>> element tags at the beginning and end of the file.
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Wed, Nov 9, 2016 at 11:36 PM Manu Zhang <
> > > >>> [email protected]>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> I would love to see a lean core and abundant Transforms at
> the
> > > >> same
> > > >>>>>>> time.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Maybe we can look at what Confluent <
> > > >>> https://github.com/confluentinc
> > > >>>>>>
> > > >>>>>>>>> does
> > > >>>>>>>>>> for kafka-connect. They have official extensions support for
> > > >> JDBC,
> > > >>>>> HDFS
> > > >>>>>>>>> and
> > > >>>>>>>>>> ElasticSearch under https://github.com/confluentinc. They
> put
> > > >> them
> > > >>>>>>> along
> > > >>>>>>>>>> with other community extensions on
> > > >>>>>>>>>> https://www.confluent.io/product/connectors/ for
> visibility.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Although not a commercial company, can we have a GitHub user
> > > like
> > > >>>>>>>>>> beam-community to host projects we build around beam but not
> > > >>> suitable
> > > >>>>>>> for
> > > >>>>>>>>>> https://github.com/apache/incubator-beam. In the future, we
> > may
> > > >>> have
> > > >>>>>>>>>> beam-algebra like http://github.com/twitter/algebird for
> > > algebra
> > > >>>>>>>>> operations
> > > >>>>>>>>>> and beam-ml / beam-dl for machine learning / deep learning.
> > > Also,
> > > >>>>> there
> > > >>>>>>>>>> will will be beam related projects elsewhere maintained by
> > other
> > > >>>>>>>>>> communities. We can put all of them on the beam-website or
> > like
> > > >>> spark
> > > >>>>>>>>>> packages as mentioned by Amit.
> > > >>>>>>>>>>
> > > >>>>>>>>>> My $0.02
> > > >>>>>>>>>> Manu
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Thu, Nov 10, 2016 at 2:59 AM Kenneth Knowles
> > > >>>>> <[email protected]
> > > >>>>>>>>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> On this point from Amit and Ismaël, I agree: we could
> benefit
> > > >>> from a
> > > >>>>>>>>> place
> > > >>>>>>>>>>> for miscellaneous non-core helper transformations.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> We have sdks/java/extensions but it is organized as
> separate
> > > >>>>>>> artifacts.
> > > >>>>>>>>> I
> > > >>>>>>>>>>> think that is fine, considering the nature of Join and
> > > >> SortValues.
> > > >>>>> But
> > > >>>>>>>>> for
> > > >>>>>>>>>>> simpler transforms, Importing one artifact per tiny
> transform
> > > is
> > > >>> too
> > > >>>>>>>>> much
> > > >>>>>>>>>>> overhead. It also seems unlikely that we will have enough
> > > >>>>> commonality
> > > >>>>>>>>>> among
> > > >>>>>>>>>>> the transforms to call the artifact anything other than
> [some
> > > >>>>> synonym
> > > >>>>>>>>> for]
> > > >>>>>>>>>>> "miscellaneous".
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I wouldn't want to take this too far - even though the SDK
> > many
> > > >>>>>>>>>> transforms*
> > > >>>>>>>>>>> that are not required for the model [1], I like that the
> SDK
> > > >>>>> artifact
> > > >>>>>>>>> has
> > > >>>>>>>>>>> everything a user might need in their "getting started"
> phase
> > > of
> > > >>>>> use.
> > > >>>>>>>>> This
> > > >>>>>>>>>>> user-friendliness (the user doesn't care that ParDo is core
> > and
> > > >>> Sum
> > > >>>>> is
> > > >>>>>>>>>> not)
> > > >>>>>>>>>>> plus the difficulty of judging which transforms go where,
> are
> > > >>>>> probably
> > > >>>>>>>>> why
> > > >>>>>>>>>>> we have them mostly all in one place.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Models to look at, off the top of my head, include Pig's
> > > >> PiggyBank
> > > >>>>> and
> > > >>>>>>>>>>> Apex's Malhar. These have different levels of support
> > implied.
> > > >>>>> Others?
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Kenn
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> [1] ApproximateQuantiles, ApproximateUnique, Count,
> Distinct,
> > > >>>>> Filter,
> > > >>>>>>>>>>> FlatMapElements, Keys, Latest, MapElements, Max, Mean, Min,
> > > >>> Values,
> > > >>>>>>>>>> KvSwap,
> > > >>>>>>>>>>> Partition, Regex, Sample, Sum, Top, Values, WithKeys,
> > > >>> WithTimestamps
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> * at least they are separate classes and not methods on
> > > >>> PCollection
> > > >>>>>>> :-)
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Wed, Nov 9, 2016 at 6:03 AM, Ismaël Mejía <
> > > [email protected]
> > > >>>
> > > >>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Nice discussion, and thanks Jesse for bringing this
> subject
> > > >>> back.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I agree 100% with Amit and the idea of having a home for
> > those
> > > >>>>>>>>>> transforms
> > > >>>>>>>>>>>> that are not core enough to be part of the sdk, but that
> we
> > > all
> > > >>> end
> > > >>>>>>> up
> > > >>>>>>>>>>>> re-writing somehow.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> This is a needed improvement to be more developer
> friendly,
> > > but
> > > >>>>> also
> > > >>>>>>> as
> > > >>>>>>>>>> a
> > > >>>>>>>>>>>> reference of good practices of Beam development, and for
> > this
> > > >>>>> reason
> > > >>>>>>> I
> > > >>>>>>>>>>>> agree with JB that at this moment it would be better for
> > these
> > > >>>>>>>>>> transforms
> > > >>>>>>>>>>>> to reside in the Beam repository at least for visibility
> > > >> reasons.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> One additional question is if these transforms represent a
> > > >>>>> different
> > > >>>>>>>>> DSL
> > > >>>>>>>>>>> or
> > > >>>>>>>>>>>> if those could be grouped with the current extensions
> (e.g.
> > > >> Join
> > > >>>>> and
> > > >>>>>>>>>>>> SortValues) into something more general that we as a
> > community
> > > >>>>> could
> > > >>>>>>>>>>>> maintain, but well even if it is not the case, it would be
> > > >> really
> > > >>>>>>> nice
> > > >>>>>>>>>> to
> > > >>>>>>>>>>>> start working on something like this.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Ismaël Mejía
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Wed, Nov 9, 2016 at 11:59 AM, Jean-Baptiste Onofré <
> > > >>>>>>> [email protected]
> > > >>>>>>>>>>
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> Related to spark-package, we also have Apache Bahir to
> host
> > > >>>>>>>>>>>>> connectors/transforms for Spark and Flink.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> IMHO, right now, Beam should host this, not sure if it
> > makes
> > > >>> sense
> > > >>>>>>>>>>>>> directly in the core.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> It reminds me the "Integration" DSL we discussed in the
> > > >>> technical
> > > >>>>>>>>>>> vision
> > > >>>>>>>>>>>>> document.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Regards
> > > >>>>>>>>>>>>> JB
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On 11/09/2016 11:17 AM, Amit Sela wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> I think Jesse has a very good point on one hand, while
> > > Luke's
> > > >>> and
> > > >>>>>>>>>>>>>> Kenneth's
> > > >>>>>>>>>>>>>> worries about committing users to specific
> implementations
> > > is
> > > >>> in
> > > >>>>>>>>>>> place.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> The Spark community has a 3rd party repository for
> useful
> > > >>>>> libraries
> > > >>>>>>>>>>> that
> > > >>>>>>>>>>>>>> for various reasons are not a part of the Apache Spark
> > > >> project:
> > > >>>>>>>>>>>>>> https://spark-packages.org/.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Maybe a "common-transformations" package would serve
> both
> > > >> users
> > > >>>>>>> quick
> > > >>>>>>>>>>>>>> ramp-up and ease-of-use while keeping Beam more
> > "enabling" ?
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 9:03 PM Kenneth Knowles
> > > >>>>>>>>>> <[email protected]
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> It seems useful for small scale debugging / demoing to
> > have
> > > >>>>>>>>>>>>>>> Dump.toString(). I think it should be named to clearly
> > > >>> indicate
> > > >>>>>>> its
> > > >>>>>>>>>>>>>>> limited
> > > >>>>>>>>>>>>>>> scope. Maybe other stuff could go in the Dump
> namespace,
> > > but
> > > >>>>>>>>>>>>>>> "Dump.toJson()" would be for humans to read - so it
> > should
> > > >> be
> > > >>>>>>> pretty
> > > >>>>>>>>>>>>>>> printed, not treated as a machine-to-machine wire
> format.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> The broader question of representing data in JSON or
> XML,
> > > >> etc,
> > > >>>>> is
> > > >>>>>>>>>>>> already
> > > >>>>>>>>>>>>>>> the subject of many mature libraries which are already
> > easy
> > > >> to
> > > >>>>> use
> > > >>>>>>>>>>> with
> > > >>>>>>>>>>>>>>> Beam.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> The more esoteric practice of implicit or semi-implicit
> > > >>>>> coercions
> > > >>>>>>>>>>> seems
> > > >>>>>>>>>>>>>>> like it is also already addressed in many ways
> elsewhere.
> > > >>>>>>>>>>>>>>> Transform.via(TypeConverter) is basically the same as
> > > >>>>>>>>>>>>>>> MapElements.via(<lambda>) and also easy to use with
> Beam.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> In both of the last cases, there are many reasonable
> > > >>> approaches,
> > > >>>>>>> and
> > > >>>>>>>>>>> we
> > > >>>>>>>>>>>>>>> shouldn't commit our users to one of them.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 10:15 AM, Lukasz Cwik
> > > >>>>>>>>>>> <[email protected]
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> The suggestions you give seem good except for the the
> XML
> > > >>> cases.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Might want to have the XML be a document per line
> > similar
> > > >> to
> > > >>>>> the
> > > >>>>>>>>>>> JSON
> > > >>>>>>>>>>>>>>>> examples you have been giving.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 12:00 PM, Jesse Anderson <
> > > >>>>>>>>>>>> [email protected]>
> > > >>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> @lukasz Agreed there would have to be KV handling. I
> was
> > > >> more
> > > >>>>>>> think
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> that
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> whatever the addition, it shouldn't just handle KV. It
> > > >> should
> > > >>>>>>>>>> handle
> > > >>>>>>>>>>>>>>>>> Iterables, Lists, Sets, and KVs.
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> For JSON and XML, I wonder if we'd be able to give
> > > someone
> > > >>>>>>>>>>> something
> > > >>>>>>>>>>>>>>>>> general purpose enough that you would just end up
> > writing
> > > >>> your
> > > >>>>>>> own
> > > >>>>>>>>>>>> code
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> handle it anyway.
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Here are some ideas on what it could look like with a
> > > >> method
> > > >>>>> and
> > > >>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>> resulting string output:
> > > >>>>>>>>>>>>>>>>> *Stringify.toJSON()*
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> With KV:
> > > >>>>>>>>>>>>>>>>> {"key": "value"}
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> With Iterables:
> > > >>>>>>>>>>>>>>>>> ["one", "two", "three"]
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> *Stringify.toXML("rootelement")*
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> With KV:
> > > >>>>>>>>>>>>>>>>> <rootelement key=value />
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> With Iterables:
> > > >>>>>>>>>>>>>>>>> <rootelement>
> > > >>>>>>>>>>>>>>>>>   <item>one</item>
> > > >>>>>>>>>>>>>>>>>   <item>two</item>
> > > >>>>>>>>>>>>>>>>>   <item>three</item>
> > > >>>>>>>>>>>>>>>>> </rootelement>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> *Stringify.toDelimited(",")*
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> With KV:
> > > >>>>>>>>>>>>>>>>> key,value
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> With Iterables:
> > > >>>>>>>>>>>>>>>>> one,two,three
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Do you think that would strike a good balance between
> > > >>> reusable
> > > >>>>>>>>>> code
> > > >>>>>>>>>>>> and
> > > >>>>>>>>>>>>>>>>> writing your own for more difficult formatting?
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Jesse
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 11:01 AM Lukasz Cwik
> > > >>>>>>>>>>> <[email protected]
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Jesse, I believe if one format gets special treatment
> > in
> > > >>>>> TextIO,
> > > >>>>>>>>>>>> people
> > > >>>>>>>>>>>>>>>>> will then ask why doesn't JSON, XML, ... also not
> > > >> supported.
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Also, the example that you provide is using the fact
> > that
> > > >>> the
> > > >>>>>>>>>> input
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> format
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> is an Iterable<Item>. You had posted a question about
> > > >> using
> > > >>> KV
> > > >>>>>>>>>> with
> > > >>>>>>>>>>>>>>>>> TextIO.Write which wouldn't align with the proposed
> > input
> > > >>>>> format
> > > >>>>>>>>>>> and
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> still
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> would require to write a type conversion function,
> this
> > > >> time
> > > >>>>>>> from
> > > >>>>>>>>>>> KV
> > > >>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>>> Iterable<Item> instead of KV to string.
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 9:50 AM, Jesse Anderson <
> > > >>>>>>>>>>>> [email protected]>
> > > >>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Lukasz,
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> I don't think you'd need complicated logic for
> > > >>> TextIO.Write.
> > > >>>>>>> For
> > > >>>>>>>>>>> CSV
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> call would look like:
> > > >>>>>>>>>>>>>>>>>> Stringify.to("", ",", "\n");
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Where the arguments would be Stringify.to(prefix,
> > > >>> delimiter,
> > > >>>>>>>>>>>> suffix).
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> The code would be something like:
> > > >>>>>>>>>>>>>>>>>> StringBuffer buffer = new StringBuffer(prefix);
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> for (Item item : list) {
> > > >>>>>>>>>>>>>>>>>>   buffer.append(item.toString());
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>   if(notLast) {
> > > >>>>>>>>>>>>>>>>>>     buffer.append(delimiter);
> > > >>>>>>>>>>>>>>>>>>   }
> > > >>>>>>>>>>>>>>>>>> }
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> buffer.append(suffix);
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> c.output(buffer.toString());
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> That would allow you to do the basic CSV, TSV, and
> > other
> > > >>>>>>> formats
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> without
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> complicated logic. The same sort of thing could be
> done
> > > >> for
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> TextIO.Write.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Jesse
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 10:30 AM Lukasz Cwik
> > > >>>>>>>>>>>> <[email protected]
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> The conversion from object to string will have uses
> > > >> outside
> > > >>>>> of
> > > >>>>>>>>>>> just
> > > >>>>>>>>>>>>>>>>>>> TextIO.Write so it seems logical that we would want
> > to
> > > >>> have
> > > >>>>> a
> > > >>>>>>>>>>> ParDo
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> do
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> conversion.
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> Text file formats have a lot of variance, even if
> you
> > > >>>>> consider
> > > >>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> subset
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> of CSV like formats where it could have fixed width
> > > >> fields,
> > > >>>>> or
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> escaping
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> and
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> quoting around other fields, or headers that should
> > be
> > > >>>>> placed
> > > >>>>>>> at
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> top.
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> Having all these format conversions within
> > TextIO.Write
> > > >>>>> seems
> > > >>>>>>>>>>> like
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> a
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> lot
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> of
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> logic to contain in that transform which should
> just
> > > >> focus
> > > >>>>> on
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> writing
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> files.
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 8:15 AM, Jesse Anderson <
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> [email protected]>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> This is a thread moved over from the user mailing
> > list.
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> I think there needs to be a way to convert a
> > > >>>>> PCollection<KV>
> > > >>>>>>> to
> > > >>>>>>>>>>>>>>>>>>>> PCollection<String> Conversion.
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> To do a minimal WordCount, you have to manually
> > > convert
> > > >>> the
> > > >>>>>>> KV
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> to a
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> String:
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>         p
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>  .apply(TextIO.Read.from("playing_cards.tsv"))
> > > >>>>>>>>>>>>>>>>>>>>                 .apply(Regex.split("\\W+"))
> > > >>>>>>>>>>>>>>>>>>>>                 .apply(Count.perElement())
> > > >>>>>>>>>>>>>>>>>>>> *                .apply(MapElements.via((KV<
> String,
> > > >> Long>
> > > >>>>>>>>>> count)
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> ->*
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> *                            count.getKey() + ":" +
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> count.getValue()*
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> *                        ).withOutputType(
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> TypeDescriptors.strings()))*
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>                 .apply(TextIO.Write.to
> > > >>>>>>> ("output/stringcounts"));
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> This code really should be something like:
> > > >>>>>>>>>>>>>>>>>>>>         p
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>  .apply(TextIO.Read.from("playing_cards.tsv"))
> > > >>>>>>>>>>>>>>>>>>>>                 .apply(Regex.split("\\W+"))
> > > >>>>>>>>>>>>>>>>>>>>                 .apply(Count.perElement())
> > > >>>>>>>>>>>>>>>>>>>> *                .apply(ToString.stringify())*
> > > >>>>>>>>>>>>>>>>>>>>                 .apply(TextIO.Write.to
> > > >>>>>>>>> ("output/stringcounts"));
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> To summarize the discussion:
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>    - JA: Add a method to StringDelegateCoder to
> > output
> > > >>> any
> > > >>>>> KV
> > > >>>>>>>>>> or
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> list
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>    - JA and DH: Add a SimpleFunction that takes an
> type
> > > >> and
> > > >>>>> runs
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> toString()
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>    on it:
> > > >>>>>>>>>>>>>>>>>>>>    class ToStringFn<InputT> extends
> > > >>> SimpleFunction<InputT,
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> String>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> {
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>        public static String apply(InputT input) {
> > > >>>>>>>>>>>>>>>>>>>>            return input.toString();
> > > >>>>>>>>>>>>>>>>>>>>        }
> > > >>>>>>>>>>>>>>>>>>>>    }
> > > >>>>>>>>>>>>>>>>>>>>    - JB: Add a general purpose type converter like
> > in
> > > >>>>> Apache
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> Camel.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>    - JA: Add Object support to TextIO.Write that would
> > > >> write
> > > >>>>> out
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>    toString of any Object.
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> My thoughts:
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Is converting to a PCollection<String> mostly
> needed
> > > >> when
> > > >>>>>>>>>> you're
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> using
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> TextIO.Write? Will a general purpose transform only
> > work
> > > >> in
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> certain
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> cases
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> and you'll normally have to write custom code
> format
> > > the
> > > >>>>>>> strings
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> way
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> you want them?
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> IMHO, it's yes to both. I'd prefer to add Object
> > > >> support
> > > >>> to
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> TextIO.Write
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> or
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> a SimpleFunction that takes a delimiter as an
> > > argument.
> > > >>>>>>> Making
> > > >>>>>>>>>> a
> > > >>>>>>>>>>>>>>>>>>>> SimpleFunction that's able to specify a delimiter
> > (and
> > > >>>>>>> perhaps
> > > >>>>>>>>>> a
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> prefix
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> and
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> suffix) should cover the majority of formats and
> > > cases.
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Jesse
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>> --
> > > >>>>>>>>>>>>> Jean-Baptiste Onofré
> > > >>>>>>>>>>>>> [email protected]
> > > >>>>>>>>>>>>> http://blog.nanthrax.net
> > > >>>>>>>>>>>>> Talend - http://www.talend.com
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> --
> > > >>>>>>>>> Jean-Baptiste Onofré
> > > >>>>>>>>> [email protected]
> > > >>>>>>>>> http://blog.nanthrax.net
> > > >>>>>>>>> Talend - http://www.talend.com
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>> --
> > > >>>>>>> Jean-Baptiste Onofré
> > > >>>>>>> [email protected]
> > > >>>>>>> http://blog.nanthrax.net
> > > >>>>>>> Talend - http://www.talend.com
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Jean-Baptiste Onofré
> > > >>>>> [email protected]
> > > >>>>> http://blog.nanthrax.net
> > > >>>>> Talend - http://www.talend.com
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>> --
> > > >>> Jean-Baptiste Onofré
> > > >>> [email protected]
> > > >>> http://blog.nanthrax.net
> > > >>> Talend - http://www.talend.com
> > > >>>
> > > >>
> > > >
> > >
> > > --
> > > Jean-Baptiste Onofré
> > > [email protected]
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> >
>

Re: PCollection to PCollection Conversion

Reply via email to