@Ben which idea do you like? On Thu, Dec 29, 2016 at 3:20 PM Ben Chambers <bchamb...@google.com.invalid> wrote:
> I like that idea, with the caveat that we should probably come up with a > better name. Perhaps "ToString.elements()" and ToString.Elements or > something? Calling one the "default" and using "create" for it seems > moderately non-future proof. > > On Thu, Dec 29, 2016 at 3:17 PM Dan Halperin <dhalp...@google.com.invalid> > wrote: > > > On Thu, Dec 29, 2016 at 2:10 PM, Jesse Anderson <je...@smokinghand.com> > > wrote: > > > > > I agree MapElements isn't hard to use. I think there is a demand for > this > > > built-in conversion. > > > > > > My thought on the formatter is that, worst case, we could do runtime > type > > > checking. It would be ugly and not as performant, but it should work. > As > > > we've said, we'd point them to MapElements for better code. We'd write > > the > > > JavaDoc accordingly. > > > > > > > I think it will be good to see these proposals in PR form. I would stay > far > > away from reflection and varargs if possible, but properly-typed bits of > > code (possibly exposed as SerializableFunctions in ToString?) would > > probably make sense. > > > > In the short-term, I can't find anyone arguing against a > ToString.create() > > that simply does input.toString(). > > > > To get started, how about we ask Vikas to clean up the PR to be more > > future-proof for now? Aka make `ToString` itself not a PTransform, but > > instead ToString.create() returns ToString.Default which is a private > class > > implementing what ToString is now (PTransform<T, String>, wrapping > > MapElements). > > > > Then we can send PRs adding new features to that. > > > > IME and to Ben's point, these will mostly be used in development. Some of > > > our assumptions will break down when programmers aren't the ones using > > > Beam. I can see from the user traffic already that not everyone using > > Beam > > > is a programmer and they'll need classes like this to be productive. > > > > > > > On Thu, Dec 29, 2016 at 1:46 PM Dan Halperin > <dhalp...@google.com.invalid > > > > > > wrote: > > > > > > On Thu, Dec 29, 2016 at 1:36 PM, Jesse Anderson <je...@smokinghand.com > > > > > wrote: > > > > > > > I prefer JB's take. I think there should be three overloaded methods > on > > > the > > > > class. I like Vikas' name ToString. The methods for a simple > conversion > > > > should be: > > > > > > > > ToString.strings() - Outputs the .toString() of the objects in the > > > > PCollection > > > > ToString.strings(String delimiter) - Outputs the .toString() of KVs, > > > Lists, > > > > etc with the delimiter between every entry > > > > ToString.formatted(String format) - Outputs the formatted > > > > <https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html> > > > > string > > > > with the object passed in. For objects made up of different parts > like > > > KVs, > > > > each one is passed in as separate toString() of a varargs. > > > > > > > > > > Riffing a little, with some types: > > > > > > ToString.<T>of() -- PTransform<T, String> that is equivalent to a ParDo > > > that takes in a T and outputs T.toString(). > > > > > > ToString.<K,V>kv(String delimiter) -- PTransform<KV<K, V>, String> that > > is > > > equivalent to a ParDo that takes in a KV<K,V> and outputs > > > kv.getKey().toString() + delimiter + kv.getValue().toString() > > > > > > ToString.<T>iterable(String delimiter) -- PTransform<? extends > > Iterable<T>, > > > String> that is equivalent to a ParDo that takes in an Iterable<T> and > > > outputs the iterable[0] + delimiter + iterable[1] + delimiter + ... + > > > delimiter + iterable[N-1] > > > > > > ToString.<T>custom(SerializableFunction<T, String> formatter) ? > > > > > > The last one is just MapElement.via, except you don't need to set the > > > output type. > > > > > > I don't see a way to make the generic .formatted() that you propose > that > > > just works with anything "made of different parts". > > > > > > I think this adding too many overrides beyond "of" and "custom" is > > opening > > > up a Pandora's Box. the KV one might want to have left and right > > > delimiters, might want to take custom formatters for K and V, etc. etc. > > The > > > iterable one might want to have a special configuration for an empty > > > iterable. So I'm inclined towards simplicity with the awareness that > > > MapElements.via is just not that hard to use. > > > > > > Dan > > > > > > > > > > > > > > I think doing these three methods would cover every simple and > advanced > > > > "simple conversions." As JB says, we'll need other specific > converters > > > for > > > > other formats like XML. > > > > > > > > I'd really like to see this class in the next version of Beam. What > > does > > > > everyone think of the class name, methods name, and method operations > > so > > > we > > > > can have Vikas finish up? > > > > > > > > Thanks, > > > > > > > > Jesse > > > > > > > > On Wed, Dec 28, 2016 at 12:28 PM Jean-Baptiste Onofré < > j...@nanthrax.net > > > > > > > wrote: > > > > > > > > > Hi Vikas, > > > > > > > > > > did you take a look on: > > > > > > > > > > > > > > > https://github.com/jbonofre/beam/tree/DATAFORMAT/sdks/ > > > > java/extensions/dataformat > > > > > > > > > > You can see KV2String and ToString could be part of this extension. > > > > > I'm also using JAXB for XML and Jackson for JSON > > > > > marshalling/unmarshalling. I'm planning to deal with Avro > > > > (IndexedRecord). > > > > > > > > > > Regards > > > > > JB > > > > > > > > > > On 12/28/2016 08:37 PM, Vikas Kedigehalli wrote: > > > > > > Hi All, > > > > > > > > > > > > Not being aware of the discussion here, I sent out a PR > > > > > > <https://github.com/apache/beam/pull/1704> but JB and others > > > directed > > > > > me to > > > > > > this thread. Having converted PCollection<T> to > PCollection<String> > > > > > several > > > > > > times, I feel something like 'ToString' transform is common > enough > > to > > > > be > > > > > > part of the core. What do you all think? > > > > > > > > > > > > Also, if someone else is already working on or interested in > > tackling > > > > > this, > > > > > > then I am happy to discard the PR. > > > > > > > > > > > > Regards, > > > > > > Vikas > > > > > > > > > > > > On Tue, Dec 13, 2016 at 1:56 AM, Amit Sela <amitsel...@gmail.com > > > > > > wrote: > > > > > > > > > > > >> It seems that there were a lot of good points raised here, and I > > > tend > > > > to > > > > > >> agree that something as trivial and lean as "ToString" should > be a > > > > part > > > > > of > > > > > >> core.ake > > > > > >> I'm particularly fond of makeString(prefix, toString, suffix) in > > > > various > > > > > >> combinations (Scala-like). > > > > > >> For "fromString", I think JB has a good point leveraging JAXB > and > > > > > Jackson - > > > > > >> though I think this should be in extensions as it is not as lean > > as > > > > > >> toString. > > > > > >> > > > > > >> Thanks, > > > > > >> Amit > > > > > >> > > > > > >> On Wed, Nov 30, 2016 at 5:13 AM Jean-Baptiste Onofré < > > > j...@nanthrax.net > > > > > > > > > > >> wrote: > > > > > >> > > > > > >>> Hi Jesse, > > > > > >>> > > > > > >>> yes, I started something there (using JAXB and Jackson). Let me > > > > polish > > > > > >>> and push. > > > > > >>> > > > > > >>> Regards > > > > > >>> JB > > > > > >>> > > > > > >>> On 11/29/2016 10:00 PM, Jesse Anderson wrote: > > > > > >>>> I went through the string conversions. Do you have an example > of > > > > > >> writing > > > > > >>>> out XML/JSON/etc too? > > > > > >>>> > > > > > >>>> On Tue, Nov 29, 2016 at 3:46 PM Jean-Baptiste Onofré < > > > > j...@nanthrax.net > > > > > > > > > > > >>>> wrote: > > > > > >>>> > > > > > >>>>> Hi Jesse, > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> > > > > > >>> https://github.com/jbonofre/incubator-beam/tree/ > > > > DATAFORMAT/sdks/java/ > > > > > >> extensions/dataformat > > > > > >>>>> > > > > > >>>>> it's very simple and stupid and of course not complete at all > > (I > > > > have > > > > > >>>>> other commits but not merged as they need some polishing), > but > > as > > > I > > > > > >>>>> said, it's a base of discussion. > > > > > >>>>> > > > > > >>>>> Regards > > > > > >>>>> JB > > > > > >>>>> > > > > > >>>>> On 11/29/2016 09:23 PM, Jesse Anderson wrote: > > > > > >>>>>> @jb Sounds good. Just let us know once you've pushed. > > > > > >>>>>> > > > > > >>>>>> On Tue, Nov 29, 2016 at 2:54 PM Jean-Baptiste Onofré < > > > > > >> j...@nanthrax.net> > > > > > >>>>>> wrote: > > > > > >>>>>> > > > > > >>>>>>> Good point Eugene. > > > > > >>>>>>> > > > > > >>>>>>> Right now, it's a DoFn collection to experiment a bit (a > pure > > > > > >>>>>>> extension). It's pretty stupid ;) > > > > > >>>>>>> > > > > > >>>>>>> But, you are right, depending the direction of such > > extension, > > > it > > > > > >>> could > > > > > >>>>>>> cover more use cases (even if it's not my first intention > > ;)). > > > > > >>>>>>> > > > > > >>>>>>> Let me push the branch (pretty small) as an illustration, > and > > > in > > > > > the > > > > > >>>>>>> mean time, I'm preparing a document (more focused on the > use > > > > > cases). > > > > > >>>>>>> > > > > > >>>>>>> WDYT ? > > > > > >>>>>>> > > > > > >>>>>>> Regards > > > > > >>>>>>> JB > > > > > >>>>>>> > > > > > >>>>>>> On 11/29/2016 08:47 PM, Eugene Kirpichov wrote: > > > > > >>>>>>>> Hi JB, > > > > > >>>>>>>> Depending on the scope of what you want to ultimately > > > accomplish > > > > > >> with > > > > > >>>>>>> this > > > > > >>>>>>>> extension, I think it may make sense to write a proposal > > > > document > > > > > >> and > > > > > >>>>>>>> discuss it. > > > > > >>>>>>>> If it's just a collection of utility DoFn's for various > > > > > >> well-defined > > > > > >>>>>>>> source/target format pairs, then that's probably not > needed, > > > but > > > > > if > > > > > >>>>> it's > > > > > >>>>>>>> anything more, then I think it is. > > > > > >>>>>>>> That will help avoid a lot of churn if people propose > > > reasonable > > > > > >>>>>>>> significant changes. > > > > > >>>>>>>> > > > > > >>>>>>>> On Tue, Nov 29, 2016 at 11:15 AM Jean-Baptiste Onofré < > > > > > >>> j...@nanthrax.net > > > > > >>>>>> > > > > > >>>>>>>> wrote: > > > > > >>>>>>>> > > > > > >>>>>>>>> By the way Jesse, I gonna push my DATAFORMAT branch on my > > > > github > > > > > >>> and I > > > > > >>>>>>>>> will post on the dev mailing list when done. > > > > > >>>>>>>>> > > > > > >>>>>>>>> Regards > > > > > >>>>>>>>> JB > > > > > >>>>>>>>> > > > > > >>>>>>>>> On 11/29/2016 07:01 PM, Jesse Anderson wrote: > > > > > >>>>>>>>>> I want to bring this thread back up since we've had time > > to > > > > > think > > > > > >>>>> about > > > > > >>>>>>>>> it > > > > > >>>>>>>>>> more and make a plan. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> I think a format-specific converter will be more time > > > > consuming > > > > > >>> task > > > > > >>>>>>> than > > > > > >>>>>>>>>> we originally thought. It'd have to be a writer that > takes > > > > > >> another > > > > > >>>>>>> writer > > > > > >>>>>>>>>> as a parameter. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> I think a string converter can be done as a simple > > > transform. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> I think we should start with a simple string converter > and > > > > plan > > > > > >>> for a > > > > > >>>>>>>>>> format-specific writer. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> What are your thoughts? > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> Thanks, > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> Jesse > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> On Thu, Nov 10, 2016 at 10:33 AM Jesse Anderson < > > > > > >>>>> je...@smokinghand.com > > > > > >>>>>>>> > > > > > >>>>>>>>>> wrote: > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> I was thinking about what the outputs would look like > last > > > > > >> night. I > > > > > >>>>>>>>>> realized that more complex formats like JSON and XML may > > or > > > > may > > > > > >> not > > > > > >>>>>>>>> output > > > > > >>>>>>>>>> the data in a valid format. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> Doing a direct conversion on unbounded collections would > > > work > > > > > >> just > > > > > >>>>>>> fine. > > > > > >>>>>>>>>> They're self-contained. For writing out bounded > > collections, > > > > > >> that's > > > > > >>>>>>> where > > > > > >>>>>>>>>> we'll hit the issues. This changes the uber conversion > > > > transform > > > > > >>>>> into a > > > > > >>>>>>>>>> transform that needs to be a writer. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> If a transform executes a JSON conversion on a per > element > > > > > basis, > > > > > >>>>> we'd > > > > > >>>>>>>>> get > > > > > >>>>>>>>>> this: > > > > > >>>>>>>>>> { > > > > > >>>>>>>>>> "key": "value" > > > > > >>>>>>>>>> }, { > > > > > >>>>>>>>>> "key": "value" > > > > > >>>>>>>>>> }, > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> That isn't valid JSON. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> The conversion transform would need to know do several > > > things > > > > > >> when > > > > > >>>>>>>>> writing > > > > > >>>>>>>>>> out a file. It would need to add brackets for an array. > > Now > > > we > > > > > >>> have: > > > > > >>>>>>>>>> [ > > > > > >>>>>>>>>> { > > > > > >>>>>>>>>> "key": "value" > > > > > >>>>>>>>>> }, { > > > > > >>>>>>>>>> "key": "value" > > > > > >>>>>>>>>> }, > > > > > >>>>>>>>>> ] > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> We still don't have valid JSON. We have to remove the > last > > > > comma > > > > > >> or > > > > > >>>>>>> have > > > > > >>>>>>>>>> the uber transform start putting in the commas, except > for > > > the > > > > > >> last > > > > > >>>>>>>>> element. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> [ > > > > > >>>>>>>>>> { > > > > > >>>>>>>>>> "key": "value" > > > > > >>>>>>>>>> }, { > > > > > >>>>>>>>>> "key": "value" > > > > > >>>>>>>>>> } > > > > > >>>>>>>>>> ] > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> Only by doing this do we have valid JSON. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> I'd argue we'd have a similar issue with XML. Some > parsers > > > > > >> require > > > > > >>> a > > > > > >>>>>>> root > > > > > >>>>>>>>>> element for everything. The uber transform would have to > > put > > > > the > > > > > >>> root > > > > > >>>>>>>>>> element tags at the beginning and end of the file. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> On Wed, Nov 9, 2016 at 11:36 PM Manu Zhang < > > > > > >>> owenzhang1...@gmail.com> > > > > > >>>>>>>>> wrote: > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> I would love to see a lean core and abundant Transforms > at > > > the > > > > > >> same > > > > > >>>>>>> time. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> Maybe we can look at what Confluent < > > > > > >>> https://github.com/confluentinc > > > > > >>>>>> > > > > > >>>>>>>>> does > > > > > >>>>>>>>>> for kafka-connect. They have official extensions support > > for > > > > > >> JDBC, > > > > > >>>>> HDFS > > > > > >>>>>>>>> and > > > > > >>>>>>>>>> ElasticSearch under https://github.com/confluentinc. > They > > > put > > > > > >> them > > > > > >>>>>>> along > > > > > >>>>>>>>>> with other community extensions on > > > > > >>>>>>>>>> https://www.confluent.io/product/connectors/ for > > > visibility. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> Although not a commercial company, can we have a GitHub > > user > > > > > like > > > > > >>>>>>>>>> beam-community to host projects we build around beam but > > not > > > > > >>> suitable > > > > > >>>>>>> for > > > > > >>>>>>>>>> https://github.com/apache/incubator-beam. In the > future, > > we > > > > may > > > > > >>> have > > > > > >>>>>>>>>> beam-algebra like http://github.com/twitter/algebird > for > > > > > algebra > > > > > >>>>>>>>> operations > > > > > >>>>>>>>>> and beam-ml / beam-dl for machine learning / deep > > learning. > > > > > Also, > > > > > >>>>> there > > > > > >>>>>>>>>> will will be beam related projects elsewhere maintained > by > > > > other > > > > > >>>>>>>>>> communities. We can put all of them on the beam-website > or > > > > like > > > > > >>> spark > > > > > >>>>>>>>>> packages as mentioned by Amit. > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> My $0.02 > > > > > >>>>>>>>>> Manu > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> > > > > > >>>>>>>>>> On Thu, Nov 10, 2016 at 2:59 AM Kenneth Knowles > > > > > >>>>> <k...@google.com.invalid > > > > > >>>>>>>> > > > > > >>>>>>>>>> wrote: > > > > > >>>>>>>>>> > > > > > >>>>>>>>>>> On this point from Amit and Ismaël, I agree: we could > > > benefit > > > > > >>> from a > > > > > >>>>>>>>> place > > > > > >>>>>>>>>>> for miscellaneous non-core helper transformations. > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> We have sdks/java/extensions but it is organized as > > > separate > > > > > >>>>>>> artifacts. > > > > > >>>>>>>>> I > > > > > >>>>>>>>>>> think that is fine, considering the nature of Join and > > > > > >> SortValues. > > > > > >>>>> But > > > > > >>>>>>>>> for > > > > > >>>>>>>>>>> simpler transforms, Importing one artifact per tiny > > > transform > > > > > is > > > > > >>> too > > > > > >>>>>>>>> much > > > > > >>>>>>>>>>> overhead. It also seems unlikely that we will have > enough > > > > > >>>>> commonality > > > > > >>>>>>>>>> among > > > > > >>>>>>>>>>> the transforms to call the artifact anything other than > > > [some > > > > > >>>>> synonym > > > > > >>>>>>>>> for] > > > > > >>>>>>>>>>> "miscellaneous". > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> I wouldn't want to take this too far - even though the > > SDK > > > > many > > > > > >>>>>>>>>> transforms* > > > > > >>>>>>>>>>> that are not required for the model [1], I like that > the > > > SDK > > > > > >>>>> artifact > > > > > >>>>>>>>> has > > > > > >>>>>>>>>>> everything a user might need in their "getting started" > > > phase > > > > > of > > > > > >>>>> use. > > > > > >>>>>>>>> This > > > > > >>>>>>>>>>> user-friendliness (the user doesn't care that ParDo is > > core > > > > and > > > > > >>> Sum > > > > > >>>>> is > > > > > >>>>>>>>>> not) > > > > > >>>>>>>>>>> plus the difficulty of judging which transforms go > where, > > > are > > > > > >>>>> probably > > > > > >>>>>>>>> why > > > > > >>>>>>>>>>> we have them mostly all in one place. > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> Models to look at, off the top of my head, include > Pig's > > > > > >> PiggyBank > > > > > >>>>> and > > > > > >>>>>>>>>>> Apex's Malhar. These have different levels of support > > > > implied. > > > > > >>>>> Others? > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> Kenn > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> [1] ApproximateQuantiles, ApproximateUnique, Count, > > > Distinct, > > > > > >>>>> Filter, > > > > > >>>>>>>>>>> FlatMapElements, Keys, Latest, MapElements, Max, Mean, > > Min, > > > > > >>> Values, > > > > > >>>>>>>>>> KvSwap, > > > > > >>>>>>>>>>> Partition, Regex, Sample, Sum, Top, Values, WithKeys, > > > > > >>> WithTimestamps > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> * at least they are separate classes and not methods on > > > > > >>> PCollection > > > > > >>>>>>> :-) > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>> On Wed, Nov 9, 2016 at 6:03 AM, Ismaël Mejía < > > > > > ieme...@gmail.com > > > > > >>> > > > > > >>>>>>> wrote: > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>>>> Nice discussion, and thanks Jesse for bringing this > > > subject > > > > > >>> back. > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> I agree 100% with Amit and the idea of having a home > for > > > > those > > > > > >>>>>>>>>> transforms > > > > > >>>>>>>>>>>> that are not core enough to be part of the sdk, but > that > > > we > > > > > all > > > > > >>> end > > > > > >>>>>>> up > > > > > >>>>>>>>>>>> re-writing somehow. > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> This is a needed improvement to be more developer > > > friendly, > > > > > but > > > > > >>>>> also > > > > > >>>>>>> as > > > > > >>>>>>>>>> a > > > > > >>>>>>>>>>>> reference of good practices of Beam development, and > for > > > > this > > > > > >>>>> reason > > > > > >>>>>>> I > > > > > >>>>>>>>>>>> agree with JB that at this moment it would be better > for > > > > these > > > > > >>>>>>>>>> transforms > > > > > >>>>>>>>>>>> to reside in the Beam repository at least for > visibility > > > > > >> reasons. > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> One additional question is if these transforms > > represent a > > > > > >>>>> different > > > > > >>>>>>>>> DSL > > > > > >>>>>>>>>>> or > > > > > >>>>>>>>>>>> if those could be grouped with the current extensions > > > (e.g. > > > > > >> Join > > > > > >>>>> and > > > > > >>>>>>>>>>>> SortValues) into something more general that we as a > > > > community > > > > > >>>>> could > > > > > >>>>>>>>>>>> maintain, but well even if it is not the case, it > would > > be > > > > > >> really > > > > > >>>>>>> nice > > > > > >>>>>>>>>> to > > > > > >>>>>>>>>>>> start working on something like this. > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> Ismaël Mejía > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>> On Wed, Nov 9, 2016 at 11:59 AM, Jean-Baptiste Onofré > < > > > > > >>>>>>> j...@nanthrax.net > > > > > >>>>>>>>>> > > > > > >>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>>> Related to spark-package, we also have Apache Bahir > to > > > host > > > > > >>>>>>>>>>>>> connectors/transforms for Spark and Flink. > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> IMHO, right now, Beam should host this, not sure if > it > > > > makes > > > > > >>> sense > > > > > >>>>>>>>>>>>> directly in the core. > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> It reminds me the "Integration" DSL we discussed in > the > > > > > >>> technical > > > > > >>>>>>>>>>> vision > > > > > >>>>>>>>>>>>> document. > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> Regards > > > > > >>>>>>>>>>>>> JB > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> On 11/09/2016 11:17 AM, Amit Sela wrote: > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> I think Jesse has a very good point on one hand, > while > > > > > Luke's > > > > > >>> and > > > > > >>>>>>>>>>>>>> Kenneth's > > > > > >>>>>>>>>>>>>> worries about committing users to specific > > > implementations > > > > > is > > > > > >>> in > > > > > >>>>>>>>>>> place. > > > > > >>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> The Spark community has a 3rd party repository for > > > useful > > > > > >>>>> libraries > > > > > >>>>>>>>>>> that > > > > > >>>>>>>>>>>>>> for various reasons are not a part of the Apache > Spark > > > > > >> project: > > > > > >>>>>>>>>>>>>> https://spark-packages.org/. > > > > > >>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> Maybe a "common-transformations" package would serve > > > both > > > > > >> users > > > > > >>>>>>> quick > > > > > >>>>>>>>>>>>>> ramp-up and ease-of-use while keeping Beam more > > > > "enabling" ? > > > > > >>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 9:03 PM Kenneth Knowles > > > > > >>>>>>>>>> <k...@google.com.invalid > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> It seems useful for small scale debugging / demoing > to > > > > have > > > > > >>>>>>>>>>>>>>> Dump.toString(). I think it should be named to > > clearly > > > > > >>> indicate > > > > > >>>>>>> its > > > > > >>>>>>>>>>>>>>> limited > > > > > >>>>>>>>>>>>>>> scope. Maybe other stuff could go in the Dump > > > namespace, > > > > > but > > > > > >>>>>>>>>>>>>>> "Dump.toJson()" would be for humans to read - so it > > > > should > > > > > >> be > > > > > >>>>>>> pretty > > > > > >>>>>>>>>>>>>>> printed, not treated as a machine-to-machine wire > > > format. > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>> The broader question of representing data in JSON > or > > > XML, > > > > > >> etc, > > > > > >>>>> is > > > > > >>>>>>>>>>>> already > > > > > >>>>>>>>>>>>>>> the subject of many mature libraries which are > > already > > > > easy > > > > > >> to > > > > > >>>>> use > > > > > >>>>>>>>>>> with > > > > > >>>>>>>>>>>>>>> Beam. > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>> The more esoteric practice of implicit or > > semi-implicit > > > > > >>>>> coercions > > > > > >>>>>>>>>>> seems > > > > > >>>>>>>>>>>>>>> like it is also already addressed in many ways > > > elsewhere. > > > > > >>>>>>>>>>>>>>> Transform.via(TypeConverter) is basically the same > as > > > > > >>>>>>>>>>>>>>> MapElements.via(<lambda>) and also easy to use with > > > Beam. > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>> In both of the last cases, there are many > reasonable > > > > > >>> approaches, > > > > > >>>>>>> and > > > > > >>>>>>>>>>> we > > > > > >>>>>>>>>>>>>>> shouldn't commit our users to one of them. > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 10:15 AM, Lukasz Cwik > > > > > >>>>>>>>>>> <lc...@google.com.invalid > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>> The suggestions you give seem good except for the > the > > > XML > > > > > >>> cases. > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> Might want to have the XML be a document per line > > > > similar > > > > > >> to > > > > > >>>>> the > > > > > >>>>>>>>>>> JSON > > > > > >>>>>>>>>>>>>>>> examples you have been giving. > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 12:00 PM, Jesse Anderson < > > > > > >>>>>>>>>>>> je...@smokinghand.com> > > > > > >>>>>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> @lukasz Agreed there would have to be KV > handling. I > > > was > > > > > >> more > > > > > >>>>>>> think > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> that > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> whatever the addition, it shouldn't just handle > KV. > > It > > > > > >> should > > > > > >>>>>>>>>> handle > > > > > >>>>>>>>>>>>>>>>> Iterables, Lists, Sets, and KVs. > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> For JSON and XML, I wonder if we'd be able to > give > > > > > someone > > > > > >>>>>>>>>>> something > > > > > >>>>>>>>>>>>>>>>> general purpose enough that you would just end up > > > > writing > > > > > >>> your > > > > > >>>>>>> own > > > > > >>>>>>>>>>>> code > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> to > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> handle it anyway. > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> Here are some ideas on what it could look like > > with a > > > > > >> method > > > > > >>>>> and > > > > > >>>>>>>>>>> the > > > > > >>>>>>>>>>>>>>>>> resulting string output: > > > > > >>>>>>>>>>>>>>>>> *Stringify.toJSON()* > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> With KV: > > > > > >>>>>>>>>>>>>>>>> {"key": "value"} > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> With Iterables: > > > > > >>>>>>>>>>>>>>>>> ["one", "two", "three"] > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> *Stringify.toXML("rootelement")* > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> With KV: > > > > > >>>>>>>>>>>>>>>>> <rootelement key=value /> > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> With Iterables: > > > > > >>>>>>>>>>>>>>>>> <rootelement> > > > > > >>>>>>>>>>>>>>>>> <item>one</item> > > > > > >>>>>>>>>>>>>>>>> <item>two</item> > > > > > >>>>>>>>>>>>>>>>> <item>three</item> > > > > > >>>>>>>>>>>>>>>>> </rootelement> > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> *Stringify.toDelimited(",")* > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> With KV: > > > > > >>>>>>>>>>>>>>>>> key,value > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> With Iterables: > > > > > >>>>>>>>>>>>>>>>> one,two,three > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> Do you think that would strike a good balance > > between > > > > > >>> reusable > > > > > >>>>>>>>>> code > > > > > >>>>>>>>>>>> and > > > > > >>>>>>>>>>>>>>>>> writing your own for more difficult formatting? > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> Thanks, > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> Jesse > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 11:01 AM Lukasz Cwik > > > > > >>>>>>>>>>> <lc...@google.com.invalid > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> Jesse, I believe if one format gets special > > treatment > > > > in > > > > > >>>>> TextIO, > > > > > >>>>>>>>>>>> people > > > > > >>>>>>>>>>>>>>>>> will then ask why doesn't JSON, XML, ... also not > > > > > >> supported. > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> Also, the example that you provide is using the > > fact > > > > that > > > > > >>> the > > > > > >>>>>>>>>> input > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> format > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> is an Iterable<Item>. You had posted a question > > about > > > > > >> using > > > > > >>> KV > > > > > >>>>>>>>>> with > > > > > >>>>>>>>>>>>>>>>> TextIO.Write which wouldn't align with the > proposed > > > > input > > > > > >>>>> format > > > > > >>>>>>>>>>> and > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> still > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> would require to write a type conversion > function, > > > this > > > > > >> time > > > > > >>>>>>> from > > > > > >>>>>>>>>>> KV > > > > > >>>>>>>>>>>> to > > > > > >>>>>>>>>>>>>>>>> Iterable<Item> instead of KV to string. > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 9:50 AM, Jesse Anderson < > > > > > >>>>>>>>>>>> je...@smokinghand.com> > > > > > >>>>>>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> Lukasz, > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> I don't think you'd need complicated logic for > > > > > >>> TextIO.Write. > > > > > >>>>>>> For > > > > > >>>>>>>>>>> CSV > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> the > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> call would look like: > > > > > >>>>>>>>>>>>>>>>>> Stringify.to("", ",", "\n"); > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> Where the arguments would be > Stringify.to(prefix, > > > > > >>> delimiter, > > > > > >>>>>>>>>>>> suffix). > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> The code would be something like: > > > > > >>>>>>>>>>>>>>>>>> StringBuffer buffer = new StringBuffer(prefix); > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> for (Item item : list) { > > > > > >>>>>>>>>>>>>>>>>> buffer.append(item.toString()); > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> if(notLast) { > > > > > >>>>>>>>>>>>>>>>>> buffer.append(delimiter); > > > > > >>>>>>>>>>>>>>>>>> } > > > > > >>>>>>>>>>>>>>>>>> } > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> buffer.append(suffix); > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> c.output(buffer.toString()); > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> That would allow you to do the basic CSV, TSV, > and > > > > other > > > > > >>>>>>> formats > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> without > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> complicated logic. The same sort of thing could > be > > > done > > > > > >> for > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> TextIO.Write. > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> Thanks, > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> Jesse > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 10:30 AM Lukasz Cwik > > > > > >>>>>>>>>>>> <lc...@google.com.invalid > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> The conversion from object to string will have > > uses > > > > > >> outside > > > > > >>>>> of > > > > > >>>>>>>>>>> just > > > > > >>>>>>>>>>>>>>>>>>> TextIO.Write so it seems logical that we would > > want > > > > to > > > > > >>> have > > > > > >>>>> a > > > > > >>>>>>>>>>> ParDo > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> do > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> the > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> conversion. > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> Text file formats have a lot of variance, even > if > > > you > > > > > >>>>> consider > > > > > >>>>>>>>>>> the > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> subset > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> of CSV like formats where it could have fixed > > width > > > > > >> fields, > > > > > >>>>> or > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> escaping > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> and > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> quoting around other fields, or headers that > > should > > > > be > > > > > >>>>> placed > > > > > >>>>>>> at > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> the > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> top. > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> Having all these format conversions within > > > > TextIO.Write > > > > > >>>>> seems > > > > > >>>>>>>>>>> like > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> a > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> lot > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> of > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> logic to contain in that transform which should > > > just > > > > > >> focus > > > > > >>>>> on > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> writing > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> to > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> files. > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> On Tue, Nov 8, 2016 at 8:15 AM, Jesse Anderson > < > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> je...@smokinghand.com> > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> wrote: > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> This is a thread moved over from the user > mailing > > > > list. > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> I think there needs to be a way to convert a > > > > > >>>>> PCollection<KV> > > > > > >>>>>>> to > > > > > >>>>>>>>>>>>>>>>>>>> PCollection<String> Conversion. > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> To do a minimal WordCount, you have to > manually > > > > > convert > > > > > >>> the > > > > > >>>>>>> KV > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> to a > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> String: > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> p > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>> .apply(TextIO.Read.from("playing_cards.tsv")) > > > > > >>>>>>>>>>>>>>>>>>>> .apply(Regex.split("\\W+")) > > > > > >>>>>>>>>>>>>>>>>>>> .apply(Count.perElement()) > > > > > >>>>>>>>>>>>>>>>>>>> * .apply(MapElements.via((KV< > > > String, > > > > > >> Long> > > > > > >>>>>>>>>> count) > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> ->* > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> * count.getKey() + > ":" + > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> count.getValue()* > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> * ).withOutputType( > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> TypeDescriptors.strings()))* > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> .apply(TextIO.Write.to > > > > > >>>>>>> ("output/stringcounts")); > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> This code really should be something like: > > > > > >>>>>>>>>>>>>>>>>>>> p > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>> .apply(TextIO.Read.from("playing_cards.tsv")) > > > > > >>>>>>>>>>>>>>>>>>>> .apply(Regex.split("\\W+")) > > > > > >>>>>>>>>>>>>>>>>>>> .apply(Count.perElement()) > > > > > >>>>>>>>>>>>>>>>>>>> * .apply(ToString.stringify())* > > > > > >>>>>>>>>>>>>>>>>>>> .apply(TextIO.Write.to > > > > > >>>>>>>>> ("output/stringcounts")); > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> To summarize the discussion: > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> - JA: Add a method to StringDelegateCoder > to > > > > output > > > > > >>> any > > > > > >>>>> KV > > > > > >>>>>>>>>> or > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> list > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> - JA and DH: Add a SimpleFunction that takes > an > > > type > > > > > >> and > > > > > >>>>> runs > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> toString() > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> on it: > > > > > >>>>>>>>>>>>>>>>>>>> class ToStringFn<InputT> extends > > > > > >>> SimpleFunction<InputT, > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> String> > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> { > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> public static String apply(InputT input) { > > > > > >>>>>>>>>>>>>>>>>>>> return input.toString(); > > > > > >>>>>>>>>>>>>>>>>>>> } > > > > > >>>>>>>>>>>>>>>>>>>> } > > > > > >>>>>>>>>>>>>>>>>>>> - JB: Add a general purpose type converter > > like > > > > in > > > > > >>>>> Apache > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> Camel. > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> - JA: Add Object support to TextIO.Write that > > would > > > > > >> write > > > > > >>>>> out > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> the > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> toString of any Object. > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> My thoughts: > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> Is converting to a PCollection<String> mostly > > > needed > > > > > >> when > > > > > >>>>>>>>>> you're > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> using > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> TextIO.Write? Will a general purpose transform > > only > > > > work > > > > > >> in > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> certain > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> cases > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> and you'll normally have to write custom code > > > format > > > > > the > > > > > >>>>>>> strings > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> the > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> way > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> you want them? > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> IMHO, it's yes to both. I'd prefer to add > Object > > > > > >> support > > > > > >>> to > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> TextIO.Write > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> or > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> a SimpleFunction that takes a delimiter as an > > > > > argument. > > > > > >>>>>>> Making > > > > > >>>>>>>>>> a > > > > > >>>>>>>>>>>>>>>>>>>> SimpleFunction that's able to specify a > > delimiter > > > > (and > > > > > >>>>>>> perhaps > > > > > >>>>>>>>>> a > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> prefix > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> and > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> suffix) should cover the majority of formats > and > > > > > cases. > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> Thanks, > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> Jesse > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>>> > > > > > >>>>>>>>>>>>> -- > > > > > >>>>>>>>>>>>> Jean-Baptiste Onofré > > > > > >>>>>>>>>>>>> jbono...@apache.org > > > > > >>>>>>>>>>>>> http://blog.nanthrax.net > > > > > >>>>>>>>>>>>> Talend - http://www.talend.com > > > > > >>>>>>>>>>>>> > > > > > >>>>>>>>>>>> > > > > > >>>>>>>>>>> > > > > > >>>>>>>>>> > > > > > >>>>>>>>> > > > > > >>>>>>>>> -- > > > > > >>>>>>>>> Jean-Baptiste Onofré > > > > > >>>>>>>>> jbono...@apache.org > > > > > >>>>>>>>> http://blog.nanthrax.net > > > > > >>>>>>>>> Talend - http://www.talend.com > > > > > >>>>>>>>> > > > > > >>>>>>>> > > > > > >>>>>>> > > > > > >>>>>>> -- > > > > > >>>>>>> Jean-Baptiste Onofré > > > > > >>>>>>> jbono...@apache.org > > > > > >>>>>>> http://blog.nanthrax.net > > > > > >>>>>>> Talend - http://www.talend.com > > > > > >>>>>>> > > > > > >>>>>> > > > > > >>>>> > > > > > >>>>> -- > > > > > >>>>> Jean-Baptiste Onofré > > > > > >>>>> jbono...@apache.org > > > > > >>>>> http://blog.nanthrax.net > > > > > >>>>> Talend - http://www.talend.com > > > > > >>>>> > > > > > >>>> > > > > > >>> > > > > > >>> -- > > > > > >>> Jean-Baptiste Onofré > > > > > >>> jbono...@apache.org > > > > > >>> http://blog.nanthrax.net > > > > > >>> Talend - http://www.talend.com > > > > > >>> > > > > > >> > > > > > > > > > > > > > > > > -- > > > > > Jean-Baptiste Onofré > > > > > jbono...@apache.org > > > > > http://blog.nanthrax.net > > > > > Talend - http://www.talend.com > > > > > > > > > > > > > > >