Does DirectRunner do this today? On Mon, Jun 4, 2018 at 9:10 PM Lukasz Cwik <lc...@google.com> wrote:
> Shouldn't the runner isolate each instance of the pipeline behind an > appropriate class loader? > > On Sun, Jun 3, 2018 at 12:45 PM Reuven Lax <re...@google.com> wrote: > >> Just an update: Romain and I chatted on Slack, and I think I understand >> his concern. The concern wasn't specifically about schemas, rather about >> having a generic way to register per-ParDo state that has worker lifetime. >> As evidence that such is needed, in many cases static variables are used to >> simiulate that. static variables however have downsides - if two pipelines >> are run on the same JVM (happens often with unit tests, and there's nothing >> that prevents a runner from doing so in a production environment), these >> static variables will interfere with each other. >> >> On Thu, May 24, 2018 at 12:30 AM Reuven Lax <re...@google.com> wrote: >> >>> Romain, maybe it would be useful for us to find some time on slack. I'd >>> like to understand your concerns. Also keep in mind that I'm tagging all >>> these classes as Experimental for now, so we can definitely change these >>> interfaces around if we decide they are not the best ones. >>> >>> Reuven >>> >>> On Tue, May 22, 2018 at 11:35 PM Romain Manni-Bucau < >>> rmannibu...@gmail.com> wrote: >>> >>>> Why not extending ProcessContext to add the new remapped output? But >>>> looks good (the part i dont like is that creating a new context each time a >>>> new feature is added is hurting users. What when beam will add some >>>> reactive support? ReactiveOutputReceiver?) >>>> >>>> Pipeline sounds the wrong storage since once distributed you serialized >>>> the instances so kind of broke the lifecycle of the original instance and >>>> have no real release/close hook on them anymore right? Not sure we can do >>>> better than dofn/source embedded instances today. >>>> >>>> >>>> >>>> >>>> Le mer. 23 mai 2018 08:02, Romain Manni-Bucau <rmannibu...@gmail.com> >>>> a écrit : >>>> >>>>> >>>>> >>>>> Le mer. 23 mai 2018 07:55, Jean-Baptiste Onofré <j...@nanthrax.net> a >>>>> écrit : >>>>> >>>>>> Hi, >>>>>> >>>>>> IMHO, it would be better to have a explicit transform/IO as converter. >>>>>> >>>>>> It would be easier for users. >>>>>> >>>>>> Another option would be to use a "TypeConverter/SchemaConverter" map >>>>>> as >>>>>> we do in Camel: Beam could check the source/destination "type" and >>>>>> check >>>>>> in the map if there's a converter available. This map can be store as >>>>>> part of the pipeline (as we do for filesystem registration). >>>>>> >>>>> >>>>> >>>>> It works in camel because it is not strongly typed, isnt it? So can >>>>> require a beam new pipeline api. >>>>> >>>>> +1 for the explicit transform, if added to the pipeline api as coder >>>>> it wouldnt break the fluent api: >>>>> >>>>> p.apply(io).setOutputType(Foo.class) >>>>> >>>>> Coders can be a workaround since they owns the type but since the >>>>> pcollection is the real owner it is surely saner this way, no? >>>>> >>>>> Also it needs to ensure all converters are present before running the >>>>> pipeline probably, no implicit environment converter support is probably >>>>> good to start to avoid late surprises. >>>>> >>>>> >>>>> >>>>>> My $0.01 >>>>>> >>>>>> Regards >>>>>> JB >>>>>> >>>>>> On 23/05/2018 07:51, Romain Manni-Bucau wrote: >>>>>> > How does it work on the pipeline side? >>>>>> > Do you generate these "virtual" IO at build time to enable the >>>>>> fluent >>>>>> > API to work not erasing generics? >>>>>> > >>>>>> > ex: SQL(row)->BigQuery(native) will not compile so we need a >>>>>> > SQL(row)->BigQuery(row) >>>>>> > >>>>>> > Side note unrelated to Row: if you add another registry maybe a >>>>>> pretask >>>>>> > is to ensure beam has a kind of singleton/context to avoid to >>>>>> duplicate >>>>>> > it or not track it properly. These kind of converters will need a >>>>>> global >>>>>> > close and not only per record in general: >>>>>> > converter.init();converter.convert(row);....converter.destroy();, >>>>>> > otherwise it easily leaks. This is why it can require some way to >>>>>> not >>>>>> > recreate it. A quick fix, if you are in bytebuddy already, can be >>>>>> to add >>>>>> > it to setup/teardown pby, being more global would be nicer but is >>>>>> more >>>>>> > challenging. >>>>>> > >>>>>> > Romain Manni-Bucau >>>>>> > @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>>> > <https://rmannibucau.metawerx.net/> | Old Blog >>>>>> > <http://rmannibucau.wordpress.com> | Github >>>>>> > <https://github.com/rmannibucau> | LinkedIn >>>>>> > <https://www.linkedin.com/in/rmannibucau> | Book >>>>>> > < >>>>>> https://www.packtpub.com/application-development/java-ee-8-high-performance >>>>>> > >>>>>> > >>>>>> > >>>>>> > Le mer. 23 mai 2018 à 07:22, Reuven Lax <re...@google.com >>>>>> > <mailto:re...@google.com>> a écrit : >>>>>> > >>>>>> > No - the only modules we need to add to core are the ones we >>>>>> choose >>>>>> > to add. For example, I will probably add a registration for >>>>>> > TableRow/TableSchema (GCP BigQuery) so these can work seamlessly >>>>>> > with schemas. However I will add that to the GCP module, so only >>>>>> > someone depending on that module need to pull in that >>>>>> dependency. >>>>>> > The Java ServiceLoader framework can be used by these modules to >>>>>> > register schemas for their types (we already do something >>>>>> similar >>>>>> > for FileSystem and for coders as well). >>>>>> > >>>>>> > BTW, right now the conversion back and forth between Row >>>>>> objects I'm >>>>>> > doing in the ByteBuddy generated bytecode that we generate in >>>>>> order >>>>>> > to invoke DoFns. >>>>>> > >>>>>> > Reuven >>>>>> > >>>>>> > On Tue, May 22, 2018 at 10:04 PM Romain Manni-Bucau >>>>>> > <rmannibu...@gmail.com <mailto:rmannibu...@gmail.com>> wrote: >>>>>> > >>>>>> > Hmm, the pluggability part is close to what I wanted to do >>>>>> with >>>>>> > JsonObject as a main API (to avoid to redo a "row" API and >>>>>> > schema API) >>>>>> > Row.as(Class<T>) sounds good but then, does it mean we'll >>>>>> get >>>>>> > beam-sdk-java-row-jsonobject like modules (I'm not against, >>>>>> just >>>>>> > trying to understand here)? >>>>>> > If so, how an IO can use as() with the type it expects? >>>>>> Doesnt >>>>>> > it lead to have a tons of these modules at the end? >>>>>> > >>>>>> > Romain Manni-Bucau >>>>>> > @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>>> > <https://rmannibucau.metawerx.net/> | Old Blog >>>>>> > <http://rmannibucau.wordpress.com> | Github >>>>>> > <https://github.com/rmannibucau> | LinkedIn >>>>>> > <https://www.linkedin.com/in/rmannibucau> | Book >>>>>> > < >>>>>> https://www.packtpub.com/application-development/java-ee-8-high-performance >>>>>> > >>>>>> > >>>>>> > >>>>>> > Le mer. 23 mai 2018 à 04:57, Reuven Lax <re...@google.com >>>>>> > <mailto:re...@google.com>> a écrit : >>>>>> > >>>>>> > By the way Romain, if you have specific scenarios in >>>>>> mind I >>>>>> > would love to hear them. I can try and guess what >>>>>> exactly >>>>>> > you would like to get out of schemas, but it would work >>>>>> > better if you gave me concrete scenarios that you would >>>>>> like >>>>>> > to work. >>>>>> > >>>>>> > Reuven >>>>>> > >>>>>> > On Tue, May 22, 2018 at 7:45 PM Reuven Lax < >>>>>> re...@google.com >>>>>> > <mailto:re...@google.com>> wrote: >>>>>> > >>>>>> > Yeah, what I'm working on will help with IO. >>>>>> Basically >>>>>> > if you register a function with SchemaRegistry that >>>>>> > converts back and forth between a type (say >>>>>> JsonObject) >>>>>> > and a Beam Row, then it is applied by the framework >>>>>> > behind the scenes as part of DoFn invocation. >>>>>> Concrete >>>>>> > example: let's say I have an IO that reads json >>>>>> objects >>>>>> > class MyJsonIORead extends PTransform<PBegin, >>>>>> > JsonObject> {...} >>>>>> > >>>>>> > If you register a schema for this type (or you can >>>>>> also >>>>>> > just set the schema directly on the output >>>>>> PCollection), >>>>>> > then Beam knows how to convert back and forth >>>>>> between >>>>>> > JsonObject and Row. So the next ParDo can look like >>>>>> > >>>>>> > p.apply(new MyJsonIORead()) >>>>>> > .apply(ParDo.of(new DoFn<JsonObject, T>.... >>>>>> > @ProcessElement void process(@Element Row row) { >>>>>> > }) >>>>>> > >>>>>> > And Beam will automatically convert JsonObject to a >>>>>> Row >>>>>> > for processing (you aren't forced to do this of >>>>>> course - >>>>>> > you can always ask for it as a JsonObject). >>>>>> > >>>>>> > The same is true for output. If you have a sink that >>>>>> > takes in JsonObject but the transform before it >>>>>> produces >>>>>> > Row objects (for instance - because the transform >>>>>> before >>>>>> > it is Beam SQL), Beam can automatically convert Row >>>>>> back >>>>>> > to JsonObject for you. >>>>>> > >>>>>> > All of this was detailed in the Schema doc I shared >>>>>> a >>>>>> > few months ago. There was a lot of discussion on >>>>>> that >>>>>> > document from various parties, and some of this API >>>>>> is a >>>>>> > result of that discussion. This is also working in >>>>>> the >>>>>> > branch JB and I were working on, though not yet >>>>>> > integrated back to master. >>>>>> > >>>>>> > I would like to actually go further and make Row an >>>>>> > interface and provide a way to automatically put a >>>>>> Row >>>>>> > interface on top of any other object (e.g. >>>>>> JsonObject, >>>>>> > Pojo, etc.) This won't change the way the user >>>>>> writes >>>>>> > code, but instead of Beam having to copy and >>>>>> convert at >>>>>> > each stage (e.g. from JsonObject to Row) it simply >>>>>> will >>>>>> > create a Row object that uses the the JsonObject as >>>>>> its >>>>>> > underlying storage. >>>>>> > >>>>>> > Reuven >>>>>> > >>>>>> > On Tue, May 22, 2018 at 11:37 AM Romain Manni-Bucau >>>>>> > <rmannibu...@gmail.com <mailto: >>>>>> rmannibu...@gmail.com>> >>>>>> > wrote: >>>>>> > >>>>>> > Well, beam can implement a new mapper but it >>>>>> doesnt >>>>>> > help for io. Most of modern backends will take >>>>>> json >>>>>> > directly, even javax one and it must stay >>>>>> generic. >>>>>> > >>>>>> > Then since json to pojo mapping is already done >>>>>> a >>>>>> > dozen of times, not sure it is worth it for now. >>>>>> > >>>>>> > Le mar. 22 mai 2018 20:27, Reuven Lax >>>>>> > <re...@google.com <mailto:re...@google.com>> a >>>>>> écrit : >>>>>> > >>>>>> > We can do even better btw. Building a >>>>>> > SchemaRegistry where automatic conversions >>>>>> can >>>>>> > be registered between schema and Java data >>>>>> > types. With this the user won't even need a >>>>>> DoFn >>>>>> > to do the conversion. >>>>>> > >>>>>> > On Tue, May 22, 2018, 10:13 AM Romain >>>>>> > Manni-Bucau <rmannibu...@gmail.com >>>>>> > <mailto:rmannibu...@gmail.com>> wrote: >>>>>> > >>>>>> > Hi guys, >>>>>> > >>>>>> > Checked out what has been done on schema >>>>>> > model and think it is acceptable - >>>>>> regarding >>>>>> > the json debate - >>>>>> > if >>>>>> https://issues.apache.org/jira/browse/BEAM-4381 >>>>>> > can be fixed. >>>>>> > >>>>>> > High level, it is about providing a >>>>>> > mainstream and not too impacting model >>>>>> OOTB >>>>>> > and JSON seems the most valid option for >>>>>> > now, at least for IO and some user >>>>>> transforms. >>>>>> > >>>>>> > Wdyt? >>>>>> > >>>>>> > Le ven. 27 avr. 2018 18:36, Romain >>>>>> > Manni-Bucau <rmannibu...@gmail.com >>>>>> > <mailto:rmannibu...@gmail.com>> a >>>>>> écrit : >>>>>> > >>>>>> > Can give it a try end of may, sure. >>>>>> > (holidays and work constraints will >>>>>> make >>>>>> > it hard before). >>>>>> > >>>>>> > Le 27 avr. 2018 18:26, "Anton Kedin" >>>>>> > <ke...@google.com >>>>>> > <mailto:ke...@google.com>> a >>>>>> écrit : >>>>>> > >>>>>> > Romain, >>>>>> > >>>>>> > I don't believe that JSON >>>>>> approach >>>>>> > was investigated very >>>>>> thoroughIy. I >>>>>> > mentioned few reasons which will >>>>>> > make it not the best choice my >>>>>> > opinion, but I may be wrong. >>>>>> Can you >>>>>> > put together a design doc or a >>>>>> > prototype? >>>>>> > >>>>>> > Thank you, >>>>>> > Anton >>>>>> > >>>>>> > >>>>>> > On Thu, Apr 26, 2018 at 10:17 PM >>>>>> > Romain Manni-Bucau >>>>>> > <rmannibu...@gmail.com >>>>>> > <mailto:rmannibu...@gmail.com>> >>>>>> wrote: >>>>>> > >>>>>> > >>>>>> > >>>>>> > Le 26 avr. 2018 23:13, >>>>>> "Anton >>>>>> > Kedin" <ke...@google.com >>>>>> > <mailto:ke...@google.com>> >>>>>> a écrit : >>>>>> > >>>>>> > BeamRecord (Row) has >>>>>> very >>>>>> > little in common with >>>>>> > JsonObject (I assume >>>>>> you're >>>>>> > talking about >>>>>> javax.json), >>>>>> > except maybe some >>>>>> > similarities of the >>>>>> API. Few >>>>>> > reasons why JsonObject >>>>>> > doesn't work: >>>>>> > >>>>>> > * it is a Java EE API: >>>>>> > o Beam SDK is not >>>>>> > limited to Java. >>>>>> > There are >>>>>> probably >>>>>> > similar APIs for >>>>>> > other languages >>>>>> but >>>>>> > they might not >>>>>> > necessarily >>>>>> carry >>>>>> > the same >>>>>> semantics / >>>>>> > APIs; >>>>>> > >>>>>> > >>>>>> > Not a big deal I think. At >>>>>> least >>>>>> > not a technical blocker. >>>>>> > >>>>>> > o It can change >>>>>> > between Java >>>>>> versions; >>>>>> > >>>>>> > No, this is javaee ;). >>>>>> > >>>>>> > >>>>>> > o Current Beam >>>>>> java >>>>>> > implementation >>>>>> is an >>>>>> > experimental >>>>>> feature >>>>>> > to identify >>>>>> what's >>>>>> > needed from such >>>>>> > API, in the end >>>>>> we >>>>>> > might end up >>>>>> with >>>>>> > something >>>>>> similar to >>>>>> > JsonObject API, >>>>>> but >>>>>> > likely not >>>>>> > >>>>>> > >>>>>> > I dont get that point as a >>>>>> blocker >>>>>> > >>>>>> > o ; >>>>>> > * represents JSON, >>>>>> which >>>>>> > is not an API but an >>>>>> > object notation: >>>>>> > o it is defined as >>>>>> > unicode string >>>>>> in a >>>>>> > certain format. >>>>>> If >>>>>> > you choose to >>>>>> adhere >>>>>> > to ECMA-404, >>>>>> then it >>>>>> > doesn't sound >>>>>> like >>>>>> > JsonObject can >>>>>> > represent an >>>>>> Avro >>>>>> > object, if I'm >>>>>> > reading it >>>>>> right; >>>>>> > >>>>>> > >>>>>> > It is in the generator >>>>>> impl, you >>>>>> > can impl an avrogenerator. >>>>>> > >>>>>> > * doesn't define a >>>>>> type >>>>>> > system (JSON does, >>>>>> but >>>>>> > it's lacking): >>>>>> > o for example, >>>>>> JSON >>>>>> > doesn't define >>>>>> > semantics for >>>>>> numbers; >>>>>> > o doesn't define >>>>>> > date/time types; >>>>>> > o doesn't allow >>>>>> > extending JSON >>>>>> type >>>>>> > system at all; >>>>>> > >>>>>> > >>>>>> > That is why you need a >>>>>> metada >>>>>> > object, or simpler, a schema >>>>>> > with that data. Json or beam >>>>>> > record doesnt help here and >>>>>> you >>>>>> > end up on the same outcome >>>>>> if >>>>>> > you think about it. >>>>>> > >>>>>> > * lacks schemas; >>>>>> > >>>>>> > Jsonschema are standard, >>>>>> widely >>>>>> > spread and tooled compared >>>>>> to >>>>>> > alternative. >>>>>> > >>>>>> > You can definitely try >>>>>> > loosen the requirements >>>>>> and >>>>>> > define everything in >>>>>> JSON in >>>>>> > userland, but the point >>>>>> of >>>>>> > Row/Schema is to avoid >>>>>> it >>>>>> > and define everything in >>>>>> > Beam model, which can be >>>>>> > extended, mapped to >>>>>> JSON, >>>>>> > Avro, BigQuery Schemas, >>>>>> > custom binary format >>>>>> etc., >>>>>> > with same semantics >>>>>> across >>>>>> > beam SDKs. >>>>>> > >>>>>> > >>>>>> > This is what jsonp would >>>>>> allow >>>>>> > with the benefit of a >>>>>> natural >>>>>> > pojo support through jsonb. >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Thu, Apr 26, 2018 at >>>>>> > 12:28 PM Romain >>>>>> Manni-Bucau >>>>>> > <rmannibu...@gmail.com >>>>>> > <mailto: >>>>>> rmannibu...@gmail.com>> >>>>>> > wrote: >>>>>> > >>>>>> > Just to let it be >>>>>> clear >>>>>> > and let me >>>>>> understand: >>>>>> > how is BeamRecord >>>>>> > different from a >>>>>> > JsonObject which is >>>>>> an >>>>>> > API without >>>>>> > implementation (not >>>>>> > event a json one >>>>>> OOTB)? >>>>>> > Advantage of json >>>>>> *api* >>>>>> > are indeed natural >>>>>> > mapping (jsonb is >>>>>> based >>>>>> > on jsonp so no new >>>>>> > binding to >>>>>> reinvent) and >>>>>> > simple serialization >>>>>> > (json+gzip for ex, >>>>>> or >>>>>> > avro if you want to >>>>>> be >>>>>> > geeky). >>>>>> > >>>>>> > I fail to see the >>>>>> point >>>>>> > to rebuild an >>>>>> ecosystem ATM. >>>>>> > >>>>>> > Le 26 avr. 2018 >>>>>> 19:12, >>>>>> > "Reuven Lax" >>>>>> > <re...@google.com >>>>>> > <mailto: >>>>>> re...@google.com>> >>>>>> > a écrit : >>>>>> > >>>>>> > Exactly what JB >>>>>> > said. We will >>>>>> write >>>>>> > a generic >>>>>> conversion >>>>>> > from Avro (or >>>>>> json) >>>>>> > to Beam schemas, >>>>>> > which will make >>>>>> them >>>>>> > work >>>>>> transparently >>>>>> > with SQL. The >>>>>> plan >>>>>> > is also to >>>>>> migrate >>>>>> > Anton's work so >>>>>> that >>>>>> > POJOs works >>>>>> > generically for >>>>>> any >>>>>> > schema. >>>>>> > >>>>>> > Reuven >>>>>> > >>>>>> > On Thu, Apr 26, >>>>>> 2018 >>>>>> > at 1:17 AM >>>>>> > Jean-Baptiste >>>>>> Onofré >>>>>> > < >>>>>> j...@nanthrax.net >>>>>> > <mailto: >>>>>> j...@nanthrax.net>> >>>>>> > wrote: >>>>>> > >>>>>> > For now we >>>>>> have >>>>>> > a generic >>>>>> schema >>>>>> > interface. >>>>>> > Json-b can >>>>>> be an >>>>>> > impl, avro >>>>>> could >>>>>> > be another >>>>>> one. >>>>>> > >>>>>> > Regards >>>>>> > JB >>>>>> > Le 26 avr. >>>>>> 2018, >>>>>> > à 12:08, >>>>>> Romain >>>>>> > Manni-Bucau >>>>>> > < >>>>>> rmannibu...@gmail.com >>>>>> > <mailto: >>>>>> rmannibu...@gmail.com>> >>>>>> > a écrit: >>>>>> > >>>>>> > Hmm, >>>>>> > >>>>>> > avro has >>>>>> > still >>>>>> the >>>>>> > >>>>>> pitfalls to >>>>>> > have an >>>>>> > >>>>>> uncontrolled >>>>>> > stack >>>>>> which >>>>>> > brings >>>>>> way >>>>>> > too much >>>>>> > >>>>>> dependencies >>>>>> > to be >>>>>> part >>>>>> > of any >>>>>> API, >>>>>> > this is >>>>>> why >>>>>> > I >>>>>> proposed a >>>>>> > JSON-P >>>>>> based >>>>>> > API >>>>>> > >>>>>> (JsonObject) >>>>>> > with a >>>>>> > custom >>>>>> beam >>>>>> > entry >>>>>> for >>>>>> > some >>>>>> > metadata >>>>>> > >>>>>> (headers "à >>>>>> > la >>>>>> Camel"). >>>>>> > >>>>>> > >>>>>> > Romain >>>>>> > >>>>>> Manni-Bucau >>>>>> > >>>>>> @rmannibucau >>>>>> > < >>>>>> https://twitter.com/rmannibucau> >>>>>> > | Blog >>>>>> > < >>>>>> https://rmannibucau.metawerx.net/> | >>>>>> > Old Blog >>>>>> > < >>>>>> http://rmannibucau.wordpress.com> >>>>>> > | >>>>>> Github >>>>>> > < >>>>>> https://github.com/rmannibucau> | >>>>>> > LinkedIn >>>>>> > < >>>>>> https://www.linkedin.com/in/rmannibucau> | >>>>>> > Book >>>>>> > < >>>>>> https://www.packtpub.com/application-development/java-ee-8-high-performance >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> 2018-04-26 >>>>>> > 9:59 >>>>>> > >>>>>> GMT+02:00 >>>>>> > >>>>>> Jean-Baptiste Onofré >>>>>> > < >>>>>> j...@nanthrax.net >>>>>> > <mailto: >>>>>> j...@nanthrax.net>>: >>>>>> > >>>>>> > >>>>>> > Hi >>>>>> Ismael >>>>>> > >>>>>> > You >>>>>> mean >>>>>> > >>>>>> directly >>>>>> > in >>>>>> Beam >>>>>> > SQL >>>>>> ? >>>>>> > >>>>>> > That >>>>>> > >>>>>> will be >>>>>> > >>>>>> part of >>>>>> > >>>>>> schema >>>>>> > >>>>>> support: >>>>>> > >>>>>> generic >>>>>> > >>>>>> record >>>>>> > >>>>>> could be >>>>>> > one >>>>>> of >>>>>> > the >>>>>> > >>>>>> payload >>>>>> > with >>>>>> > >>>>>> across >>>>>> > >>>>>> schema. >>>>>> > >>>>>> > >>>>>> Regards >>>>>> > JB >>>>>> > Le >>>>>> 26 >>>>>> > avr. >>>>>> > >>>>>> 2018, à >>>>>> > >>>>>> 11:39, >>>>>> > >>>>>> "Ismaël >>>>>> > >>>>>> Mejía" < >>>>>> > >>>>>> ieme...@gmail.com >>>>>> > >>>>>> <mailto:ieme...@gmail.com>> >>>>>> > a >>>>>> écrit: >>>>>> > >>>>>> > >>>>>> Hello Anton, >>>>>> > >>>>>> > >>>>>> Thanks for the descriptive email and the really useful work. Any plans >>>>>> > >>>>>> to tackle PCollections of GenericRecord/IndexedRecords? it seems Avro >>>>>> > >>>>>> is a natural fit for this approach too. >>>>>> > >>>>>> > >>>>>> Regards, >>>>>> > >>>>>> Ismaël >>>>>> > >>>>>> > >>>>>> On Wed, Apr 25, 2018 at 9:04 PM, Anton Kedin <ke...@google.com >>>>>> > >>>>>> <mailto:ke...@google.com>> wrote: >>>>>> > >>>>>> > >>>>>> >>>>>> > >>>>>> > >>>>>> Hi, >>>>>> > >>>>>> > >>>>>> I want >>>>>> > >>>>>> to >>>>>> > >>>>>> highlight >>>>>> > >>>>>> a couple >>>>>> > >>>>>> of >>>>>> > >>>>>> improvements >>>>>> > >>>>>> to >>>>>> > >>>>>> Beam >>>>>> > >>>>>> SQL >>>>>> > >>>>>> we >>>>>> > >>>>>> have >>>>>> > >>>>>> been >>>>>> > >>>>>> > >>>>>> working >>>>>> > >>>>>> on >>>>>> > >>>>>> recently >>>>>> > >>>>>> which >>>>>> > >>>>>> are >>>>>> > >>>>>> targeted >>>>>> > >>>>>> to >>>>>> > >>>>>> make >>>>>> > >>>>>> Beam >>>>>> > >>>>>> SQL >>>>>> > >>>>>> API >>>>>> > >>>>>> easier >>>>>> > >>>>>> to >>>>>> > >>>>>> use. >>>>>> > >>>>>> > >>>>>> Specifically >>>>>> > >>>>>> these >>>>>> > >>>>>> features >>>>>> > >>>>>> simplify >>>>>> > >>>>>> conversion >>>>>> > >>>>>> of >>>>>> > >>>>>> Java >>>>>> > >>>>>> Beans >>>>>> > >>>>>> and >>>>>> > >>>>>> JSON >>>>>> > >>>>>> > >>>>>> strings >>>>>> > >>>>>> to >>>>>> > >>>>>> Rows. >>>>>> > >>>>>> > >>>>>> > >>>>>> Feel >>>>>> > >>>>>> free >>>>>> > >>>>>> to >>>>>> > >>>>>> try >>>>>> > >>>>>> this >>>>>> > >>>>>> and >>>>>> > >>>>>> send >>>>>> > >>>>>> any >>>>>> > >>>>>> bugs/comments/PRs >>>>>> > >>>>>> my >>>>>> > >>>>>> way. >>>>>> > >>>>>> > >>>>>> > >>>>>> **Caveat: >>>>>> > >>>>>> this >>>>>> > >>>>>> is >>>>>> > >>>>>> still >>>>>> > >>>>>> work >>>>>> > >>>>>> in >>>>>> > >>>>>> progress, >>>>>> > >>>>>> and >>>>>> > >>>>>> has >>>>>> > >>>>>> known >>>>>> > >>>>>> bugs >>>>>> > >>>>>> and >>>>>> > >>>>>> incomplete >>>>>> > >>>>>> > >>>>>> features, >>>>>> > >>>>>> see >>>>>> > >>>>>> below >>>>>> > >>>>>> for >>>>>> > >>>>>> details.** >>>>>> > >>>>>> > >>>>>> > >>>>>> Background >>>>>> > >>>>>> > >>>>>> > >>>>>> Beam >>>>>> > >>>>>> SQL >>>>>> > >>>>>> queries >>>>>> > >>>>>> can >>>>>> > >>>>>> only >>>>>> > >>>>>> be >>>>>> > >>>>>> applied >>>>>> > >>>>>> to >>>>>> > >>>>>> PCollection<Row>. >>>>>> > >>>>>> This >>>>>> > >>>>>> means >>>>>> > >>>>>> that >>>>>> > >>>>>> > >>>>>> users >>>>>> > >>>>>> need >>>>>> > >>>>>> to >>>>>> > >>>>>> convert >>>>>> > >>>>>> whatever >>>>>> > >>>>>> PCollection >>>>>> > >>>>>> elements >>>>>> > >>>>>> they >>>>>> > >>>>>> have >>>>>> > >>>>>> to >>>>>> > >>>>>> Rows >>>>>> > >>>>>> before >>>>>> > >>>>>> > >>>>>> querying >>>>>> > >>>>>> them >>>>>> > >>>>>> with >>>>>> > >>>>>> SQL. >>>>>> > >>>>>> This >>>>>> > >>>>>> usually >>>>>> > >>>>>> requires >>>>>> > >>>>>> >>>>> >>>>>