Correct, SqlTransform works well without explicit conversion. The Beam SQL Walkthrough page was a bit misleading. It says: "Before applying a SQL query to a PCollection, the data in the collection must be in Row format" and shows examples how to achieve it. Thank you! Gyorgy
On Wed, Apr 6, 2022 at 5:47 PM Brian Hulette <bhule...@google.com> wrote: > Thanks Reuven! > > Gyorgy - please also note that we'd like it if users didn't actually have > to interact with Rows directly. Beam should automatically convert to Row > under the hood when you apply a schema-aware transform e.g. SqlTransform or > anything in org.apache.beam.sdk.schema.transforms [1]) to PCollection<XYZ>. > Why is it that you need to convert to Row? > > Brian > > [1] > https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/schemas/transforms/package-summary.html > > On Tue, Apr 5, 2022 at 10:13 PM Reuven Lax <re...@google.com> wrote: > >> You can apply Convert.toRows() >> >> On Tue, Apr 5, 2022 at 10:02 PM Balogh, György <bog...@ultinous.com> >> wrote: >> >>> Hi Brian, >>> Thank you it worked, now I have a schema of my PCollection<XYZ>. The >>> next step is still not clear. I'd like to convert this to PCollection<Row> >>> to be able to query with SQL. The doc has an example on how to assemble the >>> row but I assume there should be a way to do this automatically. >>> Thank you, >>> Gyorgy >>> >>> On Tue, Apr 5, 2022 at 10:53 PM Brian Hulette <bhule...@google.com> >>> wrote: >>> >>>> Hi Gyorgy, >>>> >>>> You should be able to register ProtoMessageSchema [1] as the >>>> SchemaProvider for your protobuf type, something like: >>>> >>>> SchemaRegistry.createDefault().registerSchemaProvider(XYZ.class, new >>>> ProtoMessageSchema()) >>>> >>>> This is similar to annotating XYZ >>>> with @DefaultScema(ProtoMessageSchema.class), which of course doesn't work >>>> in this case since you don't control the class. >>>> >>>> Adding @Reuven Lax <re...@google.com> in case he has a better solution. >>>> >>>> Brian >>>> >>>> [1] >>>> https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/extensions/protobuf/ProtoMessageSchema.html >>>> >>>> On Tue, Apr 5, 2022 at 12:03 PM Balogh, György <bog...@ultinous.com> >>>> wrote: >>>> >>>>> Hi, >>>>> I'm using the java beam sdk. >>>>> I have a PCollection<XYZ> where XYZ is a class generated from a proto2 >>>>> file with protoc. >>>>> Is it possible to infer schema and have a PCollection<Row> from this? >>>>> Thank you, >>>>> Gyorgy >>>>> -- >>>>> >>>>> György Balogh >>>>> CEO >>>>> E gyorgy.bal...@ultinous.com <zsolt.sala...@ultinous.com> >>>>> M +36 30 270 8342 <+36%2030%20270%208342> >>>>> A HU, 1117 Budapest, Budafoki út 209. >>>>> W www.ultinous.com >>>>> >>>> >>> >>> -- >>> >>> György Balogh >>> CEO >>> E gyorgy.bal...@ultinous.com <zsolt.sala...@ultinous.com> >>> M +36 30 270 8342 <+36%2030%20270%208342> >>> A HU, 1117 Budapest, Budafoki út 209. >>> W www.ultinous.com >>> >> -- György Balogh CEO E gyorgy.bal...@ultinous.com <zsolt.sala...@ultinous.com> M +36 30 270 8342 <+36%2030%20270%208342> A HU, 1117 Budapest, Budafoki út 209. W www.ultinous.com