Team,
Any update on this?
On Mon, Jun 13, 2022 at 8:39 PM Ravi Kapoor <[email protected]> wrote:
> Hi Team,
>
> I am currently using Beam in my project with Dataflow Runner.
> I am trying to create a pipeline where the data flows from the source to
> staging then to target such as:
>
> A (Source) -> B(Staging) -> C (Target)
>
> When I create a pipeline as below:
>
> PCollection<TableRow> table_A_records = p.apply(BigQueryIO.readTableRows()
> .from("project:dataset.table_A"));
>
> table_A_records.apply(BigQueryIO.writeTableRows().
> to("project:dataset.table_B")
>
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
>
> .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE));
>
> PCollection<TableRow> table_B_records = p.apply(BigQueryIO.readTableRows()
> .from("project:dataset.table_B"));
> table_B_records.apply(BigQueryIO.writeTableRows().
> to("project:dataset.table_C")
>
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
>
> .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE));
> p.run().waitUntilFinish();
>
>
> It basically creates two parallel job graphs in dataflow instead creating
> a transformation as expected:
> A -> B
> B -> C
> I needed to create data pipeline which flows the data in chain like:
> D
> /
> A -> B -> C
> \
> E
> Is there a way to achieve this transformation in between source and target
> tables?
>
> Thanks,
> Ravi
>
--
Thanks,
Ravi Kapoor
+91-9818764564
[email protected]