Team,
Any update on this?

On Mon, Jun 13, 2022 at 8:39 PM Ravi Kapoor <[email protected]> wrote:

> Hi Team,
>
> I am currently using Beam in my project with Dataflow Runner.
> I am trying to create a pipeline where the data flows from the source to
> staging then to target such as:
>
> A (Source) -> B(Staging) -> C (Target)
>
> When I create a pipeline as below:
>
> PCollection<TableRow> table_A_records = p.apply(BigQueryIO.readTableRows()
>         .from("project:dataset.table_A"));
>
> table_A_records.apply(BigQueryIO.writeTableRows().
>         to("project:dataset.table_B")
>         
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
>         
> .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE));
>
> PCollection<TableRow> table_B_records = p.apply(BigQueryIO.readTableRows()
>         .from("project:dataset.table_B"));
> table_B_records.apply(BigQueryIO.writeTableRows().
>         to("project:dataset.table_C")
>         
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
>         
> .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE));
> p.run().waitUntilFinish();
>
>
> It basically creates two parallel job graphs in dataflow instead creating
> a transformation as expected:
> A -> B
> B -> C
> I needed to create data pipeline which flows the data in chain like:
>                      D
>                    /
> A -> B -> C
>                   \
>                     E
> Is there a way to achieve this transformation in between source and target
> tables?
>
> Thanks,
> Ravi
>


-- 
Thanks,
Ravi Kapoor
+91-9818764564
[email protected]

Reply via email to