Hi, Sorry for the delay. So sounds like you want to do something after writing a window of data to BigQuery is complete. I think this should be possible: expansion of BigQueryIO.write() returns a WriteResult and you can apply other transforms to it. Have you tried that?
On Sat, Aug 26, 2017 at 1:10 PM Chaim Turkel <[email protected]> wrote: > I have documents from a mongo db that i need to migrate to bigquery. > Since it is mongodb i do not know they schema ahead of time, so i have > two pipelines, one to run over the documents and update the bigquery > schema, then wait a few minutes (i can take for bigquery to be able to > use the new schema) then with the other pipline copy all the > documents. > To know as to where i got with the different piplines i have a status > table so that at the start i know from where to continue. > So i need the option to update the status table with the success of > the copy and some time value of the last copied document > > > chaim > > On Fri, Aug 25, 2017 at 6:53 PM, Eugene Kirpichov > <[email protected]> wrote: > > I'd like to know more about your both use cases, can you clarify? I think > > making sinks output something that can be waited on by another pipeline > > step is a reasonable request, but more details would help refine this > > suggestion. > > > > On Fri, Aug 25, 2017, 8:46 AM Chamikara Jayalath <[email protected]> > > wrote: > > > >> Can you do this from the program that runs the Beam job, after job is > >> complete (you might have to use a blocking runner or poll for the > status of > >> the job) ? > >> > >> - Cham > >> > >> On Fri, Aug 25, 2017 at 8:44 AM Steve Niemitz <[email protected]> > wrote: > >> > >> > I also have a similar use case (but with BigTable) that I feel like I > had > >> > to hack up to make work. It'd be great to hear if there is a way to > do > >> > something like this already, or if there are plans in the future. > >> > > >> > On Fri, Aug 25, 2017 at 9:46 AM, Chaim Turkel <[email protected]> > wrote: > >> > > >> > > Hi, > >> > > I have a few piplines that are an ETL from different systems to > >> > bigquery. > >> > > I would like to write the status of the ETL after all records have > >> > > been updated to the bigquery. > >> > > The problem is that writing to bigquery is a sink and you cannot > have > >> > > any other steps after the sink. > >> > > I tried a sideoutput, but this is called in no correlation to the > >> > > writing to bigquery, so i don't know if it succeeded or failed. > >> > > > >> > > > >> > > any ideas? > >> > > chaim > >> > > > >> > > >> >
