PCollections's usually are persistent within a pipeline, so you can reuse them in other parts of a pipeline with no problem.
There is no notion of state across pipelines - every pipeline is independent. If you want state across pipelines you can write the PCollection out to a set of files which are read back in in the new pipeline. On Tue, Oct 18, 2022 at 11:45 PM Ravi Kapoor <kapoorrav...@gmail.com> wrote: > Hi Team, > Can we stage a PCollection<TableRows> or PCollection<Row> data? Lets say > to save the expensive operations between two complex BQ tables time and > again and materialize it in some temp view which will be deleted after the > session. > > Is it possible to do that in the Beam Pipeline? > We can later use the temp view in another pipeline to read the data from > and do processing. > > Or In general I would like to know Do we ever stage the PCollection. > Let's say I want to create another instance of the same job which has > complex processing. > Does the pipeline re perform the computation or would it pick the already > processed data in the previous instance that must be staged somewhere? > > Like in spark we do have notions of createOrReplaceTempView which is used > to create temp table from a spark dataframe or dataset. > > Please advise. > > -- > Thanks, > Ravi Kapoor > +91-9818764564 <+91%2098187%2064564> > kapoorrav...@gmail.com >