Another approach is to let BeamSQL support it natively, as the title of this thread says: "as a Table in BeamSQL".
We might be able to define a table with properties that says this table return a PCollectionView. By doing so we will have a trigger based PCollectionView available in SQL rel nodes, thus SQL will be able to implement [*Pattern: Slowly-changing lookup cache].* By this way, users only need to construct a table and set it to SqlTransform <https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/SqlTransform.java#L186> *. * Create a JIRA to track this idea: https://jira.apache.org/jira/browse/BEAM-7758 -Rui On Tue, Jul 16, 2019 at 7:12 AM Reza Rokni <r...@google.com> wrote: > Hi Rahul, > > FYI, that patterns is also available in the Beam docs ( with updated code > example ) > https://beam.apache.org/documentation/patterns/side-input-patterns/. > > Please note in the DoFn that feeds the View.asSingleton() you will need to > manually call BigQuery using the BigQuery client. > > Regards > > Reza > > On Tue, 16 Jul 2019 at 14:37, rahul patwari <rahulpatwari8...@gmail.com> > wrote: > >> Hi, >> >> we are following [*Pattern: Slowly-changing lookup cache*] from >> https://cloud.google.com/blog/products/gcp/guide-to-common-cloud-dataflow-use-case-patterns-part-1 >> >> We have a use case to read slowly changing bounded data as a PCollection >> along with the main PCollection from Kafka(windowed) and use it in the >> query of BeamSql. >> >> Is it possible to design such a use case with Beam Java SDK? >> >> Approaches followed but not Successful: >> >> 1) GenerateSequence => GlobalWindow with Data Trigger => Composite >> Transform(which applies Beam I/O on the >> pipeline[PCollection.getPipeline()]) => Convert the resulting PCollection >> to PCollection<Row> Apply BeamSQL >> Comments: Beam I/O reads data only once even though a long value is >> generated from GenerateSequece with periodicity. The expectation is that >> whenever a long value is generated, Beam I/O will be used to read the >> latest data. Is this because of optimizations in the DAG? Can the >> optimizations be overridden? >> >> 2) The pipeline is the same as approach 1. But, instead of using a >> composite transform, a DoFn is used where a for loop will emit each Row of >> the PCollection. >> comments: The output PCollection is unbounded. But, we need a bounded >> PCollection as this PCollection is used to JOIN with PCollection of each >> window from Kafka. How can we convert an Unbounded PCollection to Bounded >> PCollection inside a DoFn? >> >> Are there any better Approaches? >> >> Regards, >> Rahul >> >> >> > > -- > > This email may be confidential and privileged. If you received this > communication by mistake, please don't forward it to anyone else, please > erase all copies and attachments, and please let me know that it has gone > to the wrong person. > > The above terms reflect a potential business arrangement, are provided > solely as a basis for further discussion, and are not intended to be and do > not constitute a legally binding obligation. No legally binding obligations > will be created, implied, or inferred until an agreement in final form is > executed in writing by all parties involved. >