Hi, Please add me as a contributor to the Beam Issue Tracker. I would like to work on this feature. My ASF Jira Username: "rahul8383"
Thanks, Rahul On Wed, Jul 17, 2019 at 1:06 AM Rui Wang <ruw...@google.com> wrote: > Another approach is to let BeamSQL support it natively, as the title of > this thread says: "as a Table in BeamSQL". > > We might be able to define a table with properties that says this table > return a PCollectionView. By doing so we will have a trigger based > PCollectionView available in SQL rel nodes, thus SQL will be able to > implement [*Pattern: Slowly-changing lookup cache].* By this way, users > only need to construct a table and set it to SqlTransform > <https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/SqlTransform.java#L186> > *. * > > Create a JIRA to track this idea: > https://jira.apache.org/jira/browse/BEAM-7758 > > > -Rui > > > On Tue, Jul 16, 2019 at 7:12 AM Reza Rokni <r...@google.com> wrote: > >> Hi Rahul, >> >> FYI, that patterns is also available in the Beam docs ( with updated >> code example ) >> https://beam.apache.org/documentation/patterns/side-input-patterns/. >> >> Please note in the DoFn that feeds the View.asSingleton() you will need >> to manually call BigQuery using the BigQuery client. >> >> Regards >> >> Reza >> >> On Tue, 16 Jul 2019 at 14:37, rahul patwari <rahulpatwari8...@gmail.com> >> wrote: >> >>> Hi, >>> >>> we are following [*Pattern: Slowly-changing lookup cache*] from >>> https://cloud.google.com/blog/products/gcp/guide-to-common-cloud-dataflow-use-case-patterns-part-1 >>> >>> We have a use case to read slowly changing bounded data as a PCollection >>> along with the main PCollection from Kafka(windowed) and use it in the >>> query of BeamSql. >>> >>> Is it possible to design such a use case with Beam Java SDK? >>> >>> Approaches followed but not Successful: >>> >>> 1) GenerateSequence => GlobalWindow with Data Trigger => Composite >>> Transform(which applies Beam I/O on the >>> pipeline[PCollection.getPipeline()]) => Convert the resulting PCollection >>> to PCollection<Row> Apply BeamSQL >>> Comments: Beam I/O reads data only once even though a long value is >>> generated from GenerateSequece with periodicity. The expectation is that >>> whenever a long value is generated, Beam I/O will be used to read the >>> latest data. Is this because of optimizations in the DAG? Can the >>> optimizations be overridden? >>> >>> 2) The pipeline is the same as approach 1. But, instead of using a >>> composite transform, a DoFn is used where a for loop will emit each Row of >>> the PCollection. >>> comments: The output PCollection is unbounded. But, we need a bounded >>> PCollection as this PCollection is used to JOIN with PCollection of each >>> window from Kafka. How can we convert an Unbounded PCollection to Bounded >>> PCollection inside a DoFn? >>> >>> Are there any better Approaches? >>> >>> Regards, >>> Rahul >>> >>> >>> >> >> -- >> >> This email may be confidential and privileged. If you received this >> communication by mistake, please don't forward it to anyone else, please >> erase all copies and attachments, and please let me know that it has gone >> to the wrong person. >> >> The above terms reflect a potential business arrangement, are provided >> solely as a basis for further discussion, and are not intended to be and do >> not constitute a legally binding obligation. No legally binding obligations >> will be created, implied, or inferred until an agreement in final form is >> executed in writing by all parties involved. >> >