You can do a join between streaming dataset and a static dataset. I would prefer your first approach. But the problem with this approach is performance. Unless you cache the dataset , every time you fire a join query it will fetch the latest records from the table.
Regards, Rishitesh Mishra, SnappyData . (http://www.snappydata.io/) https://in.linkedin.com/in/rishiteshmishra On Tue, Dec 12, 2017 at 6:29 AM, satyajit vegesna < satyajit.apas...@gmail.com> wrote: > Hi All, > > I working on real time reporting project and i have a question about > structured streaming job, that is going to stream a particular table > records and would have to join to an existing table. > > Stream ----> query/join to another DF/DS ---> update the Stream data > record. > > Now i have a problem on how do i approach the mid layer(query/join to > another DF/DS), should i create a DF from spark.read.format("JDBC") or > "stream and maintain the data in memory sink" or if there is any better way > to do it. > > Would like to know, if anyone has faced a similar scenario and have any > suggestion on how to go ahead. > > Regards, > Satyajit. >