My recommendation! is using materialized views (MVs) created in Hive with Spark Structured Streaming and Change Data Capture (CDC) is a good combination for efficiently streaming view data updates in your scenario.
HTH Mich Talebzadeh, Technologist | Architect | Data Engineer | Generative AI | FinCrime London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". On Thu, 2 May 2024 at 21:25, Karthick Nk <kcekarth...@gmail.com> wrote: > Hi All, > > Requirements: > I am working on the data flow, which will use the view definition(view > definition already defined in schema), there are multiple tables used in > the view definition. Here we want to stream the view data into elastic > index based on if any of the table(used in the view definition) data got > changed. > > > Current flow: > 1. we are inserting id's from the table(which used in the view definition) > into the common table. > 2. From the common table by using the id, we will be streaming the view > data (by using if any of the incomming id is present in the collective id > of all tables used from view definition) by using spark structured > streaming. > > > Issue: > 1. Here we are facing issue - For each incomming id here we running view > definition(so it will read all the data from all the data) and check if any > of the incomming id is present in the collective id's of view result, Due > to which it is taking more memory in the cluster driver and taking more > time to process. > > > I am epxpecting an alternate solution, if we can avoid full scan of view > definition every time, If you have any alternate deisgn flow how we can > achieve the result, please suggest for the same. > > > Note: Also, it will be helpfull, if you can share the details like > community forum or platform to discuss this kind of deisgn related topics, > it will be more helpfull. >