Re: ETL Using Spark
Hi Avadhut Narayan JoshiThe use case is achievable using Spark. Connection to SQL Server possible as Mich mentioned below as longs as there a JDBC driver that can connect to SQL ServerFor a production workloads important points to consider, >> what is the QoS requirements for your case? at least once, at most once, exactly-once >> how to handle Spark Streaming job restarts? (because of error or you have to put a new version of application) >> What are your error handling strategies? >> How do you deal with late arriving data since you are doing aggregations?It is best to make downstream systems idempotent, that is very less troublesome way to have maintainable production workloadsBest RegardsVP thanksVijay -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
Re: ETL Using Spark
Ok 1. What information are you fetching from MSSQL. Is this reference data? 2. What information are you processing through Spark via topics? 3. Assuming you are combining data from MSSQL and Spark and enriching it are you posting back to another table in the same database? Specifically you can fetch data from MSSQL through JDBC connection. Also the enriched data can be written back to MSSQL through JDBC again HTH LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Thu, 21 May 2020 at 16:15, Avadhut Narayan Joshi wrote: > Hello Team > > > > I am working on ETL using Spark . > > > >- I am fetching streaming data from Confluent Kafka >- Wanted to do aggregations by combining streaming data with Data from >SQL Server > > > > For achieving above use case > > > >1. Can I fetch data from SQL Server into Spark based on where >conditions ? >2. Can such data fetched from SQL Server combined with Streaming data >and again streamed back into SQL Server ? > > > > Is above use case valid ? Do we have any examples for above ? > > > > Regards > > Avadhut > > Schlumberger-Private >
ETL Using Spark
Hello Team I am working on ETL using Spark . * I am fetching streaming data from Confluent Kafka * Wanted to do aggregations by combining streaming data with Data from SQL Server For achieving above use case 1. Can I fetch data from SQL Server into Spark based on where conditions ? 2. Can such data fetched from SQL Server combined with Streaming data and again streamed back into SQL Server ? Is above use case valid ? Do we have any examples for above ? Regards Avadhut Schlumberger-Private