Re: [Spark Kafka Structured Streaming] Adding partition and topic to the kafka dynamically

2020-08-27 Thread Amit Joshi
Any pointers will be appreciated. On Thursday, August 27, 2020, Amit Joshi wrote: > Hi All, > > I am trying to understand the effect of adding topics and partitions to a > topic in kafka, which is being consumed by spark structured streaming > applications. > > Do we have to restart the spark

[Spark Kafka Structured Streaming] Adding partition and topic to the kafka dynamically

2020-08-27 Thread Amit Joshi
Hi All, I am trying to understand the effect of adding topics and partitions to a topic in kafka, which is being consumed by spark structured streaming applications. Do we have to restart the spark structured streaming application to read from the newly added topic? Do we have to restart the

Re: Connecting to Oracle Autonomous Data warehouse (ADW) from Spark via JDBC

2020-08-27 Thread kuassi . mensah
Mich, That's right, referring to you guys. Cheers, Kuassi On 8/27/20 9:27 AM, Mich Talebzadeh wrote: Thanks Kuassi, I presume you mean Spark DEV team by "they are using ... " cheers, Mich LinkedIn /https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Connecting to Oracle Autonomous Data warehouse (ADW) from Spark via JDBC

2020-08-27 Thread Mich Talebzadeh
Thanks Kuassi, I presume you mean Spark DEV team by "they are using ... " cheers, Mich LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * *Disclaimer:* Use it at

Re: Connecting to Oracle Autonomous Data warehouse (ADW) from Spark via JDBC

2020-08-27 Thread kuassi . mensah
According to our dev team. From the error it is evident that they are using a jdbc jar which does not support setting tns_admin in URL. They might have some old jar in class-path which is being used instead of 18.3 jar. You can ask them to use either full URL or tns alias format URL with

Some sort of chaos monkey for spark jobs, do we have it?

2020-08-27 Thread Ivan Petrov
Hi, I'm feeling pain while trying to insert 2-3 millions of records into Mongo using plain Spark RDD. There were so many hidden problems. I would like to avoid this in future and looking for a way to kill individual spark tasks at specific stage and verify expected behaviour of my Spark job.

Export subset of Oracle database

2020-08-27 Thread pduflot
Dear Spark users, I am trying to figure out whether Spark is a good tool for my use case. I'm trying to ETL a subset of a customers/orders database from Oracle to JSON. Rougly 3-5% of the overall customers table. I tried to use the Spark JDBC datasource but it ends up fetching the entire