Re: I cannot use spark 2.3.0 and kafka 0.9?

2018-05-08 Thread Shixiong(Ryan) Zhu
"note that the 0.8 integration is compatible with later 0.9 and 0.10 brokers, but the 0.10 integration is not compatible with earlier brokers." This is pretty clear. You can use 0.8 integration to talk to 0.9 broker. Best Regards, Shixiong Zhu Databricks Inc. shixi...@databricks.com

Re: Error submitting Spark Job in yarn-cluster mode on EMR

2018-05-08 Thread Marco Mistroni
Did you by any chances left a sparkSession.setMaster("local") lurking in your code? Last time i checked, to run on yarn you have to package a 'fat jar'. could you make sure the spark depedencies in your jar matches the version you are running on Yarn? alternatively please share code including

Re: Guava dependency issue

2018-05-08 Thread Koert Kuipers
we shade guava in our fat jar/assembly jar/application jar On Tue, May 8, 2018 at 12:31 PM, Marcelo Vanzin wrote: > Using a custom Guava version with Spark is not that simple. Spark > shades Guava, but a lot of libraries Spark uses do not - the main one > being all of the

Re: Guava dependency issue

2018-05-08 Thread Marcelo Vanzin
Using a custom Guava version with Spark is not that simple. Spark shades Guava, but a lot of libraries Spark uses do not - the main one being all of the Hadoop ones, and they need a quite old Guava. So you have two options: shade/relocate Guava in your application, or use

Re: Advice on multiple streaming job

2018-05-08 Thread Peter Liu
Hi Dhaval, I'm using Yarn scheduler (without the need to specify the port in the submit). Not sue why the port issue here. Gerard seem to have a good point here to have the multiple topics managed within your application (to avoid the port issue) - Not sure if you're using Spark Streaming or

Re: Guava dependency issue

2018-05-08 Thread Stephen Boesch
I downgraded to spark 2.0.1 and it fixed that *particular *runtime exception: but then a similar one appears when saving to parquet: An SOF question on this was created a month ago and today further details plus an open bounty were added to it:

Re: Help Required - Unable to run spark-submit on YARN client mode

2018-05-08 Thread Deepak Sharma
Can you try increasing the partition for the base RDD/dataframe that you are working on? On Tue, May 8, 2018 at 5:05 PM, Debabrata Ghosh wrote: > Hi Everyone, > I have been trying to run spark-shell in YARN client mode, but am getting > lot of ClosedChannelException

Help Required - Unable to run spark-submit on YARN client mode

2018-05-08 Thread Debabrata Ghosh
Hi Everyone, I have been trying to run spark-shell in YARN client mode, but am getting lot of ClosedChannelException errors, however the program works fine on local mode. I am using spark 2.2.0 build for Hadoop 2.7.3. If you are familiar with this error, please can you help with the possible

Error submitting Spark Job in yarn-cluster mode on EMR

2018-05-08 Thread SparkUser6
I have a simple program that works fine in the local mode. But I am having issues when I try to run the program in yarn-cluster mode. I know usually no such method happens when compile and run version mismatch but I made sure I took the same version. 205 [main] INFO