Re: Kafka with Spark Streaming work on local but it doesn't work in Standalone mode

2020-07-24 Thread Gabor Somogyi
Hi Davide, Please see the doc: *Note: Kafka 0.8 support is deprecated as of Spark 2.3.0.* Have you tried the same with Structured Streaming and not with DStreams? If you insist somehow to DStreams you can use spark-streaming-kafka-0-10 connector instead. BR, G On Fri, Jul 24, 2020 at 12:08 PM

Re: spark exception

2020-07-24 Thread Russell Spitzer
Usually this is just the sign that one of the executors quit unexpectedly which explains the dead executors you see in the ui. The next step is usually to go and look at those executor logs and see if there's any reason for the termination. if you end up seeing an abrupt truncation of the log that

spark exception

2020-07-24 Thread Amit Sharma
Hi All, sometimes i get this error in spark logs. I notice few executors are shown as dead in the executor tab during this error. Although my job get success. Please help me out the root cause of this issue. I have 3 workers with 30 cores each and 64 GB RAM each. My job uses 3 cores per executor

Kafka with Spark Streaming work on local but it doesn't work in Standalone mode

2020-07-24 Thread Davide Curcio
Hi, I'm trying to use Spark Streaming with a very simple script like this: from pyspark import SparkContext, SparkConf from pyspark.streaming import StreamingContext from pyspark.streaming.kafka import KafkaUtils sc = SparkContext(appName="PythonSparkStreamingKafka") ssc =

How to introduce reset logic when aggregating/joining streaming dataframe with static dataframe for spark streaming

2020-07-24 Thread Yong Yuan
A good feature of spark structured streaming is that it can join the static dataframe with the streaming dataframe. To cite an example as below. users is a static dataframe read from database. transactionStream is from a stream. By the joining operation, we can get the spending of each country