date:20171114

Spark Streaming fails with unable to get records after polling for 512 ms

2017-11-14 Thread jkagitala

Hi, I'm trying to add spark-streaming to our kafka topic. But, I keep getting this error java.lang.AssertionError: assertion failed: Failed to get record after polling for 512 ms. I tried to add different params like max.poll.interval.ms, spark.streaming.kafka.consumer.poll.ms to 1ms in

Executor not getting added SparkUI & Spark Eventlog in deploymode:cluster

2017-11-14 Thread Mamillapalli, Purna Pradeep

Hi all, Im performing spark submit using Spark rest api POST operation on 6066 port with below config > Launch Command: > "/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.141-1.b16.el7_3.x86_64/jre/bin/java" > "-cp" "/usr/local/spark/conf/:/usr/local/spark/jars/*" "-Xmx4096M" >

Re: Process large JSON file without causing OOM

2017-11-14 Thread Alec Swan

Thanks all. I am not submitting a spark job explicitly. Instead, I am using the Spark library functionality embedded in my web service as shown in the code I included in the previous email. So, effectively Spark SQL runs in the web service's JVM. Therefore, --driver-memory option would not (and

Re: Use of Accumulators

2017-11-14 Thread Kedarnath Dixit

Yes! Thanks! ~Kedar Dixit Bigdata Analytics at Persistent Systems Ltd. From: Holden Karau Sent: 14 November 2017 20:04:50 To: Kedarnath Dixit Cc: user@spark.apache.org Subject: Re: Use of Accumulators And where do you want to read the

Re: Use of Accumulators

2017-11-14 Thread Holden Karau

And where do you want to read the toggle back from? On the driver? On Tue, Nov 14, 2017 at 3:52 AM Kedarnath Dixit < kedarnath_di...@persistent.com> wrote: > Hi, > > > > Inside the transformation if there is any change for the Variable’s > associated data, I want to just toggle it saying there

Spark Structured Streaming + Kafka

2017-11-14 Thread Agostino Calamita

Hi, I have a problem with Structured Streaming and Kafka. I have 2 brokers and a topic with 8 partitions and replication factor 2. This is my driver program: public static void main(String[] args) { SparkSession spark = SparkSession .builder()

Re: Measuring cluster utilization of a streaming job

2017-11-14 Thread Teemu Heikkilä

Without knowing anything about your pipeline the best estimate of the resources needed is to run the job with same ingestion rate as the normal production load. With kafka you can enable back pressure so with high load also your latency will just increase but you don’t have to have capacity for

Measuring cluster utilization of a streaming job

2017-11-14 Thread Nadeem Lalani

Hi, I was wondering if anyone has done some work around measuring the cluster resource utilization of a "typical" spark streaming job. We are trying to build a message ingestion system which will read from Kafka and do some processing. We have had some concerns raised in the team that a 24*7

RE: Use of Accumulators

2017-11-14 Thread Kedarnath Dixit

Hi, Inside the transformation if there is any change for the Variable’s associated data, I want to just toggle it saying there is some change while processing the data. Please let me know if we can runtime do this. Thanks! ~Kedar Dixit Bigdata Analytics at Persistent Systems Ltd. From:

Spark Streaming fails with unable to get records after polling for 512 ms

Executor not getting added SparkUI & Spark Eventlog in deploymode:cluster

Re: Process large JSON file without causing OOM

Re: Use of Accumulators

Re: Use of Accumulators

Spark Structured Streaming + Kafka

Re: Measuring cluster utilization of a streaming job

Measuring cluster utilization of a streaming job

RE: Use of Accumulators

9 matches

Site Navigation

Mail list logo

Footer information