Have you tried just printing each message, to see which ones are being processed?
On Fri, Feb 5, 2016 at 1:41 PM, Diwakar Dhanuskodi < diwakar.dhanusk...@gmail.com> wrote: > I am able to see no of messages processed per event in > sparkstreaming web UI . Also I am counting the messages inside > foreachRDD . > Removed the settings for backpressure but still the same . > > > > > > Sent from Samsung Mobile. > > > -------- Original message -------- > From: Cody Koeninger <c...@koeninger.org> > Date:06/02/2016 00:33 (GMT+05:30) > To: Diwakar Dhanuskodi <diwakar.dhanusk...@gmail.com> > Cc: user@spark.apache.org > Subject: Re: Kafka directsream receiving rate > > How are you counting the number of messages? > > I'd go ahead and remove the settings for backpressure and > maxrateperpartition, just to eliminate that as a variable. > > On Fri, Feb 5, 2016 at 12:22 PM, Diwakar Dhanuskodi < > diwakar.dhanusk...@gmail.com> wrote: > >> I am using one directsream. Below is the call to directsream:- >> >> val topicSet = topics.split(",").toSet >> val kafkaParams = Map[String,String]("bootstrap.servers" -> " >> datanode4.isdp.com:9092") >> val k = >> KafkaUtils.createDirectStream[String,String,StringDecoder,StringDecoder](ssc, >> kafkaParams, topicSet) >> >> When I replace DirectStream call to createStream, all messages >> were read by one Dstream block.:- >> val k = KafkaUtils.createStream(ssc, >> "datanode4.isdp.com:2181","resp",topicMap >> ,StorageLevel.MEMORY_ONLY) >> >> I am using below spark-submit to execute: >> ./spark-submit --master yarn-client --conf >> "spark.dynamicAllocation.enabled=true" --conf >> "spark.shuffle.service.enabled=true" --conf >> "spark.sql.tungsten.enabled=false" --conf "spark.sql.codegen=false" --conf >> "spark.sql.unsafe.enabled=false" --conf >> "spark.streaming.backpressure.enabled=true" --conf "spark.locality.wait=1s" >> --conf "spark.shuffle.consolidateFiles=true" --conf >> "spark.streaming.kafka.maxRatePerPartition=1000000" --driver-memory 2g >> --executor-memory 1g --class com.tcs.dime.spark.SparkReceiver --files >> /etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml,/etc/hadoop/conf/mapred-site.xml,/etc/hadoop/conf/yarn-site.xml,/etc/hive/conf/hive-site.xml >> --jars >> /root/dime/jars/spark-streaming-kafka-assembly_2.10-1.5.1.jar,/root/Jars/sparkreceiver.jar >> /root/Jars/sparkreceiver.jar >> >> >> >> >> Sent from Samsung Mobile. >> >> >> -------- Original message -------- >> From: Cody Koeninger <c...@koeninger.org> >> Date:05/02/2016 22:07 (GMT+05:30) >> To: Diwakar Dhanuskodi <diwakar.dhanusk...@gmail.com> >> Cc: user@spark.apache.org >> Subject: Re: Kafka directsream receiving rate >> >> If you're using the direct stream, you have 0 receivers. Do you mean you >> have 1 executor? >> >> Can you post the relevant call to createDirectStream from your code, as >> well as any relevant spark configuration? >> >> On Thu, Feb 4, 2016 at 8:13 PM, Diwakar Dhanuskodi < >> diwakar.dhanusk...@gmail.com> wrote: >> >>> Adding more info >>> >>> Batch interval is 2000ms. >>> I expect all 100 messages go thru one dstream from directsream but it >>> receives at rate of 10 messages at time. Am I missing some >>> configurations here. Any help appreciated. >>> >>> Regards >>> Diwakar. >>> >>> >>> Sent from Samsung Mobile. >>> >>> >>> -------- Original message -------- >>> From: Diwakar Dhanuskodi <diwakar.dhanusk...@gmail.com> >>> Date:05/02/2016 07:33 (GMT+05:30) >>> To: user@spark.apache.org >>> Cc: >>> Subject: Kafka directsream receiving rate >>> >>> Hi, >>> Using spark 1.5.1. >>> I have a topic with 20 partitions. When I publish 100 messages. Spark >>> direct stream is receiving 10 messages per dstream. I have only one >>> receiver . When I used createStream the receiver received entire 100 >>> messages at once. >>> >>> Appreciate any help . >>> >>> Regards >>> Diwakar >>> >>> >>> Sent from Samsung Mobile. >>> >> >> >