Have you tried just printing each message, to see which ones are being
processed?

On Fri, Feb 5, 2016 at 1:41 PM, Diwakar Dhanuskodi <
diwakar.dhanusk...@gmail.com> wrote:

> I am  able  to  see  no of  messages processed  per  event  in
>  sparkstreaming web UI . Also  I am  counting  the  messages inside
>  foreachRDD .
> Removed  the  settings for  backpressure but still  the  same .
>
>
>
>
>
> Sent from Samsung Mobile.
>
>
> -------- Original message --------
> From: Cody Koeninger <c...@koeninger.org>
> Date:06/02/2016 00:33 (GMT+05:30)
> To: Diwakar Dhanuskodi <diwakar.dhanusk...@gmail.com>
> Cc: user@spark.apache.org
> Subject: Re: Kafka directsream receiving rate
>
> How are you counting the number of messages?
>
> I'd go ahead and remove the settings for backpressure and
> maxrateperpartition, just to eliminate that as a variable.
>
> On Fri, Feb 5, 2016 at 12:22 PM, Diwakar Dhanuskodi <
> diwakar.dhanusk...@gmail.com> wrote:
>
>> I am  using  one  directsream. Below  is  the  call  to directsream:-
>>
>> val topicSet = topics.split(",").toSet
>> val kafkaParams = Map[String,String]("bootstrap.servers" -> "
>> datanode4.isdp.com:9092")
>> val k =
>> KafkaUtils.createDirectStream[String,String,StringDecoder,StringDecoder](ssc,
>> kafkaParams, topicSet)
>>
>> When  I replace   DirectStream call  to  createStream,  all  messages
>> were  read  by  one  Dstream block.:-
>> val k = KafkaUtils.createStream(ssc, 
>> "datanode4.isdp.com:2181","resp",topicMap
>> ,StorageLevel.MEMORY_ONLY)
>>
>> I am  using   below  spark-submit to execute:
>> ./spark-submit --master yarn-client --conf
>> "spark.dynamicAllocation.enabled=true" --conf
>> "spark.shuffle.service.enabled=true" --conf
>> "spark.sql.tungsten.enabled=false" --conf "spark.sql.codegen=false" --conf
>> "spark.sql.unsafe.enabled=false" --conf
>> "spark.streaming.backpressure.enabled=true" --conf "spark.locality.wait=1s"
>> --conf "spark.shuffle.consolidateFiles=true"   --conf
>> "spark.streaming.kafka.maxRatePerPartition=1000000" --driver-memory 2g
>> --executor-memory 1g --class com.tcs.dime.spark.SparkReceiver   --files
>> /etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml,/etc/hadoop/conf/mapred-site.xml,/etc/hadoop/conf/yarn-site.xml,/etc/hive/conf/hive-site.xml
>> --jars
>> /root/dime/jars/spark-streaming-kafka-assembly_2.10-1.5.1.jar,/root/Jars/sparkreceiver.jar
>> /root/Jars/sparkreceiver.jar
>>
>>
>>
>>
>> Sent from Samsung Mobile.
>>
>>
>> -------- Original message --------
>> From: Cody Koeninger <c...@koeninger.org>
>> Date:05/02/2016 22:07 (GMT+05:30)
>> To: Diwakar Dhanuskodi <diwakar.dhanusk...@gmail.com>
>> Cc: user@spark.apache.org
>> Subject: Re: Kafka directsream receiving rate
>>
>> If you're using the direct stream, you have 0 receivers.  Do you mean you
>> have 1 executor?
>>
>> Can you post the relevant call to createDirectStream from your code, as
>> well as any relevant spark configuration?
>>
>> On Thu, Feb 4, 2016 at 8:13 PM, Diwakar Dhanuskodi <
>> diwakar.dhanusk...@gmail.com> wrote:
>>
>>> Adding more info
>>>
>>> Batch  interval  is  2000ms.
>>> I expect all 100 messages  go thru one  dstream from  directsream but it
>>> receives at rate of 10 messages at time. Am  I missing  some
>>>  configurations here. Any help appreciated.
>>>
>>> Regards
>>> Diwakar.
>>>
>>>
>>> Sent from Samsung Mobile.
>>>
>>>
>>> -------- Original message --------
>>> From: Diwakar Dhanuskodi <diwakar.dhanusk...@gmail.com>
>>> Date:05/02/2016 07:33 (GMT+05:30)
>>> To: user@spark.apache.org
>>> Cc:
>>> Subject: Kafka directsream receiving rate
>>>
>>> Hi,
>>> Using spark 1.5.1.
>>> I have a topic with 20 partitions.  When I publish 100 messages. Spark
>>> direct stream is receiving 10 messages per  dstream. I have  only  one
>>>  receiver . When I used createStream the  receiver  received  entire 100
>>> messages  at once.
>>>
>>> Appreciate  any  help .
>>>
>>> Regards
>>> Diwakar
>>>
>>>
>>> Sent from Samsung Mobile.
>>>
>>
>>
>

Reply via email to