How are you creating your kafka streams in Spark? If you have 10 partitions for a topic, you can call "createStream" ten times to create 10 parallel receivers/executors and then use "union" to combine all the dStreams.
On Wed, Sep 10, 2014 at 7:16 AM, richiesgr <richie...@gmail.com> wrote: > Hi (my previous post as been used by someone else) > > I'm building a application the read from kafka stream event. In production > we've 5 consumers that share 10 partitions. > But on spark streaming kafka only 1 worker act as a consumer then > distribute > the tasks to workers so I can have only 1 machine acting as consumer but I > need more because only 1 consumer means Lags. > > Do you've any idea what I can do ? Another point is interresting the master > is not loaded at all I can get up more than 10 % CPU > > I've tried to increase the queued.max.message.chunks on the kafka client to > read more records thinking it'll speed up the read but I only get > > ERROR consumer.ConsumerFetcherThread: > > [ConsumerFetcherThread-SparkEC2_ip-10-138-59-194.ec2.internal-1410182950783-5c49c8e8-0-174167372], > Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 73; ClientId: > > SparkEC2-ConsumerFetcherThread-SparkEC2_ip-10-138-59-194.ec2.internal-1410182950783-5c49c8e8-0-174167372; > ReplicaId: -1; MaxWait: 100 ms; MinBytes: 1 bytes; RequestInfo: [IA2,7] -> > PartitionFetchInfo(929838589,1048576),[IA2,6] -> > PartitionFetchInfo(929515796,1048576),[IA2,9] -> > PartitionFetchInfo(929577946,1048576),[IA2,8] -> > PartitionFetchInfo(930751599,1048576),[IA2,2] -> > PartitionFetchInfo(926457704,1048576),[IA2,5] -> > PartitionFetchInfo(930774385,1048576),[IA2,0] -> > PartitionFetchInfo(929913213,1048576),[IA2,3] -> > PartitionFetchInfo(929268891,1048576),[IA2,4] -> > PartitionFetchInfo(929949877,1048576),[IA2,1] -> > PartitionFetchInfo(930063114,1048576) > java.lang.OutOfMemoryError: Java heap space > > Is someone have ideas ? > Thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-scale-more-consumer-to-Kafka-stream-tp13883.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >