I struggled with kinesis for a long time and got all my findings documented at -
http://stackoverflow.com/questions/35567440/spark-not-able-to-fetch-events-from-amazon-kinesis Let me know if it helps. Cheers, Yash - Thanks, via mobile, excuse brevity. On Jul 16, 2016 6:05 AM, "dharmendra" <d17...@gmail.com> wrote: > I have created small spark streaming program to fetch data from Kinesis and > put some data in database. > When i ran it in spark standalone cluster using master as local[*] it is > working fine but when i tried to run in yarn cluster with master as "yarn" > application doesn't receive any data. > > I submit job using following command > spark-submit --class <<className>> --master yarn --deploy-mode cluster > --queue default --executor-cores 2 --executor-memory 2G --num-executors 4 > <<jar> > > My java code is like > > JavaDStream<Aggregation> enrichStream = > javaDStream.flatMap(sparkRecordProcessor); > > enrichStream.mapToPair(new PairFunction<Aggregation, Aggregation, > Integer>() > { > @Override > public Tuple2<Aggregation, Integer> call(Aggregation s) throws > Exception { > LOGGER.info("creating tuple " + s); > return new Tuple2<>(s, 1); > } > }).reduceByKey(new Function2<Integer, Integer, Integer>() { > @Override > public Integer call(Integer i1, Integer i2) throws Exception { > LOGGER.info("reduce by key {}, {}", i1, i2); > return i1 + i2; > } > }).foreach(sparkDatabaseProcessor); > > > I have put some logs in sparkRecordProcessor and sparkDatabaseProcessor. > I can see that sparkDatabaseProcessor executed every batch interval(10 sec) > and but find no log in sparkRecordProcessor. > There is no event(avg/sec) in Spark Streaming UI. > In Executor tab i can see 3 executors. Data against these executors are > also > continuously updated. > I also check Dynamodb table in Amazon and leaseCounter is updated regularly > from my application. > But spark streaming gets no data from Kinesis in yarn. > I see "shuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 > blocks" many times in log. > I don't know what else i need to do to run spark streaming on yarn. > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-from-Kinesis-is-not-getting-data-in-Yarn-cluster-tp27345.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >