Re: Error Kafka/Spark. Ran out of messages before reaching ending offset

Cody Koeninger Fri, 06 May 2016 07:12:06 -0700

Yeah, so that means the driver talked to kafka and kafka told it the
highest available offset was 2723431.  Then when the executor tried to
consume messages, it stopped getting messages before reaching that
offset.  That almost certainly means something's wrong with Kafka,
have you looked at your kafka logs?  I doubt it's anything to do with
elasticsearch.


On Fri, May 6, 2016 at 4:22 AM, Guillermo Ortiz <konstt2...@gmail.com> wrote:
> This is the complete error.
>
> 2016-05-06 11:18:05,424 [task-result-getter-0] INFO
> org.apache.spark.scheduler.TaskSetManager - Finished task 5.0 in stage
> 13.0 (TID 60) in 11692 ms on xxxxxxxxxx (6/8)
> 2016-05-06 11:18:08,978 [task-result-getter-1] WARN
> org.apache.spark.scheduler.TaskSetManager - Lost task 7.0 in stage
> 13.0 (TID 62, xxxxxxxxxxxxxxx): java.lang.AssertionError: assertion
> failed: Ran out of messages before reaching ending offset 2723431 for
> topic kafka-global-paas partition 2 start 2705506. This should not
> happen, and indicates that messages may have been lost
> at scala.Predef$.assert(Predef.scala:179)
> at 
> org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
> at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:48)
> at 
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at 
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> 2016-05-06 11:18:08,982 [sparkDriver-akka.actor.default-dispatcher-18]
> INFO  org.apache.spark.scheduler.TaskSetManager - Starting task 7.1 in
> stage 13.0 (TID 63, xxxxxxxxxxxx, partition 7,RACK_LOCAL, 2052 bytes)
> 2016-05-06 11:18:10,013 [JobGenerator] INFO
> org.apache.spark.streaming.scheduler.JobScheduler - Added jobs for
> time 1462526290000 ms
> 2016-05-06 11:18:10,015 [JobGenerator] INFO
> org.apache.spark.streaming.scheduler.JobGenerator - Checkpointing
> graph for time 1462526290000 ms
>
> 2016-05-06 11:11 GMT+02:00 Guillermo Ortiz <konstt2...@gmail.com>:
>> I think that it's a kafka error, but I'm starting thinking if it could
>> be something about elasticsearch since I have seen more people with
>> same error using elasticsearch. I have no idea.
>>
>> 2016-05-06 11:05 GMT+02:00 Guillermo Ortiz <konstt2...@gmail.com>:
>>> I'm trying to read data from Spark and index to ES with its library
>>> (es-hadoop 2.2.1 version).
>>> IIt was working right for a while but now it has started to happen this 
>>> error.
>>> I have delete the checkpoint and even the kafka topic and restart all
>>> the machines with kafka and zookeeper but it didn't fix it.
>>>
>>> User class threw exception: org.apache.spark.SparkException: Job
>>> aborted due to stage failure: Task 4 in stage 1.0 failed 4 times, most
>>> recent failure: Lost task 4.3 in stage 1.0 (TID 12, xxxxxxxxx):
>>> java.lang.AssertionError: assertion failed: Ran out of messages before
>>> reaching ending offset 1226116 for topic kafka-global-paas partition 7
>>> start 1212156. This should not happen, and indicates that messages may
>>> have been lost
>>> at scala.Predef$.assert(Predef.scala:179)
>>> at 
>>> org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:211)
>>> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
>>> at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
>>> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>>>
>>> I read some threads with this error but it didn't help me.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Error Kafka/Spark. Ran out of messages before reaching ending offset

Reply via email to