[GitHub] [incubator-hudi] bhasudha commented on issue #1653: [SUPPORT]: Hudi Deltastreamer OffsetoutofRange Exception reading from Kafka topic (12 partitions)

2020-05-22 Thread GitBox


bhasudha commented on issue #1653:
URL: https://github.com/apache/incubator-hudi/issues/1653#issuecomment-632516157


   @prashanthpdesai  sorry I assumed you were referring to your own check 
pointing.  Your understanding is right. Checkpoints are written to hoodie 
commit metadata after each round of DeltaStreamer run.
   
   The Exception you described seems to be possible if the offfsets supplied is 
larger or smaller than what the server has for a given partition. I am 
suspecting if this could be be because of retention policy of the kafka topic 
kicking in. It should be easy to check this. I think  some command like this 
```kafka-topics.sh --bootstrap-server server_ip:9092 --describe --topic 
topic_name``` will help print the topic config.  Can we start to debug from 
there?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] bhasudha commented on issue #1653: [SUPPORT]: Hudi Deltastreamer OffsetoutofRange Exception reading from Kafka topic (12 partitions)

2020-05-21 Thread GitBox


bhasudha commented on issue #1653:
URL: https://github.com/apache/incubator-hudi/issues/1653#issuecomment-632405067


   @prashanthpdesai  quick questions.
   
   Where do you checkpoint the offsets between mini batches? and how do you 
configure that to deltastreamer? 
   
   Do you have the offset that you used to run this batch ()which failed with 
out of range) ? 
   If yes can you check if that msg offset is indeed present in the kafka topic 
(to avoid the possibility of your kafka retention policy deleting those msgs) . 
You can get smallest offset available for a topic partition by running this 
kafka command line
   `bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list 
 --topic  --time -2
   `



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org