Hi, I'm using spark streaming with kafka and I need to clear the offset and re-compute all things. I deleted checkpoint directory in HDFS and reset kafka offset with "kafka-run-class kafka.tools.ImportZkOffsets". I can confirm the offset is set to 0 in kafka:
~ > kafka-run-class kafka.tools.ConsumerOffsetChecker --group adhoc_data_spark --topic adhoc_data --zookeeper szq1.appadhoc.com:2181 Group Topic Pid Offset logSize Lag Owner adhoc_data_spark adhoc_data 0 0 5280743 5280743 none But when I restart spark streaming, the offset is reset to logSize, I cannot figure out why is that, can anybody help? Thanks.