I think I've found the reason. It seems that the the smallest offset is not 0 and I should not set the offset to 0.
Bin Wang <wbi...@gmail.com>于2015年9月14日周一 下午2:46写道: > Hi, > > I'm using spark streaming with kafka and I need to clear the offset and > re-compute all things. I deleted checkpoint directory in HDFS and reset > kafka offset with "kafka-run-class kafka.tools.ImportZkOffsets". I can > confirm the offset is set to 0 in kafka: > > ~ > kafka-run-class kafka.tools.ConsumerOffsetChecker --group > adhoc_data_spark --topic adhoc_data --zookeeper szq1.appadhoc.com:2181 > Group Topic Pid Offset logSize > Lag Owner > adhoc_data_spark adhoc_data 0 0 > 5280743 5280743 none > > But when I restart spark streaming, the offset is reset to logSize, I > cannot figure out why is that, can anybody help? Thanks. >