vinothchandar commented on a change in pull request #650: Fix up offsets not
available on leader exception
URL: https://github.com/apache/incubator-hudi/pull/650#discussion_r279958442
##########
File path:
hoodie-utilities/src/main/java/com/uber/hoodie/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -235,6 +237,30 @@ public KafkaOffsetGen(TypedProperties props) {
return offsetRanges;
}
+ /***
+ * check up checkpoint offsets is valid or not, if true, return checkpoint
offsets, else return earliest offsets
+ * @param cluster,
+ * @param checkpointOffsets
+ * @param topicPartitions
+ * @return fromOffset
+ */
+ private HashMap<TopicAndPartition, KafkaCluster.LeaderOffset>
checkupValidOffsets(KafkaCluster cluster,
+
HashMap<TopicAndPartition, KafkaCluster.LeaderOffset> checkpointOffsets,
+
Set<TopicAndPartition> topicPartitions ) {
+ java.util.Set<TopicAndPartition> partitions = checkpointOffsets.keySet();
+ HashMap<TopicAndPartition, KafkaCluster.LeaderOffset> fromOffsets;
+ fromOffsets = new HashMap(ScalaHelpers.toJavaMap(
+ cluster.getEarliestLeaderOffsets(topicPartitions).right().get()));
+ for (TopicAndPartition partition: partitions) {
+ if (checkpointOffsets.get(partition).offset() <
fromOffsets.get(partition).offset() ) {
Review comment:
how does this handle new partitions being added since the last checkpoint?
in that case, you won't find the partition in the `checkpointOffsets` and it
will keep crashing with NPE?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services