garyli1019 commented on a change in pull request #1652: URL: https://github.com/apache/hudi/pull/1652#discussion_r430074707
########## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java ########## @@ -207,6 +208,11 @@ public KafkaOffsetGen(TypedProperties props) { maxEventsToReadFromKafka = (maxEventsToReadFromKafka == Long.MAX_VALUE || maxEventsToReadFromKafka == Integer.MAX_VALUE) ? Config.maxEventsFromKafkaSource : maxEventsToReadFromKafka; long numEvents = sourceLimit == Long.MAX_VALUE ? maxEventsToReadFromKafka : sourceLimit; + + if (numEvents < toOffsets.size()) { Review comment: I think if people really set `sourceLimit` to 0 then they should consume no new data, which is intended behavior, instead of throwing an exception. If they want to consume data then they should set this number higher. ########## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java ########## @@ -207,6 +208,11 @@ public KafkaOffsetGen(TypedProperties props) { maxEventsToReadFromKafka = (maxEventsToReadFromKafka == Long.MAX_VALUE || maxEventsToReadFromKafka == Integer.MAX_VALUE) ? Config.maxEventsFromKafkaSource : maxEventsToReadFromKafka; long numEvents = sourceLimit == Long.MAX_VALUE ? maxEventsToReadFromKafka : sourceLimit; + + if (numEvents < toOffsets.size()) { Review comment: @pratyakshsharma thoughts? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org