[ 
https://issues.apache.org/jira/browse/BEAM-13171?focusedWorklogId=686133&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-686133
 ]

ASF GitHub Bot logged work on BEAM-13171:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/Nov/21 23:39
            Start Date: 24/Nov/21 23:39
    Worklog Time Spent: 10m 
      Work Description: lukecwik commented on a change in pull request #15951:
URL: https://github.com/apache/beam/pull/15951#discussion_r756478187



##########
File path: CHANGES.md
##########
@@ -63,6 +63,7 @@
 * We changed the data type for ranges in `JdbcIO.readWithPartitions` from 
`int` to `long`. This is a relatively minor
     breaking change, which we're implementing to improve the usability of the 
transform without increasing cruft.
     This transform is relatively new, so we may implement other breaking 
changes in the future to improve its usability.
+* Support for stopReadTime on KafkaIO SDF 
(Java).([BEAM-13171](https://issues.apache.org/jira/browse/BEAM-13171)).

Review comment:
       This will likely go under the 2.36 release now since the 2.35 release 
branch has been cut and is unlikely to take in new features.

##########
File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/ReadFromKafkaDoFn.java
##########
@@ -338,6 +355,9 @@ public ProcessContinuation processElement(
         // When there are no records available for the current TopicPartition, 
self-checkpoint
         // and move to process the next element.
         if (rawRecords.isEmpty()) {
+          if (expectedOffset > endOffset) {

Review comment:
       Note that `OffsetRange` is `[from, to)` so shouldn't we be checking that 
`expectedOffset >= endOffset`.
   
   Also, I believe we have this line here because you want to cover the case 
where you want to read to offset `X` and the cluster only has messages to `X-1` 
and you don't want to wait for the message at offset `X` to be published. In 
this case it makes sense to have this if down below within the `for (record : 
...) { ... }` loop right after the `outputWithTimestamp`. It would be good to 
add a comment to this effect as well otherwise others will not understand why 
we aren't relying on the `tryClaim` to return false.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 686133)
    Time Spent: 1h 20m  (was: 1h 10m)

> Support for stopReadTime on KafkaIO SDF 
> ----------------------------------------
>
>                 Key: BEAM-13171
>                 URL: https://issues.apache.org/jira/browse/BEAM-13171
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-kafka
>            Reporter: Mostafa Aghajani
>            Assignee: Mostafa Aghajani
>            Priority: P2
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> There is already the support for startReadTime using SDF when the Kafka 
> version is supported.
> I want to add the support for stopReadTIme so we can extract messages from 
> Kafka only up to a point in time and then the task will be finished.
> One use case: when you want to only re-process (re-read) a period of time for 
> a Kafka topic in your pipeline.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to