[ https://issues.apache.org/jira/browse/BEAM-9977?focusedWorklogId=435076&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-435076 ]
ASF GitHub Bot logged work on BEAM-9977: ---------------------------------------- Author: ASF GitHub Bot Created on: 19/May/20 17:23 Start Date: 19/May/20 17:23 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #11715: URL: https://github.com/apache/beam/pull/11715#issuecomment-630963389 > > > Should we be using the RangeEndEstimator when providing progress/splitting for ranges not ending at `Long.MAX_VALUE`? > > > Lets say the range estimate is bad and is `MAX_VALUE - 3` but the real end is `5000`, then after a split we end up with `[0, (MAX_VALUE - 3) * 0.5)` and `[(MAX_VALUE - 3) * 0.5, MAX_VALUE)`. We may quickly learn that the residual is empty and then lose all effective progress on the primary. > > > > > > I can see the benefit of using `RangeEndEstimator` for the finite range here. But as long as we don't modify the range end to estimate end or use estimate ed in `tryClaim`, we still cannot say the residual is empty. > > That is true but I was thinking it would make better splitting decisions instead of creating a bunch of empty splits trimming the range down. The advantage of not using the estimator is that we don't have to invoke since it could be expensive for the user and in many situations will produce a value greater than `to`. > > We can leave it out for now unless some compelling use case comes up. Do we want to have a TODO here to track this? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 435076) Time Spent: 2.5h (was: 2h 20m) > Build Kafka Read on top of Java SplittableDoFn > ---------------------------------------------- > > Key: BEAM-9977 > URL: https://issues.apache.org/jira/browse/BEAM-9977 > Project: Beam > Issue Type: New Feature > Components: io-java-kafka > Reporter: Boyuan Zhang > Assignee: Boyuan Zhang > Priority: P2 > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)