lukecwik commented on pull request #11715:
URL: https://github.com/apache/beam/pull/11715#issuecomment-630955102


   > > Should we be using the RangeEndEstimator when providing 
progress/splitting for ranges not ending at `Long.MAX_VALUE`?
   > > Lets say the range estimate is bad and is `MAX_VALUE - 3` but the real 
end is `5000`, then after a split we end up with `[0, (MAX_VALUE - 3) * 0.5)` 
and `[(MAX_VALUE - 3) * 0.5, MAX_VALUE)`. We may quickly learn that the 
residual is empty and then lose all effective progress on the primary.
   > 
   > I can see the benefit of using `RangeEndEstimator` for the finite range 
here. But as long as we don't modify the range end to estimate end or use 
estimate ed in `tryClaim`, we still cannot say the residual is empty.
   
   That is true but I was thinking it would make better splitting decisions 
instead of creating a bunch of empty splits trimming the range down. The 
advantage of not using the estimator is that we don't have to invoke since it 
could be expensive for the user and in many situations will produce a value 
greater than `to`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to