Eugene Kirpichov created BEAM-2716:
--------------------------------------

             Summary: AvroReader should refuse dynamic splits while in the last 
block
                 Key: BEAM-2716
                 URL: https://issues.apache.org/jira/browse/BEAM-2716
             Project: Beam
          Issue Type: Bug
          Components: sdk-java-core
            Reporter: Eugene Kirpichov
            Assignee: Eugene Kirpichov
            Priority: Minor


AvroReader is able to detect when it's in the last block:
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java#L728

It could also use this information to avoid wastefully producing dynamic splits 
starting in the range of the current block.

One way to do this would be to have OffsetRangeTracker have a "claim range" 
operation: claim range of [a, b) is, in terms of correctness, equivalent to 
claiming "a" (it checks whether "a" is within the range), but sets the last 
claimed position to "b" rather than "a", thus protecting more positions from 
being split away.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to