[ https://issues.apache.org/jira/browse/FLINK-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhijiang closed FLINK-13589. ---------------------------- Resolution: Fixed Merged in master: 0bd083e5eeb5eb5adeddfbe3a9928860f3b4a6eb Merged in release-1.9: db531e79807acba1ba28d9922bfed912fd78dd03 Merged in release-1.10: 1e716e4a43018caeb77beaa5d8f16cedfedbd887 > DelimitedInputFormat index error on multi-byte delimiters with whole file > input splits > -------------------------------------------------------------------------------------- > > Key: FLINK-13589 > URL: https://issues.apache.org/jira/browse/FLINK-13589 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Formats (JSON, Avro, Parquet, > ORC, SequenceFile) > Affects Versions: 1.8.1 > Reporter: Adric Eckstein > Assignee: Arvid Heise > Priority: Blocker > Labels: pull-request-available > Fix For: 1.9.2, 1.10.0 > > Attachments: delimiter-bug.patch > > Time Spent: 1h > Remaining Estimate: 0h > > The DelimitedInputFormat can drops bytes when using input splits that have a > length of -1 (for reading the whole file). It looks like this is a simple > bug in handing the delimiter on buffer boundaries where the logic is > inconsistent for different split types. > Attached is a possible patch with fix and test. > -- This message was sent by Atlassian Jira (v8.3.4#803005)