PigStorage may miss records when loading a file
-----------------------------------------------
Key: PIG-1080
URL: https://issues.apache.org/jira/browse/PIG-1080
Project: Pig
Issue Type: Bug
Reporter: Richard Ding
Assignee: Richard Ding
When a file is assigned to multiple mappers (one block per mapper), the blocks
may not end at the exact record boundary. Special care is taken to ensure that
all records are loaded by mappers (and exactly once), even for records that
cross the block boundary.
The PigStorage, however, doesn't correctly handle the case where a block ends
at exactly record boundary and results in missing records.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.