multi file input format for loaders
-----------------------------------
Key: PIG-1518
URL: https://issues.apache.org/jira/browse/PIG-1518
Project: Pig
Issue Type: Improvement
Reporter: Olga Natkovich
Assignee: Yan Zhou
Fix For: 0.8.0
We frequently run in the situation where Pig needs to deal with small files in
the input. In this case a separate map is created for each file which could be
very inefficient.
It would be greate to have an umbrella input format that can take multiple
files and use them in a single split. We would like to see this working with
different data formats if possible.
There are already a couple of input formats doing similar thing:
MultifileInputFormat as well as CombinedInputFormat; howevere, neither works
with ne Hadoop 20 API.
We at least want to do a feasibility study for Pig 0.8.0.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.