PERFORMANCE: improve how data is stored between M-R jobs and between Map and Reduce -----------------------------------------------------------------------------------
Key: PIG-686 URL: https://issues.apache.org/jira/browse/PIG-686 Project: Pig Issue Type: Improvement Affects Versions: types_branch Reporter: Olga Natkovich Fix For: types_branch Currently, there is quite a bit of overhead in how the data is serialized in both cases because a type information is stored with each field. However, most of the time the data has known and consistent schema in which case, it is sufficient to store the schema once. This change could really decrease the ammount of intermediate data generated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.