Aneesh Sharma created PIG-2774:
----------------------------------

             Summary: Fix merge join to work with many duplicate left keys
                 Key: PIG-2774
                 URL: https://issues.apache.org/jira/browse/PIG-2774
             Project: Pig
          Issue Type: Bug
            Reporter: Aneesh Sharma


A merge join can throw an OOM error if the number of duplicate left tuples is 
large as it accumulates all of them in memory. There are two solutions around 
this problem:
1. Serialize the accumulated tuples to disk if they exceed a certain size.
2. Spit out join output periodically, and re-seek on the right hand side index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to