[ https://issues.apache.org/jira/browse/PIG-554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shravan Matthur Narayanamurthy reassigned PIG-554: -------------------------------------------------- Assignee: Shravan Matthur Narayanamurthy > Fragment Replicate Join > ----------------------- > > Key: PIG-554 > URL: https://issues.apache.org/jira/browse/PIG-554 > Project: Pig > Issue Type: New Feature > Affects Versions: types_branch > Reporter: Shravan Matthur Narayanamurthy > Assignee: Shravan Matthur Narayanamurthy > Fix For: types_branch > > > Fragment Replicate Join(FRJ) is useful when we want a join between a huge > table and a very small table (fitting in memory small) and the join doesn't > expand the data by much. The idea is to distribute the processing of the huge > files by fragmenting it and replicating the small file to all machines > receiving a fragment of the huge file. Because of the availability of the > entire small file, the join becomes a trivial task without needing any break > in the pipeline. Exhaustive test have done to determine the improvement we > get out of FRJ. Will post the details in a wiki and add a link here > The patch makes changes to parts of the code where new operators are > introduced. Currently, when a new operator is introduced, its alias is not > set. For schema computation I have modified this behaviour to set the alias > of the new operator to that of its predecessor. The logical side of the patch > mimics the cogroup behavior as join syntax closely resembles that of cogroup. > Currently, this patch doesn't have support for joins other than inner joins. > The rest of the code has been documented. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.