liyunzhang_intel created PIG-4891:
-------------------------------------

             Summary: Implement FR join by broadcasting small rdd not making 
more copys of data
                 Key: PIG-4891
                 URL: https://issues.apache.org/jira/browse/PIG-4891
             Project: Pig
          Issue Type: Sub-task
          Components: spark
            Reporter: liyunzhang_intel


In current implementation of FRJoin(PIG-4771), we just set the value of 
replication of data as 10 to make the data access more efficiency because 
current FRJoin algrithms can be reused in this way. We need to figure out how 
to use broadcasting small rdd to implement FRJoin in current code base if we 
find the performance can be improved a lot by using broadcasting rdd.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to