[ 
https://issues.apache.org/jira/browse/PIG-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated PIG-2163:
-----------------------------

    Attachment: PIG-2163_1.patch

Hi Daniel,

I modified the patch according to your comments.

Now the right-most bag will be streamed while the cross product will be 
generated on the fly. Additionally, to make the order of generated tuples 
reasonable, I reverse the iteration order of n bags (converting to n, n - 1, 
..., 2, 1 order, and avoiding the strange 2, 3, ..., n - 1, n, 1 order). For 
example, if there are three bags from left to right:

bag #1 {(a, 1), (a, 2)}
bag #2 {(a, 11), (a, 22)}
bag #3 {(a, 111), (a, 222)}

the generated bag will be:
{
(a, 1, a, 11, a, 111),
(a, 2, a, 11, a, 111),
(a, 1, a, 22, a, 111),
(a, 2, a, 22, a, 111),
(a, 1, a, 11, a, 222),
(a, 2, a, 11, a, 222),
(a, 1, a, 22, a, 222),
(a, 2, a, 22, a, 222)
}

> Improve nested cross to stream one relation
> -------------------------------------------
>
>                 Key: PIG-2163
>                 URL: https://issues.apache.org/jira/browse/PIG-2163
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.10
>            Reporter: Daniel Dai
>            Assignee: Zhijie Shen
>             Fix For: 0.10
>
>         Attachments: PIG-2163.patch, PIG-2163_1.patch
>
>
> PIG-1916 added nested cross support for PIG. One optimization is instead of 
> materialize all bags before producing result, we can stream one of the input 
> to save on memory.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to