[ 
https://issues.apache.org/jira/browse/PIG-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated PIG-4542:
---------------------------------
    Attachment: PIG-4542.1.patch

> OutputConsumerIterator should flush buffered records
> ----------------------------------------------------
>
>                 Key: PIG-4542
>                 URL: https://issues.apache.org/jira/browse/PIG-4542
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>    Affects Versions: spark-branch
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>             Fix For: spark-branch
>
>         Attachments: PIG-4542.1.patch, PIG-4542.patch
>
>
> Certain operators may buffer the output. We need to flush the last set of 
> records from such operators, when we encounter the last input record, before 
> calling getNextTuple() for the last time.
> Currently, to flush the last set of records, we compute RDD.count() and 
> compare the count with a running counter to determine if we have reached the 
> last record. This is an unnecessary and inefficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to