[ 
https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kexianda updated PIG-4655:
--------------------------
    Attachment: PIG-4655.patch

Hi [~mohitsabharwal], 

PIG-4655.patch is attached. Would you please help review the code. Thanks.

based on PIG-4634 & PIG-4645.
Like in MR mode, we add a customized counter for each input file, based on the 
spark accumulator.
1. create a counter in LoadConverter
2. increase counter in RDD.map() transform
3. collect the input info into SparPigStats

How to test:
run  TestPigRunner.testEmptyFileCounter()

Regards,
Xianda




> Support InputStats in spark mode
> --------------------------------
>
>                 Key: PIG-4655
>                 URL: https://issues.apache.org/jira/browse/PIG-4655
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: kexianda
>            Assignee: kexianda
>             Fix For: spark-branch
>
>         Attachments: PIG-4655.patch
>
>
> Currently, InputStats is not implemented in spark mode. 
> The JUnit case TestPigRunner.testEmptyFileCounter() will fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to