[
https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
kexianda updated PIG-4655:
--------------------------
Attachment: PIG-4655.patch
Hi [~mohitsabharwal],
PIG-4655.patch is attached. Would you please help review the code. Thanks.
based on PIG-4634 & PIG-4645.
Like in MR mode, we add a customized counter for each input file, based on the
spark accumulator.
1. create a counter in LoadConverter
2. increase counter in RDD.map() transform
3. collect the input info into SparPigStats
How to test:
run TestPigRunner.testEmptyFileCounter()
Regards,
Xianda
> Support InputStats in spark mode
> --------------------------------
>
> Key: PIG-4655
> URL: https://issues.apache.org/jira/browse/PIG-4655
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: kexianda
> Assignee: kexianda
> Fix For: spark-branch
>
> Attachments: PIG-4655.patch
>
>
> Currently, InputStats is not implemented in spark mode.
> The JUnit case TestPigRunner.testEmptyFileCounter() will fail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)