[jira] [Updated] (PIG-4655) Support InputStats in spark mode
[ https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianda Ke updated PIG-4655: --- Attachment: PIG-4655-4.patch Hi [~xuefuz], Rebased on PIG-4634, latest PIG-4655-4.patch is attached. > Support InputStats in spark mode > > > Key: PIG-4655 > URL: https://issues.apache.org/jira/browse/PIG-4655 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Xianda Ke >Assignee: Xianda Ke > Fix For: spark-branch > > Attachments: PIG-4655-2.patch, PIG-4655-3.patch, PIG-4655-4.patch, > PIG-4655.patch > > > Currently, InputStats is not implemented in spark mode. > The JUnit case TestPigRunner.testEmptyFileCounter() will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-4655) Support InputStats in spark mode
[ https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianda Ke updated PIG-4655: --- Attachment: PIG-4655-2.patch RB request: https://reviews.apache.org/r/37636/ > Support InputStats in spark mode > > > Key: PIG-4655 > URL: https://issues.apache.org/jira/browse/PIG-4655 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Xianda Ke >Assignee: Xianda Ke > Fix For: spark-branch > > Attachments: PIG-4655-2.patch, PIG-4655.patch > > > Currently, InputStats is not implemented in spark mode. > The JUnit case TestPigRunner.testEmptyFileCounter() will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-4655) Support InputStats in spark mode
[ https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianda Ke updated PIG-4655: --- Attachment: PIG-4655-3.patch PIG-4655-3.patch is attached. Add some informative comments. > Support InputStats in spark mode > > > Key: PIG-4655 > URL: https://issues.apache.org/jira/browse/PIG-4655 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: Xianda Ke >Assignee: Xianda Ke > Fix For: spark-branch > > Attachments: PIG-4655-2.patch, PIG-4655-3.patch, PIG-4655.patch > > > Currently, InputStats is not implemented in spark mode. > The JUnit case TestPigRunner.testEmptyFileCounter() will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-4655) Support InputStats in spark mode
[ https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kexianda updated PIG-4655: -- Attachment: PIG-4655.patch Hi [~mohitsabharwal], PIG-4655.patch is attached. Would you please help review the code. Thanks. based on PIG-4634 PIG-4645. Like in MR mode, we add a customized counter for each input file, based on the spark accumulator. 1. create a counter in LoadConverter 2. increase counter in RDD.map() transform 3. collect the input info into SparPigStats How to test: run TestPigRunner.testEmptyFileCounter() Regards, Xianda Support InputStats in spark mode Key: PIG-4655 URL: https://issues.apache.org/jira/browse/PIG-4655 Project: Pig Issue Type: Sub-task Components: spark Reporter: kexianda Assignee: kexianda Fix For: spark-branch Attachments: PIG-4655.patch Currently, InputStats is not implemented in spark mode. The JUnit case TestPigRunner.testEmptyFileCounter() will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)