[jira] [Updated] (PIG-4655) Support InputStats in spark mode

2015-10-30 Thread Xianda Ke (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianda Ke updated PIG-4655:
---
Attachment: PIG-4655-4.patch

Hi [~xuefuz],
Rebased on PIG-4634,  latest PIG-4655-4.patch is attached. 

> Support InputStats in spark mode
> 
>
> Key: PIG-4655
> URL: https://issues.apache.org/jira/browse/PIG-4655
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Xianda Ke
>Assignee: Xianda Ke
> Fix For: spark-branch
>
> Attachments: PIG-4655-2.patch, PIG-4655-3.patch, PIG-4655-4.patch, 
> PIG-4655.patch
>
>
> Currently, InputStats is not implemented in spark mode. 
> The JUnit case TestPigRunner.testEmptyFileCounter() will fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4655) Support InputStats in spark mode

2015-08-31 Thread Xianda Ke (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianda Ke updated PIG-4655:
---
Attachment: PIG-4655-2.patch

RB request: https://reviews.apache.org/r/37636/

> Support InputStats in spark mode
> 
>
> Key: PIG-4655
> URL: https://issues.apache.org/jira/browse/PIG-4655
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Xianda Ke
>Assignee: Xianda Ke
> Fix For: spark-branch
>
> Attachments: PIG-4655-2.patch, PIG-4655.patch
>
>
> Currently, InputStats is not implemented in spark mode. 
> The JUnit case TestPigRunner.testEmptyFileCounter() will fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4655) Support InputStats in spark mode

2015-08-31 Thread Xianda Ke (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianda Ke updated PIG-4655:
---
Attachment: PIG-4655-3.patch

PIG-4655-3.patch is attached.  Add some informative comments.

> Support InputStats in spark mode
> 
>
> Key: PIG-4655
> URL: https://issues.apache.org/jira/browse/PIG-4655
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Xianda Ke
>Assignee: Xianda Ke
> Fix For: spark-branch
>
> Attachments: PIG-4655-2.patch, PIG-4655-3.patch, PIG-4655.patch
>
>
> Currently, InputStats is not implemented in spark mode. 
> The JUnit case TestPigRunner.testEmptyFileCounter() will fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4655) Support InputStats in spark mode

2015-08-18 Thread kexianda (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kexianda updated PIG-4655:
--
Attachment: PIG-4655.patch

Hi [~mohitsabharwal], 

PIG-4655.patch is attached. Would you please help review the code. Thanks.

based on PIG-4634  PIG-4645.
Like in MR mode, we add a customized counter for each input file, based on the 
spark accumulator.
1. create a counter in LoadConverter
2. increase counter in RDD.map() transform
3. collect the input info into SparPigStats

How to test:
run  TestPigRunner.testEmptyFileCounter()

Regards,
Xianda




 Support InputStats in spark mode
 

 Key: PIG-4655
 URL: https://issues.apache.org/jira/browse/PIG-4655
 Project: Pig
  Issue Type: Sub-task
  Components: spark
Reporter: kexianda
Assignee: kexianda
 Fix For: spark-branch

 Attachments: PIG-4655.patch


 Currently, InputStats is not implemented in spark mode. 
 The JUnit case TestPigRunner.testEmptyFileCounter() will fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)