[ 
https://issues.apache.org/jira/browse/PIG-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168574#comment-15168574
 ] 

prateek vaishnav commented on PIG-4621:
---------------------------------------

After investigating the issue, I have found out the problem. It is not 
different from what [~kellyzly] was pointing out.

If you look at the class ExampleGenerator, execEngine and 
localJobSimulator(does not exist currently) are hard coded.
private MRExecutionEngine execEngine;
private LocalMapReduceSimulator localMRRunner;

If you look further below, in method getData() -
localMRRunner.launchPig(plan, baseData, lineage, attacher, this, pigContext);

This launchPig runs the MR job and illustrates the data at different points.
We need similar classes for spark and tez as well.

To solve the issue for spark, I propose creating following classes -

- LocalJobSimulator
- LocalMRSimulator extends LocalJobSimulator
- LocalSparkSimulator extends LocalJobSimulator

A method getLocalJobSimulator() in HExecutionEngrine class will return 
respective localJobSimulator, which will be used in ExampleGenerator. Since 
LocalMapReduceSimulator is only used in ExampleGenerator, risk is fairly low.

Any thoughts or concerns are welcome. 

> Enable Illustrate in spark
> --------------------------
>
>                 Key: PIG-4621
>                 URL: https://issues.apache.org/jira/browse/PIG-4621
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: prateek vaishnav
>             Fix For: spark-branch
>
>
> Current we don't support illustrate in spark mode.
> How illustrate works 
> see:http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#ILLUSTRATE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to