[ 
https://issues.apache.org/jira/browse/PIG-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048202#comment-15048202
 ] 

liyunzhang_intel commented on PIG-4621:
---------------------------------------

[~zulfiali], [~mohitsabharwal] and [~xuefuz]:
The patch modified the code in JobControlCompiler.java. I don't know whether 
this change can be accepted by the community.
Let's use an example to show why script with illustrate fails in spark mode:
{code}
 visits = LOAD './visits' AS (user:chararray, ulr:chararray, 
timestamp:chararray);
recent_visits = FILTER visits BY timestamp >= '20081201';
dump recent_visits;
illustrate recent_visits;
{code}

The exception is ClassCastException: can not cast SparkScriptState to 
MRScriptState and the error stack is:
{code}
 
org.apache.pig.tools.pigstats.mapreduce.MRScriptState.get(MRScriptState.java:67)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:512)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:327)
org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:110)
org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:259)
org.apache.pig.pen.ExampleGenerator.readBaseData(ExampleGenerator.java:223)
org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:155)
org.apache.pig.PigServer.getExamples(PigServer.java:1305)
org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:812)
org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:818)
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:385)
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
org.apache.pig.Main.run(Main.java:624)
org.apache.pig.Main.main(Main.java:170)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:606)
org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{code}

The reason why this exception is thrown out is because illustrate will call 
LocalMapReduceSimulator.launchPig and MRScriptState#get is called. 
ScriptState.get() returns SparkScriptState and we cast SparkScriptState to 
MRScriptState.
{code}
   public static MRScriptState get() {
        return (MRScriptState) ScriptState.get();
    }
{code}

In mr implementation, it shows that it calls LocalMapReduceSimulator#launchPig 
to execute a small set of data to show the execution steps. If we really need 
implement it in spark mode. I think we need to implement a class like 
LocalMapReduceSimulator.  After investigating the code, I found currently 
illustrate is *only* implemented in mr mode.

> Enable Illustrate in spark
> --------------------------
>
>                 Key: PIG-4621
>                 URL: https://issues.apache.org/jira/browse/PIG-4621
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: Syed Zulfiqar Ali
>             Fix For: spark-branch
>
>
> Current we don't support illustrate in spark mode.
> How illustrate works 
> see:http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#ILLUSTRATE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to