[
https://issues.apache.org/jira/browse/PIG-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048202#comment-15048202
]
liyunzhang_intel commented on PIG-4621:
---------------------------------------
[~zulfiali], [~mohitsabharwal] and [~xuefuz]:
The patch modified the code in JobControlCompiler.java. I don't know whether
this change can be accepted by the community.
Let's use an example to show why script with illustrate fails in spark mode:
{code}
visits = LOAD './visits' AS (user:chararray, ulr:chararray,
timestamp:chararray);
recent_visits = FILTER visits BY timestamp >= '20081201';
dump recent_visits;
illustrate recent_visits;
{code}
The exception is ClassCastException: can not cast SparkScriptState to
MRScriptState and the error stack is:
{code}
org.apache.pig.tools.pigstats.mapreduce.MRScriptState.get(MRScriptState.java:67)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:512)
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:327)
org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:110)
org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:259)
org.apache.pig.pen.ExampleGenerator.readBaseData(ExampleGenerator.java:223)
org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:155)
org.apache.pig.PigServer.getExamples(PigServer.java:1305)
org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:812)
org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:818)
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:385)
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
org.apache.pig.Main.run(Main.java:624)
org.apache.pig.Main.main(Main.java:170)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:606)
org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{code}
The reason why this exception is thrown out is because illustrate will call
LocalMapReduceSimulator.launchPig and MRScriptState#get is called.
ScriptState.get() returns SparkScriptState and we cast SparkScriptState to
MRScriptState.
{code}
public static MRScriptState get() {
return (MRScriptState) ScriptState.get();
}
{code}
In mr implementation, it shows that it calls LocalMapReduceSimulator#launchPig
to execute a small set of data to show the execution steps. If we really need
implement it in spark mode. I think we need to implement a class like
LocalMapReduceSimulator. After investigating the code, I found currently
illustrate is *only* implemented in mr mode.
> Enable Illustrate in spark
> --------------------------
>
> Key: PIG-4621
> URL: https://issues.apache.org/jira/browse/PIG-4621
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: liyunzhang_intel
> Assignee: Syed Zulfiqar Ali
> Fix For: spark-branch
>
>
> Current we don't support illustrate in spark mode.
> How illustrate works
> see:http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#ILLUSTRATE
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)