Hi,
I am able to run *pig on spark* using
https://github.com/twitter/pig/tree/spork.
In spark mode, LOAD data without using pigstorage is working fine.
But, when I use pigStorage to Load data, following error occurs:
[Result resolver thread-2] WARN
org.apache.spark.scheduler.cluster.ClusterTaskSetManager - Lost TID 2
(task 0.0:0)
[Result resolver thread-2] INFO
org.apache.spark.scheduler.cluster.ClusterTaskSetManager - Loss was due to
java.lang.RuntimeException:
org.apache.pig.backend.executionengine.ExecException: *ERROR 0: Error while
executing ForEach at [raw[-1,-1]] [duplicate 2]*
*Error Log file:*
ERROR 2043: Unexpected error during execution.
org.apache.pig.backend.executionengine.ExecException: ERROR 2043:
Unexpected error during execution.
at org.apache.pig.PigServer.launchPlan(PigServer.java:1339)
at
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1310)
at org.apache.pig.PigServer.execute(PigServer.java:1300)
at org.apache.pig.PigServer.executeBatch(PigServer.java:381)
at org.apache.pig.PigServer.executeBatch(PigServer.java:358)
at
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:137)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:609)
at org.apache.pig.Main.main(Main.java:158)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
*Caused by*: org.apache.spark.SparkException: Job aborted: Task 0.0:0
failed more than 4 times
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:827)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:825)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:825)
at
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:440)
at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:502)
at
org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:157)
=================================================================
*Is there anything I am missing?? * Does pigStorage support spark or not??
Is there any patch available for this?
Thanks in Advance.
--
Lalit Yadav