Re: queries on Spork (Pig on Spark)
Log files content : Pig Stack Trace --- ERROR 2998: Unhandled internal error. Could not initialize class org.apache.spark.rdd.RDDOperationScope$ java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.rdd.RDDOperationScope$ at org.apache.spark.SparkContext.withScope(SparkContext.scala:681) at org.apache.spark.SparkContext.newAPIHadoopRDD(SparkContext.scala:1094) at org.apache.pig.backend.hadoop.executionengine.spark.converter.LoadConverter.convert(LoadConverter.java:91) at org.apache.pig.backend.hadoop.executionengine.spark.converter.LoadConverter.convert(LoadConverter.java:61) at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:666) at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:633) at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:633) at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.sparkOperToRDD(SparkLauncher.java:585) at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.sparkPlanToRDD(SparkLauncher.java:534) at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:209) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301) at org.apache.pig.PigServer.launchPlan(PigServer.java:1390) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375) at org.apache.pig.PigServer.storeEx(PigServer.java:1034) at org.apache.pig.PigServer.store(PigServer.java:997) at org.apache.pig.PigServer.openIterator(PigServer.java:910) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:754) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:376) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66) at org.apache.pig.Main.run(Main.java:558) at org.apache.pig.Main.main(Main.java:170) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Didn't understand the problem behind the error . Thanks, Regards, Divya On 25 November 2015 at 14:00, Jeff Zhang wrote: > >>> Details at logfile: /home/pig/pig_1448425672112.log > > You need to check the log file for details > > > > > On Wed, Nov 25, 2015 at 1:57 PM, Divya Gehlot > wrote: > >> Hi, >> >> >> As a beginner ,I have below queries on Spork(Pig on Spark). >> I have cloned git clone https://github.com/apache/pig -b spark . >> 1.On which version of Pig and Spark , Spork is being built ? >> 2. I followed the steps mentioned in https://issues.apache.org/ji >> ra/browse/PIG-4059 and try to run simple pig script just like Load the >> file and dump/store it. >> Getting errors : >> >>> >> grunt> A = load '/tmp/words_tb.txt' using PigStorage('\t') as >> (empNo:chararray,empName:chararray,salary:chararray); >> grunt> Store A into >> '/tmp/spork'; >> >> 2015-11-25 05:35:52,502 [main] INFO >> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the >> script: UNKNOWN >> 2015-11-25 05:35:52,875 [main] WARN >> org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already >> been initialized >> 2015-11-25 05:35:52,883 [main] INFO >> org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - Not MR >> mode. RollupHIIOptimizer is disabled >> 2015-11-25 05:35:52,894 [main] INFO >> org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - >> {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, >> GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, >> MergeFilter, MergeForEach, PartitionFilterOptimizer, >> PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, >> SplitFilter, StreamTypeCastInserter]} >> 2015-11-25 05:35:52,966 [main] INFO >> org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not >> set... will not generate code. >> 2015-11-25 05:35:52,983 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - add >> Files Spark Job >> 2015-11-25 05:35:53,137 [main] INFO >> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - Added >> jar
Re: queries on Spork (Pig on Spark)
>>> Details at logfile: /home/pig/pig_1448425672112.log You need to check the log file for details On Wed, Nov 25, 2015 at 1:57 PM, Divya Gehlot wrote: > Hi, > > > As a beginner ,I have below queries on Spork(Pig on Spark). > I have cloned git clone https://github.com/apache/pig -b spark . > 1.On which version of Pig and Spark , Spork is being built ? > 2. I followed the steps mentioned in https://issues.apache.org/ji > ra/browse/PIG-4059 and try to run simple pig script just like Load the > file and dump/store it. > Getting errors : > >> > grunt> A = load '/tmp/words_tb.txt' using PigStorage('\t') as > (empNo:chararray,empName:chararray,salary:chararray); > grunt> Store A into > '/tmp/spork'; > > 2015-11-25 05:35:52,502 [main] INFO > org.apache.pig.tools.pigstats.ScriptState - Pig features used in the > script: UNKNOWN > 2015-11-25 05:35:52,875 [main] WARN > org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already > been initialized > 2015-11-25 05:35:52,883 [main] INFO > org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - Not MR > mode. RollupHIIOptimizer is disabled > 2015-11-25 05:35:52,894 [main] INFO > org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - > {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, > GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, > MergeFilter, MergeForEach, PartitionFilterOptimizer, > PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, > SplitFilter, StreamTypeCastInserter]} > 2015-11-25 05:35:52,966 [main] INFO > org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not > set... will not generate code. > 2015-11-25 05:35:52,983 [main] INFO > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - add > Files Spark Job > 2015-11-25 05:35:53,137 [main] INFO > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - Added > jar pig-0.15.0-SNAPSHOT-core-h2.jar > 2015-11-25 05:35:53,138 [main] INFO > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - Added > jar pig-0.15.0-SNAPSHOT-core-h2.jar > 2015-11-25 05:35:53,138 [main] INFO > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - > Converting operator POLoad (Name: A: > Load(/tmp/words_tb.txt:PigStorage(' ')) - scope-29 Operator Key: scope-29) > 2015-11-25 05:35:53,205 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 2998: Unhandled internal error. Could not initialize class > org.apache.spark.rdd.RDDOperationScope$ > Details at logfile: /home/pig/pig_1448425672112.log > > > Can you please help me in pointing whats wrong ? > > Appreciate your help . > > Thanks, > > Regards, > > Divya > -- Best Regards Jeff Zhang
queries on Spork (Pig on Spark)
> > Hi, As a beginner ,I have below queries on Spork(Pig on Spark). I have cloned git clone https://github.com/apache/pig -b spark . 1.On which version of Pig and Spark , Spork is being built ? 2. I followed the steps mentioned in https://issues.apache.org/ji ra/browse/PIG-4059 and try to run simple pig script just like Load the file and dump/store it. Getting errors : > grunt> A = load '/tmp/words_tb.txt' using PigStorage('\t') as (empNo:chararray,empName:chararray,salary:chararray); grunt> Store A into '/tmp/spork'; 2015-11-25 05:35:52,502 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN 2015-11-25 05:35:52,875 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized 2015-11-25 05:35:52,883 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - Not MR mode. RollupHIIOptimizer is disabled 2015-11-25 05:35:52,894 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} 2015-11-25 05:35:52,966 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2015-11-25 05:35:52,983 [main] INFO org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - add Files Spark Job 2015-11-25 05:35:53,137 [main] INFO org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - Added jar pig-0.15.0-SNAPSHOT-core-h2.jar 2015-11-25 05:35:53,138 [main] INFO org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - Added jar pig-0.15.0-SNAPSHOT-core-h2.jar 2015-11-25 05:35:53,138 [main] INFO org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - Converting operator POLoad (Name: A: Load(/tmp/words_tb.txt:PigStorage(' ')) - scope-29 Operator Key: scope-29) 2015-11-25 05:35:53,205 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. Could not initialize class org.apache.spark.rdd.RDDOperationScope$ Details at logfile: /home/pig/pig_1448425672112.log Can you please help me in pointing whats wrong ? Appreciate your help . Thanks, Regards, Divya