> On May 22, 2016, 9:57 p.m., Rohini Palaniswamy wrote: > > src/org/apache/pig/backend/hadoop/hbase/HBaseStorage.java, line 313 > > <https://reviews.apache.org/r/45667/diff/1/?file=1323847#file1323847line313> > > > > Just change the code to use UDFContext.getUDFContext().getJobConf() > > which should not be null instead of getClientSystemProps(). Not sure why it > > is using getClientSystemProps() in the first place. > > kelly zhang wrote: > Here if we change to UDFContext.getUDFContext().getJobConf(), problem > still exists. > > > The reason why verify UDFContext.getUDFContext().getJobConf() or not is > because spark executor first initializes all the object then > UDFContext.deserialize is called, HBaseStorage constructor is called before > UDFContext.deserialized(), so here we need to verify > UDFContext.getUDFContext().getJobConf() is null or not otherwise NPE will be > thrown out here.
Update PigOnSpark_3.patch. After PIG-4920, we store UdfContext#getClientSystemProps UDFContext#getUdfConfs into SparkEngineConf. so not modify HBaseStorage any more. > On May 22, 2016, 9:57 p.m., Rohini Palaniswamy wrote: > > src/org/apache/pig/impl/PigContext.java, line 924 > > <https://reviews.apache.org/r/45667/diff/1/?file=1323849#file1323849line924> > > > > This can be reverted. PigContext need not be serialized to the backend. > > See PIG-4866 > > kelly zhang wrote: > PIG-4866 is not serialize pigcontext in configuration while here we > override PigContext#writeObject and PigContext#readObject to only serialize > and deserialize 1 attribute(packageImportList) in spark mode. Update PigOnSpark_3.patch, After PIG-4920, we store UdfContext#getClientSystemProps UDFContext#getUdfConfs into SparkEngineConf. so not modify PigContext anymore. > On May 22, 2016, 9:57 p.m., Rohini Palaniswamy wrote: > > test/org/apache/pig/test/TestBuiltin.java, line 3255 > > <https://reviews.apache.org/r/45667/diff/1/?file=1323870#file1323870line3255> > > > > This testcase is broken if you have 0-0 repeating twice. It is not > > UniqueID anymore. > > kelly zhang wrote: > 0-0 repeating twice is because we use TaskID in UniqueID#exec: > public String exec(Tuple input) throws IOException { > String taskIndex = > PigMapReduce.sJobConfInternal.get().get(PigConstants.TASK_INDEX); > String sequenceId = taskIndex + "-" + Long.toString(sequence); > sequence++; > return sequenceId; > } > in MR, we initialize PigContants.TASK_INDEX in > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.Reduce#setup > > protected void setup(Context context) throws IOException, > InterruptedException { > ... > context.getConfiguration().set(PigConstants.TASK_INDEX, > Integer.toString(context.getTaskAttemptID().getTaskID().getId())); > ... > } > > But spark does not provide funtion like PigGenericMapReduce.Reduce#setup > to initialize PigContants.TASK_INDEX when job starts. > Suggest to file a new jira(Initialize PigContants.TASK_INDEX when spark > job starts) and skip this unit test until this jira is resolved. Update PigOnSpark_3.patch. Have created PIG-5051 and added comment on TestBuilt#testUniqueID(the behavior in spark mode will be same with what in mr until PIG-5051 is fixed) - kelly ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45667/#review134255 ----------------------------------------------------------- On July 11, 2016, 4:32 a.m., Pallavi Rao wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/45667/ > ----------------------------------------------------------- > > (Updated July 11, 2016, 4:32 a.m.) > > > Review request for pig, Daniel Dai and Rohini Palaniswamy. > > > Bugs: PIG-4059 and PIG-4854 > https://issues.apache.org/jira/browse/PIG-4059 > https://issues.apache.org/jira/browse/PIG-4854 > > > Repository: pig-git > > > Description > ------- > > The patch contains all the work done in the spark branch, so far. > > > Diffs > ----- > > bin/pig 81f1426 > build.xml 99ba1f4 > ivy.xml dd9878e > ivy/libraries.properties 3a819a5 > shims/test/hadoop20/org/apache/pig/test/SparkMiniCluster.java PRE-CREATION > shims/test/hadoop23/org/apache/pig/test/SparkMiniCluster.java PRE-CREATION > shims/test/hadoop23/org/apache/pig/test/TezMiniCluster.java 792a1bd > shims/test/hadoop23/org/apache/pig/test/YarnMiniCluster.java PRE-CREATION > src/META-INF/services/org.apache.pig.ExecType 5c034c8 > src/docs/src/documentation/content/xdocs/start.xml 36f9952 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java > 1ff1abd > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java > ecf780c > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhysicalPlan.java > 2376d03 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCollectedGroup.java > bcbfe2b > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFRJoin.java > d80951a > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java > 21b75f1 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POGlobalRearrange.java > 52cfb73 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeJoin.java > 13f70c0 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java > c3a82c3 > > src/org/apache/pig/backend/hadoop/executionengine/spark/JobGraphBuilder.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/JobMetricsListener.java > PRE-CREATION > src/org/apache/pig/backend/hadoop/executionengine/spark/KryoSerializer.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/MapReducePartitionerWrapper.java > PRE-CREATION > src/org/apache/pig/backend/hadoop/executionengine/spark/SparkExecType.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/SparkExecutionEngine.java > PRE-CREATION > src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLocalExecType.java > PRE-CREATION > src/org/apache/pig/backend/hadoop/executionengine/spark/SparkUtil.java > PRE-CREATION > src/org/apache/pig/backend/hadoop/executionengine/spark/UDFJarsFinder.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/CollectedGroupConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/CounterConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/DistinctConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/FRJoinConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/FilterConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ForEachConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/GlobalRearrangeConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/IndexedKey.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/IteratorTransform.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LimitConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LoadConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LocalRearrangeConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/MergeCogroupConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/MergeJoinConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/OutputConsumerIterator.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/PackageConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/PigSecondaryKeyComparatorSpark.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/RDDConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/RankConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ReduceByConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SkewedJoinConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SortConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SplitConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/StoreConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/StreamConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/UnionConverter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/operator/NativeSparkOperator.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/operator/POGlobalRearrangeSpark.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/operator/POReduceBySpark.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/AccumulatorOptimizer.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/CombinerOptimizer.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/MultiQueryOptimizerSpark.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/NoopFilterRemover.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/ParallelismSetter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/SecondaryKeyOptimizerSpark.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/plan/DotSparkPrinter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkCompiler.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkCompilerException.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkOpPlanVisitor.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkOperPlan.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkOperator.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkPOPackageAnnotator.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkPrinter.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/spark/running/PigInputFormatSpark.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/util/AccumulatorOptimizerUtil.java > c4b44ad > > src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java > 889c01b > > src/org/apache/pig/backend/hadoop/executionengine/util/SecondaryKeyOptimizerUtil.java > 0b59c9c > src/org/apache/pig/backend/hadoop/hbase/HBaseStorage.java e0581d9 > src/org/apache/pig/data/SelfSpillBag.java d17f0a8 > src/org/apache/pig/impl/PigContext.java d43949f > src/org/apache/pig/impl/plan/OperatorPlan.java 8b2e2e7 > src/org/apache/pig/tools/pigstats/PigStatsUtil.java 542cc2e > src/org/apache/pig/tools/pigstats/spark/SparkCounter.java PRE-CREATION > src/org/apache/pig/tools/pigstats/spark/SparkCounterGroup.java PRE-CREATION > src/org/apache/pig/tools/pigstats/spark/SparkCounters.java PRE-CREATION > src/org/apache/pig/tools/pigstats/spark/SparkJobStats.java PRE-CREATION > src/org/apache/pig/tools/pigstats/spark/SparkPigStats.java PRE-CREATION > src/org/apache/pig/tools/pigstats/spark/SparkPigStatusReporter.java > PRE-CREATION > src/org/apache/pig/tools/pigstats/spark/SparkScriptState.java PRE-CREATION > src/org/apache/pig/tools/pigstats/spark/SparkStatsUtil.java PRE-CREATION > test/e2e/pig/build.xml f7c38ba > test/e2e/pig/conf/spark.conf PRE-CREATION > test/e2e/pig/drivers/TestDriverPig.pm bf9c302 > test/e2e/pig/tests/streaming.conf 18f2fb2 > test/excluded-tests-spark PRE-CREATION > > test/org/apache/pig/newplan/logical/relational/TestLocationInPhysicalPlan.java > 94b34b3 > test/org/apache/pig/spark/TestIndexedKey.java PRE-CREATION > test/org/apache/pig/spark/TestSecondarySortSpark.java PRE-CREATION > test/org/apache/pig/test/MiniGenericCluster.java 9347269 > test/org/apache/pig/test/TestAssert.java 6d4b5c6 > test/org/apache/pig/test/TestBuiltin.java fbc3f1e > test/org/apache/pig/test/TestCase.java c9bb2fa > test/org/apache/pig/test/TestCollectedGroup.java a958d33 > test/org/apache/pig/test/TestCombiner.java df44293 > test/org/apache/pig/test/TestCubeOperator.java de96e6c > test/org/apache/pig/test/TestEvalPipeline.java 9efde13 > test/org/apache/pig/test/TestEvalPipeline2.java c8f51d7 > test/org/apache/pig/test/TestEvalPipelineLocal.java c12d595 > test/org/apache/pig/test/TestFinish.java f18c103 > test/org/apache/pig/test/TestForEachNestedPlanLocal.java b0aa3a8 > test/org/apache/pig/test/TestGrunt.java 9eaf298 > test/org/apache/pig/test/TestHBaseStorage.java 8d2ad85 > test/org/apache/pig/test/TestLimitVariable.java 53b9dae > test/org/apache/pig/test/TestMapSideCogroup.java 2c78b4a > test/org/apache/pig/test/TestMergeJoin.java f1a9608 > test/org/apache/pig/test/TestMergeJoinOuter.java 81aee55 > test/org/apache/pig/test/TestMultiQuery.java c32eab7 > test/org/apache/pig/test/TestMultiQueryLocal.java b9ac035 > test/org/apache/pig/test/TestNativeMapReduce.java c4f6573 > test/org/apache/pig/test/TestNullConstant.java 3ea4509 > test/org/apache/pig/test/TestPigRunner.java fde8609 > test/org/apache/pig/test/TestPigServerLocal.java fbabd03 > test/org/apache/pig/test/TestProjectRange.java 2e3e7b8 > test/org/apache/pig/test/TestPruneColumn.java 3936332 > test/org/apache/pig/test/TestRank1.java 9e4ef62 > test/org/apache/pig/test/TestRank2.java fc802a9 > test/org/apache/pig/test/TestRank3.java 43af10d > test/org/apache/pig/test/TestSecondarySort.java 8991010 > test/org/apache/pig/test/TestSkewedJoin.java dba2241 > test/org/apache/pig/test/TestStoreBase.java eb3b253 > test/org/apache/pig/test/Util.java 36d01e8 > test/spark-tests PRE-CREATION > > Diff: https://reviews.apache.org/r/45667/diff/ > > > Testing > ------- > > New UTs were added where required and ensure old UTs pass -> > https://builds.apache.org/job/Pig-spark/ > > > Thanks, > > Pallavi Rao > >