> On May 22, 2016, 9:57 p.m., Rohini Palaniswamy wrote:
> > src/org/apache/pig/backend/hadoop/hbase/HBaseStorage.java, line 313
> > <https://reviews.apache.org/r/45667/diff/1/?file=1323847#file1323847line313>
> >
> >     Just change the code to use UDFContext.getUDFContext().getJobConf() 
> > which should not be null instead of getClientSystemProps(). Not sure why it 
> > is using getClientSystemProps() in the first place.
> 
> kelly zhang wrote:
>     Here if we change to UDFContext.getUDFContext().getJobConf(), problem 
> still exists.
>     
>     
>     The reason why verify  UDFContext.getUDFContext().getJobConf() or not is 
> because spark executor first initializes all the object then 
> UDFContext.deserialize is called, HBaseStorage constructor is called before 
> UDFContext.deserialized(), so here we need to verify  
> UDFContext.getUDFContext().getJobConf() is null or not otherwise NPE will be 
> thrown out here.

Update PigOnSpark_3.patch. After PIG-4920, we store 
UdfContext#getClientSystemProps  UDFContext#getUdfConfs into SparkEngineConf. 
so not modify HBaseStorage any more.


> On May 22, 2016, 9:57 p.m., Rohini Palaniswamy wrote:
> > src/org/apache/pig/impl/PigContext.java, line 924
> > <https://reviews.apache.org/r/45667/diff/1/?file=1323849#file1323849line924>
> >
> >     This can be reverted. PigContext need not be serialized to the backend. 
> > See PIG-4866
> 
> kelly zhang wrote:
>     PIG-4866 is not serialize pigcontext in configuration while here we 
> override PigContext#writeObject and PigContext#readObject to only serialize 
> and deserialize 1 attribute(packageImportList)  in spark mode.

Update PigOnSpark_3.patch, After PIG-4920, we store 
UdfContext#getClientSystemProps  UDFContext#getUdfConfs into SparkEngineConf. 
so not modify PigContext anymore.


> On May 22, 2016, 9:57 p.m., Rohini Palaniswamy wrote:
> > test/org/apache/pig/test/TestBuiltin.java, line 3255
> > <https://reviews.apache.org/r/45667/diff/1/?file=1323870#file1323870line3255>
> >
> >     This testcase is broken if you have 0-0 repeating twice. It is not 
> > UniqueID anymore.
> 
> kelly zhang wrote:
>     0-0 repeating twice is because we use TaskID in UniqueID#exec:
>     public String exec(Tuple input) throws IOException {
>         String taskIndex = 
> PigMapReduce.sJobConfInternal.get().get(PigConstants.TASK_INDEX);
>         String sequenceId = taskIndex + "-" + Long.toString(sequence);
>         sequence++;
>         return sequenceId;
>     }
>     in MR, we initialize PigContants.TASK_INDEX in  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce.Reduce#setup
>  
>     protected void setup(Context context) throws IOException, 
> InterruptedException {
>        ...
>         context.getConfiguration().set(PigConstants.TASK_INDEX, 
> Integer.toString(context.getTaskAttemptID().getTaskID().getId()));
>     ...
>     }
>     
>     But spark does not provide funtion like PigGenericMapReduce.Reduce#setup 
> to initialize PigContants.TASK_INDEX when job starts.
>     Suggest to file a new jira(Initialize PigContants.TASK_INDEX when spark 
> job starts) and skip this unit test until this jira is resolved.

Update PigOnSpark_3.patch. Have created PIG-5051 and  added comment on 
TestBuilt#testUniqueID(the behavior in spark mode will be same with what in mr 
until PIG-5051 is fixed)


- kelly


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45667/#review134255
-----------------------------------------------------------


On July 11, 2016, 4:32 a.m., Pallavi Rao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45667/
> -----------------------------------------------------------
> 
> (Updated July 11, 2016, 4:32 a.m.)
> 
> 
> Review request for pig, Daniel Dai and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-4059 and PIG-4854
>     https://issues.apache.org/jira/browse/PIG-4059
>     https://issues.apache.org/jira/browse/PIG-4854
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> The patch contains all the work done in the spark branch, so far.
> 
> 
> Diffs
> -----
> 
>   bin/pig 81f1426 
>   build.xml 99ba1f4 
>   ivy.xml dd9878e 
>   ivy/libraries.properties 3a819a5 
>   shims/test/hadoop20/org/apache/pig/test/SparkMiniCluster.java PRE-CREATION 
>   shims/test/hadoop23/org/apache/pig/test/SparkMiniCluster.java PRE-CREATION 
>   shims/test/hadoop23/org/apache/pig/test/TezMiniCluster.java 792a1bd 
>   shims/test/hadoop23/org/apache/pig/test/YarnMiniCluster.java PRE-CREATION 
>   src/META-INF/services/org.apache.pig.ExecType 5c034c8 
>   src/docs/src/documentation/content/xdocs/start.xml 36f9952 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
>  1ff1abd 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
>  ecf780c 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhysicalPlan.java
>  2376d03 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCollectedGroup.java
>  bcbfe2b 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFRJoin.java
>  d80951a 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java
>  21b75f1 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POGlobalRearrange.java
>  52cfb73 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeJoin.java
>  13f70c0 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java
>  c3a82c3 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/JobGraphBuilder.java 
> PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/JobMetricsListener.java
>  PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/spark/KryoSerializer.java 
> PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/MapReducePartitionerWrapper.java
>  PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/spark/SparkExecType.java 
> PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/SparkExecutionEngine.java
>  PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 
> PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLocalExecType.java
>  PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/spark/SparkUtil.java 
> PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/spark/UDFJarsFinder.java 
> PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/CollectedGroupConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/CounterConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/DistinctConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/FRJoinConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/FilterConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ForEachConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/GlobalRearrangeConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/IndexedKey.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/IteratorTransform.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LimitConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LoadConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LocalRearrangeConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/MergeCogroupConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/MergeJoinConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/OutputConsumerIterator.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/PackageConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/PigSecondaryKeyComparatorSpark.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/RDDConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/RankConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ReduceByConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SkewedJoinConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SortConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SplitConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/StoreConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/StreamConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/UnionConverter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/operator/NativeSparkOperator.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/operator/POGlobalRearrangeSpark.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/operator/POReduceBySpark.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/AccumulatorOptimizer.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/CombinerOptimizer.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/MultiQueryOptimizerSpark.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/NoopFilterRemover.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/ParallelismSetter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/SecondaryKeyOptimizerSpark.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/DotSparkPrinter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkCompiler.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkCompilerException.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkOpPlanVisitor.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkOperPlan.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkOperator.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkPOPackageAnnotator.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/SparkPrinter.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/running/PigInputFormatSpark.java
>  PRE-CREATION 
>   
> src/org/apache/pig/backend/hadoop/executionengine/util/AccumulatorOptimizerUtil.java
>  c4b44ad 
>   
> src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java
>  889c01b 
>   
> src/org/apache/pig/backend/hadoop/executionengine/util/SecondaryKeyOptimizerUtil.java
>  0b59c9c 
>   src/org/apache/pig/backend/hadoop/hbase/HBaseStorage.java e0581d9 
>   src/org/apache/pig/data/SelfSpillBag.java d17f0a8 
>   src/org/apache/pig/impl/PigContext.java d43949f 
>   src/org/apache/pig/impl/plan/OperatorPlan.java 8b2e2e7 
>   src/org/apache/pig/tools/pigstats/PigStatsUtil.java 542cc2e 
>   src/org/apache/pig/tools/pigstats/spark/SparkCounter.java PRE-CREATION 
>   src/org/apache/pig/tools/pigstats/spark/SparkCounterGroup.java PRE-CREATION 
>   src/org/apache/pig/tools/pigstats/spark/SparkCounters.java PRE-CREATION 
>   src/org/apache/pig/tools/pigstats/spark/SparkJobStats.java PRE-CREATION 
>   src/org/apache/pig/tools/pigstats/spark/SparkPigStats.java PRE-CREATION 
>   src/org/apache/pig/tools/pigstats/spark/SparkPigStatusReporter.java 
> PRE-CREATION 
>   src/org/apache/pig/tools/pigstats/spark/SparkScriptState.java PRE-CREATION 
>   src/org/apache/pig/tools/pigstats/spark/SparkStatsUtil.java PRE-CREATION 
>   test/e2e/pig/build.xml f7c38ba 
>   test/e2e/pig/conf/spark.conf PRE-CREATION 
>   test/e2e/pig/drivers/TestDriverPig.pm bf9c302 
>   test/e2e/pig/tests/streaming.conf 18f2fb2 
>   test/excluded-tests-spark PRE-CREATION 
>   
> test/org/apache/pig/newplan/logical/relational/TestLocationInPhysicalPlan.java
>  94b34b3 
>   test/org/apache/pig/spark/TestIndexedKey.java PRE-CREATION 
>   test/org/apache/pig/spark/TestSecondarySortSpark.java PRE-CREATION 
>   test/org/apache/pig/test/MiniGenericCluster.java 9347269 
>   test/org/apache/pig/test/TestAssert.java 6d4b5c6 
>   test/org/apache/pig/test/TestBuiltin.java fbc3f1e 
>   test/org/apache/pig/test/TestCase.java c9bb2fa 
>   test/org/apache/pig/test/TestCollectedGroup.java a958d33 
>   test/org/apache/pig/test/TestCombiner.java df44293 
>   test/org/apache/pig/test/TestCubeOperator.java de96e6c 
>   test/org/apache/pig/test/TestEvalPipeline.java 9efde13 
>   test/org/apache/pig/test/TestEvalPipeline2.java c8f51d7 
>   test/org/apache/pig/test/TestEvalPipelineLocal.java c12d595 
>   test/org/apache/pig/test/TestFinish.java f18c103 
>   test/org/apache/pig/test/TestForEachNestedPlanLocal.java b0aa3a8 
>   test/org/apache/pig/test/TestGrunt.java 9eaf298 
>   test/org/apache/pig/test/TestHBaseStorage.java 8d2ad85 
>   test/org/apache/pig/test/TestLimitVariable.java 53b9dae 
>   test/org/apache/pig/test/TestMapSideCogroup.java 2c78b4a 
>   test/org/apache/pig/test/TestMergeJoin.java f1a9608 
>   test/org/apache/pig/test/TestMergeJoinOuter.java 81aee55 
>   test/org/apache/pig/test/TestMultiQuery.java c32eab7 
>   test/org/apache/pig/test/TestMultiQueryLocal.java b9ac035 
>   test/org/apache/pig/test/TestNativeMapReduce.java c4f6573 
>   test/org/apache/pig/test/TestNullConstant.java 3ea4509 
>   test/org/apache/pig/test/TestPigRunner.java fde8609 
>   test/org/apache/pig/test/TestPigServerLocal.java fbabd03 
>   test/org/apache/pig/test/TestProjectRange.java 2e3e7b8 
>   test/org/apache/pig/test/TestPruneColumn.java 3936332 
>   test/org/apache/pig/test/TestRank1.java 9e4ef62 
>   test/org/apache/pig/test/TestRank2.java fc802a9 
>   test/org/apache/pig/test/TestRank3.java 43af10d 
>   test/org/apache/pig/test/TestSecondarySort.java 8991010 
>   test/org/apache/pig/test/TestSkewedJoin.java dba2241 
>   test/org/apache/pig/test/TestStoreBase.java eb3b253 
>   test/org/apache/pig/test/Util.java 36d01e8 
>   test/spark-tests PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/45667/diff/
> 
> 
> Testing
> -------
> 
> New UTs were added where required and ensure old UTs pass -> 
> https://builds.apache.org/job/Pig-spark/
> 
> 
> Thanks,
> 
> Pallavi Rao
> 
>

Reply via email to