-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43875/#review120467
-----------------------------------------------------------


Ship it!




Ship It!

- Pallavi Rao


On Feb. 24, 2016, 7:47 a.m., prateek vaishnav wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43875/
> -----------------------------------------------------------
> 
> (Updated Feb. 24, 2016, 7:47 a.m.)
> 
> 
> Review request for pig and Pallavi Rao.
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/PIG-4807
> 
> Following test cases have been fixed -
> 1. org.apache.pig.test.TestEvalPipelineLocal.testSetLocationCalledInFE
> 2. org.apache.pig.test.TestEvalPipelineLocal.testExplainInDotGraph
> 3. org.apache.pig.test.TestEvalPipelineLocal.testSortWithUDF
> 
> 1 was failing because of not saving UDF_CONTEXT configuration in jobConf. 
> This leads  UDFContext.getUDFProperties() to return NULL.
>  
> public Properties getUDFProperties(Class c) {
>     UDFContextKey k = generateKey(c, null);
>     Properties p = udfConfs.get(k);
>     if (p == null) {
>         p = new Properties();
>         udfConfs.put(k, p);
>     }
>     return p;
> }
> 
> Here, udfConfs remains empty even when it was set while processing the pig 
> query.
> udf configuration in jobConf is getting lost while running the job.
> In the code udf configuration is meant to be saved by serializing them in 
> jobConf.
> 
> Currently, serialization is done before loading configuration in jobConf.
> It is done in 'newJobConf(PigContext pigContext)'
> It needs to be done after loading configuration.
> 
> JobConf jobConf = SparkUtil.newJobConf(pigContext);
> configureLoader(physicalPlan, op, jobConf);
> UDFContext.getUDFContext().serialize(jobConf);
>             
> 2 was failing because of pig-spark not supporting 'explain' in dot format. I 
> have added the DotSparkPrinter to fix the same.
> 
> 3 was failing because instead of UDFSortComparator, SortConveter class was 
> using SortComparator. 
> 
> JavaPairRDD<Tuple, Object> sorted = r.sortByKey(
>                 sortOperator.new SortComparator(), true);
> 
> It should be using mComparator stored in POSort class. I have changed it to 
> following
> 
> JavaPairRDD<Tuple, Object> sorted = r.sortByKey(
>                 sortOperator.getMComparator(), true);
> 
> 
> Diffs
> -----
> 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java
>  a759857 
>   src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 
> b74977d 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LoadConverter.java
>  90cff23 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SortConverter.java
>  f54f8fc 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/DotSparkPrinter.java
>  e69de29 
>   test/org/apache/pig/test/TestEvalPipelineLocal.java d73074c 
> 
> Diff: https://reviews.apache.org/r/43875/diff/
> 
> 
> Testing
> -------
> 
> Successfully ran TestEvalPipelineLocal in spark/mr/local mode.
> 
> 
> Thanks,
> 
> prateek vaishnav
> 
>

Reply via email to