-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43875/
-----------------------------------------------------------

Review request for pig and Pallavi Rao.


Repository: pig-git


Description
-------

https://issues.apache.org/jira/browse/PIG-4807

Following test cases have been fixed -
1. org.apache.pig.test.TestEvalPipelineLocal.testSetLocationCalledInFE
2. org.apache.pig.test.TestEvalPipelineLocal.testExplainInDotGraph
3. org.apache.pig.test.TestEvalPipelineLocal.testSortWithUDF

1 was failing because of not saving UDF_CONTEXT configuration in jobConf. This 
leads  UDFContext.getUDFProperties() to return NULL.
 
public Properties getUDFProperties(Class c) {
    UDFContextKey k = generateKey(c, null);
    Properties p = udfConfs.get(k);
    if (p == null) {
        p = new Properties();
        udfConfs.put(k, p);
    }
    return p;
}

Here, udfConfs remains empty even when it was set while processing the pig 
query.
udf configuration in jobConf is getting lost while running the job.
In the code udf configuration is meant to be saved by serializing them in 
jobConf.

Currently, serialization is done before loading configuration in jobConf.
It is done in 'newJobConf(PigContext pigContext)'
It needs to be done after loading configuration.

JobConf jobConf = SparkUtil.newJobConf(pigContext);
configureLoader(physicalPlan, op, jobConf);
UDFContext.getUDFContext().serialize(jobConf);
            
2 was failing because of pig-spark not supporting 'explain' in dot format. I 
have added the DotSparkPrinter to fix the same.

3 was failing because instead of UDFSortComparator, SortConveter class was 
using SortComparator. 

JavaPairRDD<Tuple, Object> sorted = r.sortByKey(
                sortOperator.new SortComparator(), true);

It should be using mComparator stored in POSort class. I have changed it to 
following

JavaPairRDD<Tuple, Object> sorted = r.sortByKey(
                sortOperator.getMComparator(), true);


Diffs
-----

  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java
 a759857 
  src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 
b74977d 
  
src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LoadConverter.java
 90cff23 
  
src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SortConverter.java
 f54f8fc 
  
src/org/apache/pig/backend/hadoop/executionengine/spark/plan/DotSparkPrinter.java
 e69de29 
  test/org/apache/pig/test/TestEvalPipelineLocal.java d73074c 

Diff: https://reviews.apache.org/r/43875/diff/


Testing
-------

Successfully ran TestEvalPipelineLocal in spark/mr/local mode.


Thanks,

prateek vaishnav

Reply via email to