-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57996/
-----------------------------------------------------------

Review request for pig, Daniel Dai, liyun zhang, Rohini Palaniswamy, and Xuefu 
Zhang.


Bugs: PIG-5186
    https://issues.apache.org/jira/browse/PIG-5186


Repository: pig-git


Description
-------

Aggregate warnings were not supported in Spark mode yet (hence the e2e Warning 
test case failures). I aim to enable this now.
In MR/Tez we use counters, and in Spark we rely on Accumulators (a means to 
support distributed counters).
Pig has some builtin warning enums in PigWarning, and also supports custom 
warnings for user defined functions.
This latter is problematic with Spark because you cannot register new 
accumulators on the backend and read their values later in the driver.

A workaround has been implemented in my patch whereas we define Map type of 
Accumulators (beside the Long type we already use). One for the builtin 
warnings, one for the custom ones. These are passed from driver to backend, 
where the executors can create entries in the maps or increment preexisting 
values.

Also added upgrade of DummyContextUDF, this will help fix HiveUDF_7 e2e test 
case on Spark.
Previously this was using org.apache.hadoop.mapred.Reporter we have to update 
this to PigHadoopLogger which supports Spark too.


Diffs
-----

  src/org/apache/pig/PigWarning.java fcda1145f4e7c16940a540222ac7cc5370e3db33 
  
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigHadoopLogger.java
 255650edb519acc452812a5d67f3ac2376c278c2 
  src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 
36813b27be1090b04d577829080e4b931c5eb950 
  
src/org/apache/pig/backend/hadoop/executionengine/spark/running/PigInputFormatSpark.java
 8cf6513d3e5425e974d27d74c92592a6f0ed2cf2 
  src/org/apache/pig/tools/pigstats/PigStatusReporter.java 
5396535301b0e90dc5d3be2064cfe0bdf488bf6a 
  src/org/apache/pig/tools/pigstats/PigWarnCounterIncrementable.java 
PRE-CREATION 
  src/org/apache/pig/tools/pigstats/spark/SparkCounter.java 
2411f875ec996fedb870c1b709b99e949803ed50 
  src/org/apache/pig/tools/pigstats/spark/SparkCounterGroup.java 
c23624dfcd2e11429fd8355497d184b155450c1f 
  src/org/apache/pig/tools/pigstats/spark/SparkCounters.java 
5ca077ca519ad766c8f6a23ef5b69cd02f3abe99 
  src/org/apache/pig/tools/pigstats/spark/SparkJobStats.java 
808c3deb47bc8d9a212a701c81e0c9c6abe88f37 
  src/org/apache/pig/tools/pigstats/spark/SparkPigStats.java 
699219d30519c7db56a4b39c7690fa20d62df44d 
  src/org/apache/pig/tools/pigstats/spark/SparkStatsUtil.java 
2945c80dba4a23b07ee3d8a613b6a7e9319622ba 
  test/e2e/pig/udfs/java/org/apache/pig/test/udf/evalfunc/DummyContextUDF.java 
d5eb9ae660f94a444a17f2171107b6ff7e81819b 


Diff: https://reviews.apache.org/r/57996/diff/1/


Testing
-------

After this patch Warning E2E tests on Spark pass.


Thanks,

Adam Szita

Reply via email to