-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57996/
-----------------------------------------------------------

(Updated May 15, 2017, 1:27 p.m.)


Review request for pig, Daniel Dai, liyun zhang, Rohini Palaniswamy, and Xuefu 
Zhang.


Bugs: PIG-5186
    https://issues.apache.org/jira/browse/PIG-5186


Repository: pig-git


Description
-------

Aggregate warnings were not supported in Spark mode yet (hence the e2e Warning 
test case failures). I aim to enable this now.
In MR/Tez we use counters, and in Spark we rely on Accumulators (a means to 
support distributed counters).
Pig has some builtin warning enums in PigWarning, and also supports custom 
warnings for user defined functions.
This latter is problematic with Spark because you cannot register new 
accumulators on the backend and read their values later in the driver.

A workaround has been implemented in my patch whereas we define Map type of 
Accumulators (beside the Long type we already use). One for the builtin 
warnings, one for the custom ones. These are passed from driver to backend, 
where the executors can create entries in the maps or increment preexisting 
values.

Also added upgrade of DummyContextUDF, this will help fix HiveUDF_7 e2e test 
case on Spark.
Previously this was using org.apache.hadoop.mapred.Reporter we have to update 
this to PigHadoopLogger which supports Spark too.


Diffs (updated)
-----

  src/org/apache/pig/PigWarning.java fcda1145f4e7c16940a540222ac7cc5370e3db33 
  
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigHadoopLogger.java
 255650edb519acc452812a5d67f3ac2376c278c2 
  src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 
85e3dc2f80c22b358560c716151d0ba931c78e4e 
  
src/org/apache/pig/backend/hadoop/executionengine/spark/running/PigInputFormatSpark.java
 a22c90dd8dd9b6463f22597effd884bc0f12bb73 
  src/org/apache/pig/tools/pigstats/PigStatusReporter.java 
5396535301b0e90dc5d3be2064cfe0bdf488bf6a 
  src/org/apache/pig/tools/pigstats/PigWarnCounter.java PRE-CREATION 
  src/org/apache/pig/tools/pigstats/spark/SparkCounter.java 
2411f875ec996fedb870c1b709b99e949803ed50 
  src/org/apache/pig/tools/pigstats/spark/SparkCounterGroup.java 
c23624dfcd2e11429fd8355497d184b155450c1f 
  src/org/apache/pig/tools/pigstats/spark/SparkCounters.java 
5ca077ca519ad766c8f6a23ef5b69cd02f3abe99 
  src/org/apache/pig/tools/pigstats/spark/SparkJobStats.java 
4c81a414be071d5e08689fde2a5abd847a5fd0b6 
  src/org/apache/pig/tools/pigstats/spark/SparkPigStats.java 
4e3644209817eeffabcec13d8aac5179bbad1c62 
  src/org/apache/pig/tools/pigstats/spark/SparkStatsUtil.java 
8f3bf6d736de08ae963152553e975e865f875ab2 
  test/e2e/pig/tests/nightly.conf 2048740b6a1b73862694805cc07bdd084c66a300 


Diff: https://reviews.apache.org/r/57996/diff/2/

Changes: https://reviews.apache.org/r/57996/diff/1-2/


Testing
-------

After this patch Warning E2E tests on Spark pass.


Thanks,

Adam Szita

Reply via email to