----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30262/ -----------------------------------------------------------
(Updated Jan. 26, 2015, 7:23 p.m.) Review request for pig, liyun zhang and Praveen R. Changes ------- Incorp feedback: Removed spark version change from this patch. Bugs: PIG-4393 https://issues.apache.org/jira/browse/PIG-4393 Repository: pig-git Description ------- PIG-4393 : Add stats and error reporting for Spark After Pig submits a job to Spark cluster, we need to report job progress, spark specific stats and any error logs back to the user. This is an initial patch that adds spark specific stats, mostly to get feedback around assumption that a separate Spark job is launched for each POStore operator. It also re-factors code to correctly populate PigStats, which is used by most unit tests. This should fix a bunch of unit tests. TODO items: - Probably need to add counters to capture number of records, bytes in output file to populate OutputStats. - Though StatsReportListener prints spark job progress in the logs, we also probably need to implement PigProgressNotificationListener for spark. Diffs (updated) ----- src/org/apache/pig/backend/hadoop/executionengine/spark/JobMetricsListener.java PRE-CREATION src/org/apache/pig/backend/hadoop/executionengine/spark/SparkExecutionEngine.java db152b5003ce6e79b001b2624010b91cc0f921d8 src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 6e9b29753fa2db360b5063da38c785675f1e5b57 src/org/apache/pig/tools/pigstats/SparkStats.java fd45dd4f0be415dd48d9fb7381c57c861bbbf7ce src/org/apache/pig/tools/pigstats/spark/SparkJobStats.java PRE-CREATION src/org/apache/pig/tools/pigstats/spark/SparkPigStats.java PRE-CREATION src/org/apache/pig/tools/pigstats/spark/SparkStatsUtil.java PRE-CREATION Diff: https://reviews.apache.org/r/30262/diff/ Testing ------- Thanks, Mohit Sabharwal