[ https://issues.apache.org/jira/browse/BEAM-6191?focusedWorklogId=175502&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-175502 ]
ASF GitHub Bot logged work on BEAM-6191: ---------------------------------------- Author: ASF GitHub Bot Created on: 14/Dec/18 17:44 Start Date: 14/Dec/18 17:44 Worklog Time Spent: 10m Work Description: swegner closed pull request #7220: [BEAM-6191] Remove redundant error logging for Dataflow exception handling URL: https://github.com/apache/beam/pull/7220 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkItemStatusClient.java b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkItemStatusClient.java index 2d840e3f4356..4473c04f3da8 100644 --- a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkItemStatusClient.java +++ b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WorkItemStatusClient.java @@ -117,17 +117,20 @@ public synchronized WorkItemServiceState reportError(Throwable e) throws IOExcep Status error = new Status(); error.setCode(2); // Code.UNKNOWN. TODO: Replace with a generated definition. // TODO: Attach the stack trace as exception details, not to the message. + String logPrefix = String.format("Failure processing work item %s", uniqueWorkId()); if (isOutOfMemoryError(t)) { String message = "An OutOfMemoryException occurred. Consider specifying higher memory " + "instances in PipelineOptions.\n"; - LOG.error(message); + LOG.error("{}: {}", logPrefix, message); error.setMessage(message + DataflowWorkerLoggingHandler.formatException(t)); } else { - LOG.error("Uncaught exception occurred during work unit execution. This will be retried.", t); + LOG.error( + "{}: Uncaught exception occurred during work unit execution. This will be retried.", + logPrefix, + t); error.setMessage(DataflowWorkerLoggingHandler.formatException(t)); } - LOG.warn("Failure processing work item {}", uniqueWorkId()); status.setErrors(ImmutableList.of(error)); return execute(status); diff --git a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/MapTaskExecutor.java b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/MapTaskExecutor.java index d11e72fe95dd..5690814f956d 100644 --- a/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/MapTaskExecutor.java +++ b/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/util/common/worker/MapTaskExecutor.java @@ -84,7 +84,7 @@ public void execute() throws Exception { op.finish(); } } catch (Exception | Error exn) { - LOG.warn("Aborting operations", exn); + LOG.debug("Aborting operations", exn); for (Operation op : operations) { try { op.abort(); ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 175502) Time Spent: 1h (was: 50m) > Redundant error messages for failures in Dataflow runner > -------------------------------------------------------- > > Key: BEAM-6191 > URL: https://issues.apache.org/jira/browse/BEAM-6191 > Project: Beam > Issue Type: New Feature > Components: runner-dataflow > Reporter: Scott Wegner > Assignee: Scott Wegner > Priority: Minor > Fix For: 2.10.0 > > Time Spent: 1h > Remaining Estimate: 0h > > The Dataflow runner harness has redundant error logging from a couple > different components, which creates log spam and confusion when failures do > occur. We should dedupe redundant logs. > From a typical user-code exception, we see at least 3 error logs from the > worker: > http://screen/QZxsJOVnvt6 > "Aborting operations" > "Uncaught exception occurred during work unit execution. This will be > retried." > "Failure processing work item" -- This message was sent by Atlassian JIRA (v7.6.3#76005)