debabhishek53 commented on code in PR #4187:
URL: https://github.com/apache/gobblin/pull/4187#discussion_r3115366930
##########
gobblin-restli/gobblin-flow-config-service/gobblin-flow-config-service-server/src/main/java/org/apache/gobblin/service/FlowExecutionResource.java:
##########
@@ -220,13 +220,25 @@ public static FlowExecution convertFlowStatus(FlowStatus
monitoringFlowStatus,
jobStatusArray.sort(Comparator.comparing((org.apache.gobblin.service.JobStatus
js) -> js.getExecutionStatistics().getExecutionStartTime()));
+ // $UNKNOWN is a Pegasus in-memory sentinel (not declared in
ExecutionStatus.pdl) that can arise
+ // when no flow-level (NA/NA) status event was persisted for this
execution (e.g., orchestration
+ // race in ReevaluateDagProc). Serializing it produces HTTP 500 and
poisons the whole collection
+ // response. Coerce to PENDING so the record serializes; the flow status
is effectively "unknown
+ // but the flow exists", and polling callers will keep polling until a
terminal state is known.
+ ExecutionStatus flowExecutionStatus =
monitoringFlowStatus.getFlowExecutionStatus();
+ if (flowExecutionStatus == ExecutionStatus.$UNKNOWN) {
+ log.warn("FlowExecution {}/{}/{} has $UNKNOWN flow status; coercing to
PENDING. Check state store for data quality issue.",
+ flowId.getFlowGroup(), flowId.getFlowName(),
monitoringFlowStatus.getFlowExecutionId());
+ flowExecutionStatus = ExecutionStatus.PENDING;
Review Comment:
Why PENDING over RUNNING as the coercion target? RUNNING might be a closer
semantic match than PENDING, since the flow is clearly active enough for
executions to appear in the collection response.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]