phet commented on a change in pull request #3382:
URL: https://github.com/apache/gobblin/pull/3382#discussion_r704930592
##########
File path:
gobblin-restli/gobblin-flow-config-service/gobblin-flow-config-service-server/src/main/java/org/apache/gobblin/service/FlowExecutionResourceHandler.java
##########
@@ -36,6 +36,14 @@
*/
public List<FlowExecution> getLatestFlowExecution(PagingContext context,
FlowId flowId, Integer count, String tag, String executionStatus);
+ /**
+ * Get latest {@link FlowExecution} for every flow in `flowGroup`
+ *
+ * NOTE: `executionStatus` param not provided yet, without justifying use
case, due to complexity of interaction with `count`
+ * and resulting efficiency concern of performing across many flows sharing
the single named group.
+ */
+ public List<FlowExecution> getLatestFlowGroupExecutions(PagingContext
context, String flowGroup, Integer count, String tag);
Review comment:
as we agreed offline, I've renamed `count` to `countPerFlow` (on the
`latestFlowGroupExecutions` finder).
AFA potential to time out from a large number of historical flow executions
(within the group), there is the `DatasetCleaner`, scheduled by
`KafkaJobStatusMonitor` to run in the background. that can constrain
historical flow executions from accumulating forever. when the config property
named by `MultiCleanableDatasetFinder.datasetFinderClassKey` is set to
`org.apache.gobblin.data.management.retention.dataset.finder.TimeBasedDatasetStoreDatasetFinder`,
the latter will scan the `MysqlJobStatusStateStore` to offer the contents up
for retention/expiration.
so while there could still be lots of prior executions, there are
constraints on accumulation. I propose we gain operational experience to
determine whether there's a need to truncate or page through the results: in
practice we may not be in danger of timing out (but I agree with you that it's
worth keeping an eye out for).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]