phet commented on a change in pull request #3382:
URL: https://github.com/apache/gobblin/pull/3382#discussion_r704930592



##########
File path: 
gobblin-restli/gobblin-flow-config-service/gobblin-flow-config-service-server/src/main/java/org/apache/gobblin/service/FlowExecutionResourceHandler.java
##########
@@ -36,6 +36,14 @@
    */
   public List<FlowExecution> getLatestFlowExecution(PagingContext context, 
FlowId flowId, Integer count, String tag, String executionStatus);
 
+  /**
+   * Get latest {@link FlowExecution} for every flow in `flowGroup`
+   *
+   * NOTE: `executionStatus` param not provided yet, without justifying use 
case, due to complexity of interaction with `count`
+   * and resulting efficiency concern of performing across many flows sharing 
the single named group.
+   */
+  public List<FlowExecution> getLatestFlowGroupExecutions(PagingContext 
context, String flowGroup, Integer count, String tag);

Review comment:
       as we agreed offline, I've renamed `count` to `countPerFlow` (on the 
`latestFlowGroupExecutions` finder).
   
   AFA potential to time out from a large number of historical flow executions 
(within the group), there is the `DatasetCleaner`, scheduled by 
`KafkaJobStatusMonitor` to run in the background.  that can constrain 
historical flow executions from accumulating forever.  when the config property 
named by `MultiCleanableDatasetFinder.datasetFinderClassKey` is set to
   
`org.apache.gobblin.data.management.retention.dataset.finder.TimeBasedDatasetStoreDatasetFinder`,
 the latter will scan the `MysqlJobStatusStateStore` to offer the contents up 
for retention/expiration.
   
   so while there could still be lots of prior executions, there are 
constraints on accumulation.  I propose we gain operational experience to 
determine whether there's a need to truncate or page through the results: in 
practice we may not be in danger of timing out (but I agree with you that it's 
worth keeping an eye out for).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to