phet commented on code in PR #4089:
URL: https://github.com/apache/gobblin/pull/4089#discussion_r1900542681
##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/GenerateWorkUnitsImpl.java:
##########
@@ -150,26 +156,28 @@ public GenerateWorkUnitsResult
generateWorkUnits(Properties jobProps, EventSubmi
protected List<WorkUnit>
generateWorkUnitsForJobStateAndCollectCleanupPaths(JobState jobState,
EventSubmitterContext eventSubmitterContext, Closer closer,
Set<String> pathsToCleanUp)
throws ReflectiveOperationException {
+ // report (timer) metrics for "Work Discovery", *planning only* - NOT
including WU prep, like serialization, `DestinationDatasetHandlerService`ing,
etc.
+ // IMPORTANT: for accurate timing, SEPARATELY emit
`.createWorkPreparationTimer`, to record time prior to measuring the WU size
required for that one
Review Comment:
originally, in `AbstractJobLauncher` the "WU creation timer" measured only
the planning -
https://github.com/apache/gobblin/blob/7dbeebf7fecc748ea3ef90cc318214cf26ba5afa/gobblin-runtime/src/main/java/org/apache/gobblin/runtime/AbstractJobLauncher.java#L476
that is what's included in the `GaaSJobObservabilityEvent`.
the timer for WU prep happens a bit later -
https://github.com/apache/gobblin/blob/7dbeebf7fecc748ea3ef90cc318214cf26ba5afa/gobblin-runtime/src/main/java/org/apache/gobblin/runtime/AbstractJobLauncher.java#L549
so in this comment:
> "Work Discovery", *planning only* - NOT including WU prep, like
serialization, ...
I just meant that we're timing only planning/creation, not the preparation
such as serialization.
as for WU serialization, there is no existing, historical event strictly for
that. typically that only takes a long time when memory-constrained and
GC-bound. although we could consider adding a new event to time that, for
purposes of right-sizing, GC stats are more interesting than the duration it
happens to take. if anything, the former is what I'd prioritize.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]