phet commented on code in PR #3610: URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041819858
########## gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc: ########## @@ -0,0 +1,135 @@ +{ + "type": "record", + "name": "GaaSObservabilityEventExperimental", + "namespace" : "org.apache.gobblin.metrics", + "doc": "An experimental feature for GaaS to emit events during and after a job is executed.", + "fields": [ + { + "name": "timestamp", + "type": "long", + "doc": "Time at which event was created in millis" + }, { + "name" : "flowGroup", + "type" : "string", + "doc" : "Flow group for the GaaS flow", + "compliance" : "NONE" + }, { + "name" : "flowName", + "type" : "string", + "doc" : "Flow name for the GaaS flow", + "compliance" : "NONE" + }, { + "name" : "flowExecutionId", + "type" : "long", + "doc" : "Flow execution id for the GaaS flow", + "compliance" : "NONE" + }, { + "name": "jobOrchestratedTime", + "type": "long", + "doc": "Timestamp when the job was successfully sent to the job executor, -1 if it was unable to be sent." Review Comment: I prefer this to the boolean. and since now conceptually similar, I suggest moving adjacent to the `jobStartTime` field. also, as mentioned in reply to other comment, I prefer `null` to `-1`, so unless you have strong pref, I recommend that. reason is that gives notice when the null check has been forgotten, e.g. when read in scala, where `["null", "Foo"]` maps to `Option[Foo]` ########## gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc: ########## @@ -0,0 +1,162 @@ +{ + "type": "record", + "name": "GaaSObservabilityEventV0", + "namespace" : "org.apache.gobblin.metrics", + "fields": [ + { + "name": "timestamp", + "type": "long", + "doc": "Time at which event was created in millis" + }, { + "name" : "flowGroup", + "type" : "string", + "doc" : "Flow group for the GaaS flow", + "compliance" : "NONE" + }, { + "name" : "flowName", + "type" : "string", + "doc" : "Flow name for the GaaS flow", + "compliance" : "NONE" + }, { + "name" : "flowExecutionId", + "type" : "long", + "doc" : "Flow execution id for the GaaS flow", + "compliance" : "NONE" + }, { + "name": "jobSentToExecutor", + "type": "boolean", + "doc": "Whether or not this job was able to be sent to a job executor." + }, { + "name": "lastFlowModificationTime", + "type": "long", + "doc": "Timestamp in millis when the flow config was last modified" + }, { + "name" : "edgeId", + "type" : "string", + "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>", + "compliance" : "NONE" + }, { + "name": "jobName", + "type": "string", + "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs", + "compliance" : "NONE" + }, { + "name": "jobType", + "type": { + "type": "enum", + "name": "JobType", + "symbols": [ + "COPY", + "RETENTION", + "GOBBLIN" + ], + "symbolDocs": { + "COPY": "Gobblin distcp job", + "RETENTION": "Gobblin retention job", + "GOBBLIN": "Any Gobblin job" + } + }, + "doc": "Gobblin job type running on GaaS, determined by the compiled job template.", + "compliance": "NONE" + }, { + "name": "jobStatus", + "type": { + "type": "enum", + "name": "JobStatus", + "symbols": [ + "SUCCEEDED", + "FAILED", + "CANCELLED" + ], + "doc": "Final job status for this job in the GaaS flow", + "compliance": "NONE" + } + }, { + "name": "jobStartTime", + "type": "long", + "doc": "Start time of the job in millis", + "compliance": "NONE" + }, { + "name": "jobEndTime", + "type": "long", + "doc": "Finish time of the job in millis", + "compliance": "NONE" + }, { + "name": "proxyUser", + "type": "string", + "doc": "Proxy user (if applicable) that ran the Gobblin job", + "compliance": "NONE" + }, { + "name": "executorLink", + "type": "string", + "doc": "Link to where the job is running, currently limited to Azkaban", + "compliance": "NONE" + }, { + "name": "executorId", + "type": "string", Review Comment: great Q on compilation failure! probably both ought to be null. compilation error events should be purely for *scheduled flows* that previously compiled--failure at submission time should be an interactive error only, no event. nonetheless, as we need them, perhaps we're asking `jobStatus.FAILURE` to do too much: it's already execution failure as well as failure to send to executor--now compilation failure as well? it becomes a status *category*, a catch-all. better might be another status, like `INVALID` or more specific `COMPILATION_FAILURE` / `UNRESOLVED` (?). in fact, more I consider that, I'd seek to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (I dislike the overloaded terminology, since the user 'submits' a flow to gaas and then gaas 'submits' the job to the executor.) other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org