dejii commented on code in PR #54101:
URL: https://github.com/apache/airflow/pull/54101#discussion_r2258809404
##########
providers/google/src/airflow/providers/google/cloud/triggers/cloud_storage_transfer_service.py:
##########
@@ -87,13 +88,9 @@ async def run(self) -> AsyncIterator[TriggerEvent]:
for job, operation in zip(jobs, operations):
if operation is None:
- yield TriggerEvent(
- {
- "status": "error",
- "message": f"Transfer job {job.name} has no
latest operation.",
- }
- )
- return
+ self.log.info("Transfer job %s has no latest operation
yet, waiting.", job.name)
+ all_operations_found = False
+ continue
Review Comment:
Hi, thanks for the feedback.
The TransferJob submitted by the operator uses a one off schedule:
https://github.com/apache/airflow/blob/5318bd8a61c80d4ebc69d25da8dab164e511e8c6/providers/google/src/airflow/providers/google/cloud/transfers/s3_to_gcs.py#L296-L299
The REST API
[reference](https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs#Schedule)
also notes that the transfer run will be triggered immediately in such cases.
I _think_ that we should be able to rely on this API contract. Let me know your
thoughts here.
> \> If startTimeOfDay is not specified: One-time transfers run immediately
In my experience from internal workloads, GCP reliably kicks off the run and
typically within 10 seconds. Of course, if the job never starts for some
unknown reason then no operation will be created and the trigger could wait
indefinitely. I haven’t encountered that yet, but I’m happy to make the changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]