dstandish commented on code in PR #27597:
URL: https://github.com/apache/airflow/pull/27597#discussion_r1019517527
##########
airflow/sensors/base.py:
##########
@@ -122,6 +135,7 @@ def __init__(
self.timeout = timeout
self.mode = mode
self.exponential_backoff = exponential_backoff
+ self.max_wait = max_wait if max_wait is None else
coerce_timedelta(max_wait, key="max_retry_delay")
Review Comment:
i would put "if wait is none" logic in coerce timedelta so that ... if it is
none, it will return none, wdyt?
##########
airflow/sensors/base.py:
##########
@@ -72,6 +73,14 @@ def __bool__(self) -> bool:
return self.is_done
+def coerce_timedelta(value: float | timedelta, *, key: str) -> timedelta:
+ if isinstance(value, timedelta):
+ return value
+ if isinstance(value, Number):
+ return timedelta(seconds=value)
+ raise ValueError(f"{key} should be timedelta object or a number of
seconds")
Review Comment:
i think it's ok to fail here (though i'd add `if value is None: return None`
at the top)
BUT, i think it would be best to help the user by failing fast i.e. check
the type in operator init and raise accordingly (rather than waiting until code
gets deployed and task runs)
##########
airflow/sensors/base.py:
##########
@@ -72,6 +73,14 @@ def __bool__(self) -> bool:
return self.is_done
+def coerce_timedelta(value: float | timedelta, *, key: str) -> timedelta:
+ if isinstance(value, timedelta):
+ return value
+ if isinstance(value, Number):
+ return timedelta(seconds=value)
+ raise ValueError(f"{key} should be timedelta object or a number of
seconds")
Review Comment:
the other benefit is then... if we're failing fast... then we don't really
need this `key` argument which, at call site, is a little confusing.
if you validate the param in `__init__` then you can immediately raise with
a helpful message.
##########
airflow/sensors/base.py:
##########
@@ -122,6 +135,7 @@ def __init__(
self.timeout = timeout
self.mode = mode
self.exponential_backoff = exponential_backoff
+ self.max_wait = max_wait if max_wait is None else
coerce_timedelta(max_wait, key="max_retry_delay")
Review Comment:
i would get rid of `key=...` it's confusing and not nec esp if you validate
type in init
##########
airflow/sensors/base.py:
##########
@@ -72,6 +73,14 @@ def __bool__(self) -> bool:
return self.is_done
+def coerce_timedelta(value: float | timedelta, *, key: str) -> timedelta:
Review Comment:
let's make it private
##########
tests/sensors/test_base.py:
##########
@@ -509,6 +509,40 @@ def run_duration():
assert interval2 >= sensor.poke_interval
assert interval2 > interval1
+ def test_sensor_with_exponential_backoff_on_and_max_wait(self):
+
+ sensor = DummySensor(
+ task_id=SENSOR_OP,
+ return_value=None,
+ poke_interval=10,
+ timeout=60,
+ exponential_backoff=True,
+ max_wait=timedelta(seconds=30),
+ )
+
+ with patch("airflow.utils.timezone.utcnow") as mock_utctime:
+ mock_utctime.return_value = DEFAULT_DATE
+
+ started_at = timezone.utcnow() - timedelta(seconds=10)
+
+ def run_duration():
+ return (timezone.utcnow - started_at).total_seconds()
+
+ interval1 = sensor._get_next_poke_interval(started_at,
run_duration, 1)
+ interval2 = sensor._get_next_poke_interval(started_at,
run_duration, 2)
+ interval3 = sensor._get_next_poke_interval(started_at,
run_duration, 3)
+ interval4 = sensor._get_next_poke_interval(started_at,
run_duration, 4)
+
+ assert interval1 >= 0
+ assert interval1 <= sensor.poke_interval
+ assert interval2 >= sensor.poke_interval
+ assert interval2 > interval1
+ assert interval2 <= sensor.max_wait.total_seconds()
+ assert interval3 >= interval2
+ assert interval3 <= sensor.max_wait.total_seconds()
+ assert interval4 >= interval3
+ assert interval4 <= sensor.max_wait.total_seconds()
Review Comment:
```suggestion
for idx, expected in enumerate([2, 6, 13, 30, 30, 30, 30, 30]):
assert sensor._get_next_poke_interval(started_at,
run_duration, idx) == expected
```
i don't _think_ this is fragile... but it's def more readable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]