Vitor-Avila commented on PR #28176:
URL: https://github.com/apache/superset/pull/28176#issuecomment-2072169549

   > I'm a bit worried about manipulating the crontab as a string to figure out 
the shortest interval — it can miss edge cases, and it forces constraints to 
the user due to the implementation details (in this case, the interval has to 
be in minutes, and has a maximum value of 60, which are artificial 
requirements).
   > 
   > An alternative solution would be to allow any minimum interval between 
reports:
   > 
   > ```python
   > ALERT_MINIMUM_INTERVAL_MINUTES = timedelta(seconds=90)
   > ```
   > 
   > Or, as it seems to be more common for intervals in `config.py`:
   > 
   > ```python
   > ALERT_MINIMUM_INTERVAL_MINUTES = int(timedelta(seconds=90).total_seconds())
   > ```
   > 
   > Then you can compute a few execution times programmatically and see if it 
violates the config (untested):
   > 
   > ```python
   > def is_valid_schedule(cron_schedule: str, config_value: int) -> bool:
   >     it = croniter(cron_schedule)
   >     previous = next(it)
   >     for _ in range(1000):  # how many? I have no idea...
   >         current = next(it)
   >         interval, previous = current - previous, current
   >         if interval.total_seconds() < config_value:
   >             return False
   > 
   >     return True
   > ```
   
   hey @betodealmeida that was the route I was initially going for too, but 
"how many repetitions?" (in the `for _ in range(x)`) is what made me think of 
another alternative. There's a lot of flexibility in the configuration (like 
the user could make it every 3 minutes until minute 58, but then re-execute on 
59, select multiple days with a different interval between each in a month, 
etc). 
   
   I can see benefits and concerns in both routes:
   * **PR approach:** Biggest advantage here is that its more limited scope can 
validate all executions, since it only cares to the minute piece. Biggest 
disadvantage is that it's more restrictive (limit can't be higher than 60 
minutes), and that it's handling a cron schedule as a string.
   
   * **Implementing cron validation:** No limit on the configuration side (user 
could create any interval needed like days, hours, etc). The solution might not 
iterate on all executions.
   
   Since I believe most Orgs would like to limit this in the minutes range 
(like at least 5/10 minutes interval), I thought it made more sense to go with 
the first route. But I'm happy to migrate to the second approach -- we could 
start with a loop with `1440` as the repetition count (amount of minutes in a 
day) and go from there.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org
For additional commands, e-mail: notifications-h...@superset.apache.org

Reply via email to