Hello everyone,

TL;DR; I wanted to propose to start treating MySQL 5.7  in a special way
(and add warning about upcoming disabling of support for Older DBs users
use). We have still more than year of support for MySQL 5.7 (October 2023)
by Oracle, but mySQL 5.7 has some behaviours that make it less stable, and
less scalable and causes some deadlock problems.

# Why do I think it is needed?

We have (not a huge by repeating) consistent reports from our users who are
reporting deadlocks on MySQL. Some of them we fixed, some of them are a bit
mysterious, but some of them are there because we rely a bit too much on
"SKIP_LOCKED" functionality in some cases (mini-scheduler) or because users
- despite warnings try to run multiple schedulers on 5.7:

Examples - at least SOME of the deadlocks reported here are almost for sure
result of mini-scheduler running in parallel to main scheduler:
https://github.com/apache/airflow/issues/16982
The problem is that with 5.7 we silently skip "SKIP_LOCKED" and this simply
leads to occasional Deadlocks.

# What do I propose ?

My proposal would be to implement a few "guards" and "warnings" to prevent
those cases from happening (rather than lose our time on false-reports).
There are some legitimate deadlocks we should track down, but if we see a
number of deadlocks which come from the 5.7 limitations, it's a bit hard to
weed them out (not every user will share their version and often users
simply piggyback writing "I have the same issue" and will not tell their
version. Also I think it would be rather better for Airflow's reputation
that they do not experience the "Deadlock" issues in the first place. We
have an easy way to detect if SKIP_LOCKED is supported which we use already.

The proposals:

1) Create a lock at startup of the scheduler, that will prevent from
starting another scheduler). This can be a global lock (similar as we do in
migration). We can test if the lock is set and fail the second scheduler if
one is already running with appropriate message ("please upgrade to MySQL 8
if you want to use multiple schedulers").. Should be very easy.

2) Hard-disable mini-scheduler (no matter if it is enabled or not) for
MySQL 5.7. Should be easy.

3) Check if we are running MySQL5.7 and warn the user (both in logs and in
the UI) that time of live for the database they use is coming and they
should consider an update. This we could do in general - Postgres 10 end of
life is coming much faster - November 2022) so that could be a generally
useful notification to everyone I think. We could also add some extra
information in case of MySQL 5.7 that it does not support all the features
of Airflow and that it's better to upgrade anyway. I'd treat MySQL 5.7 a
bit less and run such a deprecation notice a year before it happens due to
that (for Postgres, I think ~ 6 months should be fine).

WDYT.

J.

Reply via email to