potiuk commented on issue #23361:
URL: https://github.com/apache/airflow/issues/23361#issuecomment-1193861793

   > I do not actually know about is it supported by other DB engine and is it 
has exactly the same behaviour.
   
   @Taragolis  woudl be worth checking. The DagRun lock `SELECT FOR UPDATE SKIP 
LOCKED' is very much the "Key" (pun intended) to make multiple schedulers work 
and it also (as you can see) spilled a bit to mini-scheduler and task run" in 
form of just 'SELECT FOR UPDATE". The "SELECT FOR UPDATE SKIP LOCKED" is 
precisely the mechanism that allows multiple schedulers to run in parallel with 
basically no serialization and no "accidental state overrides". 
   
   And we need to make sure that it works - for MySQL 8 and Postgres, because 
this is our 'baseline". We cannot rely on Postgres-only features (though we 
would love to - I started some threads in the past mostly starting along the 
lines "we are fed--up with MySQL, let's dump it". See for example this 
"Elephant in the Room" thread at the devlist 
https://lists.apache.org/thread/dp78j0ssyhx62008lbtblrc856nbmlfb . The answer 
so far and the wisdom of crowd is "No, as much as we would like to, we cannot 
get rid of MySQL". And is you see the results of our Survey 
https://airflow.apache.org/blog/airflow-survey-2022/  - while Postgres is by 
far strongest (also because it is now the only supported DB for ALL managed 
Airlfow services), there are still ~ 20% of people who use MySQL (or MariaDB 
but we finally decided that we explicitly exclude MariaDB from supported 
databases and actively encourage people to migrate out if they use ir). 
   
   So while I would love to start the dicsussion with "Can we use this Postgres 
feature ?". when we think about the product development, the question is "Is 
this feature supported in both Postgres AND MySQL 8+". If not - we won't even 
discuss it, because if we start actively using a Postgres-only feature to 
optimize stuff, we are going to impair our MySQL users and eventually we will 
implement things that only work for Postgres, and behaviours that will 
different between Postgres and MySQL and we certainly do not want that.
   
   I looked (very briefly) if similar feature exists in MySQL, and it seems no, 
but I did not look too much. But If you think it is worth considering and if 
you think it's good to think of it, starting with deeper investigation and 
justifying both - benefits and cross-db-portability is something I advice you 
to start. 
   
   I think your question is phrased a bit wrongly:
   
   >  is it necessary use FOR UPDATE lock rather than FOR NO KEY UPDATE ?
   
   It should rather be:
   
   "I see that we can use that new feature NO KEY in Postgres and also 
equivalent in MySQL. It has those and those benefits and we can make it 
cross-platfform - doc here, doc here". Is this good enough reason to switch to 
it ?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to