Hi community


I have a question/suggestion about the different SQL Database hooks +
operators.



Examples:

- MsSqlHook (Implements DbApiHook using pymssql)

- MySqlHook (Implements DbApiHook using MySQLdb)

- PostgresHook (Implements DbApiHook using psycopg2)



These hooks tend to have convenience methods for doing more complex DB
functions (eg bulk loading), which is useful.



However there is also a downside to them.

First I need to figure out which operator and hook to use and what driver
it is using. Then I have to review the source code to understand what it is
doing and what it isn't doing. Finally (at my unique client site) I request
the driver lib to be whitelisted and installed, and I test it for my use
case.



This can be quite time consuming if working with several different DBs.



Does Airflow have a general-purpose SQL hook + operator?

I didn't see one but perhaps I'm missing it :)



I was considering implementing something based on SQL Alchemy.



Advantages:

- Range of database dialects

- Simple to use esp. for prototyping

- Consistent - no DB-specific code

- No extra dependencies - Airflow already requires SQL Alchemy

- Maybe SQL Alchemy features would be useful eg function-based querying



Disadvantages:

- Another way to do the same thing (some existing DB hooks)

- Maybe less efficient than using DB specific hooks



What are your thoughts?



Kind regards

Reply via email to