This is an automated email from the ASF dual-hosted git repository.

eladkal pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
     new ccbd536d28f Added ADR document describing why the notion of dialects 
is introduced (#45456)
ccbd536d28f is described below

commit ccbd536d28fa6d5285d59b50915d832e02ba141d
Author: David Blain <i...@dabla.be>
AuthorDate: Fri Jan 24 09:45:27 2025 +0100

    Added ADR document describing why the notion of dialects is introduced 
(#45456)
    
    * docs: Added a markdown ADR document describing why the notion of dialects 
was introduced in the common sql provider
    
    * docs: Added reference to the dialects in the Airflow common sql provider 
documentation
    
    * docs: Reformatted the ADR
    
    * refactor: Added dialects reference in mssql and postgres provider
    
    * refactor: Duplicated dialects.rst to mssql and postgres provider until we 
sort out how to redirect to the common sql provider one
    
    ---------
    
    Co-authored-by: David Blain <david.bl...@infrabel.be>
---
 .../dialects.rst                                   | 55 +++++++++++++++++++
 docs/apache-airflow-providers-common-sql/index.rst |  1 +
 .../dialects.rst                                   | 55 +++++++++++++++++++
 .../index.rst                                      |  1 +
 .../apache-airflow-providers-postgres/dialects.rst | 55 +++++++++++++++++++
 docs/apache-airflow-providers-postgres/index.rst   |  1 +
 ...03-introduce-notion-of-dialects-in-dbapihook.md | 61 ++++++++++++++++++++++
 7 files changed, 229 insertions(+)

diff --git a/docs/apache-airflow-providers-common-sql/dialects.rst 
b/docs/apache-airflow-providers-common-sql/dialects.rst
new file mode 100644
index 00000000000..22feedc9454
--- /dev/null
+++ b/docs/apache-airflow-providers-common-sql/dialects.rst
@@ -0,0 +1,55 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+SQL Dialects
+=============
+
+The :class:`~airflow.providers.common.sql.dialects.dialect.Dialect` offers an 
abstraction layer between the
+:class:`~airflow.providers.common.sql.hooks.sql.DbApiHook` implementation and 
the database.  For some database multiple
+connection types are available, like native, ODBC and or JDBC.  As the 
:class:`~airflow.providers.odbc.hooks.odbc.OdbcHook`
+and the :class:`~airflow.providers.jdbc.hooks.jdbc.JdbcHook` are generic hooks 
which allows you to interact with any
+database that has a driver for it, it needed an abstraction layer which allows 
us to run specialized queries
+depending of the database to which we connect and that's why dialects where 
introduced.
+
+The default :class:`~airflow.providers.common.sql.dialects.dialect.Dialect` 
class has following operations
+available which underneath use SQLAlchemy to execute, but can be overloaded 
with specialized implementations
+per database:
+
+- ``placeholder`` specifies the database specific placeholder used in prepared 
statements (default: ``%s``);
+- ``inspector`` returns the SQLAlchemy inspector which allows us to retrieve 
database metadata;
+- ``extract_schema_from_table`` allows us to extract the schema name from a 
string.
+- ``get_column_names`` returns the column names for the given table and schema 
(optional) using the SQLAlchemy inspector.
+- ``get_primary_keys`` returns the primary keys for the given table and schema 
(optional) using the SQLAlchemy inspector.
+- ``get_target_fields`` returns the columns names that aren't identity or auto 
incremented columns, this will be used by the insert_rows method of the 
:class:`~airflow.providers.common.sql.hooks.sql.DbApiHook` if the target_fields 
parameter wasn't specified and the Airflow property 
``core.dbapihook_resolve_target_fields`` is set to True (default: False).
+- ``reserved_words`` returns the reserved words in SQL for the target database 
using the SQLAlchemy inspector.
+- ``generate_insert_sql`` generates the insert SQL statement for the target 
database.
+- ``generate_replace_sql`` generates the upsert SQL statement for the target 
database.
+
+At the moment there are only 3 dialects available:
+
+- ``default`` :class:`~airflow.providers.common.sql.dialects.dialect.Dialect` 
reuses the generic functionality that was already available in the 
:class:`~airflow.providers.common.sql.hooks.sql.DbApiHook`;
+- ``mssql`` 
:class:`~airflow.providers.microsoft.mssql.dialects.mssql.MsSqlDialect` 
specialized for Microsoft SQL Server;
+- ``postgresql`` 
:class:`~airflow.providers.postgres.dialects.postgres.PostgresDialect` 
specialized for PostgreSQL;
+
+The dialect to be used will be derived from the connection string, which 
sometimes won't be possible.  There is always
+the possibility to specify the dialect name through the extra options of the 
connection:
+
+.. code-block::
+
+  dialect_name: 'mssql'
+
+If a specific dialect isn't available for a database, the default one will be 
used, same when a non-existing dialect name is specified.
diff --git a/docs/apache-airflow-providers-common-sql/index.rst 
b/docs/apache-airflow-providers-common-sql/index.rst
index b24ee78510c..6659b1affbc 100644
--- a/docs/apache-airflow-providers-common-sql/index.rst
+++ b/docs/apache-airflow-providers-common-sql/index.rst
@@ -43,6 +43,7 @@
 
     Python API <_api/airflow/providers/common/sql/index>
     Supported Database Types </supported-database-types>
+    Dialects <dialects>
 
 .. toctree::
     :hidden:
diff --git a/docs/apache-airflow-providers-microsoft-mssql/dialects.rst 
b/docs/apache-airflow-providers-microsoft-mssql/dialects.rst
new file mode 100644
index 00000000000..22feedc9454
--- /dev/null
+++ b/docs/apache-airflow-providers-microsoft-mssql/dialects.rst
@@ -0,0 +1,55 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+SQL Dialects
+=============
+
+The :class:`~airflow.providers.common.sql.dialects.dialect.Dialect` offers an 
abstraction layer between the
+:class:`~airflow.providers.common.sql.hooks.sql.DbApiHook` implementation and 
the database.  For some database multiple
+connection types are available, like native, ODBC and or JDBC.  As the 
:class:`~airflow.providers.odbc.hooks.odbc.OdbcHook`
+and the :class:`~airflow.providers.jdbc.hooks.jdbc.JdbcHook` are generic hooks 
which allows you to interact with any
+database that has a driver for it, it needed an abstraction layer which allows 
us to run specialized queries
+depending of the database to which we connect and that's why dialects where 
introduced.
+
+The default :class:`~airflow.providers.common.sql.dialects.dialect.Dialect` 
class has following operations
+available which underneath use SQLAlchemy to execute, but can be overloaded 
with specialized implementations
+per database:
+
+- ``placeholder`` specifies the database specific placeholder used in prepared 
statements (default: ``%s``);
+- ``inspector`` returns the SQLAlchemy inspector which allows us to retrieve 
database metadata;
+- ``extract_schema_from_table`` allows us to extract the schema name from a 
string.
+- ``get_column_names`` returns the column names for the given table and schema 
(optional) using the SQLAlchemy inspector.
+- ``get_primary_keys`` returns the primary keys for the given table and schema 
(optional) using the SQLAlchemy inspector.
+- ``get_target_fields`` returns the columns names that aren't identity or auto 
incremented columns, this will be used by the insert_rows method of the 
:class:`~airflow.providers.common.sql.hooks.sql.DbApiHook` if the target_fields 
parameter wasn't specified and the Airflow property 
``core.dbapihook_resolve_target_fields`` is set to True (default: False).
+- ``reserved_words`` returns the reserved words in SQL for the target database 
using the SQLAlchemy inspector.
+- ``generate_insert_sql`` generates the insert SQL statement for the target 
database.
+- ``generate_replace_sql`` generates the upsert SQL statement for the target 
database.
+
+At the moment there are only 3 dialects available:
+
+- ``default`` :class:`~airflow.providers.common.sql.dialects.dialect.Dialect` 
reuses the generic functionality that was already available in the 
:class:`~airflow.providers.common.sql.hooks.sql.DbApiHook`;
+- ``mssql`` 
:class:`~airflow.providers.microsoft.mssql.dialects.mssql.MsSqlDialect` 
specialized for Microsoft SQL Server;
+- ``postgresql`` 
:class:`~airflow.providers.postgres.dialects.postgres.PostgresDialect` 
specialized for PostgreSQL;
+
+The dialect to be used will be derived from the connection string, which 
sometimes won't be possible.  There is always
+the possibility to specify the dialect name through the extra options of the 
connection:
+
+.. code-block::
+
+  dialect_name: 'mssql'
+
+If a specific dialect isn't available for a database, the default one will be 
used, same when a non-existing dialect name is specified.
diff --git a/docs/apache-airflow-providers-microsoft-mssql/index.rst 
b/docs/apache-airflow-providers-microsoft-mssql/index.rst
index 62089e86545..b574f05bef9 100644
--- a/docs/apache-airflow-providers-microsoft-mssql/index.rst
+++ b/docs/apache-airflow-providers-microsoft-mssql/index.rst
@@ -43,6 +43,7 @@
     :caption: References
 
     Python API <_api/airflow/providers/microsoft/mssql/index>
+    Dialects <dialects>
 
 .. toctree::
     :hidden:
diff --git a/docs/apache-airflow-providers-postgres/dialects.rst 
b/docs/apache-airflow-providers-postgres/dialects.rst
new file mode 100644
index 00000000000..22feedc9454
--- /dev/null
+++ b/docs/apache-airflow-providers-postgres/dialects.rst
@@ -0,0 +1,55 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+SQL Dialects
+=============
+
+The :class:`~airflow.providers.common.sql.dialects.dialect.Dialect` offers an 
abstraction layer between the
+:class:`~airflow.providers.common.sql.hooks.sql.DbApiHook` implementation and 
the database.  For some database multiple
+connection types are available, like native, ODBC and or JDBC.  As the 
:class:`~airflow.providers.odbc.hooks.odbc.OdbcHook`
+and the :class:`~airflow.providers.jdbc.hooks.jdbc.JdbcHook` are generic hooks 
which allows you to interact with any
+database that has a driver for it, it needed an abstraction layer which allows 
us to run specialized queries
+depending of the database to which we connect and that's why dialects where 
introduced.
+
+The default :class:`~airflow.providers.common.sql.dialects.dialect.Dialect` 
class has following operations
+available which underneath use SQLAlchemy to execute, but can be overloaded 
with specialized implementations
+per database:
+
+- ``placeholder`` specifies the database specific placeholder used in prepared 
statements (default: ``%s``);
+- ``inspector`` returns the SQLAlchemy inspector which allows us to retrieve 
database metadata;
+- ``extract_schema_from_table`` allows us to extract the schema name from a 
string.
+- ``get_column_names`` returns the column names for the given table and schema 
(optional) using the SQLAlchemy inspector.
+- ``get_primary_keys`` returns the primary keys for the given table and schema 
(optional) using the SQLAlchemy inspector.
+- ``get_target_fields`` returns the columns names that aren't identity or auto 
incremented columns, this will be used by the insert_rows method of the 
:class:`~airflow.providers.common.sql.hooks.sql.DbApiHook` if the target_fields 
parameter wasn't specified and the Airflow property 
``core.dbapihook_resolve_target_fields`` is set to True (default: False).
+- ``reserved_words`` returns the reserved words in SQL for the target database 
using the SQLAlchemy inspector.
+- ``generate_insert_sql`` generates the insert SQL statement for the target 
database.
+- ``generate_replace_sql`` generates the upsert SQL statement for the target 
database.
+
+At the moment there are only 3 dialects available:
+
+- ``default`` :class:`~airflow.providers.common.sql.dialects.dialect.Dialect` 
reuses the generic functionality that was already available in the 
:class:`~airflow.providers.common.sql.hooks.sql.DbApiHook`;
+- ``mssql`` 
:class:`~airflow.providers.microsoft.mssql.dialects.mssql.MsSqlDialect` 
specialized for Microsoft SQL Server;
+- ``postgresql`` 
:class:`~airflow.providers.postgres.dialects.postgres.PostgresDialect` 
specialized for PostgreSQL;
+
+The dialect to be used will be derived from the connection string, which 
sometimes won't be possible.  There is always
+the possibility to specify the dialect name through the extra options of the 
connection:
+
+.. code-block::
+
+  dialect_name: 'mssql'
+
+If a specific dialect isn't available for a database, the default one will be 
used, same when a non-existing dialect name is specified.
diff --git a/docs/apache-airflow-providers-postgres/index.rst 
b/docs/apache-airflow-providers-postgres/index.rst
index 0979b194b18..b4d698ceb72 100644
--- a/docs/apache-airflow-providers-postgres/index.rst
+++ b/docs/apache-airflow-providers-postgres/index.rst
@@ -43,6 +43,7 @@
     :caption: References
 
     Python API <_api/airflow/providers/postgres/index>
+    Dialects <dialects>
 
 .. toctree::
     :hidden:
diff --git 
a/providers/src/airflow/providers/common/sql/doc/adr/0003-introduce-notion-of-dialects-in-dbapihook.md
 
b/providers/src/airflow/providers/common/sql/doc/adr/0003-introduce-notion-of-dialects-in-dbapihook.md
new file mode 100644
index 00000000000..09df3145e4e
--- /dev/null
+++ 
b/providers/src/airflow/providers/common/sql/doc/adr/0003-introduce-notion-of-dialects-in-dbapihook.md
@@ -0,0 +1,61 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# 3. Introduce notion of dialects in DbApiHook
+
+Date: 2025-01-07
+
+## Status
+
+Accepted
+
+## Context
+
+This ADR describes the proposition why we wanted to introduce dialects in the 
``DBAPIHook`` as we experienced
+that the ``_insert_statement_format`` and ``_replace_statement_format`` string 
formatting properties used by the
+``insert_rows`` method in the ``DbApiHook`` where lacking in some cases as the 
number of parameters passed to the
+string format are hard-coded and aren't always sufficient when using different 
database through the
+generic JBDC and ODBC connection types.
+
+That's why we wanted a generic approach in which the code isn't tied to a 
specific database hook.
+
+For example when using MsSQL through ODBC instead of the native ``MsSqlHook``, 
you won't have the merge into
+(e.g. replace) functionality for MSSQL when using the ODBC connection type as 
that one was only available in
+the native ``MsSqlHook``.
+
+That's where the notion of dialects come into play and allow us to benefit of 
the same functionalities
+independently of which connection type you want to use (ODBC/JDBC or native if 
available) for a specific
+database.
+
+
+## Decision
+
+We decided the introduce the notion of dialects which allows us to implement 
database specific functionalities
+independently of the used connection type (e.g. hook).  That way when using 
for example the ``insert_rows`` method on
+the ``DbApiHook`` for as well ODBC as JDBC as native connection types, it will 
always be possible to use the replace
+into (e.g. merge into) functionality as that won't be tied to a specific 
implementation with a Hook an thus the
+connection type.
+
+
+## Consequences
+
+The consequence of this decision is that from now on database specific 
implementations should be done within the
+dialect for that database instead of the specialized hook, unless the 
connection type is tied to the hook,
+meaning that there is only one connection type possible and an ODBC/JDBC and 
in the future maybe even ADBC
+(e.g. Apache Arrow) isn't available.

Reply via email to