potiuk commented on a change in pull request #18718:
URL: https://github.com/apache/airflow/pull/18718#discussion_r723957739



##########
File path: airflow/models/connection.py
##########
@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, **kwargs):

Review comment:
       There is a backwards-compatibility problem with that approach. While 
this change onm it's own is backwards, compatible and old providers will work, 
if you implement a change to a  provider that adds kwargs to `get_hook`, they 
will not be usable with already released versions of Airflow. 
   
   We have A LOT of operators derived from SQLOperator and using `get_hook()` 
directly. If we just change a signature of get_hook()  it will be very easy for 
anyone to start using it and breaking compatibility with released Airflow 
version without anyone noticing.
   
   I am not saying we should not do it, it's a good change but I think this 
change should be much more thought out and we need to figure out when and how 
to do it to avoid disruption.
   
   I dop not yet know how to do it  best - my current thinking is that we 
should rather introduce a new method (`get_hook with params()` rather than 
modify existing ones, and add a pre-commit check that will check if a provider 
is using that method, and signals that either the provider should have the 
"additional_dependency" added for Airflow >version X.Y.Z. We would also have to 
check the usages of SQL Operators the same way and flag problems with 
dependencies if we see that any of the BaseSQL-derived operators passes 
`hook_params`. The second part will be much more "brittle" I am afraid so maybe 
we should figure a bit different way here.
   
   Thoughts? 
   
   
   

##########
File path: airflow/models/connection.py
##########
@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, **kwargs):

Review comment:
       There is a backwards-compatibility problem with that approach. While 
this change on it's own is backwards, compatible and old providers will work, 
if you implement a change to a  provider that adds kwargs to `get_hook`, they 
will not be usable with already released versions of Airflow. 
   
   We have A LOT of operators derived from SQLOperator and using `get_hook()` 
directly. If we just change a signature of get_hook()  it will be very easy for 
anyone to start using it and breaking compatibility with released Airflow 
version without anyone noticing.
   
   I am not saying we should not do it, it's a good change but I think this 
change should be much more thought out and we need to figure out when and how 
to do it to avoid disruption.
   
   I dop not yet know how to do it  best - my current thinking is that we 
should rather introduce a new method (`get_hook with params()` rather than 
modify existing ones, and add a pre-commit check that will check if a provider 
is using that method, and signals that either the provider should have the 
"additional_dependency" added for Airflow >version X.Y.Z. We would also have to 
check the usages of SQL Operators the same way and flag problems with 
dependencies if we see that any of the BaseSQL-derived operators passes 
`hook_params`. The second part will be much more "brittle" I am afraid so maybe 
we should figure a bit different way here.
   
   Thoughts? 
   
   
   

##########
File path: airflow/models/connection.py
##########
@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, **kwargs):

Review comment:
       There is a backwards-compatibility problem with that approach. While 
this change on it's own is backwards, compatible and old providers will work, 
if you implement a change to a  provider that adds kwargs to `get_hook`, they 
will not be usable with already released versions of Airflow. 
   
   We have A LOT of operators derived from SQLOperator and using `get_hook()` 
directly. If we just change a signature of get_hook()  it will be very easy for 
anyone to start using it and breaking compatibility with released Airflow 
version without anyone noticing.
   
   I am not saying we should not do it, it's a good change but I think this 
change should be much more thought out and we need to figure out when and how 
to do it to avoid disruption.
   
   I dop not yet know how to do it  best - my current thinking is that we 
should rather introduce a new method (`get_hook with params()` rather than 
modify existing ones, and add a pre-commit check that will check if a provider 
is using that method, and signals that the provider should have the 
"additional_dependency" added for Airflow >version X.Y.Z. We would also have to 
check the usages of SQL Operators the same way and flag problems with 
dependencies if we see that any of the BaseSQL-derived operators passes 
`hook_params`. The second part will be much more "brittle" I am afraid so maybe 
we should figure a bit different way here.
   
   Thoughts? 
   
   
   

##########
File path: airflow/models/connection.py
##########
@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, **kwargs):

Review comment:
       There is a backwards-compatibility problem with that approach. While 
this change on it's own is backwards, compatible and old providers will work, 
if you implement a change to a  provider that adds kwargs to `get_hook`, they 
will not be usable with already released versions of Airflow. 
   
   We have A LOT of operators derived from SQLOperator and using `get_hook()` 
directly. If we just change a signature of get_hook()  it will be very easy for 
anyone to start using it and breaking compatibility with released Airflow 
version without anyone noticing.
   
   I am not saying we should not do it, it's a good change but I think this 
change should be much more thought out and we need to figure out when and how 
to do it to avoid disruption.
   
   I dop not yet know how to do it  best - my current thinking is that we 
should rather introduce a new method (`get_hook with params()` rather than 
modify existing ones, and add a pre-commit check that will check if a provider 
is using that method, and signals that the provider should have the 
"additional_dependency" added for Airflow >= X.Y.Z. We would also have to check 
the usages of SQL Operators the same way and flag problems with dependencies if 
we see that any of the BaseSQL-derived operators passes `hook_params`. The 
second part will be much more "brittle" I am afraid so maybe we should figure a 
bit different way here.
   
   Thoughts? 
   
   
   

##########
File path: airflow/models/connection.py
##########
@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, **kwargs):

Review comment:
       There is a backwards-compatibility problem with that approach. While 
this change on it's own is backwards, compatible and old providers will work, 
if you implement a change to a  provider that adds kwargs to `get_hook`, they 
will not be usable with already released versions of Airflow. 
   
   We have A LOT of operators derived from SQLOperator and using `get_hook()` 
directly. If we just change a signature of get_hook()  it will be very easy for 
anyone to start using it and breaking compatibility with released Airflow 
version without anyone noticing.
   
   I am not saying we should not do it, it's a good change but I think this 
change should be much more thought out and we need to figure out when and how 
to do it to avoid disruption.
   
   I dop not yet know how to do it  best - my current thinking is that we 
should rather introduce a new method (`get_hook with params()` rather than 
modify existing ones, and add a pre-commit check that will check if a provider 
is using that method, and signals that the provider should have the 
"additional_dependency" added for Airflow >= X.Y.Z. We would also have to check 
the usages of SQL Operators the same way and flag problems with dependencies if 
we see that any of the BaseSQL-derived operators passes `hook_params`. The 
second part will be much more "brittle" I am afraid, and difficult to detect , 
so maybe we should figure a bit different way here.
   
   Thoughts? 
   
   
   

##########
File path: airflow/models/connection.py
##########
@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, **kwargs):

Review comment:
       There is a backwards-compatibility problem with that approach. While 
this change on it's own is backwards, compatible and old providers will work, 
if you implement a change to a  provider that adds kwargs to `get_hook`, they 
will not be usable with already released versions of Airflow. 
   
   We have A LOT of operators derived from SQLOperator and using `get_hook()` 
directly. If we just change a signature of get_hook()  it will be very easy for 
anyone to start using it and breaking compatibility with released Airflow 
version without anyone noticing.
   
   I am not saying we should not do it, it's a good change but I think this 
change should be much more thought out and we need to figure out when and how 
to do it to avoid disruption.
   
   I dop not yet know how to do it  best - my current thinking is that we 
should rather introduce a new method (`get_hook with params()` rather than 
modify existing ones, and add a pre-commit check that will check if a provider 
is using that method, and signals that the provider should have the 
"additional_dependency" added for Airflow >= X.Y.Z. We would also have to check 
the usages of SQL Operators the same way and flag problems with dependencies if 
we see that any of the BaseSQL-derived operators passes `hook_params`. The 
second part will be much more "brittle" I am afraid, and difficult to detect , 
so maybe we should figure a bit different way of passing parameters here - one 
that will be easy to detect and flag by pre-commit.
   
   Thoughts? 
   
   
   

##########
File path: airflow/models/connection.py
##########
@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, **kwargs):

Review comment:
       There is a backwards-compatibility problem with that approach. While 
this change on it's own is backwards, compatible and old providers will work, 
if you implement a change to a  provider that adds kwargs to `get_hook`, they 
will not be usable with already released versions of Airflow. 
   
   We have A LOT of operators derived from SQLOperator or using `get_hook()` 
directly. If we just change a signature of get_hook()  it will be very easy for 
anyone to start using it and breaking compatibility with released Airflow 
version without anyone noticing.
   
   I am not saying we should not do it, it's a good change but I think this 
change should be much more thought out and we need to figure out when and how 
to do it to avoid disruption.
   
   I dop not yet know how to do it  best - my current thinking is that we 
should rather introduce a new method (`get_hook with params()` rather than 
modify existing ones, and add a pre-commit check that will check if a provider 
is using that method, and signals that the provider should have the 
"additional_dependency" added for Airflow >= X.Y.Z. We would also have to check 
the usages of SQL Operators the same way and flag problems with dependencies if 
we see that any of the BaseSQL-derived operators passes `hook_params`. The 
second part will be much more "brittle" I am afraid, and difficult to detect , 
so maybe we should figure a bit different way of passing parameters here - one 
that will be easy to detect and flag by pre-commit.
   
   Thoughts? 
   
   
   

##########
File path: airflow/models/connection.py
##########
@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
         if self._extra and self.is_extra_encrypted:
             self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()
 
-    def get_hook(self):
-        """Return hook based on conn_type."""
+    def get_hook(self, **kwargs):

Review comment:
       There is a backwards-compatibility (or rather future-compatibility) 
problem with that approach. While this change on it's own is backwards, 
compatible and old providers will work, if you implement a change to a  
provider that adds kwargs to `get_hook`, they will not be usable with already 
released versions of Airflow. 
   
   We have A LOT of operators derived from SQLOperator or using `get_hook()` 
directly. If we just change a signature of get_hook()  it will be very easy for 
anyone to start using it and breaking compatibility with released Airflow 
version without anyone noticing.
   
   I am not saying we should not do it, it's a good change but I think this 
change should be much more thought out and we need to figure out when and how 
to do it to avoid disruption.
   
   I dop not yet know how to do it  best - my current thinking is that we 
should rather introduce a new method (`get_hook with params()` rather than 
modify existing ones, and add a pre-commit check that will check if a provider 
is using that method, and signals that the provider should have the 
"additional_dependency" added for Airflow >= X.Y.Z. We would also have to check 
the usages of SQL Operators the same way and flag problems with dependencies if 
we see that any of the BaseSQL-derived operators passes `hook_params`. The 
second part will be much more "brittle" I am afraid, and difficult to detect , 
so maybe we should figure a bit different way of passing parameters here - one 
that will be easy to detect and flag by pre-commit.
   
   Thoughts? 
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to