[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-03-04 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r387642551
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/datasync.py
 ##
 @@ -29,10 +29,13 @@ class AWSDataSyncHook(AwsBaseHook):
 """
 Interact with AWS DataSync.
 
+Additional arguments (such as ``aws_conn_id``) may be specified and
+are passed down to the underlying AwsBaseHook.
 
 Review comment:
   Docs build is failing since I removed the indent:
   
https://travis-ci.org/apache/airflow/jobs/658184121?utm_medium=notification_source=github_status
   Adding it back, lets see
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-03-04 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r387620707
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -232,6 +261,38 @@ def get_resource_type(self, resource_type, 
region_name=None, config=None):
 resource_type, endpoint_url=endpoint_url, config=config, 
verify=self.verify
 )
 
+@cached_property
+def conn(self):
+"""Get the underlying boto3 client (cached).
+
+The return value from this method is cached for efficiency.
+
+:return: boto3.client or boto3.resource for the current
+client/resource type and region
+:rtype: boto3.client() or boto3.resource()
+:raises AirflowException: self.client_type or self.resource_type are 
not
+populated. These are usually specified to this class, by a subclass
+__init__ method.
+"""
+if self.client_type:
+return self.get_client_type(self.client_type, 
region_name=self.region_name)
+elif self.resource_type:
+return self.get_resource_type(self.resource_type, 
region_name=self.region_name)
+else:
+raise AirflowException(
+'Either self.client_type or self.resource_type'
+' must be specified in the subclass')
 
 Review comment:
   Yes, in the `__init__` I do the actual validation. The only way the 
`client_type`/`resource_type `could change after that is if a subclass changes 
it directly. It would represent a bug in the subclass implementation. Based on 
the Python docs on Exceptions, a `NotImplementedError `sounds like the best fit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-03-04 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r387623564
 
 

 ##
 File path: tests/providers/amazon/aws/sensors/test_sqs.py
 ##
 @@ -18,7 +18,7 @@
 
 
 import unittest
-from unittest.mock import MagicMock, patch
+import unittest.mock as mock
 
 Review comment:
   I use @mock.patch so it should work. Will see what Travis says :)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-03-04 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r387622963
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/datasync.py
 ##
 @@ -29,10 +29,13 @@ class AWSDataSyncHook(AwsBaseHook):
 """
 Interact with AWS DataSync.
 
+Additional arguments (such as ``aws_conn_id``) may be specified and
+are passed down to the underlying AwsBaseHook.
 
 Review comment:
   Thanks, fixed up


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-03-04 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r387621654
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -232,6 +261,38 @@ def get_resource_type(self, resource_type, 
region_name=None, config=None):
 resource_type, endpoint_url=endpoint_url, config=config, 
verify=self.verify
 )
 
+@cached_property
+def conn(self):
+"""Get the underlying boto3 client (cached).
+
+The return value from this method is cached for efficiency.
+
+:return: boto3.client or boto3.resource for the current
+client/resource type and region
+:rtype: boto3.client() or boto3.resource()
+:raises AirflowException: self.client_type or self.resource_type are 
not
+populated. These are usually specified to this class, by a subclass
+__init__ method.
+"""
+if self.client_type:
+return self.get_client_type(self.client_type, 
region_name=self.region_name)
+elif self.resource_type:
+return self.get_resource_type(self.resource_type, 
region_name=self.region_name)
+else:
+raise AirflowException(
+'Either self.client_type or self.resource_type'
+' must be specified in the subclass')
+
+def get_conn(self):
+""" Get the underlying boto3 client (cached)
 
 Review comment:
   fixed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-03-04 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r387621500
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -232,6 +261,38 @@ def get_resource_type(self, resource_type, 
region_name=None, config=None):
 resource_type, endpoint_url=endpoint_url, config=config, 
verify=self.verify
 )
 
+@cached_property
+def conn(self):
+"""Get the underlying boto3 client (cached).
 
 Review comment:
   Thanks, there were some older methods with a similar style. I'll update 
these to all be consistent. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-03-04 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r387620707
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -232,6 +261,38 @@ def get_resource_type(self, resource_type, 
region_name=None, config=None):
 resource_type, endpoint_url=endpoint_url, config=config, 
verify=self.verify
 )
 
+@cached_property
+def conn(self):
+"""Get the underlying boto3 client (cached).
+
+The return value from this method is cached for efficiency.
+
+:return: boto3.client or boto3.resource for the current
+client/resource type and region
+:rtype: boto3.client() or boto3.resource()
+:raises AirflowException: self.client_type or self.resource_type are 
not
+populated. These are usually specified to this class, by a subclass
+__init__ method.
+"""
+if self.client_type:
+return self.get_client_type(self.client_type, 
region_name=self.region_name)
+elif self.resource_type:
+return self.get_resource_type(self.resource_type, 
region_name=self.region_name)
+else:
+raise AirflowException(
+'Either self.client_type or self.resource_type'
+' must be specified in the subclass')
 
 Review comment:
   Yes, in the `__init__` I do the actual validation. The only way the 
client_type/resource_type could change after that is if a subclass changes it 
directly. It would represent a bug in the subclass implementation. Based on the 
Python docs on Exceptions, a NotImplementedError sounds like the best fit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385543236
 
 

 ##
 File path: airflow/providers/amazon/aws/sensors/sqs.py
 ##
 @@ -56,6 +56,7 @@ def __init__(self,
 self.aws_conn_id = aws_conn_id
 self.max_messages = max_messages
 self.wait_time_seconds = wait_time_seconds
+self.hook = SQSHook(aws_conn_id=self.aws_conn_id)
 
 Review comment:
   Resolved, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385543222
 
 

 ##
 File path: airflow/providers/amazon/aws/sensors/s3_prefix.py
 ##
 @@ -69,12 +70,11 @@ def __init__(self,
 self.full_url = "s3://" + bucket_name + '/' + prefix
 self.aws_conn_id = aws_conn_id
 self.verify = verify
+self.hook = S3Hook(aws_conn_id=self.aws_conn_id, verify=self.verify)
 
 Review comment:
   Resolved, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385543208
 
 

 ##
 File path: airflow/providers/amazon/aws/sensors/redshift.py
 ##
 @@ -43,9 +43,9 @@ def __init__(self,
 self.cluster_identifier = cluster_identifier
 self.target_status = target_status
 self.aws_conn_id = aws_conn_id
+self.hook = RedshiftHook(aws_conn_id=self.aws_conn_id)
 
 Review comment:
   Resolved, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385542845
 
 

 ##
 File path: tests/providers/amazon/aws/operators/test_ecs.py
 ##
 @@ -48,12 +48,10 @@
 ]
 }
 
-
+# pylint: disable=unused-argument
+@mock.patch('airflow.providers.amazon.aws.operators.ecs.AwsBaseHook')
 
 Review comment:
   Found a way to fix this. 
   I had to call `get_hook()` during the `setUp()` function, while the hook was 
still mocked, to ensure that the operator saves the mocked hook in its 
`self.hook`
   If I call `get_hook()` for the first time _after_ the `setUp`  - (eg by the 
Operator execute) then it is no longer mocked.
   I guess this makes sense but it was really unexpected :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385542306
 
 

 ##
 File path: tests/providers/amazon/aws/operators/test_ecs.py
 ##
 @@ -171,7 +166,7 @@ def test_execute_with_failures(self):
 }
 )
 
-def test_wait_end_tasks(self):
+def test_wait_end_tasks(self, aws_hook_mock):
 
 Review comment:
   Found a way to fix the issue with `setUp`, so this arg will be removed :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385542156
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -47,11 +48,29 @@ class AwsBaseHook(BaseHook):
 :param verify: Whether or not to verify SSL certificates.
 
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
 :type verify: str or bool
+:param str region_name: AWS Region name to use. If this is None then the 
default boto3
+behaviour is used.
+:param str client_type: boto3 client_type used when creating 
boto3.client(). For
+example, 's3', 'emr', etc. Provided by specific hooks for these 
clients which
+subclass AwsBaseHook.
+:param str resource_type: boto3 resource_type used when creating 
boto3.resource(). For
+example, 's3'. Provided by specific hooks for these resources which
+subclass AwsBaseHook.
 
 Review comment:
   Resolved, thanks.
   I preferred the conciseness of `:param  : xyz`
   But have changed back to `:param` + `:type`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385542845
 
 

 ##
 File path: tests/providers/amazon/aws/operators/test_ecs.py
 ##
 @@ -48,12 +48,10 @@
 ]
 }
 
-
+# pylint: disable=unused-argument
+@mock.patch('airflow.providers.amazon.aws.operators.ecs.AwsBaseHook')
 
 Review comment:
   Found a way to fix this. 
   I had to call `get_hook()` during the `setUp `function, while the hook was 
still mocked, to ensure that the operator saves the mocked hook in its 
`self.hook`
   If I call `get_hook()` for the first time _after_ the `setUp` (eg by the 
Operator execute) then it is no longer mocked.
   I guess this makes sense but it was really unexpected :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385542185
 
 

 ##
 File path: tests/providers/amazon/aws/sensors/test_sqs.py
 ##
 @@ -18,8 +18,9 @@
 
 
 import unittest
-from unittest.mock import MagicMock, patch
+from unittest.mock import MagicMock
 
+import mock
 
 Review comment:
   Resolved, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385542306
 
 

 ##
 File path: tests/providers/amazon/aws/operators/test_ecs.py
 ##
 @@ -171,7 +166,7 @@ def test_execute_with_failures(self):
 }
 )
 
-def test_wait_end_tasks(self):
+def test_wait_end_tasks(self, aws_hook_mock):
 
 Review comment:
   Found a way to fix the issue with setUp, so this arg will be removed :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385542156
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -47,11 +48,29 @@ class AwsBaseHook(BaseHook):
 :param verify: Whether or not to verify SSL certificates.
 
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
 :type verify: str or bool
+:param str region_name: AWS Region name to use. If this is None then the 
default boto3
+behaviour is used.
+:param str client_type: boto3 client_type used when creating 
boto3.client(). For
+example, 's3', 'emr', etc. Provided by specific hooks for these 
clients which
+subclass AwsBaseHook.
+:param str resource_type: boto3 resource_type used when creating 
boto3.resource(). For
+example, 's3'. Provided by specific hooks for these resources which
+subclass AwsBaseHook.
 
 Review comment:
   Resolved, thanks.
   I preferred the conciseness of :param  : xyz
   But have changed back to :param + :type


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385541955
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -232,6 +251,38 @@ def get_resource_type(self, resource_type, 
region_name=None, config=None):
 resource_type, endpoint_url=endpoint_url, config=config, 
verify=self.verify
 )
 
+@cached_property
+def conn(self):
+"""Get the underlying boto3 client (cached).
+
+The return value from this method is cached for efficiency.
+
+:return: boto3.client or boto3.resource for the current
+client/resource type and region
+:rtype: boto3.client() or boto3.resource()
+:raises AirflowException: self.client_type or self.resource_type are 
not
+populated. These are usually specified to this class, by a subclass
+__init__ method.
+"""
+if self.client_type:
+return self.get_client_type(self.client_type, 
region_name=self.region_name)
+elif self.resource_type:
+return self.get_resource_type(self.resource_type, 
region_name=self.region_name)
+else:
+raise AirflowException(
+'Either self.client_type or self.resource_type'
+' must be specified in the subclass')
 
 Review comment:
   Resolved, thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385517283
 
 

 ##
 File path: tests/providers/amazon/aws/operators/test_ecs.py
 ##
 @@ -48,12 +48,10 @@
 ]
 }
 
-
+# pylint: disable=unused-argument
+@mock.patch('airflow.providers.amazon.aws.operators.ecs.AwsBaseHook')
 
 Review comment:
   Agreed, it _should_ work :) I'll try and do some more digging. There must be 
a good reason why it isn't working like that at the moment, and I don't want to 
have flaky tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r38577
 
 

 ##
 File path: tests/providers/amazon/aws/operators/test_ecs.py
 ##
 @@ -48,12 +48,10 @@
 ]
 }
 
-
+# pylint: disable=unused-argument
+@mock.patch('airflow.providers.amazon.aws.operators.ecs.AwsBaseHook')
 
 Review comment:
   No luck. I can't mock the hook properly if I move it back to above the 
setUp, regardless of how I patch it. I'm not sure how it was working before...
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385110474
 
 

 ##
 File path: tests/providers/amazon/aws/operators/test_ecs.py
 ##
 @@ -171,7 +166,7 @@ def test_execute_with_failures(self):
 }
 )
 
-def test_wait_end_tasks(self):
+def test_wait_end_tasks(self, aws_hook_mock):
 
 Review comment:
   After moving the mock from setUp to above the class, all methods need to 
have this added (even if unused)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385091513
 
 

 ##
 File path: airflow/providers/amazon/aws/sensors/athena.py
 ##
 @@ -57,13 +57,11 @@ def __init__(self,
 super().__init__(*args, **kwargs)
 self.aws_conn_id = aws_conn_id
 self.query_execution_id = query_execution_id
-self.hook = None
 self.sleep_time = sleep_time
 self.max_retires = max_retires
+self.hook = AWSAthenaHook(self.aws_conn_id, self.sleep_time)
 
 Review comment:
   Thanks - many of the existing sensor code was doing it during __init__, so I 
thought it was convention. Will change to do it during execution which makes 
more sense.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385091780
 
 

 ##
 File path: tests/providers/amazon/aws/operators/test_ecs.py
 ##
 @@ -48,12 +48,10 @@
 ]
 }
 
-
+# pylint: disable=unused-argument
+@mock.patch('airflow.providers.amazon.aws.operators.ecs.AwsBaseHook')
 
 Review comment:
   I couldn't get it to work correctly with setUp, but will try again in case


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385061503
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -49,9 +50,19 @@ class AwsBaseHook(BaseHook):
 :type verify: str or bool
 """
 
-def __init__(self, aws_conn_id="aws_default", verify=None):
+def __init__(
+self,
+aws_conn_id="aws_default",
+verify=None,
+region_name=None,
+client_type=None,
+resource_type=None
 
 Review comment:
   @feluelle Thanks for the feedback, please take another look. I've updated 
the docstrings on all the hooks accordingly. There's only a "how to" document 
for the DataSync operator, no other separate docs for AWS - at least that I 
could see. Let me know if there's anywhere else you can think of :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385002535
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -232,6 +243,23 @@ def get_resource_type(self, resource_type, 
region_name=None, config=None):
 resource_type, endpoint_url=endpoint_url, config=config, 
verify=self.verify
 )
 
+@cached_property
+def conn(self):
+"Get the underlying boto3 client (cached)"
+if self.client_type:
+return self.get_client_type(self.client_type, 
region_name=self.region_name)
+elif self.resource_type:
+return self.get_resource_type(self.resource_type, 
region_name=self.region_name)
+else:
+raise AirflowException(
+'Either self.client_type or self.resource_type'
+' must be specified in the subclass')
+
+def get_conn(self):
+"Get the underlying boto3 client (cached)"
 
 Review comment:
   Thanks, resolved


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385002855
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -49,9 +50,19 @@ class AwsBaseHook(BaseHook):
 :type verify: str or bool
 """
 
-def __init__(self, aws_conn_id="aws_default", verify=None):
+def __init__(
+self,
+aws_conn_id="aws_default",
+verify=None,
+region_name=None,
+client_type=None,
+resource_type=None
 
 Review comment:
   Thanks, I forgot to do that step :p Am busy with it now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks should cache boto3 client

2020-02-27 Thread GitBox
baolsen commented on a change in pull request #7541: [AIRFLOW-6822] AWS hooks 
should cache boto3 client
URL: https://github.com/apache/airflow/pull/7541#discussion_r385002486
 
 

 ##
 File path: airflow/providers/amazon/aws/hooks/base_aws.py
 ##
 @@ -232,6 +243,23 @@ def get_resource_type(self, resource_type, 
region_name=None, config=None):
 resource_type, endpoint_url=endpoint_url, config=config, 
verify=self.verify
 )
 
+@cached_property
+def conn(self):
+"Get the underlying boto3 client (cached)"
 
 Review comment:
   Thanks, resolved


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services