[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times

2019-03-23 Thread GitBox
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw 
exception if job failed or cancelled or retry too many times
URL: https://github.com/apache/airflow/pull/4919#discussion_r268396887
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -74,7 +78,17 @@ def execute(self, context):
 self.result_configuration['OutputLocation'] = self.output_location
 self.query_execution_id = self.hook.run_query(self.query, 
self.query_execution_context,
   
self.result_configuration, self.client_request_token)
-self.hook.poll_query_status(self.query_execution_id)
+query_status = self.hook.poll_query_status(self.query_execution_id, 
self.max_tries)
+
+if query_status in AWSAthenaHook.FAILURE_STATES:
+raise Exception(
+'Final state of Athena job is {}, query_execution_id is {}.'
+.format(query_status, self.query_execution_id))
+elif not query_status or query_status in 
AWSAthenaHook.INTERMEDIATE_STATES:
+raise Exception(
+'Final state of Athena job is {}. \
+ Max tries of poll status exceeded, query_execution_id is {}.'
 
 Review comment:
   Kindly allow me a bit more time to have a closer look at your test update 
before I merge.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times

2019-03-20 Thread GitBox
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw 
exception if job failed or cancelled or retry too many times
URL: https://github.com/apache/airflow/pull/4919#discussion_r267606803
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -74,7 +78,17 @@ def execute(self, context):
 self.result_configuration['OutputLocation'] = self.output_location
 self.query_execution_id = self.hook.run_query(self.query, 
self.query_execution_context,
   
self.result_configuration, self.client_request_token)
-self.hook.poll_query_status(self.query_execution_id)
+query_status = self.hook.poll_query_status(self.query_execution_id, 
self.max_tries)
+
+if query_status in AWSAthenaHook.FAILURE_STATES:
+raise Exception(
+'Final state of Athena job is {}, query_execution_id is {}.'
+.format(query_status, self.query_execution_id))
+elif not query_status or query_status in 
AWSAthenaHook.INTERMEDIATE_STATES:
+raise Exception(
+'Final state of Athena job is {}. \
+ Max tries of poll status exceeded, query_execution_id is {}.'
 
 Review comment:
   Hi @bryanyang0528 This line-breaking here will cause minor display issue. 
   
   You can try to run
   ```python
   raise Exception(
   'Final state of Athena job is {}. \
   Max tries of poll status exceeded, query_execution_id is {}.'
   .format("AAA", "BBB"))
   ```
   
   What you will see is 
   ```
   Exception: Final state of Athena job is AAA. Max tries of poll status 
exceeded, query_execution_id is **BBB.**
   ```
   in which we have unnecessary extra spaces between the two sentences.
   
   Please change how you break the line. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times

2019-03-20 Thread GitBox
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw 
exception if job failed or cancelled or retry too many times
URL: https://github.com/apache/airflow/pull/4919#discussion_r267356523
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -74,7 +76,16 @@ def execute(self, context):
 self.result_configuration['OutputLocation'] = self.output_location
 self.query_execution_id = self.hook.run_query(self.query, 
self.query_execution_context,
   
self.result_configuration, self.client_request_token)
-self.hook.poll_query_status(self.query_execution_id)
+query_status = self.hook.poll_query_status(self.query_execution_id, 
self.max_tries)
+
+if not query_status or query_status in AWSAthenaHook.FAILURE_STATES:
+raise Exception(
+'Athena job failed. Final state is {}, query_execution_id is 
{}.'
+.format(query_status, self.query_execution_id))
+elif query_status in AWSAthenaHook.INTERMEDIATE_STATES:
 
 Review comment:
   Hi @bryanyang0528 , if you check 
https://github.com/apache/airflow/blob/4655c3f2bbd6dbb442a9c8482559748bd9db0bd7/airflow/contrib/hooks/aws_athena_hook.py#L123-L140
   
   You will notice that `query_state is None` or `query_state in 
self.INTERMEDIATE_STATES` would not result in `break`. Instead, 
`poll_query_status()` will only end with `query_state is None` or `query_state 
in self.INTERMEDIATE_STATES` when `max_tries` is reached. It may be a too 
strong assumption to say "`query_status` is `None` means `failed`".
   
   On the other hand, `else:` (including `FAILURE_STATES`) cause an explicit 
`break`, which is for sure `failed`.
   
   Hope this clarifies.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times

2019-03-18 Thread GitBox
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw 
exception if job failed or cancelled or retry too many times
URL: https://github.com/apache/airflow/pull/4919#discussion_r266712619
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -74,7 +76,16 @@ def execute(self, context):
 self.result_configuration['OutputLocation'] = self.output_location
 self.query_execution_id = self.hook.run_query(self.query, 
self.query_execution_context,
   
self.result_configuration, self.client_request_token)
-self.hook.poll_query_status(self.query_execution_id)
+query_status = self.hook.poll_query_status(self.query_execution_id, 
self.max_tries)
+
+if not query_status or query_status in AWSAthenaHook.FAILURE_STATES:
+raise Exception(
+'Athena job failed. Final state is {}, query_execution_id is 
{}.'
+.format(query_status, self.query_execution_id))
+elif query_status in AWSAthenaHook.INTERMEDIATE_STATES:
 
 Review comment:
   Hi @bryanyang0528 seems you only fixed for the `.format()` method, but the 
`if...elif` question is not addressed yet.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times

2019-03-18 Thread GitBox
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw 
exception if job failed or cancelled or retry too many times
URL: https://github.com/apache/airflow/pull/4919#discussion_r266444808
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -47,7 +47,8 @@ class AWSAthenaOperator(BaseOperator):
 
 @apply_defaults
 def __init__(self, query, database, output_location, 
aws_conn_id='aws_default', client_request_token=None,
- query_execution_context=None, result_configuration=None, 
sleep_time=30, *args, **kwargs):
+ query_execution_context=None, result_configuration=None, 
sleep_time=30, max_tries=None,
 
 Review comment:
   You didn't update docstring for `max_tries`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times

2019-03-18 Thread GitBox
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw 
exception if job failed or cancelled or retry too many times
URL: https://github.com/apache/airflow/pull/4919#discussion_r266444318
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -74,7 +76,16 @@ def execute(self, context):
 self.result_configuration['OutputLocation'] = self.output_location
 self.query_execution_id = self.hook.run_query(self.query, 
self.query_execution_context,
   
self.result_configuration, self.client_request_token)
-self.hook.poll_query_status(self.query_execution_id)
+query_status = self.hook.poll_query_status(self.query_execution_id, 
self.max_tries)
+
+if not query_status or query_status in AWSAthenaHook.FAILURE_STATES:
+raise Exception(
+'Athena job failed. Final state is {}, query_execution_id is 
{}.'
+.format(query_status, self.query_execution_id))
+elif query_status in AWSAthenaHook.INTERMEDIATE_STATES:
 
 Review comment:
   I guess it should be like below?
   ```python
   if query_status in AWSAthenaHook.FAILURE_STATES:
   raise Exception("...")
   elif not query_status or query_status in AWSAthenaHook.INTERMEDIATE_STATES:
   raise Exception("...")
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times

2019-03-18 Thread GitBox
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw 
exception if job failed or cancelled or retry too many times
URL: https://github.com/apache/airflow/pull/4919#discussion_r266443569
 
 

 ##
 File path: airflow/contrib/operators/aws_athena_operator.py
 ##
 @@ -74,7 +76,16 @@ def execute(self, context):
 self.result_configuration['OutputLocation'] = self.output_location
 self.query_execution_id = self.hook.run_query(self.query, 
self.query_execution_context,
   
self.result_configuration, self.client_request_token)
-self.hook.poll_query_status(self.query_execution_id)
+query_status = self.hook.poll_query_status(self.query_execution_id, 
self.max_tries)
+
+if not query_status or query_status in AWSAthenaHook.FAILURE_STATES:
+raise Exception(
+'Athena job failed. Final state is {}, query_execution_id is 
{}.'
+.format(query_status, self.query_execution_id))
+elif query_status in AWSAthenaHook.INTERMEDIATE_STATES:
+raise Exception(
+'Athena job failed. Max tries of poll status exceeded, 
query_execution_id is {}.'
+.format(query_status, self.query_execution_id))
 
 Review comment:
   Only one `{}`, but you provided two values in `.format()`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services