[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times URL: https://github.com/apache/airflow/pull/4919#discussion_r268396887 ## File path: airflow/contrib/operators/aws_athena_operator.py ## @@ -74,7 +78,17 @@ def execute(self, context): self.result_configuration['OutputLocation'] = self.output_location self.query_execution_id = self.hook.run_query(self.query, self.query_execution_context, self.result_configuration, self.client_request_token) -self.hook.poll_query_status(self.query_execution_id) +query_status = self.hook.poll_query_status(self.query_execution_id, self.max_tries) + +if query_status in AWSAthenaHook.FAILURE_STATES: +raise Exception( +'Final state of Athena job is {}, query_execution_id is {}.' +.format(query_status, self.query_execution_id)) +elif not query_status or query_status in AWSAthenaHook.INTERMEDIATE_STATES: +raise Exception( +'Final state of Athena job is {}. \ + Max tries of poll status exceeded, query_execution_id is {}.' Review comment: Kindly allow me a bit more time to have a closer look at your test update before I merge. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times URL: https://github.com/apache/airflow/pull/4919#discussion_r267606803 ## File path: airflow/contrib/operators/aws_athena_operator.py ## @@ -74,7 +78,17 @@ def execute(self, context): self.result_configuration['OutputLocation'] = self.output_location self.query_execution_id = self.hook.run_query(self.query, self.query_execution_context, self.result_configuration, self.client_request_token) -self.hook.poll_query_status(self.query_execution_id) +query_status = self.hook.poll_query_status(self.query_execution_id, self.max_tries) + +if query_status in AWSAthenaHook.FAILURE_STATES: +raise Exception( +'Final state of Athena job is {}, query_execution_id is {}.' +.format(query_status, self.query_execution_id)) +elif not query_status or query_status in AWSAthenaHook.INTERMEDIATE_STATES: +raise Exception( +'Final state of Athena job is {}. \ + Max tries of poll status exceeded, query_execution_id is {}.' Review comment: Hi @bryanyang0528 This line-breaking here will cause minor display issue. You can try to run ```python raise Exception( 'Final state of Athena job is {}. \ Max tries of poll status exceeded, query_execution_id is {}.' .format("AAA", "BBB")) ``` What you will see is ``` Exception: Final state of Athena job is AAA. Max tries of poll status exceeded, query_execution_id is **BBB.** ``` in which we have unnecessary extra spaces between the two sentences. Please change how you break the line. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times URL: https://github.com/apache/airflow/pull/4919#discussion_r267356523 ## File path: airflow/contrib/operators/aws_athena_operator.py ## @@ -74,7 +76,16 @@ def execute(self, context): self.result_configuration['OutputLocation'] = self.output_location self.query_execution_id = self.hook.run_query(self.query, self.query_execution_context, self.result_configuration, self.client_request_token) -self.hook.poll_query_status(self.query_execution_id) +query_status = self.hook.poll_query_status(self.query_execution_id, self.max_tries) + +if not query_status or query_status in AWSAthenaHook.FAILURE_STATES: +raise Exception( +'Athena job failed. Final state is {}, query_execution_id is {}.' +.format(query_status, self.query_execution_id)) +elif query_status in AWSAthenaHook.INTERMEDIATE_STATES: Review comment: Hi @bryanyang0528 , if you check https://github.com/apache/airflow/blob/4655c3f2bbd6dbb442a9c8482559748bd9db0bd7/airflow/contrib/hooks/aws_athena_hook.py#L123-L140 You will notice that `query_state is None` or `query_state in self.INTERMEDIATE_STATES` would not result in `break`. Instead, `poll_query_status()` will only end with `query_state is None` or `query_state in self.INTERMEDIATE_STATES` when `max_tries` is reached. It may be a too strong assumption to say "`query_status` is `None` means `failed`". On the other hand, `else:` (including `FAILURE_STATES`) cause an explicit `break`, which is for sure `failed`. Hope this clarifies. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times URL: https://github.com/apache/airflow/pull/4919#discussion_r266712619 ## File path: airflow/contrib/operators/aws_athena_operator.py ## @@ -74,7 +76,16 @@ def execute(self, context): self.result_configuration['OutputLocation'] = self.output_location self.query_execution_id = self.hook.run_query(self.query, self.query_execution_context, self.result_configuration, self.client_request_token) -self.hook.poll_query_status(self.query_execution_id) +query_status = self.hook.poll_query_status(self.query_execution_id, self.max_tries) + +if not query_status or query_status in AWSAthenaHook.FAILURE_STATES: +raise Exception( +'Athena job failed. Final state is {}, query_execution_id is {}.' +.format(query_status, self.query_execution_id)) +elif query_status in AWSAthenaHook.INTERMEDIATE_STATES: Review comment: Hi @bryanyang0528 seems you only fixed for the `.format()` method, but the `if...elif` question is not addressed yet. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times URL: https://github.com/apache/airflow/pull/4919#discussion_r266444808 ## File path: airflow/contrib/operators/aws_athena_operator.py ## @@ -47,7 +47,8 @@ class AWSAthenaOperator(BaseOperator): @apply_defaults def __init__(self, query, database, output_location, aws_conn_id='aws_default', client_request_token=None, - query_execution_context=None, result_configuration=None, sleep_time=30, *args, **kwargs): + query_execution_context=None, result_configuration=None, sleep_time=30, max_tries=None, Review comment: You didn't update docstring for `max_tries`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times URL: https://github.com/apache/airflow/pull/4919#discussion_r266444318 ## File path: airflow/contrib/operators/aws_athena_operator.py ## @@ -74,7 +76,16 @@ def execute(self, context): self.result_configuration['OutputLocation'] = self.output_location self.query_execution_id = self.hook.run_query(self.query, self.query_execution_context, self.result_configuration, self.client_request_token) -self.hook.poll_query_status(self.query_execution_id) +query_status = self.hook.poll_query_status(self.query_execution_id, self.max_tries) + +if not query_status or query_status in AWSAthenaHook.FAILURE_STATES: +raise Exception( +'Athena job failed. Final state is {}, query_execution_id is {}.' +.format(query_status, self.query_execution_id)) +elif query_status in AWSAthenaHook.INTERMEDIATE_STATES: Review comment: I guess it should be like below? ```python if query_status in AWSAthenaHook.FAILURE_STATES: raise Exception("...") elif not query_status or query_status in AWSAthenaHook.INTERMEDIATE_STATES: raise Exception("...") ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times
XD-DENG commented on a change in pull request #4919: [AIRFLOW-4093] Throw exception if job failed or cancelled or retry too many times URL: https://github.com/apache/airflow/pull/4919#discussion_r266443569 ## File path: airflow/contrib/operators/aws_athena_operator.py ## @@ -74,7 +76,16 @@ def execute(self, context): self.result_configuration['OutputLocation'] = self.output_location self.query_execution_id = self.hook.run_query(self.query, self.query_execution_context, self.result_configuration, self.client_request_token) -self.hook.poll_query_status(self.query_execution_id) +query_status = self.hook.poll_query_status(self.query_execution_id, self.max_tries) + +if not query_status or query_status in AWSAthenaHook.FAILURE_STATES: +raise Exception( +'Athena job failed. Final state is {}, query_execution_id is {}.' +.format(query_status, self.query_execution_id)) +elif query_status in AWSAthenaHook.INTERMEDIATE_STATES: +raise Exception( +'Athena job failed. Max tries of poll status exceeded, query_execution_id is {}.' +.format(query_status, self.query_execution_id)) Review comment: Only one `{}`, but you provided two values in `.format()` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services