[jira] [Updated] (AIRFLOW-2834) can not see the dag page after build from the newest code in github
[ https://issues.apache.org/jira/browse/AIRFLOW-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rurui Ye updated AIRFLOW-2834: -- Priority: Blocker (was: Major) > can not see the dag page after build from the newest code in github > --- > > Key: AIRFLOW-2834 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2834 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: Airflow 2.0 >Reporter: Rurui Ye >Priority: Blocker > Attachments: image-2018-08-01-14-20-09-256.png > > > after build and deploy the newest version of code from github. got the web > server opened and the dags page blank with the following error in request > resource. > > !image-2018-08-01-14-20-09-256.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2834) can not see the dag page after build from the newest code in github
Rurui Ye created AIRFLOW-2834: - Summary: can not see the dag page after build from the newest code in github Key: AIRFLOW-2834 URL: https://issues.apache.org/jira/browse/AIRFLOW-2834 Project: Apache Airflow Issue Type: Bug Affects Versions: Airflow 2.0 Reporter: Rurui Ye Attachments: image-2018-08-01-14-20-09-256.png after build and deploy the newest version of code from github. got the web server opened and the dags page blank with the following error in request resource. !image-2018-08-01-14-20-09-256.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2833) Delay in trigger of downstream tasks in DAG
Mishika Singh created AIRFLOW-2833: -- Summary: Delay in trigger of downstream tasks in DAG Key: AIRFLOW-2833 URL: https://issues.apache.org/jira/browse/AIRFLOW-2833 Project: Apache Airflow Issue Type: Bug Reporter: Mishika Singh Attachments: Screen Shot 2018-05-25 at 9.18.08 AM.png There is around 2 minutes of delay in triggering the downstream tasks on completion of upstream tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-2703) Scheduler crashes if Mysql Connectivity is lost
[ https://issues.apache.org/jira/browse/AIRFLOW-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] raman reassigned AIRFLOW-2703: -- Assignee: Mishika Singh > Scheduler crashes if Mysql Connectivity is lost > --- > > Key: AIRFLOW-2703 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2703 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: Airflow 2.0, 1.9.0 >Reporter: raman >Assignee: Mishika Singh >Priority: Major > > Airflow scheduler crashes if connectivity to Mysql is lost. > Below is the stack Trace > Traceback (most recent call last): File > "/usr/src/venv/local/lib/python2.7/site-packages/airflow/jobs.py", line 371, > in helper pickle_dags) File > "/usr/src/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", line > 50, in wrapper result = func(*args, **kwargs) File > "/usr/src/venv/local/lib/python2.7/site-packages/airflow/jobs.py", line 1762, > in process_file dag.sync_to_db() File > "/usr/src/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", line > 50, in wrapper result = func(*args, **kwargs) File > "/usr/src/venv/local/lib/python2.7/site-packages/airflow/models.py", line > 3816, in sync_to_db session.commit() File > "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", > line 943, in commit self.transaction.commit() File > "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", > line 471, in commit t[1].commit() File > "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", > line 1643, in commit self._do_commit() File > "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", > line 1674, in _do_commit self.connection._commit_impl() File > "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", > line 726, in _commit_impl self._handle_dbapi_exception(e, None, None, None, > None) File > "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", > line 1413, in _handle_dbapi_exception exc_info File > "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", > line 203, in raise_from_cause reraise(type(exception), exception, tb=exc_tb, > cause=cause) File > "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", > line 724, in _commit_impl self.engine.dialect.do_commit(self.connection) File > "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/dialects/mysql/base.py", > line 1784, in do_commit dbapi_connection.commit() OperationalError: > (_mysql_exceptions.OperationalError) (2013, 'Lost connection to MySQL server > during query') (Background on this error at: http://sqlalche.me/e/e3q8) > Process DagFileProcessor141318-Process: -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564514#comment-16564514 ] ASF GitHub Bot commented on AIRFLOW-2524: - codecov-io edited a comment on issue #3658: [AIRFLOW-2524] Add Amazon SageMaker Training URL: https://github.com/apache/incubator-airflow/pull/3658#issuecomment-408564225 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=h1) Report > Merging [#3658](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/096ba9ecd961cdaebd062599f408571ffb21165a?src=pr&el=desc) will **increase** coverage by `0.4%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3658/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master#3658 +/- ## = + Coverage 77.11% 77.51% +0.4% = Files 206 205 -1 Lines 1577215751 -21 = + Hits1216212210 +48 + Misses 3610 3541 -69 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/www/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `99.01% <0%> (-0.99%)` | :arrow_down: | | [airflow/www/validators.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdmFsaWRhdG9ycy5weQ==) | `100% <0%> (ø)` | :arrow_up: | | [airflow/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9fX2luaXRfXy5weQ==) | `80.43% <0%> (ø)` | :arrow_up: | | [airflow/plugins\_manager.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=) | `92.59% <0%> (ø)` | :arrow_up: | | [airflow/minihivecluster.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9taW5paGl2ZWNsdXN0ZXIucHk=) | | | | [airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9qb2JzLnB5) | `82.74% <0%> (+0.26%)` | :arrow_up: | | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `89.87% <0%> (+0.42%)` | :arrow_up: | | [airflow/hooks/pig\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9waWdfaG9vay5weQ==) | `100% <0%> (+100%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=footer). Last update [096ba9e...3f1e4b1](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow integration with AWS Sagemaker > -- > > Key: AIRFLOW-2524 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2524 > Project: Apache Airflow > Issue Type: Improvement > Components: aws, contrib >Reporter: Rajeev Srinivasan >Assignee: Yang Yu >Priority: Major > Labels: AWS > > Would it be possible to orchestrate an end to end AWS Sagemaker job using > Airflow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564491#comment-16564491 ] ASF GitHub Bot commented on AIRFLOW-2814: - XD-DENG commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default config URL: https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409398992 Hi all, thanks for the inputs. Agree with you on the desired value as well (the objective of this PR was to fix inconsistency between `.cfg` and comment in `jobs.py`, instead of proposing another value for this configuration item). Hi @kaxil , regarding `dag_dir_list_interval`, personally I think it should be reduced. 5 minutes is quite long for users to wait until new DAG file is reflected. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564482#comment-16564482 ] ASF GitHub Bot commented on AIRFLOW-2524: - troychen728 commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon SageMaker Training URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206711545 ## File path: tests/contrib/hooks/test_sagemaker_hook.py ## @@ -0,0 +1,341 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# + + +import json +import unittest +import copy +try: +from unittest import mock +except ImportError: +try: +import mock +except ImportError: +mock = None + +from airflow import configuration +from airflow import models +from airflow.utils import db +from airflow.contrib.hooks.sagemaker_hook import SageMakerHook +from airflow.hooks.S3_hook import S3Hook +from airflow.exceptions import AirflowException + + +role = 'test-role' + +bucket = 'test-bucket' + +key = 'test/data' +data_url = 's3://{}/{}'.format(bucket, key) + +job_name = 'test-job-name' + +image = 'test-image' + +test_arn_return = {'TrainingJobArn': 'testarn'} + +test_list_training_job_return = { +'TrainingJobSummaries': [ +{ +'TrainingJobName': job_name, +'TrainingJobStatus': 'InProgress' +}, +], +'NextToken': 'test-token' +} + +test_list_tuning_job_return = { +'TrainingJobSummaries': [ +{ +'TrainingJobName': job_name, +'TrainingJobArn': 'testarn', +'TunedHyperParameters': { +'k': '3' +}, +'TrainingJobStatus': 'InProgress' +}, +], +'NextToken': 'test-token' +} + +output_url = 's3://{}/test/output'.format(bucket) +create_training_params = \ +{ +'AlgorithmSpecification': { +'TrainingImage': image, +'TrainingInputMode': 'File' +}, +'RoleArn': role, +'OutputDataConfig': { +'S3OutputPath': output_url +}, +'ResourceConfig': { +'InstanceCount': 2, +'InstanceType': 'ml.c4.8xlarge', +'VolumeSizeInGB': 50 +}, +'TrainingJobName': job_name, +'HyperParameters': { +'k': '10', +'feature_dim': '784', +'mini_batch_size': '500', +'force_dense': 'True' +}, +'StoppingCondition': { +'MaxRuntimeInSeconds': 60 * 60 +}, +'InputDataConfig': [ +{ +'ChannelName': 'train', +'DataSource': { +'S3DataSource': { +'S3DataType': 'S3Prefix', +'S3Uri': data_url, +'S3DataDistributionType': 'FullyReplicated' +} +}, +'CompressionType': 'None', +'RecordWrapperType': 'None' +} +] +} + +create_tuning_params = {'HyperParameterTuningJobName': job_name, +'HyperParameterTuningJobConfig': { +'Strategy': 'Bayesian', +'HyperParameterTuningJobObjective': { +'Type': 'Maximize', +'MetricName': 'test_metric' +}, +'ResourceLimits': { +'MaxNumberOfTrainingJobs': 123, +'MaxParallelTrainingJobs': 123 +}, +'ParameterRanges': { +'IntegerParameterRanges': [ +{ +'Name': 'k', +'MinValue': '2', +'MaxValue': '10' +}, +] +} +}, +'TrainingJobDefinition': { +
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564481#comment-16564481 ] ASF GitHub Bot commented on AIRFLOW-2524: - troychen728 commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon SageMaker Training URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206711515 ## File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py ## @@ -0,0 +1,98 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow.contrib.hooks.sagemaker_hook import SageMakerHook +from airflow.models import BaseOperator +from airflow.utils import apply_defaults +from airflow.exceptions import AirflowException + + +class SageMakerCreateTrainingJobOperator(BaseOperator): + +""" + Initiate a SageMaker training + + This operator returns The ARN of the model created in Amazon SageMaker + + :param training_job_config: + The configuration necessary to start a training job (templated) + :type training_job_config: dict + :param region_name: The AWS region_name + :type region_name: string + :param sagemaker_conn_id: The SageMaker connection ID to use. + :type aws_conn_id: string + :param use_db_config: Whether or not to use db config + associated with sagemaker_conn_id. + If set to true, will automatically update the training config + with what's in db, so the db config doesn't need to + included everything, but what's there does replace the ones + in the training_job_config, so be careful + :type use_db_config: + :param aws_conn_id: The AWS connection ID to use. + :type aws_conn_id: string + + **Example**: + The following operator would start a training job when executed + +sagemaker_training = + SageMakerCreateTrainingJobOperator( + task_id='sagemaker_training', + training_job_config=config, + use_db_config=True, + region_name='us-west-2' + sagemaker_conn_id='sagemaker_customers_conn', + aws_conn_id='aws_customers_conn' + ) + """ + +template_fields = ['training_job_config'] +template_ext = () +ui_color = '#ededed' + +@apply_defaults +def __init__(self, + sagemaker_conn_id=None, Review comment: Changed the order This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow integration with AWS Sagemaker > -- > > Key: AIRFLOW-2524 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2524 > Project: Apache Airflow > Issue Type: Improvement > Components: aws, contrib >Reporter: Rajeev Srinivasan >Assignee: Yang Yu >Priority: Major > Labels: AWS > > Would it be possible to orchestrate an end to end AWS Sagemaker job using > Airflow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564478#comment-16564478 ] ASF GitHub Bot commented on AIRFLOW-2524: - troychen728 commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon SageMaker Training URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206711354 ## File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py ## @@ -0,0 +1,98 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow.contrib.hooks.sagemaker_hook import SageMakerHook +from airflow.models import BaseOperator +from airflow.utils import apply_defaults +from airflow.exceptions import AirflowException + + +class SageMakerCreateTrainingJobOperator(BaseOperator): + +""" + Initiate a SageMaker training + + This operator returns The ARN of the model created in Amazon SageMaker + + :param training_job_config: + The configuration necessary to start a training job (templated) + :type training_job_config: dict + :param region_name: The AWS region_name + :type region_name: string + :param sagemaker_conn_id: The SageMaker connection ID to use. + :type aws_conn_id: string + :param use_db_config: Whether or not to use db config + associated with sagemaker_conn_id. Review comment: Added This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow integration with AWS Sagemaker > -- > > Key: AIRFLOW-2524 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2524 > Project: Apache Airflow > Issue Type: Improvement > Components: aws, contrib >Reporter: Rajeev Srinivasan >Assignee: Yang Yu >Priority: Major > Labels: AWS > > Would it be possible to orchestrate an end to end AWS Sagemaker job using > Airflow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564480#comment-16564480 ] ASF GitHub Bot commented on AIRFLOW-2814: - codecov-io commented on issue #3669: Revert [AIRFLOW-2814] - Change `min_file_process_interval` to 0 URL: https://github.com/apache/incubator-airflow/pull/3669#issuecomment-409396427 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=h1) Report > Merging [#3669](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/ed972042a864cd010137190e0bbb1d25a9dcfe83?src=pr&el=desc) will **increase** coverage by `0.27%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3669/graphs/tree.svg?token=WdLKlKHOAU&src=pr&width=650&height=150)](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#3669 +/- ## == + Coverage 77.51% 77.79% +0.27% == Files 205 205 Lines 1575116079 +328 == + Hits1221012508 +298 - Misses 3541 3571 +30 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3669/diff?src=pr&el=tree#diff-YWlyZmxvdy9qb2JzLnB5) | `84.63% <ø> (+1.88%)` | :arrow_up: | | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/incubator-airflow/pull/3669/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `89.45% <0%> (-0.43%)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=footer). Last update [ed97204...1ee1fc4](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564434#comment-16564434 ] ASF subversion and git services commented on AIRFLOW-2814: -- Commit 1ee1fc4ec0bab25d9e75a8ca1943fc1a91a85546 in incubator-airflow's branch refs/heads/revert-2814 from [~kaxilnaik] [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=1ee1fc4 ] Revert [AIRFLOW-2814] - Change `min_file_process_interval` to 0 > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564422#comment-16564422 ] ASF GitHub Bot commented on AIRFLOW-2524: - troychen728 commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon SageMaker Training URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206700100 ## File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py ## @@ -0,0 +1,98 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow.contrib.hooks.sagemaker_hook import SageMakerHook +from airflow.models import BaseOperator +from airflow.utils import apply_defaults +from airflow.exceptions import AirflowException + + +class SageMakerCreateTrainingJobOperator(BaseOperator): + +""" + Initiate a SageMaker training + + This operator returns The ARN of the model created in Amazon SageMaker + + :param training_job_config: + The configuration necessary to start a training job (templated) + :type training_job_config: dict + :param region_name: The AWS region_name + :type region_name: string + :param sagemaker_conn_id: The SageMaker connection ID to use. + :type aws_conn_id: string Review comment: Hi Fokko, Thank you so much for your review. I really appreciate your feedback. I didn't figure out how to reply to your request, so I'll just reply to you here. The main reason why I separate it to operator and sensor is that the success of the training job have two stages: successfully kick off a training job, and the training job successfully finishes. The operator tells about the first status, and the sensor tells the latter one. Also, since a training job is hosted at an AWS instance, not the instance that is hosting Airflow, so this way, other operators can set upstream to the operator, rather than the sensor, if they aren't dependent on the model actually being created. Also, by using the sensor, users can set parameters like poke_interval, which makes more sense for a sensor rather than an operator. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow integration with AWS Sagemaker > -- > > Key: AIRFLOW-2524 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2524 > Project: Apache Airflow > Issue Type: Improvement > Components: aws, contrib >Reporter: Rajeev Srinivasan >Assignee: Yang Yu >Priority: Major > Labels: AWS > > Would it be possible to orchestrate an end to end AWS Sagemaker job using > Airflow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2658) Add GKE specific Kubernetes Pod Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564402#comment-16564402 ] ASF GitHub Bot commented on AIRFLOW-2658: - fenglu-g commented on issue #3532: [AIRFLOW-2658] Add GCP specific k8s pod operator URL: https://github.com/apache/incubator-airflow/pull/3532#issuecomment-409378846 @Noremac201 please fix travis-ci, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add GKE specific Kubernetes Pod Operator > > > Key: AIRFLOW-2658 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2658 > Project: Apache Airflow > Issue Type: New Feature >Reporter: Cameron Moberg >Assignee: Cameron Moberg >Priority: Minor > > Currently there is a Kubernetes Pod operator, but it is not really easy to > have it work with GCP Kubernetes Engine, it would be nice to have one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564400#comment-16564400 ] ASF GitHub Bot commented on AIRFLOW-2814: - XD-DENG commented on issue #3669: Revert [AIRFLOW-2814] - Change `min_file_process_interval` to 0 URL: https://github.com/apache/incubator-airflow/pull/3669#issuecomment-409378082 Hi @kaxil , please be reminded to update the comment in https://github.com/apache/incubator-airflow/blob/master/airflow/jobs.py#L592 as well, otherwise the comment will be inconsistent with the configuration value again. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files
[ https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564395#comment-16564395 ] ASF GitHub Bot commented on AIRFLOW-2832: - codecov-io commented on issue #3670: [AIRFLOW-2832] Lint and resolve inconsistencies in Markdown files URL: https://github.com/apache/incubator-airflow/pull/3670#issuecomment-409376218 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=h1) Report > Merging [#3670](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/ed972042a864cd010137190e0bbb1d25a9dcfe83?src=pr&el=desc) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3670/graphs/tree.svg?width=650&src=pr&token=WdLKlKHOAU&height=150)](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#3670 +/- ## === Coverage 77.51% 77.51% === Files 205 205 Lines 1575115751 === Hits1221012210 Misses 3541 3541 ``` -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=footer). Last update [ed97204...eef6fc8](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Inconsistencies and linter errors across markdown files > --- > > Key: AIRFLOW-2832 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2832 > Project: Apache Airflow > Issue Type: Improvement > Components: docs, Documentation >Reporter: Taylor Edmiston >Assignee: Taylor Edmiston >Priority: Minor > > There are a number of inconsistencies within and across markdown files in the > Airflow project. Most of these are simple formatting issues easily fixed by > linting (e.g., with mdl). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files
[ https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564396#comment-16564396 ] ASF GitHub Bot commented on AIRFLOW-2832: - codecov-io edited a comment on issue #3670: [AIRFLOW-2832] Lint and resolve inconsistencies in Markdown files URL: https://github.com/apache/incubator-airflow/pull/3670#issuecomment-409376218 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=h1) Report > Merging [#3670](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/ed972042a864cd010137190e0bbb1d25a9dcfe83?src=pr&el=desc) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3670/graphs/tree.svg?height=150&width=650&token=WdLKlKHOAU&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#3670 +/- ## === Coverage 77.51% 77.51% === Files 205 205 Lines 1575115751 === Hits1221012210 Misses 3541 3541 ``` -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=footer). Last update [ed97204...eef6fc8](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Inconsistencies and linter errors across markdown files > --- > > Key: AIRFLOW-2832 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2832 > Project: Apache Airflow > Issue Type: Improvement > Components: docs, Documentation >Reporter: Taylor Edmiston >Assignee: Taylor Edmiston >Priority: Minor > > There are a number of inconsistencies within and across markdown files in the > Airflow project. Most of these are simple formatting issues easily fixed by > linting (e.g., with mdl). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564387#comment-16564387 ] ASF GitHub Bot commented on AIRFLOW-2803: - tedmiston commented on a change in pull request #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#discussion_r206688518 ## File path: airflow/www_rbac/templates/airflow/circles.html ## @@ -28,117 +28,111 @@ Airflow 404 = lots of circles
[jira] [Commented] (AIRFLOW-2817) Force explicit choice on GPL dependency
[ https://issues.apache.org/jira/browse/AIRFLOW-2817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564373#comment-16564373 ] ASF GitHub Bot commented on AIRFLOW-2817: - ashb commented on issue #3660: [AIRFLOW-2817] Force explicit choice on GPL dependency URL: https://github.com/apache/incubator-airflow/pull/3660#issuecomment-409370019 Charting is causing us quite the license head-ache isn't it? :( This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Force explicit choice on GPL dependency > --- > > Key: AIRFLOW-2817 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2817 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Bolke de Bruin >Priority: Major > > A more explicit choice on GPL dependency was required by the IPMC -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564370#comment-16564370 ] ASF GitHub Bot commented on AIRFLOW-2803: - ashb commented on a change in pull request #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#discussion_r206684313 ## File path: airflow/www_rbac/templates/airflow/circles.html ## @@ -28,117 +28,111 @@ Airflow 404 = lots of circles
[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running
[ https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564359#comment-16564359 ] ASF GitHub Bot commented on AIRFLOW-1104: - codecov-io edited a comment on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does not over schedule tasks URL: https://github.com/apache/incubator-airflow/pull/3568#issuecomment-401878707 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=h1) Report > Merging [#3568](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/3b35d360f6ff8694b6fb4387901c182ca39160b5?src=pr&el=desc) will **increase** coverage by `<.01%`. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3568/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#3568 +/- ## == + Coverage 77.51% 77.51% +<.01% == Files 205 205 Lines 1575115751 == + Hits1220912210 +1 + Misses 3542 3541 -1 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3568/diff?src=pr&el=tree#diff-YWlyZmxvdy9qb2JzLnB5) | `82.74% <100%> (ø)` | :arrow_up: | | [airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3568/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=) | `88.58% <0%> (+0.04%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=footer). Last update [3b35d36...b04c9b1](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Concurrency check in scheduler should count queued tasks as well as running > --- > > Key: AIRFLOW-1104 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1104 > Project: Apache Airflow > Issue Type: Bug > Environment: see https://github.com/apache/incubator-airflow/pull/2221 > "Tasks with the QUEUED state should also be counted below, but for now we > cannot count them. This is because there is no guarantee that queued tasks in > failed dagruns will or will not eventually run and queued tasks that will > never run will consume slots and can stall a DAG. Once we can guarantee that > all queued tasks in failed dagruns will never run (e.g. make sure that all > running/newly queued TIs have running dagruns), then we can include QUEUED > tasks here, with the constraint that they are in running dagruns." >Reporter: Alex Guziel >Priority: Minor > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files
[ https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564354#comment-16564354 ] ASF GitHub Bot commented on AIRFLOW-2832: - tedmiston edited a comment on issue #3670: [AIRFLOW-2832] Lint and resolve inconsistencies in Markdown files URL: https://github.com/apache/incubator-airflow/pull/3670#issuecomment-409358478 This PR is now squashed and ready for review. I'm not sure that there's any one best person to review these changes but in a git log, I see that @bolkedebruin, @Fokko, and @r39132 have modified some of these files in recent history. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Inconsistencies and linter errors across markdown files > --- > > Key: AIRFLOW-2832 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2832 > Project: Apache Airflow > Issue Type: Improvement > Components: docs, Documentation >Reporter: Taylor Edmiston >Assignee: Taylor Edmiston >Priority: Minor > > There are a number of inconsistencies within and across markdown files in the > Airflow project. Most of these are simple formatting issues easily fixed by > linting (e.g., with mdl). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files
[ https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564342#comment-16564342 ] ASF GitHub Bot commented on AIRFLOW-2832: - tedmiston commented on issue #3670: [AIRFLOW-2832] Lint and resolve inconsistencies in Markdown files URL: https://github.com/apache/incubator-airflow/pull/3670#issuecomment-409358478 This PR is now squashed and ready for review. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Inconsistencies and linter errors across markdown files > --- > > Key: AIRFLOW-2832 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2832 > Project: Apache Airflow > Issue Type: Improvement > Components: docs, Documentation >Reporter: Taylor Edmiston >Assignee: Taylor Edmiston >Priority: Minor > > There are a number of inconsistencies within and across markdown files in the > Airflow project. Most of these are simple formatting issues easily fixed by > linting (e.g., with mdl). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files
[ https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564341#comment-16564341 ] ASF GitHub Bot commented on AIRFLOW-2832: - tedmiston opened a new pull request #3670: [AIRFLOW-2832] Lint and resolve inconsistencies in Markdown files URL: https://github.com/apache/incubator-airflow/pull/3670 Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2832 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: - Inspired by other recent issues related to linter errors in Python and JS (AIRFLOW-2783, AIRFLOW-2800, AIRFLOW-2803) - This PR does a few things: - Resolves linter errors in markdown files across the project (ignores errors that aren't super useful on GitHub such as line wrapping and putting `` in brackets) - Clarifies that commit message length of 50 characters doesn't include the Jira issue tag - Replaces usage of JIRA with Jira the way it's styled nowadays by [Atlassian](https://www.atlassian.com/software/jira) and [Wikipedia](https://en.wikipedia.org/wiki/Jira_(software)) - Makes code block formatting consistent ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: The changes in this PR are restricted to linting documentation. ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. n/a ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Inconsistencies and linter errors across markdown files > --- > > Key: AIRFLOW-2832 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2832 > Project: Apache Airflow > Issue Type: Improvement > Components: docs, Documentation >Reporter: Taylor Edmiston >Assignee: Taylor Edmiston >Priority: Minor > > There are a number of inconsistencies within and across markdown files in the > Airflow project. Most of these are simple formatting issues easily fixed by > linting (e.g., with mdl). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running
[ https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik resolved AIRFLOW-1104. - Resolution: Resolved Fix Version/s: 2.0.0 Resolved by https://github.com/apache/incubator-airflow/pull/3568 > Concurrency check in scheduler should count queued tasks as well as running > --- > > Key: AIRFLOW-1104 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1104 > Project: Apache Airflow > Issue Type: Bug > Environment: see https://github.com/apache/incubator-airflow/pull/2221 > "Tasks with the QUEUED state should also be counted below, but for now we > cannot count them. This is because there is no guarantee that queued tasks in > failed dagruns will or will not eventually run and queued tasks that will > never run will consume slots and can stall a DAG. Once we can guarantee that > all queued tasks in failed dagruns will never run (e.g. make sure that all > running/newly queued TIs have running dagruns), then we can include QUEUED > tasks here, with the constraint that they are in running dagruns." >Reporter: Alex Guziel >Priority: Minor > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running
[ https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564330#comment-16564330 ] ASF GitHub Bot commented on AIRFLOW-1104: - kaxil closed pull request #3568: AIRFLOW-1104 Update jobs.py so Airflow does not over schedule tasks URL: https://github.com/apache/incubator-airflow/pull/3568 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/jobs.py b/airflow/jobs.py index 224ff185fb..a4252473cd 100644 --- a/airflow/jobs.py +++ b/airflow/jobs.py @@ -1075,9 +1075,6 @@ def _find_executable_task_instances(self, simple_dag_bag, states, session=None): :type states: Tuple[State] :return: List[TaskInstance] """ -# TODO(saguziel): Change this to include QUEUED, for concurrency -# purposes we may want to count queued tasks -states_to_count_as_running = [State.RUNNING] executable_tis = [] # Get all the queued task instances from associated with scheduled @@ -1123,6 +1120,7 @@ def _find_executable_task_instances(self, simple_dag_bag, states, session=None): for task_instance in task_instances_to_examine: pool_to_task_instances[task_instance.pool].append(task_instance) +states_to_count_as_running = [State.RUNNING, State.QUEUED] task_concurrency_map = self.__get_task_concurrency_map( states=states_to_count_as_running, session=session) @@ -1173,7 +1171,6 @@ def _find_executable_task_instances(self, simple_dag_bag, states, session=None): simple_dag = simple_dag_bag.get_dag(dag_id) if dag_id not in dag_id_to_possibly_running_task_count: -# TODO(saguziel): also check against QUEUED state, see AIRFLOW-1104 dag_id_to_possibly_running_task_count[dag_id] = \ DAG.get_num_task_instances( dag_id, diff --git a/tests/jobs.py b/tests/jobs.py index 93f6574df4..c701214f1e 100644 --- a/tests/jobs.py +++ b/tests/jobs.py @@ -1493,6 +1493,39 @@ def test_find_executable_task_instances_concurrency(self): self.assertEqual(0, len(res)) +def test_find_executable_task_instances_concurrency_queued(self): +dag_id = 'SchedulerJobTest.test_find_executable_task_instances_concurrency_queued' +dag = DAG(dag_id=dag_id, start_date=DEFAULT_DATE, concurrency=3) +task1 = DummyOperator(dag=dag, task_id='dummy1') +task2 = DummyOperator(dag=dag, task_id='dummy2') +task3 = DummyOperator(dag=dag, task_id='dummy3') +dagbag = self._make_simple_dag_bag([dag]) + +scheduler = SchedulerJob() +session = settings.Session() +dag_run = scheduler.create_dag_run(dag) + +ti1 = TI(task1, dag_run.execution_date) +ti2 = TI(task2, dag_run.execution_date) +ti3 = TI(task3, dag_run.execution_date) +ti1.state = State.RUNNING +ti2.state = State.QUEUED +ti3.state = State.SCHEDULED + +session.merge(ti1) +session.merge(ti2) +session.merge(ti3) + +session.commit() + +res = scheduler._find_executable_task_instances( +dagbag, +states=[State.SCHEDULED], +session=session) + +self.assertEqual(1, len(res)) +self.assertEqual(res[0].key, ti3.key) + def test_find_executable_task_instances_task_concurrency(self): dag_id = 'SchedulerJobTest.test_find_executable_task_instances_task_concurrency' task_id_1 = 'dummy' This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Concurrency check in scheduler should count queued tasks as well as running > --- > > Key: AIRFLOW-1104 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1104 > Project: Apache Airflow > Issue Type: Bug > Environment: see https://github.com/apache/incubator-airflow/pull/2221 > "Tasks with the QUEUED state should also be counted below, but for now we > cannot count them. This is because there is no guarantee that queued tasks in > failed dagruns will or will not eventually run and queued tasks that will > never run will consume slots and can stall a DAG. Once we can guarantee that > all queued tasks in failed dagruns will never run (e.g. make sure that all
[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running
[ https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564331#comment-16564331 ] ASF subversion and git services commented on AIRFLOW-1104: -- Commit ed972042a864cd010137190e0bbb1d25a9dcfe83 in incubator-airflow's branch refs/heads/master from Dan Fowler [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=ed97204 ] [AIRFLOW-1104] Update jobs.py so Airflow does not over schedule tasks (#3568) This change will prevent tasks from getting scheduled and queued over the concurrency limits set for the dag > Concurrency check in scheduler should count queued tasks as well as running > --- > > Key: AIRFLOW-1104 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1104 > Project: Apache Airflow > Issue Type: Bug > Environment: see https://github.com/apache/incubator-airflow/pull/2221 > "Tasks with the QUEUED state should also be counted below, but for now we > cannot count them. This is because there is no guarantee that queued tasks in > failed dagruns will or will not eventually run and queued tasks that will > never run will consume slots and can stall a DAG. Once we can guarantee that > all queued tasks in failed dagruns will never run (e.g. make sure that all > running/newly queued TIs have running dagruns), then we can include QUEUED > tasks here, with the constraint that they are in running dagruns." >Reporter: Alex Guziel >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running
[ https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564327#comment-16564327 ] ASF GitHub Bot commented on AIRFLOW-1104: - dan-sf commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does not over schedule tasks URL: https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409355510 Sure, the changes have been rebased on master This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Concurrency check in scheduler should count queued tasks as well as running > --- > > Key: AIRFLOW-1104 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1104 > Project: Apache Airflow > Issue Type: Bug > Environment: see https://github.com/apache/incubator-airflow/pull/2221 > "Tasks with the QUEUED state should also be counted below, but for now we > cannot count them. This is because there is no guarantee that queued tasks in > failed dagruns will or will not eventually run and queued tasks that will > never run will consume slots and can stall a DAG. Once we can guarantee that > all queued tasks in failed dagruns will never run (e.g. make sure that all > running/newly queued TIs have running dagruns), then we can include QUEUED > tasks here, with the constraint that they are in running dagruns." >Reporter: Alex Guziel >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564316#comment-16564316 ] ASF GitHub Bot commented on AIRFLOW-2814: - kaxil commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default config URL: https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409351337 Agreed with everyone. Do you guys think we should decrease the time duration for `dag_dir_list_interval` as well? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running
[ https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564313#comment-16564313 ] ASF GitHub Bot commented on AIRFLOW-1104: - kaxil commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does not over schedule tasks URL: https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409350840 Can you squash your commits as well? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Concurrency check in scheduler should count queued tasks as well as running > --- > > Key: AIRFLOW-1104 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1104 > Project: Apache Airflow > Issue Type: Bug > Environment: see https://github.com/apache/incubator-airflow/pull/2221 > "Tasks with the QUEUED state should also be counted below, but for now we > cannot count them. This is because there is no guarantee that queued tasks in > failed dagruns will or will not eventually run and queued tasks that will > never run will consume slots and can stall a DAG. Once we can guarantee that > all queued tasks in failed dagruns will never run (e.g. make sure that all > running/newly queued TIs have running dagruns), then we can include QUEUED > tasks here, with the constraint that they are in running dagruns." >Reporter: Alex Guziel >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564312#comment-16564312 ] ASF GitHub Bot commented on AIRFLOW-2814: - feng-tao commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default config URL: https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409350792 +1 on keeping 0. 180 seconds is surely too high... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running
[ https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564311#comment-16564311 ] ASF GitHub Bot commented on AIRFLOW-1104: - dan-sf commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does not over schedule tasks URL: https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409350564 @kaxil Conflicts have been updated This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Concurrency check in scheduler should count queued tasks as well as running > --- > > Key: AIRFLOW-1104 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1104 > Project: Apache Airflow > Issue Type: Bug > Environment: see https://github.com/apache/incubator-airflow/pull/2221 > "Tasks with the QUEUED state should also be counted below, but for now we > cannot count them. This is because there is no guarantee that queued tasks in > failed dagruns will or will not eventually run and queued tasks that will > never run will consume slots and can stall a DAG. Once we can guarantee that > all queued tasks in failed dagruns will never run (e.g. make sure that all > running/newly queued TIs have running dagruns), then we can include QUEUED > tasks here, with the constraint that they are in running dagruns." >Reporter: Alex Guziel >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running
[ https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564274#comment-16564274 ] ASF GitHub Bot commented on AIRFLOW-1104: - kaxil commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does not over schedule tasks URL: https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409343719 @dan-sf Can you please resolve the conflicts? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Concurrency check in scheduler should count queued tasks as well as running > --- > > Key: AIRFLOW-1104 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1104 > Project: Apache Airflow > Issue Type: Bug > Environment: see https://github.com/apache/incubator-airflow/pull/2221 > "Tasks with the QUEUED state should also be counted below, but for now we > cannot count them. This is because there is no guarantee that queued tasks in > failed dagruns will or will not eventually run and queued tasks that will > never run will consume slots and can stall a DAG. Once we can guarantee that > all queued tasks in failed dagruns will never run (e.g. make sure that all > running/newly queued TIs have running dagruns), then we can include QUEUED > tasks here, with the constraint that they are in running dagruns." >Reporter: Alex Guziel >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files
Taylor Edmiston created AIRFLOW-2832: Summary: Inconsistencies and linter errors across markdown files Key: AIRFLOW-2832 URL: https://issues.apache.org/jira/browse/AIRFLOW-2832 Project: Apache Airflow Issue Type: Improvement Components: docs, Documentation Reporter: Taylor Edmiston Assignee: Taylor Edmiston There are a number of inconsistencies within and across markdown files in the Airflow project. Most of these are simple formatting issues easily fixed by linting (e.g., with mdl). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564270#comment-16564270 ] ASF GitHub Bot commented on AIRFLOW-2814: - kaxil commented on issue #3669: Revert [AIRFLOW-2814] - Change `min_file_process_interval` to 0 URL: https://github.com/apache/incubator-airflow/pull/3669#issuecomment-409342022 @Fokko PTAL. Also, shouldn't we be reducing `dag_dir_list_interval` as well? It is 5 mins by default. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564269#comment-16564269 ] ASF GitHub Bot commented on AIRFLOW-2814: - kaxil opened a new pull request #3669: Revert [AIRFLOW-2814] - Change `min_file_process_interval` to 0 URL: https://github.com/apache/incubator-airflow/pull/3669 Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-XXX - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564262#comment-16564262 ] ASF GitHub Bot commented on AIRFLOW-2524: - Fokko commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon SageMaker Training URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206654107 ## File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py ## @@ -0,0 +1,98 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow.contrib.hooks.sagemaker_hook import SageMakerHook +from airflow.models import BaseOperator +from airflow.utils import apply_defaults +from airflow.exceptions import AirflowException + + +class SageMakerCreateTrainingJobOperator(BaseOperator): + +""" + Initiate a SageMaker training + + This operator returns The ARN of the model created in Amazon SageMaker + + :param training_job_config: + The configuration necessary to start a training job (templated) + :type training_job_config: dict + :param region_name: The AWS region_name + :type region_name: string + :param sagemaker_conn_id: The SageMaker connection ID to use. + :type aws_conn_id: string Review comment: Should be `sagemaker_conn_id` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow integration with AWS Sagemaker > -- > > Key: AIRFLOW-2524 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2524 > Project: Apache Airflow > Issue Type: Improvement > Components: aws, contrib >Reporter: Rajeev Srinivasan >Assignee: Yang Yu >Priority: Major > Labels: AWS > > Would it be possible to orchestrate an end to end AWS Sagemaker job using > Airflow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564264#comment-16564264 ] ASF GitHub Bot commented on AIRFLOW-2524: - Fokko commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon SageMaker Training URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206655197 ## File path: tests/contrib/hooks/test_sagemaker_hook.py ## @@ -0,0 +1,341 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# + + +import json +import unittest +import copy +try: +from unittest import mock +except ImportError: +try: +import mock +except ImportError: +mock = None + +from airflow import configuration +from airflow import models +from airflow.utils import db +from airflow.contrib.hooks.sagemaker_hook import SageMakerHook +from airflow.hooks.S3_hook import S3Hook +from airflow.exceptions import AirflowException + + +role = 'test-role' + +bucket = 'test-bucket' + +key = 'test/data' +data_url = 's3://{}/{}'.format(bucket, key) + +job_name = 'test-job-name' + +image = 'test-image' + +test_arn_return = {'TrainingJobArn': 'testarn'} + +test_list_training_job_return = { +'TrainingJobSummaries': [ +{ +'TrainingJobName': job_name, +'TrainingJobStatus': 'InProgress' +}, +], +'NextToken': 'test-token' +} + +test_list_tuning_job_return = { +'TrainingJobSummaries': [ +{ +'TrainingJobName': job_name, +'TrainingJobArn': 'testarn', +'TunedHyperParameters': { +'k': '3' +}, +'TrainingJobStatus': 'InProgress' +}, +], +'NextToken': 'test-token' +} + +output_url = 's3://{}/test/output'.format(bucket) +create_training_params = \ +{ +'AlgorithmSpecification': { +'TrainingImage': image, +'TrainingInputMode': 'File' +}, +'RoleArn': role, +'OutputDataConfig': { +'S3OutputPath': output_url +}, +'ResourceConfig': { +'InstanceCount': 2, +'InstanceType': 'ml.c4.8xlarge', +'VolumeSizeInGB': 50 +}, +'TrainingJobName': job_name, +'HyperParameters': { +'k': '10', +'feature_dim': '784', +'mini_batch_size': '500', +'force_dense': 'True' +}, +'StoppingCondition': { +'MaxRuntimeInSeconds': 60 * 60 +}, +'InputDataConfig': [ +{ +'ChannelName': 'train', +'DataSource': { +'S3DataSource': { +'S3DataType': 'S3Prefix', +'S3Uri': data_url, +'S3DataDistributionType': 'FullyReplicated' +} +}, +'CompressionType': 'None', +'RecordWrapperType': 'None' +} +] +} + +create_tuning_params = {'HyperParameterTuningJobName': job_name, +'HyperParameterTuningJobConfig': { +'Strategy': 'Bayesian', +'HyperParameterTuningJobObjective': { +'Type': 'Maximize', +'MetricName': 'test_metric' +}, +'ResourceLimits': { +'MaxNumberOfTrainingJobs': 123, +'MaxParallelTrainingJobs': 123 +}, +'ParameterRanges': { +'IntegerParameterRanges': [ +{ +'Name': 'k', +'MinValue': '2', +'MaxValue': '10' +}, +] +} +}, +'TrainingJobDefinition': { +
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564265#comment-16564265 ] ASF GitHub Bot commented on AIRFLOW-2524: - Fokko commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon SageMaker Training URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206654353 ## File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py ## @@ -0,0 +1,98 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow.contrib.hooks.sagemaker_hook import SageMakerHook +from airflow.models import BaseOperator +from airflow.utils import apply_defaults +from airflow.exceptions import AirflowException + + +class SageMakerCreateTrainingJobOperator(BaseOperator): + +""" + Initiate a SageMaker training + + This operator returns The ARN of the model created in Amazon SageMaker + + :param training_job_config: + The configuration necessary to start a training job (templated) + :type training_job_config: dict + :param region_name: The AWS region_name + :type region_name: string + :param sagemaker_conn_id: The SageMaker connection ID to use. + :type aws_conn_id: string + :param use_db_config: Whether or not to use db config + associated with sagemaker_conn_id. Review comment: Missing `:type use_db_config: bool` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow integration with AWS Sagemaker > -- > > Key: AIRFLOW-2524 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2524 > Project: Apache Airflow > Issue Type: Improvement > Components: aws, contrib >Reporter: Rajeev Srinivasan >Assignee: Yang Yu >Priority: Major > Labels: AWS > > Would it be possible to orchestrate an end to end AWS Sagemaker job using > Airflow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker
[ https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564263#comment-16564263 ] ASF GitHub Bot commented on AIRFLOW-2524: - Fokko commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon SageMaker Training URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206654727 ## File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py ## @@ -0,0 +1,98 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow.contrib.hooks.sagemaker_hook import SageMakerHook +from airflow.models import BaseOperator +from airflow.utils import apply_defaults +from airflow.exceptions import AirflowException + + +class SageMakerCreateTrainingJobOperator(BaseOperator): + +""" + Initiate a SageMaker training + + This operator returns The ARN of the model created in Amazon SageMaker + + :param training_job_config: + The configuration necessary to start a training job (templated) + :type training_job_config: dict + :param region_name: The AWS region_name + :type region_name: string + :param sagemaker_conn_id: The SageMaker connection ID to use. + :type aws_conn_id: string + :param use_db_config: Whether or not to use db config + associated with sagemaker_conn_id. + If set to true, will automatically update the training config + with what's in db, so the db config doesn't need to + included everything, but what's there does replace the ones + in the training_job_config, so be careful + :type use_db_config: + :param aws_conn_id: The AWS connection ID to use. + :type aws_conn_id: string + + **Example**: + The following operator would start a training job when executed + +sagemaker_training = + SageMakerCreateTrainingJobOperator( + task_id='sagemaker_training', + training_job_config=config, + use_db_config=True, + region_name='us-west-2' + sagemaker_conn_id='sagemaker_customers_conn', + aws_conn_id='aws_customers_conn' + ) + """ + +template_fields = ['training_job_config'] +template_ext = () +ui_color = '#ededed' + +@apply_defaults +def __init__(self, + sagemaker_conn_id=None, Review comment: Please make the order of the arguments congruent with the docstring, or the other way around This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Airflow integration with AWS Sagemaker > -- > > Key: AIRFLOW-2524 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2524 > Project: Apache Airflow > Issue Type: Improvement > Components: aws, contrib >Reporter: Rajeev Srinivasan >Assignee: Yang Yu >Priority: Major > Labels: AWS > > Would it be possible to orchestrate an end to end AWS Sagemaker job using > Airflow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2670) SSHOperator's timeout parameter doesn't affect SSHook timeoot
[ https://issues.apache.org/jira/browse/AIRFLOW-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564247#comment-16564247 ] ASF GitHub Bot commented on AIRFLOW-2670: - Fokko closed pull request #3666: [AIRFLOW-2670] Update SSH Operator's Hook to respect timeout URL: https://github.com/apache/incubator-airflow/pull/3666 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/operators/ssh_operator.py b/airflow/contrib/operators/ssh_operator.py index 2e890f463e..747ad04ff0 100644 --- a/airflow/contrib/operators/ssh_operator.py +++ b/airflow/contrib/operators/ssh_operator.py @@ -69,16 +69,17 @@ def __init__(self, def execute(self, context): try: if self.ssh_conn_id and not self.ssh_hook: -self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id) +self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id, +timeout=self.timeout) if not self.ssh_hook: -raise AirflowException("can not operate without ssh_hook or ssh_conn_id") +raise AirflowException("Cannot operate without ssh_hook or ssh_conn_id.") if self.remote_host is not None: self.ssh_hook.remote_host = self.remote_host if not self.command: -raise AirflowException("no command specified so nothing to execute here.") +raise AirflowException("SSH command not specified. Aborting.") with self.ssh_hook.get_conn() as ssh_client: # Auto apply tty when its required in case of sudo diff --git a/tests/contrib/operators/test_ssh_operator.py b/tests/contrib/operators/test_ssh_operator.py index b97ba84a01..7ddd24b2ac 100644 --- a/tests/contrib/operators/test_ssh_operator.py +++ b/tests/contrib/operators/test_ssh_operator.py @@ -7,9 +7,9 @@ # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at -# +# # http://www.apache.org/licenses/LICENSE-2.0 -# +# # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY @@ -58,6 +58,23 @@ def setUp(self): self.hook = hook self.dag = dag +def test_hook_created_correctly(self): +TIMEOUT = 20 +SSH_ID = "ssh_default" +task = SSHOperator( +task_id="test", +command="echo -n airflow", +dag=self.dag, +timeout=TIMEOUT, +ssh_conn_id="ssh_default" +) +self.assertIsNotNone(task) + +task.execute(None) + +self.assertEquals(TIMEOUT, task.ssh_hook.timeout) +self.assertEquals(SSH_ID, task.ssh_hook.ssh_conn_id) + def test_json_command_execution(self): configuration.conf.set("core", "enable_xcom_pickling", "False") task = SSHOperator( This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > SSHOperator's timeout parameter doesn't affect SSHook timeoot > - > > Key: AIRFLOW-2670 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2670 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib >Affects Versions: Airflow 2.0 >Reporter: jin zhang >Priority: Major > > when I use SSHOperator, SSHOperator's timeout parameter can't set in SSHHook > and it's just effect exce_command. > old version: > self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id) > I change it to : > self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id, timeout=self.timeout) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2670) SSHOperator's timeout parameter doesn't affect SSHook timeoot
[ https://issues.apache.org/jira/browse/AIRFLOW-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564248#comment-16564248 ] ASF subversion and git services commented on AIRFLOW-2670: -- Commit 3b35d360f6ff8694b6fb4387901c182ca39160b5 in incubator-airflow's branch refs/heads/master from [~noremac201] [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=3b35d36 ] [AIRFLOW-2670] Update SSH Operator's Hook to respect timeout (#3666) > SSHOperator's timeout parameter doesn't affect SSHook timeoot > - > > Key: AIRFLOW-2670 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2670 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib >Affects Versions: Airflow 2.0 >Reporter: jin zhang >Priority: Major > > when I use SSHOperator, SSHOperator's timeout parameter can't set in SSHHook > and it's just effect exce_command. > old version: > self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id) > I change it to : > self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id, timeout=self.timeout) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2670) SSHOperator's timeout parameter doesn't affect SSHook timeoot
[ https://issues.apache.org/jira/browse/AIRFLOW-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564246#comment-16564246 ] ASF GitHub Bot commented on AIRFLOW-2670: - Fokko commented on issue #3666: [AIRFLOW-2670] Update SSH Operator's Hook to respect timeout URL: https://github.com/apache/incubator-airflow/pull/3666#issuecomment-409338606 Nice one @Noremac201 Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > SSHOperator's timeout parameter doesn't affect SSHook timeoot > - > > Key: AIRFLOW-2670 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2670 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib >Affects Versions: Airflow 2.0 >Reporter: jin zhang >Priority: Major > > when I use SSHOperator, SSHOperator's timeout parameter can't set in SSHHook > and it's just effect exce_command. > old version: > self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id) > I change it to : > self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id, timeout=self.timeout) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2795) Oracle to Oracle Transfer Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564244#comment-16564244 ] ASF GitHub Bot commented on AIRFLOW-2795: - Fokko closed pull request #3639: [AIRFLOW-2795] Oracle to Oracle Transfer Operator URL: https://github.com/apache/incubator-airflow/pull/3639 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/operators/oracle_to_oracle_transfer.py b/airflow/contrib/operators/oracle_to_oracle_transfer.py new file mode 100644 index 00..31eb89b7dd --- /dev/null +++ b/airflow/contrib/operators/oracle_to_oracle_transfer.py @@ -0,0 +1,90 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +from airflow.hooks.oracle_hook import OracleHook +from airflow.models import BaseOperator +from airflow.utils.decorators import apply_defaults + + +class OracleToOracleTransfer(BaseOperator): +""" +Moves data from Oracle to Oracle. + + +:param oracle_destination_conn_id: destination Oracle connection. +:type oracle_destination_conn_id: str +:param destination_table: destination table to insert rows. +:type destination_table: str +:param oracle_source_conn_id: source Oracle connection. +:type oracle_source_conn_id: str +:param source_sql: SQL query to execute against the source Oracle +database. (templated) +:type source_sql: str +:param source_sql_params: Parameters to use in sql query. (templated) +:type source_sql_params: dict +:param rows_chunk: number of rows per chunk to commit. +:type rows_chunk: int +""" + +template_fields = ('source_sql', 'source_sql_params') +ui_color = '#e08c8c' + +@apply_defaults +def __init__( +self, +oracle_destination_conn_id, +destination_table, +oracle_source_conn_id, +source_sql, +source_sql_params={}, +rows_chunk=5000, +*args, **kwargs): +super(OracleToOracleTransfer, self).__init__(*args, **kwargs) +self.oracle_destination_conn_id = oracle_destination_conn_id +self.destination_table = destination_table +self.oracle_source_conn_id = oracle_source_conn_id +self.source_sql = source_sql +self.source_sql_params = source_sql_params +self.rows_chunk = rows_chunk + +def _execute(self, src_hook, dest_hook, context): +with src_hook.get_conn() as src_conn: +cursor = src_conn.cursor() +self.log.info("Querying data from source: {0}".format( +self.oracle_source_conn_id)) +cursor.execute(self.source_sql, self.source_sql_params) +target_fields = list(map(lambda field: field[0], cursor.description)) + +rows_total = 0 +rows = cursor.fetchmany(self.rows_chunk) +while len(rows) > 0: +rows_total = rows_total + len(rows) +dest_hook.bulk_insert_rows(self.destination_table, rows, + target_fields=target_fields, + commit_every=self.rows_chunk) +rows = cursor.fetchmany(self.rows_chunk) +self.log.info("Total inserted: {0} rows".format(rows_total)) + +self.log.info("Finished data transfer.") +cursor.close() + +def execute(self, context): +src_hook = OracleHook(oracle_conn_id=self.oracle_source_conn_id) +dest_hook = OracleHook(oracle_conn_id=self.oracle_destination_conn_id) +self._execute(src_hook, dest_hook, context) diff --git a/docs/code.rst b/docs/code.rst index 4f1b301711..f4f55b7b38 100644 --- a/docs/code.rst +++ b/docs/code.rst @@ -172,6 +172,7 @@ Operators .. autoclass:: airflow.contrib.operators.mongo_to_s3.MongoToS3Operator .. autoclass:: airflow.contrib.operators.mysql_to_gcs.MySqlToGoogleCloudStorag
[jira] [Commented] (AIRFLOW-2795) Oracle to Oracle Transfer Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564245#comment-16564245 ] ASF subversion and git services commented on AIRFLOW-2795: -- Commit 9983466fd1f82faad7d74506fd428f2d007e3daf in incubator-airflow's branch refs/heads/master from [~marcus.r...@gmail.com] [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=9983466 ] [AIRFLOW-2795] Oracle to Oracle Transfer Operator (#3639) > Oracle to Oracle Transfer Operator > --- > > Key: AIRFLOW-2795 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2795 > Project: Apache Airflow > Issue Type: New Feature > Components: operators >Reporter: Marcus Rehm >Assignee: Marcus Rehm >Priority: Trivial > > This operator should help in transfer data from one Oracle instance to > another or between tables in the same instance. t's suitable in use cases > where you don't want to or it's not allowed use dblink. > The operator needs a sql query and a destination table in order to work. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2825) S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase ext in S3
[ https://issues.apache.org/jira/browse/AIRFLOW-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564241#comment-16564241 ] ASF GitHub Bot commented on AIRFLOW-2825: - Fokko closed pull request #3665: [AIRFLOW-2825]Fix S3ToHiveTransfer bug due to case URL: https://github.com/apache/incubator-airflow/pull/3665 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/operators/s3_to_hive_operator.py b/airflow/operators/s3_to_hive_operator.py index 09eb8363c0..5faaf916b7 100644 --- a/airflow/operators/s3_to_hive_operator.py +++ b/airflow/operators/s3_to_hive_operator.py @@ -153,7 +153,7 @@ def execute(self, context): root, file_ext = os.path.splitext(s3_key_object.key) if (self.select_expression and self.input_compressed and -file_ext != '.gz'): +file_ext.lower() != '.gz'): raise AirflowException("GZIP is the only compression " + "format Amazon S3 Select supports") diff --git a/tests/operators/s3_to_hive_operator.py b/tests/operators/s3_to_hive_operator.py index 482e7fefc8..6ca6274a2c 100644 --- a/tests/operators/s3_to_hive_operator.py +++ b/tests/operators/s3_to_hive_operator.py @@ -89,6 +89,11 @@ def setUp(self): mode="wb") as f_gz_h: self._set_fn(fn_gz, '.gz', True) f_gz_h.writelines([header, line1, line2]) +fn_gz_upper = self._get_fn('.txt', True) + ".GZ" +with gzip.GzipFile(filename=fn_gz_upper, + mode="wb") as f_gz_upper_h: +self._set_fn(fn_gz_upper, '.GZ', True) +f_gz_upper_h.writelines([header, line1, line2]) fn_bz2 = self._get_fn('.txt', True) + '.bz2' with bz2.BZ2File(filename=fn_bz2, mode="wb") as f_bz2_h: @@ -105,6 +110,11 @@ def setUp(self): mode="wb") as f_gz_nh: self._set_fn(fn_gz, '.gz', False) f_gz_nh.writelines([line1, line2]) +fn_gz_upper = self._get_fn('.txt', False) + ".GZ" +with gzip.GzipFile(filename=fn_gz_upper, + mode="wb") as f_gz_upper_nh: +self._set_fn(fn_gz_upper, '.GZ', False) +f_gz_upper_nh.writelines([line1, line2]) fn_bz2 = self._get_fn('.txt', False) + '.bz2' with bz2.BZ2File(filename=fn_bz2, mode="wb") as f_bz2_nh: @@ -143,7 +153,7 @@ def _check_file_equality(self, fn_1, fn_2, ext): # gz files contain mtime and filename in the header that # causes filecmp to return False even if contents are identical # Hence decompress to test for equality -if(ext == '.gz'): +if(ext.lower() == '.gz'): with gzip.GzipFile(fn_1, 'rb') as f_1,\ NamedTemporaryFile(mode='wb') as f_txt_1,\ gzip.GzipFile(fn_2, 'rb') as f_2,\ @@ -220,14 +230,14 @@ def test_execute(self, mock_hiveclihook): conn.create_bucket(Bucket='bucket') # Testing txt, zip, bz2 files with and without header row -for (ext, has_header) in product(['.txt', '.gz', '.bz2'], [True, False]): +for (ext, has_header) in product(['.txt', '.gz', '.bz2', '.GZ'], [True, False]): self.kwargs['headers'] = has_header self.kwargs['check_headers'] = has_header logging.info("Testing {0} format {1} header". format(ext, ('with' if has_header else 'without')) ) -self.kwargs['input_compressed'] = ext != '.txt' +self.kwargs['input_compressed'] = ext.lower() != '.txt' self.kwargs['s3_key'] = 's3://bucket/' + self.s3_key + ext ip_fn = self._get_fn(ext, self.kwargs['headers']) op_fn = self._get_fn(ext, False) @@ -260,8 +270,8 @@ def test_execute_with_select_expression(self, mock_hiveclihook): # Only testing S3ToHiveTransfer calls S3Hook.select_key with # the right parameters and its execute method succeeds here, # since Moto doesn't support select_object_content as of 1.3.2. -for (ext, has_header) in product(['.txt', '.gz'], [True, False]): -input_compressed = ext != '.txt' +for (ext, has_header) in product(['.txt', '.gz', '.GZ'], [True, False]): +input_compressed = ext.lower() != '.txt' key = self.s3_key + ext self.kwargs['check_headers'] = False --
[jira] [Commented] (AIRFLOW-2825) S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase ext in S3
[ https://issues.apache.org/jira/browse/AIRFLOW-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564242#comment-16564242 ] ASF subversion and git services commented on AIRFLOW-2825: -- Commit c7e54461c68c70e11b5cd47e9dee9d52f6ee357b in incubator-airflow's branch refs/heads/master from XD-DENG [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=c7e5446 ] [AIRFLOW-2825]Fix S3ToHiveTransfer bug due to case Because upper/lower case was not considered in the file extension check, S3ToHiveTransfer operator may mistakenly think a GZIP file with uppercase ext ".GZ" is not a GZIP file and raise exception. > S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase > ext in S3 > --- > > Key: AIRFLOW-2825 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2825 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > > Because upper/lower case was not considered in the extension check, > S3ToHiveTransfer operator may think a GZIP file with uppercase ext `.GZ` is > not a GZIP file and raise exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2825) S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase ext in S3
[ https://issues.apache.org/jira/browse/AIRFLOW-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564243#comment-16564243 ] ASF subversion and git services commented on AIRFLOW-2825: -- Commit 8d2f57cd104736f4a9b2b87182358a8c2e406c1a in incubator-airflow's branch refs/heads/master from [~Fokko] [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=8d2f57c ] Merge pull request #3665 from XD-DENG/patch-6 [AIRFLOW-2825] Fix S3ToHiveTransfer bug due to case > S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase > ext in S3 > --- > > Key: AIRFLOW-2825 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2825 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > > Because upper/lower case was not considered in the extension check, > S3ToHiveTransfer operator may think a GZIP file with uppercase ext `.GZ` is > not a GZIP file and raise exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2825) S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase ext in S3
[ https://issues.apache.org/jira/browse/AIRFLOW-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564228#comment-16564228 ] ASF GitHub Bot commented on AIRFLOW-2825: - Fokko commented on issue #3665: [AIRFLOW-2825]Fix S3ToHiveTransfer bug due to case URL: https://github.com/apache/incubator-airflow/pull/3665#issuecomment-409335560 LGTM, thanks @XD-DENG This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase > ext in S3 > --- > > Key: AIRFLOW-2825 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2825 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > > Because upper/lower case was not considered in the extension check, > S3ToHiveTransfer operator may think a GZIP file with uppercase ext `.GZ` is > not a GZIP file and raise exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564226#comment-16564226 ] ASF GitHub Bot commented on AIRFLOW-2814: - Fokko commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default config URL: https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409335193 I would keep it at 0 by default. 3 minutes is definitely too high. 1 would also work for me as a compromise. Making changes to your dag, and not see them in the UI would feel awkward to me. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2831) Logs not found through the UI when deployed on Kubernetes
Dustin Jenkins created AIRFLOW-2831: --- Summary: Logs not found through the UI when deployed on Kubernetes Key: AIRFLOW-2831 URL: https://issues.apache.org/jira/browse/AIRFLOW-2831 Project: Apache Airflow Issue Type: Bug Components: logging Affects Versions: 1.10.0 Reporter: Dustin Jenkins Attachments: Screen Shot 2018-07-31 at 11.19.34.png Kubernetes 1.11 on OpenStack Airflow 1.10.0rc2 Executor: KubernetesExecutor Operator(s): Mix of KubernetesPodOperator and PythonOperator When deploying Airflow tasks on Kubernetes, the logs are rarely accessible after a run, regardless of a successful or failed run (See attached screenshot). If I use the Kubernetes command line client and shell into the Scheduler pod and onto the running Scheduler Container and view the logs directly I can see the output. At first I thought it was just the KubernetesPodOperator, but I've tested with the PythonOperator and the DummyOperator as well with the same results. I have the Web Server, Scheduler, and PostgeSQL instances all running in their own Pods. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2658) Add GKE specific Kubernetes Pod Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564085#comment-16564085 ] ASF GitHub Bot commented on AIRFLOW-2658: - fenglu-g commented on a change in pull request #3532: [AIRFLOW-2658] Add GCP specific k8s pod operator URL: https://github.com/apache/incubator-airflow/pull/3532#discussion_r206629560 ## File path: airflow/contrib/operators/gcp_container_operator.py ## @@ -170,3 +175,147 @@ def execute(self, context): hook = GKEClusterHook(self.project_id, self.location) create_op = hook.create_cluster(cluster=self.body) return create_op + + +KUBE_CONFIG_ENV_VAR = "KUBECONFIG" +G_APP_CRED = "GOOGLE_APPLICATION_CREDENTIALS" + + +class GKEPodOperator(KubernetesPodOperator): +template_fields = ('project_id', 'location', + 'cluster_name') + KubernetesPodOperator.template_fields + +@apply_defaults +def __init__(self, + project_id, + location, + cluster_name, + gcp_conn_id='google_cloud_default', + *args, + **kwargs): +""" +Executes a task in a Kubernetes pod in the specified Google Kubernetes +Engine cluster + +This Operator assumes that the system has gcloud installed and either +has working default application credentials or has configured a +connection id with a service account. + +The **minimum** required to define a cluster to create are the variables +``task_id``, ``project_id``, ``location``, ``cluster_name``, ``name``, +``namespace``, and ``image`` + +**Operator Creation**: :: + +operator = GKEPodOperator(task_id='pod_op', + project_id='my-project', + location='us-central1-a', + cluster_name='my-cluster-name', + name='task-name', + namespace='default', + image='perl') + +.. seealso:: +For more detail about application authentication have a look at the reference: + https://cloud.google.com/docs/authentication/production#providing_credentials_to_your_application + +:param project_id: The Google Developers Console project id +:type project_id: str +:param location: The name of the Google Kubernetes Engine zone in which the +cluster resides, e.g. 'us-central1-a' +:type location: str +:param cluster_name: The name of the Google Kubernetes Engine cluster the pod +should be spawned in +:type cluster_name: str +:param gcp_conn_id: The google cloud connection id to use. This allows for +users to specify a service account. +:type gcp_conn_id: str +""" +super(GKEPodOperator, self).__init__(*args, **kwargs) +self.project_id = project_id +self.location = location +self.cluster_name = cluster_name +self.gcp_conn_id = gcp_conn_id + +def execute(self, context): +# Specifying a service account file allows the user to using non default +# authentication for creating a Kubernetes Pod. This is done by setting the +# environment variable `GOOGLE_APPLICATION_CREDENTIALS` that gcloud looks at. +key_file = None + +# If gcp_conn_id is not specified gcloud will use the default +# service account credentials. +if self.gcp_conn_id: +from airflow.hooks.base_hook import BaseHook +# extras is a deserialized json object +extras = BaseHook.get_connection(self.gcp_conn_id).extra_dejson +# key_file only gets set if a json file is created from a JSON string in +# the web ui, else none +key_file = self._set_env_from_extras(extras=extras) + +# Write config to a temp file and set the environment variable to point to it. +# This is to avoid race conditions of reading/writing a single file +with tempfile.NamedTemporaryFile() as conf_file: +os.environ[KUBE_CONFIG_ENV_VAR] = conf_file.name +# Attempt to get/update credentials +# We call gcloud directly instead of using google-cloud-python api +# because there is no way to write kubernetes config to a file, which is +# required by KubernetesPodOperator. +# The gcloud command looks at the env variable `KUBECONFIG` for where to save +# the kubernetes config file. +subprocess.check_call( +["gcloud", "container", "clusters", "get-credentials", + self.cluster_name, + "--zone", self.location
[jira] [Resolved] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leo Gallucci resolved AIRFLOW-2822. --- Resolution: Fixed > PendingDeprecationWarning Invalid arguments: > HipChatAPISendRoomNotificationOperator > --- > > Key: AIRFLOW-2822 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2822 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, operators >Affects Versions: Airflow 2.0 >Reporter: Leo Gallucci >Assignee: Leo Gallucci >Priority: Trivial > Labels: easyfix > > Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) > gives: > {code:python} > airflow/models.py:2390: PendingDeprecationWarning: > Invalid arguments were passed to HipChatAPISendRoomNotificationOperator. > Support for passing such arguments will be dropped in Airflow 2.0. > Invalid arguments were: > *args: () > **kwargs: {'color': 'green'} > category=PendingDeprecationWarning > {code} > I've fixed this in my fork: > https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055 > I will send a PR -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration
[ https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564040#comment-16564040 ] ASF GitHub Bot commented on AIRFLOW-2310: - suma-ps commented on issue #3504: [AIRFLOW-2310]: Add AWS Glue Job Compatibility to Airflow URL: https://github.com/apache/incubator-airflow/pull/3504#issuecomment-409303864 @OElesin Do you plan to resolve the merge issues soon? Looking forward to using the Glue operator soon, thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Enable AWS Glue Job Integration > --- > > Key: AIRFLOW-2310 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2310 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib >Reporter: Olalekan Elesin >Assignee: Olalekan Elesin >Priority: Major > Labels: AWS > > Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs > and ETL pipelines can be orchestrated with Airflow -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564022#comment-16564022 ] ASF GitHub Bot commented on AIRFLOW-2803: - codecov-io edited a comment on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#issuecomment-408503531 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=h1) Report > Merging [#3656](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/a338f3276835af45765d24a6e6d43ad4ba4d66ba?src=pr&el=desc) will **increase** coverage by `0.38%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3656/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=tree) ```diff @@Coverage Diff @@ ## master#3656 +/- ## == + Coverage 77.12% 77.51% +0.38% == Files 206 205 -1 Lines 1577215751 -21 == + Hits1216412209 +45 + Misses 3608 3542 -66 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/www/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `99.01% <0%> (-0.99%)` | :arrow_down: | | [airflow/plugins\_manager.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=) | `92.59% <0%> (ø)` | :arrow_up: | | [airflow/www/validators.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdmFsaWRhdG9ycy5weQ==) | `100% <0%> (ø)` | :arrow_up: | | [airflow/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9fX2luaXRfXy5weQ==) | `80.43% <0%> (ø)` | :arrow_up: | | [airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9qb2JzLnB5) | `82.74% <0%> (ø)` | :arrow_up: | | [airflow/minihivecluster.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9taW5paGl2ZWNsdXN0ZXIucHk=) | | | | [airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==) | `89.87% <0%> (+0.42%)` | :arrow_up: | | [airflow/hooks/pig\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9waWdfaG9vay5weQ==) | `100% <0%> (+100%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=footer). Last update [a338f32...ecbc873](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix all ESLint issues > - > > Key: AIRFLOW-2803 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2803 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Verdan Mahmood >Assignee: Taylor Edmiston >Priority: Major > > Most of the JS code in Apache Airflow has linting issues which are > highlighted after the integration of ESLint. > Once AIRFLOW-2783 merged in master branch, please fix all the javascript > styling issues that we have in .js and .html files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563965#comment-16563965 ] ASF GitHub Bot commented on AIRFLOW-2803: - tedmiston commented on a change in pull request #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#discussion_r206602944 ## File path: airflow/www_rbac/static/js/clock.js ## @@ -18,24 +18,25 @@ */ require('./jqClock.min'); -$(document).ready(function () { - x = new Date(); +$(document).ready(() => { Review comment: Sounds good. I will stick with the ES5 for now for this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix all ESLint issues > - > > Key: AIRFLOW-2803 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2803 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Verdan Mahmood >Assignee: Taylor Edmiston >Priority: Major > > Most of the JS code in Apache Airflow has linting issues which are > highlighted after the integration of ESLint. > Once AIRFLOW-2783 merged in master branch, please fix all the javascript > styling issues that we have in .js and .html files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563963#comment-16563963 ] ASF GitHub Bot commented on AIRFLOW-2803: - tedmiston edited a comment on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409266326 @verdan Sure! Typically I keep atomic commits while I'm working so everyone can follow small changes instead of one big diff, then squash down to one commit at the end. I updated the title to make it clear this is WIP. Since you're doing most of the reviewing here, do you have a preference on squashing throughout working vs just squashing pre-merge? I should have an update later today btw. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix all ESLint issues > - > > Key: AIRFLOW-2803 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2803 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Verdan Mahmood >Assignee: Taylor Edmiston >Priority: Major > > Most of the JS code in Apache Airflow has linting issues which are > highlighted after the integration of ESLint. > Once AIRFLOW-2783 merged in master branch, please fix all the javascript > styling issues that we have in .js and .html files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563923#comment-16563923 ] ASF GitHub Bot commented on AIRFLOW-2803: - r39132 commented on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409282209 @verdan once @tedmiston is done, please provide your +1 and notify some of the committers on this PR that the PR is ready for validation and merge. Thx for your help on reviewing this PR! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix all ESLint issues > - > > Key: AIRFLOW-2803 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2803 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Verdan Mahmood >Assignee: Taylor Edmiston >Priority: Major > > Most of the JS code in Apache Airflow has linting issues which are > highlighted after the integration of ESLint. > Once AIRFLOW-2783 merged in master branch, please fix all the javascript > styling issues that we have in .js and .html files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563903#comment-16563903 ] ASF subversion and git services commented on AIRFLOW-2822: -- Commit 3eb0454cb1da1e96ae5d7ad88db7c1cca71109f3 in incubator-airflow's branch refs/heads/master from [~elgalu] [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=3eb0454 ] [AIRFLOW-2822] Fix HipChat Deprecation Warning Fixes PendingDeprecationWarning on HipChatAPISendRoomNotificationOperator Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) gives: airflow/models.py:2390: PendingDeprecationWarning: Invalid arguments were passed to HipChatAPISendRoomNotificationOperator. Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were: *args: () **kwargs: {'color': 'green'} category=PendingDeprecationWarning > PendingDeprecationWarning Invalid arguments: > HipChatAPISendRoomNotificationOperator > --- > > Key: AIRFLOW-2822 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2822 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, operators >Affects Versions: Airflow 2.0 >Reporter: Leo Gallucci >Assignee: Leo Gallucci >Priority: Trivial > Labels: easyfix > > Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) > gives: > {code:python} > airflow/models.py:2390: PendingDeprecationWarning: > Invalid arguments were passed to HipChatAPISendRoomNotificationOperator. > Support for passing such arguments will be dropped in Airflow 2.0. > Invalid arguments were: > *args: () > **kwargs: {'color': 'green'} > category=PendingDeprecationWarning > {code} > I've fixed this in my fork: > https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055 > I will send a PR -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563902#comment-16563902 ] ASF GitHub Bot commented on AIRFLOW-2822: - r39132 closed pull request #3668: [AIRFLOW-2822] Fix HipChat Deprecation Warning URL: https://github.com/apache/incubator-airflow/pull/3668 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/operators/hipchat_operator.py b/airflow/contrib/operators/hipchat_operator.py index 381cd72cdf..adeca23079 100644 --- a/airflow/contrib/operators/hipchat_operator.py +++ b/airflow/contrib/operators/hipchat_operator.py @@ -99,24 +99,23 @@ class HipChatAPISendRoomNotificationOperator(HipChatAPIOperator): :param card: HipChat-defined card object :type card: dict """ -template_fields = ('token', 'room_id', 'message') +template_fields = ('token', 'room_id', 'message', 'message_format', + 'color', 'frm', 'attach_to', 'notify', 'card') ui_color = '#2980b9' @apply_defaults -def __init__(self, room_id, message, *args, **kwargs): +def __init__(self, room_id, message, message_format='html', + color='yellow', frm='airflow', attach_to=None, + notify=False, card=None, *args, **kwargs): super(HipChatAPISendRoomNotificationOperator, self).__init__(*args, **kwargs) self.room_id = room_id self.message = message -default_options = { -'message_format': 'html', -'color': 'yellow', -'frm': 'airflow', -'attach_to': None, -'notify': False, -'card': None -} -for (prop, default) in default_options.items(): -setattr(self, prop, kwargs.get(prop, default)) +self.message_format = message_format +self.color = color +self.frm = frm +self.attach_to = attach_to +self.notify = notify +self.card = card def prepare_request(self): params = { This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > PendingDeprecationWarning Invalid arguments: > HipChatAPISendRoomNotificationOperator > --- > > Key: AIRFLOW-2822 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2822 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, operators >Affects Versions: Airflow 2.0 >Reporter: Leo Gallucci >Assignee: Leo Gallucci >Priority: Trivial > Labels: easyfix > > Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) > gives: > {code:python} > airflow/models.py:2390: PendingDeprecationWarning: > Invalid arguments were passed to HipChatAPISendRoomNotificationOperator. > Support for passing such arguments will be dropped in Airflow 2.0. > Invalid arguments were: > *args: () > **kwargs: {'color': 'green'} > category=PendingDeprecationWarning > {code} > I've fixed this in my fork: > https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055 > I will send a PR -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2800) Remove airflow/ low-hanging linting errors
[ https://issues.apache.org/jira/browse/AIRFLOW-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563861#comment-16563861 ] ASF GitHub Bot commented on AIRFLOW-2800: - r39132 closed pull request #3638: [AIRFLOW-2800] Remove low-hanging linting errors URL: https://github.com/apache/incubator-airflow/pull/3638 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/__init__.py b/airflow/__init__.py index f40b08aab5..bc6a7bbe19 100644 --- a/airflow/__init__.py +++ b/airflow/__init__.py @@ -7,9 +7,9 @@ # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at -# +# # http://www.apache.org/licenses/LICENSE-2.0 -# +# # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY @@ -80,11 +80,12 @@ class AirflowMacroPlugin(object): def __init__(self, namespace): self.namespace = namespace -from airflow import operators + +from airflow import operators # noqa: E402 from airflow import sensors # noqa: E402 -from airflow import hooks -from airflow import executors -from airflow import macros +from airflow import hooks # noqa: E402 +from airflow import executors # noqa: E402 +from airflow import macros # noqa: E402 operators._integrate_plugins() sensors._integrate_plugins() # noqa: E402 diff --git a/airflow/contrib/auth/backends/ldap_auth.py b/airflow/contrib/auth/backends/ldap_auth.py index eefaa1263b..516e121c9b 100644 --- a/airflow/contrib/auth/backends/ldap_auth.py +++ b/airflow/contrib/auth/backends/ldap_auth.py @@ -62,7 +62,7 @@ def get_ldap_connection(dn=None, password=None): cacert = configuration.conf.get("ldap", "cacert") tls_configuration = Tls(validate=ssl.CERT_REQUIRED, ca_certs_file=cacert) use_ssl = True -except: +except Exception: pass server = Server(configuration.conf.get("ldap", "uri"), use_ssl, tls_configuration) @@ -94,7 +94,7 @@ def groups_user(conn, search_base, user_filter, user_name_att, username): search_filter = "(&({0})({1}={2}))".format(user_filter, user_name_att, username) try: memberof_attr = configuration.conf.get("ldap", "group_member_attr") -except: +except Exception: memberof_attr = "memberOf" res = conn.search(native(search_base), native(search_filter), attributes=[native(memberof_attr)]) diff --git a/airflow/contrib/hooks/aws_hook.py b/airflow/contrib/hooks/aws_hook.py index 69a1b0bed3..8ca1f3d744 100644 --- a/airflow/contrib/hooks/aws_hook.py +++ b/airflow/contrib/hooks/aws_hook.py @@ -72,7 +72,7 @@ def _parse_s3_config(config_file_name, config_format='boto', profile=None): try: access_key = config.get(cred_section, key_id_option) secret_key = config.get(cred_section, secret_key_option) -except: +except Exception: logging.warning("Option Error in parsing s3 config file") raise return access_key, secret_key diff --git a/airflow/contrib/operators/awsbatch_operator.py b/airflow/contrib/operators/awsbatch_operator.py index a5c86afce6..353fbbb0a0 100644 --- a/airflow/contrib/operators/awsbatch_operator.py +++ b/airflow/contrib/operators/awsbatch_operator.py @@ -139,7 +139,7 @@ def _wait_for_task_ended(self): if response['jobs'][-1]['status'] in ['SUCCEEDED', 'FAILED']: retry = False -sleep( 1 + pow(retries * 0.1, 2)) +sleep(1 + pow(retries * 0.1, 2)) retries += 1 def _check_success_task(self): diff --git a/airflow/contrib/operators/mlengine_prediction_summary.py b/airflow/contrib/operators/mlengine_prediction_summary.py index 17fc2c0903..4efe81e641 100644 --- a/airflow/contrib/operators/mlengine_prediction_summary.py +++ b/airflow/contrib/operators/mlengine_prediction_summary.py @@ -112,14 +112,14 @@ def decode(self, x): @beam.ptransform_fn def MakeSummary(pcoll, metric_fn, metric_keys): # pylint: disable=invalid-name return ( -pcoll -| "ApplyMetricFnPerInstance" >> beam.Map(metric_fn) -| "PairWith1" >> beam.Map(lambda tup: tup + (1,)) -| "SumTuple" >> beam.CombineGlobally(beam.combiners.TupleCombineFn( -*([sum] * (len(metric_keys) + 1 -| "AverageAndMakeDict" >> beam.Map( +pcoll | +"ApplyMetricFnPerInstance" >> beam.Map(metric_fn) | +"PairWith1" >> beam.Map(lambda tup: tup + (1,)) | +"SumTupl
[jira] [Commented] (AIRFLOW-2800) Remove airflow/ low-hanging linting errors
[ https://issues.apache.org/jira/browse/AIRFLOW-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563863#comment-16563863 ] ASF subversion and git services commented on AIRFLOW-2800: -- Commit 06584fc4b1d82a2dbba98e484d0b4515a169a818 in incubator-airflow's branch refs/heads/master from [~ajc] [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=06584fc ] [AIRFLOW-2800] Remove low-hanging linting errors > Remove airflow/ low-hanging linting errors > -- > > Key: AIRFLOW-2800 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2800 > Project: Apache Airflow > Issue Type: Bug >Reporter: Andy Cooper >Assignee: Andy Cooper >Priority: Major > > Removing low hanging linting errors from airflow directory > Focuses on > * E226 > * W291 > as well as *some* E501 (line too long) where it did not risk reducing > readability -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2800) Remove airflow/ low-hanging linting errors
[ https://issues.apache.org/jira/browse/AIRFLOW-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563857#comment-16563857 ] ASF GitHub Bot commented on AIRFLOW-2800: - r39132 commented on issue #3638: [AIRFLOW-2800] Remove low-hanging linting errors URL: https://github.com/apache/incubator-airflow/pull/3638#issuecomment-409269190 Cool. Running `flake8 airflow | wc -l` on master and this PR branch, I see a decrease from `458` down to `235`! Thanks for making these changes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove airflow/ low-hanging linting errors > -- > > Key: AIRFLOW-2800 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2800 > Project: Apache Airflow > Issue Type: Bug >Reporter: Andy Cooper >Assignee: Andy Cooper >Priority: Major > > Removing low hanging linting errors from airflow directory > Focuses on > * E226 > * W291 > as well as *some* E501 (line too long) where it did not risk reducing > readability -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563849#comment-16563849 ] ASF GitHub Bot commented on AIRFLOW-2803: - ashb commented on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409266779 FWIW I too am in favour of atomic/fixup! commits that then get squashed pre merge. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix all ESLint issues > - > > Key: AIRFLOW-2803 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2803 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Verdan Mahmood >Assignee: Taylor Edmiston >Priority: Major > > Most of the JS code in Apache Airflow has linting issues which are > highlighted after the integration of ESLint. > Once AIRFLOW-2783 merged in master branch, please fix all the javascript > styling issues that we have in .js and .html files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563847#comment-16563847 ] ASF GitHub Bot commented on AIRFLOW-2803: - tedmiston commented on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409266326 @verdan Sure! Typically I keep atomic commits while I'm working so everyone can follow small changes instead of one big diff, then squash down to one commit at the end. I updated the title to make it clear this is WIP. Since you're doing most of the reviewing here, do you have a preference on squashing throughout working or just thinking about preparing for merge? I should have an update later today btw. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix all ESLint issues > - > > Key: AIRFLOW-2803 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2803 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Verdan Mahmood >Assignee: Taylor Edmiston >Priority: Major > > Most of the JS code in Apache Airflow has linting issues which are > highlighted after the integration of ESLint. > Once AIRFLOW-2783 merged in master branch, please fix all the javascript > styling issues that we have in .js and .html files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563848#comment-16563848 ] ASF GitHub Bot commented on AIRFLOW-2803: - tedmiston edited a comment on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409266326 @verdan Sure! Typically I keep atomic commits while I'm working so everyone can follow small changes instead of one big diff, then squash down to one commit at the end. I updated the title to make it clear this is WIP. Since you're doing most of the reviewing here, do you have a preference on squashing throughout working vs just thinking about preparing for the merge with squashing at the end? I should have an update later today btw. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix all ESLint issues > - > > Key: AIRFLOW-2803 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2803 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Verdan Mahmood >Assignee: Taylor Edmiston >Priority: Major > > Most of the JS code in Apache Airflow has linting issues which are > highlighted after the integration of ESLint. > Once AIRFLOW-2783 merged in master branch, please fix all the javascript > styling issues that we have in .js and .html files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leo Gallucci updated AIRFLOW-2822: -- Comment: was deleted (was: https://github.com/apache/incubator-airflow/pull/3668) > PendingDeprecationWarning Invalid arguments: > HipChatAPISendRoomNotificationOperator > --- > > Key: AIRFLOW-2822 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2822 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, operators >Affects Versions: Airflow 2.0 >Reporter: Leo Gallucci >Assignee: Leo Gallucci >Priority: Trivial > Labels: easyfix > > Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) > gives: > {code:python} > airflow/models.py:2390: PendingDeprecationWarning: > Invalid arguments were passed to HipChatAPISendRoomNotificationOperator. > Support for passing such arguments will be dropped in Airflow 2.0. > Invalid arguments were: > *args: () > **kwargs: {'color': 'green'} > category=PendingDeprecationWarning > {code} > I've fixed this in my fork: > https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055 > I will send a PR -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563777#comment-16563777 ] Leo Gallucci commented on AIRFLOW-2822: --- https://github.com/apache/incubator-airflow/pull/3668 > PendingDeprecationWarning Invalid arguments: > HipChatAPISendRoomNotificationOperator > --- > > Key: AIRFLOW-2822 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2822 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, operators >Affects Versions: Airflow 2.0 >Reporter: Leo Gallucci >Assignee: Leo Gallucci >Priority: Trivial > Labels: easyfix > > Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) > gives: > {code:python} > airflow/models.py:2390: PendingDeprecationWarning: > Invalid arguments were passed to HipChatAPISendRoomNotificationOperator. > Support for passing such arguments will be dropped in Airflow 2.0. > Invalid arguments were: > *args: () > **kwargs: {'color': 'green'} > category=PendingDeprecationWarning > {code} > I've fixed this in my fork: > https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055 > I will send a PR -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563767#comment-16563767 ] ASF GitHub Bot commented on AIRFLOW-2822: - elgalu opened a new pull request #3668: [AIRFLOW-2822] Fix HipChat Deprecation Warning URL: https://github.com/apache/incubator-airflow/pull/3668 [AIRFLOW-2822](https://issues.apache.org/jira/browse/AIRFLOW-2822) Fixes PendingDeprecationWarning on HipChatAPISendRoomNotificationOperator Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) gives: ```python airflow/models.py:2390: PendingDeprecationWarning: Invalid arguments were passed to HipChatAPISendRoomNotificationOperator. Support for passing such arguments will be dropped in Airflow 2.0. Invalid arguments were: *args: () **kwargs: {'color': 'green'} category=PendingDeprecationWarning ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > PendingDeprecationWarning Invalid arguments: > HipChatAPISendRoomNotificationOperator > --- > > Key: AIRFLOW-2822 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2822 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, operators >Affects Versions: Airflow 2.0 >Reporter: Leo Gallucci >Assignee: Leo Gallucci >Priority: Trivial > Labels: easyfix > > Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) > gives: > {code:python} > airflow/models.py:2390: PendingDeprecationWarning: > Invalid arguments were passed to HipChatAPISendRoomNotificationOperator. > Support for passing such arguments will be dropped in Airflow 2.0. > Invalid arguments were: > *args: () > **kwargs: {'color': 'green'} > category=PendingDeprecationWarning > {code} > I've fixed this in my fork: > https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055 > I will send a PR -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2830) Worker subprocess crash results in tasks failing without retry
James Meickle created AIRFLOW-2830: -- Summary: Worker subprocess crash results in tasks failing without retry Key: AIRFLOW-2830 URL: https://issues.apache.org/jira/browse/AIRFLOW-2830 Project: Apache Airflow Issue Type: Bug Components: celery, scheduler, worker Affects Versions: 1.9.1 Reporter: James Meickle We ran across this fixed bug in production: [https://github.com/apache/incubator-airflow/pull/3040] Fair enough, it's fixed. However, that task had `retries=3` which never kicked in - that's a bug in its own right! I do see this in the documentation: {quote}Zombies & Undeads Task instances die all the time, usually as part of their normal life cycle, but sometimes unexpectedly. Zombie tasks are characterized by the absence of an heartbeat (emitted by the job periodically) and a running status in the database. {quote} I was not on call at the time so I don't have a full log of what happened with the task states. However, I am wondering if what happened looked something like this: * Scheduler detects that process needs to run * Scheduler changes state to "queued" * Scheduler adds to Celery queue * Worker pulls message off queue * Worker starts subprocess * Worker subprocess dies to bug when trying to load logging config, before changing task state to running * Worker never tries to actually run task, so it never sets task to "up_for_retry" * Message no longer exists in queue so worker won't grab task again * Scheduler never retries because the task wasn't "up_for_retry" * Scheduler never checks heartbeat because it's "queued", not "running" In general it's been disappointing to see so many ugly race conditions in Airflow. I'd love to see an Airflow enhancement proposal for converting the codebase to use a reliable state machine and better distributed system primitives. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2824) Disable loading of default connections via airflow config
[ https://issues.apache.org/jira/browse/AIRFLOW-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563690#comment-16563690 ] Felix Uellendall commented on AIRFLOW-2824: --- Dont know, looks strange to me that it is called upgradedb when it does what you say. But I guess I will use it for now but I personally do not like that it is called upgradedb and it is actually an init without "examples". A doc patch would not be a fix but an improvement :) > Disable loading of default connections via airflow config > - > > Key: AIRFLOW-2824 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2824 > Project: Apache Airflow > Issue Type: Wish >Reporter: Felix Uellendall >Priority: Major > > I would love to have a variable I can set in the airflow.cfg, like the DAG > examples have, to not load the default connections. > Either by using {{load_examples}} that is already > [there|https://github.com/apache/incubator-airflow/blob/dfa7b26ddaca80ee8fd9915ee9f6eac50fac77f6/airflow/config_templates/default_airflow.cfg#L128] > for loading dag examples or by a new one like {{load_default_connections}} > to check if the user wants to have it or not. > The implementation of the default connections starts > [here|https://github.com/apache/incubator-airflow/blob/9e1d8ee837ea2c23e828d070b6a72a6331d98602/airflow/utils/db.py#L94] > Let me know what you guys think of it, pls. :) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2824) Disable loading of default connections via airflow config
[ https://issues.apache.org/jira/browse/AIRFLOW-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563592#comment-16563592 ] Ash Berlin-Taylor commented on AIRFLOW-2824: It is not well documented, but the answer I use for this is to never run {{airflow initd}} -- that is what creates the sample connections. Instead I only ever run {{airflow upgradedb}} which will apply missing migrations but not create any "example" objects. {{upgradedb}} will work on an empty DB just fine. Perhaps the fix for this is a doc patch? > Disable loading of default connections via airflow config > - > > Key: AIRFLOW-2824 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2824 > Project: Apache Airflow > Issue Type: Wish >Reporter: Felix Uellendall >Priority: Major > > I would love to have a variable I can set in the airflow.cfg, like the DAG > examples have, to not load the default connections. > Either by using {{load_examples}} that is already > [there|https://github.com/apache/incubator-airflow/blob/dfa7b26ddaca80ee8fd9915ee9f6eac50fac77f6/airflow/config_templates/default_airflow.cfg#L128] > for loading dag examples or by a new one like {{load_default_connections}} > to check if the user wants to have it or not. > The implementation of the default connections starts > [here|https://github.com/apache/incubator-airflow/blob/9e1d8ee837ea2c23e828d070b6a72a6331d98602/airflow/utils/db.py#L94] > Let me know what you guys think of it, pls. :) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets
[ https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563402#comment-16563402 ] ASF subversion and git services commented on AIRFLOW-2238: -- Commit 6fdc79980b378222bb0706035bedfe5fcefb982d in incubator-airflow's branch refs/heads/master from [~ashb] [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=6fdc799 ] Merge pull request #3413 from ashb/pr-tool-git-config [AIRFLOW-2238] Switch PR tool to push to Github > Update dev/airflow-pr to work with gitub for merge targets > -- > > Key: AIRFLOW-2238 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2238 > Project: Apache Airflow > Issue Type: Improvement > Components: PR tool >Reporter: Ash Berlin-Taylor >Priority: Major > > We are planning on migrating the to the Apache "GitBox" project which lets > committers work directly on github. This will mean we might not _need_ to use > the pr tool, but we should update it so that it merges and pushes back to > github, not the ASF repo. > I think we need to do this before we ask the ASF infra team to migrate our > repo over. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets
[ https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563400#comment-16563400 ] ASF subversion and git services commented on AIRFLOW-2238: -- Commit 4484286e49b7272d2f82e022c0ee5a8690ccc564 in incubator-airflow's branch refs/heads/master from [~ashb] [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=4484286 ] [AIRFLOW-2238] Flake8 fixes on dev/airflow-pr > Update dev/airflow-pr to work with gitub for merge targets > -- > > Key: AIRFLOW-2238 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2238 > Project: Apache Airflow > Issue Type: Improvement > Components: PR tool >Reporter: Ash Berlin-Taylor >Priority: Major > > We are planning on migrating the to the Apache "GitBox" project which lets > committers work directly on github. This will mean we might not _need_ to use > the pr tool, but we should update it so that it merges and pushes back to > github, not the ASF repo. > I think we need to do this before we ask the ASF infra team to migrate our > repo over. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets
[ https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563401#comment-16563401 ] ASF subversion and git services commented on AIRFLOW-2238: -- Commit d3793c0a5021df6555a720e9038ccf14b79a1196 in incubator-airflow's branch refs/heads/master from [~ashb] [ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=d3793c0 ] [AIRFLOW-2238] Update PR tool to push directly to Github > Update dev/airflow-pr to work with gitub for merge targets > -- > > Key: AIRFLOW-2238 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2238 > Project: Apache Airflow > Issue Type: Improvement > Components: PR tool >Reporter: Ash Berlin-Taylor >Priority: Major > > We are planning on migrating the to the Apache "GitBox" project which lets > committers work directly on github. This will mean we might not _need_ to use > the pr tool, but we should update it so that it merges and pushes back to > github, not the ASF repo. > I think we need to do this before we ask the ASF infra team to migrate our > repo over. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets
[ https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563399#comment-16563399 ] ASF GitHub Bot commented on AIRFLOW-2238: - ashb closed pull request #3413: [AIRFLOW-2238] Switch PR tool to push to Github URL: https://github.com/apache/incubator-airflow/pull/3413 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/dev/airflow-pr b/dev/airflow-pr index 65243ddf87..08eb9aad36 100755 --- a/dev/airflow-pr +++ b/dev/airflow-pr @@ -54,7 +54,8 @@ except ImportError: try: import keyring except ImportError: -print("Could not find the keyring library. Run 'sudo pip install keyring' to install.") +print("Could not find the keyring library. " + "Run 'sudo pip install keyring' to install.") sys.exit(-1) # Location of your Airflow git development area @@ -64,12 +65,12 @@ AIRFLOW_GIT_LOCATION = os.environ.get( # Remote name which points to the Gihub site GITHUB_REMOTE_NAME = os.environ.get("GITHUB_REMOTE_NAME", "github") -# Remote name which points to Apache git -APACHE_REMOTE_NAME = os.environ.get("APACHE_REMOTE_NAME", "apache") -# OAuth key used for issuing requests against the GitHub API. If this is not defined, then requests -# will be unauthenticated. You should only need to configure this if you find yourself regularly -# exceeding your IP's unauthenticated request rate limit. You can create an OAuth key at -# https://github.com/settings/tokens. This tool only requires the "public_repo" scope. +# OAuth key used for issuing requests against the GitHub API. If this is not +# defined, then requests will be unauthenticated. You should only need to +# configure this if you find yourself regularly exceeding your IP's +# unauthenticated request rate limit. You can create an OAuth key at +# https://github.com/settings/tokens. This tool only requires the "public_repo" +# scope. GITHUB_OAUTH_KEY = os.environ.get("GITHUB_OAUTH_KEY") GITHUB_BASE = "https://github.com/apache/incubator-airflow/pull"; @@ -172,7 +173,7 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, local): pr_branch_name = "%s_MERGE_PR_%s" % (BRANCH_PREFIX, pr_num) target_branch_name = "%s_MERGE_PR_%s_%s" % (BRANCH_PREFIX, pr_num, target_ref.upper()) run_cmd("git fetch %s pull/%s/head:%s" % (GITHUB_REMOTE_NAME, pr_num, pr_branch_name)) -run_cmd("git fetch %s %s:%s" % (APACHE_REMOTE_NAME, target_ref, target_branch_name)) +run_cmd("git fetch %s %s:%s" % (GITHUB_REMOTE_NAME, target_ref, target_branch_name)) run_cmd("git checkout %s" % target_branch_name) had_conflicts = False @@ -205,7 +206,8 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, local): except Exception as e: msg = "Error merging: %s\nWould you like to manually fix-up this merge?" % e continue_maybe(msg) -msg = "Okay, please fix any conflicts and 'git add' conflicting files... Finished?" +msg = ("Okay, please fix any conflicts and 'git add' conflicting files... " + + "Finished?") continue_maybe(msg) had_conflicts = True @@ -216,7 +218,6 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, local): if pr_commits: all_text += ' '.join(c['commit']['message'] for c in pr_commits) all_jira_refs = standardize_jira_ref(all_text, only_jira=True) -all_jira_issues = re.findall("AIRFLOW-[0-9]{1,6}", all_jira_refs) merge_message_flags = [] @@ -315,7 +316,6 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, local): if primary_author == "": primary_author = distinct_authors[0] -authors = "\n".join(["Author: %s" % a for a in distinct_authors]) merge_message_flags.append(u'--author="{}"'.format(primary_author)) else: @@ -327,7 +327,7 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, local): # reflow commit message seen_first_line = False for i in range(1, len(merge_message_flags)): -if merge_message_flags[i-1] == '-m': +if merge_message_flags[i - 1] == '-m': # let the first line be as long as the user wants if not seen_first_line: if '\n\n' in merge_message_flags[i]: @@ -376,7 +376,7 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, local): run_cmd(['git', 'commit'] + commit_flags, echo_cmd=False) if local: -msg ='\n' + reflow(""" +msg = '\n' + reflow(""" The PR has been merged locally in branch {}. You may leave this program running while you work on it. When you are finished, press any k
[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets
[ https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563350#comment-16563350 ] ASF GitHub Bot commented on AIRFLOW-2238: - codecov-io edited a comment on issue #3413: [AIRFLOW-2238] Switch PR tool to push to Github URL: https://github.com/apache/incubator-airflow/pull/3413#issuecomment-391769983 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=h1) Report > Merging [#3413](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/dfa7b26ddaca80ee8fd9915ee9f6eac50fac77f6?src=pr&el=desc) will **not change** coverage. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/3413/graphs/tree.svg?height=150&width=650&token=WdLKlKHOAU&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#3413 +/- ## === Coverage 77.51% 77.51% === Files 205 205 Lines 1575115751 === Hits1221012210 Misses 3541 3541 ``` -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=footer). Last update [dfa7b26...d3793c0](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Update dev/airflow-pr to work with gitub for merge targets > -- > > Key: AIRFLOW-2238 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2238 > Project: Apache Airflow > Issue Type: Improvement > Components: PR tool >Reporter: Ash Berlin-Taylor >Priority: Major > > We are planning on migrating the to the Apache "GitBox" project which lets > committers work directly on github. This will mean we might not _need_ to use > the pr tool, but we should update it so that it merges and pushes back to > github, not the ASF repo. > I think we need to do this before we ask the ASF infra team to migrate our > repo over. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563319#comment-16563319 ] ASF GitHub Bot commented on AIRFLOW-2803: - verdan commented on issue #3656: [AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409147448 @tedmiston can you please make sure: - you squash your commits - your commit message adheres the [commit guidelines](https://github.com/apache/incubator-airflow/blob/master/.github/PULL_REQUEST_TEMPLATE.md#commits) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix all ESLint issues > - > > Key: AIRFLOW-2803 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2803 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Verdan Mahmood >Assignee: Taylor Edmiston >Priority: Major > > Most of the JS code in Apache Airflow has linting issues which are > highlighted after the integration of ESLint. > Once AIRFLOW-2783 merged in master branch, please fix all the javascript > styling issues that we have in .js and .html files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues
[ https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563312#comment-16563312 ] ASF GitHub Bot commented on AIRFLOW-2803: - verdan commented on a change in pull request #3656: [AIRFLOW-2803] Fix all ESLint issues URL: https://github.com/apache/incubator-airflow/pull/3656#discussion_r206443837 ## File path: airflow/www_rbac/static/js/clock.js ## @@ -18,24 +18,25 @@ */ require('./jqClock.min'); -$(document).ready(function () { - x = new Date(); +$(document).ready(() => { Review comment: Please note that most of the custom JS is written inline in .html files, and we are not yet considering that javascript in webpack, that means, we won't be able to transpile that javascript to ES5. (which is fine for now) I am working on another issue to extract all inline JS from html files to separate .js files. https://issues.apache.org/jira/browse/AIRFLOW-2804 My suggestion would be to implement the ES6->ES5 tranpilation as part of this issue. And once this PR gets merged, we'll be able to extract all inline JS into separate .js files. We already have a JIRA issue for that: https://issues.apache.org/jira/browse/AIRFLOW-2730 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix all ESLint issues > - > > Key: AIRFLOW-2803 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2803 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Verdan Mahmood >Assignee: Taylor Edmiston >Priority: Major > > Most of the JS code in Apache Airflow has linting issues which are > highlighted after the integration of ESLint. > Once AIRFLOW-2783 merged in master branch, please fix all the javascript > styling issues that we have in .js and .html files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc
[ https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563309#comment-16563309 ] ASF GitHub Bot commented on AIRFLOW-2814: - kaxil commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default config URL: https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409144039 @bolkedebruin @Fokko Thoughts? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Default Arg "file_process_interval" for class SchedulerJob is inconsistent > with doc > --- > > Key: AIRFLOW-2814 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2814 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Reporter: Xiaodong DENG >Assignee: Xiaodong DENG >Priority: Critical > Fix For: 2.0.0 > > > h2. Backgrond > In > [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592] > , it was mentioned the default value of argument *file_process_interval* > should be 3 minutes (*file_process_interval:* Parse and schedule each file no > faster than this interval). > The value is normally parsed from the default configuration. However, in the > default config_template, its value is 0 rather than 180 seconds > ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432] > ). > h2. Issue > This means that actually that each file is parsed and scheduled without > letting Airflow "rest". This conflicts with the design purpose (by default > let it be 180 seconds) and may affect performance significantly. > h2. My Proposal > Change the value in the config template from 0 to 180. -- This message was sent by Atlassian JIRA (v7.6.3#76005)