[jira] [Updated] (AIRFLOW-2834) can not see the dag page after build from the newest code in github

2018-07-31 Thread Rurui Ye (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rurui Ye updated AIRFLOW-2834:
--
Priority: Blocker  (was: Major)

> can not see the dag page after build from the newest code in github
> ---
>
> Key: AIRFLOW-2834
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2834
> Project: Apache Airflow
>  Issue Type: Bug
>Affects Versions: Airflow 2.0
>Reporter: Rurui Ye
>Priority: Blocker
> Attachments: image-2018-08-01-14-20-09-256.png
>
>
> after build and deploy the newest version of code from github. got the web 
> server opened and the dags page blank with the following error in request 
> resource.
>  
> !image-2018-08-01-14-20-09-256.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2834) can not see the dag page after build from the newest code in github

2018-07-31 Thread Rurui Ye (JIRA)
Rurui Ye created AIRFLOW-2834:
-

 Summary: can not see the dag page after build from the newest code 
in github
 Key: AIRFLOW-2834
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2834
 Project: Apache Airflow
  Issue Type: Bug
Affects Versions: Airflow 2.0
Reporter: Rurui Ye
 Attachments: image-2018-08-01-14-20-09-256.png

after build and deploy the newest version of code from github. got the web 
server opened and the dags page blank with the following error in request 
resource.

 

!image-2018-08-01-14-20-09-256.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2833) Delay in trigger of downstream tasks in DAG

2018-07-31 Thread Mishika Singh (JIRA)
Mishika Singh created AIRFLOW-2833:
--

 Summary: Delay in trigger of downstream tasks in DAG
 Key: AIRFLOW-2833
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2833
 Project: Apache Airflow
  Issue Type: Bug
Reporter: Mishika Singh
 Attachments: Screen Shot 2018-05-25 at 9.18.08 AM.png

There is around 2 minutes of delay in triggering the downstream tasks on 
completion of upstream tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2703) Scheduler crashes if Mysql Connectivity is lost

2018-07-31 Thread raman (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raman reassigned AIRFLOW-2703:
--

Assignee: Mishika Singh

> Scheduler crashes if Mysql Connectivity is lost
> ---
>
> Key: AIRFLOW-2703
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2703
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: Airflow 2.0, 1.9.0
>Reporter: raman
>Assignee: Mishika Singh
>Priority: Major
>
> Airflow scheduler crashes if connectivity to Mysql is lost.
> Below is the stack Trace
> Traceback (most recent call last): File 
> "/usr/src/venv/local/lib/python2.7/site-packages/airflow/jobs.py", line 371, 
> in helper pickle_dags) File 
> "/usr/src/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", line 
> 50, in wrapper result = func(*args, **kwargs) File 
> "/usr/src/venv/local/lib/python2.7/site-packages/airflow/jobs.py", line 1762, 
> in process_file dag.sync_to_db() File 
> "/usr/src/venv/local/lib/python2.7/site-packages/airflow/utils/db.py", line 
> 50, in wrapper result = func(*args, **kwargs) File 
> "/usr/src/venv/local/lib/python2.7/site-packages/airflow/models.py", line 
> 3816, in sync_to_db session.commit() File 
> "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", 
> line 943, in commit self.transaction.commit() File 
> "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", 
> line 471, in commit t[1].commit() File 
> "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 1643, in commit self._do_commit() File 
> "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 1674, in _do_commit self.connection._commit_impl() File 
> "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 726, in _commit_impl self._handle_dbapi_exception(e, None, None, None, 
> None) File 
> "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 1413, in _handle_dbapi_exception exc_info File 
> "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", 
> line 203, in raise_from_cause reraise(type(exception), exception, tb=exc_tb, 
> cause=cause) File 
> "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", 
> line 724, in _commit_impl self.engine.dialect.do_commit(self.connection) File 
> "/usr/src/venv/local/lib/python2.7/site-packages/sqlalchemy/dialects/mysql/base.py",
>  line 1784, in do_commit dbapi_connection.commit() OperationalError: 
> (_mysql_exceptions.OperationalError) (2013, 'Lost connection to MySQL server 
> during query') (Background on this error at: http://sqlalche.me/e/e3q8)
> Process DagFileProcessor141318-Process:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564514#comment-16564514
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

codecov-io edited a comment on issue #3658: [AIRFLOW-2524] Add Amazon SageMaker 
Training
URL: 
https://github.com/apache/incubator-airflow/pull/3658#issuecomment-408564225
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=h1)
 Report
   > Merging 
[#3658](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/096ba9ecd961cdaebd062599f408571ffb21165a?src=pr&el=desc)
 will **increase** coverage by `0.4%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3658/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff@@
   ##   master#3658 +/-   ##
   =
   + Coverage   77.11%   77.51%   +0.4% 
   =
 Files 206  205  -1 
 Lines   1577215751 -21 
   =
   + Hits1216212210 +48 
   + Misses   3610 3541 -69
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/www/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5)
 | `99.01% <0%> (-0.99%)` | :arrow_down: |
   | 
[airflow/www/validators.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdmFsaWRhdG9ycy5weQ==)
 | `100% <0%> (ø)` | :arrow_up: |
   | 
[airflow/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9fX2luaXRfXy5weQ==)
 | `80.43% <0%> (ø)` | :arrow_up: |
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `92.59% <0%> (ø)` | :arrow_up: |
   | 
[airflow/minihivecluster.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9taW5paGl2ZWNsdXN0ZXIucHk=)
 | | |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.74% <0%> (+0.26%)` | :arrow_up: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `89.87% <0%> (+0.42%)` | :arrow_up: |
   | 
[airflow/hooks/pig\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3658/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9waWdfaG9vay5weQ==)
 | `100% <0%> (+100%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=footer).
 Last update 
[096ba9e...3f1e4b1](https://codecov.io/gh/apache/incubator-airflow/pull/3658?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow integration with AWS Sagemaker
> --
>
> Key: AIRFLOW-2524
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2524
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib
>Reporter: Rajeev Srinivasan
>Assignee: Yang Yu
>Priority: Major
>  Labels: AWS
>
> Would it be possible to orchestrate an end to end  AWS  Sagemaker job using 
> Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564491#comment-16564491
 ] 

ASF GitHub Bot commented on AIRFLOW-2814:
-

XD-DENG commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default config
URL: 
https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409398992
 
 
   Hi all, thanks for the inputs. Agree with you on the desired value as well 
(the objective of this PR was to fix inconsistency between `.cfg` and comment 
in `jobs.py`, instead of proposing another value for this configuration item).
   
   Hi @kaxil , regarding `dag_dir_list_interval`, personally I think it should 
be reduced. 5 minutes is quite long for users to wait until new DAG file is 
reflected.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564482#comment-16564482
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

troychen728 commented on a change in pull request #3658: [AIRFLOW-2524] Add 
Amazon SageMaker Training
URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206711545
 
 

 ##
 File path: tests/contrib/hooks/test_sagemaker_hook.py
 ##
 @@ -0,0 +1,341 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+
+import json
+import unittest
+import copy
+try:
+from unittest import mock
+except ImportError:
+try:
+import mock
+except ImportError:
+mock = None
+
+from airflow import configuration
+from airflow import models
+from airflow.utils import db
+from airflow.contrib.hooks.sagemaker_hook import SageMakerHook
+from airflow.hooks.S3_hook import S3Hook
+from airflow.exceptions import AirflowException
+
+
+role = 'test-role'
+
+bucket = 'test-bucket'
+
+key = 'test/data'
+data_url = 's3://{}/{}'.format(bucket, key)
+
+job_name = 'test-job-name'
+
+image = 'test-image'
+
+test_arn_return = {'TrainingJobArn': 'testarn'}
+
+test_list_training_job_return = {
+'TrainingJobSummaries': [
+{
+'TrainingJobName': job_name,
+'TrainingJobStatus': 'InProgress'
+},
+],
+'NextToken': 'test-token'
+}
+
+test_list_tuning_job_return = {
+'TrainingJobSummaries': [
+{
+'TrainingJobName': job_name,
+'TrainingJobArn': 'testarn',
+'TunedHyperParameters': {
+'k': '3'
+},
+'TrainingJobStatus': 'InProgress'
+},
+],
+'NextToken': 'test-token'
+}
+
+output_url = 's3://{}/test/output'.format(bucket)
+create_training_params = \
+{
+'AlgorithmSpecification': {
+'TrainingImage': image,
+'TrainingInputMode': 'File'
+},
+'RoleArn': role,
+'OutputDataConfig': {
+'S3OutputPath': output_url
+},
+'ResourceConfig': {
+'InstanceCount': 2,
+'InstanceType': 'ml.c4.8xlarge',
+'VolumeSizeInGB': 50
+},
+'TrainingJobName': job_name,
+'HyperParameters': {
+'k': '10',
+'feature_dim': '784',
+'mini_batch_size': '500',
+'force_dense': 'True'
+},
+'StoppingCondition': {
+'MaxRuntimeInSeconds': 60 * 60
+},
+'InputDataConfig': [
+{
+'ChannelName': 'train',
+'DataSource': {
+'S3DataSource': {
+'S3DataType': 'S3Prefix',
+'S3Uri': data_url,
+'S3DataDistributionType': 'FullyReplicated'
+}
+},
+'CompressionType': 'None',
+'RecordWrapperType': 'None'
+}
+]
+}
+
+create_tuning_params = {'HyperParameterTuningJobName': job_name,
+'HyperParameterTuningJobConfig': {
+'Strategy': 'Bayesian',
+'HyperParameterTuningJobObjective': {
+'Type': 'Maximize',
+'MetricName': 'test_metric'
+},
+'ResourceLimits': {
+'MaxNumberOfTrainingJobs': 123,
+'MaxParallelTrainingJobs': 123
+},
+'ParameterRanges': {
+'IntegerParameterRanges': [
+{
+'Name': 'k',
+'MinValue': '2',
+'MaxValue': '10'
+},
+]
+}
+},
+'TrainingJobDefinition': {
+  

[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564481#comment-16564481
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

troychen728 commented on a change in pull request #3658: [AIRFLOW-2524] Add 
Amazon SageMaker Training
URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206711515
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py
 ##
 @@ -0,0 +1,98 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.sagemaker_hook import SageMakerHook
+from airflow.models import BaseOperator
+from airflow.utils import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerCreateTrainingJobOperator(BaseOperator):
+
+"""
+   Initiate a SageMaker training
+
+   This operator returns The ARN of the model created in Amazon SageMaker
+
+   :param training_job_config:
+   The configuration necessary to start a training job (templated)
+   :type training_job_config: dict
+   :param region_name: The AWS region_name
+   :type region_name: string
+   :param sagemaker_conn_id: The SageMaker connection ID to use.
+   :type aws_conn_id: string
+   :param use_db_config: Whether or not to use db config
+   associated with sagemaker_conn_id.
+   If set to true, will automatically update the training config
+   with what's in db, so the db config doesn't need to
+   included everything, but what's there does replace the ones
+   in the training_job_config, so be careful
+   :type use_db_config:
+   :param aws_conn_id: The AWS connection ID to use.
+   :type aws_conn_id: string
+
+   **Example**:
+   The following operator would start a training job when executed
+
+sagemaker_training =
+   SageMakerCreateTrainingJobOperator(
+   task_id='sagemaker_training',
+   training_job_config=config,
+   use_db_config=True,
+   region_name='us-west-2'
+   sagemaker_conn_id='sagemaker_customers_conn',
+   aws_conn_id='aws_customers_conn'
+   )
+   """
+
+template_fields = ['training_job_config']
+template_ext = ()
+ui_color = '#ededed'
+
+@apply_defaults
+def __init__(self,
+ sagemaker_conn_id=None,
 
 Review comment:
   Changed the order


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow integration with AWS Sagemaker
> --
>
> Key: AIRFLOW-2524
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2524
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib
>Reporter: Rajeev Srinivasan
>Assignee: Yang Yu
>Priority: Major
>  Labels: AWS
>
> Would it be possible to orchestrate an end to end  AWS  Sagemaker job using 
> Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564478#comment-16564478
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

troychen728 commented on a change in pull request #3658: [AIRFLOW-2524] Add 
Amazon SageMaker Training
URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206711354
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py
 ##
 @@ -0,0 +1,98 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.sagemaker_hook import SageMakerHook
+from airflow.models import BaseOperator
+from airflow.utils import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerCreateTrainingJobOperator(BaseOperator):
+
+"""
+   Initiate a SageMaker training
+
+   This operator returns The ARN of the model created in Amazon SageMaker
+
+   :param training_job_config:
+   The configuration necessary to start a training job (templated)
+   :type training_job_config: dict
+   :param region_name: The AWS region_name
+   :type region_name: string
+   :param sagemaker_conn_id: The SageMaker connection ID to use.
+   :type aws_conn_id: string
+   :param use_db_config: Whether or not to use db config
+   associated with sagemaker_conn_id.
 
 Review comment:
   Added


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow integration with AWS Sagemaker
> --
>
> Key: AIRFLOW-2524
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2524
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib
>Reporter: Rajeev Srinivasan
>Assignee: Yang Yu
>Priority: Major
>  Labels: AWS
>
> Would it be possible to orchestrate an end to end  AWS  Sagemaker job using 
> Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564480#comment-16564480
 ] 

ASF GitHub Bot commented on AIRFLOW-2814:
-

codecov-io commented on issue #3669: Revert [AIRFLOW-2814] - Change 
`min_file_process_interval` to 0
URL: 
https://github.com/apache/incubator-airflow/pull/3669#issuecomment-409396427
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=h1)
 Report
   > Merging 
[#3669](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/ed972042a864cd010137190e0bbb1d25a9dcfe83?src=pr&el=desc)
 will **increase** coverage by `0.27%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3669/graphs/tree.svg?token=WdLKlKHOAU&src=pr&width=650&height=150)](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3669  +/-   ##
   ==
   + Coverage   77.51%   77.79%   +0.27% 
   ==
 Files 205  205  
 Lines   1575116079 +328 
   ==
   + Hits1221012508 +298 
   - Misses   3541 3571  +30
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3669/diff?src=pr&el=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `84.63% <ø> (+1.88%)` | :arrow_up: |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/incubator-airflow/pull/3669/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `89.45% <0%> (-0.43%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=footer).
 Last update 
[ed97204...1ee1fc4](https://codecov.io/gh/apache/incubator-airflow/pull/3669?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564434#comment-16564434
 ] 

ASF subversion and git services commented on AIRFLOW-2814:
--

Commit 1ee1fc4ec0bab25d9e75a8ca1943fc1a91a85546 in incubator-airflow's branch 
refs/heads/revert-2814 from [~kaxilnaik]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=1ee1fc4 ]

Revert [AIRFLOW-2814] - Change `min_file_process_interval` to 0


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564422#comment-16564422
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

troychen728 commented on a change in pull request #3658: [AIRFLOW-2524] Add 
Amazon SageMaker Training
URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206700100
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py
 ##
 @@ -0,0 +1,98 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.sagemaker_hook import SageMakerHook
+from airflow.models import BaseOperator
+from airflow.utils import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerCreateTrainingJobOperator(BaseOperator):
+
+"""
+   Initiate a SageMaker training
+
+   This operator returns The ARN of the model created in Amazon SageMaker
+
+   :param training_job_config:
+   The configuration necessary to start a training job (templated)
+   :type training_job_config: dict
+   :param region_name: The AWS region_name
+   :type region_name: string
+   :param sagemaker_conn_id: The SageMaker connection ID to use.
+   :type aws_conn_id: string
 
 Review comment:
   Hi Fokko, 
   Thank you so much for your review. I really appreciate your feedback. I 
didn't figure out how to reply to your request, so I'll just reply to you here. 
The main reason why I separate it to operator and sensor is that the success of 
the training job have two stages: successfully kick off a training job, and the 
training job successfully finishes. The operator tells about the first status, 
and the sensor tells the latter one. Also, since a training job is hosted at an 
AWS instance, not the instance that is hosting Airflow, so this way, other 
operators can set upstream to the operator, rather than the sensor, if they 
aren't dependent on the model actually being created. Also, by using the 
sensor, users can set parameters like poke_interval, which makes more sense for 
a sensor rather than an operator.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow integration with AWS Sagemaker
> --
>
> Key: AIRFLOW-2524
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2524
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib
>Reporter: Rajeev Srinivasan
>Assignee: Yang Yu
>Priority: Major
>  Labels: AWS
>
> Would it be possible to orchestrate an end to end  AWS  Sagemaker job using 
> Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2658) Add GKE specific Kubernetes Pod Operator

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564402#comment-16564402
 ] 

ASF GitHub Bot commented on AIRFLOW-2658:
-

fenglu-g commented on issue #3532: [AIRFLOW-2658] Add GCP specific k8s pod 
operator
URL: 
https://github.com/apache/incubator-airflow/pull/3532#issuecomment-409378846
 
 
   @Noremac201 please fix travis-ci, thanks. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add GKE specific Kubernetes Pod Operator
> 
>
> Key: AIRFLOW-2658
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2658
> Project: Apache Airflow
>  Issue Type: New Feature
>Reporter: Cameron Moberg
>Assignee: Cameron Moberg
>Priority: Minor
>
> Currently there is a Kubernetes Pod operator, but it is not really easy to 
> have it work with GCP Kubernetes Engine, it would be nice to have one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564400#comment-16564400
 ] 

ASF GitHub Bot commented on AIRFLOW-2814:
-

XD-DENG commented on issue #3669: Revert [AIRFLOW-2814] - Change 
`min_file_process_interval` to 0
URL: 
https://github.com/apache/incubator-airflow/pull/3669#issuecomment-409378082
 
 
   Hi @kaxil , please be reminded to update the comment in 
   https://github.com/apache/incubator-airflow/blob/master/airflow/jobs.py#L592 
as well, otherwise the comment will be inconsistent with the configuration 
value again.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564395#comment-16564395
 ] 

ASF GitHub Bot commented on AIRFLOW-2832:
-

codecov-io commented on issue #3670: [AIRFLOW-2832] Lint and resolve 
inconsistencies in Markdown files
URL: 
https://github.com/apache/incubator-airflow/pull/3670#issuecomment-409376218
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=h1)
 Report
   > Merging 
[#3670](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/ed972042a864cd010137190e0bbb1d25a9dcfe83?src=pr&el=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3670/graphs/tree.svg?width=650&src=pr&token=WdLKlKHOAU&height=150)](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3670   +/-   ##
   ===
 Coverage   77.51%   77.51%   
   ===
 Files 205  205   
 Lines   1575115751   
   ===
 Hits1221012210   
 Misses   3541 3541
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=footer).
 Last update 
[ed97204...eef6fc8](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistencies and linter errors across markdown files
> ---
>
> Key: AIRFLOW-2832
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2832
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: docs, Documentation
>Reporter: Taylor Edmiston
>Assignee: Taylor Edmiston
>Priority: Minor
>
> There are a number of inconsistencies within and across markdown files in the 
> Airflow project.  Most of these are simple formatting issues easily fixed by 
> linting (e.g., with mdl).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564396#comment-16564396
 ] 

ASF GitHub Bot commented on AIRFLOW-2832:
-

codecov-io edited a comment on issue #3670: [AIRFLOW-2832] Lint and resolve 
inconsistencies in Markdown files
URL: 
https://github.com/apache/incubator-airflow/pull/3670#issuecomment-409376218
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=h1)
 Report
   > Merging 
[#3670](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/ed972042a864cd010137190e0bbb1d25a9dcfe83?src=pr&el=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3670/graphs/tree.svg?height=150&width=650&token=WdLKlKHOAU&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3670   +/-   ##
   ===
 Coverage   77.51%   77.51%   
   ===
 Files 205  205   
 Lines   1575115751   
   ===
 Hits1221012210   
 Misses   3541 3541
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=footer).
 Last update 
[ed97204...eef6fc8](https://codecov.io/gh/apache/incubator-airflow/pull/3670?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistencies and linter errors across markdown files
> ---
>
> Key: AIRFLOW-2832
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2832
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: docs, Documentation
>Reporter: Taylor Edmiston
>Assignee: Taylor Edmiston
>Priority: Minor
>
> There are a number of inconsistencies within and across markdown files in the 
> Airflow project.  Most of these are simple formatting issues easily fixed by 
> linting (e.g., with mdl).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564387#comment-16564387
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

tedmiston commented on a change in pull request #3656: [WIP][AIRFLOW-2803] Fix 
all ESLint issues
URL: https://github.com/apache/incubator-airflow/pull/3656#discussion_r206688518
 
 

 ##
 File path: airflow/www_rbac/templates/airflow/circles.html
 ##
 @@ -28,117 +28,111 @@ Airflow 404 = lots of circles
 
 
 

[jira] [Commented] (AIRFLOW-2817) Force explicit choice on GPL dependency

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564373#comment-16564373
 ] 

ASF GitHub Bot commented on AIRFLOW-2817:
-

ashb commented on issue #3660: [AIRFLOW-2817] Force explicit choice on GPL 
dependency
URL: 
https://github.com/apache/incubator-airflow/pull/3660#issuecomment-409370019
 
 
   Charting is causing us quite the license head-ache isn't it? :(


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Force explicit choice on GPL dependency
> ---
>
> Key: AIRFLOW-2817
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2817
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Bolke de Bruin
>Priority: Major
>
> A more explicit choice on GPL dependency was required by the IPMC



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564370#comment-16564370
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

ashb commented on a change in pull request #3656: [WIP][AIRFLOW-2803] Fix all 
ESLint issues
URL: https://github.com/apache/incubator-airflow/pull/3656#discussion_r206684313
 
 

 ##
 File path: airflow/www_rbac/templates/airflow/circles.html
 ##
 @@ -28,117 +28,111 @@ Airflow 404 = lots of circles
 
 
 

[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564359#comment-16564359
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

codecov-io edited a comment on issue #3568: AIRFLOW-1104 Update jobs.py so 
Airflow does not over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-401878707
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=h1)
 Report
   > Merging 
[#3568](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/3b35d360f6ff8694b6fb4387901c182ca39160b5?src=pr&el=desc)
 will **increase** coverage by `<.01%`.
   > The diff coverage is `100%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3568/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3568  +/-   ##
   ==
   + Coverage   77.51%   77.51%   +<.01% 
   ==
 Files 205  205  
 Lines   1575115751  
   ==
   + Hits1220912210   +1 
   + Misses   3542 3541   -1
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3568/diff?src=pr&el=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.74% <100%> (ø)` | :arrow_up: |
   | 
[airflow/models.py](https://codecov.io/gh/apache/incubator-airflow/pull/3568/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMucHk=)
 | `88.58% <0%> (+0.04%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=footer).
 Last update 
[3b35d36...b04c9b1](https://codecov.io/gh/apache/incubator-airflow/pull/3568?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564354#comment-16564354
 ] 

ASF GitHub Bot commented on AIRFLOW-2832:
-

tedmiston edited a comment on issue #3670: [AIRFLOW-2832] Lint and resolve 
inconsistencies in Markdown files
URL: 
https://github.com/apache/incubator-airflow/pull/3670#issuecomment-409358478
 
 
   This PR is now squashed and ready for review.
   
   I'm not sure that there's any one best person to review these changes but in 
a git log, I see that @bolkedebruin, @Fokko, and @r39132 have modified some of 
these files in recent history.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistencies and linter errors across markdown files
> ---
>
> Key: AIRFLOW-2832
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2832
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: docs, Documentation
>Reporter: Taylor Edmiston
>Assignee: Taylor Edmiston
>Priority: Minor
>
> There are a number of inconsistencies within and across markdown files in the 
> Airflow project.  Most of these are simple formatting issues easily fixed by 
> linting (e.g., with mdl).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564342#comment-16564342
 ] 

ASF GitHub Bot commented on AIRFLOW-2832:
-

tedmiston commented on issue #3670: [AIRFLOW-2832] Lint and resolve 
inconsistencies in Markdown files
URL: 
https://github.com/apache/incubator-airflow/pull/3670#issuecomment-409358478
 
 
   This PR is now squashed and ready for review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistencies and linter errors across markdown files
> ---
>
> Key: AIRFLOW-2832
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2832
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: docs, Documentation
>Reporter: Taylor Edmiston
>Assignee: Taylor Edmiston
>Priority: Minor
>
> There are a number of inconsistencies within and across markdown files in the 
> Airflow project.  Most of these are simple formatting issues easily fixed by 
> linting (e.g., with mdl).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564341#comment-16564341
 ] 

ASF GitHub Bot commented on AIRFLOW-2832:
-

tedmiston opened a new pull request #3670: [AIRFLOW-2832] Lint and resolve 
inconsistencies in Markdown files
URL: https://github.com/apache/incubator-airflow/pull/3670
 
 
   Make sure you have checked _all_ steps below.
   
   ### JIRA
   - [x] My PR addresses the following [Airflow 
JIRA](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
   - https://issues.apache.org/jira/browse/AIRFLOW-2832
   - In case you are fixing a typo in the documentation you can prepend 
your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue.
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   - Inspired by other recent issues related to linter errors in Python and JS 
(AIRFLOW-2783, AIRFLOW-2800, AIRFLOW-2803)
   - This PR does a few things:
 - Resolves linter errors in markdown files across the project (ignores 
errors that aren't super useful on GitHub such as line wrapping and putting 
`` in brackets)
 - Clarifies that commit message length of 50 characters doesn't include 
the Jira issue tag
 - Replaces usage of JIRA with Jira the way it's styled nowadays by 
[Atlassian](https://www.atlassian.com/software/jira) and 
[Wikipedia](https://en.wikipedia.org/wiki/Jira_(software))
 - Makes code block formatting consistent
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   The changes in this PR are restricted to linting documentation.
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
   ### Documentation
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
   - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   n/a
   
   ### Code Quality
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistencies and linter errors across markdown files
> ---
>
> Key: AIRFLOW-2832
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2832
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: docs, Documentation
>Reporter: Taylor Edmiston
>Assignee: Taylor Edmiston
>Priority: Minor
>
> There are a number of inconsistencies within and across markdown files in the 
> Airflow project.  Most of these are simple formatting issues easily fixed by 
> linting (e.g., with mdl).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread Kaxil Naik (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaxil Naik resolved AIRFLOW-1104.
-
   Resolution: Resolved
Fix Version/s: 2.0.0

Resolved by https://github.com/apache/incubator-airflow/pull/3568

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564330#comment-16564330
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

kaxil closed pull request #3568: AIRFLOW-1104 Update jobs.py so Airflow does 
not over schedule tasks
URL: https://github.com/apache/incubator-airflow/pull/3568
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/jobs.py b/airflow/jobs.py
index 224ff185fb..a4252473cd 100644
--- a/airflow/jobs.py
+++ b/airflow/jobs.py
@@ -1075,9 +1075,6 @@ def _find_executable_task_instances(self, simple_dag_bag, 
states, session=None):
 :type states: Tuple[State]
 :return: List[TaskInstance]
 """
-# TODO(saguziel): Change this to include QUEUED, for concurrency
-# purposes we may want to count queued tasks
-states_to_count_as_running = [State.RUNNING]
 executable_tis = []
 
 # Get all the queued task instances from associated with scheduled
@@ -1123,6 +1120,7 @@ def _find_executable_task_instances(self, simple_dag_bag, 
states, session=None):
 for task_instance in task_instances_to_examine:
 pool_to_task_instances[task_instance.pool].append(task_instance)
 
+states_to_count_as_running = [State.RUNNING, State.QUEUED]
 task_concurrency_map = self.__get_task_concurrency_map(
 states=states_to_count_as_running, session=session)
 
@@ -1173,7 +1171,6 @@ def _find_executable_task_instances(self, simple_dag_bag, 
states, session=None):
 simple_dag = simple_dag_bag.get_dag(dag_id)
 
 if dag_id not in dag_id_to_possibly_running_task_count:
-# TODO(saguziel): also check against QUEUED state, see 
AIRFLOW-1104
 dag_id_to_possibly_running_task_count[dag_id] = \
 DAG.get_num_task_instances(
 dag_id,
diff --git a/tests/jobs.py b/tests/jobs.py
index 93f6574df4..c701214f1e 100644
--- a/tests/jobs.py
+++ b/tests/jobs.py
@@ -1493,6 +1493,39 @@ def 
test_find_executable_task_instances_concurrency(self):
 
 self.assertEqual(0, len(res))
 
+def test_find_executable_task_instances_concurrency_queued(self):
+dag_id = 
'SchedulerJobTest.test_find_executable_task_instances_concurrency_queued'
+dag = DAG(dag_id=dag_id, start_date=DEFAULT_DATE, concurrency=3)
+task1 = DummyOperator(dag=dag, task_id='dummy1')
+task2 = DummyOperator(dag=dag, task_id='dummy2')
+task3 = DummyOperator(dag=dag, task_id='dummy3')
+dagbag = self._make_simple_dag_bag([dag])
+
+scheduler = SchedulerJob()
+session = settings.Session()
+dag_run = scheduler.create_dag_run(dag)
+
+ti1 = TI(task1, dag_run.execution_date)
+ti2 = TI(task2, dag_run.execution_date)
+ti3 = TI(task3, dag_run.execution_date)
+ti1.state = State.RUNNING
+ti2.state = State.QUEUED
+ti3.state = State.SCHEDULED
+
+session.merge(ti1)
+session.merge(ti2)
+session.merge(ti3)
+
+session.commit()
+
+res = scheduler._find_executable_task_instances(
+dagbag,
+states=[State.SCHEDULED],
+session=session)
+
+self.assertEqual(1, len(res))
+self.assertEqual(res[0].key, ti3.key)
+
 def test_find_executable_task_instances_task_concurrency(self):
 dag_id = 
'SchedulerJobTest.test_find_executable_task_instances_task_concurrency'
 task_id_1 = 'dummy'


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 

[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564331#comment-16564331
 ] 

ASF subversion and git services commented on AIRFLOW-1104:
--

Commit ed972042a864cd010137190e0bbb1d25a9dcfe83 in incubator-airflow's branch 
refs/heads/master from Dan Fowler
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=ed97204 ]

[AIRFLOW-1104] Update jobs.py so Airflow does not over schedule tasks (#3568)

This change will prevent tasks from getting scheduled and queued over
the concurrency limits set for the dag

> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564327#comment-16564327
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

dan-sf commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does 
not over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409355510
 
 
   Sure, the changes have been rebased on master


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564316#comment-16564316
 ] 

ASF GitHub Bot commented on AIRFLOW-2814:
-

kaxil commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default config
URL: 
https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409351337
 
 
   Agreed with everyone. Do you guys think we should decrease the time duration 
for `dag_dir_list_interval` as well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564313#comment-16564313
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

kaxil commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does not 
over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409350840
 
 
   Can you squash your commits as well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564312#comment-16564312
 ] 

ASF GitHub Bot commented on AIRFLOW-2814:
-

feng-tao commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default 
config
URL: 
https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409350792
 
 
   +1 on keeping 0. 180 seconds is surely too high...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564311#comment-16564311
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

dan-sf commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does 
not over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409350564
 
 
   @kaxil Conflicts have been updated


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-1104) Concurrency check in scheduler should count queued tasks as well as running

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564274#comment-16564274
 ] 

ASF GitHub Bot commented on AIRFLOW-1104:
-

kaxil commented on issue #3568: AIRFLOW-1104 Update jobs.py so Airflow does not 
over schedule tasks
URL: 
https://github.com/apache/incubator-airflow/pull/3568#issuecomment-409343719
 
 
   @dan-sf Can you please resolve the conflicts?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Concurrency check in scheduler should count queued tasks as well as running
> ---
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
>  Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we 
> cannot count them. This is because there is no guarantee that queued tasks in 
> failed dagruns will or will not eventually run and queued tasks that will 
> never run will consume slots and can stall a DAG. Once we can guarantee that 
> all queued tasks in failed dagruns will never run (e.g. make sure that all 
> running/newly queued TIs have running dagruns), then we can include QUEUED 
> tasks here, with the constraint that they are in running dagruns."
>Reporter: Alex Guziel
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2832) Inconsistencies and linter errors across markdown files

2018-07-31 Thread Taylor Edmiston (JIRA)
Taylor Edmiston created AIRFLOW-2832:


 Summary: Inconsistencies and linter errors across markdown files
 Key: AIRFLOW-2832
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2832
 Project: Apache Airflow
  Issue Type: Improvement
  Components: docs, Documentation
Reporter: Taylor Edmiston
Assignee: Taylor Edmiston


There are a number of inconsistencies within and across markdown files in the 
Airflow project.  Most of these are simple formatting issues easily fixed by 
linting (e.g., with mdl).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564270#comment-16564270
 ] 

ASF GitHub Bot commented on AIRFLOW-2814:
-

kaxil commented on issue #3669: Revert [AIRFLOW-2814] - Change 
`min_file_process_interval` to 0
URL: 
https://github.com/apache/incubator-airflow/pull/3669#issuecomment-409342022
 
 
   @Fokko PTAL. Also, shouldn't we be reducing `dag_dir_list_interval` as well? 
It is 5 mins by default.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564269#comment-16564269
 ] 

ASF GitHub Bot commented on AIRFLOW-2814:
-

kaxil opened a new pull request #3669: Revert [AIRFLOW-2814] - Change 
`min_file_process_interval` to 0
URL: https://github.com/apache/incubator-airflow/pull/3669
 
 
   Make sure you have checked _all_ steps below.
   
   ### JIRA
   - [x] My PR addresses the following [Airflow 
JIRA](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references 
them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
   - https://issues.apache.org/jira/browse/AIRFLOW-XXX
   - In case you are fixing a typo in the documentation you can prepend 
your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue.
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots of any UI 
changes:
   
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
   ### Documentation
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
   - When adding new operators/hooks/sensors, the autoclass documentation 
generation needs to be added.
   
   
   ### Code Quality
   - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff`
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564262#comment-16564262
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

Fokko commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon 
SageMaker Training
URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206654107
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py
 ##
 @@ -0,0 +1,98 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.sagemaker_hook import SageMakerHook
+from airflow.models import BaseOperator
+from airflow.utils import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerCreateTrainingJobOperator(BaseOperator):
+
+"""
+   Initiate a SageMaker training
+
+   This operator returns The ARN of the model created in Amazon SageMaker
+
+   :param training_job_config:
+   The configuration necessary to start a training job (templated)
+   :type training_job_config: dict
+   :param region_name: The AWS region_name
+   :type region_name: string
+   :param sagemaker_conn_id: The SageMaker connection ID to use.
+   :type aws_conn_id: string
 
 Review comment:
   Should be `sagemaker_conn_id`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow integration with AWS Sagemaker
> --
>
> Key: AIRFLOW-2524
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2524
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib
>Reporter: Rajeev Srinivasan
>Assignee: Yang Yu
>Priority: Major
>  Labels: AWS
>
> Would it be possible to orchestrate an end to end  AWS  Sagemaker job using 
> Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564264#comment-16564264
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

Fokko commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon 
SageMaker Training
URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206655197
 
 

 ##
 File path: tests/contrib/hooks/test_sagemaker_hook.py
 ##
 @@ -0,0 +1,341 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+
+import json
+import unittest
+import copy
+try:
+from unittest import mock
+except ImportError:
+try:
+import mock
+except ImportError:
+mock = None
+
+from airflow import configuration
+from airflow import models
+from airflow.utils import db
+from airflow.contrib.hooks.sagemaker_hook import SageMakerHook
+from airflow.hooks.S3_hook import S3Hook
+from airflow.exceptions import AirflowException
+
+
+role = 'test-role'
+
+bucket = 'test-bucket'
+
+key = 'test/data'
+data_url = 's3://{}/{}'.format(bucket, key)
+
+job_name = 'test-job-name'
+
+image = 'test-image'
+
+test_arn_return = {'TrainingJobArn': 'testarn'}
+
+test_list_training_job_return = {
+'TrainingJobSummaries': [
+{
+'TrainingJobName': job_name,
+'TrainingJobStatus': 'InProgress'
+},
+],
+'NextToken': 'test-token'
+}
+
+test_list_tuning_job_return = {
+'TrainingJobSummaries': [
+{
+'TrainingJobName': job_name,
+'TrainingJobArn': 'testarn',
+'TunedHyperParameters': {
+'k': '3'
+},
+'TrainingJobStatus': 'InProgress'
+},
+],
+'NextToken': 'test-token'
+}
+
+output_url = 's3://{}/test/output'.format(bucket)
+create_training_params = \
+{
+'AlgorithmSpecification': {
+'TrainingImage': image,
+'TrainingInputMode': 'File'
+},
+'RoleArn': role,
+'OutputDataConfig': {
+'S3OutputPath': output_url
+},
+'ResourceConfig': {
+'InstanceCount': 2,
+'InstanceType': 'ml.c4.8xlarge',
+'VolumeSizeInGB': 50
+},
+'TrainingJobName': job_name,
+'HyperParameters': {
+'k': '10',
+'feature_dim': '784',
+'mini_batch_size': '500',
+'force_dense': 'True'
+},
+'StoppingCondition': {
+'MaxRuntimeInSeconds': 60 * 60
+},
+'InputDataConfig': [
+{
+'ChannelName': 'train',
+'DataSource': {
+'S3DataSource': {
+'S3DataType': 'S3Prefix',
+'S3Uri': data_url,
+'S3DataDistributionType': 'FullyReplicated'
+}
+},
+'CompressionType': 'None',
+'RecordWrapperType': 'None'
+}
+]
+}
+
+create_tuning_params = {'HyperParameterTuningJobName': job_name,
+'HyperParameterTuningJobConfig': {
+'Strategy': 'Bayesian',
+'HyperParameterTuningJobObjective': {
+'Type': 'Maximize',
+'MetricName': 'test_metric'
+},
+'ResourceLimits': {
+'MaxNumberOfTrainingJobs': 123,
+'MaxParallelTrainingJobs': 123
+},
+'ParameterRanges': {
+'IntegerParameterRanges': [
+{
+'Name': 'k',
+'MinValue': '2',
+'MaxValue': '10'
+},
+]
+}
+},
+'TrainingJobDefinition': {
+

[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564265#comment-16564265
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

Fokko commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon 
SageMaker Training
URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206654353
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py
 ##
 @@ -0,0 +1,98 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.sagemaker_hook import SageMakerHook
+from airflow.models import BaseOperator
+from airflow.utils import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerCreateTrainingJobOperator(BaseOperator):
+
+"""
+   Initiate a SageMaker training
+
+   This operator returns The ARN of the model created in Amazon SageMaker
+
+   :param training_job_config:
+   The configuration necessary to start a training job (templated)
+   :type training_job_config: dict
+   :param region_name: The AWS region_name
+   :type region_name: string
+   :param sagemaker_conn_id: The SageMaker connection ID to use.
+   :type aws_conn_id: string
+   :param use_db_config: Whether or not to use db config
+   associated with sagemaker_conn_id.
 
 Review comment:
   Missing `:type use_db_config: bool`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow integration with AWS Sagemaker
> --
>
> Key: AIRFLOW-2524
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2524
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib
>Reporter: Rajeev Srinivasan
>Assignee: Yang Yu
>Priority: Major
>  Labels: AWS
>
> Would it be possible to orchestrate an end to end  AWS  Sagemaker job using 
> Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2524) Airflow integration with AWS Sagemaker

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564263#comment-16564263
 ] 

ASF GitHub Bot commented on AIRFLOW-2524:
-

Fokko commented on a change in pull request #3658: [AIRFLOW-2524] Add Amazon 
SageMaker Training
URL: https://github.com/apache/incubator-airflow/pull/3658#discussion_r206654727
 
 

 ##
 File path: airflow/contrib/operators/sagemaker_create_training_job_operator.py
 ##
 @@ -0,0 +1,98 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.contrib.hooks.sagemaker_hook import SageMakerHook
+from airflow.models import BaseOperator
+from airflow.utils import apply_defaults
+from airflow.exceptions import AirflowException
+
+
+class SageMakerCreateTrainingJobOperator(BaseOperator):
+
+"""
+   Initiate a SageMaker training
+
+   This operator returns The ARN of the model created in Amazon SageMaker
+
+   :param training_job_config:
+   The configuration necessary to start a training job (templated)
+   :type training_job_config: dict
+   :param region_name: The AWS region_name
+   :type region_name: string
+   :param sagemaker_conn_id: The SageMaker connection ID to use.
+   :type aws_conn_id: string
+   :param use_db_config: Whether or not to use db config
+   associated with sagemaker_conn_id.
+   If set to true, will automatically update the training config
+   with what's in db, so the db config doesn't need to
+   included everything, but what's there does replace the ones
+   in the training_job_config, so be careful
+   :type use_db_config:
+   :param aws_conn_id: The AWS connection ID to use.
+   :type aws_conn_id: string
+
+   **Example**:
+   The following operator would start a training job when executed
+
+sagemaker_training =
+   SageMakerCreateTrainingJobOperator(
+   task_id='sagemaker_training',
+   training_job_config=config,
+   use_db_config=True,
+   region_name='us-west-2'
+   sagemaker_conn_id='sagemaker_customers_conn',
+   aws_conn_id='aws_customers_conn'
+   )
+   """
+
+template_fields = ['training_job_config']
+template_ext = ()
+ui_color = '#ededed'
+
+@apply_defaults
+def __init__(self,
+ sagemaker_conn_id=None,
 
 Review comment:
   Please make the order of the arguments congruent with the docstring, or the 
other way around


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Airflow integration with AWS Sagemaker
> --
>
> Key: AIRFLOW-2524
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2524
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: aws, contrib
>Reporter: Rajeev Srinivasan
>Assignee: Yang Yu
>Priority: Major
>  Labels: AWS
>
> Would it be possible to orchestrate an end to end  AWS  Sagemaker job using 
> Airflow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2670) SSHOperator's timeout parameter doesn't affect SSHook timeoot

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564247#comment-16564247
 ] 

ASF GitHub Bot commented on AIRFLOW-2670:
-

Fokko closed pull request #3666: [AIRFLOW-2670] Update SSH Operator's Hook to 
respect timeout
URL: https://github.com/apache/incubator-airflow/pull/3666
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/ssh_operator.py 
b/airflow/contrib/operators/ssh_operator.py
index 2e890f463e..747ad04ff0 100644
--- a/airflow/contrib/operators/ssh_operator.py
+++ b/airflow/contrib/operators/ssh_operator.py
@@ -69,16 +69,17 @@ def __init__(self,
 def execute(self, context):
 try:
 if self.ssh_conn_id and not self.ssh_hook:
-self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id)
+self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id,
+timeout=self.timeout)
 
 if not self.ssh_hook:
-raise AirflowException("can not operate without ssh_hook or 
ssh_conn_id")
+raise AirflowException("Cannot operate without ssh_hook or 
ssh_conn_id.")
 
 if self.remote_host is not None:
 self.ssh_hook.remote_host = self.remote_host
 
 if not self.command:
-raise AirflowException("no command specified so nothing to 
execute here.")
+raise AirflowException("SSH command not specified. Aborting.")
 
 with self.ssh_hook.get_conn() as ssh_client:
 # Auto apply tty when its required in case of sudo
diff --git a/tests/contrib/operators/test_ssh_operator.py 
b/tests/contrib/operators/test_ssh_operator.py
index b97ba84a01..7ddd24b2ac 100644
--- a/tests/contrib/operators/test_ssh_operator.py
+++ b/tests/contrib/operators/test_ssh_operator.py
@@ -7,9 +7,9 @@
 # to you under the Apache License, Version 2.0 (the
 # "License"); you may not use this file except in compliance
 # with the License.  You may obtain a copy of the License at
-# 
+#
 #   http://www.apache.org/licenses/LICENSE-2.0
-# 
+#
 # Unless required by applicable law or agreed to in writing,
 # software distributed under the License is distributed on an
 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -58,6 +58,23 @@ def setUp(self):
 self.hook = hook
 self.dag = dag
 
+def test_hook_created_correctly(self):
+TIMEOUT = 20
+SSH_ID = "ssh_default"
+task = SSHOperator(
+task_id="test",
+command="echo -n airflow",
+dag=self.dag,
+timeout=TIMEOUT,
+ssh_conn_id="ssh_default"
+)
+self.assertIsNotNone(task)
+
+task.execute(None)
+
+self.assertEquals(TIMEOUT, task.ssh_hook.timeout)
+self.assertEquals(SSH_ID, task.ssh_hook.ssh_conn_id)
+
 def test_json_command_execution(self):
 configuration.conf.set("core", "enable_xcom_pickling", "False")
 task = SSHOperator(


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> SSHOperator's timeout parameter doesn't affect SSHook timeoot
> -
>
> Key: AIRFLOW-2670
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2670
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: Airflow 2.0
>Reporter: jin zhang
>Priority: Major
>
> when I use SSHOperator, SSHOperator's timeout parameter can't set in SSHHook 
> and it's just effect exce_command. 
> old version:
> self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id)
> I change it to :
> self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id, timeout=self.timeout)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2670) SSHOperator's timeout parameter doesn't affect SSHook timeoot

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564248#comment-16564248
 ] 

ASF subversion and git services commented on AIRFLOW-2670:
--

Commit 3b35d360f6ff8694b6fb4387901c182ca39160b5 in incubator-airflow's branch 
refs/heads/master from [~noremac201]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=3b35d36 ]

[AIRFLOW-2670] Update SSH Operator's Hook to respect timeout (#3666)



> SSHOperator's timeout parameter doesn't affect SSHook timeoot
> -
>
> Key: AIRFLOW-2670
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2670
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: Airflow 2.0
>Reporter: jin zhang
>Priority: Major
>
> when I use SSHOperator, SSHOperator's timeout parameter can't set in SSHHook 
> and it's just effect exce_command. 
> old version:
> self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id)
> I change it to :
> self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id, timeout=self.timeout)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2670) SSHOperator's timeout parameter doesn't affect SSHook timeoot

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564246#comment-16564246
 ] 

ASF GitHub Bot commented on AIRFLOW-2670:
-

Fokko commented on issue #3666: [AIRFLOW-2670] Update SSH Operator's Hook to 
respect timeout
URL: 
https://github.com/apache/incubator-airflow/pull/3666#issuecomment-409338606
 
 
   Nice one @Noremac201 Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> SSHOperator's timeout parameter doesn't affect SSHook timeoot
> -
>
> Key: AIRFLOW-2670
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2670
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Affects Versions: Airflow 2.0
>Reporter: jin zhang
>Priority: Major
>
> when I use SSHOperator, SSHOperator's timeout parameter can't set in SSHHook 
> and it's just effect exce_command. 
> old version:
> self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id)
> I change it to :
> self.ssh_hook = SSHHook(ssh_conn_id=self.ssh_conn_id, timeout=self.timeout)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2795) Oracle to Oracle Transfer Operator

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564244#comment-16564244
 ] 

ASF GitHub Bot commented on AIRFLOW-2795:
-

Fokko closed pull request #3639: [AIRFLOW-2795] Oracle to Oracle Transfer 
Operator
URL: https://github.com/apache/incubator-airflow/pull/3639
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/oracle_to_oracle_transfer.py 
b/airflow/contrib/operators/oracle_to_oracle_transfer.py
new file mode 100644
index 00..31eb89b7dd
--- /dev/null
+++ b/airflow/contrib/operators/oracle_to_oracle_transfer.py
@@ -0,0 +1,90 @@
+# -*- coding: utf-8 -*-
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from airflow.hooks.oracle_hook import OracleHook
+from airflow.models import BaseOperator
+from airflow.utils.decorators import apply_defaults
+
+
+class OracleToOracleTransfer(BaseOperator):
+"""
+Moves data from Oracle to Oracle.
+
+
+:param oracle_destination_conn_id: destination Oracle connection.
+:type oracle_destination_conn_id: str
+:param destination_table: destination table to insert rows.
+:type destination_table: str
+:param oracle_source_conn_id: source Oracle connection.
+:type oracle_source_conn_id: str
+:param source_sql: SQL query to execute against the source Oracle
+database. (templated)
+:type source_sql: str
+:param source_sql_params: Parameters to use in sql query. (templated)
+:type source_sql_params: dict
+:param rows_chunk: number of rows per chunk to commit.
+:type rows_chunk: int
+"""
+
+template_fields = ('source_sql', 'source_sql_params')
+ui_color = '#e08c8c'
+
+@apply_defaults
+def __init__(
+self,
+oracle_destination_conn_id,
+destination_table,
+oracle_source_conn_id,
+source_sql,
+source_sql_params={},
+rows_chunk=5000,
+*args, **kwargs):
+super(OracleToOracleTransfer, self).__init__(*args, **kwargs)
+self.oracle_destination_conn_id = oracle_destination_conn_id
+self.destination_table = destination_table
+self.oracle_source_conn_id = oracle_source_conn_id
+self.source_sql = source_sql
+self.source_sql_params = source_sql_params
+self.rows_chunk = rows_chunk
+
+def _execute(self, src_hook, dest_hook, context):
+with src_hook.get_conn() as src_conn:
+cursor = src_conn.cursor()
+self.log.info("Querying data from source: {0}".format(
+self.oracle_source_conn_id))
+cursor.execute(self.source_sql, self.source_sql_params)
+target_fields = list(map(lambda field: field[0], 
cursor.description))
+
+rows_total = 0
+rows = cursor.fetchmany(self.rows_chunk)
+while len(rows) > 0:
+rows_total = rows_total + len(rows)
+dest_hook.bulk_insert_rows(self.destination_table, rows,
+   target_fields=target_fields,
+   commit_every=self.rows_chunk)
+rows = cursor.fetchmany(self.rows_chunk)
+self.log.info("Total inserted: {0} rows".format(rows_total))
+
+self.log.info("Finished data transfer.")
+cursor.close()
+
+def execute(self, context):
+src_hook = OracleHook(oracle_conn_id=self.oracle_source_conn_id)
+dest_hook = OracleHook(oracle_conn_id=self.oracle_destination_conn_id)
+self._execute(src_hook, dest_hook, context)
diff --git a/docs/code.rst b/docs/code.rst
index 4f1b301711..f4f55b7b38 100644
--- a/docs/code.rst
+++ b/docs/code.rst
@@ -172,6 +172,7 @@ Operators
 .. autoclass:: airflow.contrib.operators.mongo_to_s3.MongoToS3Operator
 .. autoclass:: 
airflow.contrib.operators.mysql_to_gcs.MySqlToGoogleCloudStorag

[jira] [Commented] (AIRFLOW-2795) Oracle to Oracle Transfer Operator

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564245#comment-16564245
 ] 

ASF subversion and git services commented on AIRFLOW-2795:
--

Commit 9983466fd1f82faad7d74506fd428f2d007e3daf in incubator-airflow's branch 
refs/heads/master from [~marcus.r...@gmail.com]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=9983466 ]

[AIRFLOW-2795] Oracle to Oracle Transfer Operator (#3639)



> Oracle to Oracle Transfer Operator 
> ---
>
> Key: AIRFLOW-2795
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2795
> Project: Apache Airflow
>  Issue Type: New Feature
>  Components: operators
>Reporter: Marcus Rehm
>Assignee: Marcus Rehm
>Priority: Trivial
>
> This operator should help in transfer data from one Oracle instance to 
> another or between tables in the same instance. t's suitable in use cases 
> where you don't want to or it's not allowed use dblink.
> The operator needs a sql query and a destination table in order to work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2825) S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase ext in S3

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564241#comment-16564241
 ] 

ASF GitHub Bot commented on AIRFLOW-2825:
-

Fokko closed pull request #3665: [AIRFLOW-2825]Fix S3ToHiveTransfer bug due to 
case
URL: https://github.com/apache/incubator-airflow/pull/3665
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/operators/s3_to_hive_operator.py 
b/airflow/operators/s3_to_hive_operator.py
index 09eb8363c0..5faaf916b7 100644
--- a/airflow/operators/s3_to_hive_operator.py
+++ b/airflow/operators/s3_to_hive_operator.py
@@ -153,7 +153,7 @@ def execute(self, context):
 
 root, file_ext = os.path.splitext(s3_key_object.key)
 if (self.select_expression and self.input_compressed and
-file_ext != '.gz'):
+file_ext.lower() != '.gz'):
 raise AirflowException("GZIP is the only compression " +
"format Amazon S3 Select supports")
 
diff --git a/tests/operators/s3_to_hive_operator.py 
b/tests/operators/s3_to_hive_operator.py
index 482e7fefc8..6ca6274a2c 100644
--- a/tests/operators/s3_to_hive_operator.py
+++ b/tests/operators/s3_to_hive_operator.py
@@ -89,6 +89,11 @@ def setUp(self):
mode="wb") as f_gz_h:
 self._set_fn(fn_gz, '.gz', True)
 f_gz_h.writelines([header, line1, line2])
+fn_gz_upper = self._get_fn('.txt', True) + ".GZ"
+with gzip.GzipFile(filename=fn_gz_upper,
+   mode="wb") as f_gz_upper_h:
+self._set_fn(fn_gz_upper, '.GZ', True)
+f_gz_upper_h.writelines([header, line1, line2])
 fn_bz2 = self._get_fn('.txt', True) + '.bz2'
 with bz2.BZ2File(filename=fn_bz2,
  mode="wb") as f_bz2_h:
@@ -105,6 +110,11 @@ def setUp(self):
mode="wb") as f_gz_nh:
 self._set_fn(fn_gz, '.gz', False)
 f_gz_nh.writelines([line1, line2])
+fn_gz_upper = self._get_fn('.txt', False) + ".GZ"
+with gzip.GzipFile(filename=fn_gz_upper,
+   mode="wb") as f_gz_upper_nh:
+self._set_fn(fn_gz_upper, '.GZ', False)
+f_gz_upper_nh.writelines([line1, line2])
 fn_bz2 = self._get_fn('.txt', False) + '.bz2'
 with bz2.BZ2File(filename=fn_bz2,
  mode="wb") as f_bz2_nh:
@@ -143,7 +153,7 @@ def _check_file_equality(self, fn_1, fn_2, ext):
 # gz files contain mtime and filename in the header that
 # causes filecmp to return False even if contents are identical
 # Hence decompress to test for equality
-if(ext == '.gz'):
+if(ext.lower() == '.gz'):
 with gzip.GzipFile(fn_1, 'rb') as f_1,\
  NamedTemporaryFile(mode='wb') as f_txt_1,\
  gzip.GzipFile(fn_2, 'rb') as f_2,\
@@ -220,14 +230,14 @@ def test_execute(self, mock_hiveclihook):
 conn.create_bucket(Bucket='bucket')
 
 # Testing txt, zip, bz2 files with and without header row
-for (ext, has_header) in product(['.txt', '.gz', '.bz2'], [True, 
False]):
+for (ext, has_header) in product(['.txt', '.gz', '.bz2', '.GZ'], 
[True, False]):
 self.kwargs['headers'] = has_header
 self.kwargs['check_headers'] = has_header
 logging.info("Testing {0} format {1} header".
  format(ext,
 ('with' if has_header else 'without'))
  )
-self.kwargs['input_compressed'] = ext != '.txt'
+self.kwargs['input_compressed'] = ext.lower() != '.txt'
 self.kwargs['s3_key'] = 's3://bucket/' + self.s3_key + ext
 ip_fn = self._get_fn(ext, self.kwargs['headers'])
 op_fn = self._get_fn(ext, False)
@@ -260,8 +270,8 @@ def test_execute_with_select_expression(self, 
mock_hiveclihook):
 # Only testing S3ToHiveTransfer calls S3Hook.select_key with
 # the right parameters and its execute method succeeds here,
 # since Moto doesn't support select_object_content as of 1.3.2.
-for (ext, has_header) in product(['.txt', '.gz'], [True, False]):
-input_compressed = ext != '.txt'
+for (ext, has_header) in product(['.txt', '.gz', '.GZ'], [True, 
False]):
+input_compressed = ext.lower() != '.txt'
 key = self.s3_key + ext
 
 self.kwargs['check_headers'] = False


 

--

[jira] [Commented] (AIRFLOW-2825) S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase ext in S3

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564242#comment-16564242
 ] 

ASF subversion and git services commented on AIRFLOW-2825:
--

Commit c7e54461c68c70e11b5cd47e9dee9d52f6ee357b in incubator-airflow's branch 
refs/heads/master from XD-DENG
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=c7e5446 ]

[AIRFLOW-2825]Fix S3ToHiveTransfer bug due to case

Because upper/lower case was not considered
in the file extension check, S3ToHiveTransfer
operator may mistakenly think a GZIP file with
uppercase ext ".GZ" is not a GZIP file and
raise exception.


> S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase 
> ext in S3
> ---
>
> Key: AIRFLOW-2825
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2825
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
>
> Because upper/lower case was not considered in the extension check, 
> S3ToHiveTransfer operator may think a GZIP file with uppercase ext `.GZ` is 
> not a GZIP file and raise exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2825) S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase ext in S3

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564243#comment-16564243
 ] 

ASF subversion and git services commented on AIRFLOW-2825:
--

Commit 8d2f57cd104736f4a9b2b87182358a8c2e406c1a in incubator-airflow's branch 
refs/heads/master from [~Fokko]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=8d2f57c ]

Merge pull request #3665 from XD-DENG/patch-6

[AIRFLOW-2825] Fix S3ToHiveTransfer bug due to case

> S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase 
> ext in S3
> ---
>
> Key: AIRFLOW-2825
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2825
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
>
> Because upper/lower case was not considered in the extension check, 
> S3ToHiveTransfer operator may think a GZIP file with uppercase ext `.GZ` is 
> not a GZIP file and raise exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2825) S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase ext in S3

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564228#comment-16564228
 ] 

ASF GitHub Bot commented on AIRFLOW-2825:
-

Fokko commented on issue #3665: [AIRFLOW-2825]Fix S3ToHiveTransfer bug due to 
case
URL: 
https://github.com/apache/incubator-airflow/pull/3665#issuecomment-409335560
 
 
   LGTM, thanks @XD-DENG 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> S3ToHiveTransfer operator may not may able to handle GZIP file with uppercase 
> ext in S3
> ---
>
> Key: AIRFLOW-2825
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2825
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
>
> Because upper/lower case was not considered in the extension check, 
> S3ToHiveTransfer operator may think a GZIP file with uppercase ext `.GZ` is 
> not a GZIP file and raise exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564226#comment-16564226
 ] 

ASF GitHub Bot commented on AIRFLOW-2814:
-

Fokko commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default config
URL: 
https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409335193
 
 
   I would keep it at 0 by default. 3 minutes is definitely too high. 1 would 
also work for me as a compromise. Making changes to your dag, and not see them 
in the UI would feel awkward to me. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2831) Logs not found through the UI when deployed on Kubernetes

2018-07-31 Thread Dustin Jenkins (JIRA)
Dustin Jenkins created AIRFLOW-2831:
---

 Summary: Logs not found through the UI when deployed on Kubernetes
 Key: AIRFLOW-2831
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2831
 Project: Apache Airflow
  Issue Type: Bug
  Components: logging
Affects Versions: 1.10.0
Reporter: Dustin Jenkins
 Attachments: Screen Shot 2018-07-31 at 11.19.34.png

Kubernetes 1.11 on OpenStack

Airflow 1.10.0rc2

Executor: KubernetesExecutor

Operator(s): Mix of KubernetesPodOperator and PythonOperator

When deploying Airflow tasks on Kubernetes, the logs are rarely accessible 
after a run, regardless of a successful or failed run (See attached screenshot).

If I use the Kubernetes command line client and shell into the Scheduler pod 
and onto the running Scheduler Container and view the logs directly I can see 
the output.

At first I thought it was just the KubernetesPodOperator, but I've tested with 
the PythonOperator and the DummyOperator as well with the same results.

I have the Web Server, Scheduler, and PostgeSQL instances all running in their 
own Pods.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2658) Add GKE specific Kubernetes Pod Operator

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564085#comment-16564085
 ] 

ASF GitHub Bot commented on AIRFLOW-2658:
-

fenglu-g commented on a change in pull request #3532: [AIRFLOW-2658] Add GCP 
specific k8s pod operator
URL: https://github.com/apache/incubator-airflow/pull/3532#discussion_r206629560
 
 

 ##
 File path: airflow/contrib/operators/gcp_container_operator.py
 ##
 @@ -170,3 +175,147 @@ def execute(self, context):
 hook = GKEClusterHook(self.project_id, self.location)
 create_op = hook.create_cluster(cluster=self.body)
 return create_op
+
+
+KUBE_CONFIG_ENV_VAR = "KUBECONFIG"
+G_APP_CRED = "GOOGLE_APPLICATION_CREDENTIALS"
+
+
+class GKEPodOperator(KubernetesPodOperator):
+template_fields = ('project_id', 'location',
+   'cluster_name') + KubernetesPodOperator.template_fields
+
+@apply_defaults
+def __init__(self,
+ project_id,
+ location,
+ cluster_name,
+ gcp_conn_id='google_cloud_default',
+ *args,
+ **kwargs):
+"""
+Executes a task in a Kubernetes pod in the specified Google Kubernetes
+Engine cluster
+
+This Operator assumes that the system has gcloud installed and either
+has working default application credentials or has configured a
+connection id with a service account.
+
+The **minimum** required to define a cluster to create are the 
variables
+``task_id``, ``project_id``, ``location``, ``cluster_name``, ``name``,
+``namespace``, and ``image``
+
+**Operator Creation**: ::
+
+operator = GKEPodOperator(task_id='pod_op',
+  project_id='my-project',
+  location='us-central1-a',
+  cluster_name='my-cluster-name',
+  name='task-name',
+  namespace='default',
+  image='perl')
+
+.. seealso::
+For more detail about application authentication have a look at 
the reference:
+
https://cloud.google.com/docs/authentication/production#providing_credentials_to_your_application
+
+:param project_id: The Google Developers Console project id
+:type project_id: str
+:param location: The name of the Google Kubernetes Engine zone in 
which the
+cluster resides, e.g. 'us-central1-a'
+:type location: str
+:param cluster_name: The name of the Google Kubernetes Engine cluster 
the pod
+should be spawned in
+:type cluster_name: str
+:param gcp_conn_id: The google cloud connection id to use. This allows 
for
+users to specify a service account.
+:type gcp_conn_id: str
+"""
+super(GKEPodOperator, self).__init__(*args, **kwargs)
+self.project_id = project_id
+self.location = location
+self.cluster_name = cluster_name
+self.gcp_conn_id = gcp_conn_id
+
+def execute(self, context):
+# Specifying a service account file allows the user to using non 
default
+# authentication for creating a Kubernetes Pod. This is done by 
setting the
+# environment variable `GOOGLE_APPLICATION_CREDENTIALS` that gcloud 
looks at.
+key_file = None
+
+# If gcp_conn_id is not specified gcloud will use the default
+# service account credentials.
+if self.gcp_conn_id:
+from airflow.hooks.base_hook import BaseHook
+# extras is a deserialized json object
+extras = BaseHook.get_connection(self.gcp_conn_id).extra_dejson
+# key_file only gets set if a json file is created from a JSON 
string in
+# the web ui, else none
+key_file = self._set_env_from_extras(extras=extras)
+
+# Write config to a temp file and set the environment variable to 
point to it.
+# This is to avoid race conditions of reading/writing a single file
+with tempfile.NamedTemporaryFile() as conf_file:
+os.environ[KUBE_CONFIG_ENV_VAR] = conf_file.name
+# Attempt to get/update credentials
+# We call gcloud directly instead of using google-cloud-python api
+# because there is no way to write kubernetes config to a file, 
which is
+# required by KubernetesPodOperator.
+# The gcloud command looks at the env variable `KUBECONFIG` for 
where to save
+# the kubernetes config file.
+subprocess.check_call(
+["gcloud", "container", "clusters", "get-credentials",
+ self.cluster_name,
+ "--zone", self.location

[jira] [Resolved] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator

2018-07-31 Thread Leo Gallucci (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leo Gallucci resolved AIRFLOW-2822.
---
Resolution: Fixed

> PendingDeprecationWarning Invalid arguments: 
> HipChatAPISendRoomNotificationOperator
> ---
>
> Key: AIRFLOW-2822
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2822
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, operators
>Affects Versions: Airflow 2.0
>Reporter: Leo Gallucci
>Assignee: Leo Gallucci
>Priority: Trivial
>  Labels: easyfix
>
> Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) 
> gives:
> {code:python}
> airflow/models.py:2390: PendingDeprecationWarning:
> Invalid arguments were passed to HipChatAPISendRoomNotificationOperator.
> Support for passing such arguments will be dropped in Airflow 2.0.
> Invalid arguments were:
> *args: ()
> **kwargs: {'color': 'green'}
> category=PendingDeprecationWarning
> {code}
> I've fixed this in my fork:
> https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055
> I will send a PR



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2310) Enable AWS Glue Job Integration

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564040#comment-16564040
 ] 

ASF GitHub Bot commented on AIRFLOW-2310:
-

suma-ps commented on issue #3504: [AIRFLOW-2310]: Add AWS Glue Job 
Compatibility to Airflow
URL: 
https://github.com/apache/incubator-airflow/pull/3504#issuecomment-409303864
 
 
   @OElesin  Do you plan to resolve the merge issues soon? Looking forward to 
using the Glue operator soon, thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Enable AWS Glue Job Integration
> ---
>
> Key: AIRFLOW-2310
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2310
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: contrib
>Reporter: Olalekan Elesin
>Assignee: Olalekan Elesin
>Priority: Major
>  Labels: AWS
>
> Would it be possible to integrate AWS Glue into Airflow, such that Glue jobs 
> and ETL pipelines can be orchestrated with Airflow



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564022#comment-16564022
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

codecov-io edited a comment on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint 
issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-408503531
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=h1)
 Report
   > Merging 
[#3656](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/a338f3276835af45765d24a6e6d43ad4ba4d66ba?src=pr&el=desc)
 will **increase** coverage by `0.38%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3656/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#3656  +/-   ##
   ==
   + Coverage   77.12%   77.51%   +0.38% 
   ==
 Files 206  205   -1 
 Lines   1577215751  -21 
   ==
   + Hits1216412209  +45 
   + Misses   3608 3542  -66
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=tree) 
| Coverage Δ | |
   |---|---|---|
   | 
[airflow/www/app.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5)
 | `99.01% <0%> (-0.99%)` | :arrow_down: |
   | 
[airflow/plugins\_manager.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9wbHVnaW5zX21hbmFnZXIucHk=)
 | `92.59% <0%> (ø)` | :arrow_up: |
   | 
[airflow/www/validators.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdmFsaWRhdG9ycy5weQ==)
 | `100% <0%> (ø)` | :arrow_up: |
   | 
[airflow/\_\_init\_\_.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9fX2luaXRfXy5weQ==)
 | `80.43% <0%> (ø)` | :arrow_up: |
   | 
[airflow/jobs.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9qb2JzLnB5)
 | `82.74% <0%> (ø)` | :arrow_up: |
   | 
[airflow/minihivecluster.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9taW5paGl2ZWNsdXN0ZXIucHk=)
 | | |
   | 
[airflow/utils/dag\_processing.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy91dGlscy9kYWdfcHJvY2Vzc2luZy5weQ==)
 | `89.87% <0%> (+0.42%)` | :arrow_up: |
   | 
[airflow/hooks/pig\_hook.py](https://codecov.io/gh/apache/incubator-airflow/pull/3656/diff?src=pr&el=tree#diff-YWlyZmxvdy9ob29rcy9waWdfaG9vay5weQ==)
 | `100% <0%> (+100%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=footer).
 Last update 
[a338f32...ecbc873](https://codecov.io/gh/apache/incubator-airflow/pull/3656?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix all ESLint issues
> -
>
> Key: AIRFLOW-2803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2803
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Verdan Mahmood
>Assignee: Taylor Edmiston
>Priority: Major
>
> Most of the JS code in Apache Airflow has linting issues which are 
> highlighted after the integration of ESLint. 
> Once AIRFLOW-2783 merged in master branch, please fix all the javascript 
> styling issues that we have in .js and .html files. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563965#comment-16563965
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

tedmiston commented on a change in pull request #3656: [WIP][AIRFLOW-2803] Fix 
all ESLint issues
URL: https://github.com/apache/incubator-airflow/pull/3656#discussion_r206602944
 
 

 ##
 File path: airflow/www_rbac/static/js/clock.js
 ##
 @@ -18,24 +18,25 @@
  */
 require('./jqClock.min');
 
-$(document).ready(function () {
-  x = new Date();
+$(document).ready(() => {
 
 Review comment:
   Sounds good.  I will stick with the ES5 for now for this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix all ESLint issues
> -
>
> Key: AIRFLOW-2803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2803
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Verdan Mahmood
>Assignee: Taylor Edmiston
>Priority: Major
>
> Most of the JS code in Apache Airflow has linting issues which are 
> highlighted after the integration of ESLint. 
> Once AIRFLOW-2783 merged in master branch, please fix all the javascript 
> styling issues that we have in .js and .html files. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563963#comment-16563963
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

tedmiston edited a comment on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint 
issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409266326
 
 
   @verdan Sure!  Typically I keep atomic commits while I'm working so everyone 
can follow small changes instead of one big diff, then squash down to one 
commit at the end.  I updated the title to make it clear this is WIP.  Since 
you're doing most of the reviewing here, do you have a preference on squashing 
throughout working vs just squashing pre-merge?
   
   I should have an update later today btw.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix all ESLint issues
> -
>
> Key: AIRFLOW-2803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2803
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Verdan Mahmood
>Assignee: Taylor Edmiston
>Priority: Major
>
> Most of the JS code in Apache Airflow has linting issues which are 
> highlighted after the integration of ESLint. 
> Once AIRFLOW-2783 merged in master branch, please fix all the javascript 
> styling issues that we have in .js and .html files. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563923#comment-16563923
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

r39132 commented on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409282209
 
 
   @verdan once @tedmiston is done, please provide your +1 and notify some of 
the committers on this PR that the PR is ready for validation and merge. Thx 
for your help on reviewing this PR!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix all ESLint issues
> -
>
> Key: AIRFLOW-2803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2803
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Verdan Mahmood
>Assignee: Taylor Edmiston
>Priority: Major
>
> Most of the JS code in Apache Airflow has linting issues which are 
> highlighted after the integration of ESLint. 
> Once AIRFLOW-2783 merged in master branch, please fix all the javascript 
> styling issues that we have in .js and .html files. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563903#comment-16563903
 ] 

ASF subversion and git services commented on AIRFLOW-2822:
--

Commit 3eb0454cb1da1e96ae5d7ad88db7c1cca71109f3 in incubator-airflow's branch 
refs/heads/master from [~elgalu]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=3eb0454 ]

[AIRFLOW-2822] Fix HipChat Deprecation Warning

Fixes PendingDeprecationWarning on HipChatAPISendRoomNotificationOperator

Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) 
gives:

airflow/models.py:2390: PendingDeprecationWarning:
Invalid arguments were passed to HipChatAPISendRoomNotificationOperator.
Support for passing such arguments will be dropped in Airflow 2.0.
Invalid arguments were:
*args: ()
**kwargs: {'color': 'green'}
category=PendingDeprecationWarning


> PendingDeprecationWarning Invalid arguments: 
> HipChatAPISendRoomNotificationOperator
> ---
>
> Key: AIRFLOW-2822
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2822
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, operators
>Affects Versions: Airflow 2.0
>Reporter: Leo Gallucci
>Assignee: Leo Gallucci
>Priority: Trivial
>  Labels: easyfix
>
> Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) 
> gives:
> {code:python}
> airflow/models.py:2390: PendingDeprecationWarning:
> Invalid arguments were passed to HipChatAPISendRoomNotificationOperator.
> Support for passing such arguments will be dropped in Airflow 2.0.
> Invalid arguments were:
> *args: ()
> **kwargs: {'color': 'green'}
> category=PendingDeprecationWarning
> {code}
> I've fixed this in my fork:
> https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055
> I will send a PR



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563902#comment-16563902
 ] 

ASF GitHub Bot commented on AIRFLOW-2822:
-

r39132 closed pull request #3668: [AIRFLOW-2822] Fix HipChat Deprecation Warning
URL: https://github.com/apache/incubator-airflow/pull/3668
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/contrib/operators/hipchat_operator.py 
b/airflow/contrib/operators/hipchat_operator.py
index 381cd72cdf..adeca23079 100644
--- a/airflow/contrib/operators/hipchat_operator.py
+++ b/airflow/contrib/operators/hipchat_operator.py
@@ -99,24 +99,23 @@ class 
HipChatAPISendRoomNotificationOperator(HipChatAPIOperator):
 :param card: HipChat-defined card object
 :type card: dict
 """
-template_fields = ('token', 'room_id', 'message')
+template_fields = ('token', 'room_id', 'message', 'message_format',
+   'color', 'frm', 'attach_to', 'notify', 'card')
 ui_color = '#2980b9'
 
 @apply_defaults
-def __init__(self, room_id, message, *args, **kwargs):
+def __init__(self, room_id, message, message_format='html',
+ color='yellow', frm='airflow', attach_to=None,
+ notify=False, card=None, *args, **kwargs):
 super(HipChatAPISendRoomNotificationOperator, self).__init__(*args, 
**kwargs)
 self.room_id = room_id
 self.message = message
-default_options = {
-'message_format': 'html',
-'color': 'yellow',
-'frm': 'airflow',
-'attach_to': None,
-'notify': False,
-'card': None
-}
-for (prop, default) in default_options.items():
-setattr(self, prop, kwargs.get(prop, default))
+self.message_format = message_format
+self.color = color
+self.frm = frm
+self.attach_to = attach_to
+self.notify = notify
+self.card = card
 
 def prepare_request(self):
 params = {


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> PendingDeprecationWarning Invalid arguments: 
> HipChatAPISendRoomNotificationOperator
> ---
>
> Key: AIRFLOW-2822
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2822
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, operators
>Affects Versions: Airflow 2.0
>Reporter: Leo Gallucci
>Assignee: Leo Gallucci
>Priority: Trivial
>  Labels: easyfix
>
> Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) 
> gives:
> {code:python}
> airflow/models.py:2390: PendingDeprecationWarning:
> Invalid arguments were passed to HipChatAPISendRoomNotificationOperator.
> Support for passing such arguments will be dropped in Airflow 2.0.
> Invalid arguments were:
> *args: ()
> **kwargs: {'color': 'green'}
> category=PendingDeprecationWarning
> {code}
> I've fixed this in my fork:
> https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055
> I will send a PR



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2800) Remove airflow/ low-hanging linting errors

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563861#comment-16563861
 ] 

ASF GitHub Bot commented on AIRFLOW-2800:
-

r39132 closed pull request #3638: [AIRFLOW-2800] Remove low-hanging linting 
errors
URL: https://github.com/apache/incubator-airflow/pull/3638
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/airflow/__init__.py b/airflow/__init__.py
index f40b08aab5..bc6a7bbe19 100644
--- a/airflow/__init__.py
+++ b/airflow/__init__.py
@@ -7,9 +7,9 @@
 # to you under the Apache License, Version 2.0 (the
 # "License"); you may not use this file except in compliance
 # with the License.  You may obtain a copy of the License at
-# 
+#
 #   http://www.apache.org/licenses/LICENSE-2.0
-# 
+#
 # Unless required by applicable law or agreed to in writing,
 # software distributed under the License is distributed on an
 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -80,11 +80,12 @@ class AirflowMacroPlugin(object):
 def __init__(self, namespace):
 self.namespace = namespace
 
-from airflow import operators
+
+from airflow import operators  # noqa: E402
 from airflow import sensors  # noqa: E402
-from airflow import hooks
-from airflow import executors
-from airflow import macros
+from airflow import hooks  # noqa: E402
+from airflow import executors  # noqa: E402
+from airflow import macros  # noqa: E402
 
 operators._integrate_plugins()
 sensors._integrate_plugins()  # noqa: E402
diff --git a/airflow/contrib/auth/backends/ldap_auth.py 
b/airflow/contrib/auth/backends/ldap_auth.py
index eefaa1263b..516e121c9b 100644
--- a/airflow/contrib/auth/backends/ldap_auth.py
+++ b/airflow/contrib/auth/backends/ldap_auth.py
@@ -62,7 +62,7 @@ def get_ldap_connection(dn=None, password=None):
 cacert = configuration.conf.get("ldap", "cacert")
 tls_configuration = Tls(validate=ssl.CERT_REQUIRED, 
ca_certs_file=cacert)
 use_ssl = True
-except:
+except Exception:
 pass
 
 server = Server(configuration.conf.get("ldap", "uri"), use_ssl, 
tls_configuration)
@@ -94,7 +94,7 @@ def groups_user(conn, search_base, user_filter, 
user_name_att, username):
 search_filter = "(&({0})({1}={2}))".format(user_filter, user_name_att, 
username)
 try:
 memberof_attr = configuration.conf.get("ldap", "group_member_attr")
-except:
+except Exception:
 memberof_attr = "memberOf"
 res = conn.search(native(search_base), native(search_filter),
   attributes=[native(memberof_attr)])
diff --git a/airflow/contrib/hooks/aws_hook.py 
b/airflow/contrib/hooks/aws_hook.py
index 69a1b0bed3..8ca1f3d744 100644
--- a/airflow/contrib/hooks/aws_hook.py
+++ b/airflow/contrib/hooks/aws_hook.py
@@ -72,7 +72,7 @@ def _parse_s3_config(config_file_name, config_format='boto', 
profile=None):
 try:
 access_key = config.get(cred_section, key_id_option)
 secret_key = config.get(cred_section, secret_key_option)
-except:
+except Exception:
 logging.warning("Option Error in parsing s3 config file")
 raise
 return access_key, secret_key
diff --git a/airflow/contrib/operators/awsbatch_operator.py 
b/airflow/contrib/operators/awsbatch_operator.py
index a5c86afce6..353fbbb0a0 100644
--- a/airflow/contrib/operators/awsbatch_operator.py
+++ b/airflow/contrib/operators/awsbatch_operator.py
@@ -139,7 +139,7 @@ def _wait_for_task_ended(self):
 if response['jobs'][-1]['status'] in ['SUCCEEDED', 'FAILED']:
 retry = False
 
-sleep( 1 + pow(retries * 0.1, 2))
+sleep(1 + pow(retries * 0.1, 2))
 retries += 1
 
 def _check_success_task(self):
diff --git a/airflow/contrib/operators/mlengine_prediction_summary.py 
b/airflow/contrib/operators/mlengine_prediction_summary.py
index 17fc2c0903..4efe81e641 100644
--- a/airflow/contrib/operators/mlengine_prediction_summary.py
+++ b/airflow/contrib/operators/mlengine_prediction_summary.py
@@ -112,14 +112,14 @@ def decode(self, x):
 @beam.ptransform_fn
 def MakeSummary(pcoll, metric_fn, metric_keys):  # pylint: disable=invalid-name
 return (
-pcoll
-| "ApplyMetricFnPerInstance" >> beam.Map(metric_fn)
-| "PairWith1" >> beam.Map(lambda tup: tup + (1,))
-| "SumTuple" >> beam.CombineGlobally(beam.combiners.TupleCombineFn(
-*([sum] * (len(metric_keys) + 1
-| "AverageAndMakeDict" >> beam.Map(
+pcoll |
+"ApplyMetricFnPerInstance" >> beam.Map(metric_fn) |
+"PairWith1" >> beam.Map(lambda tup: tup + (1,)) |
+"SumTupl

[jira] [Commented] (AIRFLOW-2800) Remove airflow/ low-hanging linting errors

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563863#comment-16563863
 ] 

ASF subversion and git services commented on AIRFLOW-2800:
--

Commit 06584fc4b1d82a2dbba98e484d0b4515a169a818 in incubator-airflow's branch 
refs/heads/master from [~ajc]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=06584fc ]

[AIRFLOW-2800] Remove low-hanging linting errors


> Remove airflow/ low-hanging linting errors
> --
>
> Key: AIRFLOW-2800
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2800
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Andy Cooper
>Assignee: Andy Cooper
>Priority: Major
>
> Removing low hanging linting errors from airflow directory
> Focuses on
>  * E226
>  * W291
> as well as *some* E501 (line too long) where it did not risk reducing 
> readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2800) Remove airflow/ low-hanging linting errors

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563857#comment-16563857
 ] 

ASF GitHub Bot commented on AIRFLOW-2800:
-

r39132 commented on issue #3638: [AIRFLOW-2800] Remove low-hanging linting 
errors
URL: 
https://github.com/apache/incubator-airflow/pull/3638#issuecomment-409269190
 
 
   Cool. Running `flake8 airflow | wc -l` on master and this PR branch, I see a 
decrease from `458` down to `235`!
   
   Thanks for making these changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Remove airflow/ low-hanging linting errors
> --
>
> Key: AIRFLOW-2800
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2800
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Andy Cooper
>Assignee: Andy Cooper
>Priority: Major
>
> Removing low hanging linting errors from airflow directory
> Focuses on
>  * E226
>  * W291
> as well as *some* E501 (line too long) where it did not risk reducing 
> readability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563849#comment-16563849
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

ashb commented on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409266779
 
 
   FWIW I too am in favour of atomic/fixup! commits that then get squashed pre 
merge.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix all ESLint issues
> -
>
> Key: AIRFLOW-2803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2803
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Verdan Mahmood
>Assignee: Taylor Edmiston
>Priority: Major
>
> Most of the JS code in Apache Airflow has linting issues which are 
> highlighted after the integration of ESLint. 
> Once AIRFLOW-2783 merged in master branch, please fix all the javascript 
> styling issues that we have in .js and .html files. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563847#comment-16563847
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

tedmiston commented on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409266326
 
 
   @verdan Sure!  Typically I keep atomic commits while I'm working so everyone 
can follow small changes instead of one big diff, then squash down to one 
commit at the end.  I updated the title to make it clear this is WIP.  Since 
you're doing most of the reviewing here, do you have a preference on squashing 
throughout working or just thinking about preparing for merge?
   
   I should have an update later today btw.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix all ESLint issues
> -
>
> Key: AIRFLOW-2803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2803
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Verdan Mahmood
>Assignee: Taylor Edmiston
>Priority: Major
>
> Most of the JS code in Apache Airflow has linting issues which are 
> highlighted after the integration of ESLint. 
> Once AIRFLOW-2783 merged in master branch, please fix all the javascript 
> styling issues that we have in .js and .html files. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563848#comment-16563848
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

tedmiston edited a comment on issue #3656: [WIP][AIRFLOW-2803] Fix all ESLint 
issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409266326
 
 
   @verdan Sure!  Typically I keep atomic commits while I'm working so everyone 
can follow small changes instead of one big diff, then squash down to one 
commit at the end.  I updated the title to make it clear this is WIP.  Since 
you're doing most of the reviewing here, do you have a preference on squashing 
throughout working vs just thinking about preparing for the merge with 
squashing at the end?
   
   I should have an update later today btw.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix all ESLint issues
> -
>
> Key: AIRFLOW-2803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2803
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Verdan Mahmood
>Assignee: Taylor Edmiston
>Priority: Major
>
> Most of the JS code in Apache Airflow has linting issues which are 
> highlighted after the integration of ESLint. 
> Once AIRFLOW-2783 merged in master branch, please fix all the javascript 
> styling issues that we have in .js and .html files. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator

2018-07-31 Thread Leo Gallucci (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leo Gallucci updated AIRFLOW-2822:
--
Comment: was deleted

(was: https://github.com/apache/incubator-airflow/pull/3668)

> PendingDeprecationWarning Invalid arguments: 
> HipChatAPISendRoomNotificationOperator
> ---
>
> Key: AIRFLOW-2822
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2822
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, operators
>Affects Versions: Airflow 2.0
>Reporter: Leo Gallucci
>Assignee: Leo Gallucci
>Priority: Trivial
>  Labels: easyfix
>
> Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) 
> gives:
> {code:python}
> airflow/models.py:2390: PendingDeprecationWarning:
> Invalid arguments were passed to HipChatAPISendRoomNotificationOperator.
> Support for passing such arguments will be dropped in Airflow 2.0.
> Invalid arguments were:
> *args: ()
> **kwargs: {'color': 'green'}
> category=PendingDeprecationWarning
> {code}
> I've fixed this in my fork:
> https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055
> I will send a PR



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator

2018-07-31 Thread Leo Gallucci (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563777#comment-16563777
 ] 

Leo Gallucci commented on AIRFLOW-2822:
---

https://github.com/apache/incubator-airflow/pull/3668

> PendingDeprecationWarning Invalid arguments: 
> HipChatAPISendRoomNotificationOperator
> ---
>
> Key: AIRFLOW-2822
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2822
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, operators
>Affects Versions: Airflow 2.0
>Reporter: Leo Gallucci
>Assignee: Leo Gallucci
>Priority: Trivial
>  Labels: easyfix
>
> Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) 
> gives:
> {code:python}
> airflow/models.py:2390: PendingDeprecationWarning:
> Invalid arguments were passed to HipChatAPISendRoomNotificationOperator.
> Support for passing such arguments will be dropped in Airflow 2.0.
> Invalid arguments were:
> *args: ()
> **kwargs: {'color': 'green'}
> category=PendingDeprecationWarning
> {code}
> I've fixed this in my fork:
> https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055
> I will send a PR



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2822) PendingDeprecationWarning Invalid arguments: HipChatAPISendRoomNotificationOperator

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563767#comment-16563767
 ] 

ASF GitHub Bot commented on AIRFLOW-2822:
-

elgalu opened a new pull request #3668: [AIRFLOW-2822] Fix HipChat Deprecation 
Warning
URL: https://github.com/apache/incubator-airflow/pull/3668
 
 
   [AIRFLOW-2822](https://issues.apache.org/jira/browse/AIRFLOW-2822) Fixes 
PendingDeprecationWarning on HipChatAPISendRoomNotificationOperator
   
   Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch 
(2.0) gives:
   
   ```python
   airflow/models.py:2390: PendingDeprecationWarning:
   Invalid arguments were passed to HipChatAPISendRoomNotificationOperator.
   Support for passing such arguments will be dropped in Airflow 2.0.
   Invalid arguments were:
   *args: ()
   **kwargs: {'color': 'green'}
   category=PendingDeprecationWarning
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> PendingDeprecationWarning Invalid arguments: 
> HipChatAPISendRoomNotificationOperator
> ---
>
> Key: AIRFLOW-2822
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2822
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib, operators
>Affects Versions: Airflow 2.0
>Reporter: Leo Gallucci
>Assignee: Leo Gallucci
>Priority: Trivial
>  Labels: easyfix
>
> Using `HipChatAPISendRoomNotificationOperator` on Airflow master branch (2.0) 
> gives:
> {code:python}
> airflow/models.py:2390: PendingDeprecationWarning:
> Invalid arguments were passed to HipChatAPISendRoomNotificationOperator.
> Support for passing such arguments will be dropped in Airflow 2.0.
> Invalid arguments were:
> *args: ()
> **kwargs: {'color': 'green'}
> category=PendingDeprecationWarning
> {code}
> I've fixed this in my fork:
> https://github.com/elgalu/apache-airflow/commit/83fc940f54e5d6531f66bff256f66765899dc055
> I will send a PR



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-2830) Worker subprocess crash results in tasks failing without retry

2018-07-31 Thread James Meickle (JIRA)
James Meickle created AIRFLOW-2830:
--

 Summary: Worker subprocess crash results in tasks failing without 
retry
 Key: AIRFLOW-2830
 URL: https://issues.apache.org/jira/browse/AIRFLOW-2830
 Project: Apache Airflow
  Issue Type: Bug
  Components: celery, scheduler, worker
Affects Versions: 1.9.1
Reporter: James Meickle


We ran across this fixed bug in production: 
[https://github.com/apache/incubator-airflow/pull/3040]

Fair enough, it's fixed. However, that task had `retries=3` which never kicked 
in - that's a bug in its own right!

I do see this in the documentation:
{quote}Zombies & Undeads
Task instances die all the time, usually as part of their normal life cycle, 
but sometimes unexpectedly.

Zombie tasks are characterized by the absence of an heartbeat (emitted by the 
job periodically) and a running status in the database.
{quote}
I was not on call at the time so I don't have a full log of what happened with 
the task states. However, I am wondering if what happened looked something like 
this:
 * Scheduler detects that process needs to run
 * Scheduler changes state to "queued"
 * Scheduler adds to Celery queue
 * Worker pulls message off queue
 * Worker starts subprocess
 * Worker subprocess dies to bug when trying to load logging config, before 
changing task state to running
 * Worker never tries to actually run task, so it never sets task to 
"up_for_retry"
 * Message no longer exists in queue so worker won't grab task again
 * Scheduler never retries because the task wasn't "up_for_retry"
 * Scheduler never checks heartbeat because it's "queued", not "running"

In general it's been disappointing to see so many ugly race conditions in 
Airflow. I'd love to see an Airflow enhancement proposal for converting the 
codebase to use a reliable state machine and better distributed system 
primitives.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2824) Disable loading of default connections via airflow config

2018-07-31 Thread Felix Uellendall (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563690#comment-16563690
 ] 

Felix Uellendall commented on AIRFLOW-2824:
---

Dont know, looks strange to me that it is called upgradedb when it does what 
you say.

But I guess I will use it for now but I personally do not like that it is 
called upgradedb and it is actually an init without "examples".

A doc patch would not be a fix but an improvement :)

> Disable loading of default connections via airflow config
> -
>
> Key: AIRFLOW-2824
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2824
> Project: Apache Airflow
>  Issue Type: Wish
>Reporter: Felix Uellendall
>Priority: Major
>
> I would love to have a variable I can set in the airflow.cfg, like the DAG 
> examples have, to not load the default connections.
> Either by using {{load_examples}} that is already 
> [there|https://github.com/apache/incubator-airflow/blob/dfa7b26ddaca80ee8fd9915ee9f6eac50fac77f6/airflow/config_templates/default_airflow.cfg#L128]
>  for loading dag examples or by a new one like {{load_default_connections}} 
> to check if the user wants to have it or not.
> The implementation of the default connections starts 
> [here|https://github.com/apache/incubator-airflow/blob/9e1d8ee837ea2c23e828d070b6a72a6331d98602/airflow/utils/db.py#L94]
> Let me know what you guys think of it, pls. :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2824) Disable loading of default connections via airflow config

2018-07-31 Thread Ash Berlin-Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563592#comment-16563592
 ] 

Ash Berlin-Taylor commented on AIRFLOW-2824:


It is not well documented, but the answer I use for this is to never run 
{{airflow initd}} -- that is what creates the sample connections. Instead I 
only ever run {{airflow upgradedb}} which will apply missing migrations but not 
create any "example" objects. {{upgradedb}} will work on an empty DB just fine.

Perhaps the fix for this is a doc patch?

> Disable loading of default connections via airflow config
> -
>
> Key: AIRFLOW-2824
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2824
> Project: Apache Airflow
>  Issue Type: Wish
>Reporter: Felix Uellendall
>Priority: Major
>
> I would love to have a variable I can set in the airflow.cfg, like the DAG 
> examples have, to not load the default connections.
> Either by using {{load_examples}} that is already 
> [there|https://github.com/apache/incubator-airflow/blob/dfa7b26ddaca80ee8fd9915ee9f6eac50fac77f6/airflow/config_templates/default_airflow.cfg#L128]
>  for loading dag examples or by a new one like {{load_default_connections}} 
> to check if the user wants to have it or not.
> The implementation of the default connections starts 
> [here|https://github.com/apache/incubator-airflow/blob/9e1d8ee837ea2c23e828d070b6a72a6331d98602/airflow/utils/db.py#L94]
> Let me know what you guys think of it, pls. :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563402#comment-16563402
 ] 

ASF subversion and git services commented on AIRFLOW-2238:
--

Commit 6fdc79980b378222bb0706035bedfe5fcefb982d in incubator-airflow's branch 
refs/heads/master from [~ashb]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=6fdc799 ]

Merge pull request #3413 from ashb/pr-tool-git-config

[AIRFLOW-2238] Switch PR tool to push to Github

> Update dev/airflow-pr to work with gitub for merge targets
> --
>
> Key: AIRFLOW-2238
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2238
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: PR tool
>Reporter: Ash Berlin-Taylor
>Priority: Major
>
> We are planning on migrating the to the Apache "GitBox" project which lets 
> committers work directly on github. This will mean we might not _need_ to use 
> the pr tool, but we should update it so that it merges and pushes back to 
> github, not the ASF repo.
> I think we need to do this before we ask the ASF infra team to migrate our 
> repo over.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563400#comment-16563400
 ] 

ASF subversion and git services commented on AIRFLOW-2238:
--

Commit 4484286e49b7272d2f82e022c0ee5a8690ccc564 in incubator-airflow's branch 
refs/heads/master from [~ashb]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=4484286 ]

[AIRFLOW-2238] Flake8 fixes on dev/airflow-pr


> Update dev/airflow-pr to work with gitub for merge targets
> --
>
> Key: AIRFLOW-2238
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2238
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: PR tool
>Reporter: Ash Berlin-Taylor
>Priority: Major
>
> We are planning on migrating the to the Apache "GitBox" project which lets 
> committers work directly on github. This will mean we might not _need_ to use 
> the pr tool, but we should update it so that it merges and pushes back to 
> github, not the ASF repo.
> I think we need to do this before we ask the ASF infra team to migrate our 
> repo over.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets

2018-07-31 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563401#comment-16563401
 ] 

ASF subversion and git services commented on AIRFLOW-2238:
--

Commit d3793c0a5021df6555a720e9038ccf14b79a1196 in incubator-airflow's branch 
refs/heads/master from [~ashb]
[ https://gitbox.apache.org/repos/asf?p=incubator-airflow.git;h=d3793c0 ]

[AIRFLOW-2238] Update PR tool to push directly to Github


> Update dev/airflow-pr to work with gitub for merge targets
> --
>
> Key: AIRFLOW-2238
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2238
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: PR tool
>Reporter: Ash Berlin-Taylor
>Priority: Major
>
> We are planning on migrating the to the Apache "GitBox" project which lets 
> committers work directly on github. This will mean we might not _need_ to use 
> the pr tool, but we should update it so that it merges and pushes back to 
> github, not the ASF repo.
> I think we need to do this before we ask the ASF infra team to migrate our 
> repo over.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563399#comment-16563399
 ] 

ASF GitHub Bot commented on AIRFLOW-2238:
-

ashb closed pull request #3413: [AIRFLOW-2238] Switch PR tool to push to Github
URL: https://github.com/apache/incubator-airflow/pull/3413
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/dev/airflow-pr b/dev/airflow-pr
index 65243ddf87..08eb9aad36 100755
--- a/dev/airflow-pr
+++ b/dev/airflow-pr
@@ -54,7 +54,8 @@ except ImportError:
 try:
 import keyring
 except ImportError:
-print("Could not find the keyring library. Run 'sudo pip install keyring' 
to install.")
+print("Could not find the keyring library. "
+  "Run 'sudo pip install keyring' to install.")
 sys.exit(-1)
 
 # Location of your Airflow git development area
@@ -64,12 +65,12 @@ AIRFLOW_GIT_LOCATION = os.environ.get(
 
 # Remote name which points to the Gihub site
 GITHUB_REMOTE_NAME = os.environ.get("GITHUB_REMOTE_NAME", "github")
-# Remote name which points to Apache git
-APACHE_REMOTE_NAME = os.environ.get("APACHE_REMOTE_NAME", "apache")
-# OAuth key used for issuing requests against the GitHub API. If this is not 
defined, then requests
-# will be unauthenticated. You should only need to configure this if you find 
yourself regularly
-# exceeding your IP's unauthenticated request rate limit. You can create an 
OAuth key at
-# https://github.com/settings/tokens. This tool only requires the 
"public_repo" scope.
+# OAuth key used for issuing requests against the GitHub API. If this is not
+# defined, then requests will be unauthenticated. You should only need to
+# configure this if you find yourself regularly exceeding your IP's
+# unauthenticated request rate limit. You can create an OAuth key at
+# https://github.com/settings/tokens. This tool only requires the "public_repo"
+# scope.
 GITHUB_OAUTH_KEY = os.environ.get("GITHUB_OAUTH_KEY")
 
 GITHUB_BASE = "https://github.com/apache/incubator-airflow/pull";
@@ -172,7 +173,7 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, 
local):
 pr_branch_name = "%s_MERGE_PR_%s" % (BRANCH_PREFIX, pr_num)
 target_branch_name = "%s_MERGE_PR_%s_%s" % (BRANCH_PREFIX, pr_num, 
target_ref.upper())
 run_cmd("git fetch %s pull/%s/head:%s" % (GITHUB_REMOTE_NAME, pr_num, 
pr_branch_name))
-run_cmd("git fetch %s %s:%s" % (APACHE_REMOTE_NAME, target_ref, 
target_branch_name))
+run_cmd("git fetch %s %s:%s" % (GITHUB_REMOTE_NAME, target_ref, 
target_branch_name))
 run_cmd("git checkout %s" % target_branch_name)
 
 had_conflicts = False
@@ -205,7 +206,8 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, 
local):
 except Exception as e:
 msg = "Error merging: %s\nWould you like to manually fix-up this 
merge?" % e
 continue_maybe(msg)
-msg = "Okay, please fix any conflicts and 'git add' conflicting 
files... Finished?"
+msg = ("Okay, please fix any conflicts and 'git add' conflicting 
files... " +
+   "Finished?")
 continue_maybe(msg)
 had_conflicts = True
 
@@ -216,7 +218,6 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, 
local):
 if pr_commits:
 all_text += ' '.join(c['commit']['message'] for c in pr_commits)
 all_jira_refs = standardize_jira_ref(all_text, only_jira=True)
-all_jira_issues = re.findall("AIRFLOW-[0-9]{1,6}", all_jira_refs)
 
 merge_message_flags = []
 
@@ -315,7 +316,6 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, 
local):
 if primary_author == "":
 primary_author = distinct_authors[0]
 
-authors = "\n".join(["Author: %s" % a for a in distinct_authors])
 merge_message_flags.append(u'--author="{}"'.format(primary_author))
 
 else:
@@ -327,7 +327,7 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, 
local):
 # reflow commit message
 seen_first_line = False
 for i in range(1, len(merge_message_flags)):
-if merge_message_flags[i-1] == '-m':
+if merge_message_flags[i - 1] == '-m':
 # let the first line be as long as the user wants
 if not seen_first_line:
 if '\n\n' in merge_message_flags[i]:
@@ -376,7 +376,7 @@ def merge_pr(pr_num, target_ref, title, body, pr_repo_desc, 
local):
 run_cmd(['git', 'commit'] + commit_flags, echo_cmd=False)
 
 if local:
-msg ='\n' + reflow("""
+msg = '\n' + reflow("""
 The PR has been merged locally in branch {}.
 You may leave this program running while you work on it. When
 you are finished, press any k

[jira] [Commented] (AIRFLOW-2238) Update dev/airflow-pr to work with gitub for merge targets

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563350#comment-16563350
 ] 

ASF GitHub Bot commented on AIRFLOW-2238:
-

codecov-io edited a comment on issue #3413: [AIRFLOW-2238] Switch PR tool to 
push to Github
URL: 
https://github.com/apache/incubator-airflow/pull/3413#issuecomment-391769983
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=h1)
 Report
   > Merging 
[#3413](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-airflow/commit/dfa7b26ddaca80ee8fd9915ee9f6eac50fac77f6?src=pr&el=desc)
 will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-airflow/pull/3413/graphs/tree.svg?height=150&width=650&token=WdLKlKHOAU&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master#3413   +/-   ##
   ===
 Coverage   77.51%   77.51%   
   ===
 Files 205  205   
 Lines   1575115751   
   ===
 Hits1221012210   
 Misses   3541 3541
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=footer).
 Last update 
[dfa7b26...d3793c0](https://codecov.io/gh/apache/incubator-airflow/pull/3413?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update dev/airflow-pr to work with gitub for merge targets
> --
>
> Key: AIRFLOW-2238
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2238
> Project: Apache Airflow
>  Issue Type: Improvement
>  Components: PR tool
>Reporter: Ash Berlin-Taylor
>Priority: Major
>
> We are planning on migrating the to the Apache "GitBox" project which lets 
> committers work directly on github. This will mean we might not _need_ to use 
> the pr tool, but we should update it so that it merges and pushes back to 
> github, not the ASF repo.
> I think we need to do this before we ask the ASF infra team to migrate our 
> repo over.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563319#comment-16563319
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

verdan commented on issue #3656: [AIRFLOW-2803] Fix all ESLint issues
URL: 
https://github.com/apache/incubator-airflow/pull/3656#issuecomment-409147448
 
 
   @tedmiston can you please make sure:
   - you squash your commits 
   - your commit message adheres the [commit 
guidelines](https://github.com/apache/incubator-airflow/blob/master/.github/PULL_REQUEST_TEMPLATE.md#commits)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix all ESLint issues
> -
>
> Key: AIRFLOW-2803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2803
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Verdan Mahmood
>Assignee: Taylor Edmiston
>Priority: Major
>
> Most of the JS code in Apache Airflow has linting issues which are 
> highlighted after the integration of ESLint. 
> Once AIRFLOW-2783 merged in master branch, please fix all the javascript 
> styling issues that we have in .js and .html files. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2803) Fix all ESLint issues

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563312#comment-16563312
 ] 

ASF GitHub Bot commented on AIRFLOW-2803:
-

verdan commented on a change in pull request #3656: [AIRFLOW-2803] Fix all 
ESLint issues
URL: https://github.com/apache/incubator-airflow/pull/3656#discussion_r206443837
 
 

 ##
 File path: airflow/www_rbac/static/js/clock.js
 ##
 @@ -18,24 +18,25 @@
  */
 require('./jqClock.min');
 
-$(document).ready(function () {
-  x = new Date();
+$(document).ready(() => {
 
 Review comment:
   Please note that most of the custom JS is written inline in .html files, and 
we are not yet considering that javascript in webpack, that means, we won't be 
able to transpile that javascript to ES5. (which is fine for now)
   I am working on another issue to extract all inline JS from html files to 
separate .js files. 
   https://issues.apache.org/jira/browse/AIRFLOW-2804
   
   My suggestion would be to implement the ES6->ES5 tranpilation as part of 
this issue. And once this PR gets merged, we'll be able to extract all inline 
JS into separate .js files. 
   We already have a JIRA issue for that: 
https://issues.apache.org/jira/browse/AIRFLOW-2730


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix all ESLint issues
> -
>
> Key: AIRFLOW-2803
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2803
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Verdan Mahmood
>Assignee: Taylor Edmiston
>Priority: Major
>
> Most of the JS code in Apache Airflow has linting issues which are 
> highlighted after the integration of ESLint. 
> Once AIRFLOW-2783 merged in master branch, please fix all the javascript 
> styling issues that we have in .js and .html files. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2814) Default Arg "file_process_interval" for class SchedulerJob is inconsistent with doc

2018-07-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563309#comment-16563309
 ] 

ASF GitHub Bot commented on AIRFLOW-2814:
-

kaxil commented on issue #3659: [AIRFLOW-2814] Fix inconsistent default config
URL: 
https://github.com/apache/incubator-airflow/pull/3659#issuecomment-409144039
 
 
   @bolkedebruin @Fokko Thoughts? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default Arg "file_process_interval" for class SchedulerJob is inconsistent 
> with doc
> ---
>
> Key: AIRFLOW-2814
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2814
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Xiaodong DENG
>Assignee: Xiaodong DENG
>Priority: Critical
> Fix For: 2.0.0
>
>
> h2. Backgrond
> In 
> [https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/jobs.py#L592]
>  , it was mentioned the default value of argument *file_process_interval* 
> should be 3 minutes (*file_process_interval:* Parse and schedule each file no 
> faster than this interval).
> The value is normally parsed from the default configuration. However, in the 
> default config_template, its value is 0 rather than 180 seconds 
> ([https://github.com/XD-DENG/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg#L432]
>  ). 
> h2. Issue
> This means that actually that each file is parsed and scheduled without 
> letting Airflow "rest". This conflicts with the design purpose (by default 
> let it be 180 seconds) and may affect performance significantly.
> h2. My Proposal
> Change the value in the config template from 0 to 180.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)