[jira] [Resolved] (AIRFLOW-1119) Redshift to S3 operator - headers not on first row
[ https://issues.apache.org/jira/browse/AIRFLOW-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1119. - Resolution: Fixed Fix Version/s: 1.8.2 Issue resolved by pull request #2245 [https://github.com/apache/incubator-airflow/pull/2245] > Redshift to S3 operator - headers not on first row > -- > > Key: AIRFLOW-1119 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1119 > Project: Apache Airflow > Issue Type: Bug >Reporter: Thomas H > Fix For: 1.8.2 > > Attachments: airflow-bug.png > > > The RedshiftToS3 operator adds headers when unloading data from redshift to > s3. However, there is a bug where the headers can appear in any row (see > screenshot). Need to add `ORDER BY 1 DESC` to the query below to ensure > headers are on first row. > https://github.com/apache/incubator-airflow/blob/master/airflow/operators/redshift_to_s3_operator.py#L93:L102 > More info regarding unloading data with headers: > http://stackoverflow.com/questions/24681214/unloading-from-redshift-to-s3-with-headers > https://medium.com/carwow-product-engineering/unloading-a-file-from-redshift-to-s3-with-headers-fb707f5480f7 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1142) SubDAG Tasks Not Executed Even Though All Dependencies Met
[ https://issues.apache.org/jira/browse/AIRFLOW-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1142. - Resolution: Fixed Issue resolved by pull request #2260 [https://github.com/apache/incubator-airflow/pull/2260] > SubDAG Tasks Not Executed Even Though All Dependencies Met > -- > > Key: AIRFLOW-1142 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1142 > Project: Apache Airflow > Issue Type: Bug > Components: subdag >Affects Versions: 1.8.1 > Environment: 1.8.1rc1+incubating, Celery >Reporter: Joe Schmid >Priority: Blocker > Fix For: 1.8.1 > > Attachments: 2017-04-24T23-20-38-776547, run3-scheduler-stdout.log, > run3-task.log, SubDAGOperatorTaskLog-DEBUG.txt, Test_Nested_SubDAG_0.png, > Test_Nested_SubDAG_1-Zoomed.png, test_nested_subdag.py > > > Testing on 1.8.1rc1, we noticed that tasks in subdags were not getting > executed even though all dependencies had been met. > We were able to create a simple test DAG that re-creates the issue. Attached > is a test DAG, the log file of the subdag operator that shows it fails to run > even though dependencies are met, and screenshots of what the UI looks like. > This is definitely a regression as we have many similarly constructed DAGs > that have been running successfully on a pre-v1.8 version (a fork of > 1.7.1.3+master) for some time. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1100) Subdag tasks were not getting run
[ https://issues.apache.org/jira/browse/AIRFLOW-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1100. - Resolution: Duplicate > Subdag tasks were not getting run > - > > Key: AIRFLOW-1100 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1100 > Project: Apache Airflow > Issue Type: Bug >Reporter: Rui Wang >Assignee: Rui Wang > > Observed some subdag tasks were not getting run because states of such tasks > were reset to NONE in reset_state_for_orphaned_tasks by scheduler, so > BackfillJob got into a infinite loop and could not send such tasks to > executor, even all dependencies were met. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1036) Exponential backoff should use randomization
[ https://issues.apache.org/jira/browse/AIRFLOW-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1036. - Resolution: Fixed Fix Version/s: 1.9.0 1.8.2 Issue resolved by pull request #2262 [https://github.com/apache/incubator-airflow/pull/2262] > Exponential backoff should use randomization > > > Key: AIRFLOW-1036 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1036 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Alex Guziel >Assignee: Alex Guziel > Fix For: 1.8.2, 1.9.0 > > > This prevents the thundering herd problem. I think with the current way this > is used, we would need to use some hashing function based on some subset of > the dag_run, task_id, dag_id, and execution_date to emulate the RNG. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1149) Allow custom filters to be added to jinja2
[ https://issues.apache.org/jira/browse/AIRFLOW-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1149. - Resolution: Fixed Fix Version/s: 1.9.0 1.8.2 Issue resolved by pull request #2258 [https://github.com/apache/incubator-airflow/pull/2258] > Allow custom filters to be added to jinja2 > -- > > Key: AIRFLOW-1149 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1149 > Project: Apache Airflow > Issue Type: Bug >Reporter: Niels Zeilemaker >Assignee: Niels Zeilemaker > Fix For: 1.8.2, 1.9.0 > > > Jinja supports a feature called, custom filters (see > http://jinja.pocoo.org/docs/dev/api/#writing-filters). These filters can be > useful to add custom post processing of variables in a jinja template. > Maybe Airflow can allow for the custom filters to be added to a DAG similarly > as user_defined_macros -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1160) Upadte SparkSubmitOperator parameters
[ https://issues.apache.org/jira/browse/AIRFLOW-1160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1160. - Resolution: Fixed Fix Version/s: (was: Airflow 1.8) 1.9.0 1.8.2 Issue resolved by pull request #2265 [https://github.com/apache/incubator-airflow/pull/2265] > Upadte SparkSubmitOperator parameters > - > > Key: AIRFLOW-1160 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1160 > Project: Apache Airflow > Issue Type: Bug > Components: contrib >Affects Versions: Airflow 1.8 >Reporter: Xi Wang >Assignee: Connor Ameres > Fix For: 1.8.2, 1.9.0 > > > param executor_cores from spark_submit_hook.py is not compatible with > SparkSubmit, should be total-executor-cores instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1140) DatabricksSubmitRunOperator should template the "json" field.
[ https://issues.apache.org/jira/browse/AIRFLOW-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1140. - Resolution: Fixed Fix Version/s: 1.9.0 Issue resolved by pull request #2255 [https://github.com/apache/incubator-airflow/pull/2255] > DatabricksSubmitRunOperator should template the "json" field. > - > > Key: AIRFLOW-1140 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1140 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Andrew Chen >Assignee: Andrew Chen > Fix For: 1.9.0 > > > In the DatabricksSubmitRunOperator we should make sure to apply templating to > the json parameter. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1166) _change_state_for_tis_without_dagrun can take very long and end up in deadlocks
Bolke de Bruin created AIRFLOW-1166: --- Summary: _change_state_for_tis_without_dagrun can take very long and end up in deadlocks Key: AIRFLOW-1166 URL: https://issues.apache.org/jira/browse/AIRFLOW-1166 Project: Apache Airflow Issue Type: Bug Affects Versions: 1.8.1 Reporter: Bolke de Bruin Assignee: Bolke de Bruin Fix For: 1.8.2 While testing multi-scheduler setups we've seen some serious delays (>8s, with just one dag running) in _change_state_for_tis_without_dagrun. This can also result in deadlocks in the db due to the fact the query in _change_state_for_tis_without_dagrun is locking a significant amount of task instances. This is unnecessary. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (AIRFLOW-1166) _change_state_for_tis_without_dagrun can take very long and end up in deadlocks
[ https://issues.apache.org/jira/browse/AIRFLOW-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993659#comment-15993659 ] Bolke de Bruin commented on AIRFLOW-1166: - [~saguziel] [~aoen] you probably want to be aware of this as it might turn up in single scheduler setups as well and besides the delay is significant. > _change_state_for_tis_without_dagrun can take very long and end up in > deadlocks > --- > > Key: AIRFLOW-1166 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1166 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.8.1 >Reporter: Bolke de Bruin >Assignee: Bolke de Bruin > Labels: scheduler > Fix For: 1.8.2 > > > While testing multi-scheduler setups we've seen some serious delays (>8s, > with just one dag running) in _change_state_for_tis_without_dagrun. This can > also result in deadlocks in the db due to the fact the query in > _change_state_for_tis_without_dagrun is locking a significant amount of task > instances. This is unnecessary. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-945) Revert psycopg2 workaround when psycopg2 2.7.1 is released
[ https://issues.apache.org/jira/browse/AIRFLOW-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-945. Resolution: Fixed Fix Version/s: (was: 1.8.1) 1.8.2 Issue resolved by pull request #2272 [https://github.com/apache/incubator-airflow/pull/2272] > Revert psycopg2 workaround when psycopg2 2.7.1 is released > -- > > Key: AIRFLOW-945 > URL: https://issues.apache.org/jira/browse/AIRFLOW-945 > Project: Apache Airflow > Issue Type: Bug > Components: hooks >Affects Versions: 1.8.0rc5 >Reporter: Bolke de Bruin > Fix For: 1.8.2 > > > A workaround around for https://github.com/psycopg/psycopg2/issues/517 is > present. We can revert this change when 2.7.1 is out. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1167) FTPHook file modification date does not support optional microseconds
[ https://issues.apache.org/jira/browse/AIRFLOW-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1167. - Resolution: Fixed Fix Version/s: 1.8.2 Issue resolved by pull request #2268 [https://github.com/apache/incubator-airflow/pull/2268] > FTPHook file modification date does not support optional microseconds > -- > > Key: AIRFLOW-1167 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1167 > Project: Apache Airflow > Issue Type: Bug >Reporter: Niels Zeilemaker >Assignee: Niels Zeilemaker > Fix For: 1.8.2 > > > According to the RFC, the MDTM operation can optionally return microseconds. > The FTPHook crashes if it does. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1168) DbApiHook does not close connections when an exception occurs
[ https://issues.apache.org/jira/browse/AIRFLOW-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1168. - Resolution: Fixed Fix Version/s: 1.8.2 Issue resolved by pull request #2269 [https://github.com/apache/incubator-airflow/pull/2269] > DbApiHook does not close connections when an exception occurs > - > > Key: AIRFLOW-1168 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1168 > Project: Apache Airflow > Issue Type: Bug >Reporter: Niels Zeilemaker >Assignee: Niels Zeilemaker > Fix For: 1.8.2 > > > Most methods in the DbApiHook won't close connections/cursors when an > exception occurs. It need to be wrapped with try/finally clauses -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1193) Add Checkr to Airflow user list
[ https://issues.apache.org/jira/browse/AIRFLOW-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1193. - Resolution: Fixed Fix Version/s: 1.9.0 Issue resolved by pull request #2276 [https://github.com/apache/incubator-airflow/pull/2276] > Add Checkr to Airflow user list > --- > > Key: AIRFLOW-1193 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1193 > Project: Apache Airflow > Issue Type: Task >Reporter: Tongbo Huang >Priority: Trivial > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1141) Remove DAG.crawl_for_tasks method
[ https://issues.apache.org/jira/browse/AIRFLOW-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1141. - Resolution: Fixed Fix Version/s: 1.9.0 Issue resolved by pull request #2275 [https://github.com/apache/incubator-airflow/pull/2275] > Remove DAG.crawl_for_tasks method > - > > Key: AIRFLOW-1141 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1141 > Project: Apache Airflow > Issue Type: Task >Reporter: Jeeyoung Kim >Priority: Trivial > Fix For: 1.9.0 > > > Currently {{DAG}} class has the method: > {noformat} > def crawl_for_tasks(objects): > """ > Typically called at the end of a script by passing globals() as a > parameter. This allows to not explicitly add every single task to the > dag explicitly. > """ > raise NotImplementedError("") > {noformat} > which is not called or tested from anywhere: > {noformat} > $ git grep crawl_for_tasks > airflow/models.py:def crawl_for_tasks(objects): > {noformat} > I suggest removing this method as it doesn't do anything and there's no > documentation related to it. Moreover, if one creates a subclass of DAG, > pylint returns the following warning: {{Method 'crawl_for_tasks' is abstract > in class 'DAG' but is not overridden (abstract-method)}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1150) Fix scripts execution in SparkSql hook
[ https://issues.apache.org/jira/browse/AIRFLOW-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1150. - Resolution: Fixed Fix Version/s: 1.8.2 Issue resolved by pull request #2259 [https://github.com/apache/incubator-airflow/pull/2259] > Fix scripts execution in SparkSql hook > --- > > Key: AIRFLOW-1150 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1150 > Project: Apache Airflow > Issue Type: Bug > Components: contrib >Affects Versions: Airflow 1.8 >Reporter: Giovanni Lanzani > Fix For: 1.8.2 > > > When using the the SparkSqlOperator and submitting a file (ending with .sql > or .hql), a whitespace need to be appended, otherwise a Jinja error will be > raised. However the trailing whitespace confused the hook as those files will > not end with .sql and .hql, but with .sql and .hql. This PR fixes this. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1175) Add Pronto Tools to Airflow user list
[ https://issues.apache.org/jira/browse/AIRFLOW-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1175. - Resolution: Fixed Fix Version/s: 1.9.0 Issue resolved by pull request #2277 [https://github.com/apache/incubator-airflow/pull/2277] > Add Pronto Tools to Airflow user list > - > > Key: AIRFLOW-1175 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1175 > Project: Apache Airflow > Issue Type: Task >Reporter: Kan Ouivirach >Priority: Trivial > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-823) Make task instance details available via API
[ https://issues.apache.org/jira/browse/AIRFLOW-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-823. Resolution: Fixed Fix Version/s: 1.9.0 Issue resolved by pull request #2045 [https://github.com/apache/incubator-airflow/pull/2045] > Make task instance details available via API > > > Key: AIRFLOW-823 > URL: https://issues.apache.org/jira/browse/AIRFLOW-823 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Robin Miller >Assignee: Robin Miller >Priority: Minor > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-860) Circular module dependency prevents loading of custom executor plugin
[ https://issues.apache.org/jira/browse/AIRFLOW-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-860. Resolution: Fixed Fix Version/s: 1.8.2 > Circular module dependency prevents loading of custom executor plugin > - > > Key: AIRFLOW-860 > URL: https://issues.apache.org/jira/browse/AIRFLOW-860 > Project: Apache Airflow > Issue Type: Bug > Components: executor, plugins >Affects Versions: Airflow 1.8 > Environment: Linux RHEL 7.6, Python 2.7.13 >Reporter: Nick Ivanov > Fix For: 1.8.2 > > > Custom plugins cannot be loaded, which prevents airflow from running, due to > apparent cyclic dependency in plugins_manager called in > {{executors.\_\_init\_\_}} -- the top-level {{\_\_init\_\_}} attempts to load > the default executor, which then goes back to plugins_manager etc. > {noformat} > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/EGG-INFO/scripts/airflow", > line 17, in > from airflow import configuration > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/__init__.py", > line 31, in > from airflow.models import DAG > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/models.py", > line 62, in > from airflow.executors import DEFAULT_EXECUTOR, LocalExecutor > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/executors/__init__.py", > line 55, in > raise AirflowException("Executor {0} not supported.".format(_EXECUTOR)) > airflow.exceptions.AirflowException: Executor TestExecutor not supported. > {noformat} > While attempting {{from airflow.plugins_manager import executors_modules}} > cycles right back where it came from: > {noformat} > >>> from airflow.plugins_manager import executor_modules > Traceback (most recent call last): > File "", line 1, in > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/__init__.py", > line 31, in > from airflow.models import DAG > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/models.py", > line 62, in > from airflow.executors import DEFAULT_EXECUTOR, LocalExecutor > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/executors/__init__.py", > line 55, in > raise AirflowException("Executor {0} not supported.".format(_EXECUTOR)) > airflow.exceptions.AirflowException: Executor LSFExecutor not supported. > {noformat} > To reproduce: > # Create a custom executor, e.g. by copying {{local_executor.py}} into > {{$AIRFLOW_HOME/plugins/test_executor.py}} and replacing occurrences of > "Local" with "Test" in the code, which will produce {{TestExecutor}} > # Update {{$AIRFLOW_HOME/airflow.cfg}} by setting in the {{\[core\]}} > section: {{executor = TestExecutor}} > # Run any airflow command, e.g. "airflow version" -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-935) Impossible to use plugin executors
[ https://issues.apache.org/jira/browse/AIRFLOW-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-935. Resolution: Fixed Fix Version/s: 1.8.2 > Impossible to use plugin executors > -- > > Key: AIRFLOW-935 > URL: https://issues.apache.org/jira/browse/AIRFLOW-935 > Project: Apache Airflow > Issue Type: Bug > Components: executor, plugins >Affects Versions: Airflow 1.8, 1.9.0 >Reporter: Stijn Verhaegen > Fix For: 1.8.2 > > > revision eb5982d (included in 1.8) breaks plugin executors altogether. It > makes a new module for every plugin, so import statements need to be adapted, > but the executor selection is left unchanged, so it ends up assigning the > plugin module as an executor. > To reproduce: take any plugin which defines a custom executor and try to get > it loaded by setting `executor` in the airflow.cfg -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1041) DockerOperator replaces its xcom_push method with a boolean
[ https://issues.apache.org/jira/browse/AIRFLOW-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1041. - Resolution: Fixed Fix Version/s: 1.8.2 > DockerOperator replaces its xcom_push method with a boolean > --- > > Key: AIRFLOW-1041 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1041 > Project: Apache Airflow > Issue Type: Bug > Components: docker >Affects Versions: 1.8.0 >Reporter: Ludovic Claude > Fix For: 1.8.2 > > > xcom_push method is not usable from DockerOperator as it is replaced by a > boolean value. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1170) DbApiHook insert rows does not use prepared statement
[ https://issues.apache.org/jira/browse/AIRFLOW-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1170. - Resolution: Fixed Fix Version/s: 1.8.2 Issue resolved by pull request #2270 [https://github.com/apache/incubator-airflow/pull/2270] > DbApiHook insert rows does not use prepared statement > - > > Key: AIRFLOW-1170 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1170 > Project: Apache Airflow > Issue Type: Bug >Reporter: Niels Zeilemaker >Assignee: Niels Zeilemaker > Fix For: 1.8.2 > > > Let's not allow sql injections -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-993) Dags should modify the start date and end date of tasks when they are added
[ https://issues.apache.org/jira/browse/AIRFLOW-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-993. Resolution: Fixed Fix Version/s: (was: 1.8.1) 1.8.2 Issue resolved by pull request #2157 [https://github.com/apache/incubator-airflow/pull/2157] > Dags should modify the start date and end date of tasks when they are added > --- > > Key: AIRFLOW-993 > URL: https://issues.apache.org/jira/browse/AIRFLOW-993 > Project: Apache Airflow > Issue Type: Bug > Components: DAG >Affects Versions: 1.8.0 >Reporter: Jeremiah Lowin >Assignee: Jeremiah Lowin >Priority: Minor > Fix For: 1.8.2 > > > When tasks are added to DAGs, the DAG checks if the task has a start_date. If > it doesn't, the DAG sets it to its own start date. This isn't done for > end_date, but it should be. > Otherwise, this simple code leads to a surprising failure as the backfill > tries to run the task every day, even though the DAG clearly has an end date > set. > {code} > from airflow.models import DAG > from airflow.operators.dummy_operator import DummyOperator > import datetime > dt = datetime.datetime(2017, 1, 1) > with DAG('test', start_date=dt, end_date=dt) as dag: > op = DummyOperator(task_id='dummy') > op.run() > {code} > Furthermore, it may make sense for the task start date to always be the later > of the task start date and the dag start date; similarly for the end date > (but using the earlier date) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1180) Fix flask-wtf version in setup.py for test_csrf_rejection
[ https://issues.apache.org/jira/browse/AIRFLOW-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1180. - Resolution: Fixed Fix Version/s: 1.8.2 Issue resolved by pull request #2280 [https://github.com/apache/incubator-airflow/pull/2280] > Fix flask-wtf version in setup.py for test_csrf_rejection > - > > Key: AIRFLOW-1180 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1180 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kengo Seki >Assignee: Kengo Seki > Fix For: 1.8.2 > > > I locally ran the core tests and got the following error: > {code} > FAIL: test_csrf_rejection (tests.core.SecurityTests) > -- > Traceback (most recent call last): > File "/home/vagrant/incubator-airflow/tests/core.py", line 1530, in > test_csrf_rejection > self.assertIn('CSRF token is missing', response.data.decode('utf-8')) > AssertionError: 'CSRF token is missing' not found in u' "-//W3C//DTD HTML 3.2 Final//EN">\n400 Bad Request\nBad > Request\nCSRF token missing or incorrect.\n' > {code} > [The expected string is added from Flask-WTF > v0.14|https://github.com/lepture/flask-wtf/issues/239], but its version > specified in setup.py is v0.12. So this test failed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1145) Closest_date_partition not working with before = True
[ https://issues.apache.org/jira/browse/AIRFLOW-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1145. - Resolution: Fixed Fix Version/s: (was: Airflow 2.0) 1.8.2 Issue resolved by pull request #2257 [https://github.com/apache/incubator-airflow/pull/2257] > Closest_date_partition not working with before = True > - > > Key: AIRFLOW-1145 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1145 > Project: Apache Airflow > Issue Type: Bug > Components: hive_hooks, plugins >Affects Versions: Airflow 2.0, Airflow 1.8 >Reporter: Julien GRAND-MOURCEL >Assignee: Julien GRAND-MOURCEL >Priority: Minor > Labels: easyfix, features, newbie > Fix For: 1.8.2 > > Original Estimate: 1h > Remaining Estimate: 1h > > When using the closest_date_partition with the parameter "before" set to > True, the function always returns the oldest date for this partition. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (AIRFLOW-1098) parent_dag is not correctly set when subdags have subdags inside
[ https://issues.apache.org/jira/browse/AIRFLOW-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009476#comment-16009476 ] Bolke de Bruin commented on AIRFLOW-1098: - please elaborate ands provide an example > parent_dag is not correctly set when subdags have subdags inside > > > Key: AIRFLOW-1098 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1098 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: Airflow 1.8 >Reporter: wangwenxiang > Original Estimate: 24h > Remaining Estimate: 24h > > Dag has a subdags method to recursively retrieve all sub dags associated. It > is incorrectly used to bag one dag and all its subdags when subdags has > subdags. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1186) Task duration shows weird graph for some dags
[ https://issues.apache.org/jira/browse/AIRFLOW-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1186. - Resolution: Fixed Fix Version/s: 1.8.2 Issue resolved by pull request #2284 [https://github.com/apache/incubator-airflow/pull/2284] > Task duration shows weird graph for some dags > - > > Key: AIRFLOW-1186 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1186 > Project: Apache Airflow > Issue Type: Bug >Reporter: Ignasi Peiró >Assignee: Ignasi Peiró > Fix For: 1.8.2 > > Attachments: duration.png > > > The duration chart sometimes has a bad behaviour and shows a "mess". See > attached screenshot -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1201) Update deprecated 'nose-parameterized' library to 'parameterized'
[ https://issues.apache.org/jira/browse/AIRFLOW-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1201. - Resolution: Fixed Fix Version/s: (was: Airflow 2.0) 1.8.2 > Update deprecated 'nose-parameterized' library to 'parameterized' > - > > Key: AIRFLOW-1201 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1201 > Project: Apache Airflow > Issue Type: Bug > Components: dependencies >Affects Versions: Airflow 2.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.8.2 > > > {code} > /home/travis/build/apache/incubator-airflow/.tox/py27-cdh-airflow_backend_mysql/lib/python2.7/site-packages/nose_parameterized/__init__.py:7: > UserWarning: The 'nose-parameterized' package has been renamed > 'parameterized'. For the two step migration instructions, see: > https://github.com/wolever/parameterized#migrating-from-nose-parameterized-to-parameterized > (set NOSE_PARAMETERIZED_NO_WARN=1 to suppress this warning) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (AIRFLOW-105) SqoopHook
[ https://issues.apache.org/jira/browse/AIRFLOW-105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin closed AIRFLOW-105. -- Resolution: Fixed Fix Version/s: 1.8.0 > SqoopHook > - > > Key: AIRFLOW-105 > URL: https://issues.apache.org/jira/browse/AIRFLOW-105 > Project: Apache Airflow > Issue Type: New Feature > Components: hooks >Affects Versions: Airflow 1.7.1 >Reporter: Bolke de Bruin > Fix For: 1.8.0 > > > Sqoop 1 Hook -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (AIRFLOW-216) Sqoop hook and operator required
[ https://issues.apache.org/jira/browse/AIRFLOW-216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin closed AIRFLOW-216. -- Resolution: Fixed Fix Version/s: 1.8.0 > Sqoop hook and operator required > > > Key: AIRFLOW-216 > URL: https://issues.apache.org/jira/browse/AIRFLOW-216 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, hooks, operators >Affects Versions: Airflow 1.7.0 >Reporter: Johan >Priority: Minor > Fix For: 1.8.0 > > Original Estimate: 24h > Remaining Estimate: 24h > > It would be nice to have a Sqoop hook and operator for Sqoop 1 which will > execute, import, and export Sqoop commands. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1213) Add hcatalog parameters to the sqoop operator/hook
[ https://issues.apache.org/jira/browse/AIRFLOW-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1213. - Resolution: Fixed Fix Version/s: 1.8.2 > Add hcatalog parameters to the sqoop operator/hook > -- > > Key: AIRFLOW-1213 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1213 > Project: Apache Airflow > Issue Type: Bug >Reporter: Fokko Driesprong > Fix For: 1.8.2 > > > Hi, > For the Sqoop hook and operator we would like to add the hcatalog parameters > --hcatalog-database and --hcatalog-table. These are added in Sqoop 1.4.6: > https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html > This allows the user to import/export Hive tables instead of hdfs directories. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1207) Enable utils.helpers unit tests
[ https://issues.apache.org/jira/browse/AIRFLOW-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1207. - Resolution: Fixed Fix Version/s: (was: Airflow 2.0) 1.9.0 > Enable utils.helpers unit tests > --- > > Key: AIRFLOW-1207 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1207 > Project: Apache Airflow > Issue Type: Bug > Components: tests >Affects Versions: Airflow 2.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.9.0 > > > Currently these tests are not running. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (AIRFLOW-1207) Enable utils.helpers unit tests
[ https://issues.apache.org/jira/browse/AIRFLOW-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1207: Affects Version/s: (was: Airflow 2.0) 1.8.1 > Enable utils.helpers unit tests > --- > > Key: AIRFLOW-1207 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1207 > Project: Apache Airflow > Issue Type: Bug > Components: tests >Affects Versions: 1.8.1 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.9.0 > > > Currently these tests are not running. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1200) Forbid creation of a variable with an empty key
[ https://issues.apache.org/jira/browse/AIRFLOW-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1200. - Resolution: Fixed Fix Version/s: (was: Airflow 2.0) 1.8.2 Issue resolved by pull request #2299 [https://github.com/apache/incubator-airflow/pull/2299] > Forbid creation of a variable with an empty key > --- > > Key: AIRFLOW-1200 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1200 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: Airflow 2.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.8.2 > > > This leads to the following exception: > {code} > Traceback (most recent call last): > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 2000, in __call__ > return self.wsgi_app(environ, start_response) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1991, in wsgi_app > response = self.make_response(self.handle_exception(e)) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1567, in handle_exception > reraise(exc_type, exc_value, tb) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1988, in wsgi_app > response = self.full_dispatch_request() > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1641, in full_dispatch_request > rv = self.handle_user_exception(e) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1544, in handle_user_exception > reraise(exc_type, exc_value, tb) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1639, in full_dispatch_request > rv = self.dispatch_request() > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1625, in dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/base.py", > line 69, in inner > return self._run_view(f, *args, **kwargs) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/base.py", > line 368, in _run_view > return fn(self, *args, **kwargs) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/model/base.py", > line 1900, in index_view > return_url=self._get_list_url(view_args), > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/base.py", > line 308, in render > return render_template(template, **kwargs) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/templating.py", > line 134, in render_template > context, ctx.app) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/templating.py", > line 116, in _render > rv = template.render(context) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/jinja2/environment.py", > line 989, in render > return self.environment.handle_exception(exc_info, True) > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/jinja2/environment.py", > line 754, in handle_exception > reraise(exc_type, exc_value, tb) > File > "/home/skudriashev/Work/incubator-airflow/airflow/www/templates/airflow/variable_list.html", > line 18, in top-level template code > {% extends 'admin/model/list.html' %} > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/model/list.html", > line 6, in top-level template code > {% import 'admin/model/row_actions.html' as row_actions with context %} > File > "/home/skudriashev/Work/incubator-airflow/airflow/www/templates/admin/master.html", > line 18, in top-level template code > {% extends 'admin/base.html' %} > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/base.html", > line 30, in top-level template code > {% block page_body %} > File > "/home/skudriashev/Work/incubator-airflow/airflow/www/templates/admin/master.html", > line 104, in block "page_body" > {% block body %} > File > "/home/skudriashev/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/model/list.html", > line 62, in block "
[jira] [Resolved] (AIRFLOW-1199) Create modal looks like edit modal (edit tab selected)
[ https://issues.apache.org/jira/browse/AIRFLOW-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1199. - Resolution: Fixed Fix Version/s: (was: Airflow 2.0) 1.8.2 Issue resolved by pull request #2293 [https://github.com/apache/incubator-airflow/pull/2293] > Create modal looks like edit modal (edit tab selected) > -- > > Key: AIRFLOW-1199 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1199 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: Airflow 2.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.8.2 > > Attachments: imageedit_4_9834649179.jpg > > > !imageedit_4_9834649179.jpg|thumbnail! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1221) Fix DatabricksSubmitRunOperator Templating
[ https://issues.apache.org/jira/browse/AIRFLOW-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1221. - Resolution: Fixed Fix Version/s: 1.8.2 > Fix DatabricksSubmitRunOperator Templating > -- > > Key: AIRFLOW-1221 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1221 > Project: Apache Airflow > Issue Type: Bug >Reporter: Andrew Chen > Fix For: 1.8.2 > > > https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/operators/databricks_operator.py#L207 > Bug where we mixed up index and content from enumerate. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1226) Remove empty column on the Jobs view
[ https://issues.apache.org/jira/browse/AIRFLOW-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1226. - Resolution: Fixed Fix Version/s: (was: Airflow 2.0) 1.8.2 Issue resolved by pull request #2309 [https://github.com/apache/incubator-airflow/pull/2309] > Remove empty column on the Jobs view > > > Key: AIRFLOW-1226 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1226 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: Airflow 2.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev >Priority: Minor > Fix For: 1.8.2 > > Attachments: imageedit_9_5615772146.jpg > > > First column on the Jobs view is redundant and should be removed. > !imageedit_9_5615772146.jpg! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1227) Remove empty column on the Logs view
[ https://issues.apache.org/jira/browse/AIRFLOW-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1227. - Resolution: Fixed Fix Version/s: (was: Airflow 2.0) 1.8.2 Issue resolved by pull request #2310 [https://github.com/apache/incubator-airflow/pull/2310] > Remove empty column on the Logs view > > > Key: AIRFLOW-1227 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1227 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: Airflow 2.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev >Priority: Minor > Fix For: 1.8.2 > > Attachments: imageedit_14_8306586340.jpg > > > First column on the Logs view is redundant and should be removed. > !imageedit_14_8306586340.jpg! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1233) Cover utils.json with unit tests
[ https://issues.apache.org/jira/browse/AIRFLOW-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1233. - Resolution: Fixed Fix Version/s: (was: Airflow 1.8) 1.9.0 Issue resolved by pull request #2316 [https://github.com/apache/incubator-airflow/pull/2316] > Cover utils.json with unit tests > > > Key: AIRFLOW-1233 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1233 > Project: Apache Airflow > Issue Type: Test > Components: tests >Affects Versions: Airflow 1.8 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev >Priority: Minor > Fix For: 1.9.0 > > > Cover utils.json with unit tests to get 100% coverage. > {code} > Name Stmts Miss Cover > > airflow/utils/json.py 23 0 100% > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1232) Remove deprecated readfp warning
[ https://issues.apache.org/jira/browse/AIRFLOW-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1232. - Resolution: Fixed Fix Version/s: (was: Airflow 1.8) 1.9.0 Issue resolved by pull request #2315 [https://github.com/apache/incubator-airflow/pull/2315] > Remove deprecated readfp warning > > > Key: AIRFLOW-1232 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1232 > Project: Apache Airflow > Issue Type: Bug > Components: dependencies >Affects Versions: Airflow 1.8 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev >Priority: Minor > Fix For: 1.9.0 > > > {code} > /home/travis/build/apache/incubator-airflow/airflow/configuration.py:128: > DeprecationWarning: This method will be removed in future versions. Use > 'parser.read_file()' instead. > self.readfp(StringIO.StringIO(string)) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1231) Use flask_wtf.CSRFProtect instead of flask_wtf.CsrfProtect to remove deprecation warning
[ https://issues.apache.org/jira/browse/AIRFLOW-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1231. - Resolution: Fixed Fix Version/s: (was: Airflow 1.8) 1.9.0 Issue resolved by pull request #2313 [https://github.com/apache/incubator-airflow/pull/2313] > Use flask_wtf.CSRFProtect instead of flask_wtf.CsrfProtect to remove > deprecation warning > > > Key: AIRFLOW-1231 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1231 > Project: Apache Airflow > Issue Type: Improvement > Components: core >Affects Versions: Airflow 1.8 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev >Priority: Minor > Fix For: 1.9.0 > > > Use flask_wtf.CSRFProtect instead of flask_wtf.CsrfProtect. > {code} > ...app.py:23: FlaskWTFDeprecationWarning: "flask_wtf.CsrfProtect" has been > renamed to "CSRFProtect" and will be removed in 1.0. > csrf = CsrfProtect() > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-645) HttpHook ignores https
[ https://issues.apache.org/jira/browse/AIRFLOW-645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-645. Resolution: Fixed Fix Version/s: 1.9.0 Issue resolved by pull request #2311 [https://github.com/apache/incubator-airflow/pull/2311] > HttpHook ignores https > -- > > Key: AIRFLOW-645 > URL: https://issues.apache.org/jira/browse/AIRFLOW-645 > Project: Apache Airflow > Issue Type: Bug > Components: hooks >Affects Versions: Airflow 2.0, Airflow 1.7.1 >Reporter: Ryan Morlok > Fix For: 1.9.0 > > > When loading an https connection from an environment variable, HttpHook > leverages BaseHook's get_connection(...) method which will load the string > from the environment variable. It will then parse the URI. > HttpHook will then use the base_url as the connection's host, which always > strips off the protocol. It does a useless check to see if the bsae_url > starts with http, and since it doesn't it always appends http://, losing the > https. > I think > self.base_url = conn.host > in http_hook.py should be updated to: > self.base_url = conn.conn_type + conn.host -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1217) Enable logging in Sqoop hook
[ https://issues.apache.org/jira/browse/AIRFLOW-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1217. - Resolution: Fixed Fix Version/s: 1.8.2 > Enable logging in Sqoop hook > > > Key: AIRFLOW-1217 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1217 > Project: Apache Airflow > Issue Type: Bug >Reporter: Fokko Driesprong > Fix For: 1.8.2 > > > Currently the sqoop hook does not give any output in Airflow which makes > debugging hard. We would like to change this and have proper logging from the > subprocess to airflow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1234) Cover utils.operator_helpers with unit tests
[ https://issues.apache.org/jira/browse/AIRFLOW-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1234. - Resolution: Fixed Fix Version/s: (was: Airflow 1.8) 1.9.0 Issue resolved by pull request #2317 [https://github.com/apache/incubator-airflow/pull/2317] > Cover utils.operator_helpers with unit tests > > > Key: AIRFLOW-1234 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1234 > Project: Apache Airflow > Issue Type: Test > Components: tests >Affects Versions: Airflow 1.8 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev >Priority: Minor > Fix For: 1.9.0 > > > Cover *utils.operator_helpers* with unit tests to have 100% coverage. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1245) Fix random failure of test_trigger_dag_for_date unit test
[ https://issues.apache.org/jira/browse/AIRFLOW-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1245. - Resolution: Fixed Issue resolved by pull request #2325 [https://github.com/apache/incubator-airflow/pull/2325] > Fix random failure of test_trigger_dag_for_date unit test > - > > Key: AIRFLOW-1245 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1245 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.8.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.9.0 > > > This failure happens from time to time: > https://travis-ci.org/apache/incubator-airflow/jobs/235993912 > {code} > == > 14) FAIL: test_trigger_dag_for_date > (tests.www.api.experimental.test_endpoints.ApiExperimentalTests) > -- >Traceback (most recent call last): > tests/www/api/experimental/test_endpoints.py line 87 in > test_trigger_dag_for_date > self.assertEqual(200, response.status_code) >AssertionError: 200 != 404 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1243) DAGs table has no default entries to show
[ https://issues.apache.org/jira/browse/AIRFLOW-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1243. - Resolution: Fixed Fix Version/s: (was: 1.9.0) 1.8.2 Issue resolved by pull request #2323 [https://github.com/apache/incubator-airflow/pull/2323] > DAGs table has no default entries to show > - > > Key: AIRFLOW-1243 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1243 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.8.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev >Priority: Minor > Fix For: 1.8.2 > > Attachments: imageedit_1_3250112213.jpg > > > There is no default value selected. It needs to be set. > !imageedit_1_3250112213.jpg! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1237) Fix IN-predicate sqlalchemy warning
[ https://issues.apache.org/jira/browse/AIRFLOW-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1237. - Resolution: Fixed Issue resolved by pull request #2320 [https://github.com/apache/incubator-airflow/pull/2320] > Fix IN-predicate sqlalchemy warning > --- > > Key: AIRFLOW-1237 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1237 > Project: Apache Airflow > Issue Type: Bug > Components: db >Affects Versions: 1.8.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.9.0 > > > {code} > /home/travis/build/apache/incubator-airflow/.tox/py27-cdh-airflow_backend_mysql/lib/python2.7/site-packages/sqlalchemy/sql/default_comparator.py:161: > SAWarning: The IN-predicate on "dag_stats.dag_id" was invoked with an empty > sequence. This results in a contradiction, which nonetheless can be expensive > to evaluate. Consider alternative strategies for improved performance. > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (AIRFLOW-1191) Contrib Spark Submit hook should permit override of spark-submit cmd
[ https://issues.apache.org/jira/browse/AIRFLOW-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1191: Fix Version/s: (was: 1.8.1) (was: Airflow 2.0) 1.8.2 1.9.0 > Contrib Spark Submit hook should permit override of spark-submit cmd > > > Key: AIRFLOW-1191 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1191 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, hooks >Affects Versions: 1.8.1 > Environment: Cloudera based Spark parcel >Reporter: Vianney FOUCAULT >Assignee: Vianney FOUCAULT > Fix For: 1.9.0, 1.8.2 > > > Using Cloudera based cluster with spark 2 parcel that rename spark-submit to > spark2-submit > It should be possible to change the spark submit cmd without specifying a env > var -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (AIRFLOW-1191) Contrib Spark Submit hook should permit override of spark-submit cmd
[ https://issues.apache.org/jira/browse/AIRFLOW-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1191: Affects Version/s: (was: Airflow 1.8) (was: Airflow 2.0) 1.8.1 > Contrib Spark Submit hook should permit override of spark-submit cmd > > > Key: AIRFLOW-1191 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1191 > Project: Apache Airflow > Issue Type: Improvement > Components: contrib, hooks >Affects Versions: 1.8.1 > Environment: Cloudera based Spark parcel >Reporter: Vianney FOUCAULT >Assignee: Vianney FOUCAULT > Fix For: 1.9.0, 1.8.2 > > > Using Cloudera based cluster with spark 2 parcel that rename spark-submit to > spark2-submit > It should be possible to change the spark submit cmd without specifying a env > var -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (AIRFLOW-1197) Contrib Spark Submit Hook on_kill error
[ https://issues.apache.org/jira/browse/AIRFLOW-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1197: Fix Version/s: 1.8.2 > Contrib Spark Submit Hook on_kill error > --- > > Key: AIRFLOW-1197 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1197 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, hooks >Affects Versions: 1.8.1 >Reporter: Vianney FOUCAULT > Fix For: 1.8.2 > > > the on_kill method is not ok, this needs to be fixed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (AIRFLOW-1197) Contrib Spark Submit Hook on_kill error
[ https://issues.apache.org/jira/browse/AIRFLOW-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1197: Affects Version/s: 1.8.1 > Contrib Spark Submit Hook on_kill error > --- > > Key: AIRFLOW-1197 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1197 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, hooks >Affects Versions: 1.8.1 >Reporter: Vianney FOUCAULT > Fix For: 1.8.2 > > > the on_kill method is not ok, this needs to be fixed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-654) SSL for AMQP w/ Celery(Executor)
[ https://issues.apache.org/jira/browse/AIRFLOW-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-654. Resolution: Fixed Fix Version/s: (was: Airflow 1.7.1.3) 1.9.0 Issue resolved by pull request #2333 [https://github.com/apache/incubator-airflow/pull/2333] > SSL for AMQP w/ Celery(Executor) > > > Key: AIRFLOW-654 > URL: https://issues.apache.org/jira/browse/AIRFLOW-654 > Project: Apache Airflow > Issue Type: Improvement > Components: celery, executor >Affects Versions: Airflow 2.0, Airflow 1.8 > Environment: Tested on: > Airflow 1.7.1.3, celery[auth] 4.0, et.al. >Reporter: Michael Otte > Labels: patch, security > Fix For: 1.9.0 > > > Add celery ssl certs for amqp (w/ rabbitmq) encryption. This can go in > celery_executor.py and set with current airflow configuration practices (e.g. > explicit in airflow.cfg, env var, etc.) > tldr > Currently, celery's AMQP messages cannot be encrypted using SSL unless a SSH > tunnel, VPN, or an alternative network encryption protocol is used. > This is the only feature addition required to be able to use Airflow in an > end-to-end encrypted, distributed system. > The webserver, the disk volume, etc. can be encrypted outside of Airflow with > good security practices (e.g. the webserver can be secured at the proxy > layer, GCM with AES can be used for in-state encryption, etc.) > Could technically use the certs from the webserver (link to commit/issue > comment below) if you're lazy and if the certs are issued from the same > certificate authority as the broker's certs. > https://issues.apache.org/jira/browse/AIRFLOW-91?focusedCommentId=15503562&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15503562 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1244) Forbid creation of a pool with empty name
[ https://issues.apache.org/jira/browse/AIRFLOW-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1244. - Resolution: Fixed Issue resolved by pull request #2324 [https://github.com/apache/incubator-airflow/pull/2324] > Forbid creation of a pool with empty name > - > > Key: AIRFLOW-1244 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1244 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.8.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.9.0 > > > Currently creation of a pool with empty name fails with exception. This needs > to be fixed. > {code} > Traceback (most recent call last): > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 2000, in __call__ > return self.wsgi_app(environ, start_response) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1991, in wsgi_app > response = self.make_response(self.handle_exception(e)) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1567, in handle_exception > reraise(exc_type, exc_value, tb) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1988, in wsgi_app > response = self.full_dispatch_request() > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1641, in full_dispatch_request > rv = self.handle_user_exception(e) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1544, in handle_user_exception > reraise(exc_type, exc_value, tb) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1639, in full_dispatch_request > rv = self.dispatch_request() > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1625, in dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/base.py", > line 69, in inner > return self._run_view(f, *args, **kwargs) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/base.py", > line 368, in _run_view > return fn(self, *args, **kwargs) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/model/base.py", > line 1900, in index_view > return_url=self._get_list_url(view_args), > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/base.py", > line 308, in render > return render_template(template, **kwargs) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/templating.py", > line 134, in render_template > context, ctx.app) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/templating.py", > line 116, in _render > rv = template.render(context) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/jinja2/environment.py", > line 989, in render > return self.environment.handle_exception(exc_info, True) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/jinja2/environment.py", > line 754, in handle_exception > reraise(exc_type, exc_value, tb) > File > "/home/stas/Work/incubator-airflow/airflow/www/templates/airflow/model_list.html", > line 18, in top-level template code > {% extends 'admin/model/list.html' %} > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/model/list.html", > line 6, in top-level template code > {% import 'admin/model/row_actions.html' as row_actions with context %} > File > "/home/stas/Work/incubator-airflow/airflow/www/templates/admin/master.html", > line 18, in top-level template code > {% extends 'admin/base.html' %} > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/base.html", > line 30, in top-level template code > {% block page_body %} > File > "/home/stas/Work/incubator-airflow/airflow/www/templates/admin/master.html", > line 104, in block "page_body" > {% block body %} > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/model/list.html", > line 62, in block "body" > {% block model_list_table %} > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/model/list.html", > line 110
[jira] [Updated] (AIRFLOW-1244) Forbid creation of a pool with empty name
[ https://issues.apache.org/jira/browse/AIRFLOW-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1244: Fix Version/s: 1.8.2 > Forbid creation of a pool with empty name > - > > Key: AIRFLOW-1244 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1244 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.8.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.9.0, 1.8.2 > > > Currently creation of a pool with empty name fails with exception. This needs > to be fixed. > {code} > Traceback (most recent call last): > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 2000, in __call__ > return self.wsgi_app(environ, start_response) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1991, in wsgi_app > response = self.make_response(self.handle_exception(e)) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1567, in handle_exception > reraise(exc_type, exc_value, tb) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1988, in wsgi_app > response = self.full_dispatch_request() > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1641, in full_dispatch_request > rv = self.handle_user_exception(e) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1544, in handle_user_exception > reraise(exc_type, exc_value, tb) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1639, in full_dispatch_request > rv = self.dispatch_request() > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/app.py", > line 1625, in dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/base.py", > line 69, in inner > return self._run_view(f, *args, **kwargs) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/base.py", > line 368, in _run_view > return fn(self, *args, **kwargs) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/model/base.py", > line 1900, in index_view > return_url=self._get_list_url(view_args), > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/base.py", > line 308, in render > return render_template(template, **kwargs) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/templating.py", > line 134, in render_template > context, ctx.app) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask/templating.py", > line 116, in _render > rv = template.render(context) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/jinja2/environment.py", > line 989, in render > return self.environment.handle_exception(exc_info, True) > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/jinja2/environment.py", > line 754, in handle_exception > reraise(exc_type, exc_value, tb) > File > "/home/stas/Work/incubator-airflow/airflow/www/templates/airflow/model_list.html", > line 18, in top-level template code > {% extends 'admin/model/list.html' %} > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/model/list.html", > line 6, in top-level template code > {% import 'admin/model/row_actions.html' as row_actions with context %} > File > "/home/stas/Work/incubator-airflow/airflow/www/templates/admin/master.html", > line 18, in top-level template code > {% extends 'admin/base.html' %} > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/base.html", > line 30, in top-level template code > {% block page_body %} > File > "/home/stas/Work/incubator-airflow/airflow/www/templates/admin/master.html", > line 104, in block "page_body" > {% block body %} > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/model/list.html", > line 62, in block "body" > {% block model_list_table %} > File > "/home/stas/Work/incubator-airflow/.venv/lib/python2.7/site-packages/flask_admin/templates/bootstrap3/admin/model/list.html", > line 110, in block "model_list_table" > {% block list_row scoped %} > File > "/home/sta
[jira] [Resolved] (AIRFLOW-1277) Forbid creation of a known event with empty fields
[ https://issues.apache.org/jira/browse/AIRFLOW-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1277. - Resolution: Fixed Fix Version/s: (was: 1.8.2) 1.9.0 Issue resolved by pull request #2349 [https://github.com/apache/incubator-airflow/pull/2349] > Forbid creation of a known event with empty fields > -- > > Key: AIRFLOW-1277 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1277 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.8.1 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.9.0 > > > Currently known event can be created with empty fields (no label, or no > dates) which doesn't really make sense. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1276) Forbid event creation with end_data earlier than start_date
[ https://issues.apache.org/jira/browse/AIRFLOW-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1276. - Resolution: Fixed Fix Version/s: (was: 1.8.2) 1.9.0 Issue resolved by pull request #2349 [https://github.com/apache/incubator-airflow/pull/2349] > Forbid event creation with end_data earlier than start_date > --- > > Key: AIRFLOW-1276 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1276 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.8.1 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.9.0 > > Attachments: imageedit_2_5108357500.jpg > > > Forbid event creation with end_data earlier than start_date. > !imageedit_2_5108357500.jpg! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1281) Sort variables by key field by default
[ https://issues.apache.org/jira/browse/AIRFLOW-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1281. - Resolution: Fixed Issue resolved by pull request #2347 [https://github.com/apache/incubator-airflow/pull/2347] > Sort variables by key field by default > -- > > Key: AIRFLOW-1281 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1281 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.8.1 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev >Priority: Minor > Fix For: 1.8.2 > > Attachments: variables.jpg > > > Sort variables by key field by default. > !variables.jpg! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1192) Add support to run Zeppelin Notebooks with Qubole Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1192. - Resolution: Fixed Fix Version/s: 1.8.2 Issue resolved by pull request #2322 [https://github.com/apache/incubator-airflow/pull/2322] > Add support to run Zeppelin Notebooks with Qubole Operator > --- > > Key: AIRFLOW-1192 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1192 > Project: Apache Airflow > Issue Type: New Feature > Components: contrib, operators >Reporter: Sumit Maheshwari >Assignee: Sumit Maheshwari > Fix For: 1.8.2 > > > Qubole SDK (qds_sdk) supports running Zeppelin notebooks, but as of now > Airflow can't do the same using Qubole's sdk. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1208) Speed-up cli tests
[ https://issues.apache.org/jira/browse/AIRFLOW-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1208. - Resolution: Fixed Fix Version/s: (was: Airflow 2.0) 1.9.0 Issue resolved by pull request #2301 [https://github.com/apache/incubator-airflow/pull/2301] > Speed-up cli tests > -- > > Key: AIRFLOW-1208 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1208 > Project: Apache Airflow > Issue Type: Improvement > Components: tests >Affects Versions: Airflow 2.0 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev >Priority: Minor > Fix For: 1.9.0 > > > No need to do actual reset in unit tests if nothing is really verified in it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1166) _change_state_for_tis_without_dagrun can take very long and end up in deadlocks
[ https://issues.apache.org/jira/browse/AIRFLOW-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1166. - Resolution: Fixed Issue resolved by pull request #2267 [https://github.com/apache/incubator-airflow/pull/2267] > _change_state_for_tis_without_dagrun can take very long and end up in > deadlocks > --- > > Key: AIRFLOW-1166 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1166 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.8.1 >Reporter: Bolke de Bruin >Assignee: Bolke de Bruin > Labels: scheduler > Fix For: 1.8.2 > > > While testing multi-scheduler setups we've seen some serious delays (>8s, > with just one dag running) in _change_state_for_tis_without_dagrun. This can > also result in deadlocks in the db due to the fact the query in > _change_state_for_tis_without_dagrun is locking a significant amount of task > instances. This is unnecessary. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (AIRFLOW-1282) Fix known event column sorting
[ https://issues.apache.org/jira/browse/AIRFLOW-1282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1282. - Resolution: Fixed Issue resolved by pull request #2350 [https://github.com/apache/incubator-airflow/pull/2350] > Fix known event column sorting > -- > > Key: AIRFLOW-1282 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1282 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.8.1 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.8.2 > > Attachments: event_sorting.jpg > > > Fix known event column sorting by: > - Event Type > - Reported By > !event_sorting.jpg! -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1294) Backfills can loose tasks to execute due to tasks setting themselves to NONE
Bolke de Bruin created AIRFLOW-1294: --- Summary: Backfills can loose tasks to execute due to tasks setting themselves to NONE Key: AIRFLOW-1294 URL: https://issues.apache.org/jira/browse/AIRFLOW-1294 Project: Apache Airflow Issue Type: Bug Components: backfill Affects Versions: 1.8.1 Reporter: Bolke de Bruin Priority: Blocker In the backfills we can loose tasks to execute due to a task setting its own state to NONE if concurrency limits are reached, this makes them fall outside of the scope the backfill is managing hence they will not be executed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1302) LocalTaskJob should propagate return code to executor
Bolke de Bruin created AIRFLOW-1302: --- Summary: LocalTaskJob should propagate return code to executor Key: AIRFLOW-1302 URL: https://issues.apache.org/jira/browse/AIRFLOW-1302 Project: Apache Airflow Issue Type: Bug Components: executor Affects Versions: 1.8.1 Reporter: Bolke de Bruin Priority: Critical Fix For: 1.8.2 The LocalTaskJob monitors execution of tasks, but it always reports success (exit code 0) when tasks have finished regardless of what the task or OS is reporting. This makes any monitoring by the executor not relevant. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (AIRFLOW-1302) LocalTaskJob should propagate return code to executor
[ https://issues.apache.org/jira/browse/AIRFLOW-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1302: Priority: Major (was: Critical) > LocalTaskJob should propagate return code to executor > - > > Key: AIRFLOW-1302 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1302 > Project: Apache Airflow > Issue Type: Bug > Components: executor >Affects Versions: 1.8.1 >Reporter: Bolke de Bruin > Fix For: 1.8.3 > > > The LocalTaskJob monitors execution of tasks, but it always reports success > (exit code 0) when tasks have finished regardless of what the task or OS is > reporting. > This makes any monitoring by the executor not relevant. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-1302) LocalTaskJob should propagate return code to executor
[ https://issues.apache.org/jira/browse/AIRFLOW-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1302: Fix Version/s: (was: 1.8.2) 1.8.3 > LocalTaskJob should propagate return code to executor > - > > Key: AIRFLOW-1302 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1302 > Project: Apache Airflow > Issue Type: Bug > Components: executor >Affects Versions: 1.8.1 >Reporter: Bolke de Bruin > Fix For: 1.8.3 > > > The LocalTaskJob monitors execution of tasks, but it always reports success > (exit code 0) when tasks have finished regardless of what the task or OS is > reporting. > This makes any monitoring by the executor not relevant. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-1296) DAGs using operators involving cascading skipped tasks fail prematurely
[ https://issues.apache.org/jira/browse/AIRFLOW-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1296: Fix Version/s: 1.8.2 > DAGs using operators involving cascading skipped tasks fail prematurely > --- > > Key: AIRFLOW-1296 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1296 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.8.1 >Reporter: Daniel Huang >Priority: Blocker > Fix For: 1.8.2 > > > So this is basically the same issue as AIRFLOW-872 and AIRFLOW-719. A > workaround had fixed this > (https://github.com/apache/incubator-airflow/pull/2125), but was later > reverted (https://github.com/apache/incubator-airflow/pull/2195). I totally > agree with the reason for reverting, but I still think this is an issue. > The issue is related to any operators that involves cascading skipped tasks, > like ShortCircuitOperator or LatestOnlyOperator. These operators mark only > their *direct* downstream task as SKIPPED, but additional downstream tasks > from that skipped task is left up to the scheduler to cascade the SKIPPED > state (see latest only op docs about this expected behavior > https://airflow.incubator.apache.org/concepts.html#latest-run-only). However, > instead the scheduler marks the DAG run as FAILED prematurely before the DAG > has a chance to skip all downstream tasks. > This example DAG should reproduce the issue: > https://gist.github.com/dhuang/61d38fb001c3a917edf4817bb0c915f9. > Expected result: DAG succeeds with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (skipped) -> dummy3 (skipped) > Actual result: DAG fails with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (none) -> dummy3 (none) > I believe the results I'm seeing are because of this deadlock prevention > logic, > https://github.com/apache/incubator-airflow/blob/1.8.1/airflow/models.py#L4182. > While that actual result shown above _could_ mean a deadlock, in this case > it shouldn't be. Since this {{update_state}} logic is reached first in each > scheduler run, dummy2/dummy3 don't get a chance to cascade the SKIPPED state. > Commenting out that block gives me the results I expect. > [~bolke] I know you spent awhile trying to reproduce my issue and weren't > able to, but I'm still hitting this on a fresh environment, default configs, > sqlite/mysql dbs, local/sequential/celery executors, and 1.8.1/master. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-1296) DAGs using operators involving cascading skipped tasks fail prematurely
[ https://issues.apache.org/jira/browse/AIRFLOW-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1296: Affects Version/s: 1.8.1 > DAGs using operators involving cascading skipped tasks fail prematurely > --- > > Key: AIRFLOW-1296 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1296 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.8.1 >Reporter: Daniel Huang >Priority: Blocker > Fix For: 1.8.2 > > > So this is basically the same issue as AIRFLOW-872 and AIRFLOW-719. A > workaround had fixed this > (https://github.com/apache/incubator-airflow/pull/2125), but was later > reverted (https://github.com/apache/incubator-airflow/pull/2195). I totally > agree with the reason for reverting, but I still think this is an issue. > The issue is related to any operators that involves cascading skipped tasks, > like ShortCircuitOperator or LatestOnlyOperator. These operators mark only > their *direct* downstream task as SKIPPED, but additional downstream tasks > from that skipped task is left up to the scheduler to cascade the SKIPPED > state (see latest only op docs about this expected behavior > https://airflow.incubator.apache.org/concepts.html#latest-run-only). However, > instead the scheduler marks the DAG run as FAILED prematurely before the DAG > has a chance to skip all downstream tasks. > This example DAG should reproduce the issue: > https://gist.github.com/dhuang/61d38fb001c3a917edf4817bb0c915f9. > Expected result: DAG succeeds with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (skipped) -> dummy3 (skipped) > Actual result: DAG fails with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (none) -> dummy3 (none) > I believe the results I'm seeing are because of this deadlock prevention > logic, > https://github.com/apache/incubator-airflow/blob/1.8.1/airflow/models.py#L4182. > While that actual result shown above _could_ mean a deadlock, in this case > it shouldn't be. Since this {{update_state}} logic is reached first in each > scheduler run, dummy2/dummy3 don't get a chance to cascade the SKIPPED state. > Commenting out that block gives me the results I expect. > [~bolke] I know you spent awhile trying to reproduce my issue and weren't > able to, but I'm still hitting this on a fresh environment, default configs, > sqlite/mysql dbs, local/sequential/celery executors, and 1.8.1/master. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-1296) DAGs using operators involving cascading skipped tasks fail prematurely
[ https://issues.apache.org/jira/browse/AIRFLOW-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1296: Priority: Blocker (was: Major) > DAGs using operators involving cascading skipped tasks fail prematurely > --- > > Key: AIRFLOW-1296 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1296 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.8.1 >Reporter: Daniel Huang >Priority: Blocker > Fix For: 1.8.2 > > > So this is basically the same issue as AIRFLOW-872 and AIRFLOW-719. A > workaround had fixed this > (https://github.com/apache/incubator-airflow/pull/2125), but was later > reverted (https://github.com/apache/incubator-airflow/pull/2195). I totally > agree with the reason for reverting, but I still think this is an issue. > The issue is related to any operators that involves cascading skipped tasks, > like ShortCircuitOperator or LatestOnlyOperator. These operators mark only > their *direct* downstream task as SKIPPED, but additional downstream tasks > from that skipped task is left up to the scheduler to cascade the SKIPPED > state (see latest only op docs about this expected behavior > https://airflow.incubator.apache.org/concepts.html#latest-run-only). However, > instead the scheduler marks the DAG run as FAILED prematurely before the DAG > has a chance to skip all downstream tasks. > This example DAG should reproduce the issue: > https://gist.github.com/dhuang/61d38fb001c3a917edf4817bb0c915f9. > Expected result: DAG succeeds with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (skipped) -> dummy3 (skipped) > Actual result: DAG fails with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (none) -> dummy3 (none) > I believe the results I'm seeing are because of this deadlock prevention > logic, > https://github.com/apache/incubator-airflow/blob/1.8.1/airflow/models.py#L4182. > While that actual result shown above _could_ mean a deadlock, in this case > it shouldn't be. Since this {{update_state}} logic is reached first in each > scheduler run, dummy2/dummy3 don't get a chance to cascade the SKIPPED state. > Commenting out that block gives me the results I expect. > [~bolke] I know you spent awhile trying to reproduce my issue and weren't > able to, but I'm still hitting this on a fresh environment, default configs, > sqlite/mysql dbs, local/sequential/celery executors, and 1.8.1/master. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (AIRFLOW-1296) DAGs using operators involving cascading skipped tasks fail prematurely
[ https://issues.apache.org/jira/browse/AIRFLOW-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-1296 started by Bolke de Bruin. --- > DAGs using operators involving cascading skipped tasks fail prematurely > --- > > Key: AIRFLOW-1296 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1296 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.8.1 >Reporter: Daniel Huang >Assignee: Bolke de Bruin >Priority: Blocker > Fix For: 1.8.2 > > > So this is basically the same issue as AIRFLOW-872 and AIRFLOW-719. A > workaround had fixed this > (https://github.com/apache/incubator-airflow/pull/2125), but was later > reverted (https://github.com/apache/incubator-airflow/pull/2195). I totally > agree with the reason for reverting, but I still think this is an issue. > The issue is related to any operators that involves cascading skipped tasks, > like ShortCircuitOperator or LatestOnlyOperator. These operators mark only > their *direct* downstream task as SKIPPED, but additional downstream tasks > from that skipped task is left up to the scheduler to cascade the SKIPPED > state (see latest only op docs about this expected behavior > https://airflow.incubator.apache.org/concepts.html#latest-run-only). However, > instead the scheduler marks the DAG run as FAILED prematurely before the DAG > has a chance to skip all downstream tasks. > This example DAG should reproduce the issue: > https://gist.github.com/dhuang/61d38fb001c3a917edf4817bb0c915f9. > Expected result: DAG succeeds with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (skipped) -> dummy3 (skipped) > Actual result: DAG fails with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (none) -> dummy3 (none) > I believe the results I'm seeing are because of this deadlock prevention > logic, > https://github.com/apache/incubator-airflow/blob/1.8.1/airflow/models.py#L4182. > While that actual result shown above _could_ mean a deadlock, in this case > it shouldn't be. Since this {{update_state}} logic is reached first in each > scheduler run, dummy2/dummy3 don't get a chance to cascade the SKIPPED state. > Commenting out that block gives me the results I expect. > [~bolke] I know you spent awhile trying to reproduce my issue and weren't > able to, but I'm still hitting this on a fresh environment, default configs, > sqlite/mysql dbs, local/sequential/celery executors, and 1.8.1/master. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (AIRFLOW-1308) Dask's nanny is deprecated and results in errors on travis
Bolke de Bruin created AIRFLOW-1308: --- Summary: Dask's nanny is deprecated and results in errors on travis Key: AIRFLOW-1308 URL: https://issues.apache.org/jira/browse/AIRFLOW-1308 Project: Apache Airflow Issue Type: Bug Reporter: Bolke de Bruin -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-1308) Dask's nanny is deprecated and results in errors on travis
[ https://issues.apache.org/jira/browse/AIRFLOW-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1308: Component/s: travis > Dask's nanny is deprecated and results in errors on travis > -- > > Key: AIRFLOW-1308 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1308 > Project: Apache Airflow > Issue Type: Bug > Components: travis >Affects Versions: 1.8.1 >Reporter: Bolke de Bruin > Fix For: 1.8.2 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-1308) Dask's nanny is deprecated and results in errors on travis
[ https://issues.apache.org/jira/browse/AIRFLOW-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1308: Fix Version/s: 1.8.2 > Dask's nanny is deprecated and results in errors on travis > -- > > Key: AIRFLOW-1308 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1308 > Project: Apache Airflow > Issue Type: Bug > Components: travis >Affects Versions: 1.8.1 >Reporter: Bolke de Bruin > Fix For: 1.8.2 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-1308) Dask's nanny is deprecated and results in errors on travis
[ https://issues.apache.org/jira/browse/AIRFLOW-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-1308: Affects Version/s: 1.8.1 > Dask's nanny is deprecated and results in errors on travis > -- > > Key: AIRFLOW-1308 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1308 > Project: Apache Airflow > Issue Type: Bug > Components: travis >Affects Versions: 1.8.1 >Reporter: Bolke de Bruin > Fix For: 1.8.2 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1308) Dask's nanny is deprecated and results in errors on travis
[ https://issues.apache.org/jira/browse/AIRFLOW-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1308. - Resolution: Fixed Issue resolved by pull request #2366 [https://github.com/apache/incubator-airflow/pull/2366] > Dask's nanny is deprecated and results in errors on travis > -- > > Key: AIRFLOW-1308 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1308 > Project: Apache Airflow > Issue Type: Bug > Components: travis >Affects Versions: 1.8.1 >Reporter: Bolke de Bruin > Fix For: 1.8.2 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-860) Circular module dependency prevents loading of custom executor plugin
[ https://issues.apache.org/jira/browse/AIRFLOW-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-860: --- Fix Version/s: 1.8.,2 > Circular module dependency prevents loading of custom executor plugin > - > > Key: AIRFLOW-860 > URL: https://issues.apache.org/jira/browse/AIRFLOW-860 > Project: Apache Airflow > Issue Type: Bug > Components: executor, plugins >Affects Versions: Airflow 1.8 > Environment: Linux RHEL 7.6, Python 2.7.13 >Reporter: Nick Ivanov > Fix For: 1.9.0, 1.8.,2 > > > Custom plugins cannot be loaded, which prevents airflow from running, due to > apparent cyclic dependency in plugins_manager called in > {{executors.\_\_init\_\_}} -- the top-level {{\_\_init\_\_}} attempts to load > the default executor, which then goes back to plugins_manager etc. > {noformat} > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/EGG-INFO/scripts/airflow", > line 17, in > from airflow import configuration > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/__init__.py", > line 31, in > from airflow.models import DAG > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/models.py", > line 62, in > from airflow.executors import DEFAULT_EXECUTOR, LocalExecutor > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/executors/__init__.py", > line 55, in > raise AirflowException("Executor {0} not supported.".format(_EXECUTOR)) > airflow.exceptions.AirflowException: Executor TestExecutor not supported. > {noformat} > While attempting {{from airflow.plugins_manager import executors_modules}} > cycles right back where it came from: > {noformat} > >>> from airflow.plugins_manager import executor_modules > Traceback (most recent call last): > File "", line 1, in > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/__init__.py", > line 31, in > from airflow.models import DAG > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/models.py", > line 62, in > from airflow.executors import DEFAULT_EXECUTOR, LocalExecutor > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/executors/__init__.py", > line 55, in > raise AirflowException("Executor {0} not supported.".format(_EXECUTOR)) > airflow.exceptions.AirflowException: Executor LSFExecutor not supported. > {noformat} > To reproduce: > # Create a custom executor, e.g. by copying {{local_executor.py}} into > {{$AIRFLOW_HOME/plugins/test_executor.py}} and replacing occurrences of > "Local" with "Test" in the code, which will produce {{TestExecutor}} > # Update {{$AIRFLOW_HOME/airflow.cfg}} by setting in the {{\[core\]}} > section: {{executor = TestExecutor}} > # Run any airflow command, e.g. "airflow version" -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-860) Circular module dependency prevents loading of custom executor plugin
[ https://issues.apache.org/jira/browse/AIRFLOW-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-860: --- Fix Version/s: (was: 1.8.,2) 1.8.2 > Circular module dependency prevents loading of custom executor plugin > - > > Key: AIRFLOW-860 > URL: https://issues.apache.org/jira/browse/AIRFLOW-860 > Project: Apache Airflow > Issue Type: Bug > Components: executor, plugins >Affects Versions: Airflow 1.8 > Environment: Linux RHEL 7.6, Python 2.7.13 >Reporter: Nick Ivanov > Fix For: 1.9.0, 1.8.2 > > > Custom plugins cannot be loaded, which prevents airflow from running, due to > apparent cyclic dependency in plugins_manager called in > {{executors.\_\_init\_\_}} -- the top-level {{\_\_init\_\_}} attempts to load > the default executor, which then goes back to plugins_manager etc. > {noformat} > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/EGG-INFO/scripts/airflow", > line 17, in > from airflow import configuration > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/__init__.py", > line 31, in > from airflow.models import DAG > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/models.py", > line 62, in > from airflow.executors import DEFAULT_EXECUTOR, LocalExecutor > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/executors/__init__.py", > line 55, in > raise AirflowException("Executor {0} not supported.".format(_EXECUTOR)) > airflow.exceptions.AirflowException: Executor TestExecutor not supported. > {noformat} > While attempting {{from airflow.plugins_manager import executors_modules}} > cycles right back where it came from: > {noformat} > >>> from airflow.plugins_manager import executor_modules > Traceback (most recent call last): > File "", line 1, in > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/__init__.py", > line 31, in > from airflow.models import DAG > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/models.py", > line 62, in > from airflow.executors import DEFAULT_EXECUTOR, LocalExecutor > File > "/usr/lib/python2.7/site-packages/airflow-1.9.0dev0_apache.incubating-py2.7.egg/airflow/executors/__init__.py", > line 55, in > raise AirflowException("Executor {0} not supported.".format(_EXECUTOR)) > airflow.exceptions.AirflowException: Executor LSFExecutor not supported. > {noformat} > To reproduce: > # Create a custom executor, e.g. by copying {{local_executor.py}} into > {{$AIRFLOW_HOME/plugins/test_executor.py}} and replacing occurrences of > "Local" with "Test" in the code, which will produce {{TestExecutor}} > # Update {{$AIRFLOW_HOME/airflow.cfg}} by setting in the {{\[core\]}} > section: {{executor = TestExecutor}} > # Run any airflow command, e.g. "airflow version" -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-935) Impossible to use plugin executors
[ https://issues.apache.org/jira/browse/AIRFLOW-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-935: --- Fix Version/s: 1.8.2 > Impossible to use plugin executors > -- > > Key: AIRFLOW-935 > URL: https://issues.apache.org/jira/browse/AIRFLOW-935 > Project: Apache Airflow > Issue Type: Bug > Components: executor, plugins >Affects Versions: Airflow 1.8, 1.9.0 >Reporter: Stijn Verhaegen > Fix For: 1.9.0, 1.8.2 > > > revision eb5982d (included in 1.8) breaks plugin executors altogether. It > makes a new module for every plugin, so import statements need to be adapted, > but the executor selection is left unchanged, so it ends up assigning the > plugin module as an executor. > To reproduce: take any plugin which defines a custom executor and try to get > it loaded by setting `executor` in the airflow.cfg -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1296) DAGs using operators involving cascading skipped tasks fail prematurely
[ https://issues.apache.org/jira/browse/AIRFLOW-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1296. - Resolution: Fixed Issue resolved by pull request #2365 [https://github.com/apache/incubator-airflow/pull/2365] > DAGs using operators involving cascading skipped tasks fail prematurely > --- > > Key: AIRFLOW-1296 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1296 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.8.1 >Reporter: Daniel Huang >Assignee: Bolke de Bruin >Priority: Blocker > Fix For: 1.8.2 > > > So this is basically the same issue as AIRFLOW-872 and AIRFLOW-719. A > workaround had fixed this > (https://github.com/apache/incubator-airflow/pull/2125), but was later > reverted (https://github.com/apache/incubator-airflow/pull/2195). I totally > agree with the reason for reverting, but I still think this is an issue. > The issue is related to any operators that involves cascading skipped tasks, > like ShortCircuitOperator or LatestOnlyOperator. These operators mark only > their *direct* downstream task as SKIPPED, but additional downstream tasks > from that skipped task is left up to the scheduler to cascade the SKIPPED > state (see latest only op docs about this expected behavior > https://airflow.incubator.apache.org/concepts.html#latest-run-only). However, > instead the scheduler marks the DAG run as FAILED prematurely before the DAG > has a chance to skip all downstream tasks. > This example DAG should reproduce the issue: > https://gist.github.com/dhuang/61d38fb001c3a917edf4817bb0c915f9. > Expected result: DAG succeeds with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (skipped) -> dummy3 (skipped) > Actual result: DAG fails with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (none) -> dummy3 (none) > I believe the results I'm seeing are because of this deadlock prevention > logic, > https://github.com/apache/incubator-airflow/blob/1.8.1/airflow/models.py#L4182. > While that actual result shown above _could_ mean a deadlock, in this case > it shouldn't be. Since this {{update_state}} logic is reached first in each > scheduler run, dummy2/dummy3 don't get a chance to cascade the SKIPPED state. > Commenting out that block gives me the results I expect. > [~bolke] I know you spent awhile trying to reproduce my issue and weren't > able to, but I'm still hitting this on a fresh environment, default configs, > sqlite/mysql dbs, local/sequential/celery executors, and 1.8.1/master. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1275) Fix `airflow pool` command exception
[ https://issues.apache.org/jira/browse/AIRFLOW-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1275. - Resolution: Fixed Issue resolved by pull request #2346 [https://github.com/apache/incubator-airflow/pull/2346] > Fix `airflow pool` command exception > > > Key: AIRFLOW-1275 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1275 > Project: Apache Airflow > Issue Type: Bug > Components: cli >Affects Versions: 1.8.1 >Reporter: Stanislav Kudriashev >Assignee: Stanislav Kudriashev > Fix For: 1.9.0 > > > {code} > $ airflow pool > [2017-06-04 19:07:49,805] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/Grammar.txt > [2017-06-04 19:07:49,821] {driver.py:120} INFO - Generating grammar tables > from /usr/lib/python2.7/lib2to3/PatternGrammar.txt > [2017-06-04 19:07:49,896] {__init__.py:45} INFO - Using executor > SequentialExecutor > Traceback (most recent call last): > File "/home/stas/Work/incubator-airflow/.venv/bin/airflow", line 6, in > > exec(compile(open(__file__).read(), __file__, 'exec')) > File "/home/stas/Work/incubator-airflow/airflow/bin/airflow", line 28, in > > args.func(args) > File "/home/stas/Work/incubator-airflow/airflow/bin/cli.py", line 195, in > pool > .filter(Pool.pool == name) > UnboundLocalError: local variable 'name' referenced before assignment > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-809) SqlAlchemy is_ ColumnOperator Causing Errors in MSSQL
[ https://issues.apache.org/jira/browse/AIRFLOW-809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-809: --- Fix Version/s: 1.8.2 > SqlAlchemy is_ ColumnOperator Causing Errors in MSSQL > - > > Key: AIRFLOW-809 > URL: https://issues.apache.org/jira/browse/AIRFLOW-809 > Project: Apache Airflow > Issue Type: Bug > Components: models, scheduler > Environment: Distributor ID: Ubuntu > Description:Ubuntu 14.04.5 LTS > Release:14.04 > Codename: trusty > Backend: MSSQL > Executor: LocalExecutor >Reporter: Chad T Henderson >Assignee: Chad T Henderson > Fix For: 1.9.0, 1.8.2 > > > Starting the scheduler produces the following error, which I believe is > caused by the use of the _is ColumnOperator when checking the value of the > external_trigger field on the DagRun object. > Error Text: > sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] > [FreeTDS][SQL Server]Incorrect syntax near '0'. (102) (SQLExecDirectW)") > [SQL: 'SELECT TOP 1 dag_run.state AS dag_run_state, dag_run.id AS dag_run_id, > dag_run.dag_id AS dag_run_dag_id, dag_run.execution_date AS > dag_run_execution_date, dag_run.start_date AS dag_run_start_date, > dag_run.end_date AS dag_run_end_date, dag_run.run_id AS dag_run_run_id, > dag_run.external_trigger AS dag_run_external_trigger, dag_run.conf AS > dag_run_conf \nFROM dag_run \nWHERE dag_run.dag_id = ? AND > dag_run.external_trigger IS 0 ORDER BY dag_run.execution_date DESC'] > [parameters: ('append_to_history',)] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-504) datetime.now() has microseconds Mysql does not store these resulting in issues
[ https://issues.apache.org/jira/browse/AIRFLOW-504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067201#comment-16067201 ] Bolke de Bruin commented on AIRFLOW-504: It is in updating.md > datetime.now() has microseconds Mysql does not store these resulting in issues > -- > > Key: AIRFLOW-504 > URL: https://issues.apache.org/jira/browse/AIRFLOW-504 > Project: Apache Airflow > Issue Type: Bug >Reporter: Bolke de Bruin >Priority: Critical > Fix For: 1.8.0 > > > >>> datetime.datetime.utcnow() > datetime.datetime(2016, 9, 11, 12, 22, 22, 483998) > >>> datetime.datetime.now() > datetime.datetime(2016, 9, 11, 14, 22, 31, 363392) > >>> > Mysql rounds those to "2016-05-13 00:00:00". If sqlalchemy now tries to find > the the record by primary key (eg. session.merge) it cannot find them due to > lost precision. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (AIRFLOW-504) datetime.now() has microseconds Mysql does not store these resulting in issues
[ https://issues.apache.org/jira/browse/AIRFLOW-504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin closed AIRFLOW-504. -- Resolution: Fixed Fix Version/s: 1.8.0 > datetime.now() has microseconds Mysql does not store these resulting in issues > -- > > Key: AIRFLOW-504 > URL: https://issues.apache.org/jira/browse/AIRFLOW-504 > Project: Apache Airflow > Issue Type: Bug >Reporter: Bolke de Bruin >Priority: Critical > Fix For: 1.8.0 > > > >>> datetime.datetime.utcnow() > datetime.datetime(2016, 9, 11, 12, 22, 22, 483998) > >>> datetime.datetime.now() > datetime.datetime(2016, 9, 11, 14, 22, 31, 363392) > >>> > Mysql rounds those to "2016-05-13 00:00:00". If sqlalchemy now tries to find > the the record by primary key (eg. session.merge) it cannot find them due to > lost precision. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1388) Add Google Cloud ML operators to integration doc
[ https://issues.apache.org/jira/browse/AIRFLOW-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1388. - Resolution: Fixed Fix Version/s: 1.8.3 Issue resolved by pull request #2425 [https://github.com/apache/incubator-airflow/pull/2425] > Add Google Cloud ML operators to integration doc > > > Key: AIRFLOW-1388 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1388 > Project: Apache Airflow > Issue Type: Improvement > Components: Documentation >Reporter: Ming Zhong >Assignee: Ming Zhong >Priority: Trivial > Fix For: 1.8.3 > > > There has been some work adding Google Cloud ML operators, but they are not > mentioned in the integration docs. > We need to add the new CloudML operators (batch prediction, cloudml training, > model/version management) to the integration docs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1387) Logging causes UnicodeEncodeError on wget.
[ https://issues.apache.org/jira/browse/AIRFLOW-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1387. - Resolution: Fixed Fix Version/s: 1.8.3 > Logging causes UnicodeEncodeError on wget. > -- > > Key: AIRFLOW-1387 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1387 > Project: Apache Airflow > Issue Type: Bug > Components: logging, operators, scheduler >Affects Versions: Airflow 1.8 > Environment: Ubuntu 16.04 vagrant machine, python 2.7 >Reporter: Artiom >Assignee: Artiom > Fix For: 1.8.3 > > > Encountered an issue breaking my DAGs after switching from 1.7.1.3 > The problem appears on my vagrant default ubuntu 16.04 machine. The output of > locale command. > LANG=en_US.UTF-8 > LANGUAGE=en_US: > LC_CTYPE="en_US.UTF-8" > LC_NUMERIC="en_US.UTF-8" > LC_TIME="en_US.UTF-8" > LC_COLLATE="en_US.UTF-8" > LC_MONETARY="en_US.UTF-8" > LC_MESSAGES="en_US.UTF-8" > LC_PAPER="en_US.UTF-8" > LC_NAME="en_US.UTF-8" > LC_ADDRESS="en_US.UTF-8" > LC_TELEPHONE="en_US.UTF-8" > LC_MEASUREMENT="en_US.UTF-8" > LC_IDENTIFICATION="en_US.UTF-8" > LC_ALL= > To replicate I created a DAG with single bash operator task that runs > 'download.sh' > The code for download.sh is pretty simple: > {code:java} > wget ftp://anonymous:gu...@ftp.debian.org/debian/README.mirrors.txt > {code} > It breaks on the first backquote. > {code:java} > Jul 05 15:27:00 vagrant airflow[29929]: Exception in thread Thread-1: > Jul 05 15:27:00 vagrant airflow[29929]: Traceback (most recent call last): > Jul 05 15:27:00 vagrant airflow[29929]: File > "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner > Jul 05 15:27:00 vagrant airflow[29929]: self.run() > Jul 05 15:27:00 vagrant airflow[29929]: File > "/usr/lib/python2.7/threading.py", line 754, in run > Jul 05 15:27:00 vagrant airflow[29929]: self._target(*self.args, > **self._kwargs) > Jul 05 15:27:00 vagrant airflow[29929]: File > "/var/lib/airflow/python2.7/site-packages/airflow/task_runner/base_task_runner.py", > line 95, in _read_task_logs > Jul 05 15:27:00 vagrant airflow[29929]: self.logger.info('Subtask: > {}'.format(line.rstrip('\n'))) > Jul 05 15:27:00 vagrant airflow[29929]: UnicodeEncodeError: 'ascii' codec > can't encode character u'\u2018' in position 58: ordinal not in range(128) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1382) DockerOperator should allow working directory to be specified
[ https://issues.apache.org/jira/browse/AIRFLOW-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1382. - Resolution: Fixed Fix Version/s: 1.8.3 Issue resolved by pull request #2419 [https://github.com/apache/incubator-airflow/pull/2419] > DockerOperator should allow working directory to be specified > - > > Key: AIRFLOW-1382 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1382 > Project: Apache Airflow > Issue Type: Improvement > Components: docker, operators >Affects Versions: 1.9.0 >Reporter: Benjamin Sims >Priority: Minor > Fix For: 1.8.3 > > Original Estimate: 1h > Remaining Estimate: 1h > > Docker allows the user to specify a working directory within a container, > using the -w option (see: > https://docs.docker.com/engine/reference/commandline/create/). > The DockerOperator does the currently support this; it would be helpful if it > were a parameter. Will submit a PR shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1357) Scheduler does not support zip files
[ https://issues.apache.org/jira/browse/AIRFLOW-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1357. - Resolution: Fixed Fix Version/s: 1.8.3 Issue resolved by pull request #2406 [https://github.com/apache/incubator-airflow/pull/2406] > Scheduler does not support zip files > > > Key: AIRFLOW-1357 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1357 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.8.1 >Reporter: Ultrabug >Priority: Critical > Fix For: 1.8.3 > > Attachments: fix_scheduler_zip_file_support.patch > > > Users from 1.7 series of airflow using zipped DAGs are unable to migrate to > 1.8+ series because the scheduler does not detect/run them. > The zipfile support is still present in the models.py file so the zipped DAGs > do show up on `airflow list_dags` but they are never executed by the > scheduler since it only cares for .py files. > I'm attaching the fix for this, will also open a GH PR. > Tested ok & in production here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1184) Contrib Spark Submit Hook does not split argument and argument value
[ https://issues.apache.org/jira/browse/AIRFLOW-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1184. - Resolution: Fixed Fix Version/s: (was: Airflow 1.8) (was: Airflow 2.0) 1.8.3 Issue resolved by pull request #2382 [https://github.com/apache/incubator-airflow/pull/2382] > Contrib Spark Submit Hook does not split argument and argument value > > > Key: AIRFLOW-1184 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1184 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, hooks >Affects Versions: Airflow 2.0, Airflow 1.8 >Reporter: Vianney FOUCAULT >Assignee: Vianney FOUCAULT > Fix For: 1.8.3 > > > Python Popen expect a list as command. Spark submit too, as: > * ['--option value'] > is not the same as > * ['--option', 'value'] > in regards of spark. eg spark logs : (yarn logs) > Error: Unknown option --end 2017-05-08 > Error: Unknown option --begin 2017-05-07 > Error: Unknown option --db_name mydb > Error: Missing option --begin > Error: Missing option --end > Error: Missing option --db_name -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1326) Contrib Spark Submit Hook should be able to produce arguments containing spaces.
[ https://issues.apache.org/jira/browse/AIRFLOW-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1326. - Resolution: Fixed Fix Version/s: (was: Airflow 1.8) (was: Airflow 2.0) 1.8.3 Issue resolved by pull request #2382 [https://github.com/apache/incubator-airflow/pull/2382] > Contrib Spark Submit Hook should be able to produce arguments containing > spaces. > > > Key: AIRFLOW-1326 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1326 > Project: Apache Airflow > Issue Type: Bug > Components: contrib, hooks >Affects Versions: Airflow 2.0, Airflow 1.8 >Reporter: Ash Berlin-Taylor >Assignee: Vianney FOUCAULT > Fix For: 1.8.3 > > > AIRFLOW-1184 introduced a bug making it impossible to include an argument > with an embedded space in the SparkSubmitOperator. (I believe the original > bug report should have been closed as wontfix/not-a-bug as it was user error > in how the SparkSubmitOperator is being called) > i.e. the correct way to achieve what was wanted in the bug is like this: > {code} > 'application_args': [ > '-f', ' foo', > '--bar', 'bar', > '--start', '{{ macros.ds_add(ds, -1)}}', > '--end', '{{ ds }}' > ] > {code} > This was possibly compounded as the tests were showing something that would > never "work" -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1255) SparkSubmitOperator logs do not stream correctly
[ https://issues.apache.org/jira/browse/AIRFLOW-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1255. - Resolution: Fixed Fix Version/s: 1.8.3 Issue resolved by pull request #2438 [https://github.com/apache/incubator-airflow/pull/2438] > SparkSubmitOperator logs do not stream correctly > > > Key: AIRFLOW-1255 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1255 > Project: Apache Airflow > Issue Type: Bug > Components: hooks, operators >Affects Versions: Airflow 1.8 > Environment: Spark 1.6.0 with Yarn cluster > Airflow 1.8 >Reporter: Himanshu Jain >Priority: Minor > Labels: easyfix > Fix For: 1.8.3 > > > Logging in SparkSubmitOperator does not work as intended (continuous logging > as received in the subprocess). This is because, spark-submit internally > redirects all logs to stdout (including stderr), which causes the current two > iterator logging to get stuck with empty stderr pipe. The logs are written > only when the subprocess finishes. This leads to yarn_application_id not > being available until the end of application. > Specifically, > {code:title= spark_submit_hook.py (lines 217-220)|borderStyle=solid} > self._sp = subprocess.Popen(spark_submit_cmd, > stdout=subprocess.PIPE, > stderr=subprocess.PIPE, > **kwargs) > {code} > needs to be changed to > {code:title= spark_submit_hook.py|borderStyle=solid} > self._sp = subprocess.Popen(spark_submit_cmd, > stdout=subprocess.PIPE, > **kwargs) > {code} > with subsequent changes in the following lines. > I have not tested whether the issue exists with spark 2 versions as well or > not. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-966) Celery Broker Transport Options
[ https://issues.apache.org/jira/browse/AIRFLOW-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-966: --- Priority: Major (was: Minor) > Celery Broker Transport Options > --- > > Key: AIRFLOW-966 > URL: https://issues.apache.org/jira/browse/AIRFLOW-966 > Project: Apache Airflow > Issue Type: Improvement > Components: celery, executor >Reporter: Desiree Cox >Assignee: Desiree Cox > Fix For: 1.8.3 > > > To use the CeleryExecutor with a Redis Sentinel broker, Celery must be > provided with a dictionary of broker_transport_options with a key/value pair > telling it the name of the desired redis service. There should be a > broker_transport_options in the Celery section of airflow.cfg and > CeleryExecutor should pass the options into the Celery worker configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-966) Celery Broker Transport Options
[ https://issues.apache.org/jira/browse/AIRFLOW-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-966: --- Fix Version/s: 1.8.3 > Celery Broker Transport Options > --- > > Key: AIRFLOW-966 > URL: https://issues.apache.org/jira/browse/AIRFLOW-966 > Project: Apache Airflow > Issue Type: Improvement > Components: celery, executor >Reporter: Desiree Cox >Assignee: Desiree Cox >Priority: Minor > Fix For: 1.8.3 > > > To use the CeleryExecutor with a Redis Sentinel broker, Celery must be > provided with a dictionary of broker_transport_options with a key/value pair > telling it the name of the desired redis service. There should be a > broker_transport_options in the Celery section of airflow.cfg and > CeleryExecutor should pass the options into the Celery worker configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (AIRFLOW-1393) Enable Python3 tests on contrib Spark Submit Hook
[ https://issues.apache.org/jira/browse/AIRFLOW-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin resolved AIRFLOW-1393. - Resolution: Fixed Fix Version/s: 1.8.3 Issue resolved by pull request #2427 [https://github.com/apache/incubator-airflow/pull/2427] > Enable Python3 tests on contrib Spark Submit Hook > - > > Key: AIRFLOW-1393 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1393 > Project: Apache Airflow > Issue Type: Bug > Components: tests >Reporter: Ash Berlin-Taylor >Priority: Minor > Fix For: 1.8.3 > > > The unit tests in {{tests/contrib/hooks/test_spark_submit_hook.py}} currently > skip if run in Python3 because some test cases loop forever due to a > mismatch/misunderstanding about bytes vs string (i.e. the mocked data for > {{subprocess.Popen}} doesn't behave the same as actually running Popen) > The fix is to use bytes and {{six.ByteIO}} so that the tests work on Py2 and > Py3. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-401) scheduler gets stuck without a trace
[ https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089422#comment-16089422 ] Bolke de Bruin commented on AIRFLOW-401: [~rotten] That is expected behaviour. The LocalExecutor runs in process thus the scheduler cannot restart until all tasks have finished. Run without restarts (which is the recommended behaviour) and the scheduler will just happily continue. [~grecco-gn] What version of rabbitmq (incl python libs) and celery did you use? > scheduler gets stuck without a trace > > > Key: AIRFLOW-401 > URL: https://issues.apache.org/jira/browse/AIRFLOW-401 > Project: Apache Airflow > Issue Type: Bug > Components: executor, scheduler >Affects Versions: Airflow 1.7.1.3 >Reporter: Nadeem Ahmed Nazeer >Assignee: Bolke de Bruin >Priority: Minor > Attachments: Dag_code.txt, schduler_cpu100%.png, > scheduler_stuck_7hours.png, scheduler_stuck.png > > > The scheduler gets stuck without a trace or error. When this happens, the CPU > usage of scheduler service is at 100%. No jobs get submitted and everything > comes to a halt. Looks it goes into some kind of infinite loop. > The only way I could make it run again is by manually restarting the > scheduler service. But again, after running some tasks it gets stuck. I've > tried with both Celery and Local executors but same issue occurs. I am using > the -n 3 parameter while starting scheduler. > Scheduler configs, > job_heartbeat_sec = 5 > scheduler_heartbeat_sec = 5 > executor = LocalExecutor > parallelism = 32 > Please help. I would be happy to provide any other information needed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-401) scheduler gets stuck without a trace
[ https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089429#comment-16089429 ] Bolke de Bruin commented on AIRFLOW-401: [~sudow...@gmail.com] It is expected behaviour as the LocalExecutor runs _in process_. Use 1.8.1+ and do not use "-n 5" and you will not encounter the issue. > scheduler gets stuck without a trace > > > Key: AIRFLOW-401 > URL: https://issues.apache.org/jira/browse/AIRFLOW-401 > Project: Apache Airflow > Issue Type: Bug > Components: executor, scheduler >Affects Versions: Airflow 1.7.1.3 >Reporter: Nadeem Ahmed Nazeer >Assignee: Bolke de Bruin >Priority: Minor > Attachments: Dag_code.txt, schduler_cpu100%.png, > scheduler_stuck_7hours.png, scheduler_stuck.png > > > The scheduler gets stuck without a trace or error. When this happens, the CPU > usage of scheduler service is at 100%. No jobs get submitted and everything > comes to a halt. Looks it goes into some kind of infinite loop. > The only way I could make it run again is by manually restarting the > scheduler service. But again, after running some tasks it gets stuck. I've > tried with both Celery and Local executors but same issue occurs. I am using > the -n 3 parameter while starting scheduler. > Scheduler configs, > job_heartbeat_sec = 5 > scheduler_heartbeat_sec = 5 > executor = LocalExecutor > parallelism = 32 > Please help. I would be happy to provide any other information needed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (AIRFLOW-401) scheduler gets stuck without a trace
[ https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089429#comment-16089429 ] Bolke de Bruin edited comment on AIRFLOW-401 at 7/17/17 7:32 AM: - [~sudow...@gmail.com] It is expected behaviour as the LocalExecutor runs _in process_. Use 1.8.1+ and do not use "-n 5" and you will not encounter the issue with LocalExecutor. was (Author: bolke): [~sudow...@gmail.com] It is expected behaviour as the LocalExecutor runs _in process_. Use 1.8.1+ and do not use "-n 5" and you will not encounter the issue. > scheduler gets stuck without a trace > > > Key: AIRFLOW-401 > URL: https://issues.apache.org/jira/browse/AIRFLOW-401 > Project: Apache Airflow > Issue Type: Bug > Components: executor, scheduler >Affects Versions: Airflow 1.7.1.3 >Reporter: Nadeem Ahmed Nazeer >Assignee: Bolke de Bruin >Priority: Minor > Attachments: Dag_code.txt, schduler_cpu100%.png, > scheduler_stuck_7hours.png, scheduler_stuck.png > > > The scheduler gets stuck without a trace or error. When this happens, the CPU > usage of scheduler service is at 100%. No jobs get submitted and everything > comes to a halt. Looks it goes into some kind of infinite loop. > The only way I could make it run again is by manually restarting the > scheduler service. But again, after running some tasks it gets stuck. I've > tried with both Celery and Local executors but same issue occurs. I am using > the -n 3 parameter while starting scheduler. > Scheduler configs, > job_heartbeat_sec = 5 > scheduler_heartbeat_sec = 5 > executor = LocalExecutor > parallelism = 32 > Please help. I would be happy to provide any other information needed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-401) scheduler gets stuck without a trace
[ https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089440#comment-16089440 ] Bolke de Bruin commented on AIRFLOW-401: [~darkk] You are probably hitting a celery/kombu bug that only got fixed very recently. A workaround is to use Redis or use CELERY_WORKER_PREFETCH_MULTIPLIER = 0 & -P solo . > scheduler gets stuck without a trace > > > Key: AIRFLOW-401 > URL: https://issues.apache.org/jira/browse/AIRFLOW-401 > Project: Apache Airflow > Issue Type: Bug > Components: executor, scheduler >Affects Versions: Airflow 1.7.1.3 >Reporter: Nadeem Ahmed Nazeer >Assignee: Bolke de Bruin >Priority: Minor > Attachments: Dag_code.txt, schduler_cpu100%.png, > scheduler_stuck_7hours.png, scheduler_stuck.png > > > The scheduler gets stuck without a trace or error. When this happens, the CPU > usage of scheduler service is at 100%. No jobs get submitted and everything > comes to a halt. Looks it goes into some kind of infinite loop. > The only way I could make it run again is by manually restarting the > scheduler service. But again, after running some tasks it gets stuck. I've > tried with both Celery and Local executors but same issue occurs. I am using > the -n 3 parameter while starting scheduler. > Scheduler configs, > job_heartbeat_sec = 5 > scheduler_heartbeat_sec = 5 > executor = LocalExecutor > parallelism = 32 > Please help. I would be happy to provide any other information needed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (AIRFLOW-401) scheduler gets stuck without a trace
[ https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089440#comment-16089440 ] Bolke de Bruin edited comment on AIRFLOW-401 at 7/17/17 7:48 AM: - [~darkk] You are probably hitting a celery/kombu bug that only got fixed very recently. A workaround is to use Redis or use CELERY_WORKER_PREFETCH_MULTIPLIER = 0 & -P solo . See https://github.com/celery/celery/issues/3712 was (Author: bolke): [~darkk] You are probably hitting a celery/kombu bug that only got fixed very recently. A workaround is to use Redis or use CELERY_WORKER_PREFETCH_MULTIPLIER = 0 & -P solo . > scheduler gets stuck without a trace > > > Key: AIRFLOW-401 > URL: https://issues.apache.org/jira/browse/AIRFLOW-401 > Project: Apache Airflow > Issue Type: Bug > Components: executor, scheduler >Affects Versions: Airflow 1.7.1.3 >Reporter: Nadeem Ahmed Nazeer >Assignee: Bolke de Bruin >Priority: Minor > Labels: celery, kombu > Attachments: Dag_code.txt, schduler_cpu100%.png, > scheduler_stuck_7hours.png, scheduler_stuck.png > > > The scheduler gets stuck without a trace or error. When this happens, the CPU > usage of scheduler service is at 100%. No jobs get submitted and everything > comes to a halt. Looks it goes into some kind of infinite loop. > The only way I could make it run again is by manually restarting the > scheduler service. But again, after running some tasks it gets stuck. I've > tried with both Celery and Local executors but same issue occurs. I am using > the -n 3 parameter while starting scheduler. > Scheduler configs, > job_heartbeat_sec = 5 > scheduler_heartbeat_sec = 5 > executor = LocalExecutor > parallelism = 32 > Please help. I would be happy to provide any other information needed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (AIRFLOW-401) scheduler gets stuck without a trace
[ https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bolke de Bruin updated AIRFLOW-401: --- Labels: celery kombu (was: ) > scheduler gets stuck without a trace > > > Key: AIRFLOW-401 > URL: https://issues.apache.org/jira/browse/AIRFLOW-401 > Project: Apache Airflow > Issue Type: Bug > Components: executor, scheduler >Affects Versions: Airflow 1.7.1.3 >Reporter: Nadeem Ahmed Nazeer >Assignee: Bolke de Bruin >Priority: Minor > Labels: celery, kombu > Attachments: Dag_code.txt, schduler_cpu100%.png, > scheduler_stuck_7hours.png, scheduler_stuck.png > > > The scheduler gets stuck without a trace or error. When this happens, the CPU > usage of scheduler service is at 100%. No jobs get submitted and everything > comes to a halt. Looks it goes into some kind of infinite loop. > The only way I could make it run again is by manually restarting the > scheduler service. But again, after running some tasks it gets stuck. I've > tried with both Celery and Local executors but same issue occurs. I am using > the -n 3 parameter while starting scheduler. > Scheduler configs, > job_heartbeat_sec = 5 > scheduler_heartbeat_sec = 5 > executor = LocalExecutor > parallelism = 32 > Please help. I would be happy to provide any other information needed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-1414) Add support for retriggering dependent workflows
[ https://issues.apache.org/jira/browse/AIRFLOW-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089572#comment-16089572 ] Bolke de Bruin commented on AIRFLOW-1414: - I made some comments on the PR. The logic is sound, but some minor improvements wanted. I would raise it on the mailinglist as well, maybe someone solved it differently. > Add support for retriggering dependent workflows > > > Key: AIRFLOW-1414 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1414 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Raphael Lopez Kaufman > > Currently when using an ExternalTaskSensor sensor, to have a dag wait for the > completion of a task in another dag, there is no way, when clearing the task > that is depended on, to also clear the ExternalTaskSensor task (and its > downstream/upstream tasks). > However, that might be important when teams have separate responsibilities > and create multi-staged data pipelines. Let's say there is a team responsible > for transforming logs to make them available in hadoop land. They maintain a > hourly dag processing the raw logs. Other teams wait on the completion of > this dag using an ExternalTaskSensor. If the logging team realizes that some > logs where corrupted, not only do they have to rerun the task that processed > these logs but also all the dags that use these logs to generate higher level > data. > Introducing the notion of descendants can solve this problem. The descendants > of dag A are the dags that have an ExternalTaskSensor pointing to a task of > dag A. Then, when clearing a task in the view, an option can be added to also > clear the tasks of the descendants (having the upstream, downstream, > recursive, future and past flags act accordingly). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-401) scheduler gets stuck without a trace
[ https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089747#comment-16089747 ] Bolke de Bruin commented on AIRFLOW-401: Running continuously is the default behaviour (no "num_runs" or "-n X" on the command line) . I might have misinterpreted what you meant by "restarted" . > scheduler gets stuck without a trace > > > Key: AIRFLOW-401 > URL: https://issues.apache.org/jira/browse/AIRFLOW-401 > Project: Apache Airflow > Issue Type: Bug > Components: executor, scheduler >Affects Versions: Airflow 1.7.1.3 >Reporter: Nadeem Ahmed Nazeer >Assignee: Bolke de Bruin >Priority: Minor > Labels: celery, kombu > Attachments: Dag_code.txt, schduler_cpu100%.png, > scheduler_stuck_7hours.png, scheduler_stuck.png > > > The scheduler gets stuck without a trace or error. When this happens, the CPU > usage of scheduler service is at 100%. No jobs get submitted and everything > comes to a halt. Looks it goes into some kind of infinite loop. > The only way I could make it run again is by manually restarting the > scheduler service. But again, after running some tasks it gets stuck. I've > tried with both Celery and Local executors but same issue occurs. I am using > the -n 3 parameter while starting scheduler. > Scheduler configs, > job_heartbeat_sec = 5 > scheduler_heartbeat_sec = 5 > executor = LocalExecutor > parallelism = 32 > Please help. I would be happy to provide any other information needed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AIRFLOW-401) scheduler gets stuck without a trace
[ https://issues.apache.org/jira/browse/AIRFLOW-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089806#comment-16089806 ] Bolke de Bruin commented on AIRFLOW-401: The scheduler does not restart itself just like that. That is a configuration thing. It does iterate of tasks in a loop. Please explain what you mean by "restarting". Does it get a new PID? Then you are running with a configuration that tells the scheduler to terminate and you have a external way to restart it. > scheduler gets stuck without a trace > > > Key: AIRFLOW-401 > URL: https://issues.apache.org/jira/browse/AIRFLOW-401 > Project: Apache Airflow > Issue Type: Bug > Components: executor, scheduler >Affects Versions: Airflow 1.7.1.3 >Reporter: Nadeem Ahmed Nazeer >Assignee: Bolke de Bruin >Priority: Minor > Labels: celery, kombu > Attachments: Dag_code.txt, schduler_cpu100%.png, > scheduler_stuck_7hours.png, scheduler_stuck.png > > > The scheduler gets stuck without a trace or error. When this happens, the CPU > usage of scheduler service is at 100%. No jobs get submitted and everything > comes to a halt. Looks it goes into some kind of infinite loop. > The only way I could make it run again is by manually restarting the > scheduler service. But again, after running some tasks it gets stuck. I've > tried with both Celery and Local executors but same issue occurs. I am using > the -n 3 parameter while starting scheduler. > Scheduler configs, > job_heartbeat_sec = 5 > scheduler_heartbeat_sec = 5 > executor = LocalExecutor > parallelism = 32 > Please help. I would be happy to provide any other information needed -- This message was sent by Atlassian JIRA (v6.4.14#64029)