[jira] [Commented] (AIRFLOW-1033) TypeError: can't compare datetime.datetime to NoneType in prev_dagrun_dep.py
[ https://issues.apache.org/jira/browse/AIRFLOW-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939742#comment-15939742 ] Ruslan Dautkhanov commented on AIRFLOW-1033: Might be related to AIRFLOW-1001 > TypeError: can't compare datetime.datetime to NoneType in prev_dagrun_dep.py > > > Key: AIRFLOW-1033 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1033 > Project: Apache Airflow > Issue Type: Bug > Components: DagRun >Affects Versions: Airflow 1.8 > Environment: Centos 7; > db: PostgreSQL 9.5 > python version: 2.7 > Installation via pip >Reporter: Bert Desmet >Priority: Critical > Labels: bug, interval > Attachments: test_dag.py, test_dag.py.log > > > Dear, > When starting a specific new dag we get the following error: > [2017-03-23 16:51:16,354] {jobs.py:354} DagFileProcessor908 ERROR - Got an > exception! Propagating... > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/airflow/jobs.py", line 346, in helper > pickle_dags) > File "/usr/lib/python2.7/site-packages/airflow/utils/db.py", line 53, in > wrapper > result = func(*args, **kwargs) > File "/usr/lib/python2.7/site-packages/airflow/jobs.py", line 1581, in > process_file > self._process_dags(dagbag, dags, ti_keys_to_schedule) > File "/usr/lib/python2.7/site-packages/airflow/jobs.py", line 1174, in > _process_dags > self._process_task_instances(dag, tis_out) > File "/usr/lib/python2.7/site-packages/airflow/jobs.py", line 905, in > _process_task_instances > session=session): > File "/usr/lib/python2.7/site-packages/airflow/utils/db.py", line 53, in > wrapper > result = func(*args, **kwargs) > File "/usr/lib/python2.7/site-packages/airflow/models.py", line 1116, in > are_dependencies_met > session=session): > File "/usr/lib/python2.7/site-packages/airflow/models.py", line 1140, in > get_failed_dep_statuses > dep_context): > File > "/usr/lib/python2.7/site-packages/airflow/ti_deps/deps/base_ti_dep.py", line > 94, in get_dep_statuses > for dep_status in self._get_dep_statuses(ti, session, dep_context): > File > "/usr/lib/python2.7/site-packages/airflow/ti_deps/deps/prev_dagrun_dep.py", > line 47, in _get_dep_statuses > if dag.previous_schedule(ti.execution_date) < ti.task.start_date: > TypeError: can't compare datetime.datetime to NoneType > I have added some debug code to the file 'prev_dagrun_dep.py: > dag = ti.task.dag > print 'Start dates:' > print 'previous_exection_date: > %s'%(dag.previous_schedule(ti.execution_date)) > print 'current start date: %s'%(ti.task.start_date) > if dag.catchup: > if dag.previous_schedule(ti.execution_date) < ti.task.start_date: > And this is the output I get: > Start dates: > previous_exection_date: None > current start date: 2017-03-19 00:00:00 > I think it is normall that the previous_exection_date is null, since it is > the first time this dag is being run. But why is the start_date of the dag > important, and not the start date of the run? > I have the feeling the cause is the 'schedule_interval', which is set to > None. > Please find an example and it's log file as an attachment to this mail. > Bert -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (AIRFLOW-1001) Landing Time shows "unsupported operand type(s) for -: 'datetime.datetime' and 'NoneType'" on example_subdag_operator
[ https://issues.apache.org/jira/browse/AIRFLOW-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939743#comment-15939743 ] Ruslan Dautkhanov commented on AIRFLOW-1001: Might be related to AIRFLOW-1033 > Landing Time shows "unsupported operand type(s) for -: 'datetime.datetime' > and 'NoneType'" on example_subdag_operator > - > > Key: AIRFLOW-1001 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1001 > Project: Apache Airflow > Issue Type: Bug > Components: DAG, ui >Affects Versions: 1.8.0, 1.8.0rc5 >Reporter: Ruslan Dautkhanov > > Sample DAG example_subdag_operator shows below Oops page > on "Landing Times" tab > {noformat} > --- > Traceback (most recent call last): > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/flask/app.py", > line 1988, in wsgi_app > response = self.full_dispatch_request() > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/flask/app.py", > line 1641, in full_dispatch_request > rv = self.handle_user_exception(e) > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/flask/app.py", > line 1544, in handle_user_exception > reraise(exc_type, exc_value, tb) > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/flask/app.py", > line 1639, in full_dispatch_request > rv = self.dispatch_request() > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/flask/app.py", > line 1625, in dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/flask_admin/base.py", > line 69, in inner > return self._run_view(f, *args, **kwargs) > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/flask_admin/base.py", > line 368, in _run_view > return fn(self, *args, **kwargs) > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/flask_login.py", > line 755, in decorated_view > return func(*args, **kwargs) > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/airflow/www/utils.py", > line 125, in wrapper > return f(*args, **kwargs) > File > "/opt/cloudera/parcels/Anaconda/lib/python2.7/site-packages/airflow/www/views.py", > line 1560, in landing_times > secs = (ti.end_date - ts).total_seconds() > TypeError: unsupported operand type(s) for -: 'datetime.datetime' and > 'NoneType' > {noformat} > http://somehost:18111/admin/airflow/landing_times?root==30_id=example_subdag_operator -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1036) Exponential backoff should use randomization
Alex Guziel created AIRFLOW-1036: Summary: Exponential backoff should use randomization Key: AIRFLOW-1036 URL: https://issues.apache.org/jira/browse/AIRFLOW-1036 Project: Apache Airflow Issue Type: Improvement Reporter: Alex Guziel Assignee: Alex Guziel This prevents the thundering herd problem. I think with the current way this is used, we would need to use some hashing function based on some subset of the dag_run, task_id, dag_id, and execution_date to emulate the RNG. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1035) Exponential backoff retry logic should use 2 as base
Alex Guziel created AIRFLOW-1035: Summary: Exponential backoff retry logic should use 2 as base Key: AIRFLOW-1035 URL: https://issues.apache.org/jira/browse/AIRFLOW-1035 Project: Apache Airflow Issue Type: Bug Reporter: Alex Guziel Assignee: Alex Guziel Right now, the exponential backoff logic computes it as (retry_period) ^ (retry_number) instead of retry_period * 2 ^ retry_number. See https://en.wikipedia.org/wiki/Exponential_backoff -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (AIRFLOW-137) Airflow does not respect 'max_active_runs' when task from multiple dag runs cleared
[ https://issues.apache.org/jira/browse/AIRFLOW-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938887#comment-15938887 ] Matthew Schmoyer commented on AIRFLOW-137: -- [~jarr] Looks like [~g.toonstra] fixed this with https://github.com/apache/incubator-airflow/pull/1870 Are you still experiencing an issue? Or is this good to close? > Airflow does not respect 'max_active_runs' when task from multiple dag runs > cleared > --- > > Key: AIRFLOW-137 > URL: https://issues.apache.org/jira/browse/AIRFLOW-137 > Project: Apache Airflow > Issue Type: Bug >Reporter: Tomasz Bartczak >Priority: Minor > > Also requested at https://github.com/apache/incubator-airflow/issues/1442 > Dear Airflow Maintainers, > Environment > Before I tell you about my issue, let me describe my Airflow environment: > Please fill out any appropriate fields: > Airflow version: 1.7.0 > Airflow components: webserver, mysql, scheduler with celery executor > Python Version: 2.7.6 > Operating System: Linux Ubuntu 3.19.0-26-generic Scheduler runs with > --num-runs and get restarted around every minute or so > Description of Issue > Now that you know a little about me, let me tell you about the issue I am > having: > What did you expect to happen? > After running 'airflow clear -t spark_final_observations2csv -s > 2016-04-07T01:00:00 -e 2016-04-11T01:00:00 MODELLING_V6' I expected that this > task gets executed in all dag-runs in specified by given time-range - > respecting 'max_active_runs' > Dag configuration: > concurrency= 3, > max_active_runs = 2, > What happened instead? > Airflow at first started executing 3 of those tasks, which already > violates 'max_active_runs', but it looks like 'concurrency' was the applied > limit here. > 3_running_2_pending > After first task was done - airflow scheduled all other tasks, making it 5 > running dags at the same time that violates all specified limit. > In the GUI we saw red warning (5/2 Dags running ;-) ) > Reproducing the Issue > max_active_runs is respected in a day-to-day basis - when of the tasks was > stuck - airflow didn't start more than 2 dags concurrently. > [screenshots in the original issue: > https://github.com/apache/incubator-airflow/issues/1442] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Work stopped] (AIRFLOW-670) Make Airflow home page configurable
[ https://issues.apache.org/jira/browse/AIRFLOW-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-670 stopped by Bowen Li. > Make Airflow home page configurable > --- > > Key: AIRFLOW-670 > URL: https://issues.apache.org/jira/browse/AIRFLOW-670 > Project: Apache Airflow > Issue Type: Improvement > Components: ui >Affects Versions: Airflow 1.8 >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Minor > Labels: beginner > Fix For: Airflow 1.8 > > > At OfferUp, we would like to make UI of home page configurable. Some desired > scenarios are: > 1. Able to hide view tabs, i.e. Ad Hoc Query, Charts, and Configuration. Core > tabs like DAGs shouldn't be suppressible. > 2. Able to add external links to tabs, i.e. at OfferUp we create internal > python documentation with sphinx and it would great to add an external link > to 'Docs' tab > I'm not sure which version would be the best to have this improvement, so I > marked it as 1.8 for now. Please feel free to change it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (AIRFLOW-670) Make Airflow home page configurable
[ https://issues.apache.org/jira/browse/AIRFLOW-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bowen Li closed AIRFLOW-670. Resolution: Fixed > Make Airflow home page configurable > --- > > Key: AIRFLOW-670 > URL: https://issues.apache.org/jira/browse/AIRFLOW-670 > Project: Apache Airflow > Issue Type: Improvement > Components: ui >Affects Versions: Airflow 1.8 >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Minor > Labels: beginner > Fix For: Airflow 1.8 > > > At OfferUp, we would like to make UI of home page configurable. Some desired > scenarios are: > 1. Able to hide view tabs, i.e. Ad Hoc Query, Charts, and Configuration. Core > tabs like DAGs shouldn't be suppressible. > 2. Able to add external links to tabs, i.e. at OfferUp we create internal > python documentation with sphinx and it would great to add an external link > to 'Docs' tab > I'm not sure which version would be the best to have this improvement, so I > marked it as 1.8 for now. Please feel free to change it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1034) Make it possible to connect to S3 in sigv4 regions
Robin B created AIRFLOW-1034: Summary: Make it possible to connect to S3 in sigv4 regions Key: AIRFLOW-1034 URL: https://issues.apache.org/jira/browse/AIRFLOW-1034 Project: Apache Airflow Issue Type: Bug Components: hooks Affects Versions: Airflow 1.8 Reporter: Robin B When trying to connect to S3 in regions which only allow for signature V4 authentication the S3Hook boto will fail with HostRequiredError. You can also see 403 Forbidden errors. The solution is to explicitly pass a host parameter to the boto-S3Connection object. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1033) TypeError: can't compare datetime.datetime to NoneType in prev_dagrun_dep.py
Bert Desmet created AIRFLOW-1033: Summary: TypeError: can't compare datetime.datetime to NoneType in prev_dagrun_dep.py Key: AIRFLOW-1033 URL: https://issues.apache.org/jira/browse/AIRFLOW-1033 Project: Apache Airflow Issue Type: Bug Components: DagRun Affects Versions: Airflow 1.8 Environment: Centos 7; db: PostgreSQL 9.5 python version: 2.7 Installation via pip Reporter: Bert Desmet Priority: Critical Attachments: test_dag.py, test_dag.py.log Dear, When starting a specific new dag we get the following error: [2017-03-23 16:51:16,354] {jobs.py:354} DagFileProcessor908 ERROR - Got an exception! Propagating... Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/airflow/jobs.py", line 346, in helper pickle_dags) File "/usr/lib/python2.7/site-packages/airflow/utils/db.py", line 53, in wrapper result = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/airflow/jobs.py", line 1581, in process_file self._process_dags(dagbag, dags, ti_keys_to_schedule) File "/usr/lib/python2.7/site-packages/airflow/jobs.py", line 1174, in _process_dags self._process_task_instances(dag, tis_out) File "/usr/lib/python2.7/site-packages/airflow/jobs.py", line 905, in _process_task_instances session=session): File "/usr/lib/python2.7/site-packages/airflow/utils/db.py", line 53, in wrapper result = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/airflow/models.py", line 1116, in are_dependencies_met session=session): File "/usr/lib/python2.7/site-packages/airflow/models.py", line 1140, in get_failed_dep_statuses dep_context): File "/usr/lib/python2.7/site-packages/airflow/ti_deps/deps/base_ti_dep.py", line 94, in get_dep_statuses for dep_status in self._get_dep_statuses(ti, session, dep_context): File "/usr/lib/python2.7/site-packages/airflow/ti_deps/deps/prev_dagrun_dep.py", line 47, in _get_dep_statuses if dag.previous_schedule(ti.execution_date) < ti.task.start_date: TypeError: can't compare datetime.datetime to NoneType I have added some debug code to the file 'prev_dagrun_dep.py: dag = ti.task.dag print 'Start dates:' print 'previous_exection_date: %s'%(dag.previous_schedule(ti.execution_date)) print 'current start date: %s'%(ti.task.start_date) if dag.catchup: if dag.previous_schedule(ti.execution_date) < ti.task.start_date: And this is the output I get: Start dates: previous_exection_date: None current start date: 2017-03-19 00:00:00 I think it is normall that the previous_exection_date is null, since it is the first time this dag is being run. But why is the start_date of the dag important, and not the start date of the run? I have the feeling the cause is the 'schedule_interval', which is set to None. Please find an example and it's log file as an attachment to this mail. Bert -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (AIRFLOW-1025) Zoom into SubDAG doesn't show all DAG runs
[ https://issues.apache.org/jira/browse/AIRFLOW-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938622#comment-15938622 ] Achraf commented on AIRFLOW-1025: - >From what I can see, It's not an UI issue. If the execution of the dag is scheduled, the tasks into the subdag are executed. But for a manual execution, the subdag pass to success without executing the tasks inside (either in the graph or tree view). You can also test this behavior in the "example_subdag_operator". > Zoom into SubDAG doesn't show all DAG runs > -- > > Key: AIRFLOW-1025 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1025 > Project: Apache Airflow > Issue Type: Bug > Components: subdag, ui >Affects Versions: Airflow 1.8 >Reporter: Joe Schmid > Attachments: 1-main_dag_graph_view_dag_runs.png, > 2-main_dag_tree_view.png, 3_zoom_into_subdag_view_dag_runs.png, > 4_zoom_into_subdag_tree_view.png, test_subdag.py > > > I have a simple test SubDAG (attached as test_subdag.py) to demonstrate the > issue. When viewing the DAG in the web UI in either Graph view or Tree view, > I see all available DAG runs. When I click on the subdag operator and "Zoom > into Sub DAG" it doesn't show these DAG runs. (It used to prior to 1.8.) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (AIRFLOW-1032) Zeppelin Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun M J reassigned AIRFLOW-1032: - Assignee: Arun M J > Zeppelin Operator > - > > Key: AIRFLOW-1032 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1032 > Project: Apache Airflow > Issue Type: New Feature > Components: operators >Reporter: Thomas John >Assignee: Arun M J > > Airflow should be able to execute zeppelin notebooks. > Current situation allows us to run notebooks using bash and http operators > which does not provide execution status. > It would be good if airflow can handle zeppelin notebooks directly. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (AIRFLOW-1032) Zeppelin Operator
[ https://issues.apache.org/jira/browse/AIRFLOW-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas John updated AIRFLOW-1032: - Description: Airflow should be able to execute zeppelin notebooks. Current situation allows us to run notebooks using bash and http operators which does not provide execution status. It would be good if airflow can handle zeppelin notebooks directly. > Zeppelin Operator > - > > Key: AIRFLOW-1032 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1032 > Project: Apache Airflow > Issue Type: New Feature > Components: operators >Reporter: Thomas John > > Airflow should be able to execute zeppelin notebooks. > Current situation allows us to run notebooks using bash and http operators > which does not provide execution status. > It would be good if airflow can handle zeppelin notebooks directly. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1032) Zeppelin Operator
Thomas John created AIRFLOW-1032: Summary: Zeppelin Operator Key: AIRFLOW-1032 URL: https://issues.apache.org/jira/browse/AIRFLOW-1032 Project: Apache Airflow Issue Type: New Feature Components: operators Reporter: Thomas John -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1031) 'scheduled__' may replace with DagRun.ID_PREFIX in DagRun.run_id
Fred Liu created AIRFLOW-1031: - Summary: 'scheduled__' may replace with DagRun.ID_PREFIX in DagRun.run_id Key: AIRFLOW-1031 URL: https://issues.apache.org/jira/browse/AIRFLOW-1031 Project: Apache Airflow Issue Type: Improvement Components: DagRun Affects Versions: Airflow 1.8 Reporter: Fred Liu Assignee: Fred Liu Priority: Trivial Fix For: Airflow 1.8 https://github.com/apache/incubator-airflow/blob/v1-8-stable/airflow/jobs.py#L840 This 'scheduled__' may replace with DagRun.ID_PREFIX I searched 'scheduled__', it seems don't appear other code. Can I send a PR to fix this? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1030) HttpHook error when creating HttpSensor
Paulius Dambrauskas created AIRFLOW-1030: Summary: HttpHook error when creating HttpSensor Key: AIRFLOW-1030 URL: https://issues.apache.org/jira/browse/AIRFLOW-1030 Project: Apache Airflow Issue Type: Bug Components: core Affects Versions: Airflow 1.8 Environment: python3 Reporter: Paulius Dambrauskas Assignee: Paulius Dambrauskas Task: {code} sensor = HttpSensor( task_id='http_sensor_check', http_conn_id='http_default', endpoint='', params={}, poke_interval=5, dag=dag ) {code} Exception {code} Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/airflow-1.9.0.dev0+apache.incubating-py3.5.egg/airflow/models.py", line 268, in process_file m = imp.load_source(mod_name, filepath) File "/usr/lib/python3.5/imp.py", line 172, in load_source module = _load(spec) File "", line 693, in _load File "", line 673, in _load_unlocked File "", line 665, in exec_module File "", line 222, in _call_with_frames_removed File "/home/paulius/airflow/dags/cpg_4.py", line 43, in dag=dag) File "/usr/local/lib/python3.5/dist-packages/airflow-1.9.0.dev0+apache.incubating-py3.5.egg/airflow/utils/decorators.py", line 86, in wrapper result = func(*args, **kwargs) File "/usr/local/lib/python3.5/dist-packages/airflow-1.9.0.dev0+apache.incubating-py3.5.egg/airflow/operators/sensors.py", line 663, in __init__ self.hook = hooks.http_hook.HttpHook(method='GET', http_conn_id=http_conn_id) File "/usr/local/lib/python3.5/dist-packages/airflow-1.9.0.dev0+apache.incubating-py3.5.egg/airflow/utils/helpers.py", line 436, in __getattr__ raise AttributeError AttributeError {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (AIRFLOW-1029) https://issues.apache.org/jira/browse/AIRFLOW
[ https://issues.apache.org/jira/browse/AIRFLOW-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alessio Palma updated AIRFLOW-1029: --- Description: I'm using: AIRFLOW 1.8.0RC5 ERLANG 19.2 RABBIT 3.6.7 PYTHON 2.7 When I start a DAG from the panel ( see picture ), Scheduler stop working. After some investigation the problem raises here: celery_executor.py: 83 def sync(self): 84 85 self.logger.debug( 86 "Inquiring about {} celery task(s)".format(len(self.tasks))) 87 88 for key, async in list(self.tasks.items()): 90 state = async.state < HERE Python stack trace says that the connection is closed; capturing some TCP traffic I can see that the connection to RABBITMQ is closed ( TCP FIN ) before to send a STOMP, so RABBITMQ replies with TCP RST. ( see picture 2: 172.1.0.2 -> rabbitmq node, 172.1.0.1 -> airflow node ) This exception stops the scheduler. If you are using airflow-scheduler-failover-controller the scheduler is restarted, but this is just a work around and does not fixes the problem at the root. Is safe to trap the exception ? was: I'm using: AIRFLOW 1.8.0RC5 ERLANG 19.2 RABBIT 3.6.7 PYTHON 2.7 When I start a DAG from the panel ( see picture ), Scheduler stop working. After some investigation the problem raises here: 83 def sync(self): 84 85 self.logger.debug( 86 "Inquiring about {} celery task(s)".format(len(self.tasks))) 87 88 for key, async in list(self.tasks.items()): 90 state = async.state < HERE Python stack trace says that the connection is closed; capturing some TCP traffic I can see that the connection to RABBITMQ is closed ( TCP FIN ) before to send a STOMP, so RABBITMQ replies with TCP RST. ( see picture 2: 172.1.0.2 -> rabbitmq node, 172.1.0.1 -> airflow node ) This exception stops the scheduler. If you are using airflow-scheduler-failover-controller the scheduler is restarted, but this is just a work around and does not fixes the problem at the root. Is safe to trap the exception ? > https://issues.apache.org/jira/browse/AIRFLOW > - > > Key: AIRFLOW-1029 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1029 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.8.0rc5 >Reporter: Alessio Palma > Labels: amqp_compliance, rabbitmq, scheduler > Attachments: image (1).png, PannelloAIRFLOW 2.png > > > I'm using: > AIRFLOW 1.8.0RC5 > ERLANG 19.2 > RABBIT 3.6.7 > PYTHON 2.7 > When I start a DAG from the panel ( see picture ), Scheduler stop working. > After some investigation the problem raises here: > celery_executor.py: > 83 def sync(self): > 84 > 85 self.logger.debug( > 86 "Inquiring about {} celery task(s)".format(len(self.tasks))) > 87 > 88 for key, async in list(self.tasks.items()): > 90 state = async.state < HERE > Python stack trace says that the connection is closed; capturing some TCP > traffic I can see that the connection to RABBITMQ is closed ( TCP FIN ) > before to send a STOMP, so RABBITMQ replies with TCP RST. ( see picture 2: > 172.1.0.2 -> rabbitmq node, 172.1.0.1 -> airflow node ) > This exception stops the scheduler. > If you are using airflow-scheduler-failover-controller the scheduler is > restarted, but this is just a work around and does not fixes the problem at > the root. > Is safe to trap the exception ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (AIRFLOW-1029) https://issues.apache.org/jira/browse/AIRFLOW
Alessio Palma created AIRFLOW-1029: -- Summary: https://issues.apache.org/jira/browse/AIRFLOW Key: AIRFLOW-1029 URL: https://issues.apache.org/jira/browse/AIRFLOW-1029 Project: Apache Airflow Issue Type: Bug Components: scheduler Affects Versions: 1.8.0rc5 Reporter: Alessio Palma Attachments: image (1).png, PannelloAIRFLOW 2.png I'm using: AIRFLOW 1.8.0RC5 ERLANG 19.2 RABBIT 3.6.7 PYTHON 2.7 When I start a DAG from the panel ( see picture ), Scheduler stop working. After some investigation the problem raises here: 83 def sync(self): 84 85 self.logger.debug( 86 "Inquiring about {} celery task(s)".format(len(self.tasks))) 87 88 for key, async in list(self.tasks.items()): 90 state = async.state < HERE Python stack trace says that the connection is closed; capturing some TCP traffic I can see that the connection to RABBITMQ is closed ( TCP FIN ) before to send a STOMP, so RABBITMQ replies with TCP RST. ( see picture 2: 172.1.0.2 -> rabbitmq node, 172.1.0.1 -> airflow node ) This exception stops the scheduler. If you are using airflow-scheduler-failover-controller the scheduler is restarted, but this is just a work around and does not fixes the problem at the root. Is safe to trap the exception ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)