[jira] [Updated] (AIRFLOW-2233) Update updating.md to include the info of hdfs_sensors renaming
[ https://issues.apache.org/jira/browse/AIRFLOW-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Feng updated AIRFLOW-2233: -- Description: We change the name of hdfs_sensors naming in https://issues.apache.org/jira/browse/AIRFLOW-2211. We should add an entry in updating.md file to reflect that. > Update updating.md to include the info of hdfs_sensors renaming > --- > > Key: AIRFLOW-2233 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2233 > Project: Apache Airflow > Issue Type: Improvement >Reporter: Tao Feng >Assignee: Tao Feng >Priority: Minor > > We change the name of hdfs_sensors naming in > https://issues.apache.org/jira/browse/AIRFLOW-2211. We should add an entry in > updating.md file to reflect that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2233) Update updating.md to include the info of hdfs_sensors renaming
Tao Feng created AIRFLOW-2233: - Summary: Update updating.md to include the info of hdfs_sensors renaming Key: AIRFLOW-2233 URL: https://issues.apache.org/jira/browse/AIRFLOW-2233 Project: Apache Airflow Issue Type: Improvement Reporter: Tao Feng Assignee: Tao Feng -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2231) DAG with a relativedelta schedule_interval fails
[ https://issues.apache.org/jira/browse/AIRFLOW-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407340#comment-16407340 ] Joy Gao commented on AIRFLOW-2231: -- Can't replicate this. Can you provide the dag file? thanks! > DAG with a relativedelta schedule_interval fails > > > Key: AIRFLOW-2231 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2231 > Project: Apache Airflow > Issue Type: Bug > Components: DAG >Reporter: Kyle Brooks >Priority: Major > > The documentation for the DAG class says using > dateutil.relativedelta.relativedelta as a schedule_interval is supported but > it fails: > > Traceback (most recent call last): > File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 285, > in process_file > m = imp.load_source(mod_name, filepath) > File > "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/imp.py", > line 172, in load_source > module = _load(spec) > File "", line 675, in _load > File "", line 655, in _load_unlocked > File "", line 678, in exec_module > File "", line 205, in _call_with_frames_removed > File "/Users/k398995/airflow/dags/test_reldel.py", line 33, in > dagrun_timeout=timedelta(minutes=60)) > File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 2914, > in __init__ > if schedule_interval in cron_presets: > TypeError: unhashable type: 'relativedelta' > > It looks like the __init__ function for class DAG assumes the > schedule_interval is hashable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2232) DAG must be imported for airflow dag discovery
[ https://issues.apache.org/jira/browse/AIRFLOW-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joy Gao resolved AIRFLOW-2232. -- Resolution: Duplicate Closing since it's a dupe. > DAG must be imported for airflow dag discovery > -- > > Key: AIRFLOW-2232 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2232 > Project: Apache Airflow > Issue Type: Bug > Components: DAG >Reporter: andy dreyfuss >Priority: Critical > > repro: put the following in the dags/ directory > - > from my_dags import MyDag > d = MyDag() . # this is an airflow.DAG > ` > > Expected: airflow list_dags lists the dag > Actual: airflow does not list the dag unless an unused `from airflow import > DAG` is added -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-2232) DAG must be imported for airflow dag discovery
[ https://issues.apache.org/jira/browse/AIRFLOW-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407318#comment-16407318 ] Christopher Fei commented on AIRFLOW-2232: -- I believe this is a duplicate of https://issues.apache.org/jira/browse/AIRFLOW-97 and https://issues.apache.org/jira/browse/AIRFLOW-2198. > DAG must be imported for airflow dag discovery > -- > > Key: AIRFLOW-2232 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2232 > Project: Apache Airflow > Issue Type: Bug > Components: DAG >Reporter: andy dreyfuss >Priority: Critical > > repro: put the following in the dags/ directory > - > from my_dags import MyDag > d = MyDag() . # this is an airflow.DAG > ` > > Expected: airflow list_dags lists the dag > Actual: airflow does not list the dag unless an unused `from airflow import > DAG` is added -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2232) DAG must be imported for airflow dag discovery
andy dreyfuss created AIRFLOW-2232: -- Summary: DAG must be imported for airflow dag discovery Key: AIRFLOW-2232 URL: https://issues.apache.org/jira/browse/AIRFLOW-2232 Project: Apache Airflow Issue Type: Bug Components: DAG Reporter: andy dreyfuss repro: put the following in the dags/ directory - from my_dags import MyDag d = MyDag() . # this is an airflow.DAG ` Expected: airflow list_dags lists the dag Actual: airflow does not list the dag unless an unused `from airflow import DAG` is added -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2231) DAG with a relativedelta schedule_interval fails
Kyle Brooks created AIRFLOW-2231: Summary: DAG with a relativedelta schedule_interval fails Key: AIRFLOW-2231 URL: https://issues.apache.org/jira/browse/AIRFLOW-2231 Project: Apache Airflow Issue Type: Bug Components: DAG Reporter: Kyle Brooks The documentation for the DAG class says using dateutil.relativedelta.relativedelta as a schedule_interval is supported but it fails: Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 285, in process_file m = imp.load_source(mod_name, filepath) File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/imp.py", line 172, in load_source module = _load(spec) File "", line 675, in _load File "", line 655, in _load_unlocked File "", line 678, in exec_module File "", line 205, in _call_with_frames_removed File "/Users/k398995/airflow/dags/test_reldel.py", line 33, in dagrun_timeout=timedelta(minutes=60)) File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 2914, in __init__ if schedule_interval in cron_presets: TypeError: unhashable type: 'relativedelta' It looks like the __init__ function for class DAG assumes the schedule_interval is hashable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2230) [possible dup] tutorial does not specify initdb/upgradedb prerequisite command, although quick start does
[ https://issues.apache.org/jira/browse/AIRFLOW-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] andy dreyfuss updated AIRFLOW-2230: --- Summary: [possible dup] tutorial does not specify initdb/upgradedb prerequisite command, although quick start does (was: [possible dup] tutorial does not specify upgradedb prerequisite command) > [possible dup] tutorial does not specify initdb/upgradedb prerequisite > command, although quick start does > - > > Key: AIRFLOW-2230 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2230 > Project: Apache Airflow > Issue Type: Bug > Components: docs, Documentation >Reporter: andy dreyfuss >Priority: Critical > > Quick start specifies `initdb` but full tutorial docs afaict do not specify > this prerequisite command. If this is not run before everything else you end > up with: > sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: > connection [SQL: 'SELECT connection.conn_id AS connection_conn_id \nFROM > connection GROUP BY connection.conn_id'] (Background on this error at: > [http://sqlalche.me/e/e3q8]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2230) [possible dup] tutorial does not specify upgradedb prerequisite command
[ https://issues.apache.org/jira/browse/AIRFLOW-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] andy dreyfuss updated AIRFLOW-2230: --- Description: Quick start specifies `initdb` but full tutorial docs afaict do not specify this prerequisite command. If this is not run before everything else you end up with: sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: connection [SQL: 'SELECT connection.conn_id AS connection_conn_id \nFROM connection GROUP BY connection.conn_id'] (Background on this error at: [http://sqlalche.me/e/e3q8]) was: Tutorial docs afaict do not specify the prerequisite `airflow upgradedb` command. If this is not run before everything else you end up with: sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: connection [SQL: 'SELECT connection.conn_id AS connection_conn_id \nFROM connection GROUP BY connection.conn_id'] (Background on this error at: http://sqlalche.me/e/e3q8) > [possible dup] tutorial does not specify upgradedb prerequisite command > --- > > Key: AIRFLOW-2230 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2230 > Project: Apache Airflow > Issue Type: Bug > Components: docs, Documentation >Reporter: andy dreyfuss >Priority: Critical > > Quick start specifies `initdb` but full tutorial docs afaict do not specify > this prerequisite command. If this is not run before everything else you end > up with: > sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: > connection [SQL: 'SELECT connection.conn_id AS connection_conn_id \nFROM > connection GROUP BY connection.conn_id'] (Background on this error at: > [http://sqlalche.me/e/e3q8]) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2202) Support filter in HiveMetastoreHook().max_partition()
[ https://issues.apache.org/jira/browse/AIRFLOW-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang resolved AIRFLOW-2202. - Resolution: Fixed Fixed by https://github.com/apache/incubator-airflow/pull/3117 > Support filter in HiveMetastoreHook().max_partition() > -- > > Key: AIRFLOW-2202 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2202 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Priority: Major > > Change made in https://issues.apache.org/jira/browse/AIRFLOW-2150 removed the > support for filter in max_partition(), which should be a valid use case. So > we're adding it back. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-2150) Use get_partition_names() instead of get_partitions() in HiveMetastoreHook().max_partition()
[ https://issues.apache.org/jira/browse/AIRFLOW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Yang resolved AIRFLOW-2150. - Resolution: Fixed > Use get_partition_names() instead of get_partitions() in > HiveMetastoreHook().max_partition() > > > Key: AIRFLOW-2150 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2150 > Project: Apache Airflow > Issue Type: Bug >Reporter: Kevin Yang >Assignee: Kevin Yang >Priority: Major > > get_partitions() is extremely expensive for large tables, max_partition() > should be using get_partition_names() instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2230) [possible dup] tutorial does not specify upgradedb prerequisite command
andy dreyfuss created AIRFLOW-2230: -- Summary: [possible dup] tutorial does not specify upgradedb prerequisite command Key: AIRFLOW-2230 URL: https://issues.apache.org/jira/browse/AIRFLOW-2230 Project: Apache Airflow Issue Type: Bug Components: docs, Documentation Reporter: andy dreyfuss Tutorial docs afaict do not specify the prerequisite `airflow upgradedb` command. If this is not run before everything else you end up with: sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: connection [SQL: 'SELECT connection.conn_id AS connection_conn_id \nFROM connection GROUP BY connection.conn_id'] (Background on this error at: http://sqlalche.me/e/e3q8) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-2229) Scheduler cannot retry abrupt task failures within factory-generated DAGs
[ https://issues.apache.org/jira/browse/AIRFLOW-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Meickle updated AIRFLOW-2229: --- Description: We had an issue where one of our tasks failed without the worker updating state (unclear why, but let's assume it was an OOM), resulting in this series of error messages: {{Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com airflow_scheduler-stdout.log: [2018-03-20 14:27:04,993] \{{models.py:1595 ERROR - Executor reports task instance %s finished (%s) although the task says its %s. Was the task killed externally? {{ Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com airflow_scheduler-stdout.log: NoneType}} {{ Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com airflow_scheduler-stdout.log: [2018-03-20 14:27:04,994] {{jobs.py:1435 ERROR - Cannot load the dag bag to handle failure for . Setting task to FAILED without callbacks or retries. Do you have enough resources? Mysterious failures are not unexpected, because we are in the cloud, after all. The concern is the last line: ignoring callbacks and retries, implying that it's a lack of resources. However, the machine was totally underutilized at the time. I dug into this code a bit more and as far as I can tell this error is happening in this code path: [https://github.com/apache/incubator-airflow/blob/1.9.0/airflow/jobs.py#L1427] {{self.log.error(msg)}} {{try:}} {{ simple_dag = simple_dag_bag.get_dag(dag_id)}} {{ dagbag = models.DagBag(simple_dag.full_filepath)}} {{ dag = dagbag.get_dag(dag_id)}} {{ ti.task = dag.get_task(task_id)}} {{ ti.handle_failure(msg)}} {{except Exception:}} {{ self.log.error("Cannot load the dag bag to handle failure for %s"}} {{ ". Setting task to FAILED without callbacks or "}} {{ "retries. Do you have enough resources?", ti)}} {{ ti.state = State.FAILED}} {{ session.merge(ti)}} {{ session.commit()}}{{}} I am not very familiar with this code, nor do I have time to attach a debugger at the moment, but I think what is happening here is: * I have a factory Python file, which imports and instantiates DAG code from other files. * The scheduler loads the DAGs from the factory file on the filesystem. It gets a fileloc (as represented in the DB) not of the factory file, but of the file it loaded code from. * The scheduler makes a simple DAGBag from the instantiated DAGs. * This line of code uses the simple DAG, which references the original DAG object's fileloc, to create a new DAGBag object. * This DAGBag looks for the original DAG in the fileloc, which is the file containing that DAG's _code_, but is not actually importable by Airflow. * An exception is raised trying to load the DAG from the DAGBag, which found nothing. * Handling of the task failure never occurs. * The over-broad Exception code swallows all of the above occurring. * There's just a generic error message that is not helpful to a system operator. If this is the case, at minimum, the try/except should be rewritten to be more graceful and to have a better error message. But I question whether this level of DAGBag abstraction/indirection isn't making this failure case worse than it needs to be; under normal conditions the scheduler is definitely able to find the relevant factory-generated DAGs and execute tasks within them as expected, even with the fileloc set to the code path and not the import path. was: We had an issue where one of our tasks failed without the worker updating state (unclear why, but let's assume it was an OOM), resulting in this series of error messages: {{Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com airflow_scheduler-stdout.log: [2018-03-20 14:27:04,993] \{{models.py:1595}} ERROR - Executor reports task instance %s finished (%s) although the task says its %s. Was the task killed externally?}} {{Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com airflow_scheduler-stdout.log: NoneType}} {{Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com airflow_scheduler-stdout.log: [2018-03-20 14:27:04,994] \{{jobs.py:1435}} ERROR - Cannot load the dag bag to handle failure for . Setting task to FAILED without callbacks or retries. Do you have enough resources?}} Mysterious failures are not unexpected, because we are in the cloud, after all. The concern is the last line: ignoring callbacks and retries, implying that it's a lack of resources. However, the machine was totally underutilized at the time. I dug into this code a bit more and as far as I can tell this error is happening in this code path: [https://github.com/apache/incubator-airflow/blob/1.9.0/airflow/jobs.py#L1427] {{self.log.error(msg)}} {{try:}} {{ simple_dag = simple_dag_bag.get_dag(dag_id)}} {{ dagbag =
[jira] [Created] (AIRFLOW-2229) Scheduler cannot retry abrupt task failures within factory-generated DAGs
James Meickle created AIRFLOW-2229: -- Summary: Scheduler cannot retry abrupt task failures within factory-generated DAGs Key: AIRFLOW-2229 URL: https://issues.apache.org/jira/browse/AIRFLOW-2229 Project: Apache Airflow Issue Type: Bug Components: scheduler Affects Versions: 1.9.0 Reporter: James Meickle We had an issue where one of our tasks failed without the worker updating state (unclear why, but let's assume it was an OOM), resulting in this series of error messages: {{Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com airflow_scheduler-stdout.log: [2018-03-20 14:27:04,993] \{{models.py:1595}} ERROR - Executor reports task instance %s finished (%s) although the task says its %s. Was the task killed externally?}} {{Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com airflow_scheduler-stdout.log: NoneType}} {{Mar 20 14:27:05 airflow-core-i-0fc1f995414837b8b.stg.int.dynoquant.com airflow_scheduler-stdout.log: [2018-03-20 14:27:04,994] \{{jobs.py:1435}} ERROR - Cannot load the dag bag to handle failure for . Setting task to FAILED without callbacks or retries. Do you have enough resources?}} Mysterious failures are not unexpected, because we are in the cloud, after all. The concern is the last line: ignoring callbacks and retries, implying that it's a lack of resources. However, the machine was totally underutilized at the time. I dug into this code a bit more and as far as I can tell this error is happening in this code path: [https://github.com/apache/incubator-airflow/blob/1.9.0/airflow/jobs.py#L1427] {{self.log.error(msg)}} {{try:}} {{ simple_dag = simple_dag_bag.get_dag(dag_id)}} {{ dagbag = models.DagBag(simple_dag.full_filepath)}} {{ dag = dagbag.get_dag(dag_id)}} {{ ti.task = dag.get_task(task_id)}} {{ ti.handle_failure(msg)}} {{except Exception:}} {{ self.log.error("Cannot load the dag bag to handle failure for %s"}} {{ ". Setting task to FAILED without callbacks or "}} {{ "retries. Do you have enough resources?", ti)}} {{ ti.state = State.FAILED}} {{ session.merge(ti)}} {{ session.commit()}}{{}} I am not very familiar with this code, nor do I have time to attach a debugger at the moment, but I think what is happening here is: * I have a factory Python file, which imports and instantiates DAG code from other files. * The scheduler loads the DAGs from the factory file on the filesystem. It gets a fileloc (as represented in the DB) not of the factory file, but of the file it loaded code from. * The scheduler makes a simple DAGBag from the instantiated DAGs. * This line of code uses the simple DAG, which references the original DAG object's fileloc, to create a new DAGBag object. * This DAGBag looks for the original DAG in the fileloc, which is the file containing that DAG's _code_, but is not actually importable by Airflow. * An exception is raised trying to load the DAG from the DAGBag, which found nothing. * Handling of the task failure never occurs. * The over-broad Exception code swallows all of the above occurring. * There's just a generic error message that is not helpful to a system operator. If this is the case, at minimum, the try/except should be rewritten to be more graceful and to have a better error message. But I question whether this level of DAGBag abstraction/indirection isn't making this failure case worse than it needs to be; under normal conditions the scheduler is definitely able to find the relevant factory-generated DAGs and execute tasks within them as expected, even with the fileloc set to the code path and not the import path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-2228) Enhancements in ValueCheck operator
Sakshi Bansal created AIRFLOW-2228: -- Summary: Enhancements in ValueCheck operator Key: AIRFLOW-2228 URL: https://issues.apache.org/jira/browse/AIRFLOW-2228 Project: Apache Airflow Issue Type: Improvement Reporter: Sakshi Bansal Assignee: Sakshi Bansal 1. Allow tolerance to take on a value of 1. This is important to define a range of values the results can take on. Modify the condition r / (1 + self.tol) <= self.pass_value <= r / (1 - self.tol) in ValueCheck operator 2. Modify pass_value to be a template field, so that it's value can be determined at the runtime. Convert the pass_value to float if possible in the execute method itself, after the template has been rendered. 3. Add the tolerance value in the airflow exception. This gives an idea about the allowed range for the resultant records. -- This message was sent by Atlassian JIRA (v7.6.3#76005)