[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140717#comment-17140717
 ] 

ASF subversion and git services commented on AIRFLOW-3973:
--

Commit 5bc50183e0934f7368d9cd991074b2b581114395 in airflow's branch 
refs/heads/v1-10-test from Elliott Shugerman
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=5bc5018 ]

[AIRFLOW-3973] Commit after each alembic migration (#4797)

If `Variable`s are used in DAGs, and Postgres is used for the internal
database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
logs with error messages (but does not fail).

This commit corrects this by running each migration in a separate
transaction.

Co-authored-by: Elliott Shugerman 

(cherry picked from commit ea95e9c7236969acc807c65de0f12633d04753a0)
(cherry picked from commit 5b48a5394ecf5aa1f2b50a00807e6149ade21968)


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-19 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140716#comment-17140716
 ] 

ASF subversion and git services commented on AIRFLOW-3973:
--

Commit 5bc50183e0934f7368d9cd991074b2b581114395 in airflow's branch 
refs/heads/v1-10-test from Elliott Shugerman
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=5bc5018 ]

[AIRFLOW-3973] Commit after each alembic migration (#4797)

If `Variable`s are used in DAGs, and Postgres is used for the internal
database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
logs with error messages (but does not fail).

This commit corrects this by running each migration in a separate
transaction.

Co-authored-by: Elliott Shugerman 

(cherry picked from commit ea95e9c7236969acc807c65de0f12633d04753a0)
(cherry picked from commit 5b48a5394ecf5aa1f2b50a00807e6149ade21968)


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136036#comment-17136036
 ] 

ASF subversion and git services commented on AIRFLOW-3973:
--

Commit 8f552cc09a54989f6212c735d2afca1a76431576 in airflow's branch 
refs/heads/v1-10-test from Elliott Shugerman
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=8f552cc ]

[AIRFLOW-3973] Commit after each alembic migration (#4797)

If `Variable`s are used in DAGs, and Postgres is used for the internal
database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
logs with error messages (but does not fail).

This commit corrects this by running each migration in a separate
transaction.

Co-authored-by: Elliott Shugerman 

(cherry picked from commit ea95e9c7236969acc807c65de0f12633d04753a0)

> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-15 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135798#comment-17135798
 ] 

ASF subversion and git services commented on AIRFLOW-3973:
--

Commit b5e7914a4510f6e6a787002b249c3f0cfced2094 in airflow's branch 
refs/heads/v1-10-test from Elliott Shugerman
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=b5e7914 ]

[AIRFLOW-3973] Commit after each alembic migration (#4797)

If `Variable`s are used in DAGs, and Postgres is used for the internal
database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
logs with error messages (but does not fail).

This commit corrects this by running each migration in a separate
transaction.

Co-authored-by: Elliott Shugerman 

(cherry picked from commit ea95e9c7236969acc807c65de0f12633d04753a0)

> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134984#comment-17134984
 ] 

ASF subversion and git services commented on AIRFLOW-3973:
--

Commit 7348c0f4c6a9214df7116b8d63ea8922d26c9b97 in airflow's branch 
refs/heads/v1-10-test from Elliott Shugerman
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=7348c0f ]

[AIRFLOW-3973] Commit after each alembic migration (#4797)

If `Variable`s are used in DAGs, and Postgres is used for the internal
database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
logs with error messages (but does not fail).

This commit corrects this by running each migration in a separate
transaction.

Co-authored-by: Elliott Shugerman 

(cherry picked from commit ea95e9c7236969acc807c65de0f12633d04753a0)

> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129818#comment-17129818
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

eeshugerman commented on pull request #9182:
URL: https://github.com/apache/airflow/pull/9182#issuecomment-641595881


   :+1: I see, thanks, I was wondering how that works.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129761#comment-17129761
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

ashb edited a comment on pull request #9182:
URL: https://github.com/apache/airflow/pull/9182#issuecomment-641543813


   Ah, we don't need a PR in this case -- just marking the original PR for the 
milestone 1.10.11 is enough and we'd cherry pick it.
   
   Since you've got it opened I'll merge this though.
   
   (We like cherry-picks to have the "(cherry picked from commit ...)" footer 
that `-x` adds, and our tooling to work out which commits are yet to be 
backported for a release relies upon the `(#)` on the end of the commit, so 
I have manually done both of these for this PR.)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129758#comment-17129758
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

ashb merged pull request #9182:
URL: https://github.com/apache/airflow/pull/9182


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129759#comment-17129759
 ] 

ASF subversion and git services commented on AIRFLOW-3973:
--

Commit 5b48a5394ecf5aa1f2b50a00807e6149ade21968 in airflow's branch 
refs/heads/v1-10-stable from Elliott Shugerman
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=5b48a53 ]

[AIRFLOW-3973] Commit after each alembic migration (#4797)

If `Variable`s are used in DAGs, and Postgres is used for the internal
database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
logs with error messages (but does not fail).

This commit corrects this by running each migration in a separate
transaction.

Co-authored-by: Elliott Shugerman 

(cherry picked from commit ea95e9c7236969acc807c65de0f12633d04753a0)

> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129757#comment-17129757
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

ashb commented on pull request #9182:
URL: https://github.com/apache/airflow/pull/9182#issuecomment-641543813


   Ah, we don't need a PR in this case -- just marking the original PR for the 
milestone 1.10.11 is enough and we'd cherry pick it.
   
   Since you've got it opened I'll merge this though.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129748#comment-17129748
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

eeshugerman commented on pull request #9182:
URL: https://github.com/apache/airflow/pull/9182#issuecomment-641534402


   This is already in `master`. I'm trying to backport to v1.10. Please see the 
discussion here: https://github.com/apache/airflow/pull/4797
   
   (Sorry for the confusion, I should have added a note of explanation to 
these.)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129699#comment-17129699
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

ashb commented on pull request #9182:
URL: https://github.com/apache/airflow/pull/9182#issuecomment-641497817


   As mentioned in our other PR 
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#airflow-git-branches
 -- this PR needs to target master, not v1-10-test/stable please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129698#comment-17129698
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

ashb closed pull request #9183:
URL: https://github.com/apache/airflow/pull/9183


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129697#comment-17129697
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

ashb commented on pull request #9183:
URL: https://github.com/apache/airflow/pull/9183#issuecomment-641497412


   Duplicate of #9182  - please 
https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#airflow-git-branches



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129581#comment-17129581
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

Fokko commented on pull request #4797:
URL: https://github.com/apache/airflow/pull/4797#issuecomment-640588665


   @eeshugerman please make a PR to the `v1-10-test` branch and I'll merge it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129539#comment-17129539
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

potiuk commented on pull request #4797:
URL: https://github.com/apache/airflow/pull/4797#issuecomment-640687352


   Just a small update: I think merge PR should go to v1-10-stable - in test we 
are usually cherry-pick stuff (I have 23 cherry-picked commits to push soon) so 
better to keep it this way. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129502#comment-17129502
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

eeshugerman commented on pull request #4797:
URL: https://github.com/apache/airflow/pull/4797#issuecomment-640745946


   A cherry-pick/rebase was required to get a clean PR to `v1-10-stable` or 
`v1-10-test`, so wasn't sure which to do? I opened a PR for each, feel free to 
close one or the other.
   
   https://github.com/apache/airflow/pull/9182
   https://github.com/apache/airflow/pull/9183
   
   I'm assuming the failing "PR is not up to date with master. Please rebase." 
check is not relevant here but let me know if I have the wrong idea.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129498#comment-17129498
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

eeshugerman edited a comment on pull request #4797:
URL: https://github.com/apache/airflow/pull/4797#issuecomment-640745946


   A cherry-pick/rebase was required to get a clean PR to `v1-10-stable` or 
`v1-10-test`, so wasn't sure which to do? I opened a PR for each, feel free to 
close one or the other.
   
   https://github.com/apache/airflow/pull/9182
   https://github.com/apache/airflow/pull/9183
   
   I'm assuming the failing `PR is not up to date with master. Please rebase.` 
check is not relevant here but let me know if I have the wrong idea.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129483#comment-17129483
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

eeshugerman opened a new pull request #9183:
URL: https://github.com/apache/airflow/pull/9183


   If `Variable`s are used in DAGs, and Postgres is used for the internal
   database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
   logs with error messages (but does not fail).
   
   This commit corrects this by running each migration in a separate
   transaction.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129452#comment-17129452
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

eeshugerman opened a new pull request #9182:
URL: https://github.com/apache/airflow/pull/9182


   If `Variable`s are used in DAGs, and Postgres is used for the internal
   database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
   logs with error messages (but does not fail).
   
   This commit corrects this by running each migration in a separate
   transaction.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2020-06-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127143#comment-17127143
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

eeshugerman commented on pull request #4797:
URL: https://github.com/apache/airflow/pull/4797#issuecomment-639885227


   @Fokko Could we backport this to v1.10? Is it just a matter of opening a PR 
to `v-1-10-test`?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2019-03-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783415#comment-16783415
 ] 

ASF subversion and git services commented on AIRFLOW-3973:
--

Commit ea95e9c7236969acc807c65de0f12633d04753a0 in airflow's branch 
refs/heads/master from Elliott Shugerman
[ https://gitbox.apache.org/repos/asf?p=airflow.git;h=ea95e9c ]

[AIRFLOW-3973] Commit after each alembic migration (#4797)

If `Variable`s are used in DAGs, and Postgres is used for the internal
database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the
logs with error messages (but does not fail).

This commit corrects this by running each migration in a separate
transaction.

> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
> Fix For: 2.0.0
>
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2019-03-04 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783414#comment-16783414
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

Fokko commented on pull request #4797: [AIRFLOW-3973] Run each Alembic 
migration in separate transaction
URL: https://github.com/apache/airflow/pull/4797
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3973) `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is used for the internal database

2019-02-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16779731#comment-16779731
 ] 

ASF GitHub Bot commented on AIRFLOW-3973:
-

eeshugerman commented on pull request #4797: [AIRFLOW-3973] problem: `initdb` 
spams log with errors | solution: run each migration in its own transaction
URL: https://github.com/apache/airflow/pull/4797
 
 
   ### Jira
   https://issues.apache.org/jira/browse/AIRFLOW-3973
   
   ### Description
   If `Variable`s are used in DAGs, and Postgres is used for the internal 
database, a fresh `$ airflow initdb` (or `$ airflow resetdb`) spams the logs 
with error messages (but does not fail).
   
   This commit corrects this by running each migration in a separate 
transaction.
   
   See Jira ticket for more details.
   
   I have tested this change with the default SQLite database and, of course, 
with Postgres.
   ### Tests
   
   No tests included as this is a one line change which adds no functionality 
whatsoever.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> `airflow initdb` logs errors when `Variable` is used in DAGs and Postgres is 
> used for the internal database
> ---
>
> Key: AIRFLOW-3973
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3973
> Project: Apache Airflow
>  Issue Type: Bug
>Reporter: Elliott Shugerman
>Assignee: Elliott Shugerman
>Priority: Minor
>
> h2. Notes:
>  * This does not occur if the database is already initialized. If it is, run 
> `resetdb` instead to observe the bug.
>  * This does not occur with the default SQLite database.
> h2. Example
> {{ERROR [airflow.models.DagBag] Failed to import: 
> /home/elliott/clean-airflow/dags/dag.py Traceback (most recent call last): 
> File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/base.py",
>  line 1236, in _execute_context cursor, statement, parameters, context File 
> "/home/elliott/.virtualenvs/airflow/lib/python3.6/site-packages/sqlalchemy/engine/default.py",
>  line 536, in do_execute cursor.execute(statement, parameters) 
> psycopg2.ProgrammingError: relation "variable" does not exist LINE 2: FROM 
> variable}}
> h2. Explanation
> The first thing {{airflow initdb}} does is run the Alembic migrations. All 
> migrations are run in one transaction. Most tables, including the 
> {{variable}} table, are defined in the initial migration. A [later 
> migration|https://github.com/apache/airflow/blob/master/airflow/migrations/versions/cc1e65623dc7_add_max_tries_column_to_task_instance.py]
>  imports and initializes {{models.DagBag}}. Upon initialization, {{DagBag}} 
> calls its {{collect_dags}} method, which scans the DAGs directory and 
> attempts to load all DAGs it finds. When it loads a DAG that uses a 
> {{Variable}}, it will query the database to see if that {{Variable}} is 
> defined in the {{variable}} table. It's not clear to me how exactly the 
> connection for that query is created, but I think it is apparent that it does 
> _not_ use the same transaction that is used to run the migrations. Since the 
> migrations are not yet complete, and all migrations are run in one 
> transaction, the migration that creates the {{variable}} table has not yet 
> been committed, and therefore the table does not exist to any other 
> connection/transaction. This raises {{ProgrammingError}}, which is caught and 
> logged by {{collect_dags}}.
>  
> h2. Proposed Solution
> Run each Alembic migration in its own transaction. I will open a pull request 
> which accomplishes this shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)