[GitHub] [airflow] ashb commented on a change in pull request #5979: [AIRFLOW-5373] Super fast pre-commit check for basic python2 compatib…
ashb commented on a change in pull request #5979: [AIRFLOW-5373] Super fast pre-commit check for basic python2 compatib… URL: https://github.com/apache/airflow/pull/5979#discussion_r320254712 ## File path: airflow/contrib/example_dags/example_qubole_operator.py ## @@ -198,7 +198,7 @@ def compare_result(ds, **kwargs): /** Computes an approximation to pi */ object SparkPi { - def main(args: Array[String]) { + def main(args) { Review comment: Looks like https://pre-commit.com/#regular-expressions to exclude the whole file is the only way to do this one. A bit overly broad but this file isn't changed very often so that's probably okay. That or adjust the regex to not allow `) {` on the line (trickier regex to get that working) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-5365) When switching between branches (master/v1-10-test) rebuild of image should not be needed for pre-commits
[ https://issues.apache.org/jira/browse/AIRFLOW-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921401#comment-16921401 ] ASF subversion and git services commented on AIRFLOW-5365: -- Commit 6a213183b132b229908bd2b85c8abb7dc86e88d7 in airflow's branch refs/heads/v1-10-test from Jarek Potiuk [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=6a21318 ] [AIRFLOW-5365] No need to do image rebuild when switching master/v1-10-test (#5972) (cherry picked from commit 319b80437cf7e58d0ceecf9a58e336e14936b163) > When switching between branches (master/v1-10-test) rebuild of image should > not be needed for pre-commits > - > > Key: AIRFLOW-5365 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5365 > Project: Apache Airflow > Issue Type: Improvement > Components: ci >Affects Versions: 2.0.0, 1.10.5 >Reporter: Jarek Potiuk >Priority: Major > Fix For: 1.10.5 > > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Resolved] (AIRFLOW-5365) When switching between branches (master/v1-10-test) rebuild of image should not be needed for pre-commits
[ https://issues.apache.org/jira/browse/AIRFLOW-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jarek Potiuk resolved AIRFLOW-5365. --- Fix Version/s: 1.10.5 Resolution: Fixed > When switching between branches (master/v1-10-test) rebuild of image should > not be needed for pre-commits > - > > Key: AIRFLOW-5365 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5365 > Project: Apache Airflow > Issue Type: Improvement > Components: ci >Affects Versions: 2.0.0, 1.10.5 >Reporter: Jarek Potiuk >Priority: Major > Fix For: 1.10.5 > > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[GitHub] [airflow] potiuk commented on a change in pull request #5979: [AIRFLOW-5373] Super fast pre-commit check for basic python2 compatib…
potiuk commented on a change in pull request #5979: [AIRFLOW-5373] Super fast pre-commit check for basic python2 compatib… URL: https://github.com/apache/airflow/pull/5979#discussion_r320250286 ## File path: airflow/contrib/example_dags/example_qubole_operator.py ## @@ -198,7 +198,7 @@ def compare_result(ds, **kwargs): /** Computes an approximation to pi */ object SparkPi { - def main(args: Array[String]) { + def main(args) { Review comment: Ah indeed. So I need to add a a possibility to exclude such false positives. Will do soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability
kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability URL: https://github.com/apache/airflow/pull/5743#issuecomment-52749 > How about instead of a background thread (I'm wary of using threads in python) could we instead query the last modified time of the serialised dag on each request? > > i.e. when asking for dag X we check if dag X has been updated in the db since we last loaded it? How about this -> https://github.com/kaxil/airflow/commit/830b83cfa845c69319061076bb8515c3fa99553c cc @coufon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability
kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability URL: https://github.com/apache/airflow/pull/5743#issuecomment-52749 > How about instead of a background thread (I'm wary of using threads in python) could we instead query the last modified time of the serialised dag on each request? > > i.e. when asking for dag X we check if dag X has been updated in the db since we last loaded it? How about this -> https://github.com/kaxil/airflow/commit/830b83cfa845c69319061076bb8515c3fa99553c cc @coufon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability
kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability URL: https://github.com/apache/airflow/pull/5743#issuecomment-52749 > How about instead of a background thread (I'm wary of using threads in python) could we instead query the last modified time of the serialised dag on each request? > > i.e. when asking for dag X we check if dag X has been updated in the db since we last loaded it? How about this -> https://github.com/kaxil/airflow/commit/830b83cfa845c69319061076bb8515c3fa99553c cc @coufon @ashb This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability
kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability URL: https://github.com/apache/airflow/pull/5743#issuecomment-52749 > How about instead of a background thread (I'm wary of using threads in python) could we instead query the last modified time of the serialised dag on each request? > > i.e. when asking for dag X we check if dag X has been updated in the db since we last loaded it? How about this -> https://github.com/kaxil/airflow/commit/d2ec5371ae987ff87f60ee6b0143eb66dd5dc1e7 , https://github.com/kaxil/airflow/commit/399dd656427e5d46e863fc2e7c0dda3b788fd8ed , https://github.com/kaxil/airflow/commit/c6a53b34b372466bf1a6a6f906d736284a115f55? cc @coufon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability
kaxil edited a comment on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability URL: https://github.com/apache/airflow/pull/5743#issuecomment-52749 > How about instead of a background thread (I'm wary of using threads in python) could we instead query the last modified time of the serialised dag on each request? > > i.e. when asking for dag X we check if dag X has been updated in the db since we last loaded it? How about this -> https://github.com/kaxil/airflow/commit/d2ec5371ae987ff87f60ee6b0143eb66dd5dc1e7 , https://github.com/kaxil/airflow/commit/399dd656427e5d46e863fc2e7c0dda3b788fd8ed , https://github.com/kaxil/airflow/commit/c6a53b34b372466bf1a6a6f906d736284a115f55? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-5393) UI crash in the Ad Hoc Query menu
[ https://issues.apache.org/jira/browse/AIRFLOW-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ivan de los santos updated AIRFLOW-5393: Description: Airflow UI will crash in the browser returning "Oops" message and the Traceback of the crashing error. *How to replicate*: # Launch airflow webserver -p 8080 # Go to the Airflow-UI # Click on "Data Profiling" # Select any connection to a database. # Click on ".csv" button without writing any text on the query field. # You will get an "oops" message with the Traceback. *File causing the problem*: /python3.6/dist-packages/airflow/www/views.py (Line 2317) *Reasons of the problem*: # UnboundLocalError: local variable 'df' referenced before assignment * This means "df" was never declared, in fact, df it is contained in a try / except block so the except is launched before df gets assigned. {code:java} Traceback (most recent call last): File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise raise value File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/www/utils.py", line 375, in view_func return f(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/www/views.py", line 2318, in query response=df.to_csv(index=False), UnboundLocalError: local variable 'df' referenced before assignment {code} *Proposed solution*: checking the *_error_* variable which will be True as it raises an exception "_(2006, "Unknown MySQL server host 'mysql' (2)")_" and df is never assigned. {code:java} if csv: if not error: return Response( response=df.to_csv(index=False), status=200, mimetype="application/text") {code} I am willing to work in this issue, I think it might be fixed in master tho. This is my first open issue. Best regards, Iván was: Airflow UI will crash in the browser returning "Oops" message and the Traceback of the crashing error. *How to replicate*: # Launch airflow webserver -p 8080 # Go to the Airflow-UI # Click on "Data Profiling" # Select any connection to a database. # Click on ".csv" button without writing any text on the query field. # You will get an "oops" message with the Traceback. *File causing the problem*: /python3.6/dist-packages/airflow/www/views.py (Line 2317) *Reasons of the problem*: # UnboundLocalError: local variable 'df' referenced before assignment * This means "df" was never declared, in fact, df it is contained in a try / except block so the except is launched before df gets assigned. * {code:java} Traceback (most recent call last): File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise raise value File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *args, **kwargs)
[jira] [Updated] (AIRFLOW-5393) UI crash in the Ad Hoc Query menu
[ https://issues.apache.org/jira/browse/AIRFLOW-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ivan de los santos updated AIRFLOW-5393: Description: Airflow UI will crash in the browser returning "Oops" message and the Traceback of the crashing error. *How to replicate*: # Launch airflow webserver -p 8080 # Go to the Airflow-UI # Click on "Data Profiling" # Select any connection to a database. # Click on ".csv" button without writing any text on the query field. # You will get an "oops" message with the Traceback. *File causing the problem*: /python3.6/dist-packages/airflow/www/views.py (Line 2317) *Reasons of the problem*: # UnboundLocalError: local variable 'df' referenced before assignment * This means "df" was never declared, in fact, df it is contained in a try / except block so the except is launched before df gets assigned. * {code:java} Traceback (most recent call last): File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise raise value File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/www/utils.py", line 375, in view_func return f(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/www/views.py", line 2318, in query response=df.to_csv(index=False), UnboundLocalError: local variable 'df' referenced before assignment {code} *Proposed solution*: checking the *_error_* variable which will be True as it raises an exception "_(2006, "Unknown MySQL server host 'mysql' (2)")_" and df is never assigned. {code:java} if csv: if not error: return Response( response=df.to_csv(index=False), status=200, mimetype="application/text") {code} I am willing to work in this issue, I think it might be fixed in master tho. This is my first open issue. Best regards, Iván was: Airflow UI will crash in the browser returning "Oops" message and the Traceback of the crashing error. *How to replicate*: # Launch airflow webserver -p 8080 # Go to the Airflow-UI # Click on "Data Profiling" # Select any connection to a database. # Click on ".csv" button without writing any text on the query field. # You will get an "oops" message with the Traceback. *File causing the problem*: /python3.6/dist-packages/airflow/www/views.py (Line 2317) *Reasons of the problem*: # UnboundLocalError: local variable 'df' referenced before assignment * This means "df" was never declared, infact df it is contained in a try / except block so the except will probably be launched before df gets an assignment. {code:java} Traceback (most recent call last): File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise raise value File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *
[jira] [Updated] (AIRFLOW-5393) UI crash in the Ad Hoc Query menu
[ https://issues.apache.org/jira/browse/AIRFLOW-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ivan de los santos updated AIRFLOW-5393: Description: Airflow UI will crash in the browser returning "Oops" message and the Traceback of the crashing error. *How to replicate*: # Launch airflow webserver -p 8080 # Go to the Airflow-UI # Click on "Data Profiling" # Select any connection to a database. # Click on ".csv" button without writing any text on the query field. # You will get an "oops" message with the Traceback. *File causing the problem*: /python3.6/dist-packages/airflow/www/views.py (Line 2317) *Reasons of the problem*: # UnboundLocalError: local variable 'df' referenced before assignment * This means "df" was never declared, infact df it is contained in a try / except block so the except will probably be launched before df gets an assignment. {code:java} Traceback (most recent call last): File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise raise value File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/www/utils.py", line 375, in view_func return f(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/www/views.py", line 2318, in query response=df.to_csv(index=False), UnboundLocalError: local variable 'df' referenced before assignment {code} *Proposed solution*: Return a message indicating that the query is emtpy. {code:java} if csv: if not error: return Response( response=df.to_csv(index=False), status=200, mimetype="application/text") {code} I am willing to work in this issue, I think it might be fixed in master tho. This is my first open issue. Best regards, Iván was: Airflow UI will crash in the browser returning "Oops" message and the Traceback of the crashing error. *How to replicate*: # Launch airflow webserver -p 8080 # Go to the Airflow-UI # Click on "Data Profiling" # Select any connection to a database. # Click on ".csv" button without writing any text on the query field. # You will get an "oops" message with the Traceback. *File causing the problem*: /python3.6/dist-packages/airflow/www/views.py (Line 2318) *Reasons of the problem*: # UnboundLocalError: local variable 'df' referenced before assignment * This means "df" was never declared, infact df it is contained in a try / except block so the except will probably be launched before df gets an assignment. {code:java} Traceback (most recent call last): File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise raise value File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/www/utils.py", line 375,
[jira] [Created] (AIRFLOW-5394) Invalid schedule interval issues
Shreyash hisariya created AIRFLOW-5394: -- Summary: Invalid schedule interval issues Key: AIRFLOW-5394 URL: https://issues.apache.org/jira/browse/AIRFLOW-5394 Project: Apache Airflow Issue Type: Bug Components: DagRun, scheduler Affects Versions: 1.10.2 Reporter: Shreyash hisariya I am facing issues with using schedule interval of Airflow. Since there is no documentation at all, it took me few days to find that the cron expression accepts only 5 or 6 fields. Even with 5 or 6 fields, the dag is failing at multiple times. *For example : Invalid Cron expression: [0 15 10 * * ?] is not acceptable* The above cron is valid but the airflow doesn't accept it. # Can you please put up a documentation of what is valid and what not is valid? # Why does airflow doesn't take more than 6 fields? -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (AIRFLOW-5393) UI crash in the Ad Hoc Query menu
[ https://issues.apache.org/jira/browse/AIRFLOW-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ivan de los santos updated AIRFLOW-5393: Environment: Linux NAME="Ubuntu" VERSION="18.04.2 LTS (Bionic Beaver)" Airflow version 1.10.4 was: Operating system > UI crash in the Ad Hoc Query menu > - > > Key: AIRFLOW-5393 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5393 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.10.4 > Environment: Linux > NAME="Ubuntu" > VERSION="18.04.2 LTS (Bionic Beaver)" > Airflow version 1.10.4 >Reporter: ivan de los santos >Priority: Minor > Labels: beginner, easyfix, patch > Attachments: Captura de pantalla de 2019-09-03 13-42-02.png > > > Airflow UI will crash in the browser returning "Oops" message and the > Traceback of the crashing error. > > *How to replicate*: > # Launch airflow webserver -p 8080 > # Go to the Airflow-UI > # Click on "Data Profiling" > # Select any connection to a database. > # Click on ".csv" button without writing any text on the query field. > # You will get an "oops" message with the Traceback. > > *File causing the problem*: /python3.6/dist-packages/airflow/www/views.py > (Line 2318) > > *Reasons of the problem*: > # UnboundLocalError: local variable 'df' referenced before assignment > * This means "df" was never declared, infact df it is contained in a try / > except block so the except will probably be launched before df gets an > assignment. > {code:java} > Traceback (most recent call last): > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 2446, in wsgi_app > response = self.full_dispatch_request() > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 1951, in full_dispatch_request > rv = self.handle_user_exception(e) > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 1820, in handle_user_exception > reraise(exc_type, exc_value, tb) > File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line > 39, in reraise > raise value > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 1949, in full_dispatch_request > rv = self.dispatch_request() > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 1935, in dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", > line 69, in inner > return self._run_view(f, *args, **kwargs) > File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", > line 368, in _run_view > return fn(self, *args, **kwargs) > File "/usr/local/lib/python3.6/dist-packages/airflow/www/utils.py", line > 375, in view_func > return f(*args, **kwargs) > File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, > in wrapper > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/dist-packages/airflow/www/views.py", line > 2318, in query > response=df.to_csv(index=False), > UnboundLocalError: local variable 'df' referenced before assignment > {code} > *Proposed solution*: Return a message indicating that the query is emtpy. > > > I am willing to work in this issue if someone with more experience could > guide me about how he expects the application to behave. > This is my first open issue. > > Best regards, > Iván -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (AIRFLOW-5393) UI crash in the Ad Hoc Query menu
[ https://issues.apache.org/jira/browse/AIRFLOW-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ivan de los santos updated AIRFLOW-5393: Attachment: Captura de pantalla de 2019-09-03 13-42-02.png > UI crash in the Ad Hoc Query menu > - > > Key: AIRFLOW-5393 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5393 > Project: Apache Airflow > Issue Type: Bug > Components: ui >Affects Versions: 1.10.4 > Environment: Operating system >Reporter: ivan de los santos >Priority: Minor > Labels: beginner, easyfix, patch > Attachments: Captura de pantalla de 2019-09-03 13-42-02.png > > > Airflow UI will crash in the browser returning "Oops" message and the > Traceback of the crashing error. > > *How to replicate*: > # Launch airflow webserver -p 8080 > # Go to the Airflow-UI > # Click on "Data Profiling" > # Select any connection to a database. > # Click on ".csv" button without writing any text on the query field. > # You will get an "oops" message with the Traceback. > > *File causing the problem*: /python3.6/dist-packages/airflow/www/views.py > (Line 2318) > > *Reasons of the problem*: > # UnboundLocalError: local variable 'df' referenced before assignment > * This means "df" was never declared, infact df it is contained in a try / > except block so the except will probably be launched before df gets an > assignment. > {code:java} > Traceback (most recent call last): > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 2446, in wsgi_app > response = self.full_dispatch_request() > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 1951, in full_dispatch_request > rv = self.handle_user_exception(e) > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 1820, in handle_user_exception > reraise(exc_type, exc_value, tb) > File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line > 39, in reraise > raise value > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 1949, in full_dispatch_request > rv = self.dispatch_request() > File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line > 1935, in dispatch_request > return self.view_functions[rule.endpoint](**req.view_args) > File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", > line 69, in inner > return self._run_view(f, *args, **kwargs) > File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", > line 368, in _run_view > return fn(self, *args, **kwargs) > File "/usr/local/lib/python3.6/dist-packages/airflow/www/utils.py", line > 375, in view_func > return f(*args, **kwargs) > File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, > in wrapper > return func(*args, **kwargs) > File "/usr/local/lib/python3.6/dist-packages/airflow/www/views.py", line > 2318, in query > response=df.to_csv(index=False), > UnboundLocalError: local variable 'df' referenced before assignment > {code} > *Proposed solution*: Return a message indicating that the query is emtpy. > > > I am willing to work in this issue if someone with more experience could > guide me about how he expects the application to behave. > This is my first open issue. > > Best regards, > Iván -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (AIRFLOW-5393) UI crash in the Ad Hoc Query menu
ivan de los santos created AIRFLOW-5393: --- Summary: UI crash in the Ad Hoc Query menu Key: AIRFLOW-5393 URL: https://issues.apache.org/jira/browse/AIRFLOW-5393 Project: Apache Airflow Issue Type: Bug Components: ui Affects Versions: 1.10.4 Environment: Operating system Reporter: ivan de los santos Airflow UI will crash in the browser returning "Oops" message and the Traceback of the crashing error. *How to replicate*: # Launch airflow webserver -p 8080 # Go to the Airflow-UI # Click on "Data Profiling" # Select any connection to a database. # Click on ".csv" button without writing any text on the query field. # You will get an "oops" message with the Traceback. *File causing the problem*: /python3.6/dist-packages/airflow/www/views.py (Line 2318) *Reasons of the problem*: # UnboundLocalError: local variable 'df' referenced before assignment * This means "df" was never declared, infact df it is contained in a try / except block so the except will probably be launched before df gets an assignment. {code:java} Traceback (most recent call last): File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app response = self.full_dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/rde/.local/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise raise value File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request rv = self.dispatch_request() File "/home/rde/.local/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request return self.view_functions[rule.endpoint](**req.view_args) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 69, in inner return self._run_view(f, *args, **kwargs) File "/home/rde/.local/lib/python3.6/site-packages/flask_admin/base.py", line 368, in _run_view return fn(self, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/www/utils.py", line 375, in view_func return f(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, in wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/airflow/www/views.py", line 2318, in query response=df.to_csv(index=False), UnboundLocalError: local variable 'df' referenced before assignment {code} *Proposed solution*: Return a message indicating that the query is emtpy. I am willing to work in this issue if someone with more experience could guide me about how he expects the application to behave. This is my first open issue. Best regards, Iván -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (AIRFLOW-5129) Add typehint to gcp_dlp_hook.py
[ https://issues.apache.org/jira/browse/AIRFLOW-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921354#comment-16921354 ] ASF subversion and git services commented on AIRFLOW-5129: -- Commit a9ba91579960d1a642a7e19b35ac91a84375fff6 in airflow's branch refs/heads/master from Ryan Yuan [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=a9ba915 ] [AIRFLOW-5129] Add typehint to GCP DLP hook (#5980) > Add typehint to gcp_dlp_hook.py > --- > > Key: AIRFLOW-5129 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5129 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 1.10.3 >Reporter: Ryan Yuan >Assignee: Ryan Yuan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (AIRFLOW-5129) Add typehint to gcp_dlp_hook.py
[ https://issues.apache.org/jira/browse/AIRFLOW-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921353#comment-16921353 ] ASF GitHub Bot commented on AIRFLOW-5129: - mik-laj commented on pull request #5980: [AIRFLOW-5129] Add typehint to GCP DLP hook URL: https://github.com/apache/airflow/pull/5980 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add typehint to gcp_dlp_hook.py > --- > > Key: AIRFLOW-5129 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5129 > Project: Apache Airflow > Issue Type: Improvement > Components: gcp >Affects Versions: 1.10.3 >Reporter: Ryan Yuan >Assignee: Ryan Yuan >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[GitHub] [airflow] mik-laj merged pull request #5980: [AIRFLOW-5129] Add typehint to GCP DLP hook
mik-laj merged pull request #5980: [AIRFLOW-5129] Add typehint to GCP DLP hook URL: https://github.com/apache/airflow/pull/5980 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI
mik-laj commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI URL: https://github.com/apache/airflow/pull/5975#discussion_r320224848 ## File path: airflow/bin/cli.py ## @@ -441,6 +442,33 @@ def set_is_paused(is_paused, args): print("Dag: {}, paused: {}".format(args.dag_id, str(is_paused))) +def show_dag(args): +dag = get_dag(args) +dot = render_dag(dag) +if args.save: +filename, _, fileformat = args.save.rpartition('.') +dot.render(filename=filename, format=fileformat, cleanup=True) +print("File {} saved".format(args.save)) +elif args.imgcat: +data = dot.pipe(format='png') +try: +proc = subprocess.Popen("imgcat", stdout=subprocess.PIPE, stdin=subprocess.PIPE) +except OSError as e: +if e.errno == errno.ENOENT: +raise AirflowException( +"Failed to execute. Make sure the imgcat executables are on your systems \'PATH\'" +) +else: +raise +out, err = proc.communicate(data) +if out: +print(out.decode('utf-8')) +if err: +print(err.decode('utf-8')) +else: +print(dot.source) Review comment: After that, I thought about the problem that there are these logs. I think airflow should be completely independent of CLI. Docker and kubernetes work similarly. This is important when we want Airflow to be completely serverless and secure. If I understand correctly, currently CLI always gives full access to Airflow. This is on my list of ideas. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #5944: [AIRFLOW-5362] Reorder imports
mik-laj commented on a change in pull request #5944: [AIRFLOW-5362] Reorder imports URL: https://github.com/apache/airflow/pull/5944#discussion_r320219308 ## File path: setup.py ## @@ -268,6 +268,7 @@ def write_version(filename: str = os.path.join(*["airflow", "git_version"])): 'click==6.7', 'flake8>=3.6.0', 'flake8-colors', +'flake8-isort', Review comment: Discussion about flake-isort in Airflow is available here: https://github.com/apache/airflow/pull/4892#issuecomment-472369397 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #5944: [AIRFLOW-5362] Reorder imports
mik-laj commented on a change in pull request #5944: [AIRFLOW-5362] Reorder imports URL: https://github.com/apache/airflow/pull/5944#discussion_r320219308 ## File path: setup.py ## @@ -268,6 +268,7 @@ def write_version(filename: str = os.path.join(*["airflow", "git_version"])): 'click==6.7', 'flake8>=3.6.0', 'flake8-colors', +'flake8-isort', Review comment: Discussion about flake-import-order in Airflow is available here: https://github.com/apache/airflow/pull/4892#issuecomment-472369397 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj edited a comment on issue #5975: [AIRFLOW-5368] Display DAG from the CLI
mik-laj edited a comment on issue #5975: [AIRFLOW-5368] Display DAG from the CLI URL: https://github.com/apache/airflow/pull/5975#issuecomment-527248123 https://user-images.githubusercontent.com/12058428/64134169-c8171a80-cddb-11e9-9fd6-46f5548fbfde.png";> I updated PR: * Add task coloring according to the ui_color, ui_fgcolor properties * Tasks are drawn using rounded rectangles * Rendering logic has been moved to a separate file. It can be useful when someone wants to generate automatic documentation This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-5088) To implement DAG JSON serialization and DB persistence for webserver scalability improvement
[ https://issues.apache.org/jira/browse/AIRFLOW-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921339#comment-16921339 ] ASF GitHub Bot commented on AIRFLOW-5088: - kaxil commented on pull request #5992: [AIRFLOW-5088][AIP-24][BackPort] Persisting serialized DAG in DB for webserver scalability URL: https://github.com/apache/airflow/pull/5992 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-5088 - https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-24+DAG+Persistence+in+DB+using+JSON+for+Airflow+Webserver+and+%28optional%29+Scheduler ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: **Backport of https://github.com/apache/airflow/pull/5743 for v1-10-* branches ** Based on #5701, this PR implements functionalities including writing serialized DAGs to DB in scheduler, reading DAGs from DB in webserver, controlled by [core] dagcached The goal is to decouple webserver from the DAG folder, instead it reads everything from database. Rendering template by functions is an exception, in that case it needs to re-import DAG, because functions are stringified in serialized DAG. ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - All the public functions and the classes in the PR contain docstrings that explain what it does - If you implement backwards incompatible changes, please leave a note in the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so we can assign it to a appropriate release ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > To implement DAG JSON serialization and DB persistence for webserver > scalability improvement > > > Key: AIRFLOW-5088 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5088 > Project: Apache Airflow > Issue Type: Improvement > Components: DAG, webserver >Affects Versions: 1.10.5 >Reporter: Zhou Fang >Assignee: Zhou Fang >Priority: Major > > Created this issue for starting to implement DAG serialization using JSON and > persistence in DB. Serialized DAG will be used in webserver for solving the > webserver scalability issue. > > The implementation is based on AIP-24: > [https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-24+DAG+Persistence+in+DB+using+JSON+for+Airflow+Webserver+and+%28optional%29+Scheduler] > > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[GitHub] [airflow] kaxil opened a new pull request #5992: [AIRFLOW-5088][AIP-24][BackPort] Persisting serialized DAG in DB for webserver scalability
kaxil opened a new pull request #5992: [AIRFLOW-5088][AIP-24][BackPort] Persisting serialized DAG in DB for webserver scalability URL: https://github.com/apache/airflow/pull/5992 Make sure you have checked _all_ steps below. ### Jira - [x] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-5088 - https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-24+DAG+Persistence+in+DB+using+JSON+for+Airflow+Webserver+and+%28optional%29+Scheduler ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: **Backport of https://github.com/apache/airflow/pull/5743 for v1-10-* branches ** Based on #5701, this PR implements functionalities including writing serialized DAGs to DB in scheduler, reading DAGs from DB in webserver, controlled by [core] dagcached The goal is to decouple webserver from the DAG folder, instead it reads everything from database. Rendering template by functions is an exception, in that case it needs to re-import DAG, because functions are stringified in serialized DAG. ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - All the public functions and the classes in the PR contain docstrings that explain what it does - If you implement backwards incompatible changes, please leave a note in the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so we can assign it to a appropriate release ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI
mik-laj commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI URL: https://github.com/apache/airflow/pull/5975#discussion_r320214775 ## File path: airflow/gcp/example_dags/example_vision.py ## @@ -52,6 +52,7 @@ import airflow from airflow import models +from airflow.contrib.sensors.aws_glue_catalog_partition_sensor import AwsGlueCatalogPartitionSensor Review comment: Yes. Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] mik-laj commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI
mik-laj commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI URL: https://github.com/apache/airflow/pull/5975#discussion_r320214538 ## File path: airflow/bin/cli.py ## @@ -441,6 +442,33 @@ def set_is_paused(is_paused, args): print("Dag: {}, paused: {}".format(args.dag_id, str(is_paused))) +def show_dag(args): +dag = get_dag(args) +dot = render_dag(dag) +if args.save: +filename, _, fileformat = args.save.rpartition('.') +dot.render(filename=filename, format=fileformat, cleanup=True) +print("File {} saved".format(args.save)) +elif args.imgcat: +data = dot.pipe(format='png') +try: +proc = subprocess.Popen("imgcat", stdout=subprocess.PIPE, stdin=subprocess.PIPE) +except OSError as e: +if e.errno == errno.ENOENT: +raise AirflowException( +"Failed to execute. Make sure the imgcat executables are on your systems \'PATH\'" +) +else: +raise +out, err = proc.communicate(data) +if out: +print(out.decode('utf-8')) +if err: +print(err.decode('utf-8')) +else: +print(dot.source) Review comment: It works. I checked it. ``` airflow dags show example_gcp_vision_explicit_id --save a.dot ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] kaxil commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability
kaxil commented on issue #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability URL: https://github.com/apache/airflow/pull/5743#issuecomment-52749 > How about instead of a background thread (I'm wary of using threads in python) could we instead query the last modified time of the serialised dag on each request? > > i.e. when asking for dag X we check if dag X has been updated in the db since we last loaded it? How about this -> https://github.com/kaxil/airflow/commit/d2ec5371ae987ff87f60ee6b0143eb66dd5dc1e7 ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #5944: [AIRFLOW-5362] Reorder imports
ashb commented on a change in pull request #5944: [AIRFLOW-5362] Reorder imports URL: https://github.com/apache/airflow/pull/5944#discussion_r320202786 ## File path: setup.py ## @@ -268,6 +268,7 @@ def write_version(filename: str = os.path.join(*["airflow", "git_version"])): 'click==6.7', 'flake8>=3.6.0', 'flake8-colors', +'flake8-isort', Review comment: > License: GNU General Public License v2 (GPLv2) (GPL version 2) I can't remember what the outcome was on this for other devel only changes. @mik-laj Do you remember? (Somewhat annoyingly isort itself is MIT) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #5944: [AIRFLOW-5362] Reorder imports
ashb commented on a change in pull request #5944: [AIRFLOW-5362] Reorder imports URL: https://github.com/apache/airflow/pull/5944#discussion_r320204446 ## File path: airflow/_vendor/nvd3/__init__.py ## @@ -16,14 +16,14 @@ 'scatterChart', 'discreteBarChart', 'multiBarChart'] +from . import ipynb +from .cumulativeLineChart import cumulativeLineChart Review comment: Can you exclude anything under airflow/_vendor from changes please? Perhaps `skip=airflow/_vendor` in setup.cfg might do the trick? Should do given `skip = build,.tox,venv` is used in some of isort's examples. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] Fokko edited a comment on issue #5990: [AIRFLOW-5390] Remove provide context
Fokko edited a comment on issue #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990#issuecomment-52746 Thanks @ashb for thinking along. Appreciate it. I've added the tests to the suite. I think we can fix the one with the op_args by skipping the first `len(op_args)` number of arguments. This would not introduce any change in behavior. Similar to the arguments in the `kwargs`, we could give the keys in `op_kwargs` priority. `kwargs` is actually handled: https://github.com/apache/airflow/blob/master/airflow/operators/python_operator.py#L108-L109 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] Fokko commented on issue #5990: [AIRFLOW-5390] Remove provide context
Fokko commented on issue #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990#issuecomment-52746 Thanks @ashb for thinking along. Appreciate it. I've added the tests to the suite. I think we can fix the one with the op_args by skipping the first `len(op_args)` number of arguments. This would not introduce any change in behavior. Similar to the arguments in the `kwargs`, we could give the keys in `op_kwargs ` priority. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-5365) When switching between branches (master/v1-10-test) rebuild of image should not be needed for pre-commits
[ https://issues.apache.org/jira/browse/AIRFLOW-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921310#comment-16921310 ] ASF GitHub Bot commented on AIRFLOW-5365: - ashb commented on pull request #5972: [AIRFLOW-5365] No need to do image rebuild when switching master/v1-1… URL: https://github.com/apache/airflow/pull/5972 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > When switching between branches (master/v1-10-test) rebuild of image should > not be needed for pre-commits > - > > Key: AIRFLOW-5365 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5365 > Project: Apache Airflow > Issue Type: Improvement > Components: ci >Affects Versions: 2.0.0, 1.10.5 >Reporter: Jarek Potiuk >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (AIRFLOW-5365) When switching between branches (master/v1-10-test) rebuild of image should not be needed for pre-commits
[ https://issues.apache.org/jira/browse/AIRFLOW-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921311#comment-16921311 ] ASF subversion and git services commented on AIRFLOW-5365: -- Commit 319b80437cf7e58d0ceecf9a58e336e14936b163 in airflow's branch refs/heads/master from Jarek Potiuk [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=319b804 ] [AIRFLOW-5365] No need to do image rebuild when switching master/v1-10-test (#5972) > When switching between branches (master/v1-10-test) rebuild of image should > not be needed for pre-commits > - > > Key: AIRFLOW-5365 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5365 > Project: Apache Airflow > Issue Type: Improvement > Components: ci >Affects Versions: 2.0.0, 1.10.5 >Reporter: Jarek Potiuk >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[GitHub] [airflow] ashb merged pull request #5972: [AIRFLOW-5365] No need to do image rebuild when switching master/v1-1…
ashb merged pull request #5972: [AIRFLOW-5365] No need to do image rebuild when switching master/v1-1… URL: https://github.com/apache/airflow/pull/5972 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #5979: [AIRFLOW-5373] Super fast pre-commit check for basic python2 compatib…
ashb commented on a change in pull request #5979: [AIRFLOW-5373] Super fast pre-commit check for basic python2 compatib… URL: https://github.com/apache/airflow/pull/5979#discussion_r320197482 ## File path: airflow/contrib/example_dags/example_qubole_operator.py ## @@ -198,7 +198,7 @@ def compare_result(ds, **kwargs): /** Computes an approximation to pi */ object SparkPi { - def main(args: Array[String]) { + def main(args) { Review comment: This one is a false positive -- this is Scala code :D This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ssoto opened a new pull request #5991: Adds two companies to ussing Airflow section
ssoto opened a new pull request #5991: Adds two companies to ussing Airflow section URL: https://github.com/apache/airflow/pull/5991 https://issues.apache.org/jira/browse/AIRFLOW-5392 ### Description Adds two new companies that are using Apache Airflow ### Tests OK ### Commits Ok ### Documentation Ok ### Code Quality - [ x ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-5392) README.md update
Sergio Soto created AIRFLOW-5392: Summary: README.md update Key: AIRFLOW-5392 URL: https://issues.apache.org/jira/browse/AIRFLOW-5392 Project: Apache Airflow Issue Type: Improvement Components: documentation Affects Versions: 1.10.4 Reporter: Sergio Soto Assignee: Sergio Soto Add [Logitravel Group|[http://example.com]|https://www.logitravel.com/] and [Bluekiri|https://bluekiri.com/] to companies that uses Apache Airflow -- This message was sent by Atlassian Jira (v8.3.2#803003)
[GitHub] [airflow] ashb commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability
ashb commented on a change in pull request #5743: [AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability URL: https://github.com/apache/airflow/pull/5743#discussion_r320186899 ## File path: airflow/models/serialized_dag.py ## @@ -0,0 +1,155 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""Serialzed DAG table in database.""" + +import hashlib +from typing import Any, Dict, List, Optional, TYPE_CHECKING +from sqlalchemy import Column, Index, Integer, String, Text, and_ +from sqlalchemy.sql import exists + +from airflow.models.base import Base, ID_LEN +from airflow.utils import db, timezone +from airflow.utils.sqlalchemy import UtcDateTime + + +if TYPE_CHECKING: +from airflow.dag.serialization.serialized_dag import SerializedDAG # noqa: F401, E501; # pylint: disable=cyclic-import +from airflow.models import DAG # noqa: F401; # pylint: disable=cyclic-import + + +class SerializedDagModel(Base): +"""A table for serialized DAGs. + +serialized_dag table is a snapshot of DAG files synchronized by scheduler. +This feature is controlled by: +[core] dagcached = False: enable this feature +[core] dagcached_min_update_interval = 30 (s): +serialized DAGs are updated in DB when a file gets processed by scheduler, +to reduce DB write rate, there is a minimal interval of updating serialized DAGs. +[scheduler] dag_dir_list_interval = 300 (s): +interval of deleting serialized DAGs in DB when the files are deleted, suggest +to use a smaller interval such as 60 + +It is used by webserver to load dagbags when dagcached=True. Because reading from +database is lightweight compared to importing from files, it solves the webserver +scalability issue. +""" +__tablename__ = 'serialized_dag' + +dag_id = Column(String(ID_LEN), primary_key=True) +fileloc = Column(String(2000)) +# The max length of fileloc exceeds the limit of indexing. +fileloc_hash = Column(Integer) +data = Column(Text) +last_updated = Column(UtcDateTime) + +__table_args__ = ( +Index('idx_fileloc_hash', fileloc_hash, unique=False), +) + +def __init__(self, dag): +from airflow.dag.serialization import Serialization + +self.dag_id = dag.dag_id +self.fileloc = dag.full_filepath +self.fileloc_hash = SerializedDagModel.dag_fileloc_hash(self.fileloc) +self.data = Serialization.to_json(dag) +self.last_updated = timezone.utcnow() + +@staticmethod +def dag_fileloc_hash(full_filepath: str) -> int: +Hashing file location for indexing. + +:param full_filepath: full filepath of DAG file +:return: hashed full_filepath +""" +# hashing is needed because the length of fileloc is 2000 as an Airflow convention, +# which is over the limit of indexing. If we can reduce the length of fileloc, then +# hashing is not needed. +return int(0x & int( Review comment: Sounds good, yes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI
ashb commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI URL: https://github.com/apache/airflow/pull/5975#discussion_r320185972 ## File path: airflow/gcp/example_dags/example_vision.py ## @@ -52,6 +52,7 @@ import airflow from airflow import models +from airflow.contrib.sensors.aws_glue_catalog_partition_sensor import AwsGlueCatalogPartitionSensor Review comment: Unrelated change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI
ashb commented on a change in pull request #5975: [AIRFLOW-5368] Display DAG from the CLI URL: https://github.com/apache/airflow/pull/5975#discussion_r320185707 ## File path: airflow/bin/cli.py ## @@ -441,6 +442,33 @@ def set_is_paused(is_paused, args): print("Dag: {}, paused: {}".format(args.dag_id, str(is_paused))) +def show_dag(args): +dag = get_dag(args) +dot = render_dag(dag) +if args.save: +filename, _, fileformat = args.save.rpartition('.') +dot.render(filename=filename, format=fileformat, cleanup=True) +print("File {} saved".format(args.save)) +elif args.imgcat: +data = dot.pipe(format='png') +try: +proc = subprocess.Popen("imgcat", stdout=subprocess.PIPE, stdin=subprocess.PIPE) +except OSError as e: +if e.errno == errno.ENOENT: +raise AirflowException( +"Failed to execute. Make sure the imgcat executables are on your systems \'PATH\'" +) +else: +raise +out, err = proc.communicate(data) +if out: +print(out.decode('utf-8')) +if err: +print(err.decode('utf-8')) +else: +print(dot.source) Review comment: Given this also prints logging to stdout from airflow startup it might be worth adding a way to save the dot contents to a file. (Unless `dot.render(format="dot")` already works for that?) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] Fokko commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context
Fokko commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990#discussion_r320184548 ## File path: airflow/operators/python_operator.py ## @@ -104,10 +97,21 @@ def execute(self, context): for k, v in airflow_context_vars.items()])) os.environ.update(airflow_context_vars) -if self.provide_context: -context.update(self.op_kwargs) -context['templates_dict'] = self.templates_dict +context.update(self.op_kwargs) +context['templates_dict'] = self.templates_dict + +if {parameter for name, parameter + in signature(self.python_callable).parameters.items() + if str(parameter).startswith("**")}: +# If there is a **kwargs, **context or **_ then just pass everything. self.op_kwargs = context +else: +# If there is only for example, an execution_date, then pass only these in :-) +self.op_kwargs = { +name: context[name] for name, parameter +in signature(self.python_callable).parameters.items() +if name in context # If it isn't available on the context, then ignore Review comment: 👍 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] Fokko commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context
Fokko commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990#discussion_r320184507 ## File path: airflow/operators/python_operator.py ## @@ -104,10 +97,21 @@ def execute(self, context): for k, v in airflow_context_vars.items()])) os.environ.update(airflow_context_vars) -if self.provide_context: -context.update(self.op_kwargs) -context['templates_dict'] = self.templates_dict +context.update(self.op_kwargs) +context['templates_dict'] = self.templates_dict + +if {parameter for name, parameter + in signature(self.python_callable).parameters.items() + if str(parameter).startswith("**")}: Review comment: Love it, thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] ashb commented on issue #5990: [AIRFLOW-5390] Remove provide context
ashb commented on issue #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990#issuecomment-527386507 Another example: what happens if I do: ```python def fn(dag, **context): print(dag) PythonOperator( op_args=[1], python_callable=fn ) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] BasPH commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context
BasPH commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990#discussion_r320181214 ## File path: airflow/operators/python_operator.py ## @@ -104,10 +97,21 @@ def execute(self, context): for k, v in airflow_context_vars.items()])) os.environ.update(airflow_context_vars) -if self.provide_context: -context.update(self.op_kwargs) -context['templates_dict'] = self.templates_dict +context.update(self.op_kwargs) +context['templates_dict'] = self.templates_dict + +if {parameter for name, parameter + in signature(self.python_callable).parameters.items() + if str(parameter).startswith("**")}: Review comment: Bit shorter: `{param for param in sig.parameters.values() if str(param).startswith("**")}` or: `any(str(params).startswith("**") for params in sig.parameters.values())` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] BasPH commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context
BasPH commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990#discussion_r320182021 ## File path: airflow/operators/python_operator.py ## @@ -104,10 +97,21 @@ def execute(self, context): for k, v in airflow_context_vars.items()])) os.environ.update(airflow_context_vars) -if self.provide_context: -context.update(self.op_kwargs) -context['templates_dict'] = self.templates_dict +context.update(self.op_kwargs) +context['templates_dict'] = self.templates_dict + +if {parameter for name, parameter + in signature(self.python_callable).parameters.items() + if str(parameter).startswith("**")}: +# If there is a **kwargs, **context or **_ then just pass everything. self.op_kwargs = context +else: +# If there is only for example, an execution_date, then pass only these in :-) +self.op_kwargs = { +name: context[name] for name, parameter +in signature(self.python_callable).parameters.items() +if name in context # If it isn't available on the context, then ignore Review comment: Same here: `name: context[name] for name in signature(self.python_callable).parameters.keys() if name in context` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] BasPH commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context
BasPH commented on a change in pull request #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990#discussion_r320181214 ## File path: airflow/operators/python_operator.py ## @@ -104,10 +97,21 @@ def execute(self, context): for k, v in airflow_context_vars.items()])) os.environ.update(airflow_context_vars) -if self.provide_context: -context.update(self.op_kwargs) -context['templates_dict'] = self.templates_dict +context.update(self.op_kwargs) +context['templates_dict'] = self.templates_dict + +if {parameter for name, parameter + in signature(self.python_callable).parameters.items() + if str(parameter).startswith("**")}: Review comment: Bit shorter: `{param for param in sig.parameters.values() if str(param).startswith("**")}` or: `any(str(param).startswith("**") for param in sig.parameters.values())` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] Fokko commented on issue #5967: Enable authentication for druid query hook.
Fokko commented on issue #5967: Enable authentication for druid query hook. URL: https://github.com/apache/airflow/pull/5967#issuecomment-527383916 @scrawfor Can you follow the process and create a ticket in Jira? This keeps us to help track of the releases. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [airflow] Fokko merged pull request #5966: [AIRFLOW-5360] Type annotations for BaseSensorOperator
Fokko merged pull request #5966: [AIRFLOW-5360] Type annotations for BaseSensorOperator URL: https://github.com/apache/airflow/pull/5966 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-5360) Type annotations for BaseSensorOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921292#comment-16921292 ] ASF subversion and git services commented on AIRFLOW-5360: -- Commit f46b54a10ec35485e7f507e308b9b36232168170 in airflow's branch refs/heads/master from TobKed [ https://gitbox.apache.org/repos/asf?p=airflow.git;h=f46b54a ] [AIRFLOW-5360] Type annotations for BaseSensorOperator (#5966) > Type annotations for BaseSensorOperator > --- > > Key: AIRFLOW-5360 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5360 > Project: Apache Airflow > Issue Type: Improvement > Components: core >Affects Versions: 1.10.4 >Reporter: Tobiasz Kedzierski >Assignee: Tobiasz Kedzierski >Priority: Trivial > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (AIRFLOW-5360) Type annotations for BaseSensorOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921291#comment-16921291 ] ASF GitHub Bot commented on AIRFLOW-5360: - Fokko commented on pull request #5966: [AIRFLOW-5360] Type annotations for BaseSensorOperator URL: https://github.com/apache/airflow/pull/5966 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Type annotations for BaseSensorOperator > --- > > Key: AIRFLOW-5360 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5360 > Project: Apache Airflow > Issue Type: Improvement > Components: core >Affects Versions: 1.10.4 >Reporter: Tobiasz Kedzierski >Assignee: Tobiasz Kedzierski >Priority: Trivial > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (AIRFLOW-5391) Clearing a task skipped by BranchPythonOperator will cause the task to execute
Qian Yu created AIRFLOW-5391: Summary: Clearing a task skipped by BranchPythonOperator will cause the task to execute Key: AIRFLOW-5391 URL: https://issues.apache.org/jira/browse/AIRFLOW-5391 Project: Apache Airflow Issue Type: Bug Components: operators Affects Versions: 1.10.4 Reporter: Qian Yu I tried this on 1.10.3 and 1.10.4, both have this issue: E.g. in this example from the doc, branch_a executed, branch_false was skipped because of branching condition. However if someone Clear branch_false, it'll cause branch_false to execute.!https://airflow.apache.org/_images/branch_good.png! This behaviour is understandable given how BranchPythonOperator is implemented. BranchPythonOperator does not store its decision anywhere. It skips its own downstream tasks in the branch at runtime. So there's currently no way for branch_false to know it should be skipped without rerunning the branching task. This is obviously counter-intuitive from the user's perspective. In this example, users would not expect branch_false to execute when they clear it because the branching task should have skipped it. There are a few ways to improve this: Option 1): Make downstream tasks skipped by BranchPythonOperator not clearable without also clearing the upstream BranchPythonOperator. In this example, if someone clears branch_false without clearing branching, branch_false should not execute. Option 2): Make BranchPythonOperator store the result of its skip condition somewhere. Make downstream tasks check for this stored decision and skip themselves if they should have been skipped by the condition. This probably means the decision of BranchPythonOperator needs to be stored in the db. [kevcampb|https://blog.diffractive.io/author/kevcampb/] attempted a workaround and on this blog. And he acknowledged his workaround is not perfect and a better permanent fix is needed: [https://blog.diffractive.io/2018/08/07/replacement-shortcircuitoperator-for-airflow/] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (AIRFLOW-5390) Remove provide_context
[ https://issues.apache.org/jira/browse/AIRFLOW-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16921232#comment-16921232 ] ASF GitHub Bot commented on AIRFLOW-5390: - Fokko commented on pull request #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990 Make sure you have checked _all_ steps below. I'm giving Apache Airflow training across Europe, and at every workshop that I provide, I find it a bit awkward to introduce the idea of the provide_context. Instead, I want to remove this thing and make it infer the variables automagically based on the signature of the Python callable. Less is more :-) ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-5390\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-5390 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. - In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)). - In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - All the public functions and the classes in the PR contain docstrings that explain what it does - If you implement backwards incompatible changes, please leave a note in the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so we can assign it to a appropriate release ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove provide_context > -- > > Key: AIRFLOW-5390 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5390 > Project: Apache Airflow > Issue Type: Task > Components: core >Affects Versions: 1.10.4 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[GitHub] [airflow] Fokko opened a new pull request #5990: [AIRFLOW-5390] Remove provide context
Fokko opened a new pull request #5990: [AIRFLOW-5390] Remove provide context URL: https://github.com/apache/airflow/pull/5990 Make sure you have checked _all_ steps below. I'm giving Apache Airflow training across Europe, and at every workshop that I provide, I find it a bit awkward to introduce the idea of the provide_context. Instead, I want to remove this thing and make it infer the variables automagically based on the signature of the Python callable. Less is more :-) ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-5390\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-5390 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. - In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)). - In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x). ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - All the public functions and the classes in the PR contain docstrings that explain what it does - If you implement backwards incompatible changes, please leave a note in the [Updating.md](https://github.com/apache/airflow/blob/master/UPDATING.md) so we can assign it to a appropriate release ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-5390) Remove provide_context
Fokko Driesprong created AIRFLOW-5390: - Summary: Remove provide_context Key: AIRFLOW-5390 URL: https://issues.apache.org/jira/browse/AIRFLOW-5390 Project: Apache Airflow Issue Type: Task Components: core Affects Versions: 1.10.4 Reporter: Fokko Driesprong Assignee: Fokko Driesprong Fix For: 2.0.0 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[GitHub] [airflow] TobKed commented on issue #5980: [AIRFLOW-5129] Add typehint to GCP DLP hook
TobKed commented on issue #5980: [AIRFLOW-5129] Add typehint to GCP DLP hook URL: https://github.com/apache/airflow/pull/5980#issuecomment-527341514 cc @mik-laj This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services