[jira] [Commented] (AIRFLOW-3118) DAGs not successful on new installation
[ https://issues.apache.org/jira/browse/AIRFLOW-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698576#comment-16698576 ] Iuliia Volkova commented on AIRFLOW-3118: - https://issues.apache.org/jira/browse/AIRFLOW-1561 - the fix was merged [~huyanhvn],[~ashb], will be good to close this task. > DAGs not successful on new installation > --- > > Key: AIRFLOW-3118 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3118 > Project: Apache Airflow > Issue Type: Bug > Components: DAG >Affects Versions: 1.10.0 > Environment: Ubuntu 18.04 > Python 3.6 >Reporter: Brylie Christopher Oxley >Assignee: Huy Nguyen >Priority: Blocker > Fix For: 1.10.2 > > Attachments: Screenshot_20180926_161837.png, > image-2018-09-26-12-39-03-094.png > > > When trying out Airflow, on localhost, none of the DAG runs are getting to > the 'success' state. They are getting stuck in 'running', or I manually label > them as failed: > !image-2018-09-26-12-39-03-094.png! > h2. Steps to reproduce > # create new conda environment > ** conda create -n airflow > ** source activate airflow > # install airflow > ** pip install apache-airflow > # initialize Airflow db > ** airflow initdb > # disable default paused setting in airflow.cfg > ** dags_are_paused_at_creation = False > # {color:#6a8759}run airflow and airflow scheduler (in separate > terminal){color} > ** {color:#6a8759}airflow scheduler{color} > ** {color:#6a8759}airflow webserver{color} > # {color:#6a8759}unpause example_bash_operator{color} > ** {color:#6a8759}airflow unpause example_bash_operator{color} > # {color:#6a8759}log in to Airflow UI{color} > # {color:#6a8759}turn on example_bash_operator{color} > # {color:#6a8759}click "Trigger DAG" in `example_bash_operator` row{color} > h2. {color:#6a8759}Observed result{color} > {color:#6a8759}The `example_bash_operator` never leaves the "running" > state.{color} > h2. {color:#6a8759}Expected result{color} > {color:#6a8759}The `example_bash_operator` would quickly enter the "success" > state{color} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1934) Unable to Launch Example DAG if ~/AIRFLOW_HOME/dags folder is empty
[ https://issues.apache.org/jira/browse/AIRFLOW-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698575#comment-16698575 ] Iuliia Volkova commented on AIRFLOW-1934: - https://issues.apache.org/jira/browse/AIRFLOW-1561 - the fix was merged [~ramandumcs],[~ashb], will be good to close this task. > Unable to Launch Example DAG if ~/AIRFLOW_HOME/dags folder is empty > --- > > Key: AIRFLOW-1934 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1934 > Project: Apache Airflow > Issue Type: Bug > Components: DAG >Affects Versions: 1.8.0 > Environment: RHEL >Reporter: raman >Priority: Major > > Steps to reproduce > 1. Install airflow > 2. Keep the ~/{airflow_home}/dags folder empty > 3. airflow initdb > 4. airflow webserver and scheduler > 2. Enable a example DAG and trigger it manually from web UI. > Result: DAG run gets created in the dag_run table. task_instance table also > get relevant enteries but scheduler does not pick the DAG. > Workaround: Create one sample dag in the ~/{airflow_home}/dags folder and > scheduler picks it up. > The following code in jobs.py seems to be doing the trick but this code is > only triggered if there is a dag inside ~/{airflow_home}/dags folder > File: jobs.py > Function: _find_executable_task_instances > ti_query = ( >session >.query(TI) >.filter(TI.dag_id.in_(simple_dag_bag.dag_ids)) >.outerjoin(DR, >and_(DR.dag_id == TI.dag_id, > DR.execution_date == TI.execution_date)) >.filter(or_(DR.run_id == None, >not_(DR.run_id.like(BackfillJob.ID_PREFIX + '%' >.outerjoin(DM, DM.dag_id==TI.dag_id) >.filter(or_(DM.dag_id == None, >not_(DM.is_paused))) >) > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3357) Scheduler doesn't work on example DAGs unless there's some dag file to process
[ https://issues.apache.org/jira/browse/AIRFLOW-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698574#comment-16698574 ] Iuliia Volkova commented on AIRFLOW-3357: - https://issues.apache.org/jira/browse/AIRFLOW-1561 - the fix was merged [~villasv], [~ashb], will be good to close this task. > Scheduler doesn't work on example DAGs unless there's some dag file to process > -- > > Key: AIRFLOW-3357 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3357 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler >Affects Versions: 1.10.0, 1.10.1 >Reporter: Victor Villas Bôas Chaves >Priority: Major > > Having a blank Airflow install, if you try to manually run one of the example > DAGs no tasks are going to get queued or executed. They're going to stay with > state null. > Steps to reproduce on a new airflow: > # Entered the UI, turned on the example_bash_operator, manually triggered > the example_bash_operator, 6 tasks went to None state, nothing gets scheduled > # Rebooted the scheduler (with debug logging on). Nothing gets scheduled. > # Create a mytutorial.py in the dag folder with code from tutorial.py but > DAG name changed, everything starts getting scheduled. > A debug view of the logs is here: > [https://gist.github.com/tomfaulhaber/4e72ed0884c9580c606e02e4b745ddff] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] victornoel commented on issue #4089: adding image_pull_secrets into pod at creation inside operator
victornoel commented on issue #4089: adding image_pull_secrets into pod at creation inside operator URL: https://github.com/apache/incubator-airflow/pull/4089#issuecomment-441544845 @Fokko @Padarn oups yes, sorry, incorrect link :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] victornoel edited a comment on issue #4089: adding image_pull_secrets into pod at creation inside operator
victornoel edited a comment on issue #4089: adding image_pull_secrets into pod at creation inside operator URL: https://github.com/apache/incubator-airflow/pull/4089#issuecomment-440568831 @Padarn this can be closed, another PR ~~#3251~~ #4188, fixed the issue This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] xnuinside commented on issue #4236: [AIRFLOW-3395] Add REST API endpoints to the docs
xnuinside commented on issue #4236: [AIRFLOW-3395] Add REST API endpoints to the docs URL: https://github.com/apache/incubator-airflow/pull/4236#issuecomment-441543219 @Fokko, tests fail with 'Attempting to fetch rat rm: cannot remove '/tmp/lib/apache-rat-0.12.jar': No such file or directory Our attempt to download rat locally to /tmp/lib/apache-rat-0.12.jar failed. Please install rat manually. ERROR: InvocationError for command '/app/scripts/ci/6-check-license.sh' (exited with code 255)' This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-1823) API get_task_info is incompatible with manual runs created by UI
[ https://issues.apache.org/jira/browse/AIRFLOW-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698471#comment-16698471 ] Iuliia Volkova commented on AIRFLOW-1823: - [~ashb], a ticket is not actual for version 1.10 > API get_task_info is incompatible with manual runs created by UI > > > Key: AIRFLOW-1823 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1823 > Project: Apache Airflow > Issue Type: Bug > Components: api >Affects Versions: 2.0.0 > Environment: ubuntu > Airflow 1.9rc02 > commit: > https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126 >Reporter: Jeremy Lewi >Assignee: Iuliia Volkova >Priority: Minor > > The API method > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > doesn't work with manual runs created by the UI. > The UI creates dag runs with ids with sub second precision in the name. An > example of a run created by the UI is > 2017-11-16T20:23:32.045330 > The endpoint for > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > however assumes the dag run id is of the form '%Y-%m-%dT%H:%M:%S'. > Runs triggered via the CLI generate run ids with the form expected by the API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-987) `airflow kerberos` ignores --keytab and --principal arguments
[ https://issues.apache.org/jira/browse/AIRFLOW-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698458#comment-16698458 ] ASF GitHub Bot commented on AIRFLOW-987: xnuinside opened a new pull request #4238: [AIRFLOW-987] pass kerberos cli args keytab and principal to kerberos… URL: https://github.com/apache/incubator-airflow/pull/4238 ….run() Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-987]\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-987] - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: default=conf.get('kerberos', 'principal')) ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > `airflow kerberos` ignores --keytab and --principal arguments > - > > Key: AIRFLOW-987 > URL: https://issues.apache.org/jira/browse/AIRFLOW-987 > Project: Apache Airflow > Issue Type: Bug > Components: security >Affects Versions: 1.8.0 > Environment: 1.8-rc5 >Reporter: Ruslan Dautkhanov >Assignee: Pratap20 >Priority: Major > Labels: easyfix, kerberos, security > > No matter which arguments I pass to `airflow kerberos`, > it always executes as `kinit -r 3600m -k -t airflow.keytab -c > /tmp/airflow_krb5_ccache airflow` > So it failes with expected "kinit: Keytab contains no suitable keys for > airf...@corp.some.com while getting initial credentials" > Tried different arguments, -kt and --keytab, here's one of the runs (some > lines wrapped for readability): > {noformat} > $ airflow kerberos -kt /home/rdautkha/.keytab rdautkha...@corp.some.com > [2017-03-14 23:50:11,523] {__init__.py:57} INFO - Using executor LocalExecutor > [2017-03-14 23:50:12,069] {kerberos.py:43} INFO - Reinitting kerberos from > keytab: > kinit -r 3600m -k -t airflow.keytab -c /tmp/airflow_krb5_ccache airflow > [2017-03-14 23:50:12,080] {kerberos.py:55} ERROR - > Couldn't reinit from keytab! `kinit' exited with 1. > kinit: Keytab contains no suitable keys for airf...@corp.some.com > while getting initial credentials > {noformat} > 1.8-rc5 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] xnuinside opened a new pull request #4238: [AIRFLOW-987] pass kerberos cli args keytab and principal to kerberos…
xnuinside opened a new pull request #4238: [AIRFLOW-987] pass kerberos cli args keytab and principal to kerberos… URL: https://github.com/apache/incubator-airflow/pull/4238 ….run() Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-987]\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-987] - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: default=conf.get('kerberos', 'principal')) ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] reubenvanammers commented on a change in pull request #4156: [AIRFLOW-3314] Changed auto inlets feature to work as described
reubenvanammers commented on a change in pull request #4156: [AIRFLOW-3314] Changed auto inlets feature to work as described URL: https://github.com/apache/incubator-airflow/pull/4156#discussion_r236112185 ## File path: airflow/lineage/__init__.py ## @@ -110,26 +114,31 @@ def wrapper(self, context, *args, **kwargs): for i in inlets] self.inlets.extend(inlets) -if self._inlets['auto']: -# dont append twice -task_ids = set(self._inlets['task_ids']).symmetric_difference( -self.upstream_task_ids -) -inlets = self.xcom_pull(context, -task_ids=task_ids, -dag_id=self.dag_id, -key=PIPELINE_OUTLETS) -inlets = [item for sublist in inlets if sublist for item in sublist] -inlets = [DataSet.map_type(i['typeName'])(data=i['attributes']) - for i in inlets] -self.inlets.extend(inlets) - -if len(self._inlets['datasets']) > 0: -self.inlets.extend(self._inlets['datasets']) +if self._inlets["auto"]: +visited_task_ids = set(self._inlets["task_ids"]) # prevent double counting of outlets +stack = {self.task_id} +while stack: +task_id = stack.pop() +task = self._dag.task_dict[task_id] +visited_task_ids.add(task_id) +inlets = self.xcom_pull( Review comment: Hi @bolkedebruin, can you elaborate on your objection? Once the code reaches a node of the DAG and finds outlets, it stops attempting to find further tasks upstream as the upstream tasks are not placed into the stack. In the case, for example, that the task will have multiple (say, for e.g., 2) upstream tasks, then xcom_pull will be called twice compared to the previous code. However, since xcom_pull fires off a seperate database call (using get_one) for each task_id in the task_ids parameter, I don't think that this should have an appreciable difference in speed. Regarding using something like topological sort, I feel like that would lose too much information; in what I regard as the 'expected' behaviour, it requires discrimination between upstream or sibling nodes, as you want to stop when outlets from a branch are found. As again, since that will do individual database queries, I'm not sure that it will be more efficient. I could change it so that it doesn't do a database query on the task, but thats a relatively small change. Thanks for looking over the PR, and tell me if I have misunderstood something. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dimberman commented on issue #3519: [AIRFLOW-2642] fix wrong value git-sync initcontainer env GIT_SYNC_ROOT
dimberman commented on issue #3519: [AIRFLOW-2642] fix wrong value git-sync initcontainer env GIT_SYNC_ROOT URL: https://github.com/apache/incubator-airflow/pull/3519#issuecomment-441503578 @Fokko this LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dimberman commented on issue #3792: [AIRFLOW-2947] Added Helm chart for Kubernetes executor
dimberman commented on issue #3792: [AIRFLOW-2947] Added Helm chart for Kubernetes executor URL: https://github.com/apache/incubator-airflow/pull/3792#issuecomment-441503325 @jpds could you please rebase this? Would love to start offering this as a simple solution for testing out the k8sExecutor :). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dimberman commented on issue #4218: [AIRFLOW-3378] KubernetesPodOperator does not delete on timeout failure
dimberman commented on issue #4218: [AIRFLOW-3378] KubernetesPodOperator does not delete on timeout failure URL: https://github.com/apache/incubator-airflow/pull/4218#issuecomment-441502767 Ooo good catch @victornoel! Thank you for fixing this! @Fokko LGTM :). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jmcarp commented on issue #4207: [AIRFLOW-3367] Run celery integration test with redis broker.
jmcarp commented on issue #4207: [AIRFLOW-3367] Run celery integration test with redis broker. URL: https://github.com/apache/incubator-airflow/pull/4207#issuecomment-441499749 Rebased and fixed the conflict with #4090. No rush, but this should be ready for review when you have time @Fokko @ashb. This will help avoid the kind of regression I introduced in #4203. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Padarn closed pull request #4089: adding image_pull_secrets into pod at creation inside operator
Padarn closed pull request #4089: adding image_pull_secrets into pod at creation inside operator URL: https://github.com/apache/incubator-airflow/pull/4089 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/operators/kubernetes_pod_operator.py b/airflow/contrib/operators/kubernetes_pod_operator.py index d4f1013876..0f250b0963 100644 --- a/airflow/contrib/operators/kubernetes_pod_operator.py +++ b/airflow/contrib/operators/kubernetes_pod_operator.py @@ -106,6 +106,7 @@ def execute(self, context): pod.service_account_name = self.service_account_name pod.secrets = self.secrets +pod.image_pull_secrets = self.image_pull_secrets pod.envs = self.env_vars pod.image_pull_policy = self.image_pull_policy pod.annotations = self.annotations This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Padarn commented on issue #4089: adding image_pull_secrets into pod at creation inside operator
Padarn commented on issue #4089: adding image_pull_secrets into pod at creation inside operator URL: https://github.com/apache/incubator-airflow/pull/4089#issuecomment-441494562 Sorry I didn't have time to get to this. @Fokko yes I believe the linked ticket is different, but there was a ticket that fixed this: https://github.com/apache/incubator-airflow/pull/4188. Closing the PR This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] KevinYang21 commented on issue #4235: [WIP][AIRFLOW-3392] Add index on dag_id in sla_miss table
KevinYang21 commented on issue #4235: [WIP][AIRFLOW-3392] Add index on dag_id in sla_miss table URL: https://github.com/apache/incubator-airflow/pull/4235#issuecomment-441479531 @Fokko Ty for taking a look! Was retrying the CI and not look into the actually failures as it has been pretty flaky in the past two days. Updated the style and will later on put a proper description( with performance improvement measurement) and remove the WIP in the title. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4235: [WIP][AIRFLOW-3392] Add index on dag_id in sla_miss table
Fokko commented on issue #4235: [WIP][AIRFLOW-3392] Add index on dag_id in sla_miss table URL: https://github.com/apache/incubator-airflow/pull/4235#issuecomment-441479042 ``` flake8 runtests: commands[0] | flake8 ./airflow/migrations/versions/03bc53e68815_add_sm_dag_index.py:9:2: W291 trailing whitespace ./airflow/migrations/versions/03bc53e68815_add_sm_dag_index.py:11:2: W291 trailing whitespace ./airflow/migrations/versions/03bc53e68815_add_sm_dag_index.py:33:1: E402 module level import not at top of file ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-3366) AirflowConfigParser getsection() doesn't pick environment overrides
[ https://issues.apache.org/jira/browse/AIRFLOW-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong updated AIRFLOW-3366: -- Fix Version/s: 2.0.0 > AirflowConfigParser getsection() doesn't pick environment overrides > --- > > Key: AIRFLOW-3366 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3366 > Project: Apache Airflow > Issue Type: Bug > Components: configuration >Affects Versions: 1.10.0 >Reporter: Victor Villas Bôas Chaves >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko commented on issue #4218: [AIRFLOW-3378] KubernetesPodOperator does not delete on timeout failure
Fokko commented on issue #4218: [AIRFLOW-3378] KubernetesPodOperator does not delete on timeout failure URL: https://github.com/apache/incubator-airflow/pull/4218#issuecomment-441478046 @dimberman PTAL :-) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4089: adding image_pull_secrets into pod at creation inside operator
Fokko commented on issue #4089: adding image_pull_secrets into pod at creation inside operator URL: https://github.com/apache/incubator-airflow/pull/4089#issuecomment-441477991 @victornoel are you sure? It looks unrelated. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3365) Missing visibility_timeout will overwrite celery broker transport options
[ https://issues.apache.org/jira/browse/AIRFLOW-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698336#comment-16698336 ] ASF GitHub Bot commented on AIRFLOW-3365: - Fokko closed pull request #4211: [AIRFLOW-3365][AIRFLOW-3366] Allow celery_broker_transport_options to be set with environment variables URL: https://github.com/apache/incubator-airflow/pull/4211 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/config_templates/default_celery.py b/airflow/config_templates/default_celery.py index d44f2b3448..5e72134de1 100644 --- a/airflow/config_templates/default_celery.py +++ b/airflow/config_templates/default_celery.py @@ -37,7 +37,7 @@ def _broker_supports_visibility_timeout(url): ) if 'visibility_timeout' not in broker_transport_options: if _broker_supports_visibility_timeout(broker_url): -broker_transport_options = {'visibility_timeout': 21600} +broker_transport_options['visibility_timeout'] = 21600 DEFAULT_CELERY_CONFIG = { 'accept_content': ['json', 'pickle'], diff --git a/airflow/configuration.py b/airflow/configuration.py index d07faf1cf8..3662df8d06 100644 --- a/airflow/configuration.py +++ b/airflow/configuration.py @@ -324,6 +324,12 @@ def getsection(self, section): if section in self._sections: _section.update(copy.deepcopy(self._sections[section])) +section_prefix = 'AIRFLOW__{S}__'.format(S=section.upper()) +for env_var in sorted(os.environ.keys()): +if env_var.startswith(section_prefix): +key = env_var.replace(section_prefix, '').lower() +_section[key] = self._get_env_var_option(section, key) + for key, val in iteritems(_section): try: val = int(val) diff --git a/tests/test_configuration.py b/tests/test_configuration.py index 09284c9972..acebd5732c 100644 --- a/tests/test_configuration.py +++ b/tests/test_configuration.py @@ -165,7 +165,7 @@ def test_getsection(self): key1 = awesome key2 = airflow -[another] +[testsection] key3 = value3 ''' test_conf = AirflowConfigParser( @@ -177,18 +177,18 @@ def test_getsection(self): test_conf.getsection('test') ) self.assertEqual( -OrderedDict([('key3', 'value3')]), -test_conf.getsection('another') +OrderedDict([ +('key3', 'value3'), +('testkey', 'testvalue'), +('testpercent', 'with%percent')]), +test_conf.getsection('testsection') ) def test_broker_transport_options(self): section_dict = conf.getsection("celery_broker_transport_options") self.assertTrue(isinstance(section_dict['visibility_timeout'], int)) - self.assertTrue(isinstance(section_dict['_test_only_bool'], bool)) - self.assertTrue(isinstance(section_dict['_test_only_float'], float)) - self.assertTrue(isinstance(section_dict['_test_only_string'], six.string_types)) def test_deprecated_options(self): This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Missing visibility_timeout will overwrite celery broker transport options > - > > Key: AIRFLOW-3365 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3365 > Project: Apache Airflow > Issue Type: Bug > Components: configuration >Affects Versions: 1.10.0 >Reporter: Victor Villas Bôas Chaves >Priority: Major > > If you specify the region in your airflow.cfg like this: > {{[celery_broker_transport_options] }} > {{region = us-east-2 }} > it gets ignored. > This is ignored because Airflow checks to see if the {{visibility_timeout}} > parameter is set in the transport options and, if it's not set, replaces the > transport options completely with {{{'visibility_timeout': 21600}}} (at > [https://github.com/apache/incubator-airflow/blob/master/airflow/config_templates/default_celery.py#L40]). > If you add {{visibility_timeout}}, the region is correctly set: > {{[celery_broker_transport_options] }} > {{region = us-east-2 }} > {{visibility_timeout = 21600 }} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-3366) AirflowConfigParser getsection() doesn't pick environment overrides
[ https://issues.apache.org/jira/browse/AIRFLOW-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-3366. --- Resolution: Fixed Assignee: Victor Villas Bôas Chaves > AirflowConfigParser getsection() doesn't pick environment overrides > --- > > Key: AIRFLOW-3366 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3366 > Project: Apache Airflow > Issue Type: Bug > Components: configuration >Affects Versions: 1.10.0 >Reporter: Victor Villas Bôas Chaves >Assignee: Victor Villas Bôas Chaves >Priority: Major > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRFLOW-3365) Missing visibility_timeout will overwrite celery broker transport options
[ https://issues.apache.org/jira/browse/AIRFLOW-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-3365. --- Resolution: Fixed Assignee: Victor Villas Bôas Chaves Fix Version/s: 2.0.0 > Missing visibility_timeout will overwrite celery broker transport options > - > > Key: AIRFLOW-3365 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3365 > Project: Apache Airflow > Issue Type: Bug > Components: configuration >Affects Versions: 1.10.0 >Reporter: Victor Villas Bôas Chaves >Assignee: Victor Villas Bôas Chaves >Priority: Major > Fix For: 2.0.0 > > > If you specify the region in your airflow.cfg like this: > {{[celery_broker_transport_options] }} > {{region = us-east-2 }} > it gets ignored. > This is ignored because Airflow checks to see if the {{visibility_timeout}} > parameter is set in the transport options and, if it's not set, replaces the > transport options completely with {{{'visibility_timeout': 21600}}} (at > [https://github.com/apache/incubator-airflow/blob/master/airflow/config_templates/default_celery.py#L40]). > If you add {{visibility_timeout}}, the region is correctly set: > {{[celery_broker_transport_options] }} > {{region = us-east-2 }} > {{visibility_timeout = 21600 }} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko closed pull request #4211: [AIRFLOW-3365][AIRFLOW-3366] Allow celery_broker_transport_options to be set with environment variables
Fokko closed pull request #4211: [AIRFLOW-3365][AIRFLOW-3366] Allow celery_broker_transport_options to be set with environment variables URL: https://github.com/apache/incubator-airflow/pull/4211 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/config_templates/default_celery.py b/airflow/config_templates/default_celery.py index d44f2b3448..5e72134de1 100644 --- a/airflow/config_templates/default_celery.py +++ b/airflow/config_templates/default_celery.py @@ -37,7 +37,7 @@ def _broker_supports_visibility_timeout(url): ) if 'visibility_timeout' not in broker_transport_options: if _broker_supports_visibility_timeout(broker_url): -broker_transport_options = {'visibility_timeout': 21600} +broker_transport_options['visibility_timeout'] = 21600 DEFAULT_CELERY_CONFIG = { 'accept_content': ['json', 'pickle'], diff --git a/airflow/configuration.py b/airflow/configuration.py index d07faf1cf8..3662df8d06 100644 --- a/airflow/configuration.py +++ b/airflow/configuration.py @@ -324,6 +324,12 @@ def getsection(self, section): if section in self._sections: _section.update(copy.deepcopy(self._sections[section])) +section_prefix = 'AIRFLOW__{S}__'.format(S=section.upper()) +for env_var in sorted(os.environ.keys()): +if env_var.startswith(section_prefix): +key = env_var.replace(section_prefix, '').lower() +_section[key] = self._get_env_var_option(section, key) + for key, val in iteritems(_section): try: val = int(val) diff --git a/tests/test_configuration.py b/tests/test_configuration.py index 09284c9972..acebd5732c 100644 --- a/tests/test_configuration.py +++ b/tests/test_configuration.py @@ -165,7 +165,7 @@ def test_getsection(self): key1 = awesome key2 = airflow -[another] +[testsection] key3 = value3 ''' test_conf = AirflowConfigParser( @@ -177,18 +177,18 @@ def test_getsection(self): test_conf.getsection('test') ) self.assertEqual( -OrderedDict([('key3', 'value3')]), -test_conf.getsection('another') +OrderedDict([ +('key3', 'value3'), +('testkey', 'testvalue'), +('testpercent', 'with%percent')]), +test_conf.getsection('testsection') ) def test_broker_transport_options(self): section_dict = conf.getsection("celery_broker_transport_options") self.assertTrue(isinstance(section_dict['visibility_timeout'], int)) - self.assertTrue(isinstance(section_dict['_test_only_bool'], bool)) - self.assertTrue(isinstance(section_dict['_test_only_float'], float)) - self.assertTrue(isinstance(section_dict['_test_only_string'], six.string_types)) def test_deprecated_options(self): This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-3250) Fix for Redis Hook for not authorised connection calls.
[ https://issues.apache.org/jira/browse/AIRFLOW-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-3250. --- Resolution: Fixed > Fix for Redis Hook for not authorised connection calls. > --- > > Key: AIRFLOW-3250 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3250 > Project: Apache Airflow > Issue Type: Bug > Components: db >Affects Versions: 1.9.0, 1.10.0, 1.10.1 >Reporter: Pawel Graczyk >Assignee: Pawel Graczyk >Priority: Minor > Fix For: 2.0.0 > > > Current implementation of AIRFLOW-999 needs fixes. > 1. Password stays None and not 'None' (str) in case there is no password set, > otherwise AUTH call will be send and that produces errors on connection to > Redis that does not expect authorisation calls. > 2. Reference to connection is set on get_conn hook object method rather than > on __init__ > 3. Trivial method key_exists of hook object removed > 4. Fixes for unit tests so it deal with hook code and not related > dependencies such as Redis and DB connections. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3250) Fix for Redis Hook for not authorised connection calls.
[ https://issues.apache.org/jira/browse/AIRFLOW-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong updated AIRFLOW-3250: -- Affects Version/s: (was: 2.0.0) 1.10.1 > Fix for Redis Hook for not authorised connection calls. > --- > > Key: AIRFLOW-3250 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3250 > Project: Apache Airflow > Issue Type: Bug > Components: db >Affects Versions: 1.9.0, 1.10.0, 1.10.1 >Reporter: Pawel Graczyk >Assignee: Pawel Graczyk >Priority: Minor > Fix For: 2.0.0 > > > Current implementation of AIRFLOW-999 needs fixes. > 1. Password stays None and not 'None' (str) in case there is no password set, > otherwise AUTH call will be send and that produces errors on connection to > Redis that does not expect authorisation calls. > 2. Reference to connection is set on get_conn hook object method rather than > on __init__ > 3. Trivial method key_exists of hook object removed > 4. Fixes for unit tests so it deal with hook code and not related > dependencies such as Redis and DB connections. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3250) Fix for Redis Hook for not authorised connection calls.
[ https://issues.apache.org/jira/browse/AIRFLOW-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong updated AIRFLOW-3250: -- Fix Version/s: (was: 1.10.0) (was: 1.9.0) 2.0.0 > Fix for Redis Hook for not authorised connection calls. > --- > > Key: AIRFLOW-3250 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3250 > Project: Apache Airflow > Issue Type: Bug > Components: db >Affects Versions: 1.9.0, 1.10.0, 2.0.0 >Reporter: Pawel Graczyk >Assignee: Pawel Graczyk >Priority: Minor > Fix For: 2.0.0 > > > Current implementation of AIRFLOW-999 needs fixes. > 1. Password stays None and not 'None' (str) in case there is no password set, > otherwise AUTH call will be send and that produces errors on connection to > Redis that does not expect authorisation calls. > 2. Reference to connection is set on get_conn hook object method rather than > on __init__ > 3. Trivial method key_exists of hook object removed > 4. Fixes for unit tests so it deal with hook code and not related > dependencies such as Redis and DB connections. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3250) Fix for Redis Hook for not authorised connection calls.
[ https://issues.apache.org/jira/browse/AIRFLOW-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698335#comment-16698335 ] ASF GitHub Bot commented on AIRFLOW-3250: - Fokko closed pull request #4090: [AIRFLOW-3250] Fix for Redis Hook for not authorised connection calls URL: https://github.com/apache/incubator-airflow/pull/4090 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/redis_hook.py b/airflow/contrib/hooks/redis_hook.py index 650cc9308b..a34e880796 100644 --- a/airflow/contrib/hooks/redis_hook.py +++ b/airflow/contrib/hooks/redis_hook.py @@ -21,15 +21,13 @@ RedisHook module """ from redis import StrictRedis - -from airflow.exceptions import AirflowException from airflow.hooks.base_hook import BaseHook from airflow.utils.log.logging_mixin import LoggingMixin class RedisHook(BaseHook, LoggingMixin): """ -Hook to interact with Redis database +Wrapper for connection to interact with Redis in-memory data structure store """ def __init__(self, redis_conn_id='redis_default'): """ @@ -39,55 +37,31 @@ def __init__(self, redis_conn_id='redis_default'): we need to connect to Redis. """ self.redis_conn_id = redis_conn_id -self.client = None -conn = self.get_connection(self.redis_conn_id) -self.host = conn.host -self.port = int(conn.port) -self.password = conn.password -self.db = int(conn.extra_dejson.get('db', 0)) - -self.log.debug( -'''Connection "{conn}": -\thost: {host} -\tport: {port} -\textra: {extra} -'''.format( -conn=self.redis_conn_id, -host=self.host, -port=self.port, -extra=conn.extra_dejson -) -) +self.redis = None +self.host = None +self.port = None +self.password = None +self.db = None def get_conn(self): """ Returns a Redis connection. """ -if not self.client: +conn = self.get_connection(self.redis_conn_id) +self.host = conn.host +self.port = conn.port +self.password = None if str(conn.password).lower() in ['none', 'false', ''] else conn.password +self.db = conn.extra_dejson.get('db', None) + +if not self.redis: self.log.debug( -'generating Redis client for conn_id "%s" on %s:%s:%s', +'Initializing redis object for conn_id "%s" on %s:%s:%s', self.redis_conn_id, self.host, self.port, self.db ) -try: -self.client = StrictRedis( -host=self.host, -port=self.port, -password=self.password, -db=self.db) -except Exception as general_error: -raise AirflowException( -'Failed to create Redis client, error: {error}'.format( -error=str(general_error) -) -) - -return self.client - -def key_exists(self, key): -""" -Checks if a key exists in Redis database +self.redis = StrictRedis( +host=self.host, +port=self.port, +password=self.password, +db=self.db) -:param key: The key to check the existence. -:type key: str -""" -return self.get_conn().exists(key) +return self.redis diff --git a/airflow/contrib/sensors/redis_key_sensor.py b/airflow/contrib/sensors/redis_key_sensor.py index a2d190baae..4c0ac68840 100644 --- a/airflow/contrib/sensors/redis_key_sensor.py +++ b/airflow/contrib/sensors/redis_key_sensor.py @@ -23,25 +23,17 @@ class RedisKeySensor(BaseSensorOperator): """ -Checks for the existence of a key in a Redis database +Checks for the existence of a key in a Redis """ template_fields = ('key',) ui_color = '#f0eee4' @apply_defaults def __init__(self, key, redis_conn_id, *args, **kwargs): -""" -Create a new RedisKeySensor - -:param key: The key to be monitored -:type key: str -:param redis_conn_id: The connection ID to use when connecting to Redis DB. -:type redis_conn_id: str -""" super(RedisKeySensor, self).__init__(*args, **kwargs) self.redis_conn_id = redis_conn_id self.key = key def poke(self, context): -self.log.info('Sensor check existence of key: %s', self.key) -
[GitHub] Fokko closed pull request #4090: [AIRFLOW-3250] Fix for Redis Hook for not authorised connection calls
Fokko closed pull request #4090: [AIRFLOW-3250] Fix for Redis Hook for not authorised connection calls URL: https://github.com/apache/incubator-airflow/pull/4090 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/redis_hook.py b/airflow/contrib/hooks/redis_hook.py index 650cc9308b..a34e880796 100644 --- a/airflow/contrib/hooks/redis_hook.py +++ b/airflow/contrib/hooks/redis_hook.py @@ -21,15 +21,13 @@ RedisHook module """ from redis import StrictRedis - -from airflow.exceptions import AirflowException from airflow.hooks.base_hook import BaseHook from airflow.utils.log.logging_mixin import LoggingMixin class RedisHook(BaseHook, LoggingMixin): """ -Hook to interact with Redis database +Wrapper for connection to interact with Redis in-memory data structure store """ def __init__(self, redis_conn_id='redis_default'): """ @@ -39,55 +37,31 @@ def __init__(self, redis_conn_id='redis_default'): we need to connect to Redis. """ self.redis_conn_id = redis_conn_id -self.client = None -conn = self.get_connection(self.redis_conn_id) -self.host = conn.host -self.port = int(conn.port) -self.password = conn.password -self.db = int(conn.extra_dejson.get('db', 0)) - -self.log.debug( -'''Connection "{conn}": -\thost: {host} -\tport: {port} -\textra: {extra} -'''.format( -conn=self.redis_conn_id, -host=self.host, -port=self.port, -extra=conn.extra_dejson -) -) +self.redis = None +self.host = None +self.port = None +self.password = None +self.db = None def get_conn(self): """ Returns a Redis connection. """ -if not self.client: +conn = self.get_connection(self.redis_conn_id) +self.host = conn.host +self.port = conn.port +self.password = None if str(conn.password).lower() in ['none', 'false', ''] else conn.password +self.db = conn.extra_dejson.get('db', None) + +if not self.redis: self.log.debug( -'generating Redis client for conn_id "%s" on %s:%s:%s', +'Initializing redis object for conn_id "%s" on %s:%s:%s', self.redis_conn_id, self.host, self.port, self.db ) -try: -self.client = StrictRedis( -host=self.host, -port=self.port, -password=self.password, -db=self.db) -except Exception as general_error: -raise AirflowException( -'Failed to create Redis client, error: {error}'.format( -error=str(general_error) -) -) - -return self.client - -def key_exists(self, key): -""" -Checks if a key exists in Redis database +self.redis = StrictRedis( +host=self.host, +port=self.port, +password=self.password, +db=self.db) -:param key: The key to check the existence. -:type key: str -""" -return self.get_conn().exists(key) +return self.redis diff --git a/airflow/contrib/sensors/redis_key_sensor.py b/airflow/contrib/sensors/redis_key_sensor.py index a2d190baae..4c0ac68840 100644 --- a/airflow/contrib/sensors/redis_key_sensor.py +++ b/airflow/contrib/sensors/redis_key_sensor.py @@ -23,25 +23,17 @@ class RedisKeySensor(BaseSensorOperator): """ -Checks for the existence of a key in a Redis database +Checks for the existence of a key in a Redis """ template_fields = ('key',) ui_color = '#f0eee4' @apply_defaults def __init__(self, key, redis_conn_id, *args, **kwargs): -""" -Create a new RedisKeySensor - -:param key: The key to be monitored -:type key: str -:param redis_conn_id: The connection ID to use when connecting to Redis DB. -:type redis_conn_id: str -""" super(RedisKeySensor, self).__init__(*args, **kwargs) self.redis_conn_id = redis_conn_id self.key = key def poke(self, context): -self.log.info('Sensor check existence of key: %s', self.key) -return RedisHook(self.redis_conn_id).key_exists(self.key) +self.log.info('Sensor checks for existence of key: %s', self.key) +return RedisHook(self.redis_conn_id).get_conn().exists(self.key) diff --git a/airflow/utils/db.py b/airflow/utils/db
[GitHub] Fokko commented on issue #3675: [AIRFLOW-2834] fix build script for k8s docker
Fokko commented on issue #3675: [AIRFLOW-2834] fix build script for k8s docker URL: https://github.com/apache/incubator-airflow/pull/3675#issuecomment-441476807 @verdan PTAL This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (AIRFLOW-1561) Scheduler doesn't pick up example dags unless there is atleast 1 dag in dags folder
[ https://issues.apache.org/jira/browse/AIRFLOW-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong reassigned AIRFLOW-1561: - Assignee: Shintaro Murakami (was: Chao-Han Tsai) > Scheduler doesn't pick up example dags unless there is atleast 1 dag in dags > folder > --- > > Key: AIRFLOW-1561 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1561 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.8.2 >Reporter: Sumit Maheshwari >Assignee: Shintaro Murakami >Priority: Major > Fix For: 2.0.0 > > Attachments: airflow_scheduler_log.txt > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1561) Scheduler doesn't pick up example dags unless there is atleast 1 dag in dags folder
[ https://issues.apache.org/jira/browse/AIRFLOW-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698330#comment-16698330 ] ASF GitHub Bot commented on AIRFLOW-1561: - Fokko closed pull request #2635: [AIRFLOW-1561] Fix scheduler to pick up example DAGs without other DAGs URL: https://github.com/apache/incubator-airflow/pull/2635 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/jobs.py b/airflow/jobs.py index 9e68fad797..bcaf93b533 100644 --- a/airflow/jobs.py +++ b/airflow/jobs.py @@ -1675,7 +1675,7 @@ def process_file(self, file_path, zombies, pickle_dags=False, session=None): simple_dags = [] try: -dagbag = models.DagBag(file_path) +dagbag = models.DagBag(file_path, include_examples=False) except Exception: self.log.exception("Failed at reloading the DAG file %s", file_path) Stats.incr('dag_file_refresh_error', 1, 1) diff --git a/airflow/models.py b/airflow/models.py index 95ce629d3b..d505d9479e 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -292,12 +292,7 @@ def __init__( self.import_errors = {} self.has_logged = False -if include_examples: -example_dag_folder = os.path.join( -os.path.dirname(__file__), -'example_dags') -self.collect_dags(example_dag_folder) -self.collect_dags(dag_folder) +self.collect_dags(dag_folder, include_examples) def size(self): """ @@ -531,7 +526,8 @@ def bag_dag(self, dag, parent_dag, root_dag): def collect_dags( self, dag_folder=None, -only_if_updated=True): +only_if_updated=True, +include_examples=configuration.conf.getboolean('core', 'LOAD_EXAMPLES')): """ Given a file path or a folder, this method looks for python modules, imports them and adds them to the dagbag collection. @@ -551,7 +547,7 @@ def collect_dags( stats = [] FileLoadStat = namedtuple( 'FileLoadStat', "file duration dag_num task_num dags") -for filepath in list_py_file_paths(dag_folder): +for filepath in list_py_file_paths(dag_folder, include_examples): try: ts = timezone.utcnow() found_dags = self.process_file( diff --git a/airflow/utils/dag_processing.py b/airflow/utils/dag_processing.py index 47f473e9aa..1b2eeeff2a 100644 --- a/airflow/utils/dag_processing.py +++ b/airflow/utils/dag_processing.py @@ -215,7 +215,8 @@ def get_dag(self, dag_id): return self.dag_id_to_simple_dag[dag_id] -def list_py_file_paths(directory, safe_mode=True): +def list_py_file_paths(directory, safe_mode=True, + include_examples=conf.getboolean('core', 'LOAD_EXAMPLES')): """ Traverse a directory and look for Python files. @@ -284,6 +285,10 @@ def list_py_file_paths(directory, safe_mode=True): except Exception: log = LoggingMixin().log log.exception("Error while examining %s", f) +if include_examples: +import airflow.example_dags +example_dag_folder = airflow.example_dags.__path__[0] +file_paths.extend(list_py_file_paths(example_dag_folder, safe_mode, False)) return file_paths diff --git a/tests/test_jobs.py b/tests/test_jobs.py index af8ccc6c2e..a0b0c222fe 100644 --- a/tests/test_jobs.py +++ b/tests/test_jobs.py @@ -42,6 +42,7 @@ from airflow import AirflowException, settings, models from airflow import configuration from airflow.bin import cli +import airflow.example_dags from airflow.executors import BaseExecutor, SequentialExecutor from airflow.jobs import BaseJob, BackfillJob, SchedulerJob, LocalTaskJob from airflow.models import DAG, DagModel, DagBag, DagRun, Pool, TaskInstance as TI @@ -3335,7 +3336,18 @@ def test_list_py_file_paths(self): if file_name not in ignored_files: expected_files.add( '{}/{}'.format(TEST_DAGS_FOLDER, file_name)) -for file_path in list_py_file_paths(TEST_DAGS_FOLDER): +for file_path in list_py_file_paths(TEST_DAGS_FOLDER, include_examples=False): +detected_files.add(file_path) +self.assertEqual(detected_files, expected_files) + +example_dag_folder = airflow.example_dags.__path__[0] +for root, dirs, files in os.walk(example_dag_folder): +for file_name in files: +if file_name.endswith('.py') or file_name.endswith('.zip'): +if file_name not in ['__init__.py']: +
[GitHub] Fokko closed pull request #2635: [AIRFLOW-1561] Fix scheduler to pick up example DAGs without other DAGs
Fokko closed pull request #2635: [AIRFLOW-1561] Fix scheduler to pick up example DAGs without other DAGs URL: https://github.com/apache/incubator-airflow/pull/2635 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/jobs.py b/airflow/jobs.py index 9e68fad797..bcaf93b533 100644 --- a/airflow/jobs.py +++ b/airflow/jobs.py @@ -1675,7 +1675,7 @@ def process_file(self, file_path, zombies, pickle_dags=False, session=None): simple_dags = [] try: -dagbag = models.DagBag(file_path) +dagbag = models.DagBag(file_path, include_examples=False) except Exception: self.log.exception("Failed at reloading the DAG file %s", file_path) Stats.incr('dag_file_refresh_error', 1, 1) diff --git a/airflow/models.py b/airflow/models.py index 95ce629d3b..d505d9479e 100755 --- a/airflow/models.py +++ b/airflow/models.py @@ -292,12 +292,7 @@ def __init__( self.import_errors = {} self.has_logged = False -if include_examples: -example_dag_folder = os.path.join( -os.path.dirname(__file__), -'example_dags') -self.collect_dags(example_dag_folder) -self.collect_dags(dag_folder) +self.collect_dags(dag_folder, include_examples) def size(self): """ @@ -531,7 +526,8 @@ def bag_dag(self, dag, parent_dag, root_dag): def collect_dags( self, dag_folder=None, -only_if_updated=True): +only_if_updated=True, +include_examples=configuration.conf.getboolean('core', 'LOAD_EXAMPLES')): """ Given a file path or a folder, this method looks for python modules, imports them and adds them to the dagbag collection. @@ -551,7 +547,7 @@ def collect_dags( stats = [] FileLoadStat = namedtuple( 'FileLoadStat', "file duration dag_num task_num dags") -for filepath in list_py_file_paths(dag_folder): +for filepath in list_py_file_paths(dag_folder, include_examples): try: ts = timezone.utcnow() found_dags = self.process_file( diff --git a/airflow/utils/dag_processing.py b/airflow/utils/dag_processing.py index 47f473e9aa..1b2eeeff2a 100644 --- a/airflow/utils/dag_processing.py +++ b/airflow/utils/dag_processing.py @@ -215,7 +215,8 @@ def get_dag(self, dag_id): return self.dag_id_to_simple_dag[dag_id] -def list_py_file_paths(directory, safe_mode=True): +def list_py_file_paths(directory, safe_mode=True, + include_examples=conf.getboolean('core', 'LOAD_EXAMPLES')): """ Traverse a directory and look for Python files. @@ -284,6 +285,10 @@ def list_py_file_paths(directory, safe_mode=True): except Exception: log = LoggingMixin().log log.exception("Error while examining %s", f) +if include_examples: +import airflow.example_dags +example_dag_folder = airflow.example_dags.__path__[0] +file_paths.extend(list_py_file_paths(example_dag_folder, safe_mode, False)) return file_paths diff --git a/tests/test_jobs.py b/tests/test_jobs.py index af8ccc6c2e..a0b0c222fe 100644 --- a/tests/test_jobs.py +++ b/tests/test_jobs.py @@ -42,6 +42,7 @@ from airflow import AirflowException, settings, models from airflow import configuration from airflow.bin import cli +import airflow.example_dags from airflow.executors import BaseExecutor, SequentialExecutor from airflow.jobs import BaseJob, BackfillJob, SchedulerJob, LocalTaskJob from airflow.models import DAG, DagModel, DagBag, DagRun, Pool, TaskInstance as TI @@ -3335,7 +3336,18 @@ def test_list_py_file_paths(self): if file_name not in ignored_files: expected_files.add( '{}/{}'.format(TEST_DAGS_FOLDER, file_name)) -for file_path in list_py_file_paths(TEST_DAGS_FOLDER): +for file_path in list_py_file_paths(TEST_DAGS_FOLDER, include_examples=False): +detected_files.add(file_path) +self.assertEqual(detected_files, expected_files) + +example_dag_folder = airflow.example_dags.__path__[0] +for root, dirs, files in os.walk(example_dag_folder): +for file_name in files: +if file_name.endswith('.py') or file_name.endswith('.zip'): +if file_name not in ['__init__.py']: +expected_files.add(os.path.join(root, file_name)) +detected_files.clear() +for file_path in list_py_file_paths(TEST_DAGS_FOLDER, include_examples=True): detected_files.add(file_path) self.assertEqual(detect
[jira] [Resolved] (AIRFLOW-1561) Scheduler doesn't pick up example dags unless there is atleast 1 dag in dags folder
[ https://issues.apache.org/jira/browse/AIRFLOW-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-1561. --- Resolution: Fixed Fix Version/s: 2.0.0 > Scheduler doesn't pick up example dags unless there is atleast 1 dag in dags > folder > --- > > Key: AIRFLOW-1561 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1561 > Project: Apache Airflow > Issue Type: Bug >Affects Versions: 1.8.2 >Reporter: Sumit Maheshwari >Assignee: Chao-Han Tsai >Priority: Major > Fix For: 2.0.0 > > Attachments: airflow_scheduler_log.txt > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1196) Make trigger_dag_id a templated field of TriggerDagRunOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698326#comment-16698326 ] ASF GitHub Bot commented on AIRFLOW-1196: - Fokko closed pull request #4228: [AIRFLOW-1196][AIRFLOW-2399] Add templated field in TriggerDagRunOperator URL: https://github.com/apache/incubator-airflow/pull/4228 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/operators/dagrun_operator.py b/airflow/operators/dagrun_operator.py index 53814af410..8d339894f4 100644 --- a/airflow/operators/dagrun_operator.py +++ b/airflow/operators/dagrun_operator.py @@ -35,7 +35,7 @@ class TriggerDagRunOperator(BaseOperator): """ Triggers a DAG run for a specified ``dag_id`` -:param trigger_dag_id: the dag_id to trigger +:param trigger_dag_id: the dag_id to trigger (templated) :type trigger_dag_id: str :param python_callable: a reference to a python function that will be called while passing it the ``context`` object and a placeholder @@ -50,7 +50,7 @@ class TriggerDagRunOperator(BaseOperator): :param execution_date: Execution date for the dag :type execution_date: datetime.datetime """ -template_fields = tuple() +template_fields = ('trigger_dag_id',) template_ext = tuple() ui_color = '#ffefeb' This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Make trigger_dag_id a templated field of TriggerDagRunOperator > -- > > Key: AIRFLOW-1196 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1196 > Project: Apache Airflow > Issue Type: Improvement > Components: operators >Reporter: Arthur Vigil >Assignee: Kaxil Naik >Priority: Trivial > Labels: easyfix, improvement > Fix For: 2.0.0 > > > TriggerDagRunOperator currently has no templated fields. Adding > `trigger_dag_id` as a templated field should be a trivial change that > improves its flexibility and usefulness. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] codecov-io edited a comment on issue #4223: [AIRFLOW-XXX] Remove unnecessary usage of "# noqa" in airflow/bin/cli.py
codecov-io edited a comment on issue #4223: [AIRFLOW-XXX] Remove unnecessary usage of "# noqa" in airflow/bin/cli.py URL: https://github.com/apache/incubator-airflow/pull/4223#issuecomment-441178549 # [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4223?src=pr&el=h1) Report > Merging [#4223](https://codecov.io/gh/apache/incubator-airflow/pull/4223?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-airflow/commit/9f7f5e4a1eaae1da5f3ecdabe26984b9bcaa69fb?src=pr&el=desc) will **not change** coverage. > The diff coverage is `100%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-airflow/pull/4223/graphs/tree.svg?width=650&token=WdLKlKHOAU&height=150&src=pr)](https://codecov.io/gh/apache/incubator-airflow/pull/4223?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #4223 +/- ## == Coverage77.8% 77.8% == Files 201 201 Lines 16360 16360 == Hits12729 12729 Misses 36313631 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-airflow/pull/4223?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [airflow/bin/cli.py](https://codecov.io/gh/apache/incubator-airflow/pull/4223/diff?src=pr&el=tree#diff-YWlyZmxvdy9iaW4vY2xpLnB5) | `64.59% <100%> (ø)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4223?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/incubator-airflow/pull/4223?src=pr&el=footer). Last update [9f7f5e4...9df65bc](https://codecov.io/gh/apache/incubator-airflow/pull/4223?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-1196) Make trigger_dag_id a templated field of TriggerDagRunOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-1196. --- Resolution: Fixed Fix Version/s: 2.0.0 > Make trigger_dag_id a templated field of TriggerDagRunOperator > -- > > Key: AIRFLOW-1196 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1196 > Project: Apache Airflow > Issue Type: Improvement > Components: operators >Reporter: Arthur Vigil >Assignee: Arthur Vigil >Priority: Trivial > Labels: easyfix, improvement > Fix For: 2.0.0 > > > TriggerDagRunOperator currently has no templated fields. Adding > `trigger_dag_id` as a templated field should be a trivial change that > improves its flexibility and usefulness. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko closed pull request #4228: [AIRFLOW-1196][AIRFLOW-2399] Add templated field in TriggerDagRunOperator
Fokko closed pull request #4228: [AIRFLOW-1196][AIRFLOW-2399] Add templated field in TriggerDagRunOperator URL: https://github.com/apache/incubator-airflow/pull/4228 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/operators/dagrun_operator.py b/airflow/operators/dagrun_operator.py index 53814af410..8d339894f4 100644 --- a/airflow/operators/dagrun_operator.py +++ b/airflow/operators/dagrun_operator.py @@ -35,7 +35,7 @@ class TriggerDagRunOperator(BaseOperator): """ Triggers a DAG run for a specified ``dag_id`` -:param trigger_dag_id: the dag_id to trigger +:param trigger_dag_id: the dag_id to trigger (templated) :type trigger_dag_id: str :param python_callable: a reference to a python function that will be called while passing it the ``context`` object and a placeholder @@ -50,7 +50,7 @@ class TriggerDagRunOperator(BaseOperator): :param execution_date: Execution date for the dag :type execution_date: datetime.datetime """ -template_fields = tuple() +template_fields = ('trigger_dag_id',) template_ext = tuple() ui_color = '#ffefeb' This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (AIRFLOW-2399) TriggerDagRunOperator Should Permit Implementor to Choose Dag Id at Runtime
[ https://issues.apache.org/jira/browse/AIRFLOW-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong resolved AIRFLOW-2399. --- Resolution: Fixed Assignee: Kaxil Naik Fix Version/s: 2.0.0 > TriggerDagRunOperator Should Permit Implementor to Choose Dag Id at Runtime > --- > > Key: AIRFLOW-2399 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2399 > Project: Apache Airflow > Issue Type: Bug > Components: operators >Reporter: andy dreyfuss >Assignee: Kaxil Naik >Priority: Minor > Fix For: 2.0.0 > > > Currently TriggerDagRunOperator takes a string argument to specify the one > dag id that is triggered on success. But it would be just as easy and more > flexible to allow any. Incoming PR to demo what I mean ... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRFLOW-1196) Make trigger_dag_id a templated field of TriggerDagRunOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong reassigned AIRFLOW-1196: - Assignee: Kaxil Naik (was: Arthur Vigil) > Make trigger_dag_id a templated field of TriggerDagRunOperator > -- > > Key: AIRFLOW-1196 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1196 > Project: Apache Airflow > Issue Type: Improvement > Components: operators >Reporter: Arthur Vigil >Assignee: Kaxil Naik >Priority: Trivial > Labels: easyfix, improvement > Fix For: 2.0.0 > > > TriggerDagRunOperator currently has no templated fields. Adding > `trigger_dag_id` as a templated field should be a trivial change that > improves its flexibility and usefulness. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko commented on a change in pull request #4231: [AIRFLOW-3066] Adding support for AWS Batch parameters
Fokko commented on a change in pull request #4231: [AIRFLOW-3066] Adding support for AWS Batch parameters URL: https://github.com/apache/incubator-airflow/pull/4231#discussion_r236091049 ## File path: airflow/contrib/example_dags/example_awsbatch_operator.py ## @@ -0,0 +1,95 @@ +# -*- coding: utf-8 -*- +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions andf limitations +# under the License. + +import airflow +from airflow.utils.log.logging_mixin import LoggingMixin +from airflow.models import DAG +from datetime import timedelta + +log = LoggingMixin().log + +try: +# AWS Batch is optional, so not available in vanilla Airflow +# pip install apache-airflow[boto3] +from airflow.contrib.operators.awsbatch_operator import AWSBatchOperator + +default_args = { +'owner': 'airflow', +'depends_on_past': False, +'start_date': airflow.utils.dates.days_ago(2), +'email': ['airf...@airflow.com'], +'email_on_failure': False, +'email_on_retry': False, +'retries': 1, +'retry_delay': timedelta(minutes=5), +} + +dag = DAG( +'example_awsbatch_dag', default_args=default_args, schedule_interval=timedelta(1)) + +# vanilla example +t0 = AWSBatchOperator( +task_id='airflow-vanilla', +job_name='airflow-vanilla', +job_queue='airflow', +job_definition='airflow', +overrides={}, +queue='airflow', +dag=dag) + +# overrides example +t1 = AWSBatchOperator( +job_name='airflow-overrides', +task_id='airflow-overrides', +job_queue='airflow', +job_definition='airflow', +overrides={ +"command": [ +"echo", +"overrides" +] +}, +queue='airflow', +dag=dag) + +# parameters example +t2 = AWSBatchOperator( +job_name='airflow-parameters', +task_id='airflow-parameters', +job_queue='airflow', +job_definition='airflow', +overrides={ +"command": [ +"echo", +"Ref::input" +] +}, +parameters={ +"input": "Airflow2000" +}, +queue='airflow', +dag=dag) + +t0.set_upstream(t1) Review comment: Can we use the `>>` notation here? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on a change in pull request #4231: [AIRFLOW-3066] Adding support for AWS Batch parameters
Fokko commented on a change in pull request #4231: [AIRFLOW-3066] Adding support for AWS Batch parameters URL: https://github.com/apache/incubator-airflow/pull/4231#discussion_r236091104 ## File path: airflow/contrib/operators/awsbatch_operator.py ## @@ -94,11 +100,21 @@ def execute(self, context): ) try: -response = self.client.submit_job( -jobName=self.job_name, -jobQueue=self.job_queue, -jobDefinition=self.job_definition, -containerOverrides=self.overrides) +if self.parameters is None: Review comment: Do we need the `if` construction here? Can't we just pass None parameters? Or do `parameters =self.parameters or {}` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-1823) API get_task_info is incompatible with manual runs created by UI
[ https://issues.apache.org/jira/browse/AIRFLOW-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698301#comment-16698301 ] Iuliia Volkova commented on AIRFLOW-1823: - this issue only relative to 1.9 in 1.10 it was already fixed and works correct [~bolke], can you close this task? > API get_task_info is incompatible with manual runs created by UI > > > Key: AIRFLOW-1823 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1823 > Project: Apache Airflow > Issue Type: Bug > Components: api >Affects Versions: 2.0.0 > Environment: ubuntu > Airflow 1.9rc02 > commit: > https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126 >Reporter: Jeremy Lewi >Assignee: Iuliia Volkova >Priority: Minor > > The API method > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > doesn't work with manual runs created by the UI. > The UI creates dag runs with ids with sub second precision in the name. An > example of a run created by the UI is > 2017-11-16T20:23:32.045330 > The endpoint for > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > however assumes the dag run id is of the form '%Y-%m-%dT%H:%M:%S'. > Runs triggered via the CLI generate run ids with the form expected by the API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-1823) API get_task_info is incompatible with manual runs created by UI
[ https://issues.apache.org/jira/browse/AIRFLOW-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698300#comment-16698300 ] ASF GitHub Bot commented on AIRFLOW-1823: - xnuinside closed pull request #4237: [WIP][AIRFLOW-1823] added possibility to send execution date with millisec… URL: https://github.com/apache/incubator-airflow/pull/4237 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/www_rbac/api/experimental/endpoints.py b/airflow/www_rbac/api/experimental/endpoints.py index 21aa7e8a1a..6589d203d2 100644 --- a/airflow/www_rbac/api/experimental/endpoints.py +++ b/airflow/www_rbac/api/experimental/endpoints.py @@ -168,18 +168,21 @@ def task_instance_info(dag_id, execution_date, task_id): """ Returns a JSON with a task instance's public instance variables. The format for the exec_date is expected to be -"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15". This will +"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15" or +"-mm-DDTHH:MM:SS.f", for example: "2016-11-16T11:34:15.680399". This will of course need to have been encoded for URL in the request. """ # Convert string datetime into actual datetime try: -execution_date = timezone.parse(execution_date) +execution_date = datetime.strptime( +execution_date, '%Y-%m-%dT%H:%M:%S.%f' if '.' in execution_date +else '%Y-%m-%dT%H:%M:%S') except ValueError: error_message = ( 'Given execution date, {}, could not be identified ' -'as a date. Example date format: 2015-11-16T14:34:15+00:00' -.format(execution_date)) +'as a date. Example date format: 2015-11-16T14:34:15 or ' +'2015-11-16T14:34:15.680399'.format(execution_date)) _log.info(error_message) response = jsonify({'error': error_message}) response.status_code = 400 @@ -209,17 +212,21 @@ def dag_run_status(dag_id, execution_date): """ Returns a JSON with a dag_run's public instance variables. The format for the exec_date is expected to be -"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15". This will +"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15" or +"-mm-DDTHH:MM:SS.f", for example: "2016-11-16T11:34:15.680399". This will of course need to have been encoded for URL in the request. """ # Convert string datetime into actual datetime try: -execution_date = timezone.parse(execution_date) +execution_date = datetime.strptime( +execution_date, '%Y-%m-%dT%H:%M:%S.%f' if '.' in execution_date +else '%Y-%m-%dT%H:%M:%S') except ValueError: error_message = ( 'Given execution date, {}, could not be identified ' -'as a date. Example date format: 2015-11-16T14:34:15+00:00'.format( +'as a date. Example date format: 2015-11-16T14:34:15+00:00 or ' +'2015-11-16T14:34:15.680399'.format( execution_date)) _log.info(error_message) response = jsonify({'error': error_message}) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > API get_task_info is incompatible with manual runs created by UI > > > Key: AIRFLOW-1823 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1823 > Project: Apache Airflow > Issue Type: Bug > Components: api >Affects Versions: 2.0.0 > Environment: ubuntu > Airflow 1.9rc02 > commit: > https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126 >Reporter: Jeremy Lewi >Assignee: Iuliia Volkova >Priority: Minor > > The API method > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > doesn't work with manual runs created by the UI. > The UI creates dag runs with ids with sub second precision in the name. An > example of a run created by the UI is > 2017-11-16T20:23:32.045330 > The endpoint for > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > however assumes the dag run id is of the form '%Y-%m-%dT%H:%M:%S'. > Runs triggered via
[GitHub] xnuinside commented on a change in pull request #4237: [WIP][AIRFLOW-1823] added possibility to send execution date with millisec…
xnuinside commented on a change in pull request #4237: [WIP][AIRFLOW-1823] added possibility to send execution date with millisec… URL: https://github.com/apache/incubator-airflow/pull/4237#discussion_r236090006 ## File path: airflow/www_rbac/api/experimental/endpoints.py ## @@ -168,18 +168,21 @@ def task_instance_info(dag_id, execution_date, task_id): """ Returns a JSON with a task instance's public instance variables. The format for the exec_date is expected to be -"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15". This will +"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15" or +"-mm-DDTHH:MM:SS.f", for example: "2016-11-16T11:34:15.680399". This will of course need to have been encoded for URL in the request. """ # Convert string datetime into actual datetime try: -execution_date = timezone.parse(execution_date) Review comment: yes, it works correct in 1.10, sorry for PR, I think need to close task - https://issues.apache.org/jira/projects/AIRFLOW/issues/AIRFLOW-1823 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] xnuinside closed pull request #4237: [WIP][AIRFLOW-1823] added possibility to send execution date with millisec…
xnuinside closed pull request #4237: [WIP][AIRFLOW-1823] added possibility to send execution date with millisec… URL: https://github.com/apache/incubator-airflow/pull/4237 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/www_rbac/api/experimental/endpoints.py b/airflow/www_rbac/api/experimental/endpoints.py index 21aa7e8a1a..6589d203d2 100644 --- a/airflow/www_rbac/api/experimental/endpoints.py +++ b/airflow/www_rbac/api/experimental/endpoints.py @@ -168,18 +168,21 @@ def task_instance_info(dag_id, execution_date, task_id): """ Returns a JSON with a task instance's public instance variables. The format for the exec_date is expected to be -"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15". This will +"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15" or +"-mm-DDTHH:MM:SS.f", for example: "2016-11-16T11:34:15.680399". This will of course need to have been encoded for URL in the request. """ # Convert string datetime into actual datetime try: -execution_date = timezone.parse(execution_date) +execution_date = datetime.strptime( +execution_date, '%Y-%m-%dT%H:%M:%S.%f' if '.' in execution_date +else '%Y-%m-%dT%H:%M:%S') except ValueError: error_message = ( 'Given execution date, {}, could not be identified ' -'as a date. Example date format: 2015-11-16T14:34:15+00:00' -.format(execution_date)) +'as a date. Example date format: 2015-11-16T14:34:15 or ' +'2015-11-16T14:34:15.680399'.format(execution_date)) _log.info(error_message) response = jsonify({'error': error_message}) response.status_code = 400 @@ -209,17 +212,21 @@ def dag_run_status(dag_id, execution_date): """ Returns a JSON with a dag_run's public instance variables. The format for the exec_date is expected to be -"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15". This will +"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15" or +"-mm-DDTHH:MM:SS.f", for example: "2016-11-16T11:34:15.680399". This will of course need to have been encoded for URL in the request. """ # Convert string datetime into actual datetime try: -execution_date = timezone.parse(execution_date) +execution_date = datetime.strptime( +execution_date, '%Y-%m-%dT%H:%M:%S.%f' if '.' in execution_date +else '%Y-%m-%dT%H:%M:%S') except ValueError: error_message = ( 'Given execution date, {}, could not be identified ' -'as a date. Example date format: 2015-11-16T14:34:15+00:00'.format( +'as a date. Example date format: 2015-11-16T14:34:15+00:00 or ' +'2015-11-16T14:34:15.680399'.format( execution_date)) _log.info(error_message) response = jsonify({'error': error_message}) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] xnuinside commented on a change in pull request #4237: [AIRFLOW-1823] added possibility to send execution date with millisec…
xnuinside commented on a change in pull request #4237: [AIRFLOW-1823] added possibility to send execution date with millisec… URL: https://github.com/apache/incubator-airflow/pull/4237#discussion_r236089830 ## File path: airflow/www_rbac/api/experimental/endpoints.py ## @@ -168,18 +168,21 @@ def task_instance_info(dag_id, execution_date, task_id): """ Returns a JSON with a task instance's public instance variables. The format for the exec_date is expected to be -"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15". This will +"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15" or +"-mm-DDTHH:MM:SS.f", for example: "2016-11-16T11:34:15.680399". This will of course need to have been encoded for URL in the request. """ # Convert string datetime into actual datetime try: -execution_date = timezone.parse(execution_date) Review comment: @bolkedebruin, oh shame! I don't pay attention to what it was changed from 1.9 and just apply my fix from 1.9, sorry for that (it was - https://github.com/apache/incubator-airflow/blob/v1-9-stable/airflow/www/api/experimental/endpoints.py#L126) so, maybe this task not actually already in 1.10? https://issues.apache.org/jira/projects/AIRFLOW/issues/AIRFLOW-1823 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] xnuinside commented on a change in pull request #4237: [AIRFLOW-1823] added possibility to send execution date with millisec…
xnuinside commented on a change in pull request #4237: [AIRFLOW-1823] added possibility to send execution date with millisec… URL: https://github.com/apache/incubator-airflow/pull/4237#discussion_r236089830 ## File path: airflow/www_rbac/api/experimental/endpoints.py ## @@ -168,18 +168,21 @@ def task_instance_info(dag_id, execution_date, task_id): """ Returns a JSON with a task instance's public instance variables. The format for the exec_date is expected to be -"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15". This will +"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15" or +"-mm-DDTHH:MM:SS.f", for example: "2016-11-16T11:34:15.680399". This will of course need to have been encoded for URL in the request. """ # Convert string datetime into actual datetime try: -execution_date = timezone.parse(execution_date) Review comment: @bolkedebruin, oh shame! I don't pay attention to what it was changed from 1.9 and just apply my fix from 1.9, sorry for that (it was - https://github.com/apache/incubator-airflow/blob/v1-9-stable/airflow/www/api/experimental/endpoints.py#L126) so, maybe this task not actually already in 1.10? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on a change in pull request #4237: [AIRFLOW-1823] added possibility to send execution date with millisec…
bolkedebruin commented on a change in pull request #4237: [AIRFLOW-1823] added possibility to send execution date with millisec… URL: https://github.com/apache/incubator-airflow/pull/4237#discussion_r236089571 ## File path: airflow/www_rbac/api/experimental/endpoints.py ## @@ -168,18 +168,21 @@ def task_instance_info(dag_id, execution_date, task_id): """ Returns a JSON with a task instance's public instance variables. The format for the exec_date is expected to be -"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15". This will +"-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15" or +"-mm-DDTHH:MM:SS.f", for example: "2016-11-16T11:34:15.680399". This will of course need to have been encoded for URL in the request. """ # Convert string datetime into actual datetime try: -execution_date = timezone.parse(execution_date) Review comment: You are removing timezone information here and rely on your own parsing mechanism. I don't think that's a good idea? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] xnuinside commented on issue #4237: [AIRFLOW-1823] added possibility to send execution date with millisec…
xnuinside commented on issue #4237: [AIRFLOW-1823] added possibility to send execution date with millisec… URL: https://github.com/apache/incubator-airflow/pull/4237#issuecomment-441471988 @Fokko, @ashb, @feng-tao, @bolkedebruin, somebody please review (I don't know what's with TravisCI - strange fails). As I understand the case that we have different timestamps then trigger DAG from UI and then trigger from the cli are the some kind of incorrect behavior and 'bug' but something tells me what it's not a quick issue and fix with moving to one format (with or without milliseconds will cause some backward incapability changes) or I'm wrong? I prepare a fix that makes rest api works with two execution date formats - with milliseconds and without (as it was) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-1823) API get_task_info is incompatible with manual runs created by UI
[ https://issues.apache.org/jira/browse/AIRFLOW-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698276#comment-16698276 ] ASF GitHub Bot commented on AIRFLOW-1823: - xnuinside opened a new pull request #4237: [AIRFLOW-1823] added possibility to send execution date with millisec… URL: https://github.com/apache/incubator-airflow/pull/4237 …onds to task_instance_info REST API endpoint Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-1823\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-1823 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: added compatibility with UI triggered DAGs checked: https://user-images.githubusercontent.com/15959809/48984139-260a2980-f109-11e8-95cb-31b8bc1998d1.png";> ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > API get_task_info is incompatible with manual runs created by UI > > > Key: AIRFLOW-1823 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1823 > Project: Apache Airflow > Issue Type: Bug > Components: api >Affects Versions: 2.0.0 > Environment: ubuntu > Airflow 1.9rc02 > commit: > https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126 >Reporter: Jeremy Lewi >Assignee: Iuliia Volkova >Priority: Minor > > The API method > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > doesn't work with manual runs created by the UI. > The UI creates dag runs with ids with sub second precision in the name. An > example of a run created by the UI is > 2017-11-16T20:23:32.045330 > The endpoint for > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > however assumes the dag run id is of the form '%Y-%m-%dT%H:%M:%S'. > Runs triggered via the CLI generate run ids with the form expected by the API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] xnuinside opened a new pull request #4237: [AIRFLOW-1823] added possibility to send execution date with millisec…
xnuinside opened a new pull request #4237: [AIRFLOW-1823] added possibility to send execution date with millisec… URL: https://github.com/apache/incubator-airflow/pull/4237 …onds to task_instance_info REST API endpoint Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-1823\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-1823 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: added compatibility with UI triggered DAGs checked: https://user-images.githubusercontent.com/15959809/48984139-260a2980-f109-11e8-95cb-31b8bc1998d1.png";> ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (AIRFLOW-1823) API get_task_info is incompatible with manual runs created by UI
[ https://issues.apache.org/jira/browse/AIRFLOW-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iuliia Volkova reassigned AIRFLOW-1823: --- Assignee: Iuliia Volkova (was: Tao Feng) > API get_task_info is incompatible with manual runs created by UI > > > Key: AIRFLOW-1823 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1823 > Project: Apache Airflow > Issue Type: Bug > Components: api >Affects Versions: 2.0.0 > Environment: ubuntu > Airflow 1.9rc02 > commit: > https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126 >Reporter: Jeremy Lewi >Assignee: Iuliia Volkova >Priority: Minor > > The API method > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > doesn't work with manual runs created by the UI. > The UI creates dag runs with ids with sub second precision in the name. An > example of a run created by the UI is > 2017-11-16T20:23:32.045330 > The endpoint for > [task_instance_info|https://github.com/apache/incubator-airflow/blob/master/airflow/www/api/experimental/endpoints.py#L126] > however assumes the dag run id is of the form '%Y-%m-%dT%H:%M:%S'. > Runs triggered via the CLI generate run ids with the form expected by the API. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] xnuinside commented on issue #4236: [AIRFLOW-3395] added the REST API endpoints to the doc
xnuinside commented on issue #4236: [AIRFLOW-3395] added the REST API endpoints to the doc URL: https://github.com/apache/incubator-airflow/pull/4236#issuecomment-441468648 @Fokko, two tests failed with strange errors, I believe it's not relative to PR ) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] kaxil commented on issue #4224: [AIRFLOW-3396] Make sql param as required in BigQueryOperator
kaxil commented on issue #4224: [AIRFLOW-3396] Make sql param as required in BigQueryOperator URL: https://github.com/apache/incubator-airflow/pull/4224#issuecomment-441466785 @bolkedebruin :) Updated. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-3396) Redundant sql param check in BigQueryOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik updated AIRFLOW-3396: Description: The 'sql' parameter in BigQueryOperator accepts None and then there is a check which checks if it is None and gives error. (was: The 'sql' p) > Redundant sql param check in BigQueryOperator > - > > Key: AIRFLOW-3396 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3396 > Project: Apache Airflow > Issue Type: Task > Components: gcp >Reporter: Kaxil Naik >Assignee: Kaxil Naik >Priority: Trivial > Fix For: 2.0.0 > > > The 'sql' parameter in BigQueryOperator accepts None and then there is a > check which checks if it is None and gives error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3396) Redundant sql param check in BigQueryOperator
[ https://issues.apache.org/jira/browse/AIRFLOW-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaxil Naik updated AIRFLOW-3396: Description: The 'sql' p > Redundant sql param check in BigQueryOperator > - > > Key: AIRFLOW-3396 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3396 > Project: Apache Airflow > Issue Type: Task > Components: gcp >Reporter: Kaxil Naik >Assignee: Kaxil Naik >Priority: Trivial > Fix For: 2.0.0 > > > The 'sql' p -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] xnuinside commented on issue #4236: [AIRFLOW-3395] added the REST API endpoints to the doc
xnuinside commented on issue #4236: [AIRFLOW-3395] added the REST API endpoints to the doc URL: https://github.com/apache/incubator-airflow/pull/4236#issuecomment-441466117 @Fokko, fixed . about https://github.com/soerface/flask-restful-swagger-2.0 I think we could talk about it as an adding new feature with a separate task (something told to me, what it is not quick to add and, at first, need to decide what we want use swagger) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (AIRFLOW-3396) Redundant sql param check in BigQueryOperator
Kaxil Naik created AIRFLOW-3396: --- Summary: Redundant sql param check in BigQueryOperator Key: AIRFLOW-3396 URL: https://issues.apache.org/jira/browse/AIRFLOW-3396 Project: Apache Airflow Issue Type: Task Components: gcp Reporter: Kaxil Naik Assignee: Kaxil Naik Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko commented on issue #3560: [AIRFLOW-2697] Drop snakebite in favour of hdfs3
Fokko commented on issue #3560: [AIRFLOW-2697] Drop snakebite in favour of hdfs3 URL: https://github.com/apache/incubator-airflow/pull/3560#issuecomment-441464434 Maybe move to PyArrow directly then? https://arrow.apache.org/docs/python/filesystems.html This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4233: [AIRFLOW-3391] Upgrade pendulum. to latest major version.
Fokko commented on issue #4233: [AIRFLOW-3391] Upgrade pendulum. to latest major version. URL: https://github.com/apache/incubator-airflow/pull/4233#issuecomment-441464347 There seems to be an issue with Travis, I've restarted the build. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on a change in pull request #4236: [AIRFLOW-3395] added the REST API endpoints to the doc
Fokko commented on a change in pull request #4236: [AIRFLOW-3395] added the REST API endpoints to the doc URL: https://github.com/apache/incubator-airflow/pull/4236#discussion_r236085824 ## File path: docs/api.rst ## @@ -24,10 +24,73 @@ available at /api/experimental/. Please note that we expect the endpoint definit Endpoints - -This is a place holder until the swagger definitions are active +.. http:post:: /api/experimental/dags//dag_runs + + Creates a dag_run for a given dag id (POST). + + + **Trigger DAG with config, example:** + + .. code-block:: bash + +curl -X POST \ + http://localhost:8080/api/experimental/dags//dag_runs \ + -H 'Cache-Control: no-cache' \ + -H 'Content-Type: application/json' \ + -d '{"conf":"{\"key\":\"value\"}"}' + + +.. http:get:: /api/experimental/dags//dag_runs + + Returns a list of Dag Runs for a specific DAG ID. + +.. http:get:: /api/experimental/dags//dag_runs/ + + Returns a JSON with a dag_run's public instance variables. The format for the is expected to be "-mm-DDTHH:MM:SS", for example: "2016-11-16T11:34:15". + + +.. http:get:: /api/experimental/test + + To check REST API server correct work. Return status 'OK'. + + +.. http:get:: /api/experimental/dags//tasks/ + + Returns info for a task (GET). Review comment: Same here. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on a change in pull request #4236: [AIRFLOW-3395] added the REST API endpoints to the doc
Fokko commented on a change in pull request #4236: [AIRFLOW-3395] added the REST API endpoints to the doc URL: https://github.com/apache/incubator-airflow/pull/4236#discussion_r236085805 ## File path: docs/api.rst ## @@ -24,10 +24,73 @@ available at /api/experimental/. Please note that we expect the endpoint definit Endpoints - -This is a place holder until the swagger definitions are active +.. http:post:: /api/experimental/dags//dag_runs + + Creates a dag_run for a given dag id (POST). Review comment: Isn't the (POST) a bit redundant? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #4236: [AIRFLOW-3395] added the REST API endpoints to the doc
Fokko commented on issue #4236: [AIRFLOW-3395] added the REST API endpoints to the doc URL: https://github.com/apache/incubator-airflow/pull/4236#issuecomment-441464074 Nice one @xnuinside Looks great. I've restarted the CI. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] xnuinside commented on issue #4236: [AIRFLOW-3395] added the REST API endpoints to the doc
xnuinside commented on issue #4236: [AIRFLOW-3395] added the REST API endpoints to the doc URL: https://github.com/apache/incubator-airflow/pull/4236#issuecomment-441460902 @ashb, @kaxil, @feng-tao, please review This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3395) add to documentation all existed REST API endpoints and example how to pass dag_runs params
[ https://issues.apache.org/jira/browse/AIRFLOW-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698240#comment-16698240 ] ASF GitHub Bot commented on AIRFLOW-3395: - xnuinside opened a new pull request #4236: [AIRFLOW-3395] added the REST API endpoints to the doc URL: https://github.com/apache/incubator-airflow/pull/4236 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-3395\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3395 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: I don't see any task about start using swagger or something else, so I added temporary doc for informing about existed endpoints I also try to use flask autodoc https://sphinxcontrib-httpdomain.readthedocs.io/en/stable/#module-sphinxcontrib.autohttp.flask but has an issue relative to defining flask app, I believe need to do some refactor to use flask app, so I just used simple httpdomain sphinx. Added info about REST API endpoints in doc result doc preview: https://user-images.githubusercontent.com/15959809/48982716-3addc180-f0f7-11e8-8113-7667873ba110.png";> https://user-images.githubusercontent.com/15959809/48982717-3b765800-f0f7-11e8-8550-c7afe7612cb6.png";> ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > add to documentation all existed REST API endpoints and example how to pass > dag_runs params > --- > > Key: AIRFLOW-3395 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3395 > Project: Apache Airflow > Issue Type: Task >Affects Versions: 1.10.2 >Reporter: Iuliia Volkova >Assignee: Iuliia Volkova >Priority: Minor > > In doc exist only 2 endpoints https://airflow.apache.org/api.html#endpoints > In source code > (https://github.com/apache/incubator-airflow/blob/v1-10-stable/airflow/www_rbac/api/experimental/endpoints.py) > we have more, caused issues when users think that there are not more methods > - I got on work project several questions about it and also saw in > stackoverflow relative questions: > https://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run > > I want to add more information about REST API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] xnuinside opened a new pull request #4236: [AIRFLOW-3395] added the REST API endpoints to the doc
xnuinside opened a new pull request #4236: [AIRFLOW-3395] added the REST API endpoints to the doc URL: https://github.com/apache/incubator-airflow/pull/4236 Make sure you have checked _all_ steps below. ### Jira - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-3395\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-3395 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue. ### Description - [ ] Here are some details about my PR, including screenshots of any UI changes: I don't see any task about start using swagger or something else, so I added temporary doc for informing about existed endpoints I also try to use flask autodoc https://sphinxcontrib-httpdomain.readthedocs.io/en/stable/#module-sphinxcontrib.autohttp.flask but has an issue relative to defining flask app, I believe need to do some refactor to use flask app, so I just used simple httpdomain sphinx. Added info about REST API endpoints in doc result doc preview: https://user-images.githubusercontent.com/15959809/48982716-3addc180-f0f7-11e8-8113-7667873ba110.png";> https://user-images.githubusercontent.com/15959809/48982717-3b765800-f0f7-11e8-8550-c7afe7612cb6.png";> ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 1. Subject is limited to 50 characters (not including Jira issue reference) 1. Subject does not end with a period 1. Subject uses the imperative mood ("add", not "adding") 1. Body wraps at 72 characters 1. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `flake8` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on a change in pull request #4233: [AIRFLOW-3391] Upgrade pendulum. to latest major version.
bolkedebruin commented on a change in pull request #4233: [AIRFLOW-3391] Upgrade pendulum. to latest major version. URL: https://github.com/apache/incubator-airflow/pull/4233#discussion_r236078997 ## File path: tests/models.py ## @@ -230,9 +230,9 @@ def test_dag_topological_sort(self): self.assertEquals(tuple(), dag.topological_sort()) def test_dag_naive_default_args_start_date(self): -dag = DAG('DAG', default_args={'start_date': datetime.datetime(2018, 1, 1)}) +dag = DAG('DAG', default_args={'start_date': timezone.datetime(2018, 1, 1)}) Review comment: This is not naive? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on a change in pull request #4233: [AIRFLOW-3391] Upgrade pendulum. to latest major version.
bolkedebruin commented on a change in pull request #4233: [AIRFLOW-3391] Upgrade pendulum. to latest major version. URL: https://github.com/apache/incubator-airflow/pull/4233#discussion_r236079009 ## File path: tests/models.py ## @@ -230,9 +230,9 @@ def test_dag_topological_sort(self): self.assertEquals(tuple(), dag.topological_sort()) def test_dag_naive_default_args_start_date(self): -dag = DAG('DAG', default_args={'start_date': datetime.datetime(2018, 1, 1)}) +dag = DAG('DAG', default_args={'start_date': timezone.datetime(2018, 1, 1)}) self.assertEqual(dag.timezone, settings.TIMEZONE) -dag = DAG('DAG', start_date=datetime.datetime(2018, 1, 1)) +dag = DAG('DAG', start_date=timezone.datetime(2018, 1, 1)) Review comment: Idem This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work started] (AIRFLOW-3395) add to documentation all existed REST API endpoints and example how to pass dag_runs params
[ https://issues.apache.org/jira/browse/AIRFLOW-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-3395 started by Iuliia Volkova. --- > add to documentation all existed REST API endpoints and example how to pass > dag_runs params > --- > > Key: AIRFLOW-3395 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3395 > Project: Apache Airflow > Issue Type: Task >Affects Versions: 1.10.2 >Reporter: Iuliia Volkova >Assignee: Iuliia Volkova >Priority: Minor > > In doc exist only 2 endpoints https://airflow.apache.org/api.html#endpoints > In source code > (https://github.com/apache/incubator-airflow/blob/v1-10-stable/airflow/www_rbac/api/experimental/endpoints.py) > we have more, caused issues when users think that there are not more methods > - I got on work project several questions about it and also saw in > stackoverflow relative questions: > https://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run > > I want to add more information about REST API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (AIRFLOW-3395) add to documentation all existed REST API endpoints and example how to pass dag_runs params
[ https://issues.apache.org/jira/browse/AIRFLOW-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Iuliia Volkova updated AIRFLOW-3395: Description: In doc exist only 2 endpoints https://airflow.apache.org/api.html#endpoints In source code (https://github.com/apache/incubator-airflow/blob/v1-10-stable/airflow/www_rbac/api/experimental/endpoints.py) we have more, caused issues when users think that there are not more methods - I got on work project several questions about it and also saw in stackoverflow relative questions: https://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run I want to add more information about REST API was: In doc exist only 2 endpoints https://airflow.apache.org/api.html#endpoints In source code we have several more, caused issues when users think that there are not more methods - I got on work project several questions about it and also saw in stackoverflow relative questions: https://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run I want to add more information about REST API > add to documentation all existed REST API endpoints and example how to pass > dag_runs params > --- > > Key: AIRFLOW-3395 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3395 > Project: Apache Airflow > Issue Type: Task >Affects Versions: 1.10.2 >Reporter: Iuliia Volkova >Assignee: Iuliia Volkova >Priority: Minor > > In doc exist only 2 endpoints https://airflow.apache.org/api.html#endpoints > In source code > (https://github.com/apache/incubator-airflow/blob/v1-10-stable/airflow/www_rbac/api/experimental/endpoints.py) > we have more, caused issues when users think that there are not more methods > - I got on work project several questions about it and also saw in > stackoverflow relative questions: > https://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run > > I want to add more information about REST API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work stopped] (AIRFLOW-3395) add to documentation all existed REST API endpoints and example how to pass dag_runs params
[ https://issues.apache.org/jira/browse/AIRFLOW-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-3395 stopped by Iuliia Volkova. --- > add to documentation all existed REST API endpoints and example how to pass > dag_runs params > --- > > Key: AIRFLOW-3395 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3395 > Project: Apache Airflow > Issue Type: Task >Affects Versions: 1.10.2 >Reporter: Iuliia Volkova >Assignee: Iuliia Volkova >Priority: Minor > > In doc exist only 2 endpoints https://airflow.apache.org/api.html#endpoints > In source code we have several more, caused issues when users think that > there are not more methods - I got on work project several questions about it > and also saw in stackoverflow relative questions: > https://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run > > I want to add more information about REST API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRFLOW-3395) add to documentation all existed REST API endpoints and example how to pass dag_runs params
Iuliia Volkova created AIRFLOW-3395: --- Summary: add to documentation all existed REST API endpoints and example how to pass dag_runs params Key: AIRFLOW-3395 URL: https://issues.apache.org/jira/browse/AIRFLOW-3395 Project: Apache Airflow Issue Type: Task Affects Versions: 1.10.2 Reporter: Iuliia Volkova Assignee: Iuliia Volkova In doc exist only 2 endpoints https://airflow.apache.org/api.html#endpoints In source code we have several more, caused issues when users think that there are not more methods - I got on work project several questions about it and also saw in stackoverflow relative questions: https://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run I want to add more information about REST API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (AIRFLOW-3395) add to documentation all existed REST API endpoints and example how to pass dag_runs params
[ https://issues.apache.org/jira/browse/AIRFLOW-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on AIRFLOW-3395 started by Iuliia Volkova. --- > add to documentation all existed REST API endpoints and example how to pass > dag_runs params > --- > > Key: AIRFLOW-3395 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3395 > Project: Apache Airflow > Issue Type: Task >Affects Versions: 1.10.2 >Reporter: Iuliia Volkova >Assignee: Iuliia Volkova >Priority: Minor > > In doc exist only 2 endpoints https://airflow.apache.org/api.html#endpoints > In source code we have several more, caused issues when users think that > there are not more methods - I got on work project several questions about it > and also saw in stackoverflow relative questions: > https://stackoverflow.com/questions/50121593/pass-parameters-to-airflow-experimental-rest-api-when-creating-dag-run > > I want to add more information about REST API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible
ultrabug commented on issue #2460: [AIRFLOW-1424] make the next execution date of DAGs visible URL: https://github.com/apache/incubator-airflow/pull/2460#issuecomment-441446264 > For instance if there are too many active DAG runs already for that DAG it will return None: do we want None displayed in the UI in that case or not. I would be fine with it, yes (user opinion). > Need to check if creating an instance of SchedulerJob will end up creating a row in the Job table or not? No it won't, that's why I added a new option and called it `dry_run` so it's clear that it will not touch the database. @ashb do you need anything else from me? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] xnuinside commented on issue #3546: AIRFLOW-2664: Support filtering dag runs by id prefix in API.
xnuinside commented on issue #3546: AIRFLOW-2664: Support filtering dag runs by id prefix in API. URL: https://github.com/apache/incubator-airflow/pull/3546#issuecomment-441446114 @rcorre , tests failed, ``` [error] 0.62% tests.www_rbac.api.experimental.test_dag_runs_endpoint.TestDagRunsEndpoint.test_get_dag_runs_success_with_run_id__like_parameter: 8.7352s - Traceback (most recent call last): tests/www_rbac/api/experimental/test_dag_runs_endpoint.py line 90 in test_get_dag_runs_success_with_run_id__like_parameter execution_date=datetime.datetime.fromtimestamp('1539097214'), ``` do you plan to fix issue? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] xnuinside edited a comment on issue #3546: AIRFLOW-2664: Support filtering dag runs by id prefix in API.
xnuinside edited a comment on issue #3546: AIRFLOW-2664: Support filtering dag runs by id prefix in API. URL: https://github.com/apache/incubator-airflow/pull/3546#issuecomment-441446114 @rcorre , tests failed, ``` [error] 0.62% tests.www_rbac.api.experimental.test_dag_runs_endpoint.TestDagRunsEndpoint.test_get_dag_runs_success_with_run_id__like_parameter: 8.7352s - Traceback (most recent call last): tests/www_rbac/api/experimental/test_dag_runs_endpoint.py line 90 in test_get_dag_runs_success_with_run_id__like_parameter execution_date=datetime.datetime.fromtimestamp('1539097214'), ``` do you plan to fix issue? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-987) `airflow kerberos` ignores --keytab and --principal arguments
[ https://issues.apache.org/jira/browse/AIRFLOW-987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698199#comment-16698199 ] Iuliia Volkova commented on AIRFLOW-987: [~pratap20], you set up yourself as Assignee, do you plan to fix the issue and open PR? > `airflow kerberos` ignores --keytab and --principal arguments > - > > Key: AIRFLOW-987 > URL: https://issues.apache.org/jira/browse/AIRFLOW-987 > Project: Apache Airflow > Issue Type: Bug > Components: security >Affects Versions: 1.8.0 > Environment: 1.8-rc5 >Reporter: Ruslan Dautkhanov >Assignee: Pratap20 >Priority: Major > Labels: easyfix, kerberos, security > > No matter which arguments I pass to `airflow kerberos`, > it always executes as `kinit -r 3600m -k -t airflow.keytab -c > /tmp/airflow_krb5_ccache airflow` > So it failes with expected "kinit: Keytab contains no suitable keys for > airf...@corp.some.com while getting initial credentials" > Tried different arguments, -kt and --keytab, here's one of the runs (some > lines wrapped for readability): > {noformat} > $ airflow kerberos -kt /home/rdautkha/.keytab rdautkha...@corp.some.com > [2017-03-14 23:50:11,523] {__init__.py:57} INFO - Using executor LocalExecutor > [2017-03-14 23:50:12,069] {kerberos.py:43} INFO - Reinitting kerberos from > keytab: > kinit -r 3600m -k -t airflow.keytab -c /tmp/airflow_krb5_ccache airflow > [2017-03-14 23:50:12,080] {kerberos.py:55} ERROR - > Couldn't reinit from keytab! `kinit' exited with 1. > kinit: Keytab contains no suitable keys for airf...@corp.some.com > while getting initial credentials > {noformat} > 1.8-rc5 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] bolkedebruin commented on issue #3560: [AIRFLOW-2697] Drop snakebite in favour of hdfs3
bolkedebruin commented on issue #3560: [AIRFLOW-2697] Drop snakebite in favour of hdfs3 URL: https://github.com/apache/incubator-airflow/pull/3560#issuecomment-441444001 I would love to have this moving forward! Can we get this rebased, so we can start getting this in 2.0? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-2715) Dataflow template operator dosenot support region parameter
[ https://issues.apache.org/jira/browse/AIRFLOW-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698192#comment-16698192 ] ASF GitHub Bot commented on AIRFLOW-2715: - bolkedebruin closed pull request #4139: [AIRFLOW-2715] Pick up the region setting while launching Dataflow templates URL: https://github.com/apache/incubator-airflow/pull/4139 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/gcp_dataflow_hook.py b/airflow/contrib/hooks/gcp_dataflow_hook.py index 4fdb07c74d..0eee769d61 100644 --- a/airflow/contrib/hooks/gcp_dataflow_hook.py +++ b/airflow/contrib/hooks/gcp_dataflow_hook.py @@ -220,6 +220,7 @@ def label_formatter(labels_dict): def start_template_dataflow(self, job_name, variables, parameters, dataflow_template, append_job_name=True): +variables = self._set_variables(variables) name = self._build_dataflow_job_name(job_name, append_job_name) self._start_template_dataflow( name, variables, parameters, dataflow_template) @@ -278,8 +279,9 @@ def _start_template_dataflow(self, name, variables, parameters, "parameters": parameters, "environment": environment} service = self.get_conn() -request = service.projects().templates().launch( +request = service.projects().locations().templates().launch( projectId=variables['project'], +location=variables['region'], gcsPath=dataflow_template, body=body ) diff --git a/airflow/contrib/operators/dataflow_operator.py b/airflow/contrib/operators/dataflow_operator.py index 5378735f94..2a9020f2e2 100644 --- a/airflow/contrib/operators/dataflow_operator.py +++ b/airflow/contrib/operators/dataflow_operator.py @@ -187,6 +187,7 @@ class DataflowTemplateOperator(BaseOperator): default_args = { 'dataflow_default_options': { 'project': 'my-gcp-project', + 'region': 'europe-west1', 'zone': 'europe-west1-d', 'tempLocation': 'gs://my-staging-bucket/staging/', } diff --git a/tests/contrib/hooks/test_gcp_dataflow_hook.py b/tests/contrib/hooks/test_gcp_dataflow_hook.py index 653993a7de..57932c47f6 100644 --- a/tests/contrib/hooks/test_gcp_dataflow_hook.py +++ b/tests/contrib/hooks/test_gcp_dataflow_hook.py @@ -255,8 +255,10 @@ def test_start_template_dataflow(self, internal_dataflow_mock): self.dataflow_hook.start_template_dataflow( job_name=JOB_NAME, variables=DATAFLOW_OPTIONS_TEMPLATE, parameters=PARAMETERS, dataflow_template=TEMPLATE) +options_with_region = {'region': 'us-central1'} +options_with_region.update(DATAFLOW_OPTIONS_TEMPLATE) internal_dataflow_mock.assert_called_once_with( -mock.ANY, DATAFLOW_OPTIONS_TEMPLATE, PARAMETERS, TEMPLATE) +mock.ANY, options_with_region, PARAMETERS, TEMPLATE) class DataFlowJobTest(unittest.TestCase): This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Dataflow template operator dosenot support region parameter > --- > > Key: AIRFLOW-2715 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2715 > Project: Apache Airflow > Issue Type: Improvement > Components: operators >Affects Versions: 1.9.0 >Reporter: Mohammed Tameem >Priority: Critical > Fix For: 2.0.0 > > > The DataflowTemplateOperator uses dataflow.projects.templates.launch which > has a region parameter but only supports execution of the dataflow job in the > us-central1 region. Alternatively there is another api, > dataflow.projects.locations.templates.launch which supports execution of the > template in all regional endpoints provided by google cloud. > It would be great if, > # The base REST API of this operator could be changed from > "dataflow.projects.templates.launch" to > "dataflow.projects.locations.templates.launch" > # A templated region paramter was included in the operator to run the > dataflow job in the requested regional endpoint. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] bolkedebruin commented on issue #3584: [AIRFLOW-249] Refactor the SLA mechanism
bolkedebruin commented on issue #3584: [AIRFLOW-249] Refactor the SLA mechanism URL: https://github.com/apache/incubator-airflow/pull/3584#issuecomment-441443847 Can you rebase and fixup the commits? I will then review. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on a change in pull request #4156: [AIRFLOW-3314] Changed auto inlets feature to work as described
bolkedebruin commented on a change in pull request #4156: [AIRFLOW-3314] Changed auto inlets feature to work as described URL: https://github.com/apache/incubator-airflow/pull/4156#discussion_r236074966 ## File path: airflow/lineage/__init__.py ## @@ -110,26 +114,31 @@ def wrapper(self, context, *args, **kwargs): for i in inlets] self.inlets.extend(inlets) -if self._inlets['auto']: -# dont append twice -task_ids = set(self._inlets['task_ids']).symmetric_difference( -self.upstream_task_ids -) -inlets = self.xcom_pull(context, -task_ids=task_ids, -dag_id=self.dag_id, -key=PIPELINE_OUTLETS) -inlets = [item for sublist in inlets if sublist for item in sublist] -inlets = [DataSet.map_type(i['typeName'])(data=i['attributes']) - for i in inlets] -self.inlets.extend(inlets) - -if len(self._inlets['datasets']) > 0: -self.inlets.extend(self._inlets['datasets']) +if self._inlets["auto"]: +visited_task_ids = set(self._inlets["task_ids"]) # prevent double counting of outlets +stack = {self.task_id} +while stack: +task_id = stack.pop() +task = self._dag.task_dict[task_id] +visited_task_ids.add(task_id) +inlets = self.xcom_pull( Review comment: I suggest getting all upstream tasks and using a topological sort to have the right order. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on issue #4083: [AIRFLOW-3211] Reattach to GCP Dataproc jobs upon Airflow restart
bolkedebruin commented on issue #4083: [AIRFLOW-3211] Reattach to GCP Dataproc jobs upon Airflow restart URL: https://github.com/apache/incubator-airflow/pull/4083#issuecomment-441443624 Any updates guys? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on issue #4057: [AIRFLOW-3216] HiveServer2Hook need a password with LDAP authentication
bolkedebruin commented on issue #4057: [AIRFLOW-3216] HiveServer2Hook need a password with LDAP authentication URL: https://github.com/apache/incubator-airflow/pull/4057#issuecomment-441443541 Any update @jongyoul ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on issue #4101: [AIRFLOW-3272] Add base grpc hook
bolkedebruin commented on issue #4101: [AIRFLOW-3272] Add base grpc hook URL: https://github.com/apache/incubator-airflow/pull/4101#issuecomment-441443485 > > Would it be possible to add some tests to this PR as well? > > In progress. I would lean to merge this first since the operator and some other functionality would rely on it. Local integration has been done That doesn't make sense to me? Please add the tests before we merge This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XD-DENG commented on issue #4223: [AIRFLOW-XXX] Remove unnecessary usage of "# noqa" in airflow/bin/cli.py
XD-DENG commented on issue #4223: [AIRFLOW-XXX] Remove unnecessary usage of "# noqa" in airflow/bin/cli.py URL: https://github.com/apache/incubator-airflow/pull/4223#issuecomment-441443442 Hi @bolkedebruin: updated the commit message as requested. Sorry for having missed it and thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin closed pull request #4139: [AIRFLOW-2715] Pick up the region setting while launching Dataflow templates
bolkedebruin closed pull request #4139: [AIRFLOW-2715] Pick up the region setting while launching Dataflow templates URL: https://github.com/apache/incubator-airflow/pull/4139 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/airflow/contrib/hooks/gcp_dataflow_hook.py b/airflow/contrib/hooks/gcp_dataflow_hook.py index 4fdb07c74d..0eee769d61 100644 --- a/airflow/contrib/hooks/gcp_dataflow_hook.py +++ b/airflow/contrib/hooks/gcp_dataflow_hook.py @@ -220,6 +220,7 @@ def label_formatter(labels_dict): def start_template_dataflow(self, job_name, variables, parameters, dataflow_template, append_job_name=True): +variables = self._set_variables(variables) name = self._build_dataflow_job_name(job_name, append_job_name) self._start_template_dataflow( name, variables, parameters, dataflow_template) @@ -278,8 +279,9 @@ def _start_template_dataflow(self, name, variables, parameters, "parameters": parameters, "environment": environment} service = self.get_conn() -request = service.projects().templates().launch( +request = service.projects().locations().templates().launch( projectId=variables['project'], +location=variables['region'], gcsPath=dataflow_template, body=body ) diff --git a/airflow/contrib/operators/dataflow_operator.py b/airflow/contrib/operators/dataflow_operator.py index 5378735f94..2a9020f2e2 100644 --- a/airflow/contrib/operators/dataflow_operator.py +++ b/airflow/contrib/operators/dataflow_operator.py @@ -187,6 +187,7 @@ class DataflowTemplateOperator(BaseOperator): default_args = { 'dataflow_default_options': { 'project': 'my-gcp-project', + 'region': 'europe-west1', 'zone': 'europe-west1-d', 'tempLocation': 'gs://my-staging-bucket/staging/', } diff --git a/tests/contrib/hooks/test_gcp_dataflow_hook.py b/tests/contrib/hooks/test_gcp_dataflow_hook.py index 653993a7de..57932c47f6 100644 --- a/tests/contrib/hooks/test_gcp_dataflow_hook.py +++ b/tests/contrib/hooks/test_gcp_dataflow_hook.py @@ -255,8 +255,10 @@ def test_start_template_dataflow(self, internal_dataflow_mock): self.dataflow_hook.start_template_dataflow( job_name=JOB_NAME, variables=DATAFLOW_OPTIONS_TEMPLATE, parameters=PARAMETERS, dataflow_template=TEMPLATE) +options_with_region = {'region': 'us-central1'} +options_with_region.update(DATAFLOW_OPTIONS_TEMPLATE) internal_dataflow_mock.assert_called_once_with( -mock.ANY, DATAFLOW_OPTIONS_TEMPLATE, PARAMETERS, TEMPLATE) +mock.ANY, options_with_region, PARAMETERS, TEMPLATE) class DataFlowJobTest(unittest.TestCase): This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on issue #4224: [AIRFLOW-XXX] Make sql param as required in BigQueryOperator
bolkedebruin commented on issue #4224: [AIRFLOW-XXX] Make sql param as required in BigQueryOperator URL: https://github.com/apache/incubator-airflow/pull/4224#issuecomment-441443113 @kaxil please use a jira for this, it doesnt qualify as a doc change ;-) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on issue #4223: [AIRFLOW-XXX] Remove unnecessary usage of "# noqa" in airflow/bin/cli.py
bolkedebruin commented on issue #4223: [AIRFLOW-XXX] Remove unnecessary usage of "# noqa" in airflow/bin/cli.py URL: https://github.com/apache/incubator-airflow/pull/4223#issuecomment-441443174 Can you fix your commit message? Then we can merge This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bolkedebruin commented on issue #4227: [AIRFLOW-3384] Allow higher versions of Sqlalchemy and Jinja2
bolkedebruin commented on issue #4227: [AIRFLOW-3384] Allow higher versions of Sqlalchemy and Jinja2 URL: https://github.com/apache/incubator-airflow/pull/4227#issuecomment-441443053 @jlricon can youy please execute this on your own Travis? Opening and closing for just the sake of biuilding is costing us money. If travis fails please check why it fails and fix cases accordingly. Obviously have your PR squashed and cleanup as well, that makes intregration a lot cleaner. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (AIRFLOW-3384) Allow higher versions of sqlalchemy and jinja
[ https://issues.apache.org/jira/browse/AIRFLOW-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698181#comment-16698181 ] ASF GitHub Bot commented on AIRFLOW-3384: - jlricon opened a new pull request #4227: [AIRFLOW-3384] Allow higher versions of Sqlalchemy and Jinja2 URL: https://github.com/apache/incubator-airflow/pull/4227 Make sure you have checked _all_ steps below. ### Jira - My PR addresses this JIRA issue: [AIRFLOW-3384] ### Description - [This PR bumps up the allowed versions of sqlalchemy and jinja2 ### Tests - This PR does not add new functionality to be tested, nor changes the codebase. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Allow higher versions of sqlalchemy and jinja > - > > Key: AIRFLOW-3384 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3384 > Project: Apache Airflow > Issue Type: Improvement > Components: dependencies >Reporter: Jose Luis Ricon >Assignee: Jose Luis Ricon >Priority: Major > > At the moment airflow doesn't allow the installation of sqlalchemy version > 1.2.11 and jinja2==2.10 . Airflow works with both, and there is no reason to > allow higher versions. Projects downstream who are currently forcing the > installation of said versions, overriding airflow's dependencies, will > benefit for this as it will allow for version-compatible installations > without loss in functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRFLOW-3384) Allow higher versions of sqlalchemy and jinja
[ https://issues.apache.org/jira/browse/AIRFLOW-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698180#comment-16698180 ] ASF GitHub Bot commented on AIRFLOW-3384: - jlricon closed pull request #4227: [AIRFLOW-3384] Allow higher versions of Sqlalchemy and Jinja2 URL: https://github.com/apache/incubator-airflow/pull/4227 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/setup.py b/setup.py index e651f5a66e..0c21f7d9ec 100644 --- a/setup.py +++ b/setup.py @@ -299,7 +299,7 @@ def do_setup(): 'dill>=0.2.2, <0.3', 'flask>=0.12.4, <0.13', 'flask-appbuilder==1.12.1', -'flask-admin==1.4.1', +'flask-admin==1.5.2', 'flask-caching>=1.3.3, <1.4.0', 'flask-login>=0.3, <0.5', 'flask-swagger==0.2.13', @@ -310,7 +310,7 @@ def do_setup(): 'gunicorn>=19.4.0, <20.0', 'iso8601>=0.1.12', 'json-merge-patch==0.2', -'jinja2>=2.7.3, <2.9.0', +'jinja2>=2.7.3, <=2.10.0', 'lxml>=4.0.0', 'markdown>=2.5.2, <3.0', 'pandas>=0.17.1, <1.0.0', @@ -322,7 +322,7 @@ def do_setup(): 'python-nvd3==0.15.0', 'requests>=2.5.1, <3', 'setproctitle>=1.1.8, <2', -'sqlalchemy>=1.1.15, <1.2.0', +'sqlalchemy>=1.1.15, <1.3.0', 'tabulate>=0.7.5, <=0.8.2', 'tenacity==4.8.0', 'thrift>=0.9.2', This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Allow higher versions of sqlalchemy and jinja > - > > Key: AIRFLOW-3384 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3384 > Project: Apache Airflow > Issue Type: Improvement > Components: dependencies >Reporter: Jose Luis Ricon >Assignee: Jose Luis Ricon >Priority: Major > > At the moment airflow doesn't allow the installation of sqlalchemy version > 1.2.11 and jinja2==2.10 . Airflow works with both, and there is no reason to > allow higher versions. Projects downstream who are currently forcing the > installation of said versions, overriding airflow's dependencies, will > benefit for this as it will allow for version-compatible installations > without loss in functionality. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] jlricon opened a new pull request #4227: [AIRFLOW-3384] Allow higher versions of Sqlalchemy and Jinja2
jlricon opened a new pull request #4227: [AIRFLOW-3384] Allow higher versions of Sqlalchemy and Jinja2 URL: https://github.com/apache/incubator-airflow/pull/4227 Make sure you have checked _all_ steps below. ### Jira - My PR addresses this JIRA issue: [AIRFLOW-3384] ### Description - [This PR bumps up the allowed versions of sqlalchemy and jinja2 ### Tests - This PR does not add new functionality to be tested, nor changes the codebase. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jlricon closed pull request #4227: [AIRFLOW-3384] Allow higher versions of Sqlalchemy and Jinja2
jlricon closed pull request #4227: [AIRFLOW-3384] Allow higher versions of Sqlalchemy and Jinja2 URL: https://github.com/apache/incubator-airflow/pull/4227 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/setup.py b/setup.py index e651f5a66e..0c21f7d9ec 100644 --- a/setup.py +++ b/setup.py @@ -299,7 +299,7 @@ def do_setup(): 'dill>=0.2.2, <0.3', 'flask>=0.12.4, <0.13', 'flask-appbuilder==1.12.1', -'flask-admin==1.4.1', +'flask-admin==1.5.2', 'flask-caching>=1.3.3, <1.4.0', 'flask-login>=0.3, <0.5', 'flask-swagger==0.2.13', @@ -310,7 +310,7 @@ def do_setup(): 'gunicorn>=19.4.0, <20.0', 'iso8601>=0.1.12', 'json-merge-patch==0.2', -'jinja2>=2.7.3, <2.9.0', +'jinja2>=2.7.3, <=2.10.0', 'lxml>=4.0.0', 'markdown>=2.5.2, <3.0', 'pandas>=0.17.1, <1.0.0', @@ -322,7 +322,7 @@ def do_setup(): 'python-nvd3==0.15.0', 'requests>=2.5.1, <3', 'setproctitle>=1.1.8, <2', -'sqlalchemy>=1.1.15, <1.2.0', +'sqlalchemy>=1.1.15, <1.3.0', 'tabulate>=0.7.5, <=0.8.2', 'tenacity==4.8.0', 'thrift>=0.9.2', This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (AIRFLOW-3394) Cannot redirect output from subprocesses
[ https://issues.apache.org/jira/browse/AIRFLOW-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nimrod Morag updated AIRFLOW-3394: -- Description: When using PythonOperator, I'm using the subprocess module and redirecting the output to sys.stdout in the parent process since `StreamLogWriter` doesn't implement fileno() this causes the task to fail: {{[2018-11-25 08:00:56,064] \{models.py:1760} ERROR - 'StreamLogWriter' object has no attribute 'fileno' Traceback (most recent call last):}} {{File "/usr/local/lib/python3.5/dist-packages/airflow/models.py", line 1659, in _run_raw_task result = task_copy.execute(context=context)_}} {{_File "/vagrant/pysrc/Airflow/Utils.py", line 65, in execute branch = super(MyMultiBranchPythonOperator, self).execute(context)_}} {{_File "/usr/local/lib/python3.5/dist-packages/airflow/operators/python_operator.py", line 95, in execute return_value = self.execute_callable()_}} {{_File "/usr/local/lib/python3.5/dist-packages/airflow/operators/python_operator.py", line 100, in execute_callable return self.python_callable(*self.op_args, **self.op_kwargs)_}} {{_File "/vagrant/pysrc/Airflow/Tasks.py", line 133, in triggerTraining run('sudo echo -e "Host bitbucket.org\n\tStrictHostKeyChecking no\n" >> ~/.ssh/config')_}} {{_File "/vagrant/pysrc/Airflow/Utils.py", line 23, in run if not subprocess.run(args=cmd, stdout=sys.stdout, stderr=sys.stderr, shell=True):_}} {{_File "/usr/lib/python3.5/subprocess.py", line 693, in run with Popen(*popenargs, **kwargs) as process:_}} {{_File "/usr/lib/python3.5/subprocess.py", line 911, in __init__ errread, errwrite) = self._get_handles(stdin, stdout, stderr)}} {{File "/usr/lib/python3.5/subprocess.py", line 1404, in _get_handles c2pwrite = stdout.fileno()}} {{AttributeError: 'StreamLogWriter' object has no attribute 'fileno'}} {{[2018-11-25 08:00:56,068] \{models.py:1791} INFO - Marking task as FAILED.}} {{Fixed locally by adding to the StreamLogWriter class:}} {{def fileno(self):}} {{ return 1}} {{(1 is stdout's file descriptor)}} was: When using PythonOperator, I'm using the subprocess module and redirecting the output to sys.stdout in the parent process since `StreamLogWriter` doesn't implement fileno() this causes the task to fail: {{[2018-11-25 08:00:56,064] \{models.py:1760} ERROR - 'StreamLogWriter' object has no attribute 'fileno' Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/airflow/models.py", line 1659, in _run_raw_task result = task_copy.execute(context=context) File "/vagrant/pysrc/Airflow/Utils.py", line 65, in execute branch = super(MyMultiBranchPythonOperator, self).execute(context) File "/usr/local/lib/python3.5/dist-packages/airflow/operators/python_operator.py", line 95, in execute return_value = self.execute_callable() File "/usr/local/lib/python3.5/dist-packages/airflow/operators/python_operator.py", line 100, in execute_callable return self.python_callable(*self.op_args, **self.op_kwargs) File "/vagrant/pysrc/Airflow/Tasks.py", line 133, in triggerTraining run('sudo echo -e "Host bitbucket.org\n\tStrictHostKeyChecking no\n" >> ~/.ssh/config') File "/vagrant/pysrc/Airflow/Utils.py", line 23, in run if not subprocess.run(args=cmd, stdout=sys.stdout, stderr=sys.stderr, shell=True): File "/usr/lib/python3.5/subprocess.py", line 693, in run with Popen(*popenargs, **kwargs) as process: File "/usr/lib/python3.5/subprocess.py", line 911, in __init__ errread, errwrite) = self._get_handles(stdin, stdout, stderr) File "/usr/lib/python3.5/subprocess.py", line 1404, in _get_handles c2pwrite = stdout.fileno() AttributeError: 'StreamLogWriter' object has no attribute 'fileno' [2018-11-25 08:00:56,068] \{models.py:1791} INFO - Marking task as FAILED.}} {{Fixed locally by adding to the StreamLogWriter class:}} {{def fileno(self):}} {{ return 1}} {{(1 is stdout's file descriptor)}} > Cannot redirect output from subprocesses > > > Key: AIRFLOW-3394 > URL: https://issues.apache.org/jira/browse/AIRFLOW-3394 > Project: Apache Airflow > Issue Type: Bug > Components: logging >Affects Versions: 1.10.1 > Environment: ubuntu 16.04 > python 3.5.2 >Reporter: Nimrod Morag >Priority: Major > Labels: beginner, easyfix, patch, usability > Original Estimate: 0.25h > Remaining Estimate: 0.25h > > When using PythonOperator, I'm using the subprocess module and redirecting > the output to sys.stdout in the parent process > > since `StreamLogWriter` doesn't implement fileno() this causes the task to > fail: > {{[2018-11-25 08:00:56,064] \{models.py:1760} ERROR - 'StreamLogWriter' > object has no attribute 'fileno' Traceback (most recent call last):}} > {{File "/usr/local/lib/python3.5/dist-packages/airflow/models
[jira] [Created] (AIRFLOW-3394) Cannot redirect output from subprocesses
Nimrod Morag created AIRFLOW-3394: - Summary: Cannot redirect output from subprocesses Key: AIRFLOW-3394 URL: https://issues.apache.org/jira/browse/AIRFLOW-3394 Project: Apache Airflow Issue Type: Bug Components: logging Affects Versions: 1.10.1 Environment: ubuntu 16.04 python 3.5.2 Reporter: Nimrod Morag When using PythonOperator, I'm using the subprocess module and redirecting the output to sys.stdout in the parent process since `StreamLogWriter` doesn't implement fileno() this causes the task to fail: {{[2018-11-25 08:00:56,064] \{models.py:1760} ERROR - 'StreamLogWriter' object has no attribute 'fileno' Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/airflow/models.py", line 1659, in _run_raw_task result = task_copy.execute(context=context) File "/vagrant/pysrc/Airflow/Utils.py", line 65, in execute branch = super(MyMultiBranchPythonOperator, self).execute(context) File "/usr/local/lib/python3.5/dist-packages/airflow/operators/python_operator.py", line 95, in execute return_value = self.execute_callable() File "/usr/local/lib/python3.5/dist-packages/airflow/operators/python_operator.py", line 100, in execute_callable return self.python_callable(*self.op_args, **self.op_kwargs) File "/vagrant/pysrc/Airflow/Tasks.py", line 133, in triggerTraining run('sudo echo -e "Host bitbucket.org\n\tStrictHostKeyChecking no\n" >> ~/.ssh/config') File "/vagrant/pysrc/Airflow/Utils.py", line 23, in run if not subprocess.run(args=cmd, stdout=sys.stdout, stderr=sys.stderr, shell=True): File "/usr/lib/python3.5/subprocess.py", line 693, in run with Popen(*popenargs, **kwargs) as process: File "/usr/lib/python3.5/subprocess.py", line 911, in __init__ errread, errwrite) = self._get_handles(stdin, stdout, stderr) File "/usr/lib/python3.5/subprocess.py", line 1404, in _get_handles c2pwrite = stdout.fileno() AttributeError: 'StreamLogWriter' object has no attribute 'fileno' [2018-11-25 08:00:56,068] \{models.py:1791} INFO - Marking task as FAILED.}} {{Fixed locally by adding to the StreamLogWriter class:}} {{def fileno(self):}} {{ return 1}} {{(1 is stdout's file descriptor)}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)