Re: Where to put docs on configuring specific kinds of connections? (Or restructuring the docs the way Django does)

2018-05-21 Thread Taylor Edmiston
Hey Tim - I came to Airflow from the Django world as well and had the same thought that much of the work that's been put into their docs over time could be applied here too. In terms of documentation for large Python projects, perhaps they're the gold standard. Can you give a few examples of

Where to put docs on configuring specific kinds of connections? (Or restructuring the docs the way Django does)

2018-05-21 Thread Tim Swast
Hey folks, I'd like to write some docs on how to create a GCP connection (and leave room for documenting other kinds of connections as well). Currently it seems like there are a couple places such a thing could fit: - https://airflow.incubator.apache.org/configuration.html#connections -

Re: celery problem: cannot override celery_broker_transport_options

2018-05-21 Thread Craig Rodrigues
Bolke, Can you help me with this? You have worked on this code with respect to parsing celery broker options. I cannot figure out how to override the defaults, and wrong values are being passed down into the mysql backend, causing things to fail. This is blocking me from doing further testing

Re: S3keysonsor

2018-05-21 Thread Joe Napolitano
Great, I think we're in agreement on your definition of static. In my own experience, working with S3 keys can be painful if you can't anticipate the key name. I don't think the S3KeySensor will work as it's written. There's another operator that's not in the docs, but can be seen below the

Re: S3keysonsor

2018-05-21 Thread Rajesh C
The sensor allows wild card (*) and there is also an S3PrefixSensor which might help in some cases. In one of my dags, I have a similar structure. wait_on_s3_source_data = S3KeySensor( task_id='wait_on_s3_source_data',

Re: S3keysonsor

2018-05-21 Thread purna pradeep
+ Joe On Mon, May 21, 2018 at 2:56 PM purna pradeep wrote: > I do know only to some extent , I mean If you see my sample s3 locations > > s3a://mybucket/20180425_111447_data1/_SUCCESS > > s3a://mybucket/20180424_111241_data1/_SUCCESS > > > > The only values which are

Re: S3keysonsor

2018-05-21 Thread purna pradeep
I do know only to some extent , I mean If you see my sample s3 locations s3a://mybucket/20180425_111447_data1/_SUCCESS s3a://mybucket/20180424_111241_data1/_SUCCESS The only values which are static in above location are s3a://mybucket/ data1/_SUCCESS Now I want to configure tolerance for

Re: Dockerised CI and testing environment

2018-05-21 Thread Daniel Imberman
Hi Gerardo, I left some comments on the PR. Could you please get the travis tests to pass + rebase your PR? Afterwards I'd be glad to try it out. On Mon, May 21, 2018 at 6:55 AM Gerardo Curiel wrote: > Hello folks, > > I just submitted a PR for using Docker as part of

Re: S3keysonsor

2018-05-21 Thread Joe Napolitano
Purna, with regards to "this path is not completely static," can you clarify what you mean? Do you mean that you don't know the actual key name beforehand? E.g. pertaining to "111447", "111241", and "111035" in your example? On Mon, May 21, 2018 at 2:23 PM, Brian Greene <

kwargs usage in BaseOperator

2018-05-21 Thread Tao Feng
Hi, I have a question regarding kwargs usage in BaseOperator( https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L2306-L2315). Since this pr(https://github.com/apache/incubator-airflow/pull/1285) checked in, airflow turns on deprecation warning by default. And per the

Re: S3keysonsor

2018-05-21 Thread Brian Greene
I suggest it’ll work for your needs. Sent from a device with less than stellar autocorrect > On May 21, 2018, at 10:16 AM, purna pradeep wrote: > > Hi , > > I’m trying to evaluate airflow to see if it suits my needs. > > Basically i can have below steps in a DAG > >

Re: 答复: Airflow REST API proof of concept.

2018-05-21 Thread Maxime Beauchemin
Personally I think we should keep the architecture as simple as possible and use the same web server for REST and UI. As mentioned FAB (Flask App Builder) manages authentication and RBAC, so we can have consistent access rights in the UI and CLI. Max On Fri, May 11, 2018 at 5:42 AM Luke Diment

Re: celery problem, AttributeError: async

2018-05-21 Thread Bolke de Bruin
We will rebranch 1.10 from master. Sorry, I have been too busy with normal life to be able to follow up on the release of 1.10. B. > On 21 May 2018, at 19:54, Craig Rodrigues wrote: > > Kaxil, > > Thanks for merging this into master. > What is the procedure to get this

Re: Dags getting failed after 24 hours

2018-05-21 Thread Maxime Beauchemin
Even though it's possible to set and `execution_timeout` on any task and/or a dagrun_timeout on DAG runs, by default it's all set to None (unless you're somehow setting the DAG's default parameters in some other ways). Maybe your have some OS-level policies on long-running processes in your

Re: celery problem, AttributeError: async

2018-05-21 Thread Craig Rodrigues
I have submitted: https://github.com/apache/incubator-airflow/pull/3388 -- Craig On Mon, May 21, 2018 at 7:00 AM Naik Kaxil wrote: > Thanks. Please do that. > > On 21/05/2018, 14:59, "Craig Rodrigues" wrote: > > celery 4.1.1 was just released last

Re: celery problem, AttributeError: async

2018-05-21 Thread Naik Kaxil
Thanks. Please do that. On 21/05/2018, 14:59, "Craig Rodrigues" wrote: celery 4.1.1 was just released last night which has all the async problems fixed: https://github.com/celery/celery/commits/v4.1.1 I'll test this out, and then submit a PR to bump

Re: celery problem, AttributeError: async

2018-05-21 Thread Craig Rodrigues
celery 4.1.1 was just released last night which has all the async problems fixed: https://github.com/celery/celery/commits/v4.1.1 I'll test this out, and then submit a PR to bump airflow's celery version to 4.1.1 -- Craig On 2018/05/21 07:20:50, Craig Rodrigues wrote: >

Dockerised CI and testing environment

2018-05-21 Thread Gerardo Curiel
Hello folks, I just submitted a PR for using Docker as part of Airflow's build pipeline: https://github.com/apache/incubator-airflow/pull/3393 Currently, running unit tests is a difficult process. Airflow tests depend on many external services and other custom setup, which makes it hard for

Dags getting failed after 24 hours

2018-05-21 Thread ramandumcs
Hi All, We have a long running DAG which is expected to take around 48 hours. But we are observing that its get killed by Airflow scheduler after ~24 hrs. We are not setting any Dag/task execution timeout explicitly. Is there any default timeout value that get used. We are using LocalExecutor

celery problem: cannot override celery_broker_transport_options

2018-05-21 Thread Craig Rodrigues
Hi, I used this requirements.txt file to install airflow from the v1-10-test branch: git+https://github.com/celery/celery@master#egg=celery git+https://github.com/apache/incubator-airflow@v1-10-test#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3] kombu>=4.1.0

celery problem, AttributeError: async

2018-05-21 Thread Craig Rodrigues
Hi, I used a requiremens.txt file with these three lines: git+https://github.com/apache/incubator-airflow@v1-10-test#egg=apache-airflow[celery,crypto,emr,hive,hdfs,ldap,mysql,postgres,redis,slack,s3] celery>=4.2.0rc3 kombu>=4.2.0 I did pip install -r requirements.txt When I started my