Re: [ACTION REQUIRED] Removal of v3 artifact actions on December 5th

2024-11-26 Thread Maxime Beauchemin
v4 ok? https://github.com/apache/superset/blob/master/.github/workflows/superset-e2e.yml#L139 On Mon, 25 Nov 2024 at 16:15, Jarek Potiuk wrote: > Yeah. I took a bit deeper look and found out even more in "airlfow-site" > > > https://github.com/apache/airflow-site/blob/0d5111a7f0896f6c08839a6395f

Re: [VOTE] Release Apache Airflow 2.0.0 form 2.0.0rc3

2020-12-16 Thread Maxime Beauchemin
+1 binding !!! On Wed, Dec 16, 2020 at 2:42 PM Tomasz Urbaszek wrote: > +1 binding, tested it with kubernetes based deployments using > CeleryExecutor as well as locally > > Tomek > > On Wed, Dec 16, 2020 at 9:42 PM Ry Walker wrote: > >> +1 (non-binding) >> >> On Wed, D

Re: [VOTE] Enable Github Discussions on Apache Airflow Github Repo

2020-09-25 Thread Maxime Beauchemin
+1 (binding) On Tue, Sep 22, 2020 at 12:40 PM Kaxil Naik wrote: > @tomek - Yup we could do that > > On Tue, Sep 22, 2020 at 8:35 PM Tomasz Urbaszek > wrote: > >> +1 binding >> >> Should we change "Ask a question or get support" issue type to point >> to GH discussion? >> >> T. >> >> >> On Tue,

Re: [DISCUSS] Removing Pickling from Airflow 2.0

2020-09-18 Thread Maxime Beauchemin
I'm getting bad flashbacks of fighting with pickles early on in the history of the project. I've learned since then to stay away. Almost all solutions that involve pickles are bad solutions. Beyond but related to the security implication are the issues of pickle entanglement, not really knowing wha

Re: Discuss: should we allow HTML emails on dev@ list

2020-09-14 Thread Maxime Beauchemin
+1 how about +1000² to "let's make html email a default thing in all the lists" On Mon, Sep 14, 2020 at 12:14 PM Felix Uellendall wrote: > +1 > > On Mon, Sep 14, 2020 at 20:14, Kamil Breguła > wrote: > > > +1 (binding) > > > > On Mon, Sep 14, 2020, 19:52 Tao Feng wrote: > > > >> +1 > >> > >>

Re: Consider using stale bot for issues

2020-09-13 Thread Maxime Beauchemin
About labeling on Github, I wrote a bot for Superset that enables anyone to apply labels through commenting using emojis. It does other things like auto-labeling organization on an allow-list. We could teach it new tricks top. Generalizing it or porting it to work in other communities should be v

Re: Import style in Airflow codebase

2020-09-09 Thread Maxime Beauchemin
+1 On Wed, Sep 9, 2020 at 10:36 AM Vikram Koka wrote: > +1 on absolute import. Honestly, a huge fan of doing it as an absolute vs. > relative. > > On Wed, Sep 9, 2020 at 4:09 AM Kaxil Naik wrote: > > > No strong opinion but absolute import seems better from a user's > > perspective. > > > > On

Re: [VOTE] AIP-34 TaskGroup: A UI task grouping concept as an alternative to SubDagOperator

2020-08-26 Thread Maxime Beauchemin
+1 (binding) On Wed, Aug 26, 2020 at 12:34 AM Kevin Yang wrote: > +1 (binding) > > On Tue, Aug 25, 2020 at 2:47 PM Kamil Breguła > wrote: > > > +1 (binding) > > > > On Tue, Aug 25, 2020, 23:35 Daniel Imberman > > wrote: > > > > > +1 binding > > > > > > via Newton Mail > > > [ > > > > > > https

Re: Move "Who uses Airflow?" (to Ecosystem? or new "users" page)?

2020-08-25 Thread Maxime Beauchemin
INTHEWILD.md is as close to a standard as it gets for this. I'd advocate to move the list there and have a link to it in the README. Here's the Superset one for reference: https://github.com/apache/incubator-superset/blob/master/INTHEWILD.md Personally I like a short clean README with links to key

Re: [Discuss] Improving AIP process documentation

2020-08-24 Thread Maxime Beauchemin
For reference, here's SIP-0 (Superset Improvement Proposal 0) which defines the SIP process for Apache Superset: https://github.com/apache/incubator-superset/issues/5602 We also have a GitHub issue template for it: https://github.com/apache/incubator-superset/blob/master/.github/ISSUE_TEMPLATE/sip

Re: Faster builds on CI + increased stability + easier to reproduce CI problems

2020-08-24 Thread Maxime Beauchemin
Great work! Investments in CI pay dividends to the whole community. On Sat, Aug 22, 2020 at 8:12 AM Jarek Potiuk wrote: > Hello everyone, > > Just wanted to let you know that we merged last week quite an overhaul of > the CI architecture we have in Github Actions. > > TL;DR; It should be faster,

Re: [PROPOSAL][AIP-36 DAG Versioning]

2020-07-30 Thread Maxime Beauchemin
show what > > > > > was run. > > > > > > > > > > Example Dag v1: Task A -> Task B -> Task C > > > > > The worker has completed the execution of Task B and is just about > to > > > > > complete the

Re: [PROPOSAL][AIP-36 DAG Versioning]

2020-07-28 Thread Maxime Beauchemin
ks with lines across blocks when necessary. > > > Agreed, the plan is to do the best effort aligning. > At this point in time, task additions to the end of the DAG are expected to > be compatible, > but changes to task structure within the DAG may cause the tree view not to > inco

Re: [PROPOSAL][AIP-36 DAG Versioning]

2020-07-27 Thread Maxime Beauchemin
Some notes and ideas: *DAG Fingerprinting: *this can be tricky, especially in regards to dynamic DAGs, where in some cases each parsing of the DAG can result in a different fingerprint. I think DAG and tasks attributes are left out from the proposal that should be considered as part of the fingerp

Re: [AIP-34] Rewrite SubDagOperator

2020-06-17 Thread Maxime Beauchemin
+1, proposal looks good. The original intention was really to have tasks groups and a zoom-in/out in the UI. The original reasoning was to reuse the DAG object since it is a group of tasks, but as highlighted here it does create underlying confusions since a DAG is much more than just a group of t

Re: [VOTE] Naming of the transfer operators/Hooks

2020-05-30 Thread Maxime Beauchemin
+1 for [1] XToYOperator On Sat, May 30, 2020 at 7:33 AM Felix Uellendall wrote: > +1 for [1] XToYOperator > > Best Regards, > Felix > > Sent from ProtonMail Mobile > > On Sat, May 30, 2020 at 13:58, Kaxil Naik wrote: > > > +1 for [1] XToYOperator > > > > On Sat, May 30, 2020, 12:56 Tomasz Urbas

Re: What's coming in Airflow 2.0 =- this Wednesday at NYC Online Meetup

2020-05-13 Thread Maxime Beauchemin
I was bummed about not being able to make live today, but found that the video is available already and was able to watch it just now. https://www.crowdcast.io/e/whats-coming-airflow-2 Amazing work presenters and committers! It's fantastic to see all of this coming together. Max On Mon, May 11,

Re: [VOTE] AIP-15: Support Multiple-Schedulers for HA & Better Scheduling Performance

2020-03-19 Thread Maxime Beauchemin
+1 (binding) Solid work! On Tue, Mar 17, 2020 at 10:43 PM Jarek Potiuk wrote: > +1 (binding) > > On Tue, Mar 17, 2020 at 11:16 PM Kaxil Naik wrote: > > > > +1 (binding) > > > > On Tue, Mar 17, 2020 at 10:06 PM Deng Xiaodong > wrote: > > > > > +1 (binding). > > > > > > Thanks for proceeding th

Re: [VOTE] Switch from using Jira to Github Issues

2020-03-16 Thread Maxime Beauchemin
+1 binding On Mon, Mar 16, 2020, 7:45 AM Jarek Potiuk wrote: > +1 binding. > > Re: contention: I think we can split reviewers by components. I think the > assignment to components was done recently so duplicates will be rare. I am > happy to be one of the reviewers. > > On Mon, Mar 16, 2020 at 3

Re: Airflow growth graphic?

2020-02-26 Thread Maxime Beauchemin
durations (time until first comment on issues, time until first review on PRs, ...) * ... It should be useful for most communities, I'll share the link when I post Max On Wed, Feb 26, 2020 at 12:35 PM Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > > https://star-histo

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-26 Thread Maxime Beauchemin
Hey, I wanted to echo the awesomeness once more, but also bring up the question as to whether any of this work may make it harder to distribute / HA the scheduler down the line (?) I almost started analyzing the code and thought it'd just be easier to ask the authors. Max On Wed, Feb 26, 2020 at

Re: Airflow growth graphic?

2020-02-26 Thread Maxime Beauchemin
https://star-history.t9t.io/#apache/airflow&apache/oozie&spotify/luigi&pinterest/pinball&fishtown-analytics/dbt&dagster-io/dagster&PrefectHQ/prefect [image: Screen Shot 2020-02-26 at 12.34.01 PM.png] On Wed, Feb 26, 2020 at 12:06 PM Alex Tronchin-James 949-412-7220 < alex.n.ja...@gmail.com> wrote:

Re: Big performance optimization of Scheduler - 10x faster , 2000+ fewer queries count

2020-02-25 Thread Maxime Beauchemin
Nice! On Tue, Feb 25, 2020 at 12:11 AM Robin Edwards wrote: > This is brilliant work, thank you! Looking forward to watching my RDS > metrics when this gets deployed :-) > > On Tue, 25 Feb 2020, 07:08 Driesprong, Fokko, > wrote: > > > Sweet work Kamil and others! I'll try to go through them tod

Re: Airflow and Machine Learning

2020-02-19 Thread Maxime Beauchemin
I'd have a lot of thoughts to unpack here, but top of mind is a deeper integration with [jupyter] notebooks and/or hosted notebooks-type systems. Notebooks [with papermill ] can be parameterized predictably, and notebook files provide rich log outputs (organize

Re: [DISCUSS] Reduce (remove?) automated imports in Airflow 2.0

2020-02-17 Thread Maxime Beauchemin
+1 On Mon, Feb 17, 2020 at 7:32 AM Daniel Imberman wrote: > +1 on my end! > > via Newton Mail > [ > https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.32&pv=10.14.6&source=email_footer_2 > ] > On Mon, Feb 17, 2020 at 12:30 AM, Driesprong, Fokko > > wrote: > I like this as well. It will hopefully

Re: [DISCUSS] Packaging DAG/operator dependencies in wheels

2019-12-20 Thread Maxime Beauchemin
This reminds me of the "DagFetcher" idea. Basically a new abstraction that can fetch a DAG object from anywhere and run a task. In theory you could extend it to do "zip on s3", "pex on GFS", "docker on artifactory" or whatever makes sense to your organization. In the proposal I wrote about using a

Re: [VOTE] Add Probot Integrations to Airflow Github Repo

2019-12-20 Thread Maxime Beauchemin
+1! I forgot whether I shared this before, but I wrote a probot for Apache Superset that does a few nice things, and we can grow over time to do more things without having to beg Apache Infra for anything (currently only them have admin rights and can register apps on our repos) https://github.com

Re: Scheduler Stuck when using LocalExecutor and KubernetesPodOperator

2019-12-12 Thread Maxime Beauchemin
Friend don't let friends use LocalExecutor in production. LocalExecutor is essentially a subprocess pool running in-process. When I wrote it originally I never thought it would ever be used in production. Celery / CeleryExecutor is more reasonable as Celery is a proper process/thread pool that's c

Re: [DISCUSS] Using shared memory for inter-task communication

2019-11-27 Thread Maxime Beauchemin
If memory is shared across tasks, they are by definition not idempotent, which can be troublesome. What if you have a chain of 3 tasks and the last one failed while operating on the memory that came from task number 2? The whole chain may have to be re-executed, which to me sounds like it's really

Re: [VOTE] Accept new Airflow website contribution

2019-11-21 Thread Maxime Beauchemin
+1 (binding) this is amazing! it's hard to believe that the project gained so much traction despite the website that we had before. On Thu, Nov 21, 2019 at 5:01 PM Deng Xiaodong wrote: > +1 (binding) for the new website. It will be very value-adding. Thanks! > > XD > > On Fri, Nov 22, 2019 at 0

Re: [DISCUSS] Airflow Summits dates and locations

2019-11-21 Thread Maxime Beauchemin
What mailing list should I send this to? Should I assume that they have any context about the summits? On Thu, Nov 21, 2019 at 3:02 PM Aizhamal Nurmamat kyzy wrote: > Updated the proposal with the exact dates and locations [1] > > +Maxime Beauchemin could you please send the > pro

Re: [DISCUSS] Airflow Summits dates and locations

2019-11-20 Thread Maxime Beauchemin
+1, the proposed dates and location seems reasonable to me. My heart would pick Paris but London is more rational :) Max On Wed, Nov 20, 2019 at 3:46 PM Aizhamal Nurmamat kyzy wrote: > Hi all, > > I am sharing with you a list of some industry events that are happening > from April to October [

Re: Closing JIRA Issue for Merged PRs

2019-11-19 Thread Maxime Beauchemin
Quick note as I'm playing with Probot for Superset. It's possible to catch all sorts of Github event and trigger all sorts of side effects with it. On the Superset side we're looking to enable automation around Github comments and labeling. It's also offers potential around enabling people (PMs, c

Re: [Discuss] Airflow Summits 2020

2019-11-19 Thread Maxime Beauchemin
add your comments to it. I also added names of volunteers who >> > wants >> > > to help to organize each summit, please add your names and affiliation >> > if I >> > > missed someone. My suggestion is that we don't have more than 2 >> > organ

Re: Donating code to add Common Workflow Language import to Airflow

2019-11-14 Thread Maxime Beauchemin
After all the exploration of this topic here in this thread, I'm a pretty hard -1 on this one. I think CWL and CWL-Airflow are great projects, but they can't rely on the Airflow community to evolve/maintain/package this integration. Personally I think that generally and *within reason* (winking a

Re: Donating code to add Common Workflow Language import to Airflow

2019-11-13 Thread Maxime Beauchemin
The big question is why can't it just be on its own Github repository and in its own PyPI package? Why does it have to be packaged with our PyPI package or live in the Airflow repo? Max On Wed, Nov 13, 2019 at 12:42 PM Andrey Kartashov wrote: > I don't quite get what this example should to prov

Re: Drop Python 3.5 support?

2019-11-12 Thread Maxime Beauchemin
+1 On Tue, Nov 12, 2019, 1:00 PM Bolke de Bruin wrote: > Hi All, > > Can we drop python 3.5 support and switch to 3.6 as a minimum? > > Cheers > Bolke >

[upcoming event] Data Orchestration Summit

2019-11-05 Thread Maxime Beauchemin
Hey, Quick last minute note about a somewhat-Airlfow-related event coming up: the Data Orchestration Summit (@ Computer History Museum in Mountain View) this Thursday. I'm scheduled to be on a panel about creating open source projects and co

Re: Donating code to add Common Workflow Language import to Airflow

2019-10-30 Thread Maxime Beauchemin
As someone who has spent a lot of time acting as a maintainer, a code "donation" seems like dangerous gift to accept. Personally I like the idea of an ecosystem of packages (and repos) managed and maintained by their specialist. That way they can have their own CI, their own release processes and

Re: Announcing SIG-Knative/ The Monthly Knative Executor Meetup/Call-for-contributers

2019-10-17 Thread Maxime Beauchemin
Nice! On Wed, Oct 16, 2019 at 1:19 PM Ash Berlin-Taylor wrote: > We meant October 30th (the date in the calendar is correct) > > -a > > > On 16 Oct 2019, at 18:57, Daniel Imberman > wrote: > > > > Fixing the calendar link :) > > > https://calendar.google.com/event?action=TEMPLATE&tmeid=b2I4Yml2

Re: AIP-7 completed :)

2019-09-18 Thread Maxime Beauchemin
**Enabling the enablers** Great work. Super meta! Think about it: building tools that enable the people building Airflow, that enables the data engineers building pipelines, that enable analysts with data, that enable decision makers with product insights, that enables customers with better produc

Re: Airflow 1.10.5 Python 2.7 compatability

2019-09-16 Thread Maxime Beauchemin
Fokko, surprised to see you use 3.5 . I thought most people went straight to 3.6 as soon as it came out, as there are very little reasons to be on the 3 to 3.5 range anymore I'd say we should go straight to 3.6+ for Airflow as we deprecate 2.7 https://pythonclock.org/ On Mon, Sep 16, 2019 at 2:17

Re: Setting to add choice of schedule at end or schedule at start of interval

2019-09-05 Thread Maxime Beauchemin
gt;> day, max 30 days". Then on on-going basis, your daily loads would > be a > > >>> range of 1 day but then if server down for couple days, this could be > > >>> caught up in one task and if you backfill it could be up to 30-day > > >

Re: DAG "Schedule Filter Callback"?

2019-08-30 Thread Maxime Beauchemin
I remember thinking about these issues in the past and thought adding some sort of `should_task_be_skipped` callback as an arg to BaseOperator would be easy and useful. Method should probably just receive a ref to the task instance. By the very nature of interfacing with a method, we cannot guaran

Re: Setting to add choice of schedule at end or schedule at start of interval

2019-08-27 Thread Maxime Beauchemin
How about an alternative approach that would introduce 2 new keyword arguments that are clear (something like, but maybe better than `period_start_dttm`, `period_end_dttm`) and leave `execution_date` unchanged, but plan it's deprecation. As a first step `execution_date` would be inferred from the n

Airflow maintenance blog post

2019-08-21 Thread Maxime Beauchemin
It's common to have these maintenance DAGs, but it's great to have people share theirs and blog about it. Thanks to Robert Sanders at Clairvoyant for this post and code repo https://blog.clairvoyantsoft.com/automated-maintenance-for-apache-airflow-8d844f32737d

Re: [IE] Re: [VOTE] Change the Airflow logo

2019-08-21 Thread Maxime Beauchemin
Option 1 ++ On Wed, Aug 21, 2019 at 9:52 AM Tao Feng wrote: > Option 1 > > On Wed, Aug 21, 2019 at 8:26 AM Leah Cole > wrote: > > > Option 1 > > > > On Wed, Aug 21, 2019 at 7:01 AM Sunil Varma Chiluvuri < > > sunilvarma.chiluv...@equifax.com> wrote: > > > > > Option 1 > > > > > > On Wed, Aug 21

Re: Airflow Dynamic tasks

2019-08-20 Thread Maxime Beauchemin
Bag parsing time: 3.93859595 > > > > Parsing in time of execution, when scheduler submits the DAGs: > > DagBag parsing time: 99.820316 > > > > Delay between the task run inside a single DAG grow from 30 sec to > 10 min, > >

Re: [ANNOUNCE] Please welcome new Airflow committer Chao-han Tsai

2019-08-19 Thread Maxime Beauchemin
We'll deserved! Welcome aboard! On Mon, Aug 19, 2019, 9:54 AM Aizhamal Nurmamat kyzy wrote: > Congratulations, Chao-Han! Thank you for your contributions! > > On Fri, Aug 16, 2019 at 7:58 PM Jarek Potiuk > wrote: > > > Congrats Chao-Han! > > > > On Fri, Aug 16, 2019 at 6:32 PM Kevin Yang wrote

Re: Airflow Dynamic tasks

2019-08-15 Thread Maxime Beauchemin
What is your dynamic DAG doing? How long does it take to execute it just as a python script (`time python mydag.py`)? As an Airflow admin, people may want to lower the DAG parsing timeout configuration key to force people to not do crazy thing in DAG module scope. At some point at Airbnb we had so

Re: [DISCUSS] Tweaks to the Airflow logo

2019-08-13 Thread Maxime Beauchemin
+1! (as the "designer" of the original) :) Please make sure to share the vector files in the repo as well Also don't forget to upload here https://www.apache.org/logos/ On Tue, Aug 13, 2019 at 12:12 PM Felix Uellendall wrote: > +1 looks so much cleaner and more modern :) > > Felix > > Am 13/08

Blog post: Upgrading & Scaling Airflow at Robinhood

2019-08-09 Thread Maxime Beauchemin
Thanks to Abhishek Ray @ Robinhood for this great post. I felt like I had to share it here https://robinhood.engineering/upgrading-scaling-airflow-at-robinhood-5b625dfaa2ee Max

Re: Airflow DAG Serialisation

2019-07-26 Thread Maxime Beauchemin
Great to see this happening! On Fri, Jul 26, 2019 at 8:54 AM Jarek Potiuk wrote: > Great! That's definitely one of the most painful aspects of Airflow. Happy > to help/comment/take part in the discussions and later in the > implementation. > > On Fri, Jul 26, 2019 at 4:48 PM Deng Xiaodong wrote

Re: Running Pylint/Flake/Mypy/Docs/Licence checks locally

2019-07-17 Thread Maxime Beauchemin
`pre-commit` is pretty great if someone wants to set it up. https://pre-commit.com/ Here's what our configuration looks like for Superset: https://github.com/apache/incubator-superset/blob/master/.pre-commit-config.yaml It's super fast too as it only executes for the files that have been touched

Re: [2.0 spring cleaning] Remove `dag >> task`?

2019-07-03 Thread Maxime Beauchemin
+1 To me the preferred method is to use the context manager (`with DAG(...) as dag:`). We should make sure all examples align with that method if that's not the case already. On Wed, Jul 3, 2019 at 10:57 AM Kamil Breguła wrote: > This is very confusing. > +1 > > On Wed, Jul 3, 2019 at 7:20 PM C

Re: [PROPOSE] Introduce and encourage pre-commit hooks framework to Airflow developer workflow

2019-07-02 Thread Maxime Beauchemin
+1 On Mon, Jul 1, 2019 at 11:20 PM Kaxil Naik wrote: > +1 We have been using this on some of our Astronomer repositories as well > and have been happy with it. > > Regards, > Kaxil > > On Tue, Jul 2, 2019, 11:46 Jarek Potiuk wrote: > > > TL;DR: I would like to make a proposal to add (easily ma

Re: Latest Sphinx (2.1.2) causes master to fail

2019-06-24 Thread Maxime Beauchemin
< > > https://poetry.eustace.io/>)? How does this PR compare? > > > > -ash > > > > > > > On 24 Jun 2019, at 22:02, Maxime Beauchemin < > maximebeauche...@gmail.com> > > wrote: > > > > > > This Superset PR may be relevant a

Re: Latest Sphinx (2.1.2) causes master to fail

2019-06-24 Thread Maxime Beauchemin
This Superset PR may be relevant and useful for Airflow too (still WiP): https://github.com/apache/incubator-superset/pull/7762 Basically it allows your project to define a "requirements/" folder with a "requirements.json" as a single source of truth for all deps, including the details of what goe

Re: [VOTE] AIP-16: CLI: Use nested commands instead of flags

2019-06-11 Thread Maxime Beauchemin
+1 On Tue, Jun 11, 2019 at 12:29 PM Bas Harenslak < basharens...@godatadriven.com> wrote: > +1 (binding) > > > On 11 Jun 2019, at 18:59, Tao Feng fengta...@gmail.com>> wrote: > > +1 (binding) > > On Tue, Jun 11, 2019 at 4:15 AM Ash Berlin-Taylor a...@apache.org>> wrote: > > Hi Airflowers, > > T

Re: MySQL mysql_postoperator parameter - To be deprecated in 2.0.0 ?

2019-05-30 Thread Maxime Beauchemin
That allows for the logic to run atomically with the task. I don't think it's super important, but it can be nice to allow for this type of thing. I'm not sure if this applies for mysql specifically, but if you want to bulk load, you may have to drop indexes and/or constraints and recreate them af

Re: Cron schedule with DST-aware timezone

2019-05-13 Thread Maxime Beauchemin
It would be great if people can provide failing unit tests as PR with clear expectations stated out as code. It makes it easier for people to get consensus on expectations and for anyone to jump in and implement a fix. Max On Mon, May 13, 2019 at 12:48 PM David Klosowski wrote: > Damian is corr

Re: [ANNOUNCE] Please welcome new Airflow committer Kevin Yang

2019-05-01 Thread Maxime Beauchemin
Well deserved! Congrats. On Tue, Apr 30, 2019 at 6:48 PM Yingbo Wang wrote: > Congrats Kevin! > > On Tue, Apr 30, 2019 at 6:43 PM Deng Xiaodong wrote: > > > Congrats Kevin! > > > > > > XD > > > > On Wed, May 1, 2019 at 1:09 AM Daniel Imberman < > > dimberman.opensou...@gmail.com> wrote: > > > >

Re: Use -x when cherry-picking to v1-10 branches

2019-04-22 Thread Maxime Beauchemin
Always, always use `-x` when cherry-picking! On Mon, Apr 22, 2019 at 1:28 PM Jarek Potiuk wrote: > Hello Everyone (committers especially). > > I have a proposal to improve slightly the cherry-picking process between > master and v1-10-branches: We could use `-x` flag when cherry-picking. This >

Re: Is `airflow backfill` disfunctional?

2019-04-16 Thread Maxime Beauchemin
ing :-) > > Cheers, Fokko > > Op za 13 apr. 2019 om 23:26 schreef Maxime Beauchemin < > maximebeauche...@gmail.com>: > > > +1, backfilling, and related "subdag surgeries" are core to a data > > engineer's job, and great tooling around this is super

Re: [VOTE] AIP-6 Apply Pylint to Airflow

2019-04-15 Thread Maxime Beauchemin
pylint and black are super solid, no questions there afaic Max On Mon, Apr 15, 2019 at 11:48 AM m...@maximilianroos.com wrote: > Hi there, > > I haven't been active in the airflow community so this should be weighed > appropriately. > > I'm a core dev of a couple of other libraries (xarray, pa

Re: Is `airflow backfill` disfunctional?

2019-04-13 Thread Maxime Beauchemin
would've expected this to > work > > (nor was it indicated in the dry run) > > > > I ended up having to do manual recovery work in the database to turn the > > "backfill" runs back into scheduler runs, and then switch to using > `airflow > > cle

Re: [2.0 spring cleaning] Require unique conn_id

2019-04-13 Thread Maxime Beauchemin
People may rely on this feature for [poor man's] load balancing though, I forgot what the exact use case was but used this at Airbnb at some point. Maybe the solution is to make the UI/UX/log output much more clear around this. Making the CLI log more clear should be really easy to do, web server

Re: [2.0 spring cleaning] Deprecate adding Operators and Hooks via plugins?

2019-04-13 Thread Maxime Beauchemin
I'd say it could be great to deprecate the whole plugin system and use Python's "entry points" instead. I just didn't know that was an option and the standard way to do this when I originally wrote it... The current plugin system is a minefield for circular dependencies... On Sat, Apr 13, 2019 at

Re: Airflow 1.10.3 has been released!

2019-04-12 Thread Maxime Beauchemin
Infinite kudos! Thanks Ash! On Fri, Apr 12, 2019 at 1:33 PM Kaxil Naik wrote: > Thanks Ash and all the contributors. > > On Fri, Apr 12, 2019, 20:38 Robin Edwards wrote: > > > Yes thanks everyone for their hard work :-) > > > > R > > > > On Fri, 12 Apr 2019 at 18:38, Feng Lu wrote: > > > > > >

Re: [2.0 spring cleaning] Rename and re-icon the refresh button

2019-04-12 Thread Maxime Beauchemin
Sounds to me like this is just a UI improvement, not a deprecation / spring cleanup. A simple PR should insure this fix gets in the next version. On Fri, Apr 12, 2019 at 8:49 AM James Meickle wrote: > https://issues.apache.org/jira/browse/AIRFLOW-3816 > > To quote: > > There's a "Refresh" button

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-11 Thread Maxime Beauchemin
> variables for start & end datetime of a DAG run. > >> * Other: > >> * Removal of variables should be done in major version and > >> deprecation warnings should be added. > >> > >> So how about the following: > >> - We start by put

Re: [VOTE] AIP-6 Apply Pylint to Airflow

2019-04-11 Thread Maxime Beauchemin
+1 (binding) Also check out black to auto-pep8! https://github.com/ambv/black On Thu, Apr 11, 2019 at 5:12 PM Tao Feng wrote: > +1 > > On Thu, Apr 11, 2019 at 4:27 PM Beau Barker > wrote: > > > +1 non binding > > > > Pylint is extremely strict so your may want to be selective about the > > rul

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-11 Thread Maxime Beauchemin
gt; - We start by putting deprecation warnings on tables, latest_date, > > end_date and END_DATE and remove them in Airflow 2.0. > > - We add a lineage_enabled config which is false by default and thus > > inlets/outlets aren’t provided, unless set to true. > > - We continue discussion a

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-10 Thread Maxime Beauchemin
Making backwards incompatible changes that require altering the thousands (millions?!) of DAGs in the wild will alienate the community and prevent many from orchestrating an upgrade. Upgrading hundreds of DAGs and Airflow atomically would be hard and dangerous. To mitigate this, changes to the DAG

Re: Deployment guide for airflow on kubernetes

2019-04-09 Thread Maxime Beauchemin
Also looks like there's a Helm chart here. https://github.com/helm/charts/tree/master/stable/airflow I haven't used it personally, but looks good at first glance On Mon, Apr 8, 2019 at 9:43 PM Barni Seetharaman wrote: > Please checkout > https://github.com/GoogleCloudPlatform/airflow-operator >

Re: [VOTE] AIP-10: Multi-layered and multi-stage official Airflow image

2019-04-08 Thread Maxime Beauchemin
+1 binding Max On Sun, Apr 7, 2019 at 5:23 PM Jiajie Zhong wrote: > +1 non binding > > Best wish. > -- Jiajie > > From: Chao-Han Tsai > Sent: Sunday, April 7, 2019 0:45 > To: dev@airflow.apache.org > Subject: Re: [VOTE] AIP-10: Multi-layered and multi-stage off

Re: Potential AIP: Selenium/User testing

2019-04-02 Thread Maxime Beauchemin
Side note: I'd recommend using Cypress for that. We've had a good experience using it on Superset https://www.cypress.io/ Max On Tue, Apr 2, 2019 at 11:23 AM Daniel Imberman (BLOOMBERG/ SAN FRAN) < dimber...@bloomberg.net> wrote: > Hello fellow airflowers! > > I've noticed a few times on the k8s

FAB - New REST API in the works

2019-04-01 Thread Maxime Beauchemin
Hey! I wanted to point out that there's awesome work taking place in FAB around a new REST API provided by the framework, and ways to extend it. Daniel Gaspar (cced) is working on this currently, and looking for input on design / implementation. Check it out and chime in on the PR https://github

Re: [VOTE] Accept AIP-3: Drop support for Python 2

2019-03-24 Thread Maxime Beauchemin
+1 (binding) On Sun, Mar 24, 2019 at 8:13 AM Jiajie Zhong wrote: > +1 (binding) > > could make Airflow easy to maintenance. > > > Best wish. > -- jiajie > > From: Felix Uellendall > Sent: Sunday, March 24, 2019 19:43 > To: dev@airflow.apache.org > Subject: Re: [

Re: Welcome Daniel Imberman as a new committer!

2019-03-19 Thread Maxime Beauchemin
Well deserved, welcome aboard! On Tue, Mar 19, 2019 at 3:56 AM Szymon Przedwojski < szymon.przedwoj...@polidea.com> wrote: > Congrats Daniel! > > Szymon Przedwojski > Polidea | Software Engineer > > M: +48 500 330 790 > E: szymon.przedwoj...@polidea.com > > > On 19 Mar 2019, at 09:39, Bolke de Br

Re: [RFC] Prototype for an Airflow landing page

2019-03-18 Thread Maxime Beauchemin
I'm curious to see the source, do you mind sharing the repo? Side note: I used Gatsby recently and thought it was pretty amazing for building static sites and beyond. Quick nit-picky comments: * logo is too small height:50px (along with margin-top: -10px) looks much better * dark footer needs pad

Re: Multiple Schedulers - "scheduler_lock"

2019-03-17 Thread Maxime Beauchemin
AIP-15 (Support Multiple-Schedulers for HA & Better Scheduling > Performance) > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651 > < > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651>. > > > > > More inputs fro

Re: Suggestion for AIP improvement

2019-03-14 Thread Maxime Beauchemin
For reference, here's Superset's SIP-0 which defines the template and process for SIPs. https://github.com/apache/incubator-superset/issues/5602 Max On Wed, Mar 13, 2019 at 3:45 PM Bas Harenslak wrote: > Hi all, > > I suggest a new template + guidelines to improve the AIP process. Please > let

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-03-09 Thread Maxime Beauchemin
I want to raise the question of the amount of normalization we want to use here as it seems the to be an area that needs more attention. The SIP suggest having DAG blobs, task blobs and edges (call it the fairly-normalized). I also like the idea of all-encompassing (call it very-denormalized) DAG

Re: Multiple Schedulers - "scheduler_lock"

2019-03-02 Thread Maxime Beauchemin
oblem < > >> https://en.wikipedia.org/wiki/Birthday_problem> > >> > >> > >>> On 2 Mar 2019, at 3:39 PM, Tao Feng wrote: > >>> > >>> Does the proposal use master-slave architecture(leader scheduler vs > slave > >>> s

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Maxime Beauchemin
Forgot to mention: the intention was to use the lock, but I never personally got to do the second phase which would consist of skipping the DAG if the lock is on, and expire the lock eventually based on a config setting. Max On Fri, Mar 1, 2019 at 1:57 PM Maxime Beauchemin wrote: > My origi

Re: Multiple Schedulers - "scheduler_lock"

2019-03-01 Thread Maxime Beauchemin
My original intention with the lock was preventing "double-triggering" of task (triggering refers to the scheduler putting the message in the queue). Airflow now has good "double-firing-prevention" of tasks (firing happens when the worker receives the message and starts the task), even if the sched

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-27 Thread Maxime Beauchemin
gt; in the end (but then a slightly smaller and serializable version of it). My > preference would be to simplify the DAG object and get rid of the BaseDag > and SimpleDag to simplify the object hierarchy. > > Cheers, Fokko > > Op wo 27 feb. 2019 om 21:23 schreef Maxime Beauchemin

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-27 Thread Maxime Beauchemin
JinjaTemplate objects are not serializable for some odd obscure > > reason, I think the community can solve that easily, if someone wants a > > full brain dump on this I can share what I know > > What was the preference for using Pickle over Docker/PEX for serialization? > I thin

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-27 Thread Maxime Beauchemin
;m > hopeful that Airflow develops in the direction of focusing on being a > principled Python framework for managing tasks/data executed in containers, > and the resulting execution state. > > > On Tue, Feb 26, 2019 at 8:55 PM Maxime Beauchemin < > maximebeauche...@gmail.com>

Re: [DISCUSS]: Remove Mesos Executor from Airflow 2.0.0?

2019-02-27 Thread Maxime Beauchemin
gt; > >> I’m glad yarn wasn’t the only option - it would have meant I’d have > never > >> been in a position to use Airflow! (Many of our workflows don’t touch > >> EMR/Hadoop, and running Celery is much more of a known element to a > python > >> deve

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-26 Thread Maxime Beauchemin
Related thoughts: * on the topic of serialization, let's be clear whether we're talking about unidirectional serialization and *not* deserialization back to the object. This works for making the web server stateless, but isn't a solution around how DAG definition get shipped around on the cluster

Re: Short Airflow user survey

2019-02-25 Thread Maxime Beauchemin
+1, this is great and we should do it periodically! On Mon, Feb 25, 2019 at 10:42 AM Dan Davydov wrote: > This is very interesting and useful, big thanks for conducting the survey! > > On Mon, Feb 25, 2019 at 12:24 PM Ash Berlin-Taylor wrote: > > > Thanks for all those who answered, there's som

Re: [DISCUSS]: Remove Mesos Executor from Airflow 2.0.0?

2019-02-11 Thread Maxime Beauchemin
>From memory, I think MesosExecutor depends on pickling to get DAG definitions to workers, which we should also deprecate. About CeleryExecutor, we never had the intention to make it the recommended option for production early on. The intent back in 2014 was to write a YarnExecutor quickly (that w

Re: AIP-12 Persist DAG into DB

2019-01-31 Thread Maxime Beauchemin
Right, it's been discussed extensively in the past and the main thing needed to get to a "stateless web server" (or at least a DagBag-free web server) is to drop the template rendering in the UI. Also we might need little workarounds (we'd have to dig in to check) around deleting task instances or

Re: [PROPOSAL] Add a landing page for Apache Airflow

2019-01-24 Thread Maxime Beauchemin
+1, also I hear Gatsby is great. I did some research on this a while back but never was able to commit the time It'd be great to move off readthedocs and onto something more modern-looking, but the RST auto-code-api-documentation magic is valuable and hard to reproduce elsewhere. There's probably

Re: AIP-8 Split Hooks/Operators into Separate Packages

2019-01-10 Thread Maxime Beauchemin
community should agree on a plan. Refactoring the hooks and operators out, a set at a time, seems like a really good start. Max On Thu, Jan 10, 2019 at 8:44 AM Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > That's not what I meant. If I apply what I meant to your exam

Re: AIP-8 Split Hooks/Operators into Separate Packages

2019-01-10 Thread Maxime Beauchemin
; This is very inconvenient. > > > Sent with ProtonMail Secure Email. > > ‐‐‐ Original Message ‐‐‐ > On Wednesday, January 9, 2019 9:29 PM, Maxime Beauchemin < > maximebeauche...@gmail.com> wrote: > > > If there's a strict policy of having a single hook

Re: AIP-8 Split Hooks/Operators into Separate Packages

2019-01-09 Thread Maxime Beauchemin
rators package. There is nothing what-so-ever special about the > >>> |airflow.operators| package namespace, and for example Google could > >>> choose to release a airflow-gcp-operators package now and tell people > to > >>> |from gcp.airflow.operators import Some

Re: AIP-8 Split Hooks/Operators into Separate Packages

2019-01-07 Thread Maxime Beauchemin
Something to think about is how data transfer operators like the MysqlToHiveOperator usually rely on 2 hooks. With a package-specific approach that may mean something like an `airflow-hive`, `airflow-mysql` and `airflow-mysql-hive` packages, where the `airflow-mysql-hive` package depends on the two

  1   2   >