Re: Airflow and Machine Learning

2020-02-25 Thread James Meickle
t; challenges that are too hard to solve (e.g. modifying one line of code in > > your DAG causes the whole/the majority of a docker image to get rebuilt), > > then we probably will need to do something more like Jarek is talking > > about, but it definitely feels like a hack to

Re: Airflow and Machine Learning

2020-02-24 Thread James Meickle
ng those three cases differently, we can do very well for The ML > case. And we could automate it all - we could detect what kind of change > user ldid locally and act appropriately. > > J. > > > > On Mon, Feb 24, 2020 at 4:54 PM Ash Berlin-Taylor wrote: > > > >

Re: Airflow and Machine Learning

2020-02-24 Thread James Meickle
I really agree with most of what was posted above but particularly love what Evgeny wrote about having a DAG API. As an end user, I would love to be able to provide different implementations of core DAG functionality, similar to how hExecutor can already be subclassed. Some key behavior points I ei

Re: Internal APIs of Airflow

2020-02-24 Thread James Meickle
I think that trying to *forbid* this is really not Pythonic. A more appropriate way would be to have import paths ("from airflow.internals"), docstrings, and warnings (via the silenceable warnings module) indicating which APIs are "internal" (i.e., subject to change even in patch versions). That is

Re: [DISCUSS] Forwarding Slack Threads/Communication to mailing list

2020-01-28 Thread James Meickle
ue, Jan 28, 2020 at 8:40 AM Jarek Potiuk wrote: > On Tue, Jan 28, 2020 at 2:25 PM James Meickle > wrote: > > > Hi all, as an alternative I'd like to suggest a pattern I've seen a few > > orgs use for internal documentation. Conversations can occur in Slack, > but &

Re: [DISCUSS] Forwarding Slack Threads/Communication to mailing list

2020-01-28 Thread James Meickle
Hi all, as an alternative I'd like to suggest a pattern I've seen a few orgs use for internal documentation. Conversations can occur in Slack, but if they're "worth archiving", someone can tag in a bot with a slash command. It will grab the conversation and archive it for easier indexing, with a li

Re: [DISCUSS] Using asserts in airflow code

2019-12-03 Thread James Meickle
Asserts are strictly a developer tool, and as a Airflow cluster operator I would _really_ want to know if something happens that "should never happen in reality" since those are the worst class of bugs. I think that almost any case we'd want to assert on, should actually be an exception. Even if we

Re: [DISCUSS] Using shared memory for inter-task communication

2019-11-26 Thread James Meickle
I think this idea is running before we can even crawl. Before it makes any sense to implement this in Airflow, I think it needs three other things: - A reliable, well-designed component for passing data between tasks first (not XCom!); where shared memory is an _implementation_ of data passing - A

Re: KubernetesPodOperator can't use get_logs

2019-11-15 Thread James Meickle
I've also found this to be a huge problem with the current pod operator; it isn't very resilient to even temporary failures in the k8s API, which _will_ occur if doing a live watch of logs. On Fri, Nov 15, 2019 at 10:51 AM Daniel Mateus Pires wrote: > Hi there, > > We use the KubernetesPodOperat

Re: Proposed roadmap for Airflow 2.0

2019-10-21 Thread James Meickle
:38:38 BST, bharath palaksha < > bharath...@gmail.com > > > > > > > wrote: > > > > >Hi, > > > > > > > > > >Is there any discussion thread on adding priority to tasks and > > > > >cost-based > > > > >o

Re: Announcing SIG-Knative/ The Monthly Knative Executor Meetup/Call-for-contributers

2019-10-16 Thread James Meickle
Great stuff. I shared it with my team, and I'll try to make the first one at least. On Wed, Oct 16, 2019 at 1:35 PM Daniel Imberman wrote: > Hello fellow Airflowers! > > As some of you have heard, recently we’ve been looking into knative as a > new executor that will offer both the flexibility o

Re: Proposed roadmap for Airflow 2.0

2019-09-30 Thread James Meickle
for extending it > in later versions in backwards-compatible way (maybe then we should adopt > SemVer officially and follow it). > > J. > > > On Tue, Sep 24, 2019 at 11:52 PM James Meickle > wrote: > > > My question with that is, how often do we want to do maj

Re: Setting to add choice of schedule at end or schedule at start of interval

2019-09-26 Thread James Meickle
need > to > > >> convert existing dags - the default behaviour remains as it is as far > > as I > > >> understand. And this flag is much simpler to understand and reason > about > > >> than arbitrary function and it corresponds to real business cases: &

Re: Proposed roadmap for Airflow 2.0

2019-09-24 Thread James Meickle
My question with that is, how often do we want to do major version increments? There's a few API breaking changes I'd love to see, but whether to propose them for 2.0 depends on what the wait until 3.0 looks like (or whether we'll allow more minor version breakages in the future) On Tue, Sep 24,

Re: [PLEASE PARTICIPATE][AIP-11] UX phase of Airflow website

2019-09-18 Thread James Meickle
I don't like that the first thing on the "About" entry is a big visual timeline with only a few entries. That's taking up the entire above the fold to just list three dates. Further down on the "About" page, I also don't like the switch between a 2x2 text grid and an image+text list. The latter is

Re: Setting to add choice of schedule at end or schedule at start of interval

2019-08-28 Thread James Meickle
Totally agree with Daniel here. I think that if we implement this feature as proposed, it will actively discourage us from implementing a better data-aware feature that would remain invisible to most users while neatly addressing a lot of edge cases that currently require really ugly hacks. I belie

Re: Setting to add choice of schedule at end or schedule at start of interval

2019-08-23 Thread James Meickle
d. And this flag is much simpler to understand and reason about > >> than arbitrary function and it corresponds to real business cases: > >> > >> 1) schedule_at_interval_end = True -> wait for the data to be ready for > >> the > >> interval (current/d

Re: [DISCUSS] Smarter test execution for CI (trivial changes without full tests)

2019-08-23 Thread James Meickle
GitHub recently introduced the idea of "Draft" PRs: https://github.blog/2019-02-14-introducing-draft-pull-requests/ Could we do something similar either with that system or something else? Run a minimal set until it's marked as "ready for testing", and then run a larger suite. On Fri, Aug 23, 201

Re: Setting to add choice of schedule at end or schedule at start of interval

2019-08-23 Thread James Meickle
This is a change to one of Airflow's core concepts, and it would require a lot of work for existing DAGs to cut over to it. Given that, my personal preference would be to allow arbitrary customization rather than just a bit toggle. Such as allowing passing in a mapping function: given an interval's

Re: [DISCUSS] Tweaks to the Airflow logo

2019-08-20 Thread James Meickle
gt; red-gold> > > > > > > swoosh> > > > > > > >> on Astronomer's icon, which can be seen at> > > > > > > >> https://twitter.com/astronomerio?lang=en). Generally, it's > better> > > > > to> > > > > > > >&g

Outage report

2019-08-16 Thread James Meickle
We had an outage last night that was rather complex and difficult to debug. Rather than just writing up the bug, I included what we did for various debug steps. Hope some folks who are also cluster maintainers may find it interesting! https://issues.apache.org/jira/browse/AIRFLOW-5238

Re: [DISCUSS] Tweaks to the Airflow logo

2019-08-15 Thread James Meickle
Hi all, This thread got split somehow. To bring discussion of Daniel's proposal to this thread: Daniel Gruno Wed, Aug 14, 2:33 AM (1 day ago) to dev Hi James, I don't mean to butt in much here, but as the general "guy who fixes various logos around here" (I re-did the current airflow logo as th

Re: [DISCUSS] Tweaks to the Airflow logo

2019-08-13 Thread James Meickle
I'm gonna have to voice my discontent on this one. The previous logo was designed to look like a physical pinwheel, with shading and gradients to add depth. That makes it hard to reproduce, so of course it's worth considering a flatter replacement. This simplification keeps the same overall shape

Re: Removal of "run_duration" and its impact on orphaned tasks

2019-07-31 Thread James Meickle
t a higher default) > > Can you check when those tasks got into "scheduled" and what the time > difference is with "now"? > > B. > > Sent from my iPhone > > > On 31 Jul 2019, at 20:56, James Meickle > wrote: > > > > Ash: > > >

Re: Removal of "run_duration" and its impact on orphaned tasks

2019-07-31 Thread James Meickle
e actual bug, and running the orphan detection more > > often may just be replacing one patch (the run duration) with another one > > (running the orphan detection more than at start up). > > > > -ash > > > > > On 31 Jul 2019, at 16:43, James Meickle .INVALI

Removal of "run_duration" and its impact on orphaned tasks

2019-07-31 Thread James Meickle
In my testing of 1.10.4rc3, I discovered that we were getting hit by a process leak bug (which Ash has since fixed in 1.10.4rc4). This process leak was minimal impact for most users, but was exacerbated in our case by using "run_duration" to restart the scheduler every 10 minutes. To mitigate that

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-18 Thread James Meickle
https://issues.apache.org/jira/browse/AIRFLOW-4593 so that may be part of > the OOMing too. (If this is the cause of your problem then .4 isn't worse > than .3 I don't think?) > > -ash > > > On 18 Jul 2019, at 15:54, James Meickle > wrote: > > > >

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-18 Thread James Meickle
>>> > >>>>> Thanks for all your hard work > >>>>> > >>>>> R > >>>>> > >>>>> On Tue, 16 Jul [2019](tel:2019) at 21:15, Ash Berlin-Taylor < > >> a...@apache.org> wrote: > >>

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-16 Thread James Meickle
+1 (nonbinding) to the release, it fixes a lot of UI issues we've been seeing lately. Though two notes: 1) Tasks were unscheduleable until I ran an upgradedb due the default pool change. 2) I got crash loops because I based our custom logging file off of the previous version's template. The chang

Re: [2.0 spring cleaning] Remove `dag >> task`?

2019-07-03 Thread James Meickle
I didn't even know this was a feature. Seems like it's unnecessarily ambiguous, since you can't tell at a glance whether a variable is a dag or a task. Definitely in favor of removal. On Wed, Jul 3, 2019 at 8:49 AM Ash Berlin-Taylor wrote: > I'm just suggesting removing the `dag >> task` -- `tas

Re: [VOTE] Release Apache Airflow 1.10.4rc1 as 1.10.4

2019-07-01 Thread James Meickle
security context (#5474) [George Miller] > > f6f37f207 [AIRFLOW-3502] Update config template to reflect supporting > different Celery pool implementation (#5477) [Xiaodong] > > ece77b61a [AIRFLOW-3502] Add celery config option for setting "pool" > (#4308) [Gab

Re: [VOTE] AIP-16: CLI: Use nested commands instead of flags

2019-06-11 Thread James Meickle
+1 (non-binding) On Tue, Jun 11, 2019 at 7:15 AM Ash Berlin-Taylor wrote: > Hi Airflowers, > > This email calls for a vote to introduce restructure the CLI to use nested > commands instead of flags. The vote will last for at least 1 week (June > 18th 12:00 BST), and at least three +1 (binding) v

Github repo templates

2019-06-07 Thread James Meickle
https://github.blog/2019-06-06-generate-new-repositories-with-repository-templates/ This could be an interesting offering from Airflow, to go with the docs reworks efforts. We could provide a reasonable starting DAG repository with some best practices set up.

Re: Call for fixes for 1.10.4

2019-05-21 Thread James Meickle
I would love to see this in: https://github.com/apache/airflow/pull/4277 though it would be helpful if someone with a functioning local environment scoped it out, since I don't have the time to build one out right this moment. On Sun, May 19, 2019 at 12:27 PM Kaxil Naik wrote: > Hello all, > > P

Re: Call for fixes for 1.10.4

2019-05-20 Thread James Meickle
Hi, This issue is fairly critical IMO, it's already hampered our incident recovery process a few times: https://issues.apache.org/jira/browse/AIRFLOW-4524 I sent in a PR that mitigates some of the issue by implementing a helper function, which we should probably use for any "checkbox" values. How

Re: [DISCUSS] AIP-5: Remote DAG Fetcher

2019-05-08 Thread James Meickle
I left some comments on this, and on AIP-20. I'm glad to see this moving forward as I think it's a foundational building block, even if it won't be so useful immediately :) On Thu, May 2, 2019 at 5:31 AM Ash Berlin-Taylor wrote: > I've left some comments on the wiki page. In short: nice idea, bu

Re: How to achieve Airflow worker fault tolerance?

2019-04-24 Thread James Meickle
Personally I'd find it really valuable if Airflow tasks had an understanding of the difference between a reported executor failure, an implied executor failure via timeout, and an application-level failure code returned by the executor. It's frustrating having to put retries on most tasks to cope w

Re: New job: I'm joining Astronomer.io to work on Airflow

2019-04-18 Thread James Meickle
That's great news! Congratulations. On Thu, Apr 18, 2019 at 2:37 PM Ash Berlin-Taylor wrote: > Hi everyone! > > I've hinted at this to a few people in Slack, and it's now official. I'm > super excited to be able to say that I've joined Astronomer as a full time > employee, and a good portion of

Re: Longer term Airflow planning

2019-04-15 Thread James Meickle
> > > > > > accessible to newcomers. For example the splitting up of the > > infamous > > > > > > models.py (a file with well over 6k lines), was quite a pain with > > > > > > circular > > > > > > imports. This is p

Re: [DISCUSS] period_start/period_end instead of execution_date/next_execution_date

2019-04-15 Thread James Meickle
t; the > > > user that it's not running because its interval has not yet > completed. > > > Indicate this state visually, perhaps by using some transparency > or another > > > color. > > > > > > 2. Instead of

Re: [2.0 spring cleaning] Deprecate adding Operators and Hooks via plugins?

2019-04-15 Thread James Meickle
gt;to > > >> > those, > > >> > > but operators/hooks should be fairly static once created. > > >> > > > > >> > > On Fri, Apr 12, 2019 at 8:53 AM Ash Berlin-Taylor > > > > > >> > wrote: > > >> >

[2.0 spring cleaning] Rename and re-icon the refresh button

2019-04-12 Thread James Meickle
https://issues.apache.org/jira/browse/AIRFLOW-3816 To quote: There's a "Refresh" button on the Graph view that instantly refreshes the DAG state. There's also a "Refresh" button that forces the webserver to reload the DAG definition from the filesystem. These two buttons use the same icon and are

Re: [2.0 spring cleaning] Deprecate subdags

2019-04-12 Thread James Meickle
duction and haven't had much problem with > it. > > Can you list the issues you had? > > Regards, > Kaxil > > > On Fri, Apr 12, 2019, 16:16 James Meickle .invalid> > wrote: > > > Given their bad reputation, would it be appropriate to deprecate sub

Re: [2.0 spring cleaning] Deprecate subdags

2019-04-12 Thread James Meickle
ovide a useful > abstraction - but I agree right now they aren't great (I avoid them because > of this) > > I have half thoughts of how to it should work, I just need to look at the > code in depth to see if that makes sense. Now 1.10.3 is out I might have a > bit more time to do t

[2.0 spring cleaning] Get SLA improvements merged

2019-04-12 Thread James Meickle
I would really love to get the SLA improvements I'd been working on into Airflow core: https://github.com/apache/airflow/pull/3584 The intended improvements, which have backwards compat for basic SLA usage but not for users with custom callbacks, are: - Extend the current "expected to finish by"

[2.0 spring cleaning] Require unique conn_id

2019-04-12 Thread James Meickle
Airflow fetches connections by name, but doesn't enforce unique names. My team got bit by this, since it's very unexpected behavior for most types of data entry. The reason for this behavior is explained in the docs: "Many connections with the same conn_id can be defined and when that is the case,

[2.0 spring cleaning] Deprecate subdags

2019-04-12 Thread James Meickle
I think we should deprecate SubDAGs given the complexity they add and the limited usage and use cases. Or, we should invest effort in redesigning their API and implementation. I think that having to account for subdag-introduced complexity makes Airflow's code much harder to maintain and buggier, l

Re: [2.0 spring cleaning] Deprecate adding Operators and Hooks via plugins?

2019-04-12 Thread James Meickle
YES - I strongly agree with this! I first did it this way because I wanted to follow the instructions, assuming there was some Airflow magic, and later found it really frustrating to maintain. We should be clear that standard Python packaging is the way to go. That being said, what if Airflow had

Longer term Airflow planning

2019-04-10 Thread James Meickle
Hi all, I've been following Airflow development fairly actively for over a year. In that time, the company I work at (Quantopian) has gone all-in on Airflow. It's a core part of our business and required for daily operations. However, I've had some concerns over the future of the project. Part of

Re: [DISCUSS] period_start/period_end instead of execution_date/next_execution_date

2019-04-10 Thread James Meickle
use even after reading the > > scheduling section of the doc and the FAQ, it was still not clear in my > > mind. Btw, I find some ideas exposed by James Meickle in the [DISCUSS] > > AIRFLOW-4192 very interesting and I share his opinion that there's still > > room for im

Re: Difference between Kubernetes Executor vs PodOperator

2019-04-10 Thread James Meickle
a pipeline for building the images that the operator uses. On Wed, Apr 10, 2019 at 2:41 AM Ashwin Sai Shankar wrote: > Thanks, James and Kamil! Please let me know if you have any examples of > setting up Kubernetes Executor and Operator. > > On Tue, Apr 9, 2019 at 8:03 AM James Meickl

Re: Difference between Kubernetes Executor vs PodOperator

2019-04-09 Thread James Meickle
Yes, that summary is correct - the Executor is using Kubernetes to execute all Airflow tasks (each wrapped by a temporary Airflow process), while the PodOperator is using Kubernetes only for that task, to execute one Pod (which likely won't run any Airflow code at all). On Tue, Apr 9, 2019 at 3:17

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-09 Thread James Meickle
I agree with Ash here. The naming of "execution_date" is incredibly confusing to people who are new to Airflow, who think it has something to do with... execution. However, I think that there's still room for improvement with "period_start" and "period_end". Think about manually triggered tasks -

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-08 Thread James Meickle
I'm generally in favor of this idea. Several people on my team have been confused by the different date options and their meaning. For the dates, I think we should switch to providing alternate representations of dates exclusively via Jinja filters. So instead of "next_execution_date", you'd use "

Re: [Discuss] Airflow Kubernetes worker configuration should be parsed from YAML

2019-03-06 Thread James Meickle
I'm in favor of having a YAML-based option for Kubernetes. We've had to internally subclass the Kubernetes operator because it really isn't doing what we need out of the box; such as intercepting the object it creates right before it sends it so that we can patch in missing features. I think it wou

Re: Is `airflow backfill` disfunctional?

2019-03-04 Thread James Meickle
This is an old thread, but I wanted to bump it as I just had a really bad experience using backfill. I'd been hesitant to even try backfills out given what I've read about it, so I've just relied on the UI to "Clear" entire tasks. However, I wanted to give it a shot the "right" way. Issues I ran in

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-27 Thread James Meickle
mage and we loose > centralized control over that logic), I think we need some sort of > lightweight Airflow SDK that works over the REST api. The DAGs, instead of > importing the whole Airflow python package would only import that SDK, and > the server side implementation of the ca

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-27 Thread James Meickle
On the topic of using Docker, I highly recommend looking at Argo Workflows and some of their sample code: https://github.com/argoproj/argo tl;dr is that it's a workflow management tool where DAGs are expressed as YAML manifests, and tasks are just containers run on Kubernetes. I think that there'

Re: 'Task Instance State' FAILED: Task is in the 'running' state which is not a valid state for execution. The task must be cleared in order to be run.

2019-02-13 Thread James Meickle
In some cases this is a double execute in Celery. Two workers grab the same task, but the first one to update the metadata db to "running" is the only one allowed to run. In our case this leads to confusing, but ultimately not incorrect, behavior: the failed task writes a log file and makes that av

Requesting AIP access

2019-01-14 Thread James Meickle
Could I get AIP access for "eronarn"? Thank you!