I feel like for this, we can incorporate the smart sensor we have
implemented at Airbnb that we plan on open sourcing.
The TL;DR is that it works by having the Sensor task run briefly and
materialize some state into the DB which master sensor tasks poke for. This
can be with custom time intervals.
The issue was before they re-licensed it. Now I believe the issue is put to
bed as MIT is Apache compatible.
On Wed, Nov 27, 2019 at 7:38 AM Kamil Breguła
wrote:
> But there is the question, does Apache have additional restrictions on
> this issue?
>
> On Wed, Nov 27, 2019 at 4:30 PM Colin Ingar
Agreed on running before we can crawl. The logical way to do this now is to
group it as one big task with more resources. With respect to affinity on
the same machine, that's basically what it is. I guess this hinges on well
your solution can handle workloads with different resource requirements.
-1 (binding)
Good points made by Dan. We don't need to have the future plan implemented
completely but it would be nice to see more detailed notes about how this
will play out in the future. We shouldn't walk into a system that causes
more pain in the future. (I can't say for sure that it does, but
if people use either:
>
> try:
> ...
> except:
> ...
>
> Or
>
> try:
> ...
> finally:
> ...
>
> Their code will still run on this type of exit, but in the case of 1) this
> can at least be put down to poor python code and case 2) any code
Task_copy.on_kill() should probably be killing the underlying process, but
I think it's fuzzy where the exception gets thrown. I think the intention
is for the exception to get caught in that same block, so the cleanup can
happen, but this is not the case since it is thrown in the main thread. I
th
Actually, reading the docs, the handler throws it in the main thread. In
that case we should definitely change it to subclass SystemExit, or just
use System.exit
On Wed, Oct 2, 2019 at 12:53 PM Alex Guziel wrote:
> It's been a while since I've looked at this code, but the exc
It's been a while since I've looked at this code, but the exception thrown
there is thrown from a place where it should not be able to be caught by
your operator code, so the issue may be somewhere else.
On Wed, Oct 2, 2019 at 12:41 PM Shaw, Damian P. <
damian.sha...@credit-suisse.com> wrote:
> T
Agree with Bolke here. Not much is going on in worker as long as there
aren’t breaking changes.
On Sat, Sep 14, 2019 at 1:24 PM Bolke de Bruin wrote:
> I actually think that it is not that risky (although ymmv). Worker nodes
> are pretty independent from the scheduler/webserver. As long as the
>
Latest one looks great.
On Tue, Aug 20, 2019 at 11:22 AM Aizhamal Nurmamat kyzy
wrote:
> Great job Chris! Love it :) Thank you for your patience and such a big
> contribution!
>
> On Tue, Aug 20, 2019 at 10:45 AM Jarek Potiuk
> wrote:
>
> > All for it :)
> >
> > On Tue, Aug 20, 2019 at 1:08 PM
Congratulations Kevin!
On Tue, Apr 30, 2019 at 10:58 AM Tao Feng wrote:
> Congrats!
>
> On Tue, Apr 30, 2019 at 10:09 AM Daniel Imberman <
> dimberman.opensou...@gmail.com> wrote:
>
> > Congrats Kevin!
> >
> > On Tue, Apr 30, 2019 at 9:09 AM Aizhamal Nurmamat kyzy
> > wrote:
> >
> > > Congratul
flow
> clusters.
>
> On Wed, Apr 10, 2019 at 1:05 PM Alex Guziel .invalid>
> wrote:
>
> > I'm not a huge fan of having foreign keys. I know Airbnb has and
> definitely
> > still has problems with DB load. I don't see any real convincing
> arguments
&g
I'm not a huge fan of having foreign keys. I know Airbnb has and definitely
still has problems with DB load. I don't see any real convincing arguments
for how adding referential integrity will improve Airflow meaningfully
(yet).
On Wed, Apr 10, 2019 at 12:38 PM Bas Harenslak <
basharens...@godatad
Sensor-service thing seems to open the door to make sensors a pubsub-type
deal where possible. For example, in Hive, you can keep an in-memory
registry of what partitions to sense for, and tail the audit log to see
when they are populated, instead of polling.
On Wed, Mar 6, 2019 at 1:51 PM Alex
Smart sensor seems like a good idea, but I wonder how much performance will
be improved in practice. And of course, one must think about sharding and
such.
I'm not sure how helpful rescheduling sensors is, since it will add
scheduler and DB load seemingly, which is already a bottleneck.
On Wed, M
The scheduler isn't guaranteed to compute them in that order to maximize
parallelism. You can imagine in the case where m = n -1, that it just
computes the m branches in parallel, then it has to complete the nth branch
with parallelism 1.
On Tue, Feb 5, 2019 at 7:20 AM soma dhavala wrote:
> Imag
16 matches
Mail list logo