I think the biggest downsides were already mentioned by Tomek: more dependency management when using apache-beam (plus possibility of conflicts between dependencies of beam and airflow) and no support for data lineage solutions. Besides, we create a higher entry threshold by creating a necessity to understand both beam and airflow concepts. That's why I am also in favor of considering both generic and beam approaches. Maybe we will be able adapt some concepts for generic approach from beam without creating a direct dependency. If no one is against it, I will try to take a closer look at Beam concepts and create an AIP next week.
Kamil On Mon, Sep 7, 2020 at 3:54 PM Daniel Imberman <daniel.imber...@gmail.com> wrote: > Ok that’s awesome. I’m also seeing that they have an s3 IO setting [ > https://beam.apache.org/releases/pydoc/2.23.0/apache_beam.io.aws.s3io.html] > . Seems that if it’s just a pip install we could start out with just File > (I imagine on kubernetes this could even work with volume mounts) and S3, > and then add more as time goes on? Are there any downsides with us tying > this into Beam? (e.g. if we want to use a storage system not yet supported > by beam). > via Newton Mail [ > https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.50&pv=10.15.6&source=email_footer_2 > ] > On Sun, Sep 6, 2020 at 1:24 PM, Tomasz Urbaszek <turbas...@apache.org> > wrote: > I checked it with our Beam team and DirectRunner is supported by > Python SDK and requires no JVM. That's the main reason I think it's > worth considering it :) Hard dependency od JVM would be probably a > no-go for us. > https://beam.apache.org/documentation/runners/direct/ > > Tomek > > > On Sun, Sep 6, 2020 at 9:45 PM Daniel Imberman > <daniel.imber...@gmail.com> wrote: > > > > Oof ok yeah. I hadn't realized that beam had a hard JVM requirement. I > > think that initially offering a local or block storage based solution > with > > easy extensions for users is totally in line with airflow philosophy. I > > think that offering alternative transfer operators inproviders is a great > > idea! > > > > On Sun, Sep 6, 2020, 9:07 AM Ash Berlin-Taylor <a...@apache.org> wrote: > > > > > No strong opinion - but it seems like generic is the easiest for us to > > > code (as we have most of it already via hooks?) and adopt (and doesn't > > > place a hard requirement on Beam/JVM, even if JVM would only be > runtime. > > > Still) > > > > > > This is possibly where Airflow has a core TransferOperator, and > > > providers.apache.beam.operators.BeamTransferOperator? If the "same" > python > > > API could be used for both, and it doesn't needlessly complicated > things. > > > > > > -a > > > > > > On 6 September 2020 16:20:37 BST, Tomasz Urbaszek < > turbas...@apache.org> > > > wrote: > > > >Thanks, Ash for pointing to https://pypi.org/project/smart-open/ This > > > >one looks really interesting for blob storages transfer! > > > > > > > >As stated in the initial design doc I don't think we should focus on > > > >best performance but rather on versatility. Currently, we have many > > > >AtoB operators that do not yield the highest performance but do their > > > >work and are widely used. > > > > > > > >I would say that we should prepare an AIP that will propose two > > > >approaches: generic vs beam. This will allow us to compare them and > > > >then we can vote which one is better from the Airflow community > > > >perspective. > > > > > > > >What do you think? > > > > > > > >Tomek > > > > > > > > > > > >On Sun, Sep 6, 2020 at 2:42 PM Ash Berlin-Taylor <a...@apache.org> > > > >wrote: > > > >> > > > >> For background: in the past I had an S3 to S3 transfer using > > > >smartopen (since we wanted to split one giant ~300GB file onto smaller > > > >parts) and it took about 10mins, so even "large" uses can work fine in > > > >Airflow - no JVM required. > > > >> > > > >> -ash > > > >> > > > >> On 6 September 2020 12:01:24 BST, Tomasz Urbaszek > > > ><turbas...@apache.org> wrote: > > > >> >I think using direct runner as default with the option to specify > > > >> >other setup is a win-win. However, there are few doubts I have > about > > > >> >Beam based approach: > > > >> > > > > >> >1. Dependency management. If I do `pip install apache-airflow[gcp]` > > > >> >will it install `apache-beam[gcp]`? What if there's a version clash > > > >> >between dependencies? > > > >> > > > > >> >2. The initial approach using `DataSource` concept allowed users to > > > >> >use it in any operator (not only transfer ones). In case of relying > > > >on > > > >> >Beam we are losing this. > > > >> > > > > >> >3. I'm not a Beam expert but it seems to not support any data > > > >lineage > > > >> >solution? > > > >> > > > > >> > > > > >> >On Sun, Sep 6, 2020 at 6:15 AM Daniel Imberman > > > >> ><daniel.imber...@gmail.com> wrote: > > > >> >> > > > >> >> I think there are absolutely use-cases for both. I’m totally fine > > > >> >with saying “for small/medium use-cases, we come with an in-house > > > >> >system. However for larger cases, you’ll require spark/Flink/S3. > > > >That’s > > > >> >totally in line with PLENTY of use-cases. This would be especially > > > >cool > > > >> >when matched with fast-follow as we could EVEN potentially tie in > > > >data > > > >> >locality. > > > >> >> > > > >> >> via Newton Mail > > > >> > > > >>[ > > > > https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.50&pv=10.15.6&source=email_footer_2 > > > ] > > > >> >> On Sat, Sep 5, 2020 at 5:11 PM, Austin Bennett > > > >> ><whatwouldausti...@gmail.com> wrote: > > > >> >> I believe - for not large data - the direct runner is wholly > > > >doable, > > > >> >which > > > >> >> seems in line with airflow patterns. I have, and have spoken with > > > >> >several > > > >> >> others that have, been productive with that runner. > > > >> >> > > > >> >> For much larger transfers, the generic operator could accept > > > >> >parameters for > > > >> >> submitting the compute to an actual runner. Though, imagining > that > > > >> >> (needing a runner) would not be the primary use case for such an > > > >> >operator. > > > >> >> > > > >> >> > > > >> >> On Tue, Sep 1, 2020, 11:52 PM Tomasz Urbaszek > > > ><turbas...@apache.org> > > > >> >wrote: > > > >> >> > > > >> >> > Austin, you are right, Beam covers all (and more) important > IOs. > > > >> >> > However, using Apache Beam to design a generic transfer > operator > > > >> >> > requires Airflow users to have additional resources that will > be > > > >> >used > > > >> >> > as a runner (Spark, Flink, etc.). Unless you suggest using > > > >> >> > DirectRunner? > > > >> >> > > > > >> >> > Can you please tell us more how exactly you think we can use > > > >Beam > > > >> >for > > > >> >> > those Airflow transfer operators? > > > >> >> > > > > >> >> > Best, > > > >> >> > Tomek > > > >> >> > > > > >> >> > > > > >> >> > On Wed, Sep 2, 2020 at 12:37 AM Austin Bennett > > > >> >> > <whatwouldausti...@gmail.com> wrote: > > > >> >> > > > > > >> >> > > Are there IOs that would be desired for a generic transfer > > > >> >operator that > > > >> >> > > don't exist in: > > > >> >https://beam.apache.org/documentation/io/built-in/ <- > > > >> >> > > there is pretty solid coverage? > > > >> >> > > > > > >> >> > > Beam is getting to the point where even python beam can > > > >leverage > > > >> >the java > > > >> >> > > IOs, which increases the range of IOs (and performance). > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > > On Tue, Sep 1, 2020 at 3:24 PM Jarek Potiuk > > > >> ><jarek.pot...@polidea.com> > > > >> >> > > wrote: > > > >> >> > > > > > >> >> > > > But I believe those two ideas are separate ones as Tomek > > > >> >explained :) > > > >> >> > > > > > > >> >> > > > On Wed, Sep 2, 2020 at 12:03 AM Jarek Potiuk > > > >> ><jarek.pot...@polidea.com > > > >> >> > > > > > >> >> > > > wrote: > > > >> >> > > > > > > >> >> > > > > I love the idea of connecting the projects more closely! > > > >> >> > > > > > > > >> >> > > > > I've been helping recently as a consultant in improving > > > >the > > > >> >Apache > > > >> >> > Beam > > > >> >> > > > > build infrastructure (in many parts based on my Airflow > > > >> >experience > > > >> >> > and > > > >> >> > > > > Github Actions - even recently they adopted the "cancel" > > > >> >action I > > > >> >> > > > developed > > > >> >> > > > > for Apache Airflow). > > > >> >https://github.com/apache/beam/pull/12729 > > > >> >> > > > > > > > >> >> > > > > Synergies in Apache projects are cool. > > > >> >> > > > > > > > >> >> > > > > J. > > > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > On Tue, Sep 1, 2020 at 11:16 PM Gerard Casas Saez > > > >> >> > > > > <gcasass...@twitter.com.invalid> wrote: > > > >> >> > > > > > > > >> >> > > > >> Agree on keeping those separate, just intervened as I > > > >> >believe its a > > > >> >> > > > great > > > >> >> > > > >> idea. But lets keep @beam and @spark to a separate > > > >thread. > > > >> >> > > > >> > > > >> >> > > > >> > > > >> >> > > > >> Gerard Casas Saez > > > >> >> > > > >> Twitter | Cortex | @casassaez > > > ><http://twitter.com/casassaez> > > > >> >> > > > >> > > > >> >> > > > >> > > > >> >> > > > >> On Tue, Sep 1, 2020 at 2:14 PM Tomasz Urbaszek < > > > >> >> > turbas...@apache.org> > > > >> >> > > > >> wrote: > > > >> >> > > > >> > > > >> >> > > > >> > Daniel is right we have few Apache Beam committers in > > > >> >Polidea so > > > >> >> > we > > > >> >> > > > >> > will ask for advice. However, I would be highly in > > > >favor > > > >> >of > > > >> >> > having it > > > >> >> > > > >> > as Gerard suggested as @beam decorator. This is > > > >something > > > >> >we > > > >> >> > should > > > >> >> > > > >> > put into another AIP together with the mentioned > @spark > > > >> >decorator. > > > >> >> > > > >> > > > > >> >> > > > >> > Our proposition of transfer operators was mainly to > > > >create > > > >> >> > something > > > >> >> > > > >> > Airflow-native that works out of the box and allows us > > > >to > > > >> >simplify > > > >> >> > > > >> > read/write from external sources. Thus, it requires no > > > >> >external > > > >> >> > > > >> > dependency other than the library to communicate with > > > >the > > > >> >API. In > > > >> >> > the > > > >> >> > > > >> > case of Beam we need more than that I think. > > > >> >> > > > >> > > > > >> >> > > > >> > Additionally, the ideas of Source and Destination play > > > >> >nicely with > > > >> >> > > > >> > data lineage and may bring more interest to this > > > >feature > > > >> >of > > > >> >> > Airflow. > > > >> >> > > > >> > > > > >> >> > > > >> > Cheers, > > > >> >> > > > >> > Tomek > > > >> >> > > > >> > > > > >> >> > > > >> > > > > >> >> > > > >> > On Tue, Sep 1, 2020 at 9:31 PM Kaxil Naik > > > >> ><kaxiln...@gmail.com> > > > >> >> > > > wrote: > > > >> >> > > > >> > > > > > >> >> > > > >> > > Nice. Just a note here, we will need to make sure > > > >that > > > >> >those > > > >> >> > > > "Source" > > > >> >> > > > >> and > > > >> >> > > > >> > > "Destination" needs to be serializable. > > > >> >> > > > >> > > > > > >> >> > > > >> > > On Tue, Sep 1, 2020, 20:00 Daniel Imberman < > > > >> >> > > > daniel.imber...@gmail.com > > > >> >> > > > >> > > > > >> >> > > > >> > > wrote: > > > >> >> > > > >> > > > > > >> >> > > > >> > > > Interesting! Beam also could potentially allow > > > >> >transfers > > > >> >> > within > > > >> >> > > > >> > Dask/any > > > >> >> > > > >> > > > other system with a java/python SDK? I think > @jarek > > > >> >and > > > >> >> > Polidea > > > >> >> > > > do a > > > >> >> > > > >> > lot of > > > >> >> > > > >> > > > work with Beam as well so I’d love their thoughts > > > >if > > > >> >this a > > > >> >> > good > > > >> >> > > > >> > use-case. > > > >> >> > > > >> > > > > > > >> >> > > > >> > > > via Newton Mail [ > > > >> >> > > > >> > > > > > > >> >> > > > >> > > > > >> >> > > > >> > > > >> >> > > > > > > >> >> > > > > >> > > > >> > > > > https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.50&pv=10.15.6&source=email_footer_2 > > > >> >> > > > >> > > > ] > > > >> >> > > > >> > > > On Tue, Sep 1, 2020 at 11:46 AM, Gerard Casas Saez > > > >< > > > >> >> > > > >> > gcasass...@twitter.com.invalid> > > > >> >> > > > >> > > > wrote: > > > >> >> > > > >> > > > I would be highly in favour of having a generic > > > >Beam > > > >> >operator. > > > >> >> > > > >> Similar > > > >> >> > > > >> > > > to @spark_task decorator. Something where you can > > > >> >easily > > > >> >> > define > > > >> >> > > > and > > > >> >> > > > >> > wrap a > > > >> >> > > > >> > > > beam pipeline and convert it to an Airflow > > > >operator. > > > >> >> > > > >> > > > > > > >> >> > > > >> > > > Gerard Casas Saez > > > >> >> > > > >> > > > Twitter | Cortex | @casassaez > > > >> ><http://twitter.com/casassaez> > > > >> >> > > > >> > > > > > > >> >> > > > >> > > > > > > >> >> > > > >> > > > On Tue, Sep 1, 2020 at 12:44 PM Austin Bennett < > > > >> >> > > > >> > > > whatwouldausti...@gmail.com> > > > >> >> > > > >> > > > wrote: > > > >> >> > > > >> > > > > > > >> >> > > > >> > > > > Are you guys familiar with Beam > > > >> ><https://beam.apache.org>? > > > >> >> > Esp. > > > >> >> > > > >> if > > > >> >> > > > >> > not > > > >> >> > > > >> > > > > doing transforms, it might rather > straightforward > > > >to > > > >> >rely > > > >> >> > on the > > > >> >> > > > >> > > > ecosystem > > > >> >> > > > >> > > > > of connectors in that Apache Project to use as > > > >the > > > >> >> > foundations > > > >> >> > > > >> for a > > > >> >> > > > >> > > > > generic transfer operator. > > > >> >> > > > >> > > > > > > > >> >> > > > >> > > > > On Tue, Sep 1, 2020 at 11:05 AM Jarek Potiuk < > > > >> >> > > > >> > jarek.pot...@polidea.com> > > > >> >> > > > >> > > > > wrote: > > > >> >> > > > >> > > > > > > > >> >> > > > >> > > > > > +1 > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > On Tue, Sep 1, 2020 at 1:35 PM Kamil Olszewski > > > >< > > > >> >> > > > >> > > > > > kamil.olszew...@polidea.com> > > > >> >> > > > >> > > > > > wrote: > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > > Hello all, > > > >> >> > > > >> > > > > > > since there have been no new comments shared > > > >in > > > >> >the POC > > > >> >> > doc > > > >> >> > > > >> > > > > > > < > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > > > >> >> > > > >> > > > > > > >> >> > > > >> > > > > >> >> > > > >> > > > >> >> > > > > > > >> >> > > > > >> > > > >> > > > > https://docs.google.com/document/d/1o7Ph7RRNqLWkTbe7xkWjb100eFaK1Apjv27LaqHgNkE/edit > > > >> >> > > > >> > > > > > > > > > > >> >> > > > >> > > > > > > for a couple of days, then I will proceed > > > >with > > > >> >creating > > > >> >> > an > > > >> >> > > > AIP > > > >> >> > > > >> > for > > > >> >> > > > >> > > > this > > > >> >> > > > >> > > > > > > feature, if that is ok with everybody. > > > >> >> > > > >> > > > > > > Best regards, > > > >> >> > > > >> > > > > > > Kamil > > > >> >> > > > >> > > > > > > On Thu, Aug 27, 2020 at 10:50 AM Tomasz > > > >Urbaszek > > > >> >< > > > >> >> > > > >> > > > turbas...@apache.org > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > > wrote: > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > > I like the approach as it itnroduces > > > >another > > > >> >> > interesting > > > >> >> > > > >> > operators' > > > >> >> > > > >> > > > > > > > interface standarization. It would be > > > >awesome > > > >> >to here > > > >> >> > more > > > >> >> > > > >> > opinions > > > >> >> > > > >> > > > > :) > > > >> >> > > > >> > > > > > > > > > > >> >> > > > >> > > > > > > > Cheers, > > > >> >> > > > >> > > > > > > > Tomek > > > >> >> > > > >> > > > > > > > > > > >> >> > > > >> > > > > > > > On Wed, Aug 19, 2020 at 8:10 PM Jarek > > > >Potiuk < > > > >> >> > > > >> > > > > jarek.pot...@polidea.com > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > > wrote: > > > >> >> > > > >> > > > > > > > > > > >> >> > > > >> > > > > > > > > I like the idea a lot. Similar things > > > >have > > > >> >been > > > >> >> > > > discussed > > > >> >> > > > >> > before > > > >> >> > > > >> > > > > but > > > >> >> > > > >> > > > > > > the > > > >> >> > > > >> > > > > > > > > proposal is I think rather pragmatic and > > > >> >solves a > > > >> >> > real > > > >> >> > > > >> > problem > > > >> >> > > > >> > > > (and > > > >> >> > > > >> > > > > > it > > > >> >> > > > >> > > > > > > > does > > > >> >> > > > >> > > > > > > > > not seem to be too complex to implement) > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > There is some discussion about it > already > > > >in > > > >> >the > > > >> >> > > > document > > > >> >> > > > >> > (please > > > >> >> > > > >> > > > > > > > chime-in > > > >> >> > > > >> > > > > > > > > for those interested) but here a few > > > >points > > > >> >why I > > > >> >> > like > > > >> >> > > > it: > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > - performance and optimization is not a > > > >> >focus for > > > >> >> > that. > > > >> >> > > > >> For > > > >> >> > > > >> > > > generic > > > >> >> > > > >> > > > > > > stuff > > > >> >> > > > >> > > > > > > > > it is usually to write "optimal" > solution > > > >> >but once > > > >> >> > you > > > >> >> > > > >> admit > > > >> >> > > > >> > you > > > >> >> > > > >> > > > > are > > > >> >> > > > >> > > > > > > not > > > >> >> > > > >> > > > > > > > > going to focus for optimisation, you > come > > > >> >with > > > >> >> > simpler > > > >> >> > > > and > > > >> >> > > > >> > easier > > > >> >> > > > >> > > > > to > > > >> >> > > > >> > > > > > > use > > > >> >> > > > >> > > > > > > > > solutions > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > - on the other hand - it uses very > > > >> >"Python'y" > > > >> >> > approach > > > >> >> > > > >> with > > > >> >> > > > >> > using > > > >> >> > > > >> > > > > > > > > Airflow's familiar concepts (connection, > > > >> >transfer) > > > >> >> > and > > > >> >> > > > has > > > >> >> > > > >> > the > > > >> >> > > > >> > > > > > > potential > > > >> >> > > > >> > > > > > > > of > > > >> >> > > > >> > > > > > > > > plugging in into 100s of hooks we have > > > >> >already > > > >> >> > easily - > > > >> >> > > > >> > > > leveraging > > > >> >> > > > >> > > > > > all > > > >> >> > > > >> > > > > > > > the > > > >> >> > > > >> > > > > > > > > "providers" richness of Airflow. > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > - it aims to be easy to do "quick start" > > > >- > > > >> >if you > > > >> >> > have a > > > >> >> > > > >> > number > > > >> >> > > > >> > > > of > > > >> >> > > > >> > > > > > > > > different sources/targets and as a data > > > >> >scientist > > > >> >> > you > > > >> >> > > > >> would > > > >> >> > > > >> > like > > > >> >> > > > >> > > > to > > > >> >> > > > >> > > > > > > > quickly > > > >> >> > > > >> > > > > > > > > start transferring data between them - > > > >you > > > >> >can do it > > > >> >> > > > >> easily > > > >> >> > > > >> > with > > > >> >> > > > >> > > > > > only > > > >> >> > > > >> > > > > > > > > basic python knowledge and simple DAG > > > >> >structure. > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > - it should be possible to plug it in > > > >into > > > >> >our new > > > >> >> > > > >> functional > > > >> >> > > > >> > > > > > approach > > > >> >> > > > >> > > > > > > as > > > >> >> > > > >> > > > > > > > > well as future lineage discussions as it > > > >> >makes > > > >> >> > > > connection > > > >> >> > > > >> > between > > > >> >> > > > >> > > > > > > sources > > > >> >> > > > >> > > > > > > > > and targets > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > - it opens up possibilities of adding > > > >simple > > > >> >and > > > >> >> > > > flexible > > > >> >> > > > >> > data > > > >> >> > > > >> > > > > > > > > transformation on-transfer. Not a > > > >> >replacement for > > > >> >> > any of > > > >> >> > > > >> the > > > >> >> > > > >> > > > > external > > > >> >> > > > >> > > > > > > > > services that Airflow should use > (Airflow > > > >is > > > >> >an > > > >> >> > > > >> > orchestrator, not > > > >> >> > > > >> > > > > > data > > > >> >> > > > >> > > > > > > > > processing solution) but for the kind of > > > >> >quick-start > > > >> >> > > > >> > scenarios I > > > >> >> > > > >> > > > > > > foresee > > > >> >> > > > >> > > > > > > > it > > > >> >> > > > >> > > > > > > > > might be most useful, being able to > apply > > > >> >simple > > > >> >> > data > > > >> >> > > > >> > > > > transformation > > > >> >> > > > >> > > > > > on > > > >> >> > > > >> > > > > > > > the > > > >> >> > > > >> > > > > > > > > fly by data scientist might be a big > > > >plus. > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > Suggestion: Panda DataFrame as the > format > > > >of > > > >> >the > > > >> >> > "data" > > > >> >> > > > >> > component > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > Kamil - you should have access now. > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > J. > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > On Tue, Aug 18, 2020 at 6:53 PM Kamil > > > >> >Olszewski < > > > >> >> > > > >> > > > > > > > > kamil.olszew...@polidea.com> > > > >> >> > > > >> > > > > > > > > wrote: > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > > Hello all, > > > >> >> > > > >> > > > > > > > > > in Polidea we have come up with an > idea > > > >> >for a > > > >> >> > generic > > > >> >> > > > >> > transfer > > > >> >> > > > >> > > > > > > operator > > > >> >> > > > >> > > > > > > > > > that would be able to transport data > > > >> >between two > > > >> >> > > > >> > destinations > > > >> >> > > > >> > > > of > > > >> >> > > > >> > > > > > > > various > > > >> >> > > > >> > > > > > > > > > types (file, database, storage, etc.) > - > > > >> >please > > > >> >> > find > > > >> >> > > > the > > > >> >> > > > >> > link > > > >> >> > > > >> > > > > with a > > > >> >> > > > >> > > > > > > > short > > > >> >> > > > >> > > > > > > > > > doc with POC > > > >> >> > > > >> > > > > > > > > > < > > > >> >> > > > >> > > > > > > > > > > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > > > >> >> > > > >> > > > > > > >> >> > > > >> > > > > >> >> > > > >> > > > >> >> > > > > > > >> >> > > > > >> > > > >> > > > > https://docs.google.com/document/d/1o7Ph7RRNqLWkTbe7xkWjb100eFaK1Apjv27LaqHgNkE/edit?usp=sharing > > > >> >> > > > >> > > > > > > > > > > > > > >> >> > > > >> > > > > > > > > > where we can discuss the design > > > >initially. > > > >> >Once we > > > >> >> > > > come > > > >> >> > > > >> to > > > >> >> > > > >> > the > > > >> >> > > > >> > > > > > > initial > > > >> >> > > > >> > > > > > > > > > conclusion I can create an AIP on > cWiki > > > >- > > > >> >can I > > > >> >> > ask > > > >> >> > > > for > > > >> >> > > > >> > > > > permission > > > >> >> > > > >> > > > > > to > > > >> >> > > > >> > > > > > > > do > > > >> >> > > > >> > > > > > > > > so > > > >> >> > > > >> > > > > > > > > > (my id is 'kamil.olszewski')? I > believe > > > >> >that > > > >> >> > during > > > >> >> > > > the > > > >> >> > > > >> > > > > discussion > > > >> >> > > > >> > > > > > we > > > >> >> > > > >> > > > > > > > > > should definitely aim for this feature > > > >to > > > >> >be > > > >> >> > released > > > >> >> > > > >> only > > > >> >> > > > >> > > > after > > > >> >> > > > >> > > > > > > > Airflow > > > >> >> > > > >> > > > > > > > > > 2.0 is out. > > > >> >> > > > >> > > > > > > > > > > > > >> >> > > > >> > > > > > > > > > What do you think about this idea? > > > >Would > > > >> >you find > > > >> >> > such > > > >> >> > > > >> an > > > >> >> > > > >> > > > > operator > > > >> >> > > > >> > > > > > > > > helpful > > > >> >> > > > >> > > > > > > > > > in your pipelines? Maybe you already > > > >use a > > > >> >similar > > > >> >> > > > >> > solution or > > > >> >> > > > >> > > > > know > > > >> >> > > > >> > > > > > > > > > packages that could be used to > > > >implement > > > >> >it? > > > >> >> > > > >> > > > > > > > > > > > > >> >> > > > >> > > > > > > > > > Best regards, > > > >> >> > > > >> > > > > > > > > > -- > > > >> >> > > > >> > > > > > > > > > > > > >> >> > > > >> > > > > > > > > > Kamil Olszewski > > > >> >> > > > >> > > > > > > > > > Polidea <https://www.polidea.com> | > > > >> >Software > > > >> >> > Engineer > > > >> >> > > > >> > > > > > > > > > > > > >> >> > > > >> > > > > > > > > > M: +48 503 361 783 > > > >> >> > > > >> > > > > > > > > > E: kamil.olszew...@polidea.com > > > >> >> > > > >> > > > > > > > > > > > > >> >> > > > >> > > > > > > > > > Unique Tech > > > >> >> > > > >> > > > > > > > > > Check out our projects! < > > > >> >> > > > >> https://www.polidea.com/our-work> > > > >> >> > > > >> > > > > > > > > > > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > -- > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > Jarek Potiuk > > > >> >> > > > >> > > > > > > > > Polidea <https://www.polidea.com/> | > > > >> >Principal > > > >> >> > Software > > > >> >> > > > >> > Engineer > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > M: +48 660 796 129 <+48660796129> > > > >> >> > > > >> > > > > > > > > [image: Polidea] > > > ><https://www.polidea.com/> > > > >> >> > > > >> > > > > > > > > > > > >> >> > > > >> > > > > > > > > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > -- > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > Kamil Olszewski > > > >> >> > > > >> > > > > > > Polidea <https://www.polidea.com> | > Software > > > >> >Engineer > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > M: +48 503 361 783 > > > >> >> > > > >> > > > > > > E: kamil.olszew...@polidea.com > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > Unique Tech > > > >> >> > > > >> > > > > > > Check out our projects! < > > > >> >> > https://www.polidea.com/our-work> > > > >> >> > > > >> > > > > > > > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > -- > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > Jarek Potiuk > > > >> >> > > > >> > > > > > Polidea <https://www.polidea.com/> | > Principal > > > >> >Software > > > >> >> > > > >> Engineer > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > M: +48 660 796 129 <+48660796129> > > > >> >> > > > >> > > > > > [image: Polidea] <https://www.polidea.com/> > > > >> >> > > > >> > > > > > > > > >> >> > > > >> > > > > > > > >> >> > > > >> > > > > >> >> > > > >> > > > > >> >> > > > >> > > > > >> >> > > > >> > -- > > > >> >> > > > >> > > > > >> >> > > > >> > Tomasz Urbaszek > > > >> >> > > > >> > Polidea | Software Engineer > > > >> >> > > > >> > > > > >> >> > > > >> > M: +48 505 628 493 > > > >> >> > > > >> > E: tomasz.urbas...@polidea.com > > > >> >> > > > >> > > > > >> >> > > > >> > Unique Tech > > > >> >> > > > >> > Check out our projects! > > > >> >> > > > >> > > > > >> >> > > > >> > > > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > -- > > > >> >> > > > > > > > >> >> > > > > Jarek Potiuk > > > >> >> > > > > Polidea <https://www.polidea.com/> | Principal Software > > > >> >Engineer > > > >> >> > > > > > > > >> >> > > > > M: +48 660 796 129 <+48660796129> > > > >> >> > > > > [image: Polidea] <https://www.polidea.com/> > > > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > -- > > > >> >> > > > > > > >> >> > > > Jarek Potiuk > > > >> >> > > > Polidea <https://www.polidea.com/> | Principal Software > > > >> >Engineer > > > >> >> > > > > > > >> >> > > > M: +48 660 796 129 <+48660796129> > > > >> >> > > > [image: Polidea] <https://www.polidea.com/> > > > >> >> > > > > > > >> >> > > > > > > > > -- > > Tomasz Urbaszek > Polidea | Software Engineer > > M: +48 505 628 493 > E: tomasz.urbas...@polidea.com > > Unique Tech > Check out our projects! -- Kamil Olszewski Polidea <https://www.polidea.com> | Software Engineer M: +48 503 361 783 E: kamil.olszew...@polidea.com Unique Tech Check out our projects! <https://www.polidea.com/our-work>