RE: [PROPOSAL] Add streaming support to PartialOperator

2024-10-04 Thread Blain David
We cleary want to position ourselves as the second group. Not that we don't write python code, as a fact we do, but once we see something could be solved in a generic way and be contributed back to Airflow, we will also propose that. I understand the use of hooks principle, we also do it, when

RE: [DISCUSS] Sensor Improvements With Tirggers

2024-09-26 Thread Blain David
Hello Pavan Kumar, I really like your proposition as this would have facilitated my implementation of the MSGraphSensor, which is actually implemented as a deferrable using a triggerer. You can check here how I did it: https://github.com/apache/airflow/blob/main/airflow/providers/microsoft/azu

RE: [PROPOSAL] Add streaming support to PartialOperator

2024-09-19 Thread Blain David
The MSGraphAsyncOperator already handles multi paged responses internally using the defer mechanism, but the result of this are indexed XComs, which means you have to expand over them to process it. We sometimes also use jq in combination with BashOperator to merge those multiple JSON responses

RE: [PROPOSAL] Add streaming support to PartialOperator

2024-09-18 Thread Blain David
The example I give was a simplified version, and also a continuation of another DAG process. The issue I tried to solve in Airflow here for this case (we have also other use cases where we ran into the same issue) was reading n number of users from MSGraph, which where updated and had to be syn

[PROPOSAL] Add streaming support to PartialOperator

2024-09-18 Thread Blain David
Hello, At our company we have DAG's which have to process lots of paged results (e.g. XComs results from REST endpoints or multiple files from FTP downloads). The Xcom isn't the issue, as we use a custom provider which allows us to store large data on a Persistent Volume Claim without overloadi

[DISCUSS][AIP-38 Modern Web Application]

2024-06-26 Thread Blain David
Beside the new interface and getting rid off FAB in Airflow 3.0, a cool and handy feature would be to be able to group multiple DAG's so you could order them by like domain or whatever grouping you want to achieve. Okay, you can achieve the same with filtering, and maybe the we could use that fe

RE: [PROPOSAL] Adding MSGraphSDK Async Operator to Airflow

2024-03-12 Thread Blain David
with tests, docs etc. is enough. On Mon, Mar 11, 2024 at 6:27 PM Blain David wrote: > > Hello Jarek, > > There is no particular need to add this into a separate provider, I just did > it as I wanted to deploy it myself, it could perfectly reside within the > Microsoft Azu

RE: [PROPOSAL] Adding MSGraphSDK Async Operator to Airflow

2024-03-11 Thread Blain David
under the impression there is something special about the SDK you use or "proprietaredness" (for lack of a better word) - but that seems like yet-another operator, hook, triggerer in `microsoft.azure`. Or am I missing something? J. On Fri, Mar 1, 2024 at 8:57 AM Blain David w

RE: Bad mixing of decorated and classic operators (users shooting themselves in their foot)

2024-03-01 Thread Blain David
It's certainly possible to check from where a python method is being called using traceback. I do think prohibiting the execute method of an operator being called manually would be a good idea, I've also came accross this in multiple DAG's and this is ugly and looks like a hack. Maybe in the be

RE: [PROPOSAL] Adding MSGraphSDK Async Operator to Airflow

2024-02-29 Thread Blain David
> dP4HLfDYw1HeMfylfZkEv0p%2BMO4X3Sn6QJu8hU%3D&reserved=0 > My main concern here is how will provide ongoing mantaince for this > provider? > This provider is to handle a service by Microsoft yet Microsoft is not > in the picture here (as far as I can see) > > > On Sat

[PROPOSAL] Adding MSGraphSDK Async Operator to Airflow

2024-01-13 Thread Blain David
Hello everyone, I've already started a discussion about this on the Airflow discussions: https://github.com/apache/airflow/discussions/36315 As we have multiple DAG's interacting with MS Graph API endpoints, and as we want to avoid custom code as much as possible as we have to handle lot's of