GET /dags/DAG_A/dagRuns/2020-01-01/taskInstances GET /dags/DAG_A/dagRuns/2020-01-01/taskInstances On Tue, May 12, 2020 at 12:10 PM Ash Berlin-Taylor <a...@apache.org> wrote: > > Hi Kamil, > > Thanks for re-opening this discussion! > > My initial complaint about "putting multiple values in a single path > element" is that it feels messy, and confusing, and very unlike any REST > API I've consumed or written before. > > However I've just realised that I may have _wildy_ misunderstood the proposal. > > I (somehow!) thought you were suggesting this: > > GET /dags/dag_id1-dag_id2-dag_id3/dagRuns/2020-01-01T00/taskInstances > > But on reading Google's https://aip.dev/159 that now makes more sense, > and that isn't what you were suggesting, but instaed a single, litteral > `-` to mean "any dag id". Is this correct? >
Yes. In detail, it means that the SQL query will not have a condition. /dag/{dag_Id}/taskInstanace SELECT * FROM task_instance WHERE dag_id= {dag_id} /dag/~/taskInstanace SELECT * FROM task_instance > So I think then my only ask is that we have a `dag_ids` parameter that > we can use to filter on :) > > > > > Additionally/related when selecting across multiple DAGs we would very > quickly run in to that was fixed in github.com/apache/airflow/pull/7364 > -- where asking for all the task stats for the DAGs on a single page > exceeded the HTTP request line limit. > > i.e. this could break if the DAG ids are large, or if they is a large > number of dags to be fetched. > > GET > /dags/-/dagRuns/2020-01-01T00/taskInstances?dag_ids=...&dag_ids=...&dag_ids=... > > How about this as an additional endpoint: > > POST /dags/-/dagRuns/2020-01-01/taskInstances/list > > dag_ids=...&dag_ids=...&dag_ids... > > > (The /list suffix isn't needed here as there is no POST for > taskInstances, but there is in other places, so making it explicit means > we can use the same pattern for, say `POST /dags/-/dagRuns/list`) > > We can add it, but I don't sure, I have the impression that this will require a separate specification that will require the design of BATCH API. Currently, if you would like to download resources from several collections, you can send many HTTP requests. GET /dags/DAG_A/dagRuns/2020-01-01/taskInstances GET /dags/DAG_B/dagRuns/2020-01-01/taskInstances GET /dags/DAG_C/dagRuns/2020-01-01/taskInstances I would like to focus on making all operations possible, but not necessarily making the API efficient in all cases. > > > If it is valid to pass just one of ExecutionDateStartRange and > ExecutionDateEndRange could we call it ExecutionDateGTE etc. (for > Greater-than-or-equal) so it makes more sense when a single parameter is > providerd. > > I will change it. > > > A minor style point: The filter/query string fields are all defined as > BumpyCase, but I'd prefer us to use snake_case (`execution_date_gte` > etc) - the only sort of applicable proposal I can find is https://aip.dev/140 > I will change it. > (As in I can't find anything in Google recs that talk about case for > case of query string/filter params. I guess they suggest having it all > as a single `?filter=` param :- https://aip.dev/160 ) > The API guides discusses the latest recommendations for API design. Google formerly used query parameters. Example: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/list > -ash > > > > On May 12 2020, at 12:12 am, Jarek Potiuk <jarek.pot...@polidea.com> wrote: > > > While I understand the benefits of the "clean" "REST" API where resources > > are identified by a nice hierarchy, I do not think it is necessary to > > follow it for this kind of queries.I think the "resources" based approach > > indeed breaks down when you want to make more complex queries and want to > > do it efficiently, without running many API calls or returning superfluous > > information. I think that's the reason where GraphQL shines compared to > > REST-like APIs. I think in this case "purity" impacts performance a lot. > > > > I know we did not choose GraphQL (and for good reasons - it's > > complexity is > > not needed in our case) - but I think there are cases where we want > > efficiency and we want to limit a number of requests or superfluous > > information retrieved. > > > > But looking at the balance between purity and efficiency for asking about > > "no matter which dag" or "no matter which executionDate" - I think the > > AIP-159 proposal strikes a good balance. It's pretty natural, easy to read > > and does not break the structure. > > > > I would not use "*" because it can be interpreted differently when > > URL-encoded/decoded because it is a reserved character ( > > https://tools.ietf.org/html/rfc3986#section-2.2) - and some languages > > (Apparently > > PHP > > <https://stackoverflow.com/questions/6533561/urlencode-the-asterisk-star-character> > > might have problems with encoding it accidentally) . I'd use one of the > > un-reserved ones (https://tools.ietf.org/html/rfc3986#section-2.3)). > > If - > > (dash) is out of the question due to compatibility, I'd use ~(tilde). > > > > So if we only want to focus on this kind of query, I think option II. (but > > with ~) is good for me. > > > > I have only one doubt about it. Do we want to handle other queries to > > taskInstances across dag_id or execution_date? For example > > corresponding to > > DAG_ID IN (DAG_1, DAG_2, DAG_3) or DAG_ID LIKE ("DAG_%") from typical SQL > > queries?. I think that might be a useful type of queries. even more so for > > executionDate. From what I see we do not have filters for those, but I > > wonder if this is not something we could do with additional filters and > > still keep the nice structure? > > > > For example: > > > > /dags/-/dagRuns/{execiton_date}/taskInstances?dag_id=DAG_1,DAG_2,DAG_3, > > > > Not sure if this is something we even want to follow? > > > > Ash - what are your thoughts about it? Why do you have doubts about the > > AIP159 approach ? What problems do you foresee with it ? > > > > J. > > > > > > > > On Mon, May 11, 2020 at 7:37 PM QP Hou <q...@scribd.com> wrote: > > > >> I would recommend option III. > >> > >> IMHO, hierarchy based endpoints are for CURD operations on a single > >> entity. The flat endpoint, like /taskinstance, is for read-only > >> multi-facet search queries. > >> > >> For example: > >> > >> * to create a resource Foo: "POST /api/parent/{parent_id}/foo" > >> * to update a resource Foo: "PUT /api/parent/{parent_id}/foo/{foo_id}" > >> * to get a single Foo entity: "GET /api/parent/{parent_id}/foo/{foo_id}" > >> * to search for multiple Foo entities using multiple filters: "GET > >> /api/foo?parent_id=id1,id2,id3&owner=user1" > >> > >> On Mon, May 11, 2020 at 5:08 AM Kamil Breguła <kamil.breg...@polidea.com> > >> wrote: > >> > > >> > Hello, > >> > > >> > I apologize for my very negative behavior. I misunderstood the rules. > >> > I updated the specification based on the discussion. I hope that > >> now I > >> > have not missed any suggestions that I had to make. I tried to make > >> > every change in a separate commit, so you can review the changes one > >> > by one. > >> > > >> > Here is new PR: https://github.com/apache/airflow/pull/8721 > >> > > >> > Short summary: > >> > * Changed type for DAGID, DAGRunID, TaskID. Now we use string instead > >> > of an integer. > >> > * EventLog collection is read-only > >> > * Used ExcutionDate in DagRuns endpoints > >> > * Used custom action to control task instances - Removed PATCH > >> > /dags/{dag_id}/taskInstances/{task_id}/{execution_date} endpoint and > >> > added PATCH /dags/{dag_id}/clearTaskInstanaces. > >> > * Added endpoint - POST /dagRuns "Trigger a DAG Run" > >> > * Added filter parameters to GET /dags/{dag_id}/taskInstances. > >> > * Added filter parameters to GET /dags/{dag_id}/dagRuns > >> > * Removed DELETE /taskInstances endpoint > >> > * The connection ID is a unique key that identifies the connection > >> > * Moved TaskInstances resource under DagRuns resource > >> > * Fixed many typos > >> > > >> > There is one more topic for discussion - reading resources across > >> > multiple collections. > >> > In other words - how do I retrieve task instances for multiple DAGs? > >> > We have several solutions. > >> > I) Currently, the endpoint that receives a list of task instances is > >> > available at /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances: > >> > This endpoint support reading resources across multiple DAGs by > >> > specifying a "-" as a dag_id or an execution_date. > >> > This is based on Google recommendations - AIP159 [1]. I relied on > >> > these recommendations because it is the most comprehensive study on > >> > API design principles. Ash Berlin-Taylor rightly pointed out that this > >> > would be a backward-incompatible change. > >> > II) We can use a different character that will have the same role - > >> > '*'. This character cannot be in Dag/Task ID, so it's safe. > >> > III) Ash proposed to add a separate endpoint that will not include the > >> > DAG ID in the address - /taskInstances > >> > > >> > If we want to choose one solution then I think it's worth looking at > >> > what the endpoints for DAGs look like. > >> > /dags > >> > /dags/{dag_id} > >> > /dags/{dag_id}/clearTaskInstanaces > >> > /dags/{dag_id}/dagRuns > >> > /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances > >> > /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/ > >> > /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/links > >> > > >> /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/logs/{task_try_number} > >> > > >> /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/xcomEntries > >> > > >> /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/xcomEntries/{key} > >> > /dags/{dag_id}/dagRuns/{execution_date} > >> > /dags/{dag_id}/structure > >> > /dags/{dag_id}/tasks > >> > /dags/{dag_id}/tasks/{task_id} > >> > In my opinion, here is a clear hierarchy of resources that will > >> > facilitate the use of API. The third solution causes that this > >> > hierarchy is disturbed and the endpoints that are used to receive the > >> > list of elements will be at the highest level. > >> > We will have the following endpoints: > >> > /dags > >> > /dags/{dag_id} > >> > /dags/{dag_id}/clearTaskInstanaces > >> > /dagRuns > >> > /dags/{dag_id}/dagRuns > >> > /taskInstances > >> > /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances > >> > /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/ > >> > /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/links > >> > > >> /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/logs/{task_try_number} > >> > /xcomEntries > >> > > >> /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/xcomEntries > >> > > >> /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/xcomEntries/{key} > >> > /dags/{dag_id}/dagRuns/{execution_date} > >> > /dags/{dag_id}/structure > >> > /dags/{dag_id}/tasks > >> > /dags/{dag_id}/tasks/{task_id} > >> > > >> > 4) Some endpoints will have similar behavior, so we can delete them. > >> > Then we will have the following list of endpoints. > >> > /dags > >> > /dags/{dag_id} > >> > /dags/{dag_id}/clearTaskInstanaces > >> > /dagRuns > >> > /taskInstances > >> > /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/ > >> > /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/links > >> > > >> /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/logs/{task_try_number} > >> > /xcomEntries > >> > > >> /dags/{dag_id}/dagRuns/{execiton_date}/taskInstances/{task_id}/xcomEntries/{key} > >> > /dags/{dag_id}/dagRuns/{execution_date} > >> > /dags/{dag_id}/structure > >> > /dags/{dag_id}/tasks > >> > /dags/{dag_id}/tasks/{task_id} > >> > > >> > Which solution do you like the most? I, II, III, or IV? > >> > > >> > Best regards, > >> > Kamil Breguła > >> > > >> > [1] https://aip.dev/159 > >> > > >> > On Wed, Apr 15, 2020 at 8:32 PM Daniel (Daniel Lamblin ) [Data > >> > Infrastructure] <lamb...@coupang.com> wrote: > >> > > > >> > > I'm glad the general mood is that connection_id should be unique. > >> > > FWIW when I had multiple connections in v1.8.2 with mysql I didn't > >> seem to be getting any randomized loadbalancing anyway. Then again, maybe > >> random was just 100 selections of 1 over 2. > >> > > There are many other ways to load balance connections, each specific > >> to the service type, so I don't see it Airflow's place to provide a > >> semi-generic option to do it. > >> > > > >> > > +1 for connection ID being unique. > >> > > > >> > > Pardon outlook for changing links to the ConnectionID, DagID and > >> PoolID being integers in a version of the API. > >> > > Are we past that decision already; I'd expect to use string names. > >> > > > >> > > I'd also asked about DAG run ID, task ID, and finally whether there'd > >> be an endpoint with which to clear tasks, because crud operations don't > >> model the interplay of task instance, jobs, and dag run state involved. > >> > > -Daniel > >> > > > >> > > On 4/14/20, 8:49 AM, "Xinbin Huang" <bin.huan...@gmail.com> wrote: > >> > > > >> > > [Warning]: This email originated from an external source. Do not > >> open links or attachments unless you know the content is safe. > >> > > [경고]: 본 이메일은 회사 외부에서 유입되었습니다. 내용이 안전한지 확인하기 전까지는 링크나 첨부파일을 열지 마십시오. > >> > > > >> > > > >> > > +1 on making connection IDs unique. It's confusing to have Airflow > >> handled > >> > > load balancing here. > >> > > > >> > > On Tue, Apr 14, 2020, 4:59 AM Kaxil Naik <kaxiln...@gmail.com> > >> wrote: > >> > > > >> > > > +1 to make connection ids unique > >> > > > > >> > > > On Tue, Apr 14, 2020 at 11:59 AM Jarek Potiuk < > >> jarek.pot...@polidea.com> > >> > > > wrote: > >> > > > > >> > > > > I am also supporting converting the connection to be unique. > >> > > > > > >> > > > > I've worked with similar approach long time ago (10s of years) > >> and it was > >> > > > > fine then where we have not yet figured out how to scale > >> client/server > >> > > > > architecture and we did not have all the nice infrastructure > >> like > >> > > > > load/balancing, cloud services etc. I believe in most of the > >> current > >> > > > > services/systems load-balancing should be handled by the > >> service itself > >> > > > or > >> > > > > some kind of proxy between - not by the client side - in > >> production > >> > > > > environments. > >> > > > > > >> > > > > It's far more robust and might provide much better control > >> (you can > >> > > > control > >> > > > > multiple independent client's connection and do not rely on > >> random > >> > > > > distribution of connections). There are scenarios that would > >> not work > >> > > > well > >> > > > > in this case - for example physical proximity of the workers, > >> current > >> > > > load > >> > > > > on different services etc. etc. It is very limiting to > >> rely on > >> this > >> > > > > feature. > >> > > > > > >> > > > > On Tue, Apr 14, 2020 at 11:36 AM Kamil Breguła < > >> > > > kamil.breg...@polidea.com> > >> > > > > wrote: > >> > > > > > >> > > > > > Hello, > >> > > > > > > >> > > > > > This can cause big problems with idempotence. According to > >> RFC-7231, > >> > > > > > the DELETE method should be idempotent. > >> > > > > > > >> > > > > > For example: > >> > > > > > If you want to delete items with index from 1 to 4, you > >> should set the > >> > > > > > following request > >> > > > > > DELETE /connections/hdfs_default/4 > >> > > > > > DELETE /connections/hdfs_default/3 > >> > > > > > DELETE /connections/hdfs_default/2 > >> > > > > > DELETE /connections/hdfs_default/1 > >> > > > > > > >> > > > > > However, if the user sends these requests in a different > >> order, they > >> > > > > > will only delete the first and third items. > >> > > > > > DELETE /connections/hdfs_default/1 > >> > > > > > DELETE /connections/hdfs_default/2 > >> > > > > > DELETE /connections/hdfs_default/3 > >> > > > > > DELETE /connections/hdfs_default/4 > >> > > > > > > >> > > > > > If you use asynchronous HTTP clients (a popular in Node), > >> the order of > >> > > > > > requests will not be kept. It will also be a big > >> problem with > >> > > > > > simultaneous modifications. > >> > > > > > > >> > > > > > I am also in favor of abandoning support for many > >> connections. This > >> > > > > > can be solved on a different layer. > >> > > > > > > >> > > > > > Best regards, > >> > > > > > Kamil > >> > > > > > > >> > > > > > > >> > > > > > On Thu, Apr 9, 2020 at 6:47 PM Ash Berlin-Taylor < > >> a...@apache.org> > >> > > > wrote: > >> > > > > > > > >> > > > > > > They are, but we _can_ make a choice to remove that > >> feature -- it is > >> > > > > not > >> > > > > > > widely used and is confusing to many when they > >> stumble on > >> it. > >> > > > > > > > >> > > > > > > It's not something we should do lightly, but it is a > >> possibility. > >> > > > > > > > >> > > > > > > I think I'm probably leaning towards the "ordinal" concept: > >> > > > > > > > >> > > > > > > /connections/hdfs_default -> list of connections with that > >> ID > >> > > > > > > /connections/hdfs_default/0 first connection of that type > >> > > > > > > > >> > > > > > > Something like that. > >> > > > > > > > >> > > > > > > On Apr 9 2020, at 2:31 pm, Shaw, Damian P. > >> > > > > > > <damian.sha...@credit-suisse.com> wrote: > >> > > > > > > > >> > > > > > > > FYI if you look back at the thread "Re: [2.0 spring > >> cleaning] > >> > > > Require > >> > > > > > > > unique conn_id" on 2019-04-14 you can see a message from > >> Kevin Yang > >> > > > > > > > stating that this random choice of connections is a > >> "feature" used > >> > > > to > >> > > > > > > > load balance connections in AirBnB. So users are relying > >> on this > >> > > > > > behavior. > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > -----Original Message----- > >> > > > > > > > From: Daniel (Daniel Lamblin ) [Data Infrastructure] > >> > > > > > > > <lamb...@coupang.com> > >> > > > > > > > Sent: Wednesday, April 8, 2020 20:01 > >> > > > > > > > To: dev@airflow.apache.org > >> > > > > > > > Subject: Re: API spec questions > >> > > > > > > > > >> > > > > > > > Having been bit by accidentally having two connections > >> by the same > >> > > > > > > > name or conn_id, I'd prefer if were made unique. In my > >> experience > >> > > > > > > > there's little utility in having multiple > >> connections by > >> the same > >> > > > > > > > name. Tasks that use a connection do to fairly randomly > >> choose one, > >> > > > > > > > rather they seem pretty consistent in using the one > >> created > >> > > > earliest, > >> > > > > > > > which often has the lower id integer. > >> > > > > > > > > >> > > > > > > > Circling back to how this is used by the API, from > >> a user > >> > > > > perspective, > >> > > > > > > > the following in path integer fields were ones I'd have > >> expected to > >> > > > > be > >> > > > > > > > strings instead: > >> > > > > > > > ConnectionID > >> > > > > > > >> > > > > > >> > > > > >> https://kor01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fblob%2F11f1e0cad996d5596e3e4fb440eb4ec52c024f70%2Fopenapi.yaml%23L1845&data=02%7C01%7Clamblin%40coupang.com%7Cd8716c2432c54849f3d708d7e08b68f2%7Ce3098f96361b47c6a9f4ab7bafcaffe9%7C0%7C0%7C637224761681223746&sdata=ur3AAggOlQ7BskKBxQxRq0i%2FS36Ol72c7H7ddwXo5C4%3D&reserved=0 > >> > > > > > > > DAGID > >> > > > > > > >> > > > > > >> > > > > >> https://kor01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fblob%2F11f1e0cad996d5596e3e4fb440eb4ec52c024f70%2Fopenapi.yaml%23L1853&data=02%7C01%7Clamblin%40coupang.com%7Cd8716c2432c54849f3d708d7e08b68f2%7Ce3098f96361b47c6a9f4ab7bafcaffe9%7C0%7C0%7C637224761681223746&sdata=vyhFiV6Fd3pnVvlAoxnscSvvcy2NX%2BMASGmuHKu%2Fy84%3D&reserved=0 > >> > > > > > > > PoolID > >> > > > > > > >> > > > > > >> > > > > >> https://kor01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairflow%2Fblob%2F11f1e0cad996d5596e3e4fb440eb4ec52c024f70%2Fopenapi.yaml%23L1901&data=02%7C01%7Clamblin%40coupang.com%7Cd8716c2432c54849f3d708d7e08b68f2%7Ce3098f96361b47c6a9f4ab7bafcaffe9%7C0%7C0%7C637224761681223746&sdata=4Fog92AZju0%2BN44FoBdU4fj2DwhgCbUfKYUbdjeWpgo%3D&reserved=0 > >> > > > > > > > > >> > > > > > > > Though it’s a url-encoding hassle, I also expected that > >> DAGRunID > >> > > > > would > >> > > > > > > > be more like the Run ID E.G. > >> > > > > > > > "scheduled__2020-04-08T23:10:00+00:00", > >> > > > > > > > "manual__2020-04-08T23:00:56+00:00", > >> > > > > > > > "manual__2020-04-08T16:24:56.692868+00:00" , > >> > > > > > > > "backfill_2020-04-08T22:05:00+00:00" etc > >> > > > > > > > > >> > > > > > > > Then TaskID is confusing to me; AFAIK the PK to task > >> instances are > >> > > > > > > > task_id, dag_id, and execution_date and the api call > >> appears to > >> > > > align > >> > > > > > > > with that having the pattern: > >> > > > > > > > /dags/{dag_id}/taskInstances/{task_id}/{execution_date}: > >> > > > > > > > But if task_id is a numbered id, then… execution_date > >> isn't even > >> > > > > > > > needed… I'm thinking it should have been a string. > >> > > > > > > > > >> > > > > > > > An aside to this, I've always wondered what happens > >> if an > >> > > > externally > >> > > > > > > > triggered DAG Run has the same execution date as a > >> pre-existing > >> > > > > > > > scheduled DAG Run. They'd have different run_ids, EG > >> > > > > > > > "scheduled__2020-04-08T23:10:00+00:00" vs > >> > > > > > > > "manual__2020-04-08T23:10:00+00:00" but then task > >> instances for > >> > > > those > >> > > > > > > > runs might not be unique. > >> > > > > > > > > >> > > > > > > > Lastly, the UI and CLI operation of clearing tasks seems > >> analogous > >> > > > to > >> > > > > > > > the delete task instance API end point. But probably > >> it's not, and > >> > > > > > > > this could become confusing to users. > >> > > > > > > > There should be a non-db-model api call for clearing > >> tasks like you > >> > > > > > > > can from the UI and the CLI. If I read it right, > >> clearing does the > >> > > > > > > > following: it sets the state of the task instance to > >> None unless > >> > > > the > >> > > > > > > > state was Running then instead it sets it and related > >> job ids to > >> > > > > > > > Shutdown. It deletes reschedules of the TI and it sets > >> the dag runs > >> > > > > > > > for those task instances back to Running. > >> > > > > > > > This would be a lot to do for a user using PATCH calls > >> to change > >> > > > Task > >> > > > > > > > Instances and Dag Runs together (and there's no API for > >> Jobs). > >> > > > > > > > > >> > > > > > > > -Daniel L. > >> > > > > > > > P.S. as far as renaming all parameters on operators and > >> hooks with > >> > > > > > > > *_conn_id, I do not want to see that breaking change > >> happen. But IF > >> > > > > it > >> > > > > > > > HAS TO, I'm still of the opinion that the default_args > >> use for > >> > > > > > > > X_conn_id is not preferable to having all operators take > >> simpler > >> > > > > > > > source_conn_id and optional target_conn_id parameter > >> names that are > >> > > > > > > > consistent across all operators. > >> > > > > > > > > >> > > > > > > > On 4/8/20, 9:47 AM, "Ash Berlin-Taylor" <a...@apache.org> > >> wrote: > >> > > > > > > > > >> > > > > > > > [Warning]: This email originated from an external > >> source. Do not > >> > > > > > > > open links or attachments unless you know the > >> content is > >> safe. > >> > > > > > > > [경고]: 본 이메일은 회사 외부에서 유입되었습니다. 내용이 안전한지 확인하기 전까지는 링크나 > >> 첨부파일을 열지 > >> > > > > 마십시오. > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > To expand on the "so I think we need to do one of": > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > - we need to update the name of "conn_id" > >> somehow. I > >> can't think > >> > > > > of > >> > > > > > a > >> > > > > > > > better option, and given all the operators have > >> "x_conn_id" I > >> > > > > don't > >> > > > > > > > think that change should be made lightly. > >> > > > > > > > > >> > > > > > > > - make conn_id unique (this "poor" HA has been a > >> source of > >> > > > > > > > confusion in > >> > > > > > > > the past) and the ID we use in the API > >> > > > > > > > > >> > > > > > > > Or a third option: > >> > > > > > > > > >> > > > > > > > - Have the API take conn_id and either return all > >> conns for that > >> > > > > > > > conn_id, or conn_id plus an ordinal (?idx=0 type > >> thing) to > >> > > > return > >> > > > > a > >> > > > > > > > single value. > >> > > > > > > > > >> > > > > > > > On Apr 8 2020, at 5:42 pm, Ash Berlin-Taylor < > >> a...@apache.org> > >> > > > > > wrote: > >> > > > > > > > > >> > > > > > > > > Hi everyone, > >> > > > > > > > > > >> > > > > > > > > So as I mentioned in the AIP voting thread, I think > >> we need to > >> > > > > > give > >> > > > > > > > > some more thought to how we are exposing connection > >> ids in the > >> > > > > > API. > >> > > > > > > > > > >> > > > > > > > > Right now as proposed (and merged without approval, > >> not cool. > >> > > > > The > >> > > > > > AIP > >> > > > > > > > > we voted on did not contain a PR against > >> apache/airflow.) it > >> > > > has > >> > > > > > an > >> > > > > > > > > end point of `/connections/{connection_id} ` > >> > > > > > > > > > >> > > > > > > > > My issue here is as I said in the previous thread: > >> that is > >> > > > going > >> > > > > > > > to be > >> > > > > > > > > mightly confusing to our users because there > >> is a > >> "conn_id" > >> > > > > > concept > >> > > > > > > > > that is exposed, so people are going to try putting > >> > > > > "aws_default" > >> > > > > > etc > >> > > > > > > > > in there. I don't care what the API spec says, I > >> care what our > >> > > > > > users > >> > > > > > > > > expect, and having a connection_id/id and a conn_id > >> fields is > >> > > > > > > > just confusing > >> > > > > > > > > > >> > > > > > > > > So I think we need to do one of: > >> > > > > > > > > - we need to update the name of "conn_id" somehow, > >> make > >> > > > conn_id > >> > > > > > unique > >> > > > > > > > > (this "poor" HA has been a source of confusion in > >> the past) > >> > > > > > > > > > >> > > > > > > > > There are similar problems for the DAG run -- the > >> spec has the > >> > > > > > > > type as > >> > > > > > > > > an integer, but everything else about the Airflow > >> system uses > >> > > > > the > >> > > > > > > > > (unique) run_id parameter, and I would expect the > >> API to use > >> > > > > > > > that. The > >> > > > > > > > > autoinc. id on the run column is literally never > >> used in the > >> > > > > code > >> > > > > > > > > base, so exposing that via the API seems odd. > >> > > > > > > > > > >> > > > > > > > > A few other smaller points: > >> > > > > > > > > > >> > > > > > > > > EventLog: Those are "audit"/action logs, so we > >> probably > >> > > > > shouldn't > >> > > > > > let > >> > > > > > > > > people delete them via the API! > >> > > > > > > > > > >> > > > > > > > > pool_id: still an integer. It should be the "name". > >> > > > > > > > > > >> > > > > > > > > How should add/delete variables and connections > >> work in the > >> > > > API > >> > > > > > with > >> > > > > > > > > the addition of the new "Secrets Backends"? > >> > > > > > > > > > >> > > > > > > > > xcomValues: task_id is listed as an integer. > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > -ash > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> =============================================================================== > >> > > > > > > > Please access the attached hyperlink for an important > >> electronic > >> > > > > > > > communications disclaimer: > >> > > > > > > > > >> https://kor01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.credit-suisse.com%2Flegal%2Fen%2Fdisclaimer_email_ib.html&data=02%7C01%7Clamblin%40coupang.com%7Cd8716c2432c54849f3d708d7e08b68f2%7Ce3098f96361b47c6a9f4ab7bafcaffe9%7C0%7C0%7C637224761681223746&sdata=6e9Ck952oFmXDkoDYnM0XgCWsplYb7Dua37BkAPt22A%3D&reserved=0 > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> =============================================================================== > >> > > > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > -- > >> > > > > > >> > > > > Jarek Potiuk > >> > > > > Polidea < > >> https://kor01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.polidea.com%2F&data=02%7C01%7Clamblin%40coupang.com%7Cd8716c2432c54849f3d708d7e08b68f2%7Ce3098f96361b47c6a9f4ab7bafcaffe9%7C0%7C0%7C637224761681223746&sdata=tMOZFKjLGoXh9eAVW5rXhmJeJw8kYhlWAfwFICN%2Fpfw%3D&reserved=0> > >> | Principal Software Engineer > >> > > > > > >> > > > > M: +48 660 796 129 <+48660796129> > >> > > > > [image: Polidea] < > >> https://kor01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.polidea.com%2F&data=02%7C01%7Clamblin%40coupang.com%7Cd8716c2432c54849f3d708d7e08b68f2%7Ce3098f96361b47c6a9f4ab7bafcaffe9%7C0%7C0%7C637224761681223746&sdata=tMOZFKjLGoXh9eAVW5rXhmJeJw8kYhlWAfwFICN%2Fpfw%3D&reserved=0 > >> > > >> > > > > > >> > > > > >> > > > >> > > > >> > > > > > > -- > > > > Jarek Potiuk > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > M: +48 660 796 129 <+48660796129> > > [image: Polidea] <https://www.polidea.com/> > >