Hi Mason, Thanks for driving the discussion. +1 for the proposal.
How about we return all operator metrics in a vertex? (the response is a map of operatorId/operatorName -> operator-metrics). Correspondingly, the url may be changed to /jobs/<jobid>/vertices/<vertexid>/operator-metrics In this way, users can skip the step of obtaining the operator id. Best, Lijie Hang Ruan <ruanhang1...@gmail.com> 于2024年1月17日周三 10:31写道: > Hi, Mason. > > The field `operatorName` in JobManagerOperatorQueryScopeInfo refers to the > fields in OperatorQueryScopeInfo and chooses the operatorName instead of > OperatorID. > It is fine by my side to change from opertorName to operatorID in this > FLIP. > > Best, > Hang > > Mason Chen <mas.chen6...@gmail.com> 于2024年1月17日周三 09:39写道: > > > Hi Xuyang and Hang, > > > > Thanks for your support and feedback! See my responses below: > > > > 1. IIRC, in a sense, operator ID and vertex ID are the same thing. The > > > operator ID can > > > be converted from the vertex ID[1]. Therefore, it is somewhat strange > to > > > have both vertex > > > ID and operator ID in a single URL. > > > > > I think Hang explained it perfectly. Essentially, a vertix may contain > one > > or more operators so the operator ID is required to distinguish this > case. > > > > 2. If I misunderstood the semantics of operator IDs here, then what is > the > > > relationship > > > between vertex ID and operator ID, and do we need a url like > > > `/jobs/<jobid>/vertices/<vertexid>/operators/` > > > to list all operator ids under this vertices? > > > > > Good question, we definitely need expose operator IDs through the REST > API > > to make this usable. I'm looking at how users would currently discover > the > > vertex id to query. From the supported REST APIs [1], you can currently > > obtain it from > > > > 1. `/jobs/<jobid>` > > 2. `/jobs/<jobid>/plan` > > > > From the response of both these APIs, they include the vertex ids (the > > vertices AND nodes fields), but not the operator ids. We would need to > add > > the logic to the plan generation [2]. The response is a little confusing > > because there is a field in the vertices called "operator name". I > propose > > to add a new field called "operators" to the vertex response object, > which > > would be a list of objects with the structure > > > > Operator > > { > > "id": "THE-FLINK-GENERATED-ID" > > }. > > > > The JobManagerOperatorQueryScopeInfo has three fields: jobID, vertexID > and > > > operatorName. So we should use the operator name in the API. > > > If you think we should use the operator id, there need be more changes > > > about it. > > > > > I think we should use operator id since it uniquely identifies an > > operator--on the contrary, the operator name does not (it may be empty or > > repeated between operators by the user). I actually had a question on > that > > since you implemented the metric group. What's the reason we use operator > > name currently? Could it also use operator id so we can match against the > > id? > > > > [1] > > https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/ > > [2] > > > > > https://github.com/apache/flink/blob/416cb7aaa02c176e01485ff11ab4269f76b5e9e2/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/jsonplan/JsonPlanGenerator.java#L53 > > > > Best, > > Mason > > > > > > On Thu, Jan 11, 2024 at 10:54 PM Hang Ruan <ruanhang1...@gmail.com> > wrote: > > > > > Hi, Mason. > > > > > > Thanks for driving this FLIP. > > > > > > The JobManagerOperatorQueryScopeInfo has three fields: jobID, vertexID > > and > > > operatorName. So we should use the operator name in the API. > > > If you think we should use the operator id, there need be more changes > > > about it. > > > > > > About the Xuyang's questions, we add both vertexID and operatorID > > > information because of the operator chain. > > > A operator chain has a vertexID and contains many different operators. > > The > > > operator information helps to distinguish them in the same operator > > chain. > > > > > > Best, > > > Hang > > > > > > > > > Xuyang <xyzhong...@163.com> 于2024年1月12日周五 10:21写道: > > > > > > > Hi, Mason. > > > > Thanks for driving this Flip. I think it's important for external > > system > > > > to be able to > > > > perceive the metric of the operator coordinator. +1 for it. > > > > > > > > > > > > I just have the following minor questions and am looking forward to > > your > > > > reply. Please forgive > > > > me if I have some misunderstandings. > > > > > > > > > > > > 1. IIRC, in a sense, operator ID and vertex ID are the same thing. > The > > > > operator ID can > > > > be converted from the vertex ID[1]. Therefore, it is somewhat strange > > to > > > > have both vertex > > > > ID and operator ID in a single URL. > > > > > > > > > > > > 2. If I misunderstood the semantics of operator IDs here, then what > is > > > the > > > > relationship > > > > between vertex ID and operator ID, and do we need a url like > > > > `/jobs/<jobid>/vertices/<vertexid>/operators/` > > > > to list all operator ids under this vertices? > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > https://github.com/apache/flink/blob/7bebd2d9fac517c28afc24c0c034d77cfe2b43a6/flink-runtime/src/main/java/org/apache/flink/runtime/jobgraph/OperatorID.java#L40C27-L40C27 > > > > > > > > -- > > > > > > > > Best! > > > > Xuyang > > > > > > > > > > > > > > > > > > > > > > > > At 2024-01-12 04:20:03, "Mason Chen" <mas.chen6...@gmail.com> wrote: > > > > >Hi Devs, > > > > > > > > > >I'm opening this thread to discuss a short FLIP for exposing > > > > >JobManagerOperatorMetrics via REST API [1]. > > > > > > > > > >The current set of REST APIs make it impossible to query coordinator > > > > >metrics. This FLIP proposes a new REST API to query the > > > > >JobManagerOperatorMetrics. > > > > > > > > > >[1] > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-417%3A+Expose+JobManagerOperatorMetrics+via+REST+API > > > > > > > > > >Best, > > > > >Mason > > > > > > > > > >