I see. So you are suggesting the jobmanager to support both /foo/bar and /jobs/:jobid/foo/bar, while the history server only supports the latter.
I was initially thinking having two APIs in jobmanager serving the exact same purpose is a bit tricky. Now I think it's a good point that these two APIs, despite now returning the same results, can return different things in future. Junhan & Yangze, WDYT? Best, Xintong On Fri, Jun 24, 2022 at 3:10 PM Chesnay Schepler <ches...@apache.org> wrote: > This is pretty simple to explain. > > "I want to know the environment the job ran in." -> > /jobs/:jobid/environment > "I want to know the environment the JM ran in." -> /jobmanager/environment > > It's less about the JobID being a parameter, and more of a way for them > to better model the resource they are interested in. > > In the future we could consider the job environment endpoint to return > not just the JM environment, but also those from the CLI/TMs. > > On 24/06/2022 06:37, Xintong Song wrote: > > Whether the job ID is actually used in the end isn't visible after all. > > > > I'm not sure about this. E.g., for an empty session cluster, users have > to > > understand they don't need to provide an actual jobid for requesting > > jobmanager information via rest. > > > > I believe both ways work. I think this is a trade off between a) > explaining > > to history server rest api users how the urls are different from > jobmanager > > and b) explaining to jobmanager rest api users why we need an unused > jobid > > for some of the cases. I'm leaning toward the current approach, because > I'd > > expect a smaller set of history server rest api users than (or even a > > subset of) that of jobmanager. > > > > The plan is to document which (and how) the urls are different from > > jobmanager in the history server page [1]. > > > > Compatibility test indeed should be considered. Thanks for pointing it > out. > > Currently the compatibility of history server rest api is guaranteed by > the > > compatibility of jobmanager rest api. I think the only thing we need is > to > > make sure /foo/bar of jobmanager is identical to /jobs/:jobid/foo/bar of > > history server. We can introduce an interface, as a subtype of > JsonArchivist, > > that archives the json with a path that includes the jobid. Then we can > > test against all relevant handlers as implementations of this interface. > > > > WDYT? > > > > Best, > > > > Xintong > > > > > > [1] > > > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/advanced/historyserver/#available-requests > > > > > > > > On Thu, Jun 23, 2022 at 5:07 PM Chesnay Schepler <ches...@apache.org> > wrote: > > > >> The addition of the /jobs/:jobid/jobmanager/config / environment > >> exclusively to the HS is a bit of a strange workaround. > >> How do you intend to document those? (and test compatibility)? > >> > >> Why not just add a general /jobs/:jobid/environment endpoint that works > >> just like jobmanager/environment. > >> To me that seems like a cleaner solution. > >> It is somewhat mentioned as an alternative in the FLIP, but I don't > >> understand what is supposed to be confusing about it. > >> Whether the job ID is actually used in the end isn't visible after all. > >> > >> /jobmanager/config could be integrated into /jobs/:jobid/config. > >> > >> The same approach could maybe be used for logs; not really sure yet (not > >> a fan of displaying logs in the HS in the first place). > >> > >> On 23/06/2022 06:55, junhan yang wrote: > >>> Hi all, > >>> > >>> Thank you all for your feedbacks. As far as I can see, it looks like > the > >>> discussion on this FLIP has been converged. > >>> > >>> I will start a new vote thread now. > >>> > >>> Best regards, > >>> Junhan > >>> > >>> Yangze Guo <karma...@gmail.com> 于2022年6月17日周五 14:05写道: > >>> > >>>> Thanks for the input, Jiangang. > >>>> > >>>> I think it's a valid demand to distinguish completed jobs with the > same > >>>> name. > >>>> - If they are different jobs, I think users need to give them > >>>> different meaningful names respectively. > >>>> - If they are exactly the same job, IIUC, what you need is to figure > >>>> out the order. ApplicationId in Yarn might help. But in this case, you > >>>> can just sort them with the start time. > >>>> > >>>> Best, > >>>> Yangze Guo > >>>> > >>>> On Fri, Jun 17, 2022 at 12:13 PM Jiangang Liu < > >> liujiangangp...@gmail.com> > >>>> wrote: > >>>>> Thanks for the FLIP. It is helpful to track detail infos for > completed > >>>> jobs. > >>>>> I want to ask another question. In our environment, sometimes it is > >> hard > >>>> to > >>>>> distinguish jobs since the same job names may appear multi times in > the > >>>>> completed jobs. Because a job may run multi times or different jobs > >> have > >>>>> the same job names. I wonder that wether we can enhance the complete > >> jobs > >>>>> display with more information, such as applicationId and application > >> name > >>>>> in yarn. Maybe it is different in k8s to identify a job. > >>>>> > >>>>> Best > >>>>> Jiangang Liu > >>>>> > >>>>> Yangze Guo <karma...@gmail.com> 于2022年6月17日周五 11:40写道: > >>>>> > >>>>>> Thanks for the feedback, Aitozi and Jing. > >>>>>> > >>>>>>> Are each attempts of the TaskManager or JobManager pods (if failure > >>>>>> occurs) > >>>>>> all be shown in the ui? > >>>>>> > >>>>>> The info of the prior execution attempts will be archived, you could > >>>>>> refer to `ArchivedExecutionVertex$priorExecutions`. > >>>>>> > >>>>>>> It seems that most of these metrics are more interesting to batch > >>>> jobs. > >>>>>> Does it make sense to calculate them for pure streaming jobs too? > >>>>>> > >>>>>> All the proposed metrics will be calculated no matter what the job > >>>> type is. > >>>>>>> Why "duration is less interesting" which is mentioned in the FLIP? > >>>>>> As a first step, we mainly focus on the most interesting status > during > >>>>>> the job lifecycle. The duration of final states like FINISHED and > >>>>>> CANCELED is meaningless, while abnormal conditions like CANCELING > will > >>>>>> not be included at the moment. > >>>>>> > >>>>>>> Could you share your thoughts on "accumulated-busy-time"? It should > >>>>>> describe the time while the task is working as expected, i.e. the > >> happy > >>>>>> path. When do we need it for analytics or diagnosis? > >>>>>> > >>>>>> A task could be busy or idle while it is working. Users may adjust > the > >>>>>> parallelism or the partition key according to the ratio between > them. > >>>>>> > >>>>>> Best, > >>>>>> Yangze Guo > >>>>>> > >>>>>> On Fri, Jun 17, 2022 at 5:08 AM Jing Ge <j...@ververica.com> wrote: > >>>>>>> Hi Junhan > >>>>>>> > >>>>>>> These are must-to-have information for batch processing. Thanks for > >>>>>>> bringing it up. > >>>>>>> > >>>>>>> I have some comments: > >>>>>>> > >>>>>>> 1. It seems that most of these metrics are more interesting to > batch > >>>>>> jobs. > >>>>>>> Does it make sense to calculate them for pure streaming jobs too? > >>>>>>> 2. Why "duration is less interesting" which is mentioned in the > FLIP? > >>>>>>> 3. Could you share your thoughts on "accumulated-busy-time"? It > >>>> should > >>>>>>> describe the time while the task is working as expected, i.e. the > >>>> happy > >>>>>>> path. When do we need it for analytics or diagnosis? > >>>>>>> > >>>>>>> BTW, you might want to optimize the format of the FLIP. Some text > is > >>>>>>> running out of the right border of the wiki page. > >>>>>>> > >>>>>>> Best regards, > >>>>>>> Jing > >>>>>>> > >>>>>>> On Thu, Jun 16, 2022 at 4:40 PM Aitozi <gjying1...@gmail.com> > wrote: > >>>>>>> > >>>>>>>> Thanks Junhan for driving this. It a great improvement for the > >>>> batch > >>>>>> jobs. > >>>>>>>> I'm looking forward to this feature in our internal use case. +1 > >>>> for > >>>>>> it. > >>>>>>>> One more question: > >>>>>>>> > >>>>>>>> Are each attempts of the TaskManager or JobManager pods (if > failure > >>>>>> occurs) > >>>>>>>> all be shown in the ui ? > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Aitozi. > >>>>>>>> > >>>>>>>> Yang Wang <danrtsey...@gmail.com> 于2022年6月16日周四 19:10写道: > >>>>>>>> > >>>>>>>>> Thanks Xintong for the explanation. > >>>>>>>>> > >>>>>>>>> It makes sense to leave the discussion about job result store in > >>>> a > >>>>>>>>> dedicated thread. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Best, > >>>>>>>>> Yang > >>>>>>>>> > >>>>>>>>> Xintong Song <tonysong...@gmail.com> 于2022年6月16日周四 13:40写道: > >>>>>>>>> > >>>>>>>>>> My impression of JobResultStore is more about fault tolerance > >>>> and > >>>>>> high > >>>>>>>>>> availability. Using it for providing information to users > >>>> sounds > >>>>>> worth > >>>>>>>>>> exploring. We probably need more time to think it through. > >>>>>>>>>> > >>>>>>>>>> Given that it doesn't conflict with what we have proposed in > >>>> this > >>>>>> FLIP, > >>>>>>>>> I'd > >>>>>>>>>> suggest considering it as a separate thread and exclude it > >>>> from the > >>>>>>>> scope > >>>>>>>>>> of this one. > >>>>>>>>>> > >>>>>>>>>> Best, > >>>>>>>>>> > >>>>>>>>>> Xintong > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Thu, Jun 16, 2022 at 11:43 AM Yang Wang < > >>>> danrtsey...@gmail.com> > >>>>>>>>> wrote: > >>>>>>>>>>> This is a very useful feature both for finished streaming and > >>>>>> batch > >>>>>>>>> jobs. > >>>>>>>>>>> Except for the WebUI & REST API improvements, I am curious > >>>>>> whether we > >>>>>>>>>> could > >>>>>>>>>>> also integrate some critical information(e.g. latest > >>>> checkpoint) > >>>>>> into > >>>>>>>>> the > >>>>>>>>>>> job result store[1]. > >>>>>>>>>>> I am just feeling this is also somehow related with > >>>> "Completed > >>>>>> Jobs > >>>>>>>>>>> Information Enhancement". > >>>>>>>>>>> And I think the history server is not necessary for all the > >>>>>> scenarios > >>>>>>>>>>> especially when users only want to check the job execution > >>>>>> result. > >>>>>>>>>>> [1]. > >>>>>>>>>>> > >>>>>>>>>>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-194%3A+Introduce+the+JobResultStore > >>>>>>>>>>> Best, > >>>>>>>>>>> Yang > >>>>>>>>>>> > >>>>>>>>>>> Xintong Song <tonysong...@gmail.com> 于2022年6月15日周三 15:37写道: > >>>>>>>>>>> > >>>>>>>>>>>> Thanks Junhan, > >>>>>>>>>>>> > >>>>>>>>>>>> +1 for the proposed improvements. > >>>>>>>>>>>> > >>>>>>>>>>>> Best, > >>>>>>>>>>>> > >>>>>>>>>>>> Xintong > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Wed, Jun 15, 2022 at 3:16 PM Yangze Guo < > >>>> karma...@gmail.com > >>>>>>>>> wrote: > >>>>>>>>>>>>> Thanks for driving this, Junhan. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I think it's a valuable usability improvement for both > >>>>>> streaming > >>>>>>>>> and > >>>>>>>>>>>>> batch users. Looking forward to the community feedback. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Best, > >>>>>>>>>>>>> Yangze Guo > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Wed, Jun 15, 2022 at 3:10 PM junhan yang < > >>>>>>>>>> yangjunhan1...@gmail.com> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I would like to open a discussion on FLIP-241: > >>>> Completed > >>>>>> Jobs > >>>>>>>>>>>> Information > >>>>>>>>>>>>>> Enhancement. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> As far as we can tell, streaming and batch users have > >>>>>> different > >>>>>>>>>>>> interests > >>>>>>>>>>>>>> in probing a job. As Flink grows into a unified > >>>> streaming & > >>>>>>>> batch > >>>>>>>>>>>>> processor > >>>>>>>>>>>>>> and is adopted by more and more batch users, the user > >>>>>>>> experience > >>>>>>>>> of > >>>>>>>>>>>>>> completed job's inspection has become more and more > >>>>>> important. > >>>>>>>>>> After > >>>>>>>>>>>>> doing > >>>>>>>>>>>>>> several market research, there are several potential > >>>>>>>> improvements > >>>>>>>>>>>>> spotted. > >>>>>>>>>>>>>> The main purpose here is due to the involvement of > >>>> WebUI & > >>>>>> REST > >>>>>>>>> API > >>>>>>>>>>>>>> changes, which should be openly discussed and voted on > >>>> as > >>>>>>>> FLIPs. > >>>>>>>>>>>>>> You can find more details in FLIP-241 document[1]. > >>>> Looking > >>>>>>>>> forward > >>>>>>>>>> to > >>>>>>>>>>>>>> your feedback. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> [1] https://cwiki.apache.org/confluence/x/dRD1D > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Best regards, > >>>>>>>>>>>>>> Junhan > >> > >> > >