Hi all, Thank you all for your feedbacks. As far as I can see, it looks like the discussion on this FLIP has been converged.
I will start a new vote thread now. Best regards, Junhan Yangze Guo <karma...@gmail.com> 于2022年6月17日周五 14:05写道: > Thanks for the input, Jiangang. > > I think it's a valid demand to distinguish completed jobs with the same > name. > - If they are different jobs, I think users need to give them > different meaningful names respectively. > - If they are exactly the same job, IIUC, what you need is to figure > out the order. ApplicationId in Yarn might help. But in this case, you > can just sort them with the start time. > > Best, > Yangze Guo > > On Fri, Jun 17, 2022 at 12:13 PM Jiangang Liu <liujiangangp...@gmail.com> > wrote: > > > > Thanks for the FLIP. It is helpful to track detail infos for completed > jobs. > > > > I want to ask another question. In our environment, sometimes it is hard > to > > distinguish jobs since the same job names may appear multi times in the > > completed jobs. Because a job may run multi times or different jobs have > > the same job names. I wonder that wether we can enhance the complete jobs > > display with more information, such as applicationId and application name > > in yarn. Maybe it is different in k8s to identify a job. > > > > Best > > Jiangang Liu > > > > Yangze Guo <karma...@gmail.com> 于2022年6月17日周五 11:40写道: > > > > > Thanks for the feedback, Aitozi and Jing. > > > > > > > Are each attempts of the TaskManager or JobManager pods (if failure > > > occurs) > > > all be shown in the ui? > > > > > > The info of the prior execution attempts will be archived, you could > > > refer to `ArchivedExecutionVertex$priorExecutions`. > > > > > > > It seems that most of these metrics are more interesting to batch > jobs. > > > Does it make sense to calculate them for pure streaming jobs too? > > > > > > All the proposed metrics will be calculated no matter what the job > type is. > > > > > > > Why "duration is less interesting" which is mentioned in the FLIP? > > > > > > As a first step, we mainly focus on the most interesting status during > > > the job lifecycle. The duration of final states like FINISHED and > > > CANCELED is meaningless, while abnormal conditions like CANCELING will > > > not be included at the moment. > > > > > > > Could you share your thoughts on "accumulated-busy-time"? It should > > > describe the time while the task is working as expected, i.e. the happy > > > path. When do we need it for analytics or diagnosis? > > > > > > A task could be busy or idle while it is working. Users may adjust the > > > parallelism or the partition key according to the ratio between them. > > > > > > Best, > > > Yangze Guo > > > > > > On Fri, Jun 17, 2022 at 5:08 AM Jing Ge <j...@ververica.com> wrote: > > > > > > > > Hi Junhan > > > > > > > > These are must-to-have information for batch processing. Thanks for > > > > bringing it up. > > > > > > > > I have some comments: > > > > > > > > 1. It seems that most of these metrics are more interesting to batch > > > jobs. > > > > Does it make sense to calculate them for pure streaming jobs too? > > > > 2. Why "duration is less interesting" which is mentioned in the FLIP? > > > > 3. Could you share your thoughts on "accumulated-busy-time"? It > should > > > > describe the time while the task is working as expected, i.e. the > happy > > > > path. When do we need it for analytics or diagnosis? > > > > > > > > BTW, you might want to optimize the format of the FLIP. Some text is > > > > running out of the right border of the wiki page. > > > > > > > > Best regards, > > > > Jing > > > > > > > > On Thu, Jun 16, 2022 at 4:40 PM Aitozi <gjying1...@gmail.com> wrote: > > > > > > > > > Thanks Junhan for driving this. It a great improvement for the > batch > > > jobs. > > > > > I'm looking forward to this feature in our internal use case. +1 > for > > > it. > > > > > > > > > > One more question: > > > > > > > > > > Are each attempts of the TaskManager or JobManager pods (if failure > > > occurs) > > > > > all be shown in the ui ? > > > > > > > > > > Best, > > > > > Aitozi. > > > > > > > > > > Yang Wang <danrtsey...@gmail.com> 于2022年6月16日周四 19:10写道: > > > > > > > > > > > Thanks Xintong for the explanation. > > > > > > > > > > > > It makes sense to leave the discussion about job result store in > a > > > > > > dedicated thread. > > > > > > > > > > > > > > > > > > Best, > > > > > > Yang > > > > > > > > > > > > Xintong Song <tonysong...@gmail.com> 于2022年6月16日周四 13:40写道: > > > > > > > > > > > > > My impression of JobResultStore is more about fault tolerance > and > > > high > > > > > > > availability. Using it for providing information to users > sounds > > > worth > > > > > > > exploring. We probably need more time to think it through. > > > > > > > > > > > > > > Given that it doesn't conflict with what we have proposed in > this > > > FLIP, > > > > > > I'd > > > > > > > suggest considering it as a separate thread and exclude it > from the > > > > > scope > > > > > > > of this one. > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > Xintong > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jun 16, 2022 at 11:43 AM Yang Wang < > danrtsey...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > This is a very useful feature both for finished streaming and > > > batch > > > > > > jobs. > > > > > > > > > > > > > > > > Except for the WebUI & REST API improvements, I am curious > > > whether we > > > > > > > could > > > > > > > > also integrate some critical information(e.g. latest > checkpoint) > > > into > > > > > > the > > > > > > > > job result store[1]. > > > > > > > > I am just feeling this is also somehow related with > "Completed > > > Jobs > > > > > > > > Information Enhancement". > > > > > > > > And I think the history server is not necessary for all the > > > scenarios > > > > > > > > especially when users only want to check the job execution > > > result. > > > > > > > > > > > > > > > > [1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-194%3A+Introduce+the+JobResultStore > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > Yang > > > > > > > > > > > > > > > > Xintong Song <tonysong...@gmail.com> 于2022年6月15日周三 15:37写道: > > > > > > > > > > > > > > > > > Thanks Junhan, > > > > > > > > > > > > > > > > > > +1 for the proposed improvements. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > Xintong > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 15, 2022 at 3:16 PM Yangze Guo < > karma...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Thanks for driving this, Junhan. > > > > > > > > > > > > > > > > > > > > I think it's a valuable usability improvement for both > > > streaming > > > > > > and > > > > > > > > > > batch users. Looking forward to the community feedback. > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Yangze Guo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 15, 2022 at 3:10 PM junhan yang < > > > > > > > yangjunhan1...@gmail.com> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > I would like to open a discussion on FLIP-241: > Completed > > > Jobs > > > > > > > > > Information > > > > > > > > > > > Enhancement. > > > > > > > > > > > > > > > > > > > > > > As far as we can tell, streaming and batch users have > > > different > > > > > > > > > interests > > > > > > > > > > > in probing a job. As Flink grows into a unified > streaming & > > > > > batch > > > > > > > > > > processor > > > > > > > > > > > and is adopted by more and more batch users, the user > > > > > experience > > > > > > of > > > > > > > > > > > completed job's inspection has become more and more > > > important. > > > > > > > After > > > > > > > > > > doing > > > > > > > > > > > several market research, there are several potential > > > > > improvements > > > > > > > > > > spotted. > > > > > > > > > > > > > > > > > > > > > > The main purpose here is due to the involvement of > WebUI & > > > REST > > > > > > API > > > > > > > > > > > changes, which should be openly discussed and voted on > as > > > > > FLIPs. > > > > > > > > > > > > > > > > > > > > > > You can find more details in FLIP-241 document[1]. > Looking > > > > > > forward > > > > > > > to > > > > > > > > > > > your feedback. > > > > > > > > > > > > > > > > > > > > > > [1] https://cwiki.apache.org/confluence/x/dRD1D > > > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > Junhan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >