Re: [DISCUSS] KIP-818: Introduce cache-size-bytes-total Task Level Metric
Thanks for the updates Sagar! On Thu, Feb 10, 2022 at 5:59 PM Sagar wrote: > Hi All, > > As discussed above, this KIP would be discarded and the new metric proposed > here would be added to KIP-770 as the need to add a new metric was > discovered when working on it. > > Thanks! > Sagar. > > On Thu, Feb 10, 2022 at 9:54 AM Sagar wrote: > > > Hi Guozhang, > > > > Sure. I will add it to the KIP. > > > > Thanks! > > Sagar. > > > > On Mon, Feb 7, 2022 at 6:22 AM Guozhang Wang wrote: > > > >> Since the PR is reopened and we are going to re-merged the fixed PRs, > what > >> about just adding that as part of the KIP as the addendum? > >> > >> On Fri, Feb 4, 2022 at 2:13 AM Sagar wrote: > >> > >> > Thanks Sophie/Guozhang. > >> > > >> > Yeah I could have amended the KIP but it slipped my mind when Guozhang > >> > proposed this in the PR. Later on, the PR was merged and KIP was > marked > >> as > >> > adopted so I thought I will write a new one. I know the PR had been > >> > reopened now :p . I dont have much preference on a new KIP v/s the > >> original > >> > one so anything is ok with me as well. > >> > > >> > I agree with the INFO part. I will make that change. > >> > > >> > Regarding task level, from my understanding, since every task's > >> > buffer/cache size might be different so if a certain task might be > >> > overshooting the limits then the task level metric might help people > to > >> > infer this. Also, thanks for the explanation Guozhang on why this > >> should be > >> > a task level metric. What are your thoughts on this @Sophie? > >> > > >> > Thanks! > >> > Sagar. > >> > > >> > > >> > On Fri, Feb 4, 2022 at 4:47 AM Guozhang Wang > >> wrote: > >> > > >> > > Thanks Sagar for proposing the KIP, and Sophie for sharing your > >> thoughts. > >> > > Here're my 2c: > >> > > > >> > > I think I agree with Sophie for making the two metrics (both the > added > >> > and > >> > > the newly proposed) on INFO level since we are always calculating > them > >> > > anyways. Regarding the level of the cache-size though, I'm thinking > a > >> bit > >> > > different with you two: today we do not actually keep that caches on > >> the > >> > > per-store level, but rather on the per-thread level, i.e. when the > >> cache > >> > is > >> > > full we would flush not only on the triggering state store but also > >> > > potentially on other state stores as well of the task that thread > >> owns. > >> > > This mechanism, in hindsight, is a bit weird and we have some > >> discussions > >> > > about refactoring that in the future already. Personally I'd like to > >> make > >> > > this new metric to be aligned with whatever our future design will > be. > >> > > > >> > > In the long run if we would not have a static assignment from tasks > to > >> > > threads, it may not make sense to keep a dedicated cache pool per > >> thread. > >> > > Instead all tasks will be dynamically sharing the globally > configured > >> max > >> > > cache size (dynamically here means, we would not just divide the > total > >> > size > >> > > by the num.tasks and then assign that to each task), and when a > cache > >> put > >> > > triggers the flushing because the sum now exceeds the global > >> configured > >> > > value, we would potentially flush all the cached records for that > >> task. > >> > If > >> > > this is the end stage, then I think keeping this metric at the task > >> level > >> > > is good. > >> > > > >> > > > >> > > > >> > > Guozhang > >> > > > >> > > > >> > > > >> > > > >> > > On Thu, Feb 3, 2022 at 10:15 AM Sophie Blee-Goldman > >> > > wrote: > >> > > > >> > > > Hey Sagar, thanks for the KIP! > >> > > > > >> > > > And yes, all metrics are considered part of the public API and > thus > >> > > require > >> > > > a KIP to add (or modify, etc...) Although in this particular case, > >> you > >> > > > could probably make a good case for just considering it as an > >> update to > >> > > the > >> > > > original KIP which added the analogous metric > >> > `input-buffer-bytes-total`. > >> > > > For things like this that weren't considered during the KIP > >> proposal > >> > but > >> > > > came up during the implementation or review, and are small changes > >> that > >> > > > would have made sense to include in that KIP had they been thought > >> of, > >> > > you > >> > > > can just send an update to the existing KIP's discussion and.or > >> voting > >> > > > thread that explains what you want to add or modify and maybe a > >> brief > >> > > > description why. > >> > > > > >> > > > It's always ok to make a new KIP when in doubt, but there are some > >> > cases > >> > > > where an update email is sufficient. If there are any concerns or > >> > > > suggestions that significantly expand the scope of the update, you > >> can > >> > > > always go create a new KIP and move the discussion there. > >> > > > > >> > > > I'd say you can feel free to proceed in whichever way you'd prefer > >> for > >> > > this > >> > > > new proposal -- it just needs to
Re: [DISCUSS] KIP-818: Introduce cache-size-bytes-total Task Level Metric
Hi All, As discussed above, this KIP would be discarded and the new metric proposed here would be added to KIP-770 as the need to add a new metric was discovered when working on it. Thanks! Sagar. On Thu, Feb 10, 2022 at 9:54 AM Sagar wrote: > Hi Guozhang, > > Sure. I will add it to the KIP. > > Thanks! > Sagar. > > On Mon, Feb 7, 2022 at 6:22 AM Guozhang Wang wrote: > >> Since the PR is reopened and we are going to re-merged the fixed PRs, what >> about just adding that as part of the KIP as the addendum? >> >> On Fri, Feb 4, 2022 at 2:13 AM Sagar wrote: >> >> > Thanks Sophie/Guozhang. >> > >> > Yeah I could have amended the KIP but it slipped my mind when Guozhang >> > proposed this in the PR. Later on, the PR was merged and KIP was marked >> as >> > adopted so I thought I will write a new one. I know the PR had been >> > reopened now :p . I dont have much preference on a new KIP v/s the >> original >> > one so anything is ok with me as well. >> > >> > I agree with the INFO part. I will make that change. >> > >> > Regarding task level, from my understanding, since every task's >> > buffer/cache size might be different so if a certain task might be >> > overshooting the limits then the task level metric might help people to >> > infer this. Also, thanks for the explanation Guozhang on why this >> should be >> > a task level metric. What are your thoughts on this @Sophie? >> > >> > Thanks! >> > Sagar. >> > >> > >> > On Fri, Feb 4, 2022 at 4:47 AM Guozhang Wang >> wrote: >> > >> > > Thanks Sagar for proposing the KIP, and Sophie for sharing your >> thoughts. >> > > Here're my 2c: >> > > >> > > I think I agree with Sophie for making the two metrics (both the added >> > and >> > > the newly proposed) on INFO level since we are always calculating them >> > > anyways. Regarding the level of the cache-size though, I'm thinking a >> bit >> > > different with you two: today we do not actually keep that caches on >> the >> > > per-store level, but rather on the per-thread level, i.e. when the >> cache >> > is >> > > full we would flush not only on the triggering state store but also >> > > potentially on other state stores as well of the task that thread >> owns. >> > > This mechanism, in hindsight, is a bit weird and we have some >> discussions >> > > about refactoring that in the future already. Personally I'd like to >> make >> > > this new metric to be aligned with whatever our future design will be. >> > > >> > > In the long run if we would not have a static assignment from tasks to >> > > threads, it may not make sense to keep a dedicated cache pool per >> thread. >> > > Instead all tasks will be dynamically sharing the globally configured >> max >> > > cache size (dynamically here means, we would not just divide the total >> > size >> > > by the num.tasks and then assign that to each task), and when a cache >> put >> > > triggers the flushing because the sum now exceeds the global >> configured >> > > value, we would potentially flush all the cached records for that >> task. >> > If >> > > this is the end stage, then I think keeping this metric at the task >> level >> > > is good. >> > > >> > > >> > > >> > > Guozhang >> > > >> > > >> > > >> > > >> > > On Thu, Feb 3, 2022 at 10:15 AM Sophie Blee-Goldman >> > > wrote: >> > > >> > > > Hey Sagar, thanks for the KIP! >> > > > >> > > > And yes, all metrics are considered part of the public API and thus >> > > require >> > > > a KIP to add (or modify, etc...) Although in this particular case, >> you >> > > > could probably make a good case for just considering it as an >> update to >> > > the >> > > > original KIP which added the analogous metric >> > `input-buffer-bytes-total`. >> > > > For things like this that weren't considered during the KIP >> proposal >> > but >> > > > came up during the implementation or review, and are small changes >> that >> > > > would have made sense to include in that KIP had they been thought >> of, >> > > you >> > > > can just send an update to the existing KIP's discussion and.or >> voting >> > > > thread that explains what you want to add or modify and maybe a >> brief >> > > > description why. >> > > > >> > > > It's always ok to make a new KIP when in doubt, but there are some >> > cases >> > > > where an update email is sufficient. If there are any concerns or >> > > > suggestions that significantly expand the scope of the update, you >> can >> > > > always go create a new KIP and move the discussion there. >> > > > >> > > > I'd say you can feel free to proceed in whichever way you'd prefer >> for >> > > this >> > > > new proposal -- it just needs to appear in some KIP somewhere, and >> have >> > > > given the community thew opportunity to discuss and provide feedback >> > on. >> > > > >> > > > On that note, I do have two suggestions: >> > > > >> > > > 1) since we need to measure the size of the cache (and the input >> > > buffer(s) >> > > > anyways, we may as well make `cache-size-bytes-total` -- and also >> the >>
Re: [DISCUSS] KIP-818: Introduce cache-size-bytes-total Task Level Metric
Hi Guozhang, Sure. I will add it to the KIP. Thanks! Sagar. On Mon, Feb 7, 2022 at 6:22 AM Guozhang Wang wrote: > Since the PR is reopened and we are going to re-merged the fixed PRs, what > about just adding that as part of the KIP as the addendum? > > On Fri, Feb 4, 2022 at 2:13 AM Sagar wrote: > > > Thanks Sophie/Guozhang. > > > > Yeah I could have amended the KIP but it slipped my mind when Guozhang > > proposed this in the PR. Later on, the PR was merged and KIP was marked > as > > adopted so I thought I will write a new one. I know the PR had been > > reopened now :p . I dont have much preference on a new KIP v/s the > original > > one so anything is ok with me as well. > > > > I agree with the INFO part. I will make that change. > > > > Regarding task level, from my understanding, since every task's > > buffer/cache size might be different so if a certain task might be > > overshooting the limits then the task level metric might help people to > > infer this. Also, thanks for the explanation Guozhang on why this should > be > > a task level metric. What are your thoughts on this @Sophie? > > > > Thanks! > > Sagar. > > > > > > On Fri, Feb 4, 2022 at 4:47 AM Guozhang Wang wrote: > > > > > Thanks Sagar for proposing the KIP, and Sophie for sharing your > thoughts. > > > Here're my 2c: > > > > > > I think I agree with Sophie for making the two metrics (both the added > > and > > > the newly proposed) on INFO level since we are always calculating them > > > anyways. Regarding the level of the cache-size though, I'm thinking a > bit > > > different with you two: today we do not actually keep that caches on > the > > > per-store level, but rather on the per-thread level, i.e. when the > cache > > is > > > full we would flush not only on the triggering state store but also > > > potentially on other state stores as well of the task that thread owns. > > > This mechanism, in hindsight, is a bit weird and we have some > discussions > > > about refactoring that in the future already. Personally I'd like to > make > > > this new metric to be aligned with whatever our future design will be. > > > > > > In the long run if we would not have a static assignment from tasks to > > > threads, it may not make sense to keep a dedicated cache pool per > thread. > > > Instead all tasks will be dynamically sharing the globally configured > max > > > cache size (dynamically here means, we would not just divide the total > > size > > > by the num.tasks and then assign that to each task), and when a cache > put > > > triggers the flushing because the sum now exceeds the global configured > > > value, we would potentially flush all the cached records for that task. > > If > > > this is the end stage, then I think keeping this metric at the task > level > > > is good. > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > On Thu, Feb 3, 2022 at 10:15 AM Sophie Blee-Goldman > > > wrote: > > > > > > > Hey Sagar, thanks for the KIP! > > > > > > > > And yes, all metrics are considered part of the public API and thus > > > require > > > > a KIP to add (or modify, etc...) Although in this particular case, > you > > > > could probably make a good case for just considering it as an update > to > > > the > > > > original KIP which added the analogous metric > > `input-buffer-bytes-total`. > > > > For things like this that weren't considered during the KIP proposal > > but > > > > came up during the implementation or review, and are small changes > that > > > > would have made sense to include in that KIP had they been thought > of, > > > you > > > > can just send an update to the existing KIP's discussion and.or > voting > > > > thread that explains what you want to add or modify and maybe a brief > > > > description why. > > > > > > > > It's always ok to make a new KIP when in doubt, but there are some > > cases > > > > where an update email is sufficient. If there are any concerns or > > > > suggestions that significantly expand the scope of the update, you > can > > > > always go create a new KIP and move the discussion there. > > > > > > > > I'd say you can feel free to proceed in whichever way you'd prefer > for > > > this > > > > new proposal -- it just needs to appear in some KIP somewhere, and > have > > > > given the community thew opportunity to discuss and provide feedback > > on. > > > > > > > > On that note, I do have two suggestions: > > > > > > > > 1) since we need to measure the size of the cache (and the input > > > buffer(s) > > > > anyways, we may as well make `cache-size-bytes-total` -- and also the > > new > > > > input-buffer-bytes-total -- an INFO level metric. In general the more > > > > metrics the merrier, the only real reason for disabling some are if > > they > > > > have a performance impact or other cost that not everyone will want > to > > > pay. > > > > In this case we're already computing the value of these metrics, so > why > > > not > > > > expose it to the user as an INFO metric >
Re: [DISCUSS] KIP-818: Introduce cache-size-bytes-total Task Level Metric
Since the PR is reopened and we are going to re-merged the fixed PRs, what about just adding that as part of the KIP as the addendum? On Fri, Feb 4, 2022 at 2:13 AM Sagar wrote: > Thanks Sophie/Guozhang. > > Yeah I could have amended the KIP but it slipped my mind when Guozhang > proposed this in the PR. Later on, the PR was merged and KIP was marked as > adopted so I thought I will write a new one. I know the PR had been > reopened now :p . I dont have much preference on a new KIP v/s the original > one so anything is ok with me as well. > > I agree with the INFO part. I will make that change. > > Regarding task level, from my understanding, since every task's > buffer/cache size might be different so if a certain task might be > overshooting the limits then the task level metric might help people to > infer this. Also, thanks for the explanation Guozhang on why this should be > a task level metric. What are your thoughts on this @Sophie? > > Thanks! > Sagar. > > > On Fri, Feb 4, 2022 at 4:47 AM Guozhang Wang wrote: > > > Thanks Sagar for proposing the KIP, and Sophie for sharing your thoughts. > > Here're my 2c: > > > > I think I agree with Sophie for making the two metrics (both the added > and > > the newly proposed) on INFO level since we are always calculating them > > anyways. Regarding the level of the cache-size though, I'm thinking a bit > > different with you two: today we do not actually keep that caches on the > > per-store level, but rather on the per-thread level, i.e. when the cache > is > > full we would flush not only on the triggering state store but also > > potentially on other state stores as well of the task that thread owns. > > This mechanism, in hindsight, is a bit weird and we have some discussions > > about refactoring that in the future already. Personally I'd like to make > > this new metric to be aligned with whatever our future design will be. > > > > In the long run if we would not have a static assignment from tasks to > > threads, it may not make sense to keep a dedicated cache pool per thread. > > Instead all tasks will be dynamically sharing the globally configured max > > cache size (dynamically here means, we would not just divide the total > size > > by the num.tasks and then assign that to each task), and when a cache put > > triggers the flushing because the sum now exceeds the global configured > > value, we would potentially flush all the cached records for that task. > If > > this is the end stage, then I think keeping this metric at the task level > > is good. > > > > > > > > Guozhang > > > > > > > > > > On Thu, Feb 3, 2022 at 10:15 AM Sophie Blee-Goldman > > wrote: > > > > > Hey Sagar, thanks for the KIP! > > > > > > And yes, all metrics are considered part of the public API and thus > > require > > > a KIP to add (or modify, etc...) Although in this particular case, you > > > could probably make a good case for just considering it as an update to > > the > > > original KIP which added the analogous metric > `input-buffer-bytes-total`. > > > For things like this that weren't considered during the KIP proposal > but > > > came up during the implementation or review, and are small changes that > > > would have made sense to include in that KIP had they been thought of, > > you > > > can just send an update to the existing KIP's discussion and.or voting > > > thread that explains what you want to add or modify and maybe a brief > > > description why. > > > > > > It's always ok to make a new KIP when in doubt, but there are some > cases > > > where an update email is sufficient. If there are any concerns or > > > suggestions that significantly expand the scope of the update, you can > > > always go create a new KIP and move the discussion there. > > > > > > I'd say you can feel free to proceed in whichever way you'd prefer for > > this > > > new proposal -- it just needs to appear in some KIP somewhere, and have > > > given the community thew opportunity to discuss and provide feedback > on. > > > > > > On that note, I do have two suggestions: > > > > > > 1) since we need to measure the size of the cache (and the input > > buffer(s) > > > anyways, we may as well make `cache-size-bytes-total` -- and also the > new > > > input-buffer-bytes-total -- an INFO level metric. In general the more > > > metrics the merrier, the only real reason for disabling some are if > they > > > have a performance impact or other cost that not everyone will want to > > pay. > > > In this case we're already computing the value of these metrics, so why > > not > > > expose it to the user as an INFO metric > > > 2) I think it would be both more natural and easier to implement if > this > > > was a store-level metric. A single task could in theory contain > multiple > > > physical state store caches and we would have to roll them up to report > > the > > > size for the task as a whole. It's additional work just to lose some > > > information that the user may want to have > > > > >
Re: [DISCUSS] KIP-818: Introduce cache-size-bytes-total Task Level Metric
Thanks Sophie/Guozhang. Yeah I could have amended the KIP but it slipped my mind when Guozhang proposed this in the PR. Later on, the PR was merged and KIP was marked as adopted so I thought I will write a new one. I know the PR had been reopened now :p . I dont have much preference on a new KIP v/s the original one so anything is ok with me as well. I agree with the INFO part. I will make that change. Regarding task level, from my understanding, since every task's buffer/cache size might be different so if a certain task might be overshooting the limits then the task level metric might help people to infer this. Also, thanks for the explanation Guozhang on why this should be a task level metric. What are your thoughts on this @Sophie? Thanks! Sagar. On Fri, Feb 4, 2022 at 4:47 AM Guozhang Wang wrote: > Thanks Sagar for proposing the KIP, and Sophie for sharing your thoughts. > Here're my 2c: > > I think I agree with Sophie for making the two metrics (both the added and > the newly proposed) on INFO level since we are always calculating them > anyways. Regarding the level of the cache-size though, I'm thinking a bit > different with you two: today we do not actually keep that caches on the > per-store level, but rather on the per-thread level, i.e. when the cache is > full we would flush not only on the triggering state store but also > potentially on other state stores as well of the task that thread owns. > This mechanism, in hindsight, is a bit weird and we have some discussions > about refactoring that in the future already. Personally I'd like to make > this new metric to be aligned with whatever our future design will be. > > In the long run if we would not have a static assignment from tasks to > threads, it may not make sense to keep a dedicated cache pool per thread. > Instead all tasks will be dynamically sharing the globally configured max > cache size (dynamically here means, we would not just divide the total size > by the num.tasks and then assign that to each task), and when a cache put > triggers the flushing because the sum now exceeds the global configured > value, we would potentially flush all the cached records for that task. If > this is the end stage, then I think keeping this metric at the task level > is good. > > > > Guozhang > > > > > On Thu, Feb 3, 2022 at 10:15 AM Sophie Blee-Goldman > wrote: > > > Hey Sagar, thanks for the KIP! > > > > And yes, all metrics are considered part of the public API and thus > require > > a KIP to add (or modify, etc...) Although in this particular case, you > > could probably make a good case for just considering it as an update to > the > > original KIP which added the analogous metric `input-buffer-bytes-total`. > > For things like this that weren't considered during the KIP proposal but > > came up during the implementation or review, and are small changes that > > would have made sense to include in that KIP had they been thought of, > you > > can just send an update to the existing KIP's discussion and.or voting > > thread that explains what you want to add or modify and maybe a brief > > description why. > > > > It's always ok to make a new KIP when in doubt, but there are some cases > > where an update email is sufficient. If there are any concerns or > > suggestions that significantly expand the scope of the update, you can > > always go create a new KIP and move the discussion there. > > > > I'd say you can feel free to proceed in whichever way you'd prefer for > this > > new proposal -- it just needs to appear in some KIP somewhere, and have > > given the community thew opportunity to discuss and provide feedback on. > > > > On that note, I do have two suggestions: > > > > 1) since we need to measure the size of the cache (and the input > buffer(s) > > anyways, we may as well make `cache-size-bytes-total` -- and also the new > > input-buffer-bytes-total -- an INFO level metric. In general the more > > metrics the merrier, the only real reason for disabling some are if they > > have a performance impact or other cost that not everyone will want to > pay. > > In this case we're already computing the value of these metrics, so why > not > > expose it to the user as an INFO metric > > 2) I think it would be both more natural and easier to implement if this > > was a store-level metric. A single task could in theory contain multiple > > physical state store caches and we would have to roll them up to report > the > > size for the task as a whole. It's additional work just to lose some > > information that the user may want to have > > > > Let me know if anything here doesn't make sense or needs clarification. > And > > thanks for the quick followup to get this 2nd metric! > > > > -Sophie > > > > On Sat, Jan 29, 2022 at 4:27 AM Sagar wrote: > > > > > Hi All, > > > > > > I would like to open a discussion thread on the following KIP: > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=186878390 > > > > > >
Re: [DISCUSS] KIP-818: Introduce cache-size-bytes-total Task Level Metric
Thanks Sagar for proposing the KIP, and Sophie for sharing your thoughts. Here're my 2c: I think I agree with Sophie for making the two metrics (both the added and the newly proposed) on INFO level since we are always calculating them anyways. Regarding the level of the cache-size though, I'm thinking a bit different with you two: today we do not actually keep that caches on the per-store level, but rather on the per-thread level, i.e. when the cache is full we would flush not only on the triggering state store but also potentially on other state stores as well of the task that thread owns. This mechanism, in hindsight, is a bit weird and we have some discussions about refactoring that in the future already. Personally I'd like to make this new metric to be aligned with whatever our future design will be. In the long run if we would not have a static assignment from tasks to threads, it may not make sense to keep a dedicated cache pool per thread. Instead all tasks will be dynamically sharing the globally configured max cache size (dynamically here means, we would not just divide the total size by the num.tasks and then assign that to each task), and when a cache put triggers the flushing because the sum now exceeds the global configured value, we would potentially flush all the cached records for that task. If this is the end stage, then I think keeping this metric at the task level is good. Guozhang On Thu, Feb 3, 2022 at 10:15 AM Sophie Blee-Goldman wrote: > Hey Sagar, thanks for the KIP! > > And yes, all metrics are considered part of the public API and thus require > a KIP to add (or modify, etc...) Although in this particular case, you > could probably make a good case for just considering it as an update to the > original KIP which added the analogous metric `input-buffer-bytes-total`. > For things like this that weren't considered during the KIP proposal but > came up during the implementation or review, and are small changes that > would have made sense to include in that KIP had they been thought of, you > can just send an update to the existing KIP's discussion and.or voting > thread that explains what you want to add or modify and maybe a brief > description why. > > It's always ok to make a new KIP when in doubt, but there are some cases > where an update email is sufficient. If there are any concerns or > suggestions that significantly expand the scope of the update, you can > always go create a new KIP and move the discussion there. > > I'd say you can feel free to proceed in whichever way you'd prefer for this > new proposal -- it just needs to appear in some KIP somewhere, and have > given the community thew opportunity to discuss and provide feedback on. > > On that note, I do have two suggestions: > > 1) since we need to measure the size of the cache (and the input buffer(s) > anyways, we may as well make `cache-size-bytes-total` -- and also the new > input-buffer-bytes-total -- an INFO level metric. In general the more > metrics the merrier, the only real reason for disabling some are if they > have a performance impact or other cost that not everyone will want to pay. > In this case we're already computing the value of these metrics, so why not > expose it to the user as an INFO metric > 2) I think it would be both more natural and easier to implement if this > was a store-level metric. A single task could in theory contain multiple > physical state store caches and we would have to roll them up to report the > size for the task as a whole. It's additional work just to lose some > information that the user may want to have > > Let me know if anything here doesn't make sense or needs clarification. And > thanks for the quick followup to get this 2nd metric! > > -Sophie > > On Sat, Jan 29, 2022 at 4:27 AM Sagar wrote: > > > Hi All, > > > > I would like to open a discussion thread on the following KIP: > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=186878390 > > > > PS: This is about introducing a new metric and I am assuming that it > > requires a KIP. If that isn't the case, I can close it. > > > > Thanks! > > Sagar. > > > -- -- Guozhang
Re: [DISCUSS] KIP-818: Introduce cache-size-bytes-total Task Level Metric
Hey Sagar, thanks for the KIP! And yes, all metrics are considered part of the public API and thus require a KIP to add (or modify, etc...) Although in this particular case, you could probably make a good case for just considering it as an update to the original KIP which added the analogous metric `input-buffer-bytes-total`. For things like this that weren't considered during the KIP proposal but came up during the implementation or review, and are small changes that would have made sense to include in that KIP had they been thought of, you can just send an update to the existing KIP's discussion and.or voting thread that explains what you want to add or modify and maybe a brief description why. It's always ok to make a new KIP when in doubt, but there are some cases where an update email is sufficient. If there are any concerns or suggestions that significantly expand the scope of the update, you can always go create a new KIP and move the discussion there. I'd say you can feel free to proceed in whichever way you'd prefer for this new proposal -- it just needs to appear in some KIP somewhere, and have given the community thew opportunity to discuss and provide feedback on. On that note, I do have two suggestions: 1) since we need to measure the size of the cache (and the input buffer(s) anyways, we may as well make `cache-size-bytes-total` -- and also the new input-buffer-bytes-total -- an INFO level metric. In general the more metrics the merrier, the only real reason for disabling some are if they have a performance impact or other cost that not everyone will want to pay. In this case we're already computing the value of these metrics, so why not expose it to the user as an INFO metric 2) I think it would be both more natural and easier to implement if this was a store-level metric. A single task could in theory contain multiple physical state store caches and we would have to roll them up to report the size for the task as a whole. It's additional work just to lose some information that the user may want to have Let me know if anything here doesn't make sense or needs clarification. And thanks for the quick followup to get this 2nd metric! -Sophie On Sat, Jan 29, 2022 at 4:27 AM Sagar wrote: > Hi All, > > I would like to open a discussion thread on the following KIP: > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=186878390 > > PS: This is about introducing a new metric and I am assuming that it > requires a KIP. If that isn't the case, I can close it. > > Thanks! > Sagar. >
[DISCUSS] KIP-818: Introduce cache-size-bytes-total Task Level Metric
Hi All, I would like to open a discussion thread on the following KIP: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=186878390 PS: This is about introducing a new metric and I am assuming that it requires a KIP. If that isn't the case, I can close it. Thanks! Sagar.