> I propose to stick with a cache-group level metric (e.g. > getIndexBuildProgress)
+1 > that returns a float from 0 to 1, which is calculated as [processedKeys] / > [localCacheSize]. From my point of view, we shouldn’t do calculations on the Ignite side if we can avoid it. I’d rather provide two separate metrics - processedKeys and localCacheSize. > 11 авг. 2020 г., в 16:26, Ivan Rakov <ivan.glu...@gmail.com> написал(а): > >> >> As a compromise, I can add jmx methods (rebuilding indexes in the process >> and the percentage of rebuilding) for the entire node, but I tried to find >> a suitable place and did not find it, tell me where to add it? > > I have checked existing JMX beans. To be honest, I struggle to find a > suitable place as well. > We have ClusterMetrics that may represent the state of a local node, but > this class is also used for aggregated cluster metrics. I can't propose a > reasonable way to merge percentages from different nodes. > On the other hand, total index rebuild for all caches isn't a common > scenario. It's either performed after manual index.bin removal or after > index creation, both operations are performed on cache / cache-group level. > Also, all other similar metrics are provided on cache-group level. > > I propose to stick with a cache-group level metric (e.g. > getIndexBuildProgress) that returns a float from 0 to 1, which is > calculated as [processedKeys] / [localCacheSize]. Even if a user handles > metrics through Zabbix, I anticipate that he'll perform this calculation on > his own in order to estimate progress. Let's help him a bit and perform it > on the system side. > If a per-group percentage metric is present, I > think getIndexRebuildKeyProcessed becomes redundant. > > On Tue, Aug 11, 2020 at 8:20 AM ткаленко кирилл <tkalkir...@yandex.ru> > wrote: > >> Hi, Ivan! >> >> What precision would be sufficient? >>> If the progress is very slow, I don't see issues with tracking it if the >>> percentage float has enough precision. >> >> I think we can add a mention getting cache size. >>> 1. Gain an understanding that local cache size >>> (CacheMetricsImpl#getCacheSize) should be used as a 100% milestone (it >>> isn't mentioned neither in javadoc nor in JMX method description). >> >> Do you think users collect metrics with their hands? I think this is done >> by other systems, such as zabbix. >>> 2. Manually calculate sum of all metrics and divide to sum of all cache >>> sizes. >> >> As a compromise, I can add jmx methods (rebuilding indexes in the process >> and the percentage of rebuilding) for the entire node, but I tried to find >> a suitable place and did not find it, tell me where to add it? >>> On the other hand, % of index rebuild progress is self-descriptive. I >> don't >>> understand why we tend to make user's life harder. >> >> 10.08.2020, 21:57, "Ivan Rakov" <ivan.glu...@gmail.com>: >>>> This metric can be used only for local node, to get size of cache use >>>> >> org.apache.ignite.internal.processors.cache.CacheMetricsImpl#getCacheSize. >>> >>> Got it, agree. >>> >>> If there is a lot of data in node that can be rebuilt, percentage may >>>> change very rarely and may not give an estimate of how much time is >> left. >>>> If we see for example that 50_000 keys are rebuilt once a minute, and >> we >>>> have 1_000_000_000 keys, then we can have an approximate estimate. >> What do >>>> you think of that? >>> >>> If the progress is very slow, I don't see issues with tracking it if the >>> percentage float has enough precision. >>> Still, usability of the metric concerns me. In order to estimate >> remaining >>> time of index rebuild, user should: >>> 1. Gain an understanding that local cache size >>> (CacheMetricsImpl#getCacheSize) should be used as a 100% milestone (it >>> isn't mentioned neither in javadoc nor in JMX method description). >>> 2. Manually calculate sum of all metrics and divide to sum of all cache >>> sizes. >>> On the other hand, % of index rebuild progress is self-descriptive. I >> don't >>> understand why we tend to make user's life harder. >>> >>> -- >>> Best regards, >>> Ivan >>> >>> On Mon, Aug 10, 2020 at 8:53 PM ткаленко кирилл <tkalkir...@yandex.ru> >>> wrote: >>> >>>> Hi, Ivan! >>>> >>>> For this you can use >>>> org.apache.ignite.cache.CacheMetrics#IsIndexRebuildInProgress >>>>> How can a local number of processed keys can help us to understand >> when >>>>> index rebuild will be finished? >>>> >>>> This metric can be used only for local node, to get size of cache use >>>> >> org.apache.ignite.internal.processors.cache.CacheMetricsImpl#getCacheSize. >>>>> We can't compare metric value with cache.size(). First one is >> node-local, >>>>> while cache size covers all partitions in the cluster. >>>> >>>> If there is a lot of data in node that can be rebuilt, percentage may >>>> change very rarely and may not give an estimate of how much time is >> left. >>>> If we see for example that 50_000 keys are rebuilt once a minute, and >> we >>>> have 1_000_000_000 keys, then we can have an approximate estimate. >> What do >>>> you think of that? >>>>> I find one single metric much more usable. It would be perfect if >> metric >>>>> value is represented in percentage, e.g. current progress of local >> node >>>>> index rebuild is 60%. >>>> >>>> 10.08.2020, 19:11, "Ivan Rakov" <ivan.glu...@gmail.com>: >>>>> Folks, >>>>> >>>>> Sorry for coming late to the party. I've taken a look at this issue >>>> during >>>>> review. >>>>> >>>>> How can a local number of processed keys can help us to understand >> when >>>>> index rebuild will be finished? >>>>> We can't compare metric value with cache.size(). First one is >> node-local, >>>>> while cache size covers all partitions in the cluster. >>>>> Also, I don't understand why we need to keep separate metrics for all >>>>> caches. Of course, the metric becomes more fair, but obviously >> harder to >>>>> make conclusions on whether "the index rebuild" process is over (and >> the >>>>> cluster is ready to process queries quickly). >>>>> >>>>> I find one single metric much more usable. It would be perfect if >> metric >>>>> value is represented in percentage, e.g. current progress of local >> node >>>>> index rebuild is 60%. >>>>> >>>>> -- >>>>> Best regards, >>>>> Ivan >>>>> >>>>> On Fri, Jul 24, 2020 at 1:35 PM Stanislav Lukyanov < >>>> stanlukya...@gmail.com> >>>>> wrote: >>>>> >>>>>> Got it. I thought that index building and index rebuilding are >>>> essentially >>>>>> the same, >>>>>> but now I see that they are different: index rebuilding cares about >> all >>>>>> indexes at once while index building cares about particular ones. >>>>>> >>>>>> Kirill's approach sounds good. >>>>>> >>>>>> Stan >>>>>> >>>>>>> On 20 Jul 2020, at 14:54, Alexey Goncharuk < >>>> alexey.goncha...@gmail.com> >>>>>> wrote: >>>>>>> >>>>>>> Stan, >>>>>>> >>>>>>> Currently we never build indexes one-by-one - we always use a >> cache >>>> data >>>>>>> row visitor which either updates all indexes (see >>>>>> IndexRebuildFullClosure) >>>>>>> or updates a set of all indexes that need to catch up (see >>>>>>> IndexRebuildPartialClosure). GIven that, I do not see any need for >>>>>>> per-index rebuild status as this status will be updated for all >>>> outdated >>>>>>> indexes simultaneously. >>>>>>> >>>>>>> Kirill's approach for the total number of processed keys per cache >>>> seems >>>>>>> reasonable to me. >>>>>>> >>>>>>> --AG >>>>>>> >>>>>>> пт, 3 июл. 2020 г. в 10:12, ткаленко кирилл <tkalkir...@yandex.ru >>> : >>>>>>> >>>>>>>> Hi, Stan! >>>>>>>> >>>>>>>> Perhaps it is worth clarifying what exactly I wanted to say. >>>>>>>> Now we have 2 processes: building and rebuilding indexes. >>>>>>>> >>>>>>>> At moment, we have some metrics for rebuilding indexes: >>>>>>>> "IsIndexRebuildInProgress", "IndexBuildCountPartitionsLeft". >>>>>>>> >>>>>>>> I suggest adding another metric "Indexrebuildkeyprocessed", which >>>> will >>>>>>>> allow you to determine how many records are left to rebuild for >>>> cache. >>>>>>>> >>>>>>>> I think your comments are more about building an index that may >> need >>>>>> more >>>>>>>> metrics, but I think you should do it in a separate ticket. >>>>>>>> >>>>>>>> 03.07.2020, 03:09, "Stanislav Lukyanov" <stanlukya...@gmail.com >>> : >>>>>>>>> If multiple indexes are to be built "number of indexed keys" >>>> metric may >>>>>>>> be misleading. >>>>>>>>> >>>>>>>>> As a cluster admin, I'd like to know: >>>>>>>>> - Are all indexes ready on a node? >>>>>>>>> - How many indexes are to be built? >>>>>>>>> - How much resources are used by the index building (how many >>>> threads >>>>>>>> are used)? >>>>>>>>> - Which index(es?) is being built right now? >>>>>>>>> - How much time until the current (single) index building >> finishes? >>>>>> Here >>>>>>>> "time" can be a lot of things: partitions, entries, percent of >> the >>>>>> cache, >>>>>>>> minutes and hours >>>>>>>>> - How much time until all indexes are built? >>>>>>>>> - How much does it take to build each of my indexes / a single >>>> index of >>>>>>>> my cache on average? >>>>>>>>> >>>>>>>>> I think we need a set of metrics and/or log messages to solve >> all >>>> of >>>>>>>> these questions. >>>>>>>>> I imaging something like: >>>>>>>>> - numberOfIndexesToBuild >>>>>>>>> - a standard set of metrics on the index building thread pool >> (do >>>> we >>>>>>>> already have it?) >>>>>>>>> - currentlyBuiltIndexName (assuming we only build one at a time >>>> which >>>>>> is >>>>>>>> probably not true) >>>>>>>>> - for the "time" metrics I think percentage might be the best as >>>> it's >>>>>>>> the easiest to understand; we may add multiple metrics though. >>>>>>>>> - For "time per each index" I'd add detailed log messages >> stating >>>> how >>>>>>>> long did it take to build a particular index >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Stan >>>>>>>>> >>>>>>>>>> On 26 Jun 2020, at 12:49, ткаленко кирилл < >> tkalkir...@yandex.ru> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi, Igniters. >>>>>>>>>> >>>>>>>>>> I would like to know if it is possible to estimate how much the >>>> index >>>>>>>> rebuild will take? >>>>>>>>>> >>>>>>>>>> At the moment, I have found the following metrics [1] and [2] >> and >>>>>>>> since the rebuild is based on caches, I think it would be useful >> to >>>> know >>>>>>>> how many records are processed in indexing. This way we can >>>> estimate how >>>>>>>> long we have to wait for the index to be rebuilt by subtracting >> [3] >>>> and >>>>>> how >>>>>>>> many records are indexed. >>>>>>>>>> >>>>>>>>>> I think we should add this metric [4]. >>>>>>>>>> >>>>>>>>>> Comments, suggestions? >>>>>>>>>> >>>>>>>>>> [1] - https://issues.apache.org/jira/browse/IGNITE-12184 >>>>>>>>>> [2] - >>>>>>>> >>>>>> >>>> >> >> org.apache.ignite.internal.processors.cache.CacheGroupMetricsImpl#idxBuildCntPartitionsLeft >>>>>>>>>> [3] - org.apache.ignite.cache.CacheMetrics#getCacheSize >>>>>>>>>> [4] - org.apache.ignite.cache.CacheMetrics#getNumberIndexedKeys >>>>>>>> >>