Hi, Zhenya! Users can also use it, I see nothing wrong with the presence of two metrics.
16.02.2021, 16:50, "Zhenya Stanilovsky" <[email protected]>: > Kirill, is it good practice to have a metrics for internal use? Don`t think > so. > +1 witk Nikolay size is more readable than abstract segments count. > >> Hi, Nikolay! >> >> For internal use, leave the metric that I propose and also add the metric: >> Count of bytes logged in WAL. Why not "written" because for the mmap we >> cannot track when the physical writting will occur. >> >> 16.02.2021, 15:42, "Nikolay Izhikov" < [email protected] >: >>> Kirill. >>> >>> «Count of segments» is a very internal thing for a regular user. >>> Regular user don’t want to know about such things. >>> >>> You suggest to calculate the number (space required to store WAL) with >>> some kind of rough calculation, and with the «Count of bytes written in >>> WAL» we can have exact number without any suggestions or calculations. >>> >>> Moreover, «Count of bytes written in WAL» is independent on internal WAL >>> implementation. >>> >>> So, I think exact number is always better to have then some approximation. >>> >>> What do you think? >>> >>>> 15 февр. 2021 г., в 20:45, ткаленко кирилл < [email protected] > >>>> написал(а): >>>> >>>> Hi, Nikolay! >>>> >>>> We set the number of segments in the working directory, we also delete >>>> by segment, it seems that this is a matter of usability. I prefer to dwell >>>> on my own version, this is a simple metric that does not hurt and you can >>>> add more as needed. >>>> >>>> 15.02.2021, 17:10, "Nikolay Izhikov" < [email protected] >: >>>>> My suggestion that «count of files» is meaningless number. >>>>> And «count of bytes written to the files» is useful number to know and >>>>> use for capacity planning.. >>>>> >>>>>> 15 февр. 2021 г., в 15:59, ткаленко кирилл < [email protected] > >>>>>> написал(а): >>>>>> >>>>>> Hi, Nikolay! >>>>>> >>>>>> There may be a number (count of segments * segment size) or there may >>>>>> be a count of segments, whichever is more convenient for the user. >>>>>> >>>>>> 15.02.2021, 13:14, "Nikolay Izhikov" < [email protected] >: >>>>>>> Hello, Kirill. >>>>>>> >>>>>>> Thanks for an answers. >>>>>>> Now, I understand your intentions. >>>>>>> >>>>>>>> t also seems that it will be more natural to operate not just >>>>>>>> bytes but multiples of a segment. >>>>>>> >>>>>>> Can’t agree here. >>>>>>> From my point of view - it’s better to know exact number, not just >>>>>>> «count of segments». >>>>>>> >>>>>>>> 15 февр. 2021 г., в 13:00, ткаленко кирилл < [email protected] >>>>>>>> > написал(а): >>>>>>>> >>>>>>>> Hello, Nikolay! >>>>>>>> >>>>>>>> The period of one day (24h) seems more natural, you can take more >>>>>>>> or less, I think that one day may not be enough, and it is worth >>>>>>>> getting the metric for several days (collect statistics) for example a >>>>>>>> week. Yes, the total size of the segments may not be >>>>>>>> DataStorageConfiguration#getMaxWalArchiveSize, but for capacity >>>>>>>> planning, accuracy is not so important to us, since the load can >>>>>>>> always change, it will hurt users more if we overflow the archive and >>>>>>>> it will not be able to start the node. So to say that more is better >>>>>>>> than less, it also seems that it will be more natural to operate not >>>>>>>> just bytes but multiples of a segment. >>>>>>>> >>>>>>>> In separate threads, you can discuss the metric that you propose >>>>>>>> about page memory and indexes estimates. >>>>>>>> >>>>>>>> 14.02.2021, 11:54, "Nikolay Izhikov" < [email protected] >: >>>>>>>>> Hello, Kirill >>>>>>>>> >>>>>>>>> Your conclusions still not clear for me. >>>>>>>>> >>>>>>>>>> It is not possible for us to estimate how much space a user >>>>>>>>>> will need in the archive so as not to overflow it under its load >>>>>>>>>> We take the maximum 44 and multiply it by a >>>>>>>>>> DataStorageConfiguration#getWalSegmentSize >>>>>>>>> >>>>>>>>> Why you take a single day (24h) for a standard period? Is there >>>>>>>>> any rationale behind this? >>>>>>>>> >>>>>>>>> 1. We have `walAutoArchiveAfterInactivity` property. So WAL >>>>>>>>> segment can have a size less than the maximum. >>>>>>>>> 2. For CDC feature I want to introduce «WAL force rollover >>>>>>>>> timeout» to make data available for a consumer in a guaranteed period >>>>>>>>> [1]. >>>>>>>>> >>>>>>>>> Why does the user want to estimate those numbers in the first >>>>>>>>> place? >>>>>>>>> Are we talking about some kind of capacity planning? >>>>>>>>> >>>>>>>>> If yes, then maybe it will be better to have a metric for a count >>>>>>>>> of bytes written in the WAL? >>>>>>>>> With it, we will have an exact number of space we need for WAL. >>>>>>>>> >>>>>>>>> How user should estimate capacity for a page memory and indexes? >>>>>>>>> >>>>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-13582 >>>>>>>>> >>>>>>>>>> 14 февр. 2021 г., в 09:48, ткаленко кирилл < >>>>>>>>>> [email protected] > написал(а): >>>>>>>>>> >>>>>>>>>> Hi, Nikolay! >>>>>>>>>> >>>>>>>>>> The user will be able to take the getLastArchivedSegmentIndex >>>>>>>>>> every day and remember it and do it, say, for several days. >>>>>>>>>> >>>>>>>>>> For example, when starting the application, the >>>>>>>>>> getLastArchivedSegmentIndex is 0, then at the end of the first day >>>>>>>>>> the value will be 30 at the end of the second 55 and at the end of >>>>>>>>>> the third 99. >>>>>>>>>> It turns out that 30 segments were used for the first day, 25 >>>>>>>>>> for the second and 44 for the third. We take the maximum 44 and >>>>>>>>>> multiply it by a DataStorageConfiguration#getWalSegmentSize, and we >>>>>>>>>> get the possible maximum that the archive overflow was the least >>>>>>>>>> likely. If the user uses compression, then it can be subtracted from >>>>>>>>>> the result (result * getMaxSizeCompressedArchivedSegment). >>>>>>>>>> >>>>>>>>>> 13.02.2021, 10:47, "Nikolay Izhikov" < [email protected] >: >>>>>>>>>>> Hello, Kirill. >>>>>>>>>>> >>>>>>>>>>>> It is not possible for us to estimate how much space a user >>>>>>>>>>>> will need in the archive so as not to overflow it under its load >>>>>>>>>>> >>>>>>>>>>> It still not clear for me why do we need those metrics. >>>>>>>>>>> Can you please, write down specific scenario - how user will >>>>>>>>>>> use these metrics to estimate required WAL volume? >>>>>>>>>>> >>>>>>>>>>>> 12 февр. 2021 г., в 19:35, ткаленко кирилл < >>>>>>>>>>>> [email protected] > написал(а): >>>>>>>>>>>> >>>>>>>>>>>> Hi, Nikolay! >>>>>>>>>>>> >>>>>>>>>>>> It is not possible for us to estimate how much space a user >>>>>>>>>>>> will need in the archive so as not to overflow it under its load. >>>>>>>>>>>> And the proposed metrics will allow you to make a rough estimate. >>>>>>>>>>>> >>>>>>>>>>>> 12.02.2021, 17:23, "Nikolay Izhikov" < [email protected] >: >>>>>>>>>>>>> Hello, Kirill. >>>>>>>>>>>>> >>>>>>>>>>>>> Can you, please, clarify - What question about WAL user >>>>>>>>>>>>> have in mind? >>>>>>>>>>>>> And what answers he(or she) gets with these new metrics? >>>>>>>>>>>>> >>>>>>>>>>>>>> 12 февр. 2021 г., в 14:26, ткаленко кирилл < >>>>>>>>>>>>>> [email protected] > написал(а): >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi everyone! >>>>>>>>>>>>>> At the moment, I have not found an opportunity to >>>>>>>>>>>>>> estimate how many WAL segments fall into the archive, say per >>>>>>>>>>>>>> day. >>>>>>>>>>>>>> So I created a ticket >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-14170 to add a >>>>>>>>>>>>>> couple of new metrics.
