Hi, Zhenya!

Users can also use it, I see nothing wrong with the presence of two metrics.

16.02.2021, 16:50, "Zhenya Stanilovsky" <[email protected]>:
> Kirill, is it good practice to have a metrics for internal use? Don`t think 
> so.
> +1 witk Nikolay size is more readable than abstract segments count.
>
>> Hi, Nikolay!
>>
>> For internal use, leave the metric that I propose and also add the metric: 
>> Count of bytes logged in WAL. Why not "written" because for the mmap we 
>> cannot track when the physical writting will occur.
>>
>> 16.02.2021, 15:42, "Nikolay Izhikov" < [email protected] >:
>>>  Kirill.
>>>
>>>  «Count of segments» is a very internal thing for a regular user.
>>>  Regular user don’t want to know about such things.
>>>
>>>  You suggest to calculate the number (space required to store WAL) with 
>>> some kind of rough calculation, and with the «Count of bytes written in 
>>> WAL» we can have exact number without any suggestions or calculations.
>>>
>>>  Moreover, «Count of bytes written in WAL» is independent on internal WAL 
>>> implementation.
>>>
>>>  So, I think exact number is always better to have then some approximation.
>>>
>>>  What do you think?
>>>
>>>>   15 февр. 2021 г., в 20:45, ткаленко кирилл < [email protected] > 
>>>> написал(а):
>>>>
>>>>   Hi, Nikolay!
>>>>
>>>>   We set the number of segments in the working directory, we also delete 
>>>> by segment, it seems that this is a matter of usability. I prefer to dwell 
>>>> on my own version, this is a simple metric that does not hurt and you can 
>>>> add more as needed.
>>>>
>>>>   15.02.2021, 17:10, "Nikolay Izhikov" < [email protected] >:
>>>>>   My suggestion that «count of files» is meaningless number.
>>>>>   And «count of bytes written to the files» is useful number to know and 
>>>>> use for capacity planning..
>>>>>
>>>>>>    15 февр. 2021 г., в 15:59, ткаленко кирилл < [email protected] > 
>>>>>> написал(а):
>>>>>>
>>>>>>    Hi, Nikolay!
>>>>>>
>>>>>>    There may be a number (count of segments * segment size) or there may 
>>>>>> be a count of segments, whichever is more convenient for the user.
>>>>>>
>>>>>>    15.02.2021, 13:14, "Nikolay Izhikov" < [email protected] >:
>>>>>>>    Hello, Kirill.
>>>>>>>
>>>>>>>    Thanks for an answers.
>>>>>>>    Now, I understand your intentions.
>>>>>>>
>>>>>>>>     t also seems that it will be more natural to operate not just 
>>>>>>>> bytes but multiples of a segment.
>>>>>>>
>>>>>>>    Can’t agree here.
>>>>>>>    From my point of view - it’s better to know exact number, not just 
>>>>>>> «count of segments».
>>>>>>>
>>>>>>>>     15 февр. 2021 г., в 13:00, ткаленко кирилл < [email protected] 
>>>>>>>> > написал(а):
>>>>>>>>
>>>>>>>>     Hello, Nikolay!
>>>>>>>>
>>>>>>>>     The period of one day (24h) seems more natural, you can take more 
>>>>>>>> or less, I think that one day may not be enough, and it is worth 
>>>>>>>> getting the metric for several days (collect statistics) for example a 
>>>>>>>> week. Yes, the total size of the segments may not be 
>>>>>>>> DataStorageConfiguration#getMaxWalArchiveSize, but for capacity 
>>>>>>>> planning, accuracy is not so important to us, since the load can 
>>>>>>>> always change, it will hurt users more if we overflow the archive and 
>>>>>>>> it will not be able to start the node. So to say that more is better 
>>>>>>>> than less, it also seems that it will be more natural to operate not 
>>>>>>>> just bytes but multiples of a segment.
>>>>>>>>
>>>>>>>>     In separate threads, you can discuss the metric that you propose 
>>>>>>>> about page memory and indexes estimates.
>>>>>>>>
>>>>>>>>     14.02.2021, 11:54, "Nikolay Izhikov" < [email protected] >:
>>>>>>>>>     Hello, Kirill
>>>>>>>>>
>>>>>>>>>     Your conclusions still not clear for me.
>>>>>>>>>
>>>>>>>>>>       It is not possible for us to estimate how much space a user 
>>>>>>>>>> will need in the archive so as not to overflow it under its load
>>>>>>>>>>       We take the maximum 44 and multiply it by a 
>>>>>>>>>> DataStorageConfiguration#getWalSegmentSize
>>>>>>>>>
>>>>>>>>>     Why you take a single day (24h) for a standard period? Is there 
>>>>>>>>> any rationale behind this?
>>>>>>>>>
>>>>>>>>>     1. We have `walAutoArchiveAfterInactivity` property. So WAL 
>>>>>>>>> segment can have a size less than the maximum.
>>>>>>>>>     2. For CDC feature I want to introduce «WAL force rollover 
>>>>>>>>> timeout» to make data available for a consumer in a guaranteed period 
>>>>>>>>> [1].
>>>>>>>>>
>>>>>>>>>     Why does the user want to estimate those numbers in the first 
>>>>>>>>> place?
>>>>>>>>>     Are we talking about some kind of capacity planning?
>>>>>>>>>
>>>>>>>>>     If yes, then maybe it will be better to have a metric for a count 
>>>>>>>>> of bytes written in the WAL?
>>>>>>>>>     With it, we will have an exact number of space we need for WAL.
>>>>>>>>>
>>>>>>>>>     How user should estimate capacity for a page memory and indexes?
>>>>>>>>>
>>>>>>>>>     [1] https://issues.apache.org/jira/browse/IGNITE-13582
>>>>>>>>>
>>>>>>>>>>      14 февр. 2021 г., в 09:48, ткаленко кирилл < 
>>>>>>>>>> [email protected] > написал(а):
>>>>>>>>>>
>>>>>>>>>>      Hi, Nikolay!
>>>>>>>>>>
>>>>>>>>>>      The user will be able to take the getLastArchivedSegmentIndex 
>>>>>>>>>> every day and remember it and do it, say, for several days.
>>>>>>>>>>
>>>>>>>>>>      For example, when starting the application, the 
>>>>>>>>>> getLastArchivedSegmentIndex is 0, then at the end of the first day 
>>>>>>>>>> the value will be 30 at the end of the second 55 and at the end of 
>>>>>>>>>> the third 99.
>>>>>>>>>>      It turns out that 30 segments were used for the first day, 25 
>>>>>>>>>> for the second and 44 for the third. We take the maximum 44 and 
>>>>>>>>>> multiply it by a DataStorageConfiguration#getWalSegmentSize, and we 
>>>>>>>>>> get the possible maximum that the archive overflow was the least 
>>>>>>>>>> likely. If the user uses compression, then it can be subtracted from 
>>>>>>>>>> the result (result * getMaxSizeCompressedArchivedSegment).
>>>>>>>>>>
>>>>>>>>>>      13.02.2021, 10:47, "Nikolay Izhikov" < [email protected] >:
>>>>>>>>>>>      Hello, Kirill.
>>>>>>>>>>>
>>>>>>>>>>>>       It is not possible for us to estimate how much space a user 
>>>>>>>>>>>> will need in the archive so as not to overflow it under its load
>>>>>>>>>>>
>>>>>>>>>>>      It still not clear for me why do we need those metrics.
>>>>>>>>>>>      Can you please, write down specific scenario - how user will 
>>>>>>>>>>> use these metrics to estimate required WAL volume?
>>>>>>>>>>>
>>>>>>>>>>>>       12 февр. 2021 г., в 19:35, ткаленко кирилл < 
>>>>>>>>>>>> [email protected] > написал(а):
>>>>>>>>>>>>
>>>>>>>>>>>>       Hi, Nikolay!
>>>>>>>>>>>>
>>>>>>>>>>>>       It is not possible for us to estimate how much space a user 
>>>>>>>>>>>> will need in the archive so as not to overflow it under its load. 
>>>>>>>>>>>> And the proposed metrics will allow you to make a rough estimate.
>>>>>>>>>>>>
>>>>>>>>>>>>       12.02.2021, 17:23, "Nikolay Izhikov" < [email protected] >:
>>>>>>>>>>>>>       Hello, Kirill.
>>>>>>>>>>>>>
>>>>>>>>>>>>>       Can you, please, clarify - What question about WAL user 
>>>>>>>>>>>>> have in mind?
>>>>>>>>>>>>>       And what answers he(or she) gets with these new metrics?
>>>>>>>>>>>>>
>>>>>>>>>>>>>>        12 февр. 2021 г., в 14:26, ткаленко кирилл < 
>>>>>>>>>>>>>> [email protected] > написал(а):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>        Hi everyone!
>>>>>>>>>>>>>>        At the moment, I have not found an opportunity to 
>>>>>>>>>>>>>> estimate how many WAL segments fall into the archive, say per 
>>>>>>>>>>>>>> day.
>>>>>>>>>>>>>>        So I created a ticket 
>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-14170 to add a 
>>>>>>>>>>>>>> couple of new metrics.

Reply via email to