Kirill, is it good practice to have a metrics for internal use? Don`t think so.
+1 witk Nikolay size is more readable than abstract segments count. 
 
>Hi, Nikolay!
>
>For internal use, leave the metric that I propose and also add the metric: 
>Count of bytes logged in WAL. Why not "written" because for the mmap we cannot 
>track when the physical writting will occur.
>
>16.02.2021, 15:42, "Nikolay Izhikov" < nizhi...@apache.org >:
>> Kirill.
>>
>> «Count of segments» is a very internal thing for a regular user.
>> Regular user don’t want to know about such things.
>>
>> You suggest to calculate the number (space required to store WAL) with some 
>> kind of rough calculation, and with the «Count of bytes written in WAL» we 
>> can have exact number without any suggestions or calculations.
>>
>> Moreover, «Count of bytes written in WAL» is independent on internal WAL 
>> implementation.
>>
>> So, I think exact number is always better to have then some approximation.
>>
>> What do you think?
>>
>>>  15 февр. 2021 г., в 20:45, ткаленко кирилл < tkalkir...@yandex.ru > 
>>> написал(а):
>>>
>>>  Hi, Nikolay!
>>>
>>>  We set the number of segments in the working directory, we also delete by 
>>> segment, it seems that this is a matter of usability. I prefer to dwell on 
>>> my own version, this is a simple metric that does not hurt and you can add 
>>> more as needed.
>>>
>>>  15.02.2021, 17:10, "Nikolay Izhikov" < nizhi...@apache.org >:
>>>>  My suggestion that «count of files» is meaningless number.
>>>>  And «count of bytes written to the files» is useful number to know and 
>>>> use for capacity planning..
>>>>
>>>>>   15 февр. 2021 г., в 15:59, ткаленко кирилл < tkalkir...@yandex.ru > 
>>>>> написал(а):
>>>>>
>>>>>   Hi, Nikolay!
>>>>>
>>>>>   There may be a number (count of segments * segment size) or there may 
>>>>> be a count of segments, whichever is more convenient for the user.
>>>>>
>>>>>   15.02.2021, 13:14, "Nikolay Izhikov" < nizhi...@apache.org >:
>>>>>>   Hello, Kirill.
>>>>>>
>>>>>>   Thanks for an answers.
>>>>>>   Now, I understand your intentions.
>>>>>>
>>>>>>>    t also seems that it will be more natural to operate not just bytes 
>>>>>>> but multiples of a segment.
>>>>>>
>>>>>>   Can’t agree here.
>>>>>>   From my point of view - it’s better to know exact number, not just 
>>>>>> «count of segments».
>>>>>>
>>>>>>>    15 февр. 2021 г., в 13:00, ткаленко кирилл < tkalkir...@yandex.ru > 
>>>>>>> написал(а):
>>>>>>>
>>>>>>>    Hello, Nikolay!
>>>>>>>
>>>>>>>    The period of one day (24h) seems more natural, you can take more or 
>>>>>>> less, I think that one day may not be enough, and it is worth getting 
>>>>>>> the metric for several days (collect statistics) for example a week. 
>>>>>>> Yes, the total size of the segments may not be 
>>>>>>> DataStorageConfiguration#getMaxWalArchiveSize, but for capacity 
>>>>>>> planning, accuracy is not so important to us, since the load can always 
>>>>>>> change, it will hurt users more if we overflow the archive and it will 
>>>>>>> not be able to start the node. So to say that more is better than less, 
>>>>>>> it also seems that it will be more natural to operate not just bytes 
>>>>>>> but multiples of a segment.
>>>>>>>
>>>>>>>    In separate threads, you can discuss the metric that you propose 
>>>>>>> about page memory and indexes estimates.
>>>>>>>
>>>>>>>    14.02.2021, 11:54, "Nikolay Izhikov" < nizhi...@apache.org >:
>>>>>>>>    Hello, Kirill
>>>>>>>>
>>>>>>>>    Your conclusions still not clear for me.
>>>>>>>>
>>>>>>>>>      It is not possible for us to estimate how much space a user will 
>>>>>>>>> need in the archive so as not to overflow it under its load
>>>>>>>>>      We take the maximum 44 and multiply it by a 
>>>>>>>>> DataStorageConfiguration#getWalSegmentSize
>>>>>>>>
>>>>>>>>    Why you take a single day (24h) for a standard period? Is there any 
>>>>>>>> rationale behind this?
>>>>>>>>
>>>>>>>>    1. We have `walAutoArchiveAfterInactivity` property. So WAL segment 
>>>>>>>> can have a size less than the maximum.
>>>>>>>>    2. For CDC feature I want to introduce «WAL force rollover timeout» 
>>>>>>>> to make data available for a consumer in a guaranteed period [1].
>>>>>>>>
>>>>>>>>    Why does the user want to estimate those numbers in the first place?
>>>>>>>>    Are we talking about some kind of capacity planning?
>>>>>>>>
>>>>>>>>    If yes, then maybe it will be better to have a metric for a count 
>>>>>>>> of bytes written in the WAL?
>>>>>>>>    With it, we will have an exact number of space we need for WAL.
>>>>>>>>
>>>>>>>>    How user should estimate capacity for a page memory and indexes?
>>>>>>>>
>>>>>>>>    [1]  https://issues.apache.org/jira/browse/IGNITE-13582
>>>>>>>>
>>>>>>>>>     14 февр. 2021 г., в 09:48, ткаленко кирилл < tkalkir...@yandex.ru 
>>>>>>>>> > написал(а):
>>>>>>>>>
>>>>>>>>>     Hi, Nikolay!
>>>>>>>>>
>>>>>>>>>     The user will be able to take the getLastArchivedSegmentIndex 
>>>>>>>>> every day and remember it and do it, say, for several days.
>>>>>>>>>
>>>>>>>>>     For example, when starting the application, the 
>>>>>>>>> getLastArchivedSegmentIndex is 0, then at the end of the first day 
>>>>>>>>> the value will be 30 at the end of the second 55 and at the end of 
>>>>>>>>> the third 99.
>>>>>>>>>     It turns out that 30 segments were used for the first day, 25 for 
>>>>>>>>> the second and 44 for the third. We take the maximum 44 and multiply 
>>>>>>>>> it by a DataStorageConfiguration#getWalSegmentSize, and we get the 
>>>>>>>>> possible maximum that the archive overflow was the least likely. If 
>>>>>>>>> the user uses compression, then it can be subtracted from the result 
>>>>>>>>> (result * getMaxSizeCompressedArchivedSegment).
>>>>>>>>>
>>>>>>>>>     13.02.2021, 10:47, "Nikolay Izhikov" < nizhi...@apache.org >:
>>>>>>>>>>     Hello, Kirill.
>>>>>>>>>>
>>>>>>>>>>>      It is not possible for us to estimate how much space a user 
>>>>>>>>>>> will need in the archive so as not to overflow it under its load
>>>>>>>>>>
>>>>>>>>>>     It still not clear for me why do we need those metrics.
>>>>>>>>>>     Can you please, write down specific scenario - how user will use 
>>>>>>>>>> these metrics to estimate required WAL volume?
>>>>>>>>>>
>>>>>>>>>>>      12 февр. 2021 г., в 19:35, ткаленко кирилл < 
>>>>>>>>>>> tkalkir...@yandex.ru > написал(а):
>>>>>>>>>>>
>>>>>>>>>>>      Hi, Nikolay!
>>>>>>>>>>>
>>>>>>>>>>>      It is not possible for us to estimate how much space a user 
>>>>>>>>>>> will need in the archive so as not to overflow it under its load. 
>>>>>>>>>>> And the proposed metrics will allow you to make a rough estimate.
>>>>>>>>>>>
>>>>>>>>>>>      12.02.2021, 17:23, "Nikolay Izhikov" < nizhi...@apache.org >:
>>>>>>>>>>>>      Hello, Kirill.
>>>>>>>>>>>>
>>>>>>>>>>>>      Can you, please, clarify - What question about WAL user have 
>>>>>>>>>>>> in mind?
>>>>>>>>>>>>      And what answers he(or she) gets with these new metrics?
>>>>>>>>>>>>
>>>>>>>>>>>>>       12 февр. 2021 г., в 14:26, ткаленко кирилл < 
>>>>>>>>>>>>> tkalkir...@yandex.ru > написал(а):
>>>>>>>>>>>>>
>>>>>>>>>>>>>       Hi everyone!
>>>>>>>>>>>>>       At the moment, I have not found an opportunity to estimate 
>>>>>>>>>>>>> how many WAL segments fall into the archive, say per day.
>>>>>>>>>>>>>       So I created a ticket  
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-14170 to add a 
>>>>>>>>>>>>> couple of new metrics. 
 
 
 
 

Reply via email to