Re: [PR] OAK-10577 Advanced repository statistics [jackrabbit-oak]

via GitHub Fri, 22 Dec 2023 02:46:25 -0800


thomasmueller commented on PR #1247:
URL: https://github.com/apache/jackrabbit-oak/pull/1247#issuecomment-1867533785


   > * Would be nice to have a more structured output.
   
   Right. However I would prefer to first get some real-world data before we 
spend a lot of time tweaking the output. Right now we don't know what is most 
useful and what is not. We can guess, but likely, a lot of the data turns out 
to be not useful (or only useful initially), and so we will remove it later. 
Most of the time I sent on making sure things are "correct" so far (tests). 
Documentation is sparse on purpose. We can spend a lot of time tweaking, but 
then it turns out that the data is not useful, or that some other data is a lot 
more useful. I would prefer if we don't try to do this kind of "optimization" 
right now, except for the obvious cases.
   
   Having human readable text values: I would prefer machine-readable output 
with a useful fixed magnitude, e.g. GiB for binaries, and for counts: the 
number of millions.
   
   Node count in millions: I think that "million" is a useful metric. Small 
environments will have 0 million, but that's fine. I don't remember I had a use 
case to get more accurate data.
   
   Histogram: I agree it's not readable. I changed it to eg.: 3 (2B..4B), 13 
(2KiB..4KiB), 23 (2MiB..4MiB), 33 (2GiB..4GiB), 43 (2TiB..4TiB). 
   
   For "size" and "count" data: I'll also add "size GiB", I guess that's the 
most interesting number; for "count", I'll add "count million".
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] OAK-10577 Advanced repository statistics [jackrabbit-oak]

Reply via email to