Dear all,

I’d like to hear your opinion on how we should interpret a database attribute 
“active size”.

As you surely know we are using three different size attributes in a database 
info: file - the size of the database file on disk; external - the uncompressed 
size of database contents and active, defined as “the size of live data inside 
the database” or “active byte in the current MVCC snapshot”.

Sometime ago I had a discussion with Paul Davis and he pointed on ambiguity of 
that definition, namely - is it live data before a compaction or after a 
compaction? To put it in other words: should we treat as “active” only the 
documents and attachments on btree’s leafs or also include into it the previous 
document revisions while they can be accessed. Codewise it is the latter, both 
in current version of CouchDB and in 1.x version where active size was named 
data_size, but intuitively it feels that it should be former.

Despite sounds academical this is a practical question, the difference of 
active size before and after compaction could be rather noticeable and since it 
is used as a trigger by compaction daemon it could skew disk usage pattern.

Please share your thoughts. If we’ll conclude that we want to change how active 
size calculated I’m willing to take on implementation of this as I have a 
recent PR around the same area of code.


Regards,
Eric

 



 

  

Reply via email to