Dear all, I’d like to hear your opinion on how we should interpret a database attribute “active size”.
As you surely know we are using three different size attributes in a database info: file - the size of the database file on disk; external - the uncompressed size of database contents and active, defined as “the size of live data inside the database” or “active byte in the current MVCC snapshot”. Sometime ago I had a discussion with Paul Davis and he pointed on ambiguity of that definition, namely - is it live data before a compaction or after a compaction? To put it in other words: should we treat as “active” only the documents and attachments on btree’s leafs or also include into it the previous document revisions while they can be accessed. Codewise it is the latter, both in current version of CouchDB and in 1.x version where active size was named data_size, but intuitively it feels that it should be former. Despite sounds academical this is a practical question, the difference of active size before and after compaction could be rather noticeable and since it is used as a trigger by compaction daemon it could skew disk usage pattern. Please share your thoughts. If we’ll conclude that we want to change how active size calculated I’m willing to take on implementation of this as I have a recent PR around the same area of code. Regards, Eric