[ 
https://issues.apache.org/jira/browse/COUCHDB-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022162#comment-13022162
 ] 

Paul Joseph Davis commented on COUCHDB-1132:
--------------------------------------------

@Jan

That won't be doable until we make the b+tree balance itself during writes. But 
once/if we get around to that your request would happen more or less 
automatically with a few tweaks to the compactor.

> Track used space of database and view index files
> -------------------------------------------------
>
>                 Key: COUCHDB-1132
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1132
>             Project: CouchDB
>          Issue Type: New Feature
>          Components: Database Core
>            Reporter: Filipe Manana
>             Fix For: 1.2
>
>
> Currently users have no reliable way to know if a database or view index 
> compaction is needed.
> Both me, Adam and Robert Dionne have been working on a feature to compute and 
> expose the current data size (in bytes) of databases and view indexes. These 
> computations are exposed as a single field in the database info and view 
> index info URIs.
> Comparing this new value with the disk_size value (the total space in bytes 
> used by the database or view index file) would allow users to decide whether 
> or not it's worth to trigger a compaction.
> Adam and Robert's work can be found at:
> https://github.com/cloudant/bigcouch/compare/7d1adfa...a9410e6
> Mine can be found at:
> https://github.com/fdmanana/couchdb/compare/file_space
> After chatting with Adam on IRC, the main difference seems to be that they're 
> work accounts only for user data (document bodies + attachments), while mine 
> also accounts for the btree values (including all meta information, keys, rev 
> trees, etc) and the data added by couch_file (4 bytes length prefix, md5s, 
> block boundary markers).
> An example:
> $ curl http://localhost:5984/btree_db/_design/test/_info
> {"name":"test","view_index":{"signature":"aba9f066ed7f042f63d245ce0c7d870e","language":"javascript","disk_size":274556,"data_size":270455,"updater_running":false,"compact_running":false,"waiting_commit":false,"waiting_clients":0,"update_seq":1004,"purge_seq":0}}
> $ curl http://localhost:5984/btree_db
> {"db_name":"btree_db","doc_count":1004,"doc_del_count":0,"update_seq":1004,"purge_seq":0,"compact_running":false,"disk_size":6197361,"data_size":6186460,"instance_start_time":"1303231080936421","disk_format_version":5,"committed_update_seq":1004}
> This example was executed just after compacting the test database and view 
> index. The new filed "data_size" has a value very close to the final file 
> size.
> The only thing that my branch doesn't include in the data_size computation, 
> for databases, are the size of the last header, the size of the _security 
> object and purged revs list - in practice these are very small and 
> insignificant that adding extra code to account them doesn't seem worth it.
> I'm sure we can merge the best from both branches.
> Adam, Robert, thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to