[ https://issues.apache.org/jira/browse/COUCHDB-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021733#comment-13021733 ]
Filipe Manana commented on COUCHDB-1132: ---------------------------------------- Thanks Adam. Not an issue for the replicator in trunk, it only uses binaries. > Track used space of database and view index files > ------------------------------------------------- > > Key: COUCHDB-1132 > URL: https://issues.apache.org/jira/browse/COUCHDB-1132 > Project: CouchDB > Issue Type: New Feature > Components: Database Core > Reporter: Filipe Manana > Fix For: 1.2 > > > Currently users have no reliable way to know if a database or view index > compaction is needed. > Both me, Adam and Robert Dionne have been working on a feature to compute and > expose the current data size (in bytes) of databases and view indexes. These > computations are exposed as a single field in the database info and view > index info URIs. > Comparing this new value with the disk_size value (the total space in bytes > used by the database or view index file) would allow users to decide whether > or not it's worth to trigger a compaction. > Adam and Robert's work can be found at: > https://github.com/cloudant/bigcouch/compare/7d1adfa...a9410e6 > Mine can be found at: > https://github.com/fdmanana/couchdb/compare/file_space > After chatting with Adam on IRC, the main difference seems to be that they're > work accounts only for user data (document bodies + attachments), while mine > also accounts for the btree values (including all meta information, keys, rev > trees, etc) and the data added by couch_file (4 bytes length prefix, md5s, > block boundary markers). > An example: > $ curl http://localhost:5984/btree_db/_design/test/_info > {"name":"test","view_index":{"signature":"aba9f066ed7f042f63d245ce0c7d870e","language":"javascript","disk_size":274556,"data_size":270455,"updater_running":false,"compact_running":false,"waiting_commit":false,"waiting_clients":0,"update_seq":1004,"purge_seq":0}} > $ curl http://localhost:5984/btree_db > {"db_name":"btree_db","doc_count":1004,"doc_del_count":0,"update_seq":1004,"purge_seq":0,"compact_running":false,"disk_size":6197361,"data_size":6186460,"instance_start_time":"1303231080936421","disk_format_version":5,"committed_update_seq":1004} > This example was executed just after compacting the test database and view > index. The new filed "data_size" has a value very close to the final file > size. > The only thing that my branch doesn't include in the data_size computation, > for databases, are the size of the last header, the size of the _security > object and purged revs list - in practice these are very small and > insignificant that adding extra code to account them doesn't seem worth it. > I'm sure we can merge the best from both branches. > Adam, Robert, thoughts? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira