Hi guys,
just to keep you updated. I ran out of space last time while trying to
compress my 83Go file...
I re run the test with half the amount of docs, 2.3M
-rw-r--r-- 1 root root 24G Feb 26 01:45 test.couch
{"db_name":"test","doc_count":2219598,"doc_del_count":0,"update_seq":2219598,"purge_seq":0,"compact_running":false,"disk_size":25017692071,"instance_start_time":"1235590552047908"}
curl -X POST http://localhost:5984/test/_compact
-rw-r--r-- 1 root root 17G Feb 26 13:00 test.couch
it took 3 hours to do the compression.... but we won 7Go back about
30%... quite big
P.
Also, 0.8.1 compaction has a hard time compacting big dbs. Trunk is
better.
-Damien
On Feb 20, 2009, at 12:04 PM, Jan Lehnardt wrote:
On 20 Feb 2009, at 17:42, Pascal Borghino wrote:
Hi there, I do not have attachments...
$ ls -lh
-rw-r--r-- 1 root root 83G Feb 20 02:40 test.couch
-rw-r--r-- 1 root root 23G Feb 20 16:33 test.couch.compact
$ du -sh
107G .
still... from 19Go to 83Go... huge difference.
P.
The fact that there is a .compact file means that compaction
is still running (or was aborted). When you restart it, you
should see it in the "Status" section of Futon and how far
along it is. Compaction will continue where it left off. Please
let us know what the final database file size is when compaction
is finished.
If you did an insertion of a lot of single documents, quite
extensive sparseness can occur. On large imports, do
use bulk inserts (see the wiki) or if that is not possible,
compact every once in a while during the import.
Cheers
Jan
--
Robert Newson a écrit :
I expect the b-tree wastage is minimal (though not zero).
I've wondered what happens on filesystems that don't support sparse
files, I assume they'd just be slower and use more disk space. Given
that the holes vanish after compaction, I suspected a bad calculation
in the code (couch_db.erl, I think), but I've not found it, it seems
to do the right thing. HFS+ doesn't support holes but I'm pretty sure
NTFS does.
Btw, it's mostly around attachments. If you add lots of documents but
no attachments, ls and df are in close agreement.
B.
On Fri, Feb 20, 2009 at 4:00 PM, Jens Alfke <[email protected]>
wrote:
On Feb 20, 2009, at 6:03 AM, Pascal Borghino wrote:
I am currently compacting it... even if 'Compaction rewrites the
database
file, removing outdated document revisions and deleted
documents'... no
document should be outdate neither deleted...
In addition to the sparseness of the file, another reason for the
size
difference might be obsolete b-tree nodes. The file is
append-only, so any
time a b-tree changes, the old nodes remain in the file. If you've
done a
large number of individual insertions, that space might be
significant.
(Probably not gigabytes, though.)
[email protected] wrote:
I find the actual
consumed space is far, far less that 'ls' shows. CouchDB .couch
files
are very sparse, large gaps of unwritten data, ostensibly to keep
btree and document items separate, but these 'holes' vanish after
compaction, even if you have zero updates and deletes.
Hm. But not all filesystems support sparse files. HFS+, the Mac OS
filesystem, doesn't. (Does NTFS?) Is there an option to suppress
the gaps?
—Jens