On Thu, Jan 7, 2010 at 9:59 AM, Roger Binns <rog...@rogerbinns.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I have 9.8 million documents which in a text file one document per line JSON
> encoded comes out as 2.3Gb.  Once loaded into couchdb, 21Gb of space is
> consumed (after the 24 hours it takes to do a compaction!).  Accounting for
> the _rev field this amounts to a nine times expansion of disk space.
>
> Is this massive expansion expected?  Are there any plans to make it more
> reasonable?
>
> I am going to regenerate my data so that it uses way shorter ids instead of
> random 16 byte ones.
>

You might try using CouchDB's builtin sequential "uuids". These should
give you some more storage efficiency.

Thanks for reporting. The last time we tried, we were able to do make
some major progress in storage size efficiency. I'm not sure how much
low-hanging fruit we have left, but if you try sequential uuids that
would be a good start.

Chris

> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAktGIRoACgkQmOOfHg372QQCyQCgvHF9+m2lZFNtsuTvwI+U2atC
> rLYAnjj5b7vvEIfv/6981PKTtBt3uccE
> =gwXw
> -----END PGP SIGNATURE-----
>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Reply via email to