On Sep 6, 2012, at 7:18 AM, Tim Tisdall wrote:

> I had a database of about 10.8gb with almost 15 million records which
> was fully compacted.  I had to back it up by dumping all the JSON and
> then restoring it by inserting it back in.  After it was done and I
> compacted it the database was now only 8.8gb!  I shed 2gb because of
> dropping the revision stubs still in the database.  This is likely
> because each record had about 6 revisions (so around 90 million
> stubs).  All of this is understandable, but 2gb isn't really
> negligible when running on a virtualized instance of 35gb.  The
> problem, though, is the method I used to dump to JSON and place it
> back into couchdb took almost 12hrs!
> 
> Is there a way to drop all of the revision stubs and reset the
> document's revision tags back to "1-" values?  I know this would
> completely break any kind of replication, but in this instance I am
> not doing any.
> 
> The best method I can think of is to insert each record into a new DB
> (not through replication, though, because that takes the stubs over
> with it).  Then go through the _changes from when I started and recopy
> those over to make sure everything is up-to-date.  This would save me
> having things down for 12hrs, but I have no idea how slow this process
> would take.
> 
> Suggestions?

You may find http://wiki.apache.org/couchdb/Purge_Documents interesting, 
however since it can only purge from leaf nodes you may still need creative 
application and I'm not sure what you'd gain over a scripted copy to a 
different database. What are your uptime/consistency needs? Must the document 
ids be preserved?

hth,
-nvw


Reply via email to