On Thu, Jun 18, 2009 at 4:47 AM, Nils Breunese <[email protected]> wrote:
> Joshua Bronson wrote: > > I needed a script to dump a large (>30G) couchdb database on a nightly >> basis >> for backup purposes, to be performed while couchdb is running, (...) >> > > Did you know that you can just use tools like cp to safely backup live > CouchDB databases? Using rsync will give you an instant incremental backup > tool. > > Nils Breunese. > Thanks for bringing this up. I was actually doing exactly that -- rsyncing the .couch file -- before switching to json dumps. Here are the reasons I switched: - The format of the .couch files can change from one version of couchdb to another, so if you ever upgrade couchdb (which you probably will!), you'll no longer be able to swap in the .couch files. - If the .couch file ever somehow gets corrupted, the corruption will propagate to your backups. Nobody wants to suffer the fate of ma.gnolia<http://corvusconsulting.ca/2009/02/ma-gnolias-bad-day/> ! - json is human-readable - It takes up less space, and can be further compressed to take up much less. My 30G .couch file produced a 17G _all_docs_by_seq dump which then bzip2-compressed to 2.6G. And now with the latest version of streamcouch.py<https://svn.openplans.org/melk/util/streamcouch.py>, along with something like my new wrapper script<https://svn.openplans.org/melk/util/backupcouch> , - It does incremental backups too. So far I like doing it this way a lot better. If anyone's had a chance to give it a whirl, I'd love to hear about your experiences with it. -- Archive: http://www.coactivate.org/projects/melkjug/lists/melkjug-development-list/archive/2009/06/1245444827425 To unsubscribe send an email with subject "unsubscribe" to [email protected]. Please contact [email protected] for questions.

