Good idea. Now we've got > [info] [<0.33.0>] couch_db_repair for testwritesdb - scanning 1048576 bytes > at 1380102 > [info] [<0.33.0>] couch_db_repair for testwritesdb - scanning 1048576 bytes > at 331526 > [info] [<0.33.0>] couch_db_repair for testwritesdb - scanning 331526 bytes at > 0 > [info] [<0.33.0>] couch_db_repair writing 12 updates to > lost+found/testwritesdb > [info] [<0.33.0>] couch_db_repair writing 9 updates to lost+found/testwritesdb > [info] [<0.33.0>] couch_db_repair writing 8 updates to lost+found/testwritesdb
Adam On Aug 10, 2010, at 2:29 PM, Robert Newson wrote: > It took 20 minutes before the first 'update' line came out, but now > seems to be recovering smoothly. machine load is back down to sane > levels. > > Suggest feedback during the hunting phase. > > B. > > On Tue, Aug 10, 2010 at 7:11 PM, Adam Kocoloski <kocol...@apache.org> wrote: >> Thanks for the crosscheck. I'm not aware of anything in the node finder >> that would cause it to struggle mightily with healthy DBs. It pretty much >> ignores the health of the DB, in fact. Would be interested to hear more. >> >> On Aug 10, 2010, at 1:59 PM, Robert Newson wrote: >> >>> I verified the new code's ability to repair the testwritesdb. system >>> load was smooth from start to finish. >>> >>> I started a further test on a different (healthy) database and system >>> load was severe again, just collecting the roots (the lost+found db >>> was not yet created when I aborted the attempt). I suspect the fact >>> that it's healthy is the issue, so if I'm right, perhaps a warning is >>> useful. >>> >>> B. >>> >>> >>> >>> On Tue, Aug 10, 2010 at 6:53 PM, Adam Kocoloski <kocol...@apache.org> wrote: >>>> Another update. This morning I took a different tack and, rather than try >>>> to find root nodes, I just looked for all kv_nodes in the file and treated >>>> each of those as a separate virtual DB to be replicated. This reduces the >>>> algorithmic complexity of the repair, and it looks like testwritesdb >>>> repairs in ~30 minutes or so. Also, this method results in the lost+found >>>> DB containing every document, not just the missing ones. >>>> >>>> My branch does not currently include Randall's parallelization of the >>>> replications. It's still CPU-limited, so that may be a worthwhile >>>> optimization. On the other hand, I think we may be reaching a stage at >>>> which performance for this repair tool is 'good enough', and pmaps can >>>> make error handling a bit dicey. >>>> >>>> In short, I think this tool is now in good shape. >>>> >>>> http://github.com/kocolosk/couchdb/tree/db_repair >>>> >> >>