On Jul 16, 2008, at 6:56 PM, David King wrote:
We'd love to hear what you come up with and also to solve any
problems you might encounter on your way. Please let us know.
Please note that CouchDB at this point is not optimised. We are
still in the 'getting it right' phase before we come to the
'getting it fast'. That said, CouchDB is plenty fast already, but
there is also the potential to greatly speed up things.
So I'm trying a smaller version of this first (9 million records),
and I've hit a snag. I have some rather simple python code to read
from Postgres and write to couchdb (that uses couchdb-python, where
'db' is a couchdb.client.Database object):
chunker = IteratorChunker(get_stuff())
while not chunker.done:
print "fetching"
chunk = chunker.next_chunk(1000)
if chunk:
print "Adding %d items, starting with %s" %
(len(chunk),chunk[0]['_id'])
db.update(chunk)
db.update(docs) (see <http://code.google.com/p/couchdb-python/source/browse/trunk/couchdb/client.py
>, line 360) uses the bulk API, like:
data = self.resource.post('_bulk_docs', content={'docs':
documents})
At apparently random points throughout this process, but almost
always before 15,000 records or so, the process dies with an
exception, the tail end of which looks like:
File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/
python2.5/httplib.py", line 707, in send
self.sock.sendall(str)
File "<string>", line 1, in sendall
socket.error: (54, 'Connection reset by peer')
If I have Futon up while it's running, I occasionally get a
Javascript error along the lines of "killed" (reproducing it is
difficult) at the same time.
I could have it catch the reset connection and re-try, but why would
this be happening?
You appear to be hitting the weird mochiweb connection reset bug. It's
causes test failures too. We are looking into it.