Igor Klimer created COUCHDB-2040:
------------------------------------

             Summary: Compaction fails when copying attachment
                 Key: COUCHDB-2040
                 URL: https://issues.apache.org/jira/browse/COUCHDB-2040
             Project: CouchDB
          Issue Type: Bug
          Components: Database Core
            Reporter: Igor Klimer


Orignal discussion from the user mailing list: 
http://mail-archives.apache.org/mod_mbox/couchdb-user/201401.mbox/%3cd14f971a540b974bb75adc55f00f34ca69a35...@sex1.getback.ad2008r2.corp%3e

Digest:
During database compaction, the process fails at about 50% with the following 
error: http://pastebin.com/qeaZNHMj (CouchDB 1.2.0, Windows Server 2008 R2 
Enterprise).
After server and CouchDB upgrade the error is still the same: 
http://pastebin.com/feJWu7bN (CouchDB 1.5.0, Ubuntu 12.04.3 LTS (GNU/Linux 
3.8.0-33-generic x86_64)).

There was one prior attempt at compaction that failed because of insufficient 
disk space: http://pastebin.com/S1URXN0p
After this initial failure, I've made sure that there's sufficient disk space 
for the .compact file.

The .compact file was always removed before trying compaction again.
At the request of Robert Samuel Newson, I've also tried with an empty .compact 
file - the results were the same: http://pastebin.com/MJCgGM8C.

Our I/O subsystem consists of some RAID5 matrices - the admins claim that 
they've been running error-free since inception ;) We have yet to run a parity 
check, since that'd require taking the matrix offline and I'd rather not do 
that without exhausting other options.

Config files from the 1.2.0/Windows server (since that's where the fault must 
have occured):
default.ini: http://pastebin.com/kUz0qyNk
local.ini: http://pastebin.com/srZUMwzB

Other than the default delayed_commits set to true, there are no options that 
could affect fsync()ing and such.

I've run:
curl localhost:5984/ecrepo/_changes?include_docs=true
curl localhost:5984/ecrepo/_all_docs?include_docs=true
and both calls succeeded, which would suggest that a faulty (incorrect 
checksum/length) is at fault somewhere.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to