Finally I got access to the CouchDB logs and it seems this is another
bad_utf8_character_code problem.
My document looks like this:
{"_id":"72de8a27-c3d1-3626-b6ff-190925a990e4",
"_rev":"1-64d806a111cbf19740f76e78bdae097b",
"_attachments":{"ea_seite03.png":{"content_type":"image/png","revpos":1,"length":84057,"stub":true},
"ea_seite01.png":{"content_type":"image/png","revpos":1,"length":141866,"stub":true},
"ea_seite02.png":{"content_type":"image/png","revpos":1,"length":30189,"stub":true},
"content.xml":{"content_type":"application/xml","revpos":1,"length":1882,"stub":true}}
}
When trying to access the first attachment, an HTTP code 200 is returned, but
without content. Additionally the log displays an UTF-8 encoding problem:
[Mon, 12 Sep 2011 08:34:16 GMT] [info] [<0.25582.1>] 192.168.132.25 - - 'GET'
/updateserver/72de8a27-c3d1-3626-b6ff-190925a990e4/ea_seite03.png 200
[Mon, 12 Sep 2011 08:34:16 GMT] [error] [<0.25582.1>] Uncaught error in HTTP
request: {error,
{badmatch,
<<60,134,195,221,170,229,7,35,56,121,75,
219,13,218,29,250>>}}
[Mon, 12 Sep 2011 08:34:16 GMT] [info] [<0.25582.1>] Stacktrace:
[{couch_stream,foldl,6},
{couch_util,md5_final,1},
{couch_httpd_db,do_db_req,2},
{couch_httpd,handle_request_int,5},
{mochiweb_http,headers,5},
{proc_lib,init_p_do_apply,3}]
[Mon, 12 Sep 2011 08:34:16 GMT] [error] [<0.25582.1>] {error_report,<0.33.0>,
{<0.25582.1>,crash_report,
[[{initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
{pid,<0.25582.1>},
{registered_name,[]},
{error_info,
{exit,
{ucs,{bad_utf8_character_code}},
[{xmerl_ucs,from_utf8,1},
{mochijson2,json_encode_string,2},
{mochijson2,'-json_encode_proplist/2-fun-0-',3},
{lists,foldl,3},
{mochijson2,json_encode_proplist,2},
{couch_httpd,send_json,4},
{couch_httpd,handle_request_int,5},
{mochiweb_http,headers,5}]}},
{ancestors,
[couch_httpd,couch_secondary_services,couch_server_sup,<0.34.0>]},
{messages,[]},
{links,[<0.105.0>,#Port<0.4839>]},
{dictionary,
[{mochiweb_request_qs,[]},
{jsonp,no_jsonp},
{mochiweb_request_cookie,[]}]},
{trap_exit,false},
{status,running},
{heap_size,4181},
{stack_size,24},
{reductions,5429}],
[]]}}
[Mon, 12 Sep 2011 08:34:16 GMT] [error] [<0.105.0>] {error_report,<0.33.0>,
{<0.105.0>,std_error,
{mochiweb_socket_server,235,
{child_error,{ucs,{bad_utf8_character_code}}}}}}
It does not seem to be one of the known issues with UTF-8 encoding. I'm
accessing an attachment with content type image/png (so encoding should not
matter here) with an URL that seems to have no problematic characters.
Any ideas?
This problem happened with CouchDB 1.0.2 on Windows.
Thanks,
Tillmann
On Sep 7, 2011, at 4:57 PM, Robert Newson wrote:
> If you supply a Content-MD5 header in your request we will verify it
> (and reject a mismatch) just like Amazon S3 does. That doesn't imply
> that couchdb routinely corrupts attachments (it doesn't).
>
> Can you paste a full request/response where you regard the result as
> truncated or corrupted? What client software are you using? Can you
> reproduce this with curl?
>
> B.
>
> On 7 September 2011 14:53, Tillmann Seidel <[email protected]> wrote:
>> Hi,
>>
>> I have a problem with data corruption on CouchDB. I'm creating documents
>> with attachments using PUT requests in CouchDB 1.0.2 . Once in a while it
>> happens that a stored document is corrupt, i.e. an attachment is truncated
>> or has no data at all. CouchDB does not return an error though when the
>> document is created.
>>
>> The description of COUCHDB-558 makes me think that this is a problem that's
>> not unheard of:
>>
>> "We could detect in-flight data corruption if a client sends a Content-MD5
>> header along with the data and Couch validates the MD5 on arrival."
>>
>> Now my question is: what might cause such an in-flight data corruption? And
>> what could I do to prevent it? Or if I cannot prevent it, can I at least
>> make CouchDB detect it during creation?
>>
>> Thanks in advance
>> Tillmann
>>
-----------------------------------
Tillmann Seidel
Innoopract Informationssysteme GmbH
Email: [email protected]
Tel: +49-721-66-47-33-0
Fax: +49-721-66-47-33-29
http://www.innoopract.com
Innoopract Informationssysteme GmbH
Lammstr. 21, 76133 Karlsruhe, Germany
General Manager: Jochen Krause
Registered Office: Karlsruhe, Commercial Register Mannheim HRB 107883