[ https://issues.apache.org/jira/browse/COUCHDB-345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749216#action_12749216 ]
Curt Arnold commented on COUCHDB-345: ------------------------------------- I'm guessing that the performance difference would be reduced if couch_db:json_decode were modified so it only called xmerl_ucs:from_utf8 when it encountered a byte value > 0x7F in the binary. I've leave to Adam to code and benchmark it to avoid demonstrating my Erlang newbieness. > "High ASCII" can be inserted into db but not retrieved > ------------------------------------------------------ > > Key: COUCHDB-345 > URL: https://issues.apache.org/jira/browse/COUCHDB-345 > Project: CouchDB > Issue Type: Bug > Affects Versions: 0.9 > Environment: OSX 10.5.6 > Reporter: Joan Touzet > Attachments: badenc1.patch, badtext.tar.gz, enctest.zip, > reject_invalid_utf8.patch > > > It is possible to PUT/POST a document into CouchDB with a "high ASCII" value > that cannot be retrieved. This results from not escaping a non-ASCII value > into \u#### when PUT/POSTing the document. > The attached sample code will recreate the problem using the hex value D8 (Ø) > in a possibly unsavoury test string. > Sample output against 0.9.0 is as follows: > ================================================ > { > "ok": true > } > { > "id": "fail", > "ok": true, > "rev": "1-76726372" > } > { > "error": "ucs", > "reason": "{bad_utf8_character_code}" > } > ================================================ > Please note this defect turned up another problem, namely that the > bad_utf8_character_code exception thrown by a design document attempting to > map() the bad document caused Futon to fail silently in building the view, > with no indication (except via debug log) that there was a failure. The log > indicated two attempts to build the view, both failing, followed by an > uncaught exception error for Futon. > Based on this, there are likely other areas in the codebase that do not > handle the bad_utf8_character_code exception correctly. > My belief is that CouchDB shouldn't accept this input and should have > rejected the PUT/POST, or should have escaped the input itself before the > insertion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.