Is there any intention to fix couch's handling of "unusual" unicode characters? One of the "unusual" characters is the right single quote (226,128,153) which is a valid utf8 character and also not very "unusual" IMO.
I have an interface which allows users to add and edit text in a db document (again, not very unusual) and this one came up because of someone cutting and pasting some text from a source which used the right single quote as an apostrophe (which is just plain common -- in fact they are used in the online "Definitive Guide"). So I am having to maintain a switch statement which filters out these characters and replaces them with html entities before they get sent to couch, which is okay in my case since the documents are just being used as html pages anyway. But it's an awkward and unnecessary solution: individual developers should not have to be dealing with this, proper utf8 handling should be hard coded into couch. For one thing, it means that anyone worried about such "unusual" possibilities cannot use couchapp or couch directly -- data has to be filtered first server side. Although spidermonkey handles utf8 fine, depending on client side filtering is not always an alternative. Sincerely, MK -- "Enthusiasm is not the enemy of the intellect." (said of Irving Howe) "The angel of history[...]is turned toward the past." (Walter Benjamin)
