On 20 Aug 2012, at 18:03, Gregory Williams wrote: > On Aug 20, 2012, at 12:49 PM, Steve Harris wrote: > >> How do other implementations represent the C0 control chars in SPARQL XML >> result format? >> >> They're not legal in XML 1.0 >> (http://en.wikipedia.org/wiki/Valid_characters_in_XML#XML_1.0), and it seems >> that many XML libraries choke on XML 1.1 data. >> >> This is a bit unfortunate if you have C0 chars in your literals. >> >> Things we've considered: >> >> * try to conneg XML 1.1 so at least our clients can take it (doesn't appear >> to be easy/obvious how, and some things are not even legal in XML 1.1) >> * replace C0 chars with something else from unicode, and return a 203 >> status, or something similar >> * give an error >> >> None of these is terribly satisfactory though. > > I'm sure my system breaks on control chars, but my initial thought after > reading your email was to use the replacement character (U+FFFD) in place of > the control chars. I agree it's not terribly satisfying, though.
This is what we've gone with, and returning a 200 code, on that basis that the U+FFFD chars should be enough of a clue that there were representation issues. - Steve -- Steve Harris, CTO Garlik, a part of Experian +44 7854 417 874 http://www.garlik.com/ Registered in England and Wales 653331 VAT # 887 1335 93 Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ
