In article <[EMAIL PROTECTED]>, "Mark A. Hershberger" <[EMAIL PROTECTED]> writes: [...] > I'm not using url-dav.el -- I'm using xml-rpc.el which I maintain.
> However, to eliminate the reliance on external code, I've pulled the bit > from xml-rpc.el that makes the call to post to a weblog hosted on > Blogger.com: > (let ((url-debug t)) (setq url-request-data "<?xml version=\"1.0\" > encoding=\"UTF-8\"?><methodCall><methodName>blogger.newPost</methodName><params><param><value><string>0123456789ABCDEF</string></value></param><param><value><string>9380140</string></value></param><param><value><string>usrname</string></value></param><param><value><string>passwrd</string></value></param><param><value><string>Iñtërnâtiônàlizætiøn > from emacs with > patch</string></value></param><param><value><boolean>1</boolean></value></param></params></methodCall>") [...] > result)))))) > Without the patch that I supplied, this results in a server error: > "unexpected end of file found" > With the patch, it works perfectly. The result can be seen at > http://emacs-weblogger.blogspot.com/ In the code above, you set url-request-data to a multibyte string. All non-ascii characters in "Iñtërnâtiônàlizætiøn" are iso-8859-1 and Emacs internally represents each character in iso-8859-1 in 2-byte. That means string-bytes on url-request-data returns, by chance, the same byte length of the result of encoding it by utf-8. (string-bytes "Iñtërnâtiônàlizætiøn") == (length (encode-coding-string "Iñtërnâtiônàlizætiøn" 'utf-8)) == 27 That's why your change to url-http.el works for the above case. But, that is just coincidence. If the string contains, for instance, an Ethiopic character, it doesn't work. What I still don't know is what value url-request-data should have? If it should be an already encoded string (and make it callers responsibility to pre-encode a string), just using `length' as now is ok. And you can use this kind of code: < (let ((url-debug t)) (setq url-request-data (encode-coding-string "<?xml version=\"1.0\" encoding=\"UTF-8\"?><methodCall><methodName>blogger.newPost</methodName><params><param><value><string>0123456789ABCDEF</string></value></param><param><value><string>9380140</string></value></param><param><value><string>usrname</string></value></param><param><value><string>passwrd</string></value></param><param><value><string>Iñtërnâtiônàlizætiøn from emacs with patch</string></value></param><param><value><boolean>1</boolean></value></param></params></methodCall>" 'utf-8)) Please try it after cancelling your change. If it should be a multibyte string, the correct way to calculate Content-length: is to use this code: (length (encode-coding-string "Iñtërnâtiônàlizætiøn" url-request-coding-system)) with your patch for introducing url-request-coding-system. Anyway, this change of yours: > + (set-process-coding-system connection > + (detect-coding-string url-request-data t) > + url-request-coding-system) is bad as Stefan wrote. The second arg must be `binary', and we should decode the received data according to the contents (perhaps by parsing the header and detecting what charset is specified and falling back to Emacs' code detection routine). --- Kenichi Handa [EMAIL PROTECTED] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel