Hi David

On 10-08-13 17:59 , David Maus wrote:
2. request for help about an issue with multibyte character encoding
====================================================================

There is an issue with multibyte characters that appear in the input
as unescaped, multibyte encoded characters (not as XML entities, as XML
entities multibyte characters are simply substituted correctly). I
looked for an example with a character encoding specified in the first
line of the XML feed like
<?xml version="1.0" encoding="utf-8"?>
and found one here:
http://www.openscreencast.de/blog/rss.xml
[...]

The problem with this feed is, that it contains raw unicode characters
that must be converted to utf-8 before they can be properly inserted
in the target buffer.

Attached patch does this by explicitely decoding new entries according
to their detected character encoding.

Btw.: Helpful introduction to the topic gives

The Absolute Minimum Every Software Developer Absolutely, Positively
Must Know About Unicode and Character Sets (No Excuses!)

by Joel Spolsky

http://www.joelonsoftware.com/articles/Unicode.html

Thank you very much for your patch, it resolves this issue with
org-feed.el like expected. I tested your patch with the two feeds
http://www.openscreencast.de/blog/rss.xml  (declared utf-8)
and
http://pod.drs.ch/world_music_special_mpx.xml  (not declared utf-8)
described more by me earlier and a dozen other feeds, all with
character encoding utf-8.

Michael

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

Reply via email to