christian b wrote:
[...]
these are the feed-adresses that I want to incorporate. both don't
have an encoding set (do RSS-feeds have to have that?) but they
clearly contain UTF-8 encoded characters.
http://www.industrial-technology-and-witchcraft.de/index.php/ITW/itw-rss20/
http://www.netzpolitik.org/feed/
Hi Christian,
Have a look at the HTTP response headers[1] of those feeds. The
"netzpolitik" feed's header clearly states it's iso-8859-1.
Recoding will be automagically done on generating xml from that source.
When you try with the following snippet in your pipeline, your output
will have a parsing error[2] but its source code will be strictly
according to encoding settings of your serializer
<map:match pattern="netzpolitik">
<map:generate src="http://www.netzpolitik.org/feed"/>
<map:serialize/>
</map:match>
In case of <http://www.netzpolitik.org/feed/> you go in with iso-8859-1
and come out with utf-8 (if you didn't change the settings of your
xml-serializer).
You will also have to make sure that character encoding of your output
<encoding>UTF-8</encoding>
is in accordance with encoding information sent with e.g.
mime-type="application/xhtml+xml; charset=utf-8"
by your serializer in HTTP response header. The following is an example
xhtml serializer config having both these informations.
<map:serializer name="xhtml"
mime-type="application/xhtml+xml; charset=utf-8"
logger="sitemap.serializer.xhtml"
pool-grow="2" pool-max="64" pool-min="2"
src="org.apache.cocoon.components.serializers.XHTMLSerializer">
<encoding>UTF-8</encoding>
<indent>no</indent>
</map:serializer>
What generator have you been using for your works. Maybe I didn't fully
understand your problem ...
[1]<http://livehttpheaders.mozdev.org/> for Firefox/Mozilla users
[2] XML Parsing Error: not well-formed
Location: http://bodo:8080/netzpolitik
Line Number 27, Column 17:
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]