I'm trying to retrieve an RSS feed using CFHTTP.  The problem is that
the feed uses an extended character set (it's a French feed) and the
extended characters aren't being returned properly in the
cfhttp.fileContent variable unless the charset is specified as
"iso-8859-1".  This is the character set specified within the feed
XML, but it's not specified in the response header.  The response
charset is an empty string.

Even specifying UTF-8 as the charset (I know it's the default, but it
was worth trying explicitly) does not return the characters properly.
My code:

<cfhttp url="#form.feedURL#"
                method="GET"
                throwonerror="yes"
                [charset="(utf-8|iso-8859-1)"]
></cfhttp>

To explain the notation in the code above, I've tried leaving out the
charset attribute as well as explicitly setting it to utf-8 and
iso-8859-1.

The feed I'm trying to retrieve is
http://www.lemonde.fr/rss/sequence/0,2-3208,1-0,0.xml.  I'd really
prefer to use UTF-8 as the charset because it gives me the most
flexibility.  What I'm wondering is:

1.  Why doesn't UTF-8 return the characters properly?  I thought that,
for most content, UTF-8 would handle the vast majority of characters -
certainly the French language's accented "e", etc.
2.  Do I have any options for returning these characters properly and,
if any, what are they?

I'm familiar with character encoding, but hardly an expert.  Any
guidance would be appreciated.

Thanks.

-- 

Rob Wilkerson

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Message: http://www.houseoffusion.com/lists.cfm/link=i:4:240755
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54

Reply via email to