Dump HTTP headers and look at Content-Type header.  There is a "charset" 
attribute that specifies the input stream encoding.

Most likely it is missing or incorrect.

If the "charset" is missing, CF assumes UTF-8.  However, in UTF-8 any char with 
the code higher than 127, is treated as an escape code that initiates multibyte 
sequence (up to 6 bytes).  Not all sequences are valid UTF-8 sequences.  When 
CF tries to convert not valid UTF-8 stream, the truncation or abort happens.  
Any 8-bit code that uses high codes, will not be converted correctly, unless 
the proper charset setting is used.  Looks like your case.

Same will happen, if HTTP header says UTF-8, but the stream is, actually, some 
8-bit national encoding.  MSXML does not make any conversion.  It assumes your 
default codepage (8-bit) and makes UTF-16 out of this.  This is why you get all 
characters, but not necessarily correct ones.

Resume.  Provide correct charset parameter in CFHTTP.  This might be tricky, 
since you may not always know how the original stream was encoded.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Order the Adobe Coldfusion Anthology now!
http://www.amazon.com/Adobe-Coldfusion-Anthology/dp/1430272155/?tag=houseoffusion
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:344636
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm

Reply via email to