thanks for the input Mark.

hmm, the webservice is built in CF pulling from a MySQL DB all on SunOS 5.8.

I was wondering, the DSN setup in CFAdmin has a ckbox for 'Enable
Unicode...'    wondering if this might cure it.  Thoughts?

using some stupid replacing like this function at the moment, but its
not exhaustive.

function cleantext(text) {
text = Replace(text, Chr(65426), "'", "All");
text = Replace(text, Chr(65425), "'", "All");
text = Replace(text, Chr(65427), "'", "All");
text = Replace(text, Chr(65428), "'", "All");
text = Replace(text, Chr(65430), "-", "All");
Return text;
}

----- Original Message -----
From: Mark Woods <[EMAIL PROTECTED]>
Date: Fri, 10 Sep 2004 15:50:25 +0100
Subject: Re: charset issue
To: CF-Talk <[EMAIL PROTECTED]>

>running CFMX 6.1 on winblows with the almighty oracle.  We are
>grabbing some data from another server running CFMX 6.1 on Unix with
>MySql via a webservice.  Seems the data we get is not displaying
>correctly.  Things like the - char display as a wierd backward E among
>other things.  Is there a way to mod the returned data?  Or does the
>source have to be cleaned up?

Wild guess - the source contains characters from the Windows 1252 character
set which are not included in ISO-Latin1 or Unicode. Seems a bit odd seeing
as the source is a Unix machine (usually UTF-8 encoded Unicode characters),
but I suppose it's possible.

Can you determine what character code the weird backward E is? This can be
done using JS btw (String.charCodeAt()). It may be code position 150, which
is used by Windows 1252 for en dash. In CFMX, which as far as I know uses
the Unicode character set (UTF-8 encoding) by default and does not support
Windows 1252, the character at code position 150 cannot be printed because
code positions 128-159 are reserved for control characters in Unicode
(note: Unicode is a superset of ISO-Latin1 so the same applies to Latin1).

Warning: I'm actually running CF5, so I might have made a mistake here - no
doubt somebody will point it out if that's the case.

Here's a code snippet from the Html property in Speck CMS which will
convert these Windows 1252 characters into their nearest counterpart in
Latin1 or a html entity. You could alternatively convert them to their
Unicode equivalents, but you'll need to figure out that mapping yourself ;)

if ( convertWin1252 ) {
// convert windows 1252 chars to nearest latin1 or html entity
lFind =
"#chr(128)#,#chr(130)#,#chr(131)#,#chr(132)#,#chr(133)#,#chr(134)#,#chr(135)#,#chr(136)#,#chr(137)#,#chr(138)#,#chr(139)#,#chr(140)#,#chr(142)#,#chr(145)#,#chr(146)#,#chr(147)#,#chr(148)#,#chr(149)#,#chr(150)#,#chr(151)#,#chr(152)#,#chr(153)#,#chr(154)#,#chr(155)#,#chr(156)#,#chr(158)#,#chr(159)#";
lReplace =
"&euro;,&sbquo;,&fnof;,&bdquo;,&hellip;,&dagger;,&Dagger;,&circ;,&permil;,&Scaron;,&lsaquo;,&OElig;,&##381;,&lsquo;,&rsquo;,&ldquo;,&rdquo;,&bull;,&ndash;,&mdash;,#chr(126)#,&trade;,&scaron;,&rsaquo;,&oelig;,&##382;,&Yuml;";
inHTML = replaceList(inHTML, lFind, lReplace);
}

Mark________________________________
[Todays Threads] [This Message] [Subscription] [Fast Unsubscribe] [User Settings] [Donations and Support]

Reply via email to