On May 19, 2006, at 7:55 PM, Sarah Reichelt wrote:

Hi All,

I have a routine that downloads a web page and extracts certain text.
This works fine except when the characters are accented. I'm not sure
how well the characters will transfer in the email, but I'll try to
give an example:

Accented e (é) - I never could remember which was an acute and which
was a grave but it's numToChar(142). On the web page viewed in a
browser and checking the source, it looks perfect. When I download
that page into a Rev, the é becomes "√(c)" i.e. square root &
copyright, charToNum 195 & 169.

I've tried using ISOtoMac and uniDecode and the 2 combined in various
ways, but I can't get it to give me the correct accented e.

Any ideas?

Sarah,

If it's on a web page it might be utf-8 (the metatag on the page source might tell you for sure), especially because it's rendering the character as two characters in rev. You could try this to see what happens:

put url ("http://the.web.page/file.html";) into tRawHtmlTxt
-- extract the stuff you want here
set the unicodetext of fld "myfld" to uniencode(tRawHtmlTxt,"utf8")

See if that helps.

Devin

Devin Asay
Humanities Technology and Research Support Center
Brigham Young University

_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to