Hi,

I have a large, 1.3GB xml file that I was trying to validate. It
turns out that the file has a lot of exotic characters in it such as:
é
è
Ä
È
...etc

The area of encoding and internationalisation is one I have no
experience of at all and from what I've heard it is rather complex
and difficult.

Being a lazy kidda guy, I though I would cat the file and let perl
make the substitiuations where it found any of these characters. My
problem is I am not sure how to regex for these characters except to
look for the hex value. Neither do I know of a way to escape/encode
them correctly.

I have seen the pragma utf8 but I am not sure my problem is what this
pragma was designed for. Does anyone have any suggestions for a
module or method that might take some of the pain out of detecting
and escaping such characters?

TIA,
Dp.


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to