Hi All,

I am attempting to wget a page, but find that wget is mangling some of
the characters inside the page and i'm not quite sure why.

For example, the command

# wget -d 
http://sites.google.com/a/dutymanagement.org/members/who-s-who/operations-team

Shows the character set is picked up correctly:

Content-Type: text/html; charset=utf-8

However the downloaded files shows lines such as this in vi:

... </A><200e> &gt; <200e> ...

And is mashed in a web browser too. I'm not quite sure what the <200e>
means or where it comes from.

Is there a easy way to prevent this from occurring? (i've tried to set
the header --header='Accept-Charset: UTF-8')

Many thanks,

Alex


Reply via email to