On 10.12.2005 23:42 (+0100), Norman Rasmussen wrote:
> Trying to get an object serialized into a utf-8 xml string _without_
> the 0xfeff header.
> 
> public static string GetStringFromObject(object Object, Type Type) {
>     MemoryStream ms = new MemoryStream();
>     StreamWriter sw = new StreamWriter(ms, Encoding.UTF8);
>     XmlTextWriter xw = new XmlTextWriter(sw);
>     xw.Formatting = Formatting.Indented;
> 
>     XmlSerializer serializer = new XmlSerializer(Type);
>     serializer.Serialize(xw, Object);
> 
>     return Encoding.UTF8.GetString(ms.ToArray()).TrimStart('\xfeff');
> }

I have no idea about all that serialisation stuff here, but with usual
string operations, I have never seen BOMs (aka signatures) around. This
function has proven to work correctly in several tests:

private string ToUTF8(string data)
{
        // Codepage 28591 is ISO-8859-1
        return Encoding.GetEncoding(28591).GetString(
                Encoding.UTF8.GetBytes(data));
}

It converts text into UTF-8. Of course, .NET will store it as UTF-16
internally, so it's actually UTF-8-in-UTF-16, but the code points will
all be within ISO-8859-1 (0...255), every second byte is 0 and
base64'ing it will convert it into byte[] anyway and all is fine... :)

-- 
Yves Goergen "LonelyPixel" <[EMAIL PROTECTED]>
"Does the movement of the trees make the wind blow?"
http://newsboard.unclassified.de - Unclassified NewsBoard Forum

Reply via email to