Oh, actually I have.
I even have a case that does not work with mcs but works with csc -
i.e. the case that csc detects utf-8 regardless of BOM.
I forgot one thing - with regard to that remaining problem, we need
to fix WinForms build (because KeyboardLayout.cs seems to have
raw non-ASCII character:
syntax error, got token `IDENTIFIER'
System.Windows.Forms\KeyboardLayouts.cs(93,51): error CS1526: A new
expression requires () or [] after type
System.Windows.Forms\KeyboardLayouts.cs(97,62): error CS8025: Parsing error
Compilation failed: 2 error(s), 0 warnings
They should be replaced by \uXXXX but I have no idea what those
characters actually are :|
Atsushi Eno
Marek Safar wrote:
Hello Eno,
Could you write some tests to cover this functionality. I mean e.g.
simple test file with UTF header.
Thanks,
Marek
Hi again,
Agreed. In fact, I was also fixing bug #75065, maybe duplicate.
I have a fix for UTF8Encoding, but it uncovered another mcs bug
which does not handle files with BOM with specific encoding.
To summarize the situation:
- Currently driver.cs does not process source files with
default encoding.
- UTF8Encoding.cs does not handle U+FEFF correctly.
- When we fix UTF8Encoding.cs to handle U+FEFF, it starts
to reject some source files which has BOM.
(CS8025:Parsing error)
- Even if we fix driver.cs to let StreamReader consider BOM
(currently we disable it), there are still some files
borking.
Am digging into this bug in depth. Hopefully I'll post a set of
fixes later.
... and now I finished the fixes as was done in the attached patch:
- driver.cs :
a) uses Encoding.Default for the default input.
b) Always use true for detecting BOM at any time.
- support.cs : Handle preamble_size precisely.
- UTF8Encoding.cs : it should not skip U+FEFF. This fixes
bug #73086 and #75065.
They should be applied at a time, except for a).
Atsushi Eno
public class ì¯ ì¯¡ì¯¢
{
public string é¢é¡°é£³;
public static void Main ()
{
}
}
public class ì¯ ì¯¡ì¯¢
{
static string é¢é¡°é£³ = "é é ";
public static void Main ()
{
foreach (char c in é¢é¡°é£³)
System.Console.WriteLine ("{0:X04}", (int) c);
}
}
Index: Makefile
===================================================================
--- Makefile (revision 48630)
+++ Makefile (working copy)
@@ -2,7 +2,7 @@
include ../../build/rules.make
LIBRARY = System.Windows.Forms.dll
-LIB_MCS_FLAGS = /unsafe \
+LIB_MCS_FLAGS = /unsafe /codepage:65001 \
/r:$(corlib) /r:System.dll /r:System.Xml.dll \
/r:System.Drawing.dll /r:Accessibility.dll \
/r:System.Data.dll /r:Mono.Posix.dll \
_______________________________________________
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list