> I agree, there are different ways to look at it. But the statement > > > > > A Unicode text file beginning with FEFF is > > > > big-endian, and a file beginning with FFFE (not a legal Unicode > > > > character for any other purpose) is little-endian > > is just plain wrong, since UTF-32, for example, could start with bytes > FE FF.
Um, not legally in open interchange. Either you have big-endian UTF-32 <FE FF nn mm ..> which would correspond to U-FEFFnnmm ... -- and that is out-of-range for both Unicode and 10646. Or you have little-endian UTF-32 <FE FF nn 00 ..> which would correspond to U-00nnFFFE ..., where nn could be 00..10, but all such values are noncharacters, and cannot be used in open interchange. So if serialized "Unicode text" starts off <FE FF ...> and purports to be legal, it cannot be UTF-32, it cannot be UTF-8, and it cannot be little-endian. --Ken