Doug Ewell d...@ewellic.org wrote:
|Steven Atreju wrote:
|
| If Unicode *defines* that the so-called BOM is in fact a Unicode-
| indicating tag that MUST be present,
|
|But Unicode does not define that.
Nope. On http://unicode.org/faq/utf_bom.html i read:
Q: Why do some of the UTFs
Steven Atreju, Mon, 16 Jul 2012 13:35:04 +0200:
Doug Ewell d...@ewellic.org wrote:
And:
Q: Is the UTF-8 encoding scheme the same irrespective of whether
the underlying processor is little endian or big endian?
...
Where a BOM is used with UTF-8, it is only used as an ecoding
Leif Halvard Silli xn dash dash mlform dash iua at xn dash dash mlform
dash iua dot no wrote:
So, in a way, the ZWNBSP - or any other non-ASCII character (it would
in fact be better to use U+200B, to reserve the U+FEFF for its
designated BOM purpose) could serve as a UTF-8 sniff character not
Doug Ewell, Sat, 14 Jul 2012 15:14:10 -0600:
Philippe Verdy wrote:
It would break if the only place where to place a BOM is just the
start of a file. But as I propose, we allow BOMs to occur anywhere to
specify which encoding to use to decode what follows each one, even
shell scripts would
Recently, the Canadian symbols (marque de commerce) and (marque
déposée) have been added to Unicode at U+1F16A and U+1F16B.
Would it be possible to add the copyleft symbol in the neighbourhood ?
It looks like a reversed ©. Today, to type it, I use a reversed c with a
combining enclosing
Le 14/07/12 23:14, Doug Ewell a écrit :
A related question, though, is why some people think the sky will fall
if a text file contains loose zero-width no-break spaces. U+FEFF is
the very model of a default ignorable code point.
I don’t think the sky will fall but I say there still are a few
Ↄ⃝ may be a better approximation.
Leo
On Mon, Jul 16, 2012 at 10:47 AM, Jean-François Colson j...@colson.eu wrote:
Recently, the Canadian symbols (marque de commerce) and (marque
déposée) have been added to Unicode at U+1F16A and U+1F16B.
Would it be possible to add the copyleft symbol
There was a discussion on this list around May 2000 regarding the
so-called copyleft symbol. There were concerns that it was not really a
symbol with legal standing, like © and ® and ™, but more of a logo,
notably one worn on T-shirts by followers of a sort of social movement.
Eventually it was
Steven Atreju wrote:
Q: Is the UTF-8 encoding scheme the same irrespective of whether
the underlying processor is little endian or big endian?
...
Where a BOM is used with UTF-8, it is only used as an ecoding
signature to distinguish UTF-8 from other encodings — it has
nothing
2012/7/16 Leif Halvard Silli xn--mlform-...@xn--mlform-iua.no:
html element, then Chrome will sniff it as UTF-8 encoded. Whereas IE,
Webkit, Opera, Firefox will default to ISO-8858-1/Windows-1252.
Actually ISO 885**9**-1. But we've also been told that, given the C1
controls are simply invalid
2012/7/15 David Starner prosfil...@gmail.com:
/tmp $ echo -n a file1
/tmp $ echo b file2
/tmp $ cat file1 file2 file3
/tmp $ echo ab | diff -q - file3
Once again the problem is the /bin/cat tool which is used for
everything and agnostic about preserving text selantics. using another
cat
11 matches
Mail list logo