"Doug Ewell" <d...@ewellic.org> wrote: |Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote: |> Not necessarily true. |> |> [602 words] | |This has nothing to do with the scenario I described, which involved |removing a "BOM" from the start of an arbitrary fragment of data, |thereby corrupting the data because the "BOM" was actually a ZWNBSP. | |If you have an arbitrary fragment of data, don't fiddle with it. | |If you know enough about the data to fiddle with it safely, it's not |arbitrary.
Yeah! E.g., on the all-UTF-8 Plan9 research operating system: ?0[9front.update_bomb_git]$ git ls-files --with-tree=master --|wc -l 44983 ?0[9front.update_bomb_git]$ git grep -lI "`print '\ufeff'`" master|wc -l 12 ?0[9front.update_bomb_git]$ git grep -lI "`print '\ufeff'`" master master:9front.hg/lib/font/bit/MAP master:9front.hg/lib/glass master:9front.hg/sys/lib/troff/font/devutf/0100to25ff master:9front.hg/sys/lib/troff/font/devutf/C master:9front.hg/sys/lib/troff/font/devutf/CW master:9front.hg/sys/lib/troff/font/devutf/H master:9front.hg/sys/lib/troff/font/devutf/LucidaSans master:9front.hg/sys/lib/troff/font/devutf/PA master:9front.hg/sys/lib/troff/font/devutf/R master:9front.hg/sys/lib/troff/font/devutf/R.nomath master:9front.hg/sys/src/ape/lib/utf/runetype.c master:9front.hg/sys/src/libc/port/runetype.c --steffen _______________________________________________ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode