Re: [DNG] grep handles ISO-8859 encoded text file as binary file.

Irrwahn Thu, 28 Apr 2016 10:52:50 -0700

On Thu, 28 Apr 2016 13:16:53 -0400, Hendrik Boom wrote:
> On Thu, Apr 28, 2016 at 06:53:35AM +0000, Noel Torres wrote:
>> Hughe Chung <janpeng...@riseup.net> escribió:
[...]
>>> $ grep tesselate dome_math.c
>>> Binary file dome_math.c matches
[...]
>> If I were to bet, I would say that the file dome_math.c is not
>> correctly formatted, or has an incorrect BOM at start, or so.
> 
> I've occasionally had a program that accepted UTF-8 reject a file 
> because it *had* a valid BOM at the start.
[...]


That would be because the notion of a BOM makes not much 
sense at all for UTF-8. There is no byte order issue with 
UTF-8, yet some brilliant mind thought it would be a good 
idea to define and allow one (EF BB BF) anyway. And, pray 
tell, other brilliant minds decided to use it as a way to 
tell UTF-8 from traditional single byte encodings. This is 
absurd, as it is just as bad as any other heuristic one 
may come up with to deduce text file character encoding. 

To add insult to injury, some poorly written text editing 
tools insert a BOM without any need or even being asked to, 
deliberately breaking otherwise perfectly fine 7-bit ASCII 
files and rendering them incompatible to legacy software. 

My 2¢.

Regards
Urban


_______________________________________________
Dng mailing list
Dng@lists.dyne.org
https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng

Re: [DNG] grep handles ISO-8859 encoded text file as binary file.

Reply via email to