On Mon, Dec 8, 2014 at 9:09 PM, Shal Farley <s...@cheshireeng.com> wrote:

> I would disagree with part of this statement. I agree that ASCII defines
> only the 7-bit code values, but I think this whole thread has run off the
> rails in talking about the content values as determining whether the file
> is "text" or "binary".
> ...

separator is seldom carried as metadata, because it is usually uniform in a
> given system. But in these days of interoperable systems and multi-platform
> support, this detail also may be a necessary piece of metadata to know
> about a file.


i think the implication is that Fossil "could be enhanced to support
recording this" at the file level?

And additionally, the character set used to represent text in a file must
> also be carried as metadata (because of the ISO-8859 and other code-page
> based character sets).
>

Back to heuristics: Yes, in the general sense we can't define what a
"binary" file is, but fossil must have heuristics and takes the approach
that anything above 127 "is not ASCII, and therefore could contain data of
any of millions of different encodings." i.e. it must categorize that as
"opaque binary" (with a special allowance for everyone's favourite
encoding, UTF-8)

FWIW, here's the exact heuristic fossil uses:

http://www.fossil-scm.org/index.html/info/390a2bc854f3a8ea8259139af7f88c497960da97?ln=29-34



> Only if all these items of metadata are known can the file content, or
> differences in the file content, be displayed in a useful form. So
> returning to this thread, it is convenient to have a heuristic that works
> most of the time to discriminate "text" from "binary" files, but it is
> necessary to also have a way for the user to explicitly provide that
> metadata (and ideally the character set metadata).


That's essentially what some of the various "xxx-glob" config options are
for, but they're admittedly a bit of a limited solution to the problem.
We've floated the idea of adding mimetype information to files (similar to
how svn allows with svnprop), but nobody's ever seriously looked into what
it would take to implement it AFAIK.


-- 
----- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to