I agree with the idea of making Perl 6's filesystem/etc interface more abstract,
as previously discussed, and also that users should be able to choose between
different levels of abstraction where that makes sense, either picking a more
portable interface versus a more platform-specific one.
Following up on Tim Bunce's comment about looking at prior art, I also recommend
looking at the SQLite DBMS, specifically its virtual file system layer; this one
is designed to give you deterministic behaviour and introspection over a wide
range of storage systems and attributes, both on PCs and on embedded devices, or
hard disks versus flash or write once vs write many etc, where a lot of
otherwise-assumptions are spelled out. One relevant url is
http://sqlite.org/c3ref/vfs.html and for the moment I forget where other good
urls are.
Mark Overmeer wrote:
$dir.case_sensitive(0);
$*OS.filesystem('/home', type => 'xfs', name_encoding => 'latin1'
, text_content_encoding => 'utf-8,bom', illegal_chars => "/\x0"
, case_sensitive => 1, max_path => 1024);
I understand that the above, concerning case-sensitivity, is just meant to be an
example, but I want to explore that in more detail for a moment, as it reflects
a common perception that only scratches the surface and needs to be fleshed out
more.
To summarize, what we really want is something more generic than
case-sensitivity, which is text normalization and text folding in general, as
well as distinctly dealing with distinctness for representation versus
distinctness for mutual exclusivity.
For example, one file system will represent your chosen case for a filename but
it won't allow 2 files in the same directory whose filenames are non-distinct
when uppercased; another file system in contrast would also represent a filename
uppercased. For another example, one file system will not distinguish between
accents on letters while another would, and this is orthogonal to
case-sensitivity. Or for another, one might treat a run of whitespace as being
equivalent to a single whitespace character, or whitespace characters are
ignored entirely.
Also, the paradigm that is the most distinguishing (case-sensitive,
accent-sensitive, whitespace-sensitive, etc) should be the default, and any
boolean option to change an aspect of this should be named that a false value is
more distinguishing and a true value is less distinguishing. For example, a
flag should be named "ignores_case" rather than "case_sensitive"; this also
assumes that if named arguments are optional, then the common default value of a
boolean-typed argument is false. Naming something "case_sensitive" implies that
sensitivity is special whereas sensitivity should be considered normal, and
rather insensitivity should be considered special.
-- Darren Duncan