I agree with the idea of making Perl 6's filesystem/etc interface more abstract, as previously discussed, and also that users should be able to choose between different levels of abstraction where that makes sense, either picking a more portable interface versus a more platform-specific one.

Following up on Tim Bunce's comment about looking at prior art, I also recommend looking at the SQLite DBMS, specifically its virtual file system layer; this one is designed to give you deterministic behaviour and introspection over a wide range of storage systems and attributes, both on PCs and on embedded devices, or hard disks versus flash or write once vs write many etc, where a lot of otherwise-assumptions are spelled out. One relevant url is http://sqlite.org/c3ref/vfs.html and for the moment I forget where other good urls are.

Mark Overmeer wrote:
   $dir.case_sensitive(0);

   $*OS.filesystem('/home', type => 'xfs', name_encoding => 'latin1'
    , text_content_encoding => 'utf-8,bom', illegal_chars => "/\x0"
    , case_sensitive => 1, max_path => 1024);

I understand that the above, concerning case-sensitivity, is just meant to be an example, but I want to explore that in more detail for a moment, as it reflects a common perception that only scratches the surface and needs to be fleshed out more.

To summarize, what we really want is something more generic than case-sensitivity, which is text normalization and text folding in general, as well as distinctly dealing with distinctness for representation versus distinctness for mutual exclusivity.

For example, one file system will represent your chosen case for a filename but it won't allow 2 files in the same directory whose filenames are non-distinct when uppercased; another file system in contrast would also represent a filename uppercased. For another example, one file system will not distinguish between accents on letters while another would, and this is orthogonal to case-sensitivity. Or for another, one might treat a run of whitespace as being equivalent to a single whitespace character, or whitespace characters are ignored entirely.

Also, the paradigm that is the most distinguishing (case-sensitive, accent-sensitive, whitespace-sensitive, etc) should be the default, and any boolean option to change an aspect of this should be named that a false value is more distinguishing and a true value is less distinguishing. For example, a flag should be named "ignores_case" rather than "case_sensitive"; this also assumes that if named arguments are optional, then the common default value of a boolean-typed argument is false. Naming something "case_sensitive" implies that sensitivity is special whereas sensitivity should be considered normal, and rather insensitivity should be considered special.

-- Darren Duncan

Reply via email to