Re: Filename literals

Jon Lang Mon, 17 Aug 2009 07:37:06 -0700

Timothy S. Nelson wrote:
> David Green wrote:
>> Jon Lang wrote:
>>> If so, could you give some examples of how such a distinction could be
>>> beneficial, or of how the lack of such a distinction is problematic?
>
>        Well, my main thought in this context is that the stuff that can be
> done to the inside of a file can also be done to other streams -- TCP
> sockets for example (I know, there are differences, but the two are a lot
> the same), whereas metadata makes less sense in the context of TCP sockets;
> I guess this was one of the thoughts that led me to want separate things
> here.


Ah.  I can see that.

>> Well, I definitely think there needs to be a class that combines the
>> inside and the outside, or the data and the metadata.  Certainly the
>> separate parts will exist separately for purposes of implementation, but
>> there needs to be a user-friendlier view wrapped around that.  Or maybe
>> there are (sort of) three levels, low, medium, and high; that is, the basic
>> implementation level (=P6 direct access to OS- and FS- system calls); the
>> combined level, where an IO or "File" object encompasses IO::FSnode and
>> IO::FSdata, etc.; and a gloss-over-the-details level with lots of sugar on
>> top (at the expense of losing control over some details).
>
>        Hmm.  With the quoting idea, I don't see the need for a "both" type
> of object.  I mean, I'd see the code happening something like this:
>
> if (path{/path/to/file}.e) {
>       �...@lines = slurp(path{/path/to/file});
> }
>
>        Or...
>
> if (path{/path/to/file}.e) {
>        $handle = open(path{/path/to/file});
> }
>
>
>
>        (I'm using one of David's suggested syntaxes above, but I'm not
> closely attached to it).

For the record, the above syntax was my suggestion.

>        I guess what I'm saying here is that I think we can do the things
> without people having to worry about the objects being separate unless they
> care.  So, separate objects, but hide it as much as possible.  Is that
> something you're fine with?

It looks good to me.

>>> In fact, having q, Q, or qq involved at all strikes me as wrong,
>>> since those three are specifically for generating strings.
>>
>> Pathnames still are strings, so that's fine.  In fact, there are different
>
>        Hmm.  I'm not so sure; maybe I'm just being picky, but I want to
> clarify things in case it's important (in other words, I'm thinking out loud
> here to see if it helps).
>
>        First, Q and friends don't generate strings, they generate
> string-like objects, which could be Str, or Match, or whatever.  Think of
> quoting constructs as a way of temporarily switching to a different
> sublanguage (cf. regex), and you'll have the idea that I have in mind.
>
>        As for pathnames being strings, you may be right FSVO string.  But
> I'd say that, while they may be strings, they're not Str, but they do Str,
> as in
>
> role    IO::FSNode does Str {...}
>
>        (FSNode may not be the right name here, but is used for illustrative
> purposes).

I'd go one step further.  Consider the Windows path 'C:\Program
Files\'.  Is the string what's really important, or is it the
directory to which the string refers?  I ask because, for legacy
reasons, the following points to the same directory: 'C:\PROGRA~1\'.
Then there's the matter of absolute and relative paths: if the current
working directory is 'C:\Program Files\', then the path 'thisfile'
actually refers to 'C:\Program Files\thisfile'.  And because of parent
directory and self-reference links, things like '/bin/../etc/.' is
just an overcomplicated way of pointing to '/etc'.  I'd like Perl 6's
treatment of filenames to be smart enough that smart-matching any of
these pairs of "alternative spellings" would result in a successful
match.  So while I'll agree that filenames are string-like, I really
don't want them to _be_ strings.

>> things going on here; one is to have a way of conveniently quoting strings
>> that contain a lot of backslashes.  Just as Perl lets you pick different
>> quotation marks, to make it easier to quote strings that have a lot of " or
>> ' characters, so it should have a way to make it easy to quote strings with
>> a lot of backslashes.  (The most obvious example being Windows paths; but
>> there are other possibilities, such as needing to eval some code that
>> already has a lot of backslashes in it.)
>>
>> Now, you can already turn backwhacking on or off via Q's :backslash
>> adverb; Q:qq includes :b (and Q:q recognises a few limited escape sequences
>> like \\). So you could say Q[C:\some\path], and you could add scalar
>> interpolation to say Q:s[C:\some\path\$filename].  But there's no way to
>> have all of: literal backslashes + interpolation + escaped sigils.
>>
>> Perhaps instead of a simple :b toggle, we could have an :escape<Str>
>> adverb that defaults to :escape<\>? Then you could have
>> Q:scalar:escape("^")[C:\path\with\literal^$\$filename].
>
>        Maybe a global variable?  It's an interesting idea, and I'll see how
> others feel :).

I'm leery of global variables, per se; but I _do_ like the idea of
lexically-scoped options that let you customize the filename syntax.
Changing the default delimiter would be the most common example of
this.

>>> The ultimate in "path literals" would be to establish a similar
>>> "default delimiter".  [...]
>>>  `path`.size # how big is the file?  Returns number.
>>
>> There's something that slightly jars me here... I don't like the quotation
>> returning an IO object.  (I like the conciseness, but there's something a
>> bit off conceptually.)
>
>        Hmm.  But doesn't normal quoting return a Str object?  And regex
> quoting return an object (Regex?  Match?  Something, anyway).

Exactly.

>> Now, isn't Q:path[/some/file] just creating an IO object?  Unlike /foo/,
>> where "foo" just IS the pattern, "/some/file" is *not* an IO object, it's
>> just a filename.  So if the special path-quoting returned an IO::File::Name
>> object, I would be perfectly happy.  But you can't have $filename.size -- a
>> fileNAME doesn't have a size, the file itself does.  To get from the
>> filename to the file, you need another step, and it's that extra step I
>> don't like
>
>        We seem to agree that we're talking about at least three things here
> (however much some of them are disguised).
> -       Paths (ie. file names, URLs, whatever)
> -       File metadata (IO::FSNode, etc)
> -       File data (what S32/IO calls IO::File -- the stuff inside the file)
>
>        The difference in our approaches is that you seem keen to integrate
> closely the data and the metadata, whereas I'm trying to integrate the paths
> and the metadata.

Right.  My own sense is that it's better to integrate the paths and
the metadata.  I rather miss the relatively straightforward way that
Perl 5 let you say things like '-e "filename"'; and while I understand
the argument against attaching file testers directly to strings, I
would still like to be able to do something approximating that.  Right
now, smart-matching against a Pair is the officially endorsed approach
for handling this sort of thing; but to me, that also feels like a
kludge.  Letting you access metadata by means of methods called on a
path could come close to the original file tests in terms of utility,
without requiring the kludginess of Pair matching or Str methods.

>> IO objects always to parse a literal string the right way.  (I.e. "/" is
>> made to represent the official cross-platform path separator.  Unless you
>> parse some literal text with Q:ntfs, which parses a literal "\" as "/"??  Or
>> say "use path-separator '\';" or something!)
>
>        Another problem, just to make things exciting:
>
> $ echo $PATH
> /home/wayland/local/bin:/usr/global/bin:/usr/local/bin:/bin:/usr/bin:/usr/sbin:/usr/local/sbin:/sbin
>
>        Now, which of these is the path?  The whole thing, or the individual
> directory paths?  And what's the path separator?  Slash (/) or colon (:)?
>  So I think we need to distinguish between a search path, and a
> file/directory path.  So we'd be doing "use fs-path-separator Q '\';" (note
> the Q -- your example would, I think, have needed a \\ in it).

IMHO, the Perl equivalent of your 'echo $PATH' example ought to return
a list of '/'-delimited paths - that is, what you're referring to as a
search path would merely be a list of file/directory paths.

>>> In fact, path literals may be more pattern-like than string-like, when
>>> and if you start taking wildcard characters into account:
>>>
>>> if $file ~~ path[./*.txt] {
>>>    say "$file is a text file in the current working directory."
>>> }
>>
>> Is that just a regex, in fact?
>
>        No, we're talking about a globbing sublanguage (see the glob()
> function).  In p5, this could be done with the <> quotes, but S02 says that
> in P6, that you need to use the "glob" function if you want this kind of
> functionality.  The "glob" language is different than regex.  I was wanting
> to replace the "glob" language with something more like XPath, but that idea
> was vetoed by people who didn't want Tree-related objects to be part of the
> core, so I'm doing that as a library.

Right.  And if globbing is built into path literals, then a path
literal doesn't represent a filename, per se; it represents a filename
pattern which might match 0, 1, or multiple filenames.  That said, the
globbing sublanguage is such that if you _don't_ use wildcard
characters, you'll match one file at best.

_This_ might be a reason to go with three separate things: path,
metadata, and data.  Data would refer to a file's contents, and would
be represented by a stream to which you can gain access by opening the
file; metadata would represent the file itself and information that
you can glean without opening the file - essentially, p5's fstat.
Path would be a filesystem pattern, used to test a particular file (as
identified by its metadata) for location.

>> Though, to speak of file-types, this particular example would I think
>> better be handled by:
>>
>>  if $file.type ~~ MIME("text/plain") {...}

This doesn't include the other condition for which I was testing:
namely, that $file can be found in the current working directory.

>        Cool idea.  How would the type be determined?  Are you thinking of
> the algorithms in the unix "file" utility?  Please tell me you're not
> planning to use filename extentions -- that's bad :).

Wouldn't $file.type be metadata?

-- 
Jonathan "Dataweaver" Lang

Re: Filename literals

Reply via email to