On Mon, 17 Aug 2009, Jon Lang wrote:

Well, I definitely think there needs to be a class that combines the
inside and the outside, or the data and the metadata.  Certainly the
separate parts will exist separately for purposes of implementation, but
there needs to be a user-friendlier view wrapped around that.  Or maybe
there are (sort of) three levels, low, medium, and high; that is, the basic
implementation level (=P6 direct access to OS- and FS- system calls); the
combined level, where an IO or "File" object encompasses IO::FSnode and
IO::FSdata, etc.; and a gloss-over-the-details level with lots of sugar on
top (at the expense of losing control over some details).

       Hmm.  With the quoting idea, I don't see the need for a "both" type
of object.  I mean, I'd see the code happening something like this:

if (path{/path/to/file}.e) {
      �...@lines = slurp(path{/path/to/file});
}

       Or...

if (path{/path/to/file}.e) {
       $handle = open(path{/path/to/file});
}



       (I'm using one of David's suggested syntaxes above, but I'm not
closely attached to it).

For the record, the above syntax was my suggestion.

        Ok, as long as I don't have to take the blame :).

Seriously, I was confused by trying to reply to two e-mails at once. Sorry.

In fact, having q, Q, or qq involved at all strikes me as wrong,
since those three are specifically for generating strings.

Pathnames still are strings, so that's fine.  In fact, there are different

       Hmm.  I'm not so sure; maybe I'm just being picky, but I want to
clarify things in case it's important (in other words, I'm thinking out loud
here to see if it helps).

       First, Q and friends don't generate strings, they generate
string-like objects, which could be Str, or Match, or whatever.  Think of
quoting constructs as a way of temporarily switching to a different
sublanguage (cf. regex), and you'll have the idea that I have in mind.

       As for pathnames being strings, you may be right FSVO string.  But
I'd say that, while they may be strings, they're not Str, but they do Str,
as in

role    IO::FSNode does Str {...}

       (FSNode may not be the right name here, but is used for illustrative
purposes).

I'd go one step further.  Consider the Windows path 'C:\Program
Files\'.  Is the string what's really important, or is it the
directory to which the string refers?  I ask because, for legacy
reasons, the following points to the same directory: 'C:\PROGRA~1\'.
Then there's the matter of absolute and relative paths: if the current
working directory is 'C:\Program Files\', then the path 'thisfile'
actually refers to 'C:\Program Files\thisfile'.  And because of parent
directory and self-reference links, things like '/bin/../etc/.' is
just an overcomplicated way of pointing to '/etc'.  I'd like Perl 6's
treatment of filenames to be smart enough that smart-matching any of
these pairs of "alternative spellings" would result in a successful
match.  So while I'll agree that filenames are string-like, I really
don't want them to _be_ strings.

Good ideas. But I still want it to have the same interface, so I can concatenate them easily in error messages :).

things going on here; one is to have a way of conveniently quoting strings
that contain a lot of backslashes.  Just as Perl lets you pick different
quotation marks, to make it easier to quote strings that have a lot of " or
' characters, so it should have a way to make it easy to quote strings with
a lot of backslashes.  (The most obvious example being Windows paths; but
there are other possibilities, such as needing to eval some code that
already has a lot of backslashes in it.)

Now, you can already turn backwhacking on or off via Q's :backslash
adverb; Q:qq includes :b (and Q:q recognises a few limited escape sequences
like \\). So you could say Q[C:\some\path], and you could add scalar
interpolation to say Q:s[C:\some\path\$filename].  But there's no way to
have all of: literal backslashes + interpolation + escaped sigils.

Perhaps instead of a simple :b toggle, we could have an :escape<Str>
adverb that defaults to :escape<\>? Then you could have
Q:scalar:escape("^")[C:\path\with\literal^$\$filename].

       Maybe a global variable?  It's an interesting idea, and I'll see how
others feel :).

I'm leery of global variables, per se; but I _do_ like the idea of
lexically-scoped options that let you customize the filename syntax.
Changing the default delimiter would be the most common example of
this.

Yeah, global variable is probably a bad idea. But it *feels* like it should be some kind of global or semi-global setting :). By "semi-global", I mean something that you can override in your local scope, and have it revert, much as with the $*IN, etc, filehandles.

Now, isn't Q:path[/some/file] just creating an IO object?  Unlike /foo/,
where "foo" just IS the pattern, "/some/file" is *not* an IO object, it's
just a filename.  So if the special path-quoting returned an IO::File::Name
object, I would be perfectly happy.  But you can't have $filename.size -- a
fileNAME doesn't have a size, the file itself does.  To get from the
filename to the file, you need another step, and it's that extra step I
don't like

       We seem to agree that we're talking about at least three things here
(however much some of them are disguised).
-       Paths (ie. file names, URLs, whatever)
-       File metadata (IO::FSNode, etc)
-       File data (what S32/IO calls IO::File -- the stuff inside the file)

       The difference in our approaches is that you seem keen to integrate
closely the data and the metadata, whereas I'm trying to integrate the paths
and the metadata.

Right.  My own sense is that it's better to integrate the paths and
the metadata.  I rather miss the relatively straightforward way that
Perl 5 let you say things like '-e "filename"'; and while I understand
the argument against attaching file testers directly to strings, I
would still like to be able to do something approximating that.  Right
now, smart-matching against a Pair is the officially endorsed approach
for handling this sort of thing; but to me, that also feels like a
kludge.  Letting you access metadata by means of methods called on a
path could come close to the original file tests in terms of utility,
without requiring the kludginess of Pair matching or Str methods.

        Thank you for explaining my rationale :).

IO objects always to parse a literal string the right way.  (I.e. "/" is
made to represent the official cross-platform path separator.  Unless you
parse some literal text with Q:ntfs, which parses a literal "\" as "/"??  Or
say "use path-separator '\';" or something!)

       Another problem, just to make things exciting:

$ echo $PATH
/home/wayland/local/bin:/usr/global/bin:/usr/local/bin:/bin:/usr/bin:/usr/sbin:/usr/local/sbin:/sbin

       Now, which of these is the path?  The whole thing, or the individual
directory paths?  And what's the path separator?  Slash (/) or colon (:)?
 So I think we need to distinguish between a search path, and a
file/directory path.  So we'd be doing "use fs-path-separator Q '\';" (note
the Q -- your example would, I think, have needed a \\ in it).

IMHO, the Perl equivalent of your 'echo $PATH' example ought to return
a list of '/'-delimited paths - that is, what you're referring to as a
search path would merely be a list of file/directory paths.

I guess I'm making the point that the term "path separator" is ambiguous, and should be avoided :).

In fact, path literals may be more pattern-like than string-like, when
and if you start taking wildcard characters into account:

if $file ~~ path[./*.txt] {
   say "$file is a text file in the current working directory.."
}

Is that just a regex, in fact?

       No, we're talking about a globbing sublanguage (see the glob()
function).  In p5, this could be done with the <> quotes, but S02 says that
in P6, that you need to use the "glob" function if you want this kind of
functionality.  The "glob" language is different than regex.  I was wanting
to replace the "glob" language with something more like XPath, but that idea
was vetoed by people who didn't want Tree-related objects to be part of the
core, so I'm doing that as a library.

Right.  And if globbing is built into path literals, then a path
literal doesn't represent a filename, per se; it represents a filename
pattern which might match 0, 1, or multiple filenames.  That said, the
globbing sublanguage is such that if you _don't_ use wildcard
characters, you'll match one file at best.

_This_ might be a reason to go with three separate things: path,
metadata, and data.

The way I see it, there are at least 4 concepts -- glob, path, metadata, and data. So far, I've been suggesting putting the path and the metadata in the same object, the data in a separate object, and left the current S32/IO suggestion of having the .glob() on the Filesystems object alone, because I'm fine with that :). I still think of the glob language as a tree node selection language.

Data would refer to a file's contents, and would be represented by a stream to which you can gain access by opening the file; metadata would represent the file itself and information that you can glean without opening the file - essentially, p5's fstat. Path would be a filesystem pattern, used to test a particular file (as identified by its metadata) for location.

I think you mean *stat, since you really seem to want "stat" and "lstat".

Though, to speak of file-types, this particular example would I think
better be handled by:

 if $file.type ~~ MIME("text/plain") {...}

This doesn't include the other condition for which I was testing:
namely, that $file can be found in the current working directory.


       Cool idea.  How would the type be determined?  Are you thinking of
the algorithms in the unix "file" utility?  Please tell me you're not
planning to use filename extentions -- that's bad :).

Wouldn't $file.type be metadata?

Ok, now we're on to the definition of metadata. The question is, which of the following does "metadata" mean?:
1       The metadata that the filesystem attaches to the file
2       All the information that can be gathered without opening the file
3       Any information we can gather about the file that isn't the actual
        data contained in the file, but may involve reading at least part of
        it
4       Something else

I'm guessing that 1 and 2 are pretty much synonymous. I'd been using that definition without thinking about it much. If we assume, though that the :r style tests are all metadata, then :T and :B prove that I'm wrong here.

Anyway, I'm still not sure what you, Jon, mean when you ask "Wouldn't $file.type be metadata?"


---------------------------------------------------------------------
| Name: Tim Nelson                 | Because the Creator is,        |
| E-mail: wayl...@wayland.id.au    | I am                           |
---------------------------------------------------------------------

----BEGIN GEEK CODE BLOCK----
Version 3.12
GCS d+++ s+: a- C++$ U+++$ P+++$ L+++ E- W+ N+ w--- V- PE(+) Y+>++ PGP->+++ R(+) !tv b++ DI++++ D G+ e++>++++ h! y-
-----END GEEK CODE BLOCK-----

Reply via email to