Re: S26 - The Next Generation
On Sun, Aug 16, 2009 at 3:26 PM, Damian Conwaydam...@conway.org wrote: * The DOC statement prefix constrains any block to which it is applied (including BEGIN, CHECK, INIT and similar) to run only if -doc is specified on the commandline * You can tell if you're running under -doc by checking $?DOC Does this mean I can run code on some other machine when someone on that machine reads my documentation? Kyle.
Re: Filename literals
Timothy S. Nelson wrote: David Green wrote: Jon Lang wrote: If so, could you give some examples of how such a distinction could be beneficial, or of how the lack of such a distinction is problematic? Well, my main thought in this context is that the stuff that can be done to the inside of a file can also be done to other streams -- TCP sockets for example (I know, there are differences, but the two are a lot the same), whereas metadata makes less sense in the context of TCP sockets; I guess this was one of the thoughts that led me to want separate things here. Ah. I can see that. Well, I definitely think there needs to be a class that combines the inside and the outside, or the data and the metadata. Certainly the separate parts will exist separately for purposes of implementation, but there needs to be a user-friendlier view wrapped around that. Or maybe there are (sort of) three levels, low, medium, and high; that is, the basic implementation level (=P6 direct access to OS- and FS- system calls); the combined level, where an IO or File object encompasses IO::FSnode and IO::FSdata, etc.; and a gloss-over-the-details level with lots of sugar on top (at the expense of losing control over some details). Hmm. With the quoting idea, I don't see the need for a both type of object. I mean, I'd see the code happening something like this: if (path{/path/to/file}.e) { �...@lines = slurp(path{/path/to/file}); } Or... if (path{/path/to/file}.e) { $handle = open(path{/path/to/file}); } (I'm using one of David's suggested syntaxes above, but I'm not closely attached to it). For the record, the above syntax was my suggestion. I guess what I'm saying here is that I think we can do the things without people having to worry about the objects being separate unless they care. So, separate objects, but hide it as much as possible. Is that something you're fine with? It looks good to me. In fact, having q, Q, or qq involved at all strikes me as wrong, since those three are specifically for generating strings. Pathnames still are strings, so that's fine. In fact, there are different Hmm. I'm not so sure; maybe I'm just being picky, but I want to clarify things in case it's important (in other words, I'm thinking out loud here to see if it helps). First, Q and friends don't generate strings, they generate string-like objects, which could be Str, or Match, or whatever. Think of quoting constructs as a way of temporarily switching to a different sublanguage (cf. regex), and you'll have the idea that I have in mind. As for pathnames being strings, you may be right FSVO string. But I'd say that, while they may be strings, they're not Str, but they do Str, as in role IO::FSNode does Str {...} (FSNode may not be the right name here, but is used for illustrative purposes). I'd go one step further. Consider the Windows path 'C:\Program Files\'. Is the string what's really important, or is it the directory to which the string refers? I ask because, for legacy reasons, the following points to the same directory: 'C:\PROGRA~1\'. Then there's the matter of absolute and relative paths: if the current working directory is 'C:\Program Files\', then the path 'thisfile' actually refers to 'C:\Program Files\thisfile'. And because of parent directory and self-reference links, things like '/bin/../etc/.' is just an overcomplicated way of pointing to '/etc'. I'd like Perl 6's treatment of filenames to be smart enough that smart-matching any of these pairs of alternative spellings would result in a successful match. So while I'll agree that filenames are string-like, I really don't want them to _be_ strings. things going on here; one is to have a way of conveniently quoting strings that contain a lot of backslashes. Just as Perl lets you pick different quotation marks, to make it easier to quote strings that have a lot of or ' characters, so it should have a way to make it easy to quote strings with a lot of backslashes. (The most obvious example being Windows paths; but there are other possibilities, such as needing to eval some code that already has a lot of backslashes in it.) Now, you can already turn backwhacking on or off via Q's :backslash adverb; Q:qq includes :b (and Q:q recognises a few limited escape sequences like \\). So you could say Q[C:\some\path], and you could add scalar interpolation to say Q:s[C:\some\path\$filename]. But there's no way to have all of: literal backslashes + interpolation + escaped sigils. Perhaps instead of a simple :b toggle, we could have an :escapeStr adverb that defaults to :escape\? Then you could have Q:scalar:escape(^)[C:\path\with\literal^$\$filename]. Maybe a global variable? It's an interesting idea, and I'll see how others feel :). I'm leery of global variables, per se; but I _do_ like the idea of
Re: S26 - The Next Generation
Damian Conway wrote: It's not yet committed, as there will (no doubt) be much discussion first. I apologize in advance: I am still travelling on my annual world tour, so my ability to participate in this discussion will be limited and erratic. In the spirit of ask for forgiveness rather than permission I'd suggest to commit it early. People on #perl6 have been asking where it is already, since it's not at the usual location[tm]. Of course, all comments, suggestions, and patches are most welcome. Then let me start with a huge praise: to me it seems much more practical to the Pod writer than the previous version. I appreciate the huge effort that has surely flown into it. However it seems we have to pay a price: each act of rendering a Pod file actually means executing the program that's being documented (at least the BEGIN blocks and other stuff that happens at compile time), with all the security risks implied. So we'll need a *very* good sandbox. Is that worth it? Two minor comments: ll 99: followed by a valid identifierN A valid identifier is a sequence of alphanumerics and/or underscores, beginning with an alphabetic or underscore Is there a good reason to deviate from Perl 6's definition of an identifier? For the sake of consistentcy I'd just say that the Perl 6 rules apply. ll 311: sub fu ( #= This text stored in Cfu.WHY This seems to be ignorant of multi subs. If I write multi sub fu () {#= some Pod Then fu is a multi, not a particular candidate. Does it actually attach to the .WHY of the candidate? Or of the multi? Cheers, Moritz
Re: S26 - The Next Generation
However it seems we have to pay a price: each act of rendering a Pod file actually means executing the program that's being documented (at least the BEGIN blocks and other stuff that happens at compile time), with all the security risks implied. So we'll need a *very* good sandbox. Is that worth it? From the spec: However, during parsing and initialization under K-doc, the interpreter only executes those CBEGIN, CCHECK, and CINIT blocks (and equivalents, such as Cuse statements and subroutine declarations) that are preceded by the special prefix: CDOC -- love, raiph
Re: S26 - The Next Generation
On Aug 17, 2009, at 14:27 , Moritz Lenz wrote: ll 99: followed by a valid identifierN A valid identifier is a sequence of alphanumerics and/or underscores, beginning with an alphabetic or underscore Is there a good reason to deviate from Perl 6's definition of an identifier? For the sake of consistentcy I'd just say that the Perl 6 rules apply. It occurs to me that *if* you are executing/evaluating (part of) the source, then it could be argued that an identifier should be defined by whatever language the parser ends up running, which might not be perl6. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allb...@kf8nh.com system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu electrical and computer engineering, carnegie mellon universityKF8NH PGP.sig Description: This is a digitally signed message part
Re: S26 - The Next Generation
On Aug 17, 2009, at 14:34 , raiph mellor wrote: However it seems we have to pay a price: each act of rendering a Pod file actually means executing the program that's being documented (at least the BEGIN blocks and other stuff that happens at compile time), with all the security risks implied. So we'll need a *very* good sandbox. Is that worth it? From the spec: However, during parsing and initialization under K-doc, the interpreter only executes those CBEGIN, CCHECK, and CINIT blocks (and equivalents, such as Cuse statements and subroutine declarations) that are preceded by the special prefix: CDOC Nonetheless, DOC INIT { system rm -rf . } (or etc.) would be unfortunate. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allb...@kf8nh.com system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu electrical and computer engineering, carnegie mellon universityKF8NH PGP.sig Description: This is a digitally signed message part
Re: S26 - The Next Generation
Nonetheless, DOC INIT { system rm -rf . } (or etc.) would be unfortunate. Gotcha. Perhaps something like perl6 -DOC is needed to execute DOC blocks in the file passed on the command line and files it use's, whereas perl6 -doc only processes DOC blocks in the Setting or its use'd files, and merely parses but does not execute DOC blocks in the file passed on the command line and files it use's. -- love, raiph
Re: S26 - The Next Generation
raiph mellor wrote: However it seems we have to pay a price: each act of rendering a Pod file actually means executing the program that's being documented (at least the BEGIN blocks and other stuff that happens at compile time), with all the security risks implied. So we'll need a *very* good sandbox. Is that worth it? From the spec: However, during parsing and initialization under K-doc, the interpreter only executes those CBEGIN, CCHECK, and CINIT blocks (and equivalents, such as Cuse statements and subroutine declarations) that are preceded by the special prefix: CDOC I didn't read that part, and I wonder how useful it is. Basically to produce a correct parse, any 'use' directive has to be executed, otherwise you can't know if something is a type name, a subroutine name or what not. Also modules can export special syntax, causing the parse to be significantly altered. Cheers, Moritz
Re: Filename literals
Hey, Just joined the list, and I too have been thinking about a good path literal for Perl 6. Nice to see so many other people are thinking the same :). Not knowing where to start in this long thread, I will instead try to show how I would like a path literal to work. For me a path literal is a way to make the code pretty and clean. And for multi platform coding this is mostly where it gets hard to do. So I think a path literal should make it possible to use both a native style and a more modern portable one, without having to give up using spaces like in Path::Spec from Perl 5 or have to do verbose object creation. First I think extending Q with a Q:path{} and making the alias Q:p{} and p{} would be the most consistent with the current string literal API. Also it should be possible to sub type the literals to further limit format and content. This should be done so we can get compile time error when path's are know to be incorrect or that we throw an exception or return a undef with an error type(or whatever Larry called it) when we interpolate and return something that is known to be incorrect. The default p{} should only allow / as separator and should not allow characters that won't work on modern Windows and Unix like \ / ? % * : | , etc. The reason for this is that portable Path's should be the default and if you really need platform specific behavior it should be shown in the code. my Path $path = p{../ext/dictonary.txt}; or my Path $path = p{c:/ext/dictonary.txt}; We should allow windows style paths so converting and maintaining code on this platform is not a pain. my Path $path = p:win{C:\Program Files\MS Access\file.file}; For Unix specific behavior we should have a p:unix{} literal, here the only limit are what is defined by locale. So we won't be able to write full Unicode if locale is set to Latin1. Writing filenames to the filesystem that other programs won't be able to read should be hard. my Path $path = p:unix{/usr/src/bla/myfile?:%.file}; And for people where this is a problem p:bin{} can be used as no checking is done here. my $path = p:bin{/usr/src/bla/??/adasd/myfile}; Old style Mac paths could also be supported where the : is used as separator. my Path $path = p:mac{usr:src:bla}; Or old dos paths where 8 char limits and all the old dos stuff apply. my Path $path = p:dos{c:\windows\test.fil}; Urls could also be support with: my Path $path = p:url{file:///home/test.file} ** Path Object like File::Spec, etc. just nicer ** All the different variants for p{} return a Path object that offers much of what is found in File::Spec, Cwd and Path::Class in Perl 5 today in a more Perl 6 way. my Path $real_path = $path.realpath; # Like Cwd's realpath my Path $volume = $path.volume; # Returns the volume part if relevant my Path $dir = $path.dir; # Returns the directory part my Path $file = $path.file; # Returns the file part $path.shift(); # Get rid of last part of path $path.pop(); # Get rid of first part or path my @paths = $path.dirs; # Returns the directory parts of the path etc. ** Comparing Paths should do the right thing ** As we have the option of specifying what type a Path object is, this should also count when comparing the them. So fx. p:win{} are case insensitive. my $file = p:win{c:\My File.txt}; my $path = p:win{C:\Program Files\..}; if($path.is_in($file)) { # Check if the path is contained in another path say $file is in $path\n; # C:\My File.txt is C: } if(p{../test} ~~ p{../dir/../test}) { say Comparing two Path works as it should; } Also Path handles Unicode normalization so this won't be a problem: http://lists.zerezo.com/git/msg643117.html Meaning that both MA WITH UMLAUTrchen and MaUMLAUT MODIFIERrchen are the same path, but without normalizing the path behind the users back. ** Utility functions ** Path in itself knows nothing about the filesystem and files but might have a peek in $*CWD to do some path logic. Except for that a number of File related functions might be available to make it easy to open and slurp a file a Path points to. my File $file = p{/etc/passwd}.open; if($file.type ~~ 'text/plain') { say looks like a password file; } my @passwd = p{/etc/passwd}.lines; if(p{/etc/passwd}.exists) { say passwd file exists; } This is my thought so far, hope it helps the discussion. Regards Troels
Re: Filename literals
Troels Liebe Bentsen wrote: Hey, Just joined the list, and I too have been thinking about a good path literal for Perl 6. Nice to see so many other people are thinking the same :). Welcome to the list! Not knowing where to start in this long thread, I will instead try to show how I would like a path literal to work. A well-considered proposal, and one with which I mostly agree. Some thoughts: The default p{} should only allow / as separator and should not allow characters that won't work on modern Windows and Unix like \ / ? % * : | , etc. The reason for this is that portable Path's should be the default and if you really need platform specific behavior it should be shown in the code. I note that you explicitly included * and ? in the list of forbidden characters; I take it, then, that you're not in favor of Path as a glob-based pattern-matching utility? E.g.: my Path $path; ... unless $path ~~ pastro* { say the file doesn't begin with 'astro'. } Admittedly, this particular example _could_ be accomplished through the use of a regex; but there _are_ cases where the use of wildcard characters would be easier than the series of equivalent tests that Perl would otherwise have to perform in order to achieve the same result. Hmm... maybe we need something analogous to q vs. qq; that is: pastro* #`{ syntax error: '*' is not a valid filename character. } ppastro* #`{ returns an object that is used for Path pattern-matching; perhaps Pathglob or somesuch? } We should allow windows style paths so converting and maintaining code on this platform is not a pain. : For Unix specific behavior we should have a p:unix{} literal, here the only limit are what is defined by locale. : And for people where this is a problem p:bin{} can be used as no checking is done here. : Old style Mac paths could also be supported where the : is used as separator. : Or old dos paths where 8 char limits and all the old dos stuff apply. Hear, hear. Note that these are all mutually exclusive, which suggests that the proper format ought to be something like: my Path $path = p:formatwin{C:\Program Files} However, I have no problem with the idea that :win is short for :formatwin; the feature here is brevity. Urls could also be support with: my Path $path = p:url{file:///home/test.file} I would be very careful here, in that I wouldn't want to open the can of worms inherent in non-file protocols (e.g., ftp, http, gopher, mail), or even in file protocols with hosts other than localhost. ** Path Object like File::Spec, etc. just nicer ** : ** Comparing Paths should do the right thing ** Agreed on all counts. ** Utility functions ** Path in itself knows nothing about the filesystem and files but might have a peek in $*CWD to do some path logic. Except for that a number of File related functions might be available to make it easy to open and slurp a file a Path points to. my File $file = p{/etc/passwd}.open; if($file.type ~~ 'text/plain') { say looks like a password file; } my @passwd = p{/etc/passwd}.lines; if(p{/etc/passwd}.exists) { say passwd file exists; } As soon as you allow methods such as .exists, it undermines your claim that Path knows nothing about the filesystem or files. IMHO, you should still include such methods. -- Jonathan Dataweaver Lang
Re: S26 - The Next Generation
On Sun, Aug 16, 2009 at 1:26 PM, Damian Conwaydam...@conway.org wrote: * This means Pod can be indented; the = is no longer tied to the first column. The indentation preceding the opening = (using the ($?TABSTOP // 8) rule, as for heredocs) now specifies the zeroth column of the Pod block. Will ther be any ambiguity between Pod and wraparound operators that begin with =? e.g., my Dog $spot = new Dog; # Pod, or Perl assignment? if really_long_expression == value { ... } # Pod, or equality operator? * In addition to delimited, paragraphed, and abbreviated Pod blocks, documentation can now be specified in a fourth form: my $declared_thing; #= Pod here until end of line sub declared_thing () { #=[ Pod here until matching closing bracket ] ... } Note the recent revisions to how Perl comments work - in particular, an embedded comment is now spelled #`[ ... ]. Should embedded attached Pod be spelled as #=`[ ... ]? My preference would be to simply say that if the very first character within a comment is an =, then it becomes a Pod attachment. That is, we're not dealing with a variation of the Pod Comment syntax (i.e., s/#/#=/); rather, we're dealing with a special use of a normal comment. Thus, an embedded Pod attachment would be written as #`[=...]. The main benefit of this would be that if any further refinements occur to Perl's comment syntax, Pod will adapt to those changes seamlessly[1]. As well, this would help with any effort that might be made to integrate the use of Pod into other languages: e.g., Javascript-with-Pod would handle a Pod attachment as /*=...*/ or //=... (for embedded and end-of-line comments, respectively). -- Jonathan Dataweaver Lang [1] Not to derail the conversation, but I would consider this to be another argument in favor of the proposed '(#...)' syntax for embedded comments: with this syntax, an embedded Pod attachment would be spelled '(#=...)'. Much more aesthetically pleasing.
Re: S26 - The Next Generation
Could we also get =numbered and =term directives that are equivalent to =item :numbered and =item :term, respectively, for use with abbreviated blocks? E.g.: =numbered First Item =numbered Second Item =numbered Third Item =term First Name Definition =term Second Name Definition Within tables, you should probably replace whitespace with multiple whitespace as a column delimiter; otherwise, the space between two words in an identifier would trigger a new column: column 1 column 2 ^^ ^ ^^ ^ (Each group of ^'s would be a separate column.) When using the code block alias, are the outermost curly braces considered to be part of the ambient code? Why is =END a block, and not a directive? -- Jonathan Dataweaver Lang
Re: Filename literals
On Mon, 17 Aug 2009, Jon Lang wrote: Well, I definitely think there needs to be a class that combines the inside and the outside, or the data and the metadata. Certainly the separate parts will exist separately for purposes of implementation, but there needs to be a user-friendlier view wrapped around that. Or maybe there are (sort of) three levels, low, medium, and high; that is, the basic implementation level (=P6 direct access to OS- and FS- system calls); the combined level, where an IO or File object encompasses IO::FSnode and IO::FSdata, etc.; and a gloss-over-the-details level with lots of sugar on top (at the expense of losing control over some details). Hmm. With the quoting idea, I don't see the need for a both type of object. I mean, I'd see the code happening something like this: if (path{/path/to/file}.e) { �...@lines = slurp(path{/path/to/file}); } Or... if (path{/path/to/file}.e) { $handle = open(path{/path/to/file}); } (I'm using one of David's suggested syntaxes above, but I'm not closely attached to it). For the record, the above syntax was my suggestion. Ok, as long as I don't have to take the blame :). Seriously, I was confused by trying to reply to two e-mails at once. Sorry. In fact, having q, Q, or qq involved at all strikes me as wrong, since those three are specifically for generating strings. Pathnames still are strings, so that's fine. In fact, there are different Hmm. I'm not so sure; maybe I'm just being picky, but I want to clarify things in case it's important (in other words, I'm thinking out loud here to see if it helps). First, Q and friends don't generate strings, they generate string-like objects, which could be Str, or Match, or whatever. Think of quoting constructs as a way of temporarily switching to a different sublanguage (cf. regex), and you'll have the idea that I have in mind. As for pathnames being strings, you may be right FSVO string. But I'd say that, while they may be strings, they're not Str, but they do Str, as in role IO::FSNode does Str {...} (FSNode may not be the right name here, but is used for illustrative purposes). I'd go one step further. Consider the Windows path 'C:\Program Files\'. Is the string what's really important, or is it the directory to which the string refers? I ask because, for legacy reasons, the following points to the same directory: 'C:\PROGRA~1\'. Then there's the matter of absolute and relative paths: if the current working directory is 'C:\Program Files\', then the path 'thisfile' actually refers to 'C:\Program Files\thisfile'. And because of parent directory and self-reference links, things like '/bin/../etc/.' is just an overcomplicated way of pointing to '/etc'. I'd like Perl 6's treatment of filenames to be smart enough that smart-matching any of these pairs of alternative spellings would result in a successful match. So while I'll agree that filenames are string-like, I really don't want them to _be_ strings. Good ideas. But I still want it to have the same interface, so I can concatenate them easily in error messages :). things going on here; one is to have a way of conveniently quoting strings that contain a lot of backslashes. Just as Perl lets you pick different quotation marks, to make it easier to quote strings that have a lot of or ' characters, so it should have a way to make it easy to quote strings with a lot of backslashes. (The most obvious example being Windows paths; but there are other possibilities, such as needing to eval some code that already has a lot of backslashes in it.) Now, you can already turn backwhacking on or off via Q's :backslash adverb; Q:qq includes :b (and Q:q recognises a few limited escape sequences like \\). So you could say Q[C:\some\path], and you could add scalar interpolation to say Q:s[C:\some\path\$filename]. But there's no way to have all of: literal backslashes + interpolation + escaped sigils. Perhaps instead of a simple :b toggle, we could have an :escapeStr adverb that defaults to :escape\? Then you could have Q:scalar:escape(^)[C:\path\with\literal^$\$filename]. Maybe a global variable? It's an interesting idea, and I'll see how others feel :). I'm leery of global variables, per se; but I _do_ like the idea of lexically-scoped options that let you customize the filename syntax. Changing the default delimiter would be the most common example of this. Yeah, global variable is probably a bad idea. But it *feels* like it should be some kind of global or semi-global setting :). By semi-global, I mean something that you can override in your local scope, and have it revert, much as with the $*IN, etc, filehandles. Now, isn't Q:path[/some/file] just creating an IO object? Unlike /foo/, where foo just IS the pattern, /some/file is *not* an IO object, it's just a filename. So if the special path-quoting returned an
Re: Embedded comments: two proposed solutions to the comment-whole-lines problem
On Tue, Aug 11, 2009 at 12:42 PM, Jon Langdatawea...@gmail.com wrote: jerry gay wrote: for the latest spec changes regarding this item, see http://perlcabal.org/svn/pugs/revision/?rev=27959. is everyone equally miserable now? ;) Already seen it. My latest points still stand, though: #`(...) is still vulnerable to ambiguity relative to #..., whereas `#(...), `#...#`, or (#...) don't share the same vulnerability. With the latest S26 proposal (and its new declarator blocks) in mind, I think that I would be happiest if embedded comments used the (#...) syntax. Reason: you still get the flexibility of choosing your own delimiters (unlike `#...#`), and you don't have to worry about where the = goes (unlike #`(...): is it #=`(...), #=(...), or #`(=...)? Likewise for `#(...)). -- Jonathan Dataweaver Lang
Re: S26 - The Next Generation
On 2009-Aug-17, at 12:27 pm, Moritz Lenz wrote: However it seems we have to pay a price: each act of rendering a Pod file actually means executing the program that's being documented (at least the BEGIN blocks and other stuff that happens at compile time), with all the security risks implied. So we'll need a *very* good sandbox. Is that worth it? Yes. In general, if you've installed a module, it's because you're going to use it, and you already trust it. So this is a problem only if you're looking at the documentation for the first time to decide whether you do want to use the module (and didn't already read the docs on CPAN.org first or something). Of course, CPAN will need a static copy of the docs anyway, so the solution is that authors should provide a static file (preferably in a few formats, at least text and HTML). Sites like CPAN will probably make a static doc file a requirement, and even the cpan shell could warn users about any modules that don't include static docs -- in fact, I think it would be reasonable to refuse to install such modules by default. -David