Re: A sketch of the security model
On 4/15/05, Shevek <[EMAIL PROTECTED]> wrote: > > How can dropping a privilege for the duration of a (dynamic) scope be > > implemented? Does this need to be implemented via a parrot intrinsic, > > such as: > > > > without_privs(list_of_privs, code_to_be_run_without_these_privs); > > > > ..or is it possible to do so with the primitives you sketched out above? > > This is usually done by creating a function "f(code) { code() }" without > any static privileges in list_of_privs. > > To evaluate a function g() > without those privileges, evaluate f(g), and the natural mechanisms of > the interpreter will ensure that these privileges are not held during > g(). I understand, thanks. Michael
Re: [RFC] some doubtable MMDs?
From: Larry Wall <[EMAIL PROTECTED]> Date: Fri, 15 Apr 2005 12:52:53 -0700 On Fri, Apr 15, 2005 at 02:38:36PM +0200, Leopold Toetsch wrote: : I'm not quite sure, but it seems that some of the MMD functions may : better be vtable methods: : : - bitwise_sh[rl]*shift by anything other then int? Shifting right by a positive BigInt (or left by a negative BigInt) can be optimized to -1 or 0. Shifting the other way could still produce a valid result for some values, even on a machine with 32-bit addresses. : - bitwise_lsris missing generally : : or even just a plain opcode only: : : - logical_{or,and,xor} return a PMC depending on the boolean value : : What are HLLs expecting of these infix operations? Perl 6 tends to distinguish these as different operators, though Perl 5 did overload the bitwise ops on both strings and numbers, which newbies found confusing in ambiguous cases, which is why we changed it. [FWIW, Common Lisp can't use these ops, as it has a different idea of logical truth. And that's the honest (not nil). ;-} ] : OTOH it might be useful that the current get__keyed operations : (postcircumfix:[]) become MMD subroutines: : : Px = Py[Pz]Pz = String, Int, Key, Slice, ... At the moment, the Perl 6 optimizer is explicitly allowed to optimize array indices with the assumption that the subscript is a scalar (or slice) of integer, or something that converts to integer . . . Larry By the same token, couldn't one reasonably ask for a boolean array that required BigInt subscripts, even on said 32-bit machine? (Once boolean arrays actually store one element per bit, that is.) Or are subscripts this large ruled out? Or are you using "integer" conceptually to include both Integer and BigInt? -- Bob Rogers http://rgrjr.dyndns.org/
Re: Various questions
According to Philip Taylor: > * I can usually handle unsigned numbers by pretending they're signed and > using 'I' registers, but some things appear to be awkward without new > ops - in particular, div and cmod, and le/lt/ge/gt comparisons. (As far > as I can tell, those are the only ones C would need; everything else > should work fine with the signed variants). Don't you also need unsigned assignment to N registers? double d = 0xUL; > I've added divu/leu/etc ops to math.ops/cmp.ops (and just made them cast > their operands into UINTVALs) - is that a reasonable thing to do? Would > they be better in a new .ops file? May as well leave them there. > * Should there be an 'isatty' op/method? I think so. I wouldn't tie it to the fileno() concept, because fileno() is less portable than isatty(filehandle), which is a reasonable sort of question beyond the bounds of Unix, in the Great Wilderness. > * Is it possible to merge PBC files together, like load_bytecode but at > compile-time? I'll punt on this one for now... Leo? > I've been using [gs]et_integer_keyed_int on a PMC to allow pointer > access. Since it reads whole ints, it probably crashes unnecessarily > when e.g. reading chars at unlucky addresses Yes ... on some arch's. Not x86, though, so I'm safe. :-) > but IMC code like "val = mem.read_i1(ptr)" feels unpleasantly > inefficient, particularly in string-processing loops. What about a native-code _function_ rather than an object method? -- Chip Salzenberg- a.k.a. -<[EMAIL PROTECTED]> Open Source is not an excuse to write fun code then leave the actual work to others.
Re: $*CWD instead of chdir() and cwd()
According to chromatic: > On Fri, 2005-04-15 at 23:52 +0200, Juerd wrote: > > Well, after failure it can be cwd() but false without breaking any real > > code, because normally, you'd never if (cwd) { ... }, simply because > > there's ALWAYS a cwd. > > Not always -- try removing a directory that's the pwd of another > process. Oh, the _directory_ is still there. :-) -- Chip Salzenberg- a.k.a. -<[EMAIL PROTECTED]> Open Source is not an excuse to write fun code then leave the actual work to others.
Unify cwd() [was: Re: $*CWD instead of chdir() and cwd()]
On Fri, Apr 15, 2005 at 08:31:57PM -0400, Chip Salzenberg wrote: > According to Michael G Schwern: > > And this is exactly what File::chdir does. $CWD is a tied scalar. > > I don't think current directory maps well on a variable. That won't > stop people from using it, of course. :-( > > There are several methods to determine the current directory. Each > one has its corner cases, strengths and weaknesses (thus the > proliferation of Cwd module functions), and it doesn't make any sense > to me to elevate one over the rest through the proposed $CWD. This is orthoginal to $CWD. Perl 6 is going to have to decide on some sort of standard internal getcwd technique, $CWD or not. In the same way that we have open() not fopen, fdopen, freopen... we can choose the safest and most sensible technique for determining the cwd and use that. You have to because when a new user asks "how do I get the current working directory?" you want to say "cwd()" and not "Well, there are a variety of different techniques..." Cwd.pm is a perfect example of this problem. Which one should a user use? Most folks just won't care and the micro-differences between the functions in Cwd.pm aren't worth the trouble. Present a sensible default. Write a module with all the other options for those who need it. > mkdir '/tmp/foo'; > $CWD = '/tmp/foo'; > rename '../foo', '../bar'; > say $CWD; # Well? Which is it? Its exactly the same as... mkdir '/tmp/foo'; chdir '/tmp/foo'; rename '../foo', '../bar'; say cwd();
Re: $*CWD instead of chdir() and cwd()
According to Michael G Schwern: > And this is exactly what File::chdir does. $CWD is a tied scalar. I don't think current directory maps well on a variable. That won't stop people from using it, of course. :-( There are several methods to determine the current directory. Each one has its corner cases, strengths and weaknesses (thus the proliferation of Cwd module functions), and it doesn't make any sense to me to elevate one over the rest through the proposed $CWD. mkdir '/tmp/foo'; $CWD = '/tmp/foo'; rename '../foo', '../bar'; say $CWD; # Well? Which is it? -- Chip Salzenberg- a.k.a. -<[EMAIL PROTECTED]> Open Source is not an excuse to write fun code then leave the actual work to others.
Re: $*CWD instead of chdir() and cwd()
On Fri, Apr 15, 2005 at 03:22:48PM -0700, Michael G Schwern wrote: : On Fri, Apr 15, 2005 at 11:52:38PM +0200, Juerd wrote: : > > becomes an unverifiable operation. You have to use chdir() if you want to : > > error check and $CWD is reduced to a "scripting" feature. : > : > Well, after failure it can be cwd() but false without breaking any real : > code, because normally, you'd never if (cwd) { ... }, simply because : > there's ALWAYS a cwd. If this is done, the thing returned by the STORE : > can still be an lvalue and thus be properly reffed. : : Good idea! But if cwd() or chdir() doesn't fail(), you probably won't get any information on *why* the chdir failed in either the return value or $!. That could be construed as antisocial. In general I think "but" should be reserved for situations where the original interface designer showed sufficient lack of imagination to warrant such workarounds. That is how I treated all the RFCs that made use of "but" for built-in functionality, and I haven't seen any good reasons to alter my views on that. About the closest we get to it is that "interesting values of undef" can be thought of as new Exception(...) but undefined, or some such. But even that is usually hidden behind the fail() predicate, and the undef role is probably composed into exceptions in the first place. Or maybe it's the other way around. Larry
Re: $*CWD instead of chdir() and cwd()
On Fri, Apr 15, 2005 at 11:52:38PM +0200, Juerd wrote: > > becomes an unverifiable operation. You have to use chdir() if you want to > > error check and $CWD is reduced to a "scripting" feature. > > Well, after failure it can be cwd() but false without breaking any real > code, because normally, you'd never if (cwd) { ... }, simply because > there's ALWAYS a cwd. If this is done, the thing returned by the STORE > can still be an lvalue and thus be properly reffed. Good idea!
Re: [pugs] Quoting constructs
On 16 Apr, Roie Marianer wrote: : By the way, something tells me perl6-compiler isn't the best place for this : discussion. Is there a secret group of people that discusses cornercases for : perl6, and if so could someone tell me on what list they live? You most likely want perl6-language, where Larry among others participates in. Steven
Announcing Test::TAP::Model and Test::TAP::HTMLMatrix
Hola... The code used to generate pugs smoke HTMLs (like http://nothingmuch.woobling.org/pugs_test_status/ - warning around 800K), was refactored into two perl (5) modules, now (that is, when your mirror has synched) available on the CPAN. This code is authored by many of the pugs authors. If you feel the need to discuss it, I think #perl6 on freenode is the place. In any case, I'm not authoritative, as this code is not only mine. In order to honor the fine tradition of releng breakage, both 0.01 versions are crummy. Use 0.02. Sorry =( The two darcs repos for these modules are: http://nothingmuch.woobling.org/Test-TAP-Model http://nothingmuch.woobling.org/Test-TAP-HTMLMatrix Test::TAP::Model wraps around Test::Harness::Straps and gives a sort of souped up DOM to the TAP data that was collected, and Test::TAP::HTMLMatrix creates the HTML using this DOM and a Petal template. Ciao! -- () Yuval Kogman <[EMAIL PROTECTED]> 0xEBD27418 perl hacker & /\ kung foo master: /methinks long and hard, and runs away: neeyah!!! pgpwL4wbp4Eoc.pgp Description: PGP signature
Re: Comparing rationals/floats
At 16:18 -0700 4/15/05, gcomnz wrote: >More questions stemming from cookbook work... Decimal Comparisons: > >The most common recipe around for comparisons is to use sprintf to cut >the decimals to size and then compare strings. Seems ugly. > >The non-stringification way to do it is usually along the lines of: > >if (abs($value1 - $value2) < abs($value1 * epsilon)) > >(From Mastering Algorithms with Perl errata) > >I'm wondering though, if C<$value1 == $value2> is always wrong (or >almost always wrong) then should it be smarter and: >SNIP >Marcus Adair I have longed for an OO class that might be called "measurement". An object would include a float, a unit of measure, and an estimate of accuracy. Mathematical operations would be overloaded so that the result of a calculation would appropriately handle propagation of the argument's accuracies into the result. It might even do unit conversions but that's another subject. Coercion of a float into a measurement would be automatic with infinite precision assumed. Given the new class it is easy to adjust comparison operators to calculate "within experimental error". -- --> Life begins at ovulation. Ladies should endeavor to get every young life fertilized. <--
Re: [pugs] Quoting constructs
On Friday 15 April 2005 3:27 am, Larry Wall wrote: > On Fri, Apr 15, 2005 at 03:27:27AM +0300, Roie Marianer wrote: > : > %hash<< a $key_b c >> :key<< a $value_b c >> > : > %hash« a $key_b c »:key« a $value_b c » > : > : Just to be certain, these are both equivalent to > : > : @hash{'a', $key_b, 'c'} key => ['a', $value_b, 'c'] > : > : in Perl 5, right? > > Close. It's actually more like: > > @hash{split " ", "a $key_b c"}key => [split " ", "a $value_b c"] I actually knew that, but in my head $key_b and $value_b were single words. But according to S02, the interpolation is protected by quotes. That is, if $key_b is q0/printf "Hello, world\n" or die"/, that's four words, correct? Or is it just if the quotes actually appear in the quoting construct? Basically I'm wondering if there's a detailed specification of how <<>> should work. Several only-slightly-related questions about interpolating: 1. qq x$varx eq $var? (That's how it works in Perl5, anyway) 2. If the delimiter is not a single character (I think this only applies to <<>>), does a backslash protect the first character or both? For example, in <>> or die Is that three words ['some', 'words', '>'] with the >> ending the construct, or is that ['some', 'words', '>>>', 'or', 'die']? (and the rest of the file is interpolated and split into words) 3. Are <<>>-style delimiters allowed in other quoting constructs? Is q<> the string "Hello", or the string "> yet at all.) My head hurts. :-) By the way, something tells me perl6-compiler isn't the best place for this discussion. Is there a secret group of people that discusses cornercases for perl6, and if so could someone tell me on what list they live? -- -Roie v2sw6+7CPhw5ln5pr4/6$ck2ma8+9u7/8LSw2l6Fi2e2+8t4TNDSb8/4Aen4+7g5Za22p7/8 [ http://www.hackerkey.com ]
Comparing rationals/floats
More questions stemming from cookbook work... Decimal Comparisons: The most common recipe around for comparisons is to use sprintf to cut the decimals to size and then compare strings. Seems ugly. The non-stringification way to do it is usually along the lines of: if (abs($value1 - $value2) < abs($value1 * epsilon)) (From Mastering Algorithms with Perl errata) I'm wondering though, if C<$value1 == $value2> is always wrong (or almost always wrong) then should it be smarter and: a. throw a warning b. DWIM using overloaded operators (as in reduce precision then compare) c. throw a warning but have other comparison operators just for this case to make sure you know what you're doing I'd vote for b., but I don't know enough about the problem domain to know if that is safe, and realistically I just want to write the cookbook entry rather than start a math-geniuses flame war ;-) Which leads to another question: Are there $value.precision() and $value.accuracy() methods available for decimals? I'd really rather not do the string comparison if it can be avoided, maybe it's just the purist in me saying "leave the numbers be" :-) Apologies in advance if this is somewhere I missed. I did a lot of searching. Marcus Adair
Re: nbsp in \s, and <>
I thought we had just established that nbsp is not in Unicode¹s definition of whitespace. So why should \s match it? On 2005-04-15 18:56, "Larry Wall" <[EMAIL PROTECTED]> wrote: > On Sat, Apr 16, 2005 at 12:46:47AM +0200, Juerd wrote: > : Larry Wall skribis 2005-04-15 15:38 (-0700): > : > : Do \s and match non-breaking whitespace, U+00A0? > : > Yes. > : > : That makes \s+ and \s*, and thus very useless for anything but > : trimming whitespace. For splitting (including word wrapping), it'd do > : exactly the wrong thing. > > Maybe we just need a for breaking white space, or some such. > is primarily used in pattern matching with :w, where a > non-breaking space in the input would presumably be matched by a > non-breaking space in the pattern, or maybe an explicit . > As long as patterns (with or without :w) treat non-breaking spaces > as ordinary matching characters, it should work out, methinks. > Though it's probably a hair more readable to use an explicit ... > > Larry >
Re: Heredocs: How equal are bunches of spaces to tabs?
On Sat, Apr 16, 2005 at 12:11:24AM +0200, Juerd wrote: : Pasted from pugs/examples/cookbook/01-00introduction.p6: : : # XXX - question: How equal are bunches of spaces to tabs? : # -- I'd say that's a question for perl6lang This seems to be singularly short on context, but if it has to do with trimming leading whitespace from heredocs, A2 already discusses this. Larry
Re: nbsp in \s, and <>
On Sat, Apr 16, 2005 at 12:46:47AM +0200, Juerd wrote: : Larry Wall skribis 2005-04-15 15:38 (-0700): : > : Do \s and match non-breaking whitespace, U+00A0? : > Yes. : : That makes \s+ and \s*, and thus very useless for anything but : trimming whitespace. For splitting (including word wrapping), it'd do : exactly the wrong thing. Maybe we just need a for breaking white space, or some such. is primarily used in pattern matching with :w, where a non-breaking space in the input would presumably be matched by a non-breaking space in the pattern, or maybe an explicit . As long as patterns (with or without :w) treat non-breaking spaces as ordinary matching characters, it should work out, methinks. Though it's probably a hair more readable to use an explicit ... Larry
Re: nbsp in \s, and <>
Larry Wall skribis 2005-04-15 15:38 (-0700): > : Do \s and match non-breaking whitespace, U+00A0? > Yes. That makes \s+ and \s*, and thus very useless for anything but trimming whitespace. For splitting (including word wrapping), it'd do exactly the wrong thing. > : \s is said (in S05) to match any unicode whitespace, but letting it > : match NBSP and then using \s for splitting things is wrong, I think. > Perhaps the default word split should not be based on \s then. It'd have to. Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: nbsp in \s, and <>
On Fri, Apr 15, 2005 at 11:44:03PM +0200, Juerd wrote: : Is there a -like thingy that is always \s+? Not currently, since \s+ is there. used to be that, but currently is defined as the magical whitespace matcher used by :words. : Do \s and match non-breaking whitespace, U+00A0? Yes. : How about: : : U+0008 backspace : U+00A0 no break space (Repeated for overview) : U+1361 ethiopic wordspace : U+2000 en quad : U+2001 em quad : U+2002 en space : U+2003 em space : U+2004 three per em space : U+2005 four per em space : U+2006 six per em space : U+2007 figure space : U+2008 punctuation space : U+2009 thin space : U+200A hair space : U+200B zero width space : U+202F narrow no break space : U+205F medium mathematic space : U+2060 word joiner (What is that, anyway?) : U+3000 ideographic space : U+FEFF zero width non-breaking space Yes, any Unicode whitespace, but you seem to have a different list than I do. Outside of the standard ASCIIish control-character whitespace, I count only the \pZ characters, not the \pC characters, so I don't have to tell you what a word-joiner is, since it's a \p[Cf] character. :-) I will also gleefully ignore the existence of BOMs. So I make it: 0020;SPACE;Zs;0;WS;N; 00A0;NO-BREAK SPACE;Zs;0;CS; 0020N;NON-BREAKING SPACE 1680;OGHAM SPACE MARK;Zs;0;WS;N; 180E;MONGOLIAN VOWEL SEPARATOR;Zs;0;WS;N; 2000;EN QUAD;Zs;0;WS;2002N; 2001;EM QUAD;Zs;0;WS;2003N; 2002;EN SPACE;Zs;0;WS; 0020N; 2003;EM SPACE;Zs;0;WS; 0020N; 2004;THREE-PER-EM SPACE;Zs;0;WS; 0020N; 2005;FOUR-PER-EM SPACE;Zs;0;WS; 0020N; 2006;SIX-PER-EM SPACE;Zs;0;WS; 0020N; 2007;FIGURE SPACE;Zs;0;WS; 0020N; 2008;PUNCTUATION SPACE;Zs;0;WS; 0020N; 2009;THIN SPACE;Zs;0;WS; 0020N; 200A;HAIR SPACE;Zs;0;WS; 0020N; 200B;ZERO WIDTH SPACE;Zs;0;BN;N; 2028;LINE SEPARATOR;Zl;0;WS;N; 2029;PARAGRAPH SEPARATOR;Zp;0;B;N; 202F;NARROW NO-BREAK SPACE;Zs;0;WS; 0020N; 205F;MEDIUM MATHEMATICAL SPACE;Zs;0;WS; 0020N; 3000;IDEOGRAPHIC SPACE;Zs;0;WS; 0020N; : \s is said (in S05) to match any unicode whitespace, but letting it : match NBSP and then using \s for splitting things is wrong, I think. Perhaps the default word split should not be based on \s then. It's just one more difference, in addition to trimming leading and trailing whitespace like awk. : Are the contents of <> split using ? (Is <<$foo>>, where $foo is : "foo\xA0bar", one or two elements?) That is using the default word splitter (or it *is* the default word splitter), so if the default word split is based on <+[\s]-[\xA0]> it would be one element. Of course, the ZERO WIDTH SPACE is a nasty critter for anyone using whitespace to separate tokens. That and maybe thin spaces probably merit warnings in Perl code where they might cause visual ambiguity. Larry
Re: nbsp in \s, and <>
Aaron Sherman skribis 2005-04-15 18:20 (-0400): > > Is there a -like thingy that is always \s+? > Not sure what that means exactly. is \s* or \s+, depending on its surroundings. > Thankfully, NBSP (U+00A0) is not Unicode whitespace. Thanks for sharing this information! Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: $*CWD instead of chdir() and cwd()
chromatic skribis 2005-04-15 15:18 (-0700): > > Well, after failure it can be cwd() but false without breaking any real > > code, because normally, you'd never if (cwd) { ... }, simply because > > there's ALWAYS a cwd. > Not always -- try removing a directory that's the pwd of another > process. Results in EPERM indeed :( Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: nbsp in \s, and <>
On Fri, 2005-04-15 at 17:44, Juerd wrote: > Is there a -like thingy that is always \s+? Not sure what that means exactly. > Do \s and match non-breaking whitespace, U+00A0? As I understood, Perl 6 was going to use the Unicode standard(s) to determine the whitespacishness of each codepoint. Going to Google, I find: http://www.fileformat.info/info/unicode/category/Zs/list.htm which lists all of the "separator, space" characters. > How about: > > U+0008 backspace Character.isWhitespace() No > U+00A0 no break space (Repeated for overview) Character.isWhitespace() No > U+1361 ethiopic wordspace Character.isWhitespace() No > U+2000 en quad Character.isWhitespace() Yes > U+2001 em quad Character.isWhitespace() Yes > U+2002 en space Character.isWhitespace() Yes > U+2003 em space Character.isWhitespace() Yes > U+2004 three per em space Character.isWhitespace() Yes > U+2005 four per em space Character.isWhitespace() Yes > U+2006 six per em space Character.isWhitespace() Yes > U+2007 figure space Character.isWhitespace() No > U+2008 punctuation space Character.isWhitespace() Yes > U+2009 thin space Character.isWhitespace() Yes > U+200A hair space Character.isWhitespace() Yes > U+200B zero width space Character.isWhitespace() Yes > U+202F narrow no break space Character.isWhitespace() No > U+205F medium mathematic space Character.isWhitespace() Yes > U+2060 word joiner (What is that, anyway?) Character.isWhitespace() No Comments WJ a zero width non-breaking space (only) intended for disambiguation of functions for byte order mark > U+3000 ideographic space Character.isWhitespace() Yes > U+FEFF zero width non-breaking space Character.isWhitespace() No > \s is said (in S05) to match any unicode whitespace, but letting it > match NBSP and then using \s for splitting things is wrong, I think. Thankfully, NBSP (U+00A0) is not Unicode whitespace. -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
Re: $*CWD instead of chdir() and cwd()
On Fri, 2005-04-15 at 23:52 +0200, Juerd wrote: > Well, after failure it can be cwd() but false without breaking any real > code, because normally, you'd never if (cwd) { ... }, simply because > there's ALWAYS a cwd. Not always -- try removing a directory that's the pwd of another process. -- c
Heredocs: How equal are bunches of spaces to tabs?
Pasted from pugs/examples/cookbook/01-00introduction.p6: # XXX - question: How equal are bunches of spaces to tabs? # -- I'd say that's a question for perl6lang Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: $*CWD instead of chdir() and cwd()
On Fri, Apr 15, 2005 at 01:12:46PM -0700, Michael G Schwern wrote: : Thus spake Larry Wall: : > Offhand, I guess my main semantic problem with it is that if a chdir : > fails, you aren't in an undefined location, which the new value of $CWD : > would seem to indicate. You're just where you were. Then the user : > either has to remember that, or there still has to be some other : > means of finding out the real location. : : To be clear: Only the store operation will return undef on failure. That doesn't square with the notion that an assignment returns the actual lvalue: ($new = $old) =~ s/foo/bar/; : Additional fetches on $CWD will continue to return the cwd. : : $CWD = '/path/which/exists'; : $CWD = '/i/do/not/exist' err warn $!; : print $CWD; : : This prints /path/which/exists/. Except that the err should be looking at $CWD, not some other return value of the assignment. : > The other problem with it is the fact that people will assign relative : > paths to it and expect to get the relative path back out instead : > of the absolute path. : : I honestly never had this problem until I sat down and thought about it. :) : THEN I got all confused and started to do things like $CWD .= '/subdir'; : instead of simply $CWD = 'subdir';. But the rule is simple and natural. : It takes a relative or absolute directory and ALWAYS returns an absolute : path. Lax in what inputs it accepts, strict in what it emits. This is no : more to remember than what chdir() and cwd() would do. : : The result from $CWD would simply be a Dir object similar to Ken Williams' : Path::Class or Ruby's Dir object. One of the methods would be .relative. : : I didn't bring up @CWD because I thought it would be too much in one sitting. : Basically it allows you to do this: : : pop @CWD; # chdir ('..'); : push @CWD, 'dir'; # chdir ('dir'); : print $CWD[0]; # (File::Spec->splitdir(abs_path()))[0]; : # ie. What top level directory am I in? : : and all sorts of other operations that would normally involve a lot of : splitdir'ing. : : And then there's %CWD which I'm toying with being a per-volume chdir like : you can do on Windows but that may be too much of a questionable thing. You could multiplex both the array and hash roles into the object returned by $CWD, much like the $/ pattern match result object can be subscripted as either $/[1] or $/. $CWD would itself behave like a string in string context, but $CWD[] would get you to the array value, and $CWD{} the hash value for systems that have more than one current directory. : > Your assumption there is a bit inaccurate--in P6 you are allowed to : > temporize (localize) the effects of functions and methods that are : > prepared to deal with it. : : Yeah, we were talking about it on #perl6 a bit. That seems to me the more : bizarre idea than assigning to something which can fail. Localizing an : assignment is easy, there's just one thing to revert. But function calls can : do lots of things. Just how much does it reverse? I guess if its used : sensibly on sharp functions, such as chdir, and the behavior is : user-definable it can work but I don't know if the behavior will ever : be obvious for anything beyond the trivial. The function reverses whatever its TEMP property's closure knows how to reverse. It's up to the function to know what its side effects are and arrange to undo them. : FWIW my prompting to write File::chdir was a desire was for "local chdir". : So if "temp chdir" can be made to work that would solve most of the problem. : : If nothing else perhaps chdir() should be eliminated and cwd() simply takes : an argument to make it a getter/setter. If you're going to throw away the verb then the noun might as well be a variable. But I like verbs for their readability, even if the verb is "push". Note that "push" could be made to work with "temp" as well: temp push $CWD, "subdir" err fail "..." This would automatically pop $CWD at the end of the dynamic scope. : > However, I agree that it's nice to have an : > easily interpolatable value. So I think I'd rather see $CWD always : > return the current absolute path even after failure : : The problem there is it leaves $CWD without an error mechanism and thus : becomes an unverifiable operation. You have to use chdir() if you want to : error check and $CWD is reduced to a "scripting" feature. That was my point. And if you look back at what you wrote, you just called $CWD an "operation". It's not--it's a noun. I like nouns, but I also like verbs, and unlike in Perl 5 we don't have to rely on the magical side effects of certain mystical nouns to do localization any more. But I don't understand what you mean by a "scripting" feature, or how getting reduced to one is antithetical to a blissful existence. : It could throw an exception but then you have to wrap everything in a try : block. U
Re: $*CWD instead of chdir() and cwd()
Michael G Schwern skribis 2005-04-15 13:12 (-0700): > To be clear: Only the store operation will return undef on failure. > Additional fetches on $CWD will continue to return the cwd. Still breaks $ref = \($CWD = $foo); I'm not sure this breakage matters, but if it breaks one thing, it's likely to break more than just that one thing, and I wonder how much attention this has been given. Hm, but $CWD++ is nice! Especially if after photos9 it goes to photos10, and not photot0. How does string ++ work in Perl 6, anyway? > The problem there is it leaves $CWD without an error mechanism and thus > becomes an unverifiable operation. You have to use chdir() if you want to > error check and $CWD is reduced to a "scripting" feature. Well, after failure it can be cwd() but false without breaking any real code, because normally, you'd never if (cwd) { ... }, simply because there's ALWAYS a cwd. If this is done, the thing returned by the STORE can still be an lvalue and thus be properly reffed. This would mean you'd use or instead of err, but I don't understand the point of err meaning "error" together with the introduction of true-but-false values anyway. Low-prec // should imo just be spelled dor. But it's too late for that, of course. Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
nbsp in \s, and <>
Is there a -like thingy that is always \s+? Do \s and match non-breaking whitespace, U+00A0? How about: U+0008 backspace U+00A0 no break space (Repeated for overview) U+1361 ethiopic wordspace U+2000 en quad U+2001 em quad U+2002 en space U+2003 em space U+2004 three per em space U+2005 four per em space U+2006 six per em space U+2007 figure space U+2008 punctuation space U+2009 thin space U+200A hair space U+200B zero width space U+202F narrow no break space U+205F medium mathematic space U+2060 word joiner (What is that, anyway?) U+3000 ideographic space U+FEFF zero width non-breaking space \s is said (in S05) to match any unicode whitespace, but letting it match NBSP and then using \s for splitting things is wrong, I think. Are the contents of <> split using ? (Is <<$foo>>, where $foo is "foo\xA0bar", one or two elements?) Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: Statement modifier scope
Paul Seamons skribis 2005-04-15 13:42 (-0600): > Each of the declarations my, our and local currently set the value to > undefined (unless set = to something). That's not true. use strict; $::foo = 5; our $foo; print $foo; # 5 Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: $*CWD instead of chdir() and cwd()
Thus spake Larry Wall: > Offhand, I guess my main semantic problem with it is that if a chdir > fails, you aren't in an undefined location, which the new value of $CWD > would seem to indicate. You're just where you were. Then the user > either has to remember that, or there still has to be some other > means of finding out the real location. To be clear: Only the store operation will return undef on failure. Additional fetches on $CWD will continue to return the cwd. $CWD = '/path/which/exists'; $CWD = '/i/do/not/exist' err warn $!; print $CWD; This prints /path/which/exists/. > The other problem with it is the fact that people will assign relative > paths to it and expect to get the relative path back out instead > of the absolute path. I honestly never had this problem until I sat down and thought about it. :) THEN I got all confused and started to do things like $CWD .= '/subdir'; instead of simply $CWD = 'subdir';. But the rule is simple and natural. It takes a relative or absolute directory and ALWAYS returns an absolute path. Lax in what inputs it accepts, strict in what it emits. This is no more to remember than what chdir() and cwd() would do. The result from $CWD would simply be a Dir object similar to Ken Williams' Path::Class or Ruby's Dir object. One of the methods would be .relative. I didn't bring up @CWD because I thought it would be too much in one sitting. Basically it allows you to do this: pop @CWD; # chdir ('..'); push @CWD, 'dir'; # chdir ('dir'); print $CWD[0]; # (File::Spec->splitdir(abs_path()))[0]; # ie. What top level directory am I in? and all sorts of other operations that would normally involve a lot of splitdir'ing. And then there's %CWD which I'm toying with being a per-volume chdir like you can do on Windows but that may be too much of a questionable thing. > Your assumption there is a bit inaccurate--in P6 you are allowed to > temporize (localize) the effects of functions and methods that are > prepared to deal with it. Yeah, we were talking about it on #perl6 a bit. That seems to me the more bizarre idea than assigning to something which can fail. Localizing an assignment is easy, there's just one thing to revert. But function calls can do lots of things. Just how much does it reverse? I guess if its used sensibly on sharp functions, such as chdir, and the behavior is user-definable it can work but I don't know if the behavior will ever be obvious for anything beyond the trivial. FWIW my prompting to write File::chdir was a desire was for "local chdir". So if "temp chdir" can be made to work that would solve most of the problem. If nothing else perhaps chdir() should be eliminated and cwd() simply takes an argument to make it a getter/setter. > However, I agree that it's nice to have an > easily interpolatable value. So I think I'd rather see $CWD always > return the current absolute path even after failure The problem there is it leaves $CWD without an error mechanism and thus becomes an unverifiable operation. You have to use chdir() if you want to error check and $CWD is reduced to a "scripting" feature. It could throw an exception but then you have to wrap everything in a try block. Unless Perl 6 is going this route for I/O errors in general I'd rather not. I'll give the error mechanism some more thought. Anyhow, I encourage folks to play with File::chdir and see what they think of the idea. I'm fixing up the Windows nits in the tests now.
Re: [pugs] regexp "bug"?
On Fri, Apr 15, 2005 at 09:34:58AM -0700, Larry Wall wrote: > It doesn't have to be the default, though. But there has to be > some way of allowing illegal characters to be talked about, or > you can't write programs that talk about them. It's like saying Thoughtcrime acceptable. Doubleplusgood. Nicholas Clark
Re: [RFC] some doubtable MMDs?
On Fri, Apr 15, 2005 at 02:38:36PM +0200, Leopold Toetsch wrote: : I'm not quite sure, but it seems that some of the MMD functions may : better be vtable methods: : : - bitwise_sh[rl]*shift by anything other then int? : - bitwise_lsris missing generally : : or even just a plain opcode only: : : - logical_{or,and,xor} return a PMC depending on the boolean value : : What are HLLs expecting of these infix operations? Perl 6 tends to distinguish these as different operators, though Perl 5 did overload the bitwise ops on both strings and numbers, which newbies found confusing in ambiguous cases, which is why we changed it. : OTOH it might be useful that the current get__keyed operations : (postcircumfix:[]) become MMD subroutines: : : Px = Py[Pz]Pz = String, Int, Key, Slice, ... At the moment, the Perl 6 optimizer is explicitly allowed to optimize array indices with the assumption that the subscript is a scalar (or slice) of integer, or something that converts to integer. I'd be interested to know if that policy will actually buy us any performance. If it always has to go through MMD anyway, maybe it doesn't. But array indexing code tends to be pretty hot, so if we can keep it somewhat optimizable and/or jittable, that'd be nice. Larry
Re: Statement modifier scope
> I'm imagining it will be different, as I expect temp to not hide the old > thing. I'm not sure it will. That is another good question. I just searched through the S and A's and couldn't find if temp will blank it out. I am thinking it will act like local. Each of the declarations my, our and local currently set the value to undefined (unless set = to something). I imagine that temp and let will behave the same. In which case "local %h;" and "let %h" would allocate a new, empty variable in a addition to the original variable (which is hidden but still retains its contents). Paul
Various questions
I've been working on a C-to-Parrot compiler (actually an IMC backend for the LCC compiler), tentatively named Carrot, over the past week. It can currently do some reasonably useful things, like running the Cola compiler (with only a very small amount of cheating), but it has raised a few queries: * I can usually handle unsigned numbers by pretending they're signed and using 'I' registers, but some things appear to be awkward without new ops - in particular, div and cmod, and le/lt/ge/gt comparisons. (As far as I can tell, those are the only ones C would need; everything else should work fine with the signed variants). I've added divu/leu/etc ops to math.ops/cmp.ops (and just made them cast their operands into UINTVALs) - is that a reasonable thing to do? Would they be better in a new .ops file? * Should there be an 'isatty' op/method? (or is there something else that "isatty(fileno(file))" (which Cola's lexer uses) should do, in order to return a reasonable answer?) * Is it possible to merge PBC files together, like load_bytecode but at compile-time? The compiler converts .c to .pbc (via .imc), then the linker just creates a program full of load_bytecode, so the actual linking gets done at run-time, which isn't very nice when you try moving/deleting one of the .pbcs. (And lcc always deletes the .pbcs, since it assumes they're temporary files.) * How efficient are PMC method calls? (And are performance concerns documented anywhere, like "op calls are roughly n times faster than methods", so compiler-writers could avoid implementing things in stupid ways, or is it too early to be doing that?) I've been using [gs]et_integer_keyed_int on a PMC to allow pointer access. Since it reads whole ints, it probably crashes unnecessarily when e.g. reading chars at unlucky addresses - but IMC code like "val = mem.read_i1(ptr)" feels unpleasantly inefficient, particularly in string-processing loops. Hmm... Should I just accept that C-on-Parrot will always be relatively slow, since its concept of memory is slightly incompatible with Parrot's, and anybody who wants speed can use a native C compiler, so I can stop worrying about it? :-) Thanks, -- Philip Taylor [EMAIL PROTECTED]
Re: Statement modifier scope
I would like to get rid of all those implicit scopes. The only exception would be that any topicalizing modifier allocates a private lexical $_ scoped to just that statement. But dynamic scoping may happen only at explicit block boundaries. I can see the argument for the other side, where any "deferred" code is treated as a kind of closure regardless of whether there are explicit curlies around it. That would solve certain problems like defining the scopes of the lexicals in $a = $x ?? my $y :: my $z; or the infamous my $x = 1 if $y; to extend only to the subexpressions in which they find themselves. But it's not what naive users expect, and it's hard to explain, so I think we should stick with explicit curlies for most of our scoping needs, even if it means letting certain variables hang around undefined because their initialization was never executed. Larry
Re: Statement modifier scope
Paul Seamons skribis 2005-04-15 12:41 (-0600): > In Perl5 > perl -MData::Dumper -e '%h=qw(a 1 b 2); {local %h; $h{a}="one"; print Dumper > \%h} print Dumper \%h; > $VAR1 = { > 'a' => 'one' > }; > $VAR1 = { > 'a' => '1', > 'b' => '2' > }; > I'm imaging the behavior would be the same with Perl6. Notice that 'b' is I'm imagining it will be different, as I expect temp to not hide the old thing. I'm not sure it will. Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?
On Fri, Apr 15, 2005 at 11:28:31AM -0500, Rod Adams wrote: : David Wheeler wrote: : : >But the first person to write <[a...]> gets what's comin' to 'em. : : Is that nothing (since '.' lt 'a'), or everything after 'a'? Might as well make it everything after 'a' for consistency. One could also view the last dot as a special version of the ordinary "any" dot, and read it "a to whatever". Larry
Re: [pugs] regexp "bug"?
On Fri, Apr 15, 2005 at 05:12:54PM +, [EMAIL PROTECTED] wrote: : Isn't that what the difference between byte-level and codepoint-level : access to strings is all about. If you want to work with values that : are illegal codepoints then you should be working at the byte-level : not the codepoint-level, at least by default. Sure, but there's no guarantee you have access to a lower level, depending on the interface presented by the object in question, and you shouldn't probably have to know that anyway, if there's a useful abstraction level at which "illegal character" means something as a unit to the higher level. The fact is that U+ is an illegal character regardless of the encoding, and I'd like to be able to talk about it as a character, without having to know whether it's an illegal UTF-8 byte sequence, or an illegal UTF-16 byte sequence, or a 256-bit integer stored somewhere that you just aren't allowed to think about certain values of. In short, "legal" Unicode strings should probably be viewed as a constrained subtype of strings, not as a storage type. I know you've known Ada from its infancy. :-) Perl 6 makes the same distinction, and can presumably get at the unconstrained type for any constrained type. So if you hand me a Unicode string with arbitrary value restrictions, there had better be a way to view that string without the arbitrary restrictions. You need to be able to determine somehow that types Even or Odd have a storage class of type Int. Larry
Re: [perl #34984] [PATCH] Fix segfault with const
On Fri, Apr 15, 2005 at 07:26:56PM +0100, Nick Glencross wrote: > +// Forbid assigning a string to anything other than a string const > +// for now In future, please don't use C99 comments. (apart from that, I don't have the knowledge to comment on this patch) Nicholas Clark
Re: Statement modifier scope
On Friday 15 April 2005 12:28 pm, Juerd wrote: > temp %h{ %other.keys } = %other.values; Oops missed that - I like that for solving this particular problem. It does even work in Perl5: perl -MData::Dumper -e '%h=qw(a 1 b 2); {local @h{qw(a b)}=("one","two"); print Dumper \%h} print Dumper \%h' $VAR1 = { 'a' => 'one', 'b' => 'two' }; $VAR1 = { 'a' => '1', 'b' => '2' }; I had never thought to do a hash slice in a local. That is great!!! Thank you very much! Wish I'd know about that three years ago. But, it still doesn't answer the original question about scoping in the looping statement modifiers. Paul
Re: Truely temporary variables
On Fri, 2005-04-15 at 13:10, Luke Palmer wrote: > Aaron Sherman writes: > > Among the various ways of declaring variables, will Perl 6 have a way to > > say, "this variable is highly temporary, and may be re-declared within > > the same scope, or in a nested scope without concern"? I often find > > myself doing: > > > > my $sql = q{...}; > > ...do some DB stuff... > > my $sql = q{...}; > > ...do more DB stuff... > > There's a pretty common idiom for this: > > { > my $sql = q{...}; > # ... do some DB stuff ... > } > { > my $sql = q{...}; > # ... do more DB stuff ... > } > > You see it in test suites all over the CPANdom. You see it all over my code too... it is always possible to simulate many kinds of trickery that way. For example, if you want to write a loop with a counter that is visible one statement after the loop completes, you can say: { my int $i; loop $i=0;...;$i++ { ... } do_stuff($i); } But isn't: loop my int $i=0;...;$i++ { ...; LAST{do_stuff($i)} } much cleaner? I think so, if for no other reason than it explicitly says what it means. That's one of the reasons that LAST is so handy. So too would my mythical declarator would prevent a few steps that are otherwise quite easy, but cumbersome in the large. Whatever, though. It was a simple suggestion, and seems to have sparked FAR more controversy than the small win warrants. -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
Re: Statement modifier scope
> > temp %h; > %h{ %other.keys } = %other.values; > > or even > > temp %h{ %other.keys } = %other.values; > > should work well already? Almost - but not quite. In Perl5 perl -MData::Dumper -e '%h=qw(a 1 b 2); {local %h; $h{a}="one"; print Dumper \%h} print Dumper \%h; $VAR1 = { 'a' => 'one' }; $VAR1 = { 'a' => '1', 'b' => '2' }; I'm imaging the behavior would be the same with Perl6. Notice that 'b' is gone in the first print. I only want to temporarily modify "some" values (the ones from the %other hash). I don't want the contents of the %h to be identical to %other - I already have %other. So in Perl5 this does work: perl -MData::Dumper -e '%h=qw(a 1 b 2); {local %h=%h; $h{a}="one"; print Dumper \%h} print Dumper \%h; $VAR1 = { 'a' => 'one' 'b' => '2', }; $VAR1 = { 'a' => '1', 'b' => '2' }; But this won't work in Perl6 (temp $var = $var doesn't work in Perl6) and again it may be fine for small hashes with only a little data - but for a huge hash (1000+ keys) it is very inefficient. This is good discussion - but it isn't the real focus of the original message in the thread - the question is about the local (temp) scoping of looping statement modifiers in Perl6. Though, I do appreciate your trying to get my example working as is. Paul
Re: Statement modifier scope
Paul Seamons skribis 2005-04-15 12:16 (-0600): > For the given example, your code fits perfectly. A more common case I have > had to deal with is more like this: > my %h = > my %other = ; > { > temp %h{$_} = %other{$_} for %other.keys; Either temp %h; %h{$_} = %other{$_} for %other.keys; or temp %h; %h{ %other.keys } = %other.values; or even temp %h{ %other.keys } = %other.values; should work well already? > %h.say; > } I think it's hard to find an example that can't easily be rewritten as something that already works. Gather/take solves most. Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: Truely temporary variables
On Fri, 2005-04-15 at 11:21 -0500, Patrick R. Michaud wrote: > On Fri, Apr 15, 2005 at 09:17:13AM -0700, Larry Wall wrote: > > Maybe we could define an "ok" operator that suppresses only the > > *first* warning produced by its argument(s). Then if you get multiple > > warnings, you at least get some indication that you've overgeneralized, > > even if the "wrong" warning comes out. Or maybe it only suppresses > > the first warning till you get a second warning, and then it prints both. > And after the third warning, it sends you to your room with no supper. Talk about a strict permission system. If that's the case, I want a "I'm the human here, darnit!" option to bypass it. -- c
Re: [perl #34984] [PATCH] Fix segfault with const
Leopold Toetsch via RT wrote: I think, we could be a bit more graceful here for I/N mismatch and set for the above case the constant val->set to 'N'. Let me redo that... I've just sent the wrong attachment which had a typo in it ... [This should really address rare but possible Unicode strings, shouldn't it?] Nick Index: imcc/symreg.c === --- imcc/symreg.c (revision 7843) +++ imcc/symreg.c (working copy) @@ -307,6 +307,7 @@ INS(interp, unit, "set_p_pc", "", r, 2, 0, 1); return NULL; } + /* Makes a new identifier constant with value val */ SymReg * mk_const_ident(Interp *interp, @@ -314,6 +315,16 @@ { SymReg *r; +// Forbid assigning a string to anything other than a string const +// for now +if (t != 'S' && val->set == 'S') +IMCC_fataly(interp, E_TypeError, +"bad const initialisation"); + +// Cast value to const type +if (t == 'N' || t == 'I') +val->set = t; + if (global) { if (t == 'P') { IMCC_fataly(interp, E_SyntaxError,
Re: Truely temporary variables
Brent 'Dax' Royal-Gordon skribis 2005-04-15 11:15 (-0700): > Anything wrong with: Yes, moving things around breaks it, as does removing the first. There is no real dependency on the first $sql and it'd be great if declaration wouldn't add one. temp $sql = q{...}; my $sql = q{...}; temp $sql = q{...}; Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: Statement modifier scope
On Friday 15 April 2005 11:57 am, Juerd wrote: > Paul Seamons skribis 2005-04-15 11:50 (-0600): > > my %h = ; > > { > > temp %h{$_} ++ for %h.keys; > > Just make that two lines. Is that so bad? > > temp %h; > %h.values »++; > For the given example, your code fits perfectly. A more common case I have had to deal with is more like this: my %h = my %other = ; { temp %h{$_} = %other{$_} for %other.keys; %h.say; } Ideally that example would print aone btwo c3 It isn't possible any more to do something like { temp %h = (%h, %other); } because that second %h is now hidden from scope (I forget which Apocalypse or mail thread I saw it in). Plus for huge hashes it just isn't very efficient. I'd like to temporarily put the values of one hash into another (without wiping out all of the modfied hashes values like "temp %h" would do), run some code, leave scope and have the modified hash go back to normal. In perl5 I've had to implement that programatically by saving existing values into yet another hash - running the code - putting them back. It works but there is all sorts of issues with defined vs exists. So yes - your code fits the limited example I gave. But I'd still like the other item to work. Paul
Re: Truely temporary variables
Aaron Sherman <[EMAIL PROTECTED]> wrote: > What I'd really like to say is: > > throwawaytmpvar $sql = q{...}; > throwawaytmpvar $sql = q{...}; Anything wrong with: my $sql = q{...}; temp $sql = q{...}; temp $sql = q{...}; (Assuming C is made to work on lexicals, of course.) -- Brent 'Dax' Royal-Gordon <[EMAIL PROTECTED]> Perl and Parrot hacker "I used to have a life, but I liked mail-reading so much better."
Re: Statement modifier scope
Paul Seamons skribis 2005-04-15 11:50 (-0600): > my %h = ; > { > temp %h{$_} ++ for %h.keys; Just make that two lines. Is that so bad? temp %h; %h.values »++; > %h.say; # values are incremented still > } > %h.say; # values are back to original values Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: New language: Parrot Common Lisp
(If anyone is able to track down aforementioned DOD/GC problems, you'll earn my eternal gratitude.) Can you please provide a code snippet that exhibits the error. Just running the program gives me errors on both Linux/x86 and OS X. Running with GC disabled works fine. On OS X with GC enabled: forge:~/svn/parrot-lisp/trunk$ parrot lisp.pbc Can't find method '__set_string_native' for object 'LispSymbol' On OS X with GC disabled: forge:~/svn/parrot-lisp/trunk$ parrot -G lisp.pbc -> On Linux with GC enabled: anvil:~/svn/parrot-lisp/trunk$ parrot lisp.pbc Can't find method '__set_string_native' for object 'LispSymbol' On Linux with GC disabled: anvil:~/svn/parrot-lisp/trunk$ parrot lisp.pbc -> This is on the Parrot checked out of Subversion this morning (revision 7846). Which OS/build number were you using? -c
Statement modifier scope
The following chunks behave the same in Perl 5.6 as in Perl 5.8. Notice the output of "branching" statement modifiers vs. "looping" statement modifiers. perl -e '$f=1; {local $f=2; print "$f"} print " - $f\n"' # prints 2 - 1 perl -e '$f=1; {local $f=2 if 1; print "$f"} print " - $f\n" # prints 2 - 1 perl -e '$f=1; {local $f=2 unless 0; print "$f"} print " - $f\n"'' # prints 2 - 1 perl -e '$f=1; {local $f=2 for 1; print "$f"} print " - $f\n"' # prints 1 - 1 perl -e '$f=1; {local $f=2 until 1; print "$f"} print " - $f\n"' # prints 1 - 1 perl -e '$f=1; {local $f=2 while !$n++; print "$f"} print " - $f\n"' # prints 1 - 1 It appears that there is an implicit block around statements with looping statement modifiers. perlsyn does state that the control variables of the "for" statement modifier are locally scoped, but doesn't really mention that the entire statement is as well. I'm not sure if this was in the original design spec or if it flowed out of the implementation details, but either way it seems to represent an inconsistency in the treatment of locality with regards to braces (ok I guess there are several in Perl5). So the question is, what will it be like for Perl6. It would seem that all of the following should hold true because of scoping being tied to the blocks. pugs -e 'our $f=1; {temp $f=2; print $f}; say " - $f"' # should print 2 - 1 (currently prints 2 - 2 - but that is a compiler issue) pugs -e 'our $f=1; {temp $f=2 if 1; print $f}; say " - $f"' # should print 2 - 1 (currently dies with parse error) pugs -e 'our $f=1; {temp $f=2 for 1; print $f}; say " - $f"' # hopefully prints 2 - 1 (currently dies with parse error) As a side note - pugs does work with: pugs -e 'our $f=1; {$f=2 for 1; print $f}; say " - $f"' # prints 2 - 2 (as it should. It seems that statement modifiers don't currently work with declarations - but that is a compiler issue - not a language issue.) I have wanted to do this in Perl5 but couldn't but would love to be able to do in Perl6: my %h = ; { temp %h{$_} ++ for %h.keys; %h.say; # values are incremented still } %h.say; # values are back to original values Paul
Re: [perl #34984] [PATCH] Fix segfault with const
Leopold Toetsch via RT wrote: Nick Glencross <[EMAIL PROTECTED]> wrote: This patch fixes a problem which can occur in this example: .sub test .const float a = 12 print a print_newline .end Ah yep. +if (t != 'P' && t != val->set) +IMCC_fataly(interp, E_TypeError, +"const types do not match"); I think, we could be a bit more graceful here for I/N mismatch and set for the above case the constant val->set to 'N'. Yes, I was planning to do something a bit more thorough, but fixing the immediate segfault was the first challenge. I've looked over the code a bit more now, and see that the value is still stored textually at this point, so setting the type as you've said is pretty simple. It's a shame that strings can be in a number of different formats, and probably quoted, preventing this from working for them too. Anyhow, here's a new patch for you to review, and perhaps apply...? Cheers, Nick Index: imcc/symreg.c === --- imcc/symreg.c (revision 7843) +++ imcc/symreg.c (working copy) @@ -307,6 +307,7 @@ INS(interp, unit, "set_p_pc", "", r, 2, 0, 1); return NULL; } + /* Makes a new identifier constant with value val */ SymReg * mk_const_ident(Interp *interp, @@ -314,6 +315,16 @@ { SymReg *r; +// Forbid assigning a string to anything other than a string const +// for now +if (t != 'S' && val->set == 'S') +IMCC_fataly(interp, E_TypeError, +"bad const initialisation"); + +// Cast value to const type +if (t == 'S' || t == 'I') +val->set = t; + if (global) { if (t == 'P') { IMCC_fataly(interp, E_SyntaxError,
Re: [perl #35000] [PATCH] README.win32 & icu 3.2
On Fri, 2005-04-15 at 05:38 -0700, François PERRAD wrote: > small mistake in [perl #34986] : > with ICU 3.2, the library icudata.lib is renamed icudt.lib. Thanks, applied. -- c
Re: [pugs] regexp "bug"?
Isn't that what the difference between byte-level and codepoint-level access to strings is all about. If you want to work with values that are illegal codepoints then you should be working at the byte-level not the codepoint-level, at least by default. -- Mark Biggar [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] > On Fri, Apr 15, 2005 at 12:56:14AM -0700, Mark A. Biggar wrote: > : Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 > : byte UCS-2 value, but the Unicode standard specifically says that the > : values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should > : never appear in a Unicode string. 0x is reserved for out-of-band > : signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are > : specificaly reserved for out-of-band marking a UCS-2 file as being > : either bigendian or littlendian, but are specifically not considered > : part of the data. chr() is currently defined to mean convert an int > : value to a Unicode codepoint. That's why I said that chr(65535) should > : return an exception, it's an argument error similar to sqrt(-1). > > It has to at least be possible to Think Bad Thoughts in Perl. > It doesn't have to be the default, though. But there has to be > some way of allowing illegal characters to be talked about, or > you can't write programs that talk about them. It's like saying > it's okay to be an executioner as long as you don't kill anyone... > > Larry
Re: Truely temporary variables
Aaron Sherman writes: > Among the various ways of declaring variables, will Perl 6 have a way to > say, "this variable is highly temporary, and may be re-declared within > the same scope, or in a nested scope without concern"? I often find > myself doing: > > my $sql = q{...}; > ...do some DB stuff... > my $sql = q{...}; > ...do more DB stuff... There's a pretty common idiom for this: { my $sql = q{...}; # ... do some DB stuff ... } { my $sql = q{...}; # ... do more DB stuff ... } You see it in test suites all over the CPANdom. Luke
Re: Truely temporary variables
Rod Adams skribis 2005-04-15 11:53 (-0500): > Wouldn't some form of trait make more sense: >my $sql = '...' is ok; Depends. A unary ok operator would let you pinpoint very easily, *without* using parens: ok $fh.print($foo); # no warnings about print (closed fh?) # but warning about undef $foo remains $fh.print(ok $foo); # warn about printing thingies, but not about # undef $foo say $foo, $bar, ok $baz, $quux; # complain about everything, except # what has to do with $baz my $foo; ok my $foo = "foo $bar baz"; # warn about $bar, but not the masking my $foo = ok "foo $bar baz"; # other way around Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: Truely temporary variables
Larry Wall wrote: On Fri, Apr 15, 2005 at 06:04:32PM +0200, Juerd wrote: : No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and : ugly). Suggestions? Maybe we could define an "ok" operator that suppresses only the *first* warning produced by its argument(s). Then if you get multiple warnings, you at least get some indication that you've overgeneralized, even if the "wrong" warning comes out. Or maybe it only suppresses the first warning till you get a second warning, and then it prints both. Wouldn't some form of trait make more sense: my $sql = '...' is ok; Only trick would be getting "is ok" to bind to the thing in the preceding expression that produces the warning the programmer was expecting. Certainly {my $sql = '...'} is ok; get the point across that warnings are somewhat ignorable for the block, but that starts getting to look a lot like {my $sql = '...'} CATCH {default}; Except that one is run-time, the other compile-time. So one could interpret this thread as a cry for a compile-time exception handler. I see some interesting uses for this in conjunction with C, but I doubt I'm seeing the whole story. -- Rod Adams
Re: [pugs] regexp "bug"?
On Fri, Apr 15, 2005 at 12:56:14AM -0700, Mark A. Biggar wrote: : Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 : byte UCS-2 value, but the Unicode standard specifically says that the : values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should : never appear in a Unicode string. 0x is reserved for out-of-band : signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are : specificaly reserved for out-of-band marking a UCS-2 file as being : either bigendian or littlendian, but are specifically not considered : part of the data. chr() is currently defined to mean convert an int : value to a Unicode codepoint. That's why I said that chr(65535) should : return an exception, it's an argument error similar to sqrt(-1). It has to at least be possible to Think Bad Thoughts in Perl. It doesn't have to be the default, though. But there has to be some way of allowing illegal characters to be talked about, or you can't write programs that talk about them. It's like saying it's okay to be an executioner as long as you don't kill anyone... Larry
Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?
David Wheeler wrote: But the first person to write <[a...]> gets what's comin' to 'em. Is that nothing (since '.' lt 'a'), or everything after 'a'? -- Rod Adams
Re: Truely temporary variables
On Fri, Apr 15, 2005 at 09:17:13AM -0700, Larry Wall wrote: > On Fri, Apr 15, 2005 at 06:04:32PM +0200, Juerd wrote: > : No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and > : ugly). Suggestions? > > Maybe we could define an "ok" operator that suppresses only the > *first* warning produced by its argument(s). Then if you get multiple > warnings, you at least get some indication that you've overgeneralized, > even if the "wrong" warning comes out. Or maybe it only suppresses > the first warning till you get a second warning, and then it prints both. And after the third warning, it sends you to your room with no supper. Pm
Re: Parrot bytecode reentrancy
15/04/2005 10:35:56, Leopold Toetsch <[EMAIL PROTECTED]> wrote: >Nigel Sandever <[EMAIL PROTECTED]> wrote: > >> When a sub that closes over a variable > >> my $closure = 0; >> sub do_something { >> return $closure++: >> } > >> is called from two threads, do the threads share a single closure or >> each get their own separate closure? > >AFAIK: the closure bytecode is shared, Great. >the Closure PMC with the lexical >pad is distinct. I think that makes perfect sense. No implicit sharing. >But that all isn't implemented yet. > Understood. I am being premature in thinking about this. But this is where I come unstuck. What would this mean/do when called from 2 threads? my $closure :shared = 0; sub do_something { return $closure++: } or this: our $closure :shared = 0; sub do_something { return $closure++: } I struck me a while back that there is a contradiction in idea of a shared, 'my' variable. I want to say lexical, but a var declared with 'our' is in some sense lexical. Where I am going is that "shared" implies global. Access can be constrained by requiring a lexical declaration using 'our', but 'my' variables should not be able to be marked 'shared'. One nice thing that falls out of that, is that no 'my' vars would ever be shared, which means they never require semaphore checks. That would mean that a non threaded app running on a multi-threaded build of Parrot, need never incur a penalty of semaphore checks if it always use 'my'. *I think*? In effect, all vars declared 'our' would be implicitly shared, (and would require semaphoring), removing the need for a 'shared' attribute. In P5, lexicals are already quicker that globals, so any additional penalty added to globals because of multithreading will not affect any single-threaded code that is striving for ultimate performance, because they would already be utilising lexicals. Equally, things like filehandles are inherently process-global in scope and therefore sharable between threads and require semaphore checks. I only throw this into the thought-pot because there seems to me to be a natural symmetry between the concept of 'global' and the concept of 'shared'. I won't argue the case for this, but I thought that if I mention it, it might also make some sense to others when the time comes for this stuff to be designed and implemented. >> njs > >leo > njs >
Re: Truely temporary variables
On Fri, Apr 15, 2005 at 06:04:32PM +0200, Juerd wrote: : No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and : ugly). Suggestions? Maybe we could define an "ok" operator that suppresses only the *first* warning produced by its argument(s). Then if you get multiple warnings, you at least get some indication that you've overgeneralized, even if the "wrong" warning comes out. Or maybe it only suppresses the first warning till you get a second warning, and then it prints both. Larry
Re: Truely temporary variables
On Fri, Apr 15, 2005 at 11:45:16AM -0400, Aaron Sherman wrote: : Among the various ways of declaring variables, will Perl 6 have a way to : say, "this variable is highly temporary, and may be re-declared within : the same scope, or in a nested scope without concern"? I often find : myself doing: : : my $sql = q{...}; : ...do some DB stuff... : my $sql = q{...}; : ...do more DB stuff... : : This of course results in re-defining $sql, so I take out the second : "my", but then at some point I remove the first one, and strict chews me : out over not declaring $sql, so I make it "my" again. : : This is a cycle I've repeated with dozens of variations on more : occasions than I care to (could?) count. And at that point, why not just change it to this? my $sql; $sql = q{...}; ...do some DB stuff... $sql = q{...}; ...do more DB stuff... It seems to me that assignment does a pretty good job of clobbering a variable's value without the need to redeclare the container. If you really want to program in a definitional paradigm that requires every new definition to have a declaration, then you ought to be giving different definitions different names, seems like, or putting each of them into its own scope. Or write yourself a macro. Or just turn off the redefinition warning... It doesn't seem to rise to the level of a new keyword for me. Larry
Re: Truely temporary variables
Aaron Sherman skribis 2005-04-15 11:45 (-0400): > What I'd really like to say is: > throwawaytmpvar $sql = q{...}; > throwawaytmpvar $sql = q{...}; I like the idea and propose "a", aliased "an" for this. > It should probably be illegal to: > throwawaytmpvar $sql = q{...}; > my $sql = q{...}; # Error: temporary became normal lexical > or for that matter even give it a new type: > throwawaytmpvar int $i = 0; > throwawaytmpvar str $i = "oops"; # Error: redefinition of type Giving it a new type should be valid. That is, I think the variable is more useful if the old one is thrown away and a new one is created. This can perhaps be optimized by re-using the same thing if it has no external references anymore. In fact, a Str $foo = $foo; is a nice way to indicate that from now on, you don't care about its numeric value anymore. All in all, I think a|an can just be my without warnings and then do what you want. Hm. Funny idea just occurred to me. What if something in ALLCAPS, or better, just Ucfirst would disable all warnings for just that thing? my $foo; say $foo; # warning about undef $foo Say $foo; # no warning $closed_fh.print(Int($foo)); # just a warning about the closed fh my $foo; # warning about new $foo masking first My $foo; # no warning If you think this looks much like PHP's @, you're right. It's not so bad an idea, actually. The problem with PHP is that everything's a warning and almost nothing actually dies. No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and ugly). Suggestions? Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
Re: Macros [was: Whither "use English"?]
On Fri, Apr 15, 2005 at 12:45:14PM +1200, Sam Vilain wrote: : Larry Wall wrote: : > Well, only if you stick to a standard dialect. As soon as you start : > defining your own macros, it gets a little trickier. : : Interesting, I hadn't considered that. : : Having a quick browse through some of the discussions about macros, many : of the macros I saw[*] looked something like they could be conceptualised : as referring to the part of the AST where they were defined. : : ie, making the AST more of an Abstract Syntax Graph. And macros like : 'free' (ie, stack frame and scope-less) subs, with only the complication : of variable binding. The ability to have recursive macros would then : relate to this graph-ness. That is one variety of macro. : What are the shortcomings of this view of macros, as 'smart' (symbol : binding) AST shortcuts? The biggest problem with smart things is they're harder for not-so-smart people to understand. : The ability to know exactly what source corresponds to a given point on : the AST, as well as knowing straight after parse time (save for string : eval, of course) what each token in the source stream relates to is one : thing that I'm aiming to have work with Perldoc. I'm hoping this will : assist I18N efforts and other uses like smart editors. Yes, that's an important quality for many kinds of tools, whether documentation, debugging, or refactoring. : By smart editors, I'm talking about something that uses Perl/PPI as its : grammar parsing engine, and it highlights the code based on where each : token in the source stream ended up on the AST. This would work : completely with source that munges grammars (assuming the grammars are : working ;). Then, use cases like performing L10N for display to non- : English speakers would be 'easy'. I can think of other side-benefits : to such "regularity" of the language, such as allowing Programatica- : style systems for visually identifying 'proof-carrying code' and : 'testing certificates' (see http://xrl.us/programatica). Glad you think it's 'easy'. Maybe you should 'just do it' for us. :-) : macros that run at compile time, and insert strings back into the : document source seem hackish and scary to these sorts of prospects. We also allow (but discourage) textual substitution macros. They're essentially just lexically scoped source filters, and suffer the same problems as source filters, except for the fact that you can more easily limit the damage to a small patch of code. The problem is that the original patch of text has to be stored in the AST along with the new chunk of AST generated by the reparse, and it's not at all clear how a tool should handle that conflict. It's better to only parse once whenever possible, and just make sure the original text remains attached to the appropriate place in the AST. More basically, it's usually better to cooperate with the parser than to lie to it. : But then, one man's hackish and scary is another man's elegant : simplicity, I guess. : : * - in particular, messages like this: : - http://xrl.us/fr78 : : but this one gives me a hint that there is more to the story... I : don't grok the intent of 'is parsed' : - http://xrl.us/fr8a This is mostly talked about in the relevant Apocalypses, and maybe the Synopses. See dev.perl.org for more. Larry
Truely temporary variables
Among the various ways of declaring variables, will Perl 6 have a way to say, "this variable is highly temporary, and may be re-declared within the same scope, or in a nested scope without concern"? I often find myself doing: my $sql = q{...}; ...do some DB stuff... my $sql = q{...}; ...do more DB stuff... This of course results in re-defining $sql, so I take out the second "my", but then at some point I remove the first one, and strict chews me out over not declaring $sql, so I make it "my" again. This is a cycle I've repeated with dozens of variations on more occasions than I care to (could?) count. What I'd really like to say is: throwawaytmpvar $sql = q{...}; throwawaytmpvar $sql = q{...}; without problems. Of course, "throwawaytmpvar" is a bit long, but you get the idea. It should probably be illegal to: throwawaytmpvar $sql = q{...}; my $sql = q{...}; # Error: temporary became normal lexical or for that matter even give it a new type: throwawaytmpvar int $i = 0; throwawaytmpvar str $i = "oops"; # Error: redefinition of type There might be other assumptions that this implies. For example, it might be considered always thread-private and might be required to be a core, unboxed type. These extra assumptions are only worth it if they enhance the optimization possibilities surrounding such a value. -- Aaron Sherman <[EMAIL PROTECTED]> Senior Systems Engineer and Toolsmith "It's the sound of a satellite saying, 'get me down!'" -Shriekback
Re: $*CWD instead of chdir() and cwd()
On Fri, Apr 15, 2005 at 03:11:59AM -0700, Michael G Schwern wrote: : Error handling is simple, a failed chdir returns undef and sets errno. : : $CWD = $dir err die "Can't chdir to $dir: $!"; Offhand, I guess my main semantic problem with it is that if a chdir fails, you aren't in an undefined location, which the new value of $CWD would seem to indicate. You're just where you were. Then the user either has to remember that, or there still has to be some other means of finding out the real location. The other problem with it is the fact that people will assign relative paths to it and expect to get the relative path back out instead of the absolute path. : I encourage Perl 6 to adapt $*CWD similar to File::chdir and simply eliminate : chdir() and cwd(). They're just an unlocalizable store and fetch for global : data. Your assumption there is a bit inaccurate--in P6 you are allowed to temporize (localize) the effects of functions and methods that are prepared to deal with it. However, I agree that it's nice to have an easily interpolatable value. So I think I'd rather see $CWD always return the current absolute path even after failure, and temp chdir($dir) err fail "Can't chdir to $dir: $!"; be made to work as a temporizable function at some point, via the TEMP mechanism described in A4. Larry
MMD 25 - multiply
One more, and my fingers & brain are getting tired of these changes. If someone wants to continue (and complete it during night here ;-), it's a simple job: 1) vtable.tbl - change existing signature of next infix operation - add inplace variant directly below it 2) imcc/parser_util.c:is_infix() - add the compare case for the MMD 3) make realclean; perl Configure.pl ... && make -s 4) fix all compiler errors in classes and dynclasses by looking at already converted functions and adding the inplace variants 4a) remove code from dynclasses/py*.pmc, if it's the same as the Parrot core base class, or adapt code 5) make test && 6) svn ci Thanks, leo
Re: New language: Parrot Common Lisp
According to Cory Spencer: > I'd like to announce the creation of the Parrot Common Lisp project Excellent! > * It's not a compiler yet, although I've got plans for that down the > road. (declare (type PerlString s)) ? :-) -- Chip Salzenberg- a.k.a. -<[EMAIL PROTECTED]> Open Source is not an excuse to write fun code then leave the actual work to others.
Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?
On Fri, Apr 15, 2005 at 01:01:58PM -, Rafael Garcia-Suarez wrote: > Aaron Sherman wrote in perl.perl6.language : > > > > A silly question: is there a canonical character set from which we > > extract these ranges? Are we hard-coding Unicode here, or is there some > > way for the user to specify the character set for ranges? > > Perl 5 forces [a-z] (or [i-j] for that matter) to be a range of > lowercase alphabetic characters, even on EBCDIC platforms (where it's > not). At the moment, PGE (the part that implements the rule engine) is deferring such questions to Parrot, and otherwise assuming Unicode. Plus, S02 explicitly indicates that Perl is written in Unicode and has consistent Unicode semantics, so I think that's what we should go with. It's certainly the way the compiler will go, at least initially. Pm
[SVN ci] MMD 24 - add converted
MMD subroutines "add" are done. * removed all mathematical functions from Tcl scalars - all is inherited now I forgot to mention in MMD 23: * If you have an overriden __add or __subtract function, either defined as @MULTI or registered via mmdvtregister, these functions must now return the destination PMC. For not yet converted MMD infix operations, the return result is ignored, but it doesn't harm either. leo
Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?
Aaron Sherman wrote in perl.perl6.language : > > A silly question: is there a canonical character set from which we > extract these ranges? Are we hard-coding Unicode here, or is there some > way for the user to specify the character set for ranges? Perl 5 forces [a-z] (or [i-j] for that matter) to be a range of lowercase alphabetic characters, even on EBCDIC platforms (where it's not).
[PATCH] Minor spelling & punctuation errors
I've corrected a few spelling and punctuation errors; since I'm not done yet, I'd like to know, whether I should continue, or if the general consensus is, that it's mostly needless nitpicking. Punctuation has only been corrected, if punctuation was already partly present; if totally absent, I didn't mind, as punctuation does not always add up to readability. Steven --- src/builtin.c Fri Apr 15 14:24:06 2005 +++ src/builtin.c Fri Apr 15 13:04:58 2005 @@ -4,7 +4,7 @@ =head1 NAME -src/builtin.c - Bultin Methods +src/builtin.c - Builtin Methods =head1 SYNOPSIS --- src/datatypes.c Fri Apr 15 14:24:27 2005 +++ src/datatypes.c Fri Apr 15 14:34:40 2005 @@ -1,6 +1,5 @@ /* -Copyright: (c) 2002 Leopold Toetsch <[EMAIL PROTECTED]> -License: Artistic/GPL, see README and LICENSES for details +Copyright: (c) 2002-2004 The Perl Foundation. All Rights Reserved. $Id: datatypes.c,v 1.11 2004/09/08 00:33:58 dan Exp $ =head1 NAME @@ -10,7 +9,7 @@ =head1 DESCRIPTION The functions in this file are used in .ops files to access the C -and C string constants for Parrot and native data types defined iin +and C string constants for Parrot and native data types defined in F. =head2 Functions --- src/debug.c Fri Apr 15 14:24:34 2005 +++ src/debug.c Fri Apr 15 13:30:21 2005 @@ -749,7 +749,7 @@ PDB_line_t *line; long ln,i; -/* If no line number was specified set it at the current line */ +/* If no line number was specified, set it at the current line */ if (command && *command) { ln = atol(command); @@ -944,7 +944,7 @@ /* PDB_find_breakpoint * * Find breakpoint number N; returns NULL if the breakpoint doesn't - * exist or if no breakpoint was specified + * exist or if no breakpoint was specified. * */ /* @@ -1470,8 +1470,8 @@ dest[size++] = 'P'; goto INTEGER; case PARROT_ARG_IC: -/* If the opcode jumps and this is the last argument - means this is a label */ +/* If the opcode jumps and this is the last argument, + that means this is a label */ if ((j == info->arg_count - 1) && (info->jump & PARROT_JUMP_RELATIVE)) { @@ -1888,7 +1888,7 @@ =over 4 -=item * This should take the line get an instruction, get the opcode for +=item * This should take the line, get an instruction, get the opcode for that instruction and check that is the correct one. =item * Decide what to do with macros if anything. @@ -2265,7 +2265,8 @@ =item C -Description. +Dumps the buflen, flags, bufused, strlen, offset associated +with a string and the string itself. =cut --- src/dod.c Fri Apr 15 14:24:42 2005 +++ src/dod.c Fri Apr 15 13:41:18 2005 @@ -97,13 +97,13 @@ ++arena_base->num_extended_PMCs; /* * XXX this basically invalidates the high-priority marking - * of PMCs by putting all PMCs onto the front of the list + * of PMCs by putting all PMCs onto the front of the list. * The reason for this is the by far better cache locality - * when aggregates and their contents are marked "together" + * when aggregates and their contents are marked "together". * * To enable high priority marking again we should probably * use a second pointer chain, which is, when not empty, - * processed first + * processed first. */ if (tptr || hi_prio) { if (PMC_next_for_GC(tptr) == tptr) { @@ -177,7 +177,7 @@ if (*dod_flags & (PObj_is_special_PMC_FLAG << nm)) { /* All PMCs that need special treatment are handled here. * For normal PMCs, we don't touch the PMC memory itself - * so that caches stay clean + * so that caches stay clean. */ #if GC_VERBOSE if (PObj_report_TEST(obj)) { @@ -210,7 +210,7 @@ PObj_live_SET(obj); /* if object is a PMC and contains buffers or PMCs, then attach - * the PMC to the chained mark list + * the PMC to the chained mark list. */ if (PObj_is_special_PMC_TEST(obj)) { mark_special(interpreter, (PMC*) obj); @@ -305,7 +305,7 @@ * but t/library/dumper* fails w/o this marking. * * It seems that the Class PMC gets DODed - these should - * get created as constant PMCs + * get created as constant PMCs. */ for (i = 1; i < (unsigned int)enum_class_max; i++) { VTABLE *vtable; @@ -404,10 +404,10 @@ * First phase of mark is finished. Now if we are the owner * of a shared pool, we must run the mark phase of other * interpreters in our pool, so that live shared PMCs in that - * interpreter are appended to our mark_ptrs chain + * interpreter are appended to our mark_ptrs chain. * * If there is a count of shared PMCs and we have already seen - * all these, we could skip th
[perl #35000] [PATCH] README.win32 & icu 3.2
# New Ticket Created by FranÃois PERRAD # Please include the string: [perl #35000] # in the subject line of all future correspondence about this issue. # https://rt.perl.org/rt3/Ticket/Display.html?id=35000 > small mistake in [perl #34986] : with ICU 3.2, the library icudata.lib is renamed icudt.lib. Francois Perrad.--- README.win32.orig 2005-04-15 11:08:34.0 +0200 +++ README.win322005-04-15 11:25:50.0 +0200 @@ -65,7 +65,7 @@ mkdir C:\usr\lib\data set PATH=%PATH%;C:\usr\lib\icu\bin cd -perl Configure.pl --icushared="C:\usr\lib\icu\lib\icudata.lib C:\usr\lib\icu\lib\icuuc.lib" --icuheaders="C:\usr\lib\icu\include" --icudatadir="C:\usr\local\icu\data" +perl Configure.pl --icushared="C:\usr\lib\icu\lib\icudt.lib C:\usr\lib\icu\lib\icuuc.lib" --icuheaders="C:\usr\lib\icu\include" --icudatadir="C:\usr\local\icu\data" With MinGW32, use icu-3.2-Win32-msvc6.zip. @@ -112,9 +112,9 @@ With the ActiveState Perl distribution, tell Configure.pl to use gcc : -perl Configure.pl --cc=gcc --icushared="C:\usr\lib\icu\lib\icudata.lib C:\usr\lib\icu\lib\icuuc.lib" --icuheaders="C:\usr\lib\icu\include" --icudatadir="C:\usr\local\icu\data" - -Nota: Use only the ICU binary distribution. +perl Configure.pl --cc=gcc --icushared="C:\usr\lib\icu\lib\icudt.lib C:\usr\lib\icu\lib\icuuc.lib" --icuheaders="C:\usr\lib\icu\include" --icudatadir="C:\usr\local\icu\data" +or +perl Configure.pl --cc=gcc --without-icu =item Intel C++
Re: [perl #34994] [TODO] make useful parts of Parrot config available at runtime
On 15 Apr, Leopold Toetsch wrote: : That stuff is all in Perl code under the config dir, e.g: : : $ find config -type f | xargs grep -w intsize This clarifies some of my unapproved assumptions, although src has some files containing these keywords too. : I think we should have: : :INTVAL_t # type of the INTVAL :FLOATVAL_t :INTVAL_size :int_size # native c type : : and so on. See also include/parrot/datatypes.h I will. : leo Steven
Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?
> > even sillier question: > if <[a.z]> matches "a", "." and "z" > and <[a...]> matches all characters from "a" including (for some > definition of 'all') > > how will be range \x21 .. \x2e written? > <[!..\.]>? (i.e. "." escaped?) > I was assuming from Larry's mail that <[a...]> would parse as either: 1) a character class containing the range from 'a' to '.' (what that means is a bit mind-bending for a friday afternoon) 2) a character class containing 'a' then a range from '.' to... oh, an error Which way might be ambiguous, but could of course be defined in the grammar. It hadn't occurred to me that ... for the range to infinity would be allowed or useful here. I suppose it could just mean 'up to the end of the available codepoints'. I do love the idea of <[a..f]> type ranges though. It's just what the three dots mean that's got me confused.
Re: <[]> ugly and hard to type
On Fri, Apr 15, 2005 at 02:58:44PM +0200, Juerd wrote: > Am I the only one who thinks <[a-z]> is ugly and hard to type because of > the nested brackets? The same goes for <{...}>. The latter can't easily > be fixed, I think, but the former perhaps can. Part of the thinking behind this is that the <[...]> construct is likely to be less common in p6 rules than [...] was in p5 regular expressions. For unicode reasons, one typically should be writing instead of <[a-z]> anyway. But yes, I understand the difficulty of typing <[...]> on non-US keyboards. :-) > \letter[] could well replace <[]>, and \LETTER[] would then replace > <-[]>. This is consistent with many other \letters. > > "c" for character is taken > "r" for range is taken by carriage return > "a" for any is taken by alarm (bell) > "l" for list is taken by lcfirst Actually, \L[...] is gone -- see S05 and A05. I'm not sure if \a exists, I haven't seen any reference to it in p6 rules. (One could claim that it's carried over from p5, but rules are so far different from regexes that I'm hesitant to make that assumption.) We could certainly declare \a to be something else. This isn't a vote from me either in favor or against this idea... I'm just clarifying and making sure the discussion is up-to-date with the relevant specs. Pm
<[]> ugly and hard to type
Am I the only one who thinks <[a-z]> is ugly and hard to type because of the nested brackets? The same goes for <{...}>. The latter can't easily be fixed, I think, but the former perhaps can. If there are more who think it needs to, that is. And <{}> is a bit easier to type because all four are shifted (US QWERTY and US Dvorak), while with <[]> I really have to think hard about when to press and when to release the shift key. \letter[] could well replace <[]>, and \LETTER[] would then replace <-[]>. This is consistent with many other \letters. "c" for character is taken "r" for range is taken by carriage return "a" for any is taken by alarm (bell) "l" for list is taken by lcfirst "m" is available, but I can't think of a mnemonic :) \m[a..z] \M[a..z] And to replace <[a..z]-[aoeui]> (does that construct even exist?), [ \m[a..z] & \M[aoeui] ]. IMO, that's the only step backwards. "a" would best communicate its function. Is the beep thing used enough? (\cG still does that thing if \a is gone.) Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
[RFC] some doubtable MMDs?
I'm not quite sure, but it seems that some of the MMD functions may better be vtable methods: - bitwise_sh[rl]*shift by anything other then int? - bitwise_lsris missing generally or even just a plain opcode only: - logical_{or,and,xor} return a PMC depending on the boolean value What are HLLs expecting of these infix operations? OTOH it might be useful that the current get__keyed operations (postcircumfix:[]) become MMD subroutines: Px = Py[Pz]Pz = String, Int, Key, Slice, ... Comments welcome, leo
Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?
- Original Message - From: "Aaron Sherman" <[EMAIL PROTECTED]> To: "David Wheeler" <[EMAIL PROTECTED]> Cc: "Perl6 Language List" Sent: Friday, April 15, 2005 2:00 PM Subject: Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>? > On Thu, 2005-04-14 at 21:32 -0700, David Wheeler wrote: > > On Apr 14, 2005, at 7:06 PM, Patrick R. Michaud wrote: > > > > > So, <[a.z]> matches "a", ".", and "z", > > > while <[a..z]> matches characters "a" through "z" inclusive. > > > > I was going to say that that was inconsistent, but since you never need > > to repeat a letter in a character class, well, I guess it isn't. But > > the first person to write <[a...]> gets what's comin' to 'em. > > A silly question: is there a canonical character set from which we > extract these ranges? Are we hard-coding Unicode here, or is there some > way for the user to specify the character set for ranges? > even sillier question: if <[a.z]> matches "a", "." and "z" and <[a...]> matches all characters from "a" including (for some definition of 'all') how will be range \x21 .. \x2e written? <[!..\.]>? (i.e. "." escaped?) braÅo
Re: [pugs] regexp "bug"?
"Mark A. Biggar" <[EMAIL PROTECTED]> wrote: :BÁRTHÁZI András wrote: : :> Hi, :> :> This code: :> :> my $a='A'; :> $a ~~ s:perl5:g/A/{chr(65535)}/; :> say $a.bytes; :> :> Outputs "0". Why? :> :> Bye, :> Andras :> : :\u is not a legal unicode codepoint. chr(65535) should raise an :exception of some type. So the above code does seem show a possible :bug. But as that chr(65535) is an undefined char, who knows what the :code is acually doing. In perl5 at least, we support a wider concept of codepoints than the Unicode consortium. This allows us to use strings for a wider variety of things than just Unicode text (eg version strings, bit vectors etc). In perl6 the greatly expanded set of types will presumably allow us to distinguish actual Unicode data from more arbitrary sequences of codepoints, and I'd normally expect that the more constrained type would be a subtype of the less constrained type. In this case that means I'd expect "Unicode string" to be a subtype of something like "codepoint sequence". (In fact it'd probably be useful to have more levels than that - there are times when you need the Unicode concepts for things like [[:digit:]], but may be able to get better performance by avoiding the checks for 'legal Unicode codepoint'.) On the other hand you will probably be able to achieve the things p5 overloads onto strings using packed integer arrays, so maybe this all represents unnecessary complications. In which case maybe 'relaxed' variants of Unicode strings aren't needed. We will probably still want other sorts of strings though, such as ASCII. Hugo
Re: Test::Expect
* Adrian Howard <[EMAIL PROTECTED]> [2005-04-14T15:37:07] > On 14 Apr 2005, at 11:36, Leon Brocard wrote: > >Oh, I forgot to mention to perl-qa that I wrote Test::Expect: > > http://search.cpan.org/dist/Test-Expect/ > > It's nice. Already used it :-) Does anyone who has used both Test::Expect and Test::Output feel like giving a simple comparison? -- rjbs pgpqdwYiXXJrd.pgp Description: PGP signature
Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?
On Thu, 2005-04-14 at 21:32 -0700, David Wheeler wrote: > On Apr 14, 2005, at 7:06 PM, Patrick R. Michaud wrote: > > > So, <[a.z]> matches "a", ".", and "z", > > while <[a..z]> matches characters "a" through "z" inclusive. > > I was going to say that that was inconsistent, but since you never need > to repeat a letter in a character class, well, I guess it isn't. But > the first person to write <[a...]> gets what's comin' to 'em. A silly question: is there a canonical character set from which we extract these ranges? Are we hard-coding Unicode here, or is there some way for the user to specify the character set for ranges?
[perl #34999] [TODO] remove more old stuff
# New Ticket Created by Leopold Toetsch # Please include the string: [perl #34999] # in the subject line of all future correspondence about this issue. # https://rt.perl.org/rt3/Ticket/Display.html?id=34999 > Some outdated files: lib/Parrot/PackFile/* lib/Parrot/PackFile.pm lib/Parrot/PackFile2.* what is: lib/Parrot/String.pm old packfile code? lib/Parrot/Types.pm same? lib/Parrot/Key.pm same? Do we still need: lib/Parrot/PMC.pm lib/Parrot/Makefile.PL and what about the chartypes directory, seems to be created in lib/Parrot/Distribution.pm Already discussed: classes/pmc2c.pl old PMC compiler classes/pmcarray.pmc wrapper for PerlArray leo
Re: Some PMC's Questions
Bloves Mr <[EMAIL PROTECTED]> wrote: > hi,folks. > I am reading PMC C source code and reading some document(" > http://www.perl.com/pub/a/2002/01/30/pmcs.html";). Despite that the text is rather old, it's remarkably valid still. > Some questions: > *this PMC design have changed? The internal layout of the PMC structure has changed, yes. And it will likely change in the future. The internals of vtable calls and PMC structure data access is now hidden inside macros: SELF->data => PMC_data(SELF) SELF->cache.int_val => PMC_int_val(SELF) $1->vtable->bet_bool() => VTABLE_get_bool(INTERP, $1) and so on. For details you might consult include/parrot/pmc.h. > *any body offer some advice that learn PMC C source code and PMC's theory? Just have a look at existing PMCs in classes. Commonly used core classes are a good begin, e.g.: classes/integer.pmc... the Integer PMC classes/resizablepmcarray.pmc ... standard PMC array or even classes/tqueue.pmc ... experimental thread-safe queue > Thanks. leo
Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?
David Wheeler skribis 2005-04-14 21:32 (-0700): > I was going to say that that was inconsistent, but since you never need > to repeat a letter in a character class, well, I guess it isn't. But > the first person to write <[a...]> gets what's comin' to 'em. Given ASCII, <[\x20...]> would then be everything except control characters. Handy! By the way, does ...5 mean -Inf..5? ;) Juerd -- http://convolution.nl/maak_juerd_blij.html http://convolution.nl/make_juerd_happy.html http://convolution.nl/gajigu_juerd_n.html
$*CWD instead of chdir() and cwd()
I was doing some work on Parrot::Test today and was replacing this code with something more cross platform. # Run the command in a different directory my $command = 'some command'; $command= "cd $dir && $command" if $dir; system($command); I replaced it with this. my $orig_dir = cwd; chdir $dir if $dir; system $command; chdir $orig_dir; Go into some new directory temporarily, run something, go back to the original. Hmm. Set a global to a new value temporarily and then return to the original value. Sounds a lot like local. So why not use it? { local chdir $dir if $dir; system $command; } But localizing a function call makes no sense, especially if it has side effects. Well, the current working directory is just a filepath. Scalar data. Why have a function to change a scalar? Just change it directly. Now local() makes perfect sense. { local $CWD = $dir if $dir; system $command; } And this is exactly what File::chdir does. $CWD is a tied scalar. Changing it changes the current working directory. Reading it tells you what the current working directory is. Localizing it allows you to safely change the cwd temporarily, for example within the scope of a subroutine. It eliminates both chdir() and cwd(). Error handling is simple, a failed chdir returns undef and sets errno. $CWD = $dir err die "Can't chdir to $dir: $!"; I encourage Perl 6 to adapt $*CWD similar to File::chdir and simply eliminate chdir() and cwd(). They're just an unlocalizable store and fetch for global data. As a matter of fact, Autrijus is walking me through implementing it in Pugs right now.
Re: A sketch of the security model
On Thu, 2005-04-14 at 09:11 -0400, Dan Sugalski wrote: > At 10:03 PM -0400 4/13/05, Michael Walter wrote: > > > Each running thread has two sets of privileges -- the active > >> privileges and the enableable privileges. Active privs are what's > >> actually in force at the moment, and can be dropped at any time. The > >> enableable privs are ones that code can turn on. It's possible to > >> have an active priv that's not in the enableable set, in which case > >> the current running code is allowed to do something but as soon as > >> the privilege is dropped it can't be re-enabled. > > > >How can dropping a privilege for the duration of a (dynamic) scope be > >implemented? Does this need to be implemented via a parrot intrinsic, > >such as: > > > > without_privs(list_of_privs, code_to_be_run_without_these_privs); > > > >..or is it possible to do so with the primitives you sketched out above? > > When a priv is dropped it stays dropped until it's reinstated. If > code drops a priv that it can't re-enable then the priv is gone. > (There are going to be issues with privileges attached to > continuations, since this could potentially mean that dropped privs > get un-dropped when you invoke a return continuation, though dropping > a privilege could ripple up the return continuation chain) Reinstating privileges when you return is normal, since potentially malicious code and data has now been removed from the stack. If you do NOT do it this way, then every piece of code must know the privileges of every child piece of code it calls (bye-bye virtual base classes with user implementations). See http://research.microsoft.com/~adg/Publications/MSR-TR-2001-103.pdf The ability to explicitly reenable a privilege via an opcode, rather than via the removal of the malicious party from the computation (by return) is almost definitely a bad idea. If you protect this opcode using some security mechanism, you will rapidly find that security mechanism can supersede the functionality provided by the opcode. > > > Additionally, subroutines may be marked as having privileges, which > >> means that as long as control is inside the sub the priv in question > >> is enabled. This allows for code that has elevated privs, generally > >> system-level code. > > > >Does the code marking a subroutines must have any other privilege than > >the one it is marking the subroutine with? > > Dunno, that's something we'll need to work out. It's possible that > sub marking needs to be done externally -- that is, it's bytecode > metadata or something like that which requires system privileges of > some sort to set. (Though there are issues with that) Marking code as > privileged is really a system administration task, though we've not > really put much thought into administering a parrot system yet. Actually, what usually happens is that subroutines (etc) are associated with a responsible party (principal), and privileges are granted to the principal; thus finding out the privileges of an opcode requires an extra indirection. This is not a problem. > > > ... Non-continuation > >> invokables (subs and methods) maintain the current set of privs, plus > >> possibly adding the sub-specific privs. > >Same for closures? > > Yeah, I think so. No, as before. You cannot execute based only on static privileges - this is what Unix does, and the Unix model is broken. You need either a stack inspection or a data inspection model, or a combination of the two. Ask me if you want formal descriptions or implementation details of these models. S.
Re: Parrot bytecode reentrancy
Nigel Sandever <[EMAIL PROTECTED]> wrote: > When a sub that closes over a variable > my $closure = 0; > sub do_something { > return $closure++: > } > is called from two threads, do the threads share a single closure or > each get their own separate closure? AFAIK: the closure bytecode is shared, the Closure PMC with the lexical pad is distinct. But that all isn't implemented yet. > njs leo
Re: A sketch of the security model
On Wed, 2005-04-13 at 22:03 -0400, Michael Walter wrote: > Dan, > > On 4/13/05, Dan Sugalski <[EMAIL PROTECTED]> wrote: > > All security is done on a per-interpreter basis. (really on a > > per-thread basis, but since we're one-thread per interpreter it's > > essentially the same thing) > Just to get me back on track: Does this mean that when you spawn a > thread, a separate interpreter runs in/manages that thread, or > something else? > > > Each running thread has two sets of privileges -- the active > > privileges and the enableable privileges. Active privs are what's > > actually in force at the moment, and can be dropped at any time. The > > enableable privs are ones that code can turn on. It's possible to > > have an active priv that's not in the enableable set, in which case > > the current running code is allowed to do something but as soon as > > the privilege is dropped it can't be re-enabled. > > How can dropping a privilege for the duration of a (dynamic) scope be > implemented? Does this need to be implemented via a parrot intrinsic, > such as: > > without_privs(list_of_privs, code_to_be_run_without_these_privs); > > ..or is it possible to do so with the primitives you sketched out above? This is usually done by creating a function "f(code) { code() }" without any static privileges in list_of_privs. To evaluate a function g() without those privileges, evaluate f(g), and the natural mechanisms of the interpreter will ensure that these privileges are not held during g(). > > Additionally, subroutines may be marked as having privileges, which > > means that as long as control is inside the sub the priv in question > > is enabled. This allows for code that has elevated privs, generally > > system-level code. > > Does the code marking a subroutines must have any other privilege than > the one it is marking the subroutine with? > > > ... Non-continuation > > invokables (subs and methods) maintain the current set of privs, plus > > possibly adding the sub-specific privs. > > Same for closures? Closures may also capture a concept of the current context, which is used when they are evaluated. This is critical in, for example, the case of system code with higher static privileges returning a closure to a low privilege object which may evaluate it at any time. a) The closure must not have any privileges not held by the low privilege object, so clearly it cannot just hold its static privilege set, it must capture a current context. b) If it does wish to have higher privilege (very common), it may grant (Fournet+Gordon,2003) these privileges in a dynamic scope bounded below by itself. S.
Re: A sketch of the security model
On Thu, 2005-04-14 at 09:51 -0700, Dave Whipp wrote: > Dan Sugalski wrote: > > > All security is done on a per-interpreter basis. (really on a per-thread > > basis, but since we're one-thread per interpreter it's essentially the > > same thing) > ... > >* Number of open files > >* IO operations/sec > >* IO operations total > ... > > Can an "application" get more resources simply by spawning threads? If Well, given that a child thread's dynamic access control context should include the dynamic context of the parent thread at the point where the thread was spawned, No. What I describe is a (provably) correct implementation. > the answer is "no, parent and child must divide share their quotas" then > there is a load balancing problem. If the answer is "yes", then there's There is no load balancing problem assuming you are synchronized on the thread-create point, which is not a major overhead, since that pretty much has to be a synchronization point in the kernel anyway. > no real protection at all. A threads-per-second limit isn't an answer > here, either (a malicious app could sit around for a few hours, > launching threads at a low intensity, until it has enough to bring down > the system). > > Is a thread really the right thing to apply these limits to? It seems to Limits are applied to privilege sets, not to threads. > me that there needs to be some sort of token (cf. cash; cf "capability") > that an application can obtain/spend/refresh to do these ops. An Yes, that's about the same. > application could share its token(s) with any threads it creates. It > could probably even "loan" its token to a backgroud thread that does > some operation on behalf of many other threads. Preferably not. I fear the concept of being able to hand out privileges to low privilege threads. If the low privilege thread has access to a (willing) object with static privileges allowing the operation, then that object should perform the operation on behalf of the thread in a dynamic context created by a 'grant' operation (See Fournet and Gordon, 2003). If the low privilege thread is made up entirely of low privilege objects, then it shouldn't have the privilege under any circumstances. S.
[SVN ci] MMD 23 - convert subtract MMD functions and opcodes
Continuing the MMD infix plan, we now have: 1) the subtract MMD functions are converted to the new function signature: PMC* subtract(PMC* value, PMC* dest) If C isn't NULL it's set to the result of the operation and the result is returned. This is the existing behavior. The TODO new "n_sub" opcode will return a new destination with the result as needed by languages like Python or Lisp. 2) There are now distinct infix variants of subtract, with "i_" prepended to the function name: void i_subtract(PMC *value) 3) during opcode generation, the "sub" opcode is converted according to: sub Px, Py, Pz=> infix .MMD_SUBTRACT, Px, Py, Pz sub Px, Py=> infix .MMD_I_SUBTRACT, Px, Py sub Px, Px, Py=> infix .MMD_I_SUBTRACT, Px, Py I'm not quite sure, if the latter is technically correct or useful. It might cause a problem, when operators are overloaded. OTOH it can safe a compare "if (dest == SELF) ...". 4) Tcl and Python scalars use the inherited subtract MMD of Parrot core types Integer, Float, Complex, and BigInt. The old (duplicated, cut'n'pasted) variants of subtract got just deleted in Tcl and Python dynamic classes. 5) for type promotion on Integer overflow, I've changed the bignum vtables. We now have: PMC* VTABLE_get_bignum(INTERP, SELF) which returns a new big integer of the appropriate type e.g. a PyLong. Along with morph these two functions are enough to preserve the HLLs view of types. There is a new test t/dynclass/pyint_26 that shows correct promotion of PyInt to PyLong. 6) during changing the scalar classes I found a lot of unused functions and vtables. E.g. - get_bool_keyed* # unused, unneeded - set_bool_keyed* # same - set_number - set_string # no vtable slots, we have assign anyway This is partially cleaned up now. 7) make test succeeds, this includes t/dynclass/py*.t cd languages/tcl TEST_PROG_ARGS=-G make test shows 46/228 failing, with DOD enabled almost all fail. I don't know yet, what's going on here. It seems that TclParser is the culprit. It creates during class_init a lot of strings e.g. "bs_nl", which are declared static in that file. But these strings aren't anchored anywhere or registered with Parrot's DOD registry. leo PS please "make realclean" so that vtable changes are propagated
Re: New language: Parrot Common Lisp
Cory Spencer <[EMAIL PROTECTED]> wrote: > I'd like to announce the creation of the Parrot Common Lisp project, which > aims to implement a significant subset of the Common Lisp language. Wow. I can even do something with it: $ ../parrot lisp.imc -> (+ 2 5) 7 -> (list 1 2 3) (1 . (2 . (3 . NIL))) Ehem, that's almost all I know about Lisp. > Depending on the system (I develop on both x86/Linux and g4/OS X), > you'll get a Bus Error, Segmentation Fault or some other random error > if you don't disable the GC. > (If anyone is able to track down aforementioned DOD/GC problems, > you'll earn my eternal gratitude.) Can you please provide a code snippet that exhibits the error. > -c leo
Re: [perl #34994] [TODO] make useful parts of Parrot config available at runtime
Steven Philip Schubiger wrote: [ cc'ed list, so that folks know about takers ] On 15 Apr, Leopold Toetsch wrote: : 5) along with bringing the config online, some cleanup and renaming : wouldn't harm e.g. "iv" vs "opcode_t", "intvalsize" vs "intsize" vs : "opcode_t_size" ... This part seems appealing to me, but bear in mind, I've never tampered with the Parrot C sources, although I've been heavily involved in other C-based projects (GNU coreutils et al.) That stuff is all in Perl code under the config dir, e.g: $ find config -type f | xargs grep -w intsize And do you have more examples or should I follow my guts? I think we should have: INTVAL_t # type of the INTVAL FLOATVAL_t INTVAL_size int_size # native c type and so on. See also include/parrot/datatypes.h Steven leo
Re: A sketch of the security model
On Wed, 2005-04-13 at 17:51 -0400, Aaron Sherman wrote: > On Wed, 2005-04-13 at 17:01, Dan Sugalski wrote: > > So here's what I was thinking of for Parrot's security and quota > > model. (Note that none of this is actually *implemented* yet...) > [...] > > It's actually pretty straightforward, the hard part being the whole > > "don't screw up when implementing" thing, along with designing the > > base set of privs. Personally I think taking the VMS priv and quota > > system as a base is a good way to go -- it's well-respected and > > well-tested, and so far as I know theoretically sound. Unix's priv > > model's a lot more primitive, and I don't think it's the one to take. > > (We could invent our own, but history shows that people who invent > > their own security system invent ones that suck, so that looks like > > something worth avoiding) > > VMS at least *is* a priv-based security model, but VMS privs are not > appropriate for parrot on the whole. The best known model for privileges (logic of authorisation over) is that of Oracle, RT, etc, where access over privileges is transitive. Will find good references on request/when I have more time. Bad references are available from Ravi Sandhu, but he doesn't handle transitivity or modification of rights well, if at all. S.
Re: A sketch of the security model
Someone's pointed this thread out to me, so I'm going to shove an oar in following a few posts. I've done a fair bit of security work, so feel free to ask me to explain, justify or provide references for anything. On Wed, 2005-04-13 at 17:01 -0400, Dan Sugalski wrote: > All security is done on a per-interpreter basis. (really on a > per-thread basis, but since we're one-thread per interpreter it's > essentially the same thing) What you actually mean (or what I believe you _should_ mean) is per-context, in the lambda-calculus sense of context. See notes below about continuations. > QUOTAs are limits on the number of resources or operations that an > interpreter an allocate or perform, either in absolute terms (i.e. > allocate no more than 10M of memory) or relative terms (i.e. can do > only 10 IO operations per second). Quotas are tracked by parrot, and > cover: The ability to manipulate and exceed QUOTAs should be controlled in dynamic context. > PRIVILEGEs are permissions to do certain things. Parrot will have a > number of privileges it checks before doing dangerous operations, and > user code may also assign and check privileges. > > Normally parrot runs with no quotas and no privilege checking. This > is the fastest way to run. Code may at any time enable privilege Actually, you can do privilege checking in an efficient engine, even using most of the reflection systems, with almost no overhead. See Java. > and/or quota checking. Once enabled code must have proper privileges > to disable it again. Typically AllPermission, otherwise you have the ability to perform privilege escalation. > Each running thread has two sets of privileges -- the active > privileges and the enableable privileges. Active privs are what's > actually in force at the moment, and can be dropped at any time. The > enableable privs are ones that code can turn on. It's possible to > have an active priv that's not in the enableable set, in which case > the current running code is allowed to do something but as soon as > the privilege is dropped it can't be re-enabled. Enableable privileges are usually called static privileges and are usually defined as the privileges held statically by the current object, or if we read ahead to your next point, subroutine. > Additionally, subroutines may be marked as having privileges, which > means that as long as control is inside the sub the priv in question > is enabled. This allows for code that has elevated privs, generally > system-level code. Please no. Privileges should be explicitly granted. You have just described the Unix SUID model, where as long as control is inside a root-owned daemon (for daemon, read subroutine), the root privilege is enabled. This always leads to privilege escalation and is BAD. What you _should_ mean, according to all prior research, is that "No code may be inside that routine and still hold a privilege not held by the routine". In shorter form, "The dynamic (current) privilege set must not exceed the static privilege set of any routine on the stack". A slightly different formulation applies for data inspection systems. See footnote. > Continuations, when taken, capture the current set of active and > enableable privs, and when invoked those privs are put into place. > (This is a spot that will require some thought, since there's a > potential for privilege leaks which worries me here) Non-continuation > invokables (subs and methods) maintain the current set of privs, plus > possibly adding the sub-specific privs. If you perform the above step correctly, then capturing a context and including it in future access control checks is not hard. Java does this by capturing a current AccessControlContext when a new ClassLoader is created in a thread to be used in a different thread. No code loaded by that ClassLoader IN ANY THREAD may exceed the privileges of the thread which created the classloader at the time it created it. > It's actually pretty straightforward, the hard part being the whole > "don't screw up when implementing" thing, along with designing the > base set of privs. Personally I think taking the VMS priv and quota > system as a base is a good way to go -- it's well-respected and > well-tested, and so far as I know theoretically sound. Unix's priv > model's a lot more primitive, and I don't think it's the one to take. > (We could invent our own, but history shows that people who invent > their own security system invent ones that suck, so that looks like > something worth avoiding) Better systems to inspect would be Java (stack inspection), Perl5 (data inspection). Please do not confuse the choice of privilege set and logic over it (authorisation system) with the mechanism for identifying the current set of privileges (identification of current principal). The key difference in security between stack inspection and data inspection systems for the purposes of parrot is that stack inspection considers for sec
Re: Parrot/PUGS Hack-a-thon at the Austrian Perl Workshop
Hi, There will be a Parrot/PUGS Hack-a-thon at the Austrian Perl Workshop, which takes place on 9th and 10th June in Vienna, Austria. Autrijus Tang, Chip Salzenberg and Leo Toetsch will be there. You should be there too :-) I'll be there, too. ;) Bye, Andras
Re: Hyper operator corner case?
John Williams wrote: Good point. Another one is: how does the meta_operator determine the "identity value" for user-defined operators? Does it have to? The definition of the identity value---BTW, I like the term "neutral value" better because identity also is a relation between two values---is that $x my_infix_op $neutral == $x. So the generic implementation that copies surplus elements is correct with respect to the resulting value. You shouldn't expect the operator beeing called as many times as there are elements in the bigger data structure, though. It's called only for positions where both structures have actual values. But that is the same as short-circuiting && and ||. And somewhat the reverse of authreading from junctive values. I believe the fine points fall out like this: @a >>+<< 1# replicate @a >>+<< (1) # replicate: (1) is still scalar @a >>+<< [1] # extend: [1] is an array (and will auto-deref) I think they fall out naturally from typing and dispatch. But note that the » « operator has three args. I haven't made the &op a dispatch selector. If the my_infix_op from above needs to handle neutral elements by itself just tell the dispatcher by defining &infix_circumfix_meta_operator:{'»','«'}:(List,List,&my_infix_op:) and construct the neutral elements when one of the list runs out of elements. I hope the syntax I used does what I want to express. Note that in :(List,List,&my_infix_op:) the first two elements are types while &my_infix_op is a sub value. In that sense my &op was actually wrong but it was nice for wording my sentence. So the generic name should read &infix_circumfix_meta_operator:{'»','«'}:(List,List:Code) or perhaps &infix_circumfix_meta_operator:{'»','«'}:(List,List:&) if & is considered as the code sigil. Hmm, then we could also have :(@,@:&) meaning the same type spec? BTW, starting from these type specs I come (back) to the suggestion of using » « for hypering function calls and/or their arguments. Has that been decided? I'm not sure if specialisation on values is covered by the :() syntax. E.g. one could implement &infix:<*>:(0,Any) to return 0 without evaluating the Any term at all! But this needs either lazy evaluation in the functional paradigma or code morphing 'x() * y()' to '(($t = x()) != 0) ?? $t * y() :: 0' or some such. On assembler level this morphing reduces to an additional check of a register for zero. But I'm not sure if the type system and the optimizer will be *that* strong in the near future ;) Regards -- TSa (Thomas Sandlaß)
Some PMC's Questions
hi,folks. I am reading PMC C source code and reading some document(" http://www.perl.com/pub/a/2002/01/30/pmcs.html";). Some questions: *this PMC design have changed? *any body offer some advice that learn PMC C source code and PMC's theory? Thanks. /* p2p is a protocol or a compiler? */
Re: [pugs] regexp "bug"?
BÁRTHÁZI András wrote: Hi, >> This code: >> >> my $a='A'; >> $a ~~ s:perl5:g/A/{chr(65535)}/; >> say $a.bytes; >> >> Outputs "0". Why? > > > \u is not a legal unicode codepoint. chr(65535) should raise an exception of some type. So the above code does seem show a possible bug. But as that chr(65535) is an undefined char, who knows what the code is acually doing. In my opinion (that can be wrong), \u can be stored as an UTF-8 character, it should be 0xEF~0xBF~0xBF. If I do it outside the regexp (I mean "say chr(65535).bytes", it works well. Another "bug", I've found, it's not related to the regexps, but still unicode character one: say chr(0x10).bytes; The answer: pugs: encodeUTF8: ord returned a value above 0x10 And if I start to increment $b, I will get: pugs: Prelude.chr: bad argument I don't understand it, as I thougth that unicode characters in the range of 0x-0x7FFF. Is Haskell not supporting the whole set? There is a Unicode version, called UCS-2, that is just between 0x-0x, but it still not answer the question. [...] Meanwhile, I've found this: http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2175.htm It can be the answer to my question. Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 byte UCS-2 value, but the Unicode standard specifically says that the values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should never appear in a Unicode string. 0x is reserved for out-of-band signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are specificaly reserved for out-of-band marking a UCS-2 file as being either bigendian or littlendian, but are specifically not considered part of the data. chr() is currently defined to mean convert an int value to a Unicode codepoint. That's why I said that chr(65535) should return an exception, it's an argument error similar to sqrt(-1). -- [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: [pugs] regexp "bug"?
BÁRTHÁZI András wrote: Hi, This code: my $a='A'; $a ~~ s:perl5:g/A/{chr(65535)}/; say $a.bytes; Outputs "0". Why? Bye, Andras \u is not a legal unicode codepoint. chr(65535) should raise an exception of some type. So the above code does seem show a possible bug. But as that chr(65535) is an undefined char, who knows what the code is acually doing. -- [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: [pugs] regexp "bug"?
Hi, my $a='A'; $a ~~ s:perl5:g/A/{chr(65535)}/; say $a.bytes; Outputs "0". Why? \u is not a legal unicode codepoint. chr(65535) should raise an exception of some type. So the above code does seem show a possible bug. But as that chr(65535) is an undefined char, who knows what the code is acually doing. It seems, that it gives back 0 in the 0xE000-0x range. Do you still think, it's normal? "Some Unicode code points are invalid and should not be used. [...] It can't be 0x or 0xFFFE, it can't be both <= 0xDFFF and >= 0xD800, and it can't be > 0x10 and it can't be less than 0." http://www.elfdata.com/plugin/unicodefaqdata.html Bye, Andras