date:20050415

Re: A sketch of the security model

2005-04-15 Thread Michael Walter

On 4/15/05, Shevek <[EMAIL PROTECTED]> wrote:
> > How can dropping a privilege for the duration of a (dynamic) scope be
> > implemented? Does this need to be implemented via a parrot intrinsic,
> > such as:
> >
> >   without_privs(list_of_privs, code_to_be_run_without_these_privs);
> >
> > ..or is it possible to do so with the primitives you sketched out above?
> 
> This is usually done by creating a function "f(code) { code() }" without
> any static privileges in list_of_privs.
>
> To evaluate a function g()
> without those privileges, evaluate f(g), and the natural mechanisms of
> the interpreter will ensure that these privileges are not held during
> g().

I understand, thanks.
Michael

Re: [RFC] some doubtable MMDs?

2005-04-15 Thread Bob Rogers

   From: Larry Wall <[EMAIL PROTECTED]>
   Date: Fri, 15 Apr 2005 12:52:53 -0700

   On Fri, Apr 15, 2005 at 02:38:36PM +0200, Leopold Toetsch wrote:
   : I'm not quite sure, but it seems that some of the MMD functions may 
   : better be vtable methods:
   : 
   : - bitwise_sh[rl]*shift by anything other then int?

Shifting right by a positive BigInt (or left by a negative BigInt) can
be optimized to -1 or 0.  Shifting the other way could still produce a
valid result for some values, even on a machine with 32-bit addresses.

   : - bitwise_lsris missing generally
   : 
   : or even just a plain opcode only:
   : 
   : - logical_{or,and,xor}  return a PMC depending on the boolean value
   : 
   : What are HLLs expecting of these infix operations?

   Perl 6 tends to distinguish these as different operators, though Perl 5
   did overload the bitwise ops on both strings and numbers, which newbies
   found confusing in ambiguous cases, which is why we changed it.

[FWIW, Common Lisp can't use these ops, as it has a different idea of
logical truth.  And that's the honest (not nil).  ;-} ]

   : OTOH it might be useful that the current get__keyed operations 
   : (postcircumfix:[]) become MMD subroutines:
   : 
   :   Px = Py[Pz]Pz = String, Int, Key, Slice, ...

   At the moment, the Perl 6 optimizer is explicitly allowed to optimize
   array indices with the assumption that the subscript is a scalar
   (or slice) of integer, or something that converts to integer . . .

   Larry

By the same token, couldn't one reasonably ask for a boolean array that
required BigInt subscripts, even on said 32-bit machine?  (Once boolean
arrays actually store one element per bit, that is.)  Or are subscripts
this large ruled out?

   Or are you using "integer" conceptually to include both Integer and
BigInt?

-- Bob Rogers
   http://rgrjr.dyndns.org/

Re: Various questions

2005-04-15 Thread Chip Salzenberg

According to Philip Taylor:
> * I can usually handle unsigned numbers by pretending they're signed and 
> using 'I' registers, but some things appear to be awkward without new 
> ops - in particular, div and cmod, and le/lt/ge/gt comparisons. (As far 
> as I can tell, those are the only ones C would need; everything else 
> should work fine with the signed variants).

Don't you also need unsigned assignment to N registers?

double d = 0xUL;

> I've added divu/leu/etc ops to math.ops/cmp.ops (and just made them cast 
> their operands into UINTVALs) - is that a reasonable thing to do? Would 
> they be better in a new .ops file?

May as well leave them there.

> * Should there be an 'isatty' op/method?

I think so.  I wouldn't tie it to the fileno() concept, because
fileno() is less portable than isatty(filehandle), which is a
reasonable sort of question beyond the bounds of Unix, in the Great
Wilderness.

> * Is it possible to merge PBC files together, like load_bytecode but at 
> compile-time?

I'll punt on this one for now... Leo?

> I've been using [gs]et_integer_keyed_int on a PMC to allow pointer 
> access. Since it reads whole ints, it probably crashes unnecessarily 
> when e.g. reading chars at unlucky addresses

Yes ... on some arch's.  Not x86, though, so I'm safe.  :-)

> but IMC code like "val = mem.read_i1(ptr)" feels unpleasantly
> inefficient, particularly in string-processing loops.

What about a native-code _function_ rather than an object method?
-- 
Chip Salzenberg- a.k.a. -<[EMAIL PROTECTED]>
 Open Source is not an excuse to write fun code
then leave the actual work to others.

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Chip Salzenberg

According to chromatic:
> On Fri, 2005-04-15 at 23:52 +0200, Juerd wrote:
> > Well, after failure it can be cwd() but false without breaking any real
> > code, because normally, you'd never if (cwd) { ... }, simply because
> > there's ALWAYS a cwd.
> 
> Not always -- try removing a directory that's the pwd of another
> process.

Oh, the _directory_ is still there.  :-)
-- 
Chip Salzenberg- a.k.a. -<[EMAIL PROTECTED]>
 Open Source is not an excuse to write fun code
then leave the actual work to others.

Unify cwd() [was: Re: $*CWD instead of chdir() and cwd()]

2005-04-15 Thread Michael G Schwern

On Fri, Apr 15, 2005 at 08:31:57PM -0400, Chip Salzenberg wrote:
> According to Michael G Schwern:
> > And this is exactly what File::chdir does.  $CWD is a tied scalar.
> 
> I don't think current directory maps well on a variable.  That won't
> stop people from using it, of course.  :-(
> 
> There are several methods to determine the current directory.  Each
> one has its corner cases, strengths and weaknesses (thus the
> proliferation of Cwd module functions), and it doesn't make any sense
> to me to elevate one over the rest through the proposed $CWD.

This is orthoginal to $CWD.  

Perl 6 is going to have to decide on some sort of standard internal getcwd 
technique, $CWD or not.  In the same way that we have open() not fopen, 
fdopen, freopen... we can choose the safest and most sensible technique for 
determining the cwd and use that.  You have to because when a new user asks 
"how do I get the current working directory?" you want to say "cwd()" and 
not "Well, there are a variety of different techniques..."  Cwd.pm is a 
perfect example of this problem.  Which one should a user use?  Most folks 
just won't care and the micro-differences between the functions in Cwd.pm
aren't worth the trouble.  

Present a sensible default.  Write a module with all the other options for 
those who need it.

>   mkdir '/tmp/foo';
>   $CWD = '/tmp/foo';
>   rename '../foo', '../bar';
>   say $CWD;  # Well?  Which is it?

Its exactly the same as...

mkdir '/tmp/foo';
chdir '/tmp/foo';
rename '../foo', '../bar';
say cwd();

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Chip Salzenberg

According to Michael G Schwern:
> And this is exactly what File::chdir does.  $CWD is a tied scalar.

I don't think current directory maps well on a variable.  That won't
stop people from using it, of course.  :-(

There are several methods to determine the current directory.  Each
one has its corner cases, strengths and weaknesses (thus the
proliferation of Cwd module functions), and it doesn't make any sense
to me to elevate one over the rest through the proposed $CWD.

mkdir '/tmp/foo';
$CWD = '/tmp/foo';
rename '../foo', '../bar';
say $CWD;  # Well?  Which is it?

-- 
Chip Salzenberg- a.k.a. -<[EMAIL PROTECTED]>
 Open Source is not an excuse to write fun code
then leave the actual work to others.

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 03:22:48PM -0700, Michael G Schwern wrote:
: On Fri, Apr 15, 2005 at 11:52:38PM +0200, Juerd wrote:
: > > becomes an unverifiable operation.  You have to use chdir() if you want to
: > > error check and $CWD is reduced to a "scripting" feature.
: > 
: > Well, after failure it can be cwd() but false without breaking any real
: > code, because normally, you'd never if (cwd) { ... }, simply because
: > there's ALWAYS a cwd. If this is done, the thing returned by the STORE
: > can still be an lvalue and thus be properly reffed.
: 
: Good idea!

But if cwd() or chdir() doesn't fail(), you probably won't get any
information on *why* the chdir failed in either the return value or $!.
That could be construed as antisocial.

In general I think "but" should be reserved for situations where the
original interface designer showed sufficient lack of imagination to
warrant such workarounds.  That is how I treated all the RFCs that
made use of "but" for built-in functionality, and I haven't seen any
good reasons to alter my views on that.  About the closest we get
to it is that "interesting values of undef" can be thought of as new
Exception(...) but undefined, or some such.  But even that is usually
hidden behind the fail() predicate, and the undef role is probably
composed into exceptions in the first place.  Or maybe it's the
other way around.

Larry

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Michael G Schwern

On Fri, Apr 15, 2005 at 11:52:38PM +0200, Juerd wrote:
> > becomes an unverifiable operation.  You have to use chdir() if you want to
> > error check and $CWD is reduced to a "scripting" feature.
> 
> Well, after failure it can be cwd() but false without breaking any real
> code, because normally, you'd never if (cwd) { ... }, simply because
> there's ALWAYS a cwd. If this is done, the thing returned by the STORE
> can still be an lvalue and thus be properly reffed.

Good idea!

Re: [pugs] Quoting constructs

2005-04-15 Thread Steven Philip Schubiger

On 16 Apr, Roie Marianer wrote:

: By the way, something tells me perl6-compiler isn't the best place for this
: discussion. Is there a secret group of people that discusses cornercases for
: perl6, and if so could someone tell me on what list they live?

You most likely want perl6-language, where Larry among others
participates in.

Steven

Announcing Test::TAP::Model and Test::TAP::HTMLMatrix

2005-04-15 Thread Yuval Kogman

Hola...

The code used to generate pugs smoke HTMLs (like
http://nothingmuch.woobling.org/pugs_test_status/ - warning around
800K), was refactored into two perl (5) modules, now (that is, when
your mirror has synched) available on the CPAN.

This code is authored by many of the pugs authors. If you feel the
need to discuss it, I think #perl6 on freenode is the place. In any
case, I'm not authoritative, as this code is not only mine.

In order to honor the fine tradition of releng breakage, both 0.01
versions are crummy. Use 0.02. Sorry =(

The two darcs repos for these modules are:

http://nothingmuch.woobling.org/Test-TAP-Model
http://nothingmuch.woobling.org/Test-TAP-HTMLMatrix

Test::TAP::Model wraps around Test::Harness::Straps and gives a sort
of souped up DOM to the TAP data that was collected, and
Test::TAP::HTMLMatrix creates the HTML using this DOM and a Petal
template.

Ciao!

-- 
 ()  Yuval Kogman <[EMAIL PROTECTED]> 0xEBD27418  perl hacker &
 /\  kung foo master: /methinks long and hard, and runs away: neeyah!!!



pgpwL4wbp4Eoc.pgp
Description: PGP signature

Re: Comparing rationals/floats

2005-04-15 Thread Doug McNutt

At 16:18 -0700 4/15/05, gcomnz wrote:
>More questions stemming from cookbook work... Decimal Comparisons:
>
>The most common recipe around for comparisons is to use sprintf to cut
>the decimals to size and then compare strings. Seems ugly.
>
>The non-stringification way to do it is usually along the lines of:
>
>if (abs($value1 - $value2) < abs($value1 * epsilon))
>
>(From Mastering Algorithms with Perl errata)
>
>I'm wondering though, if C<$value1 == $value2> is always wrong (or
>almost always wrong) then should it be smarter and:
>SNIP
>Marcus Adair

I have longed for an OO class that might be called "measurement". An object 
would include a float, a unit of measure, and an estimate of accuracy.

Mathematical operations would be overloaded so that the result of a calculation 
would appropriately handle propagation of the argument's accuracies into the 
result. It might even do unit conversions but that's another subject. Coercion 
of a float into a measurement would be automatic with infinite precision 
assumed.

Given the new class it is easy to adjust comparison operators to calculate 
"within experimental error".

-- 

--> Life begins at ovulation. Ladies should endeavor to get every young life 
fertilized. <--

Re: [pugs] Quoting constructs

2005-04-15 Thread Roie Marianer

On Friday 15 April 2005 3:27 am, Larry Wall wrote:
> On Fri, Apr 15, 2005 at 03:27:27AM +0300, Roie Marianer wrote:
> : > %hash<< a $key_b c >>  :key<< a $value_b c >>
> : > %hash« a $key_b c »:key« a $value_b c »
> :
> : Just to be certain, these are both equivalent to
> :
> :  @hash{'a', $key_b, 'c'} key => ['a', $value_b, 'c']
> :
> : in Perl 5, right?
>
> Close.  It's actually more like:
>
> @hash{split " ", "a $key_b c"}key => [split " ", "a $value_b c"]

I actually knew that, but in my head $key_b and $value_b were single words. 
But according to S02, the interpolation is protected by quotes. That is, if 
$key_b is q0/printf "Hello, world\n" or die"/, that's four words, correct? Or 
is it just if the quotes actually appear in the quoting construct? Basically 
I'm wondering if there's a detailed specification of how <<>> should work.

Several only-slightly-related questions about interpolating:

1. qq x$varx eq $var? (That's how it works in Perl5, anyway)

2. If the delimiter is not a single character (I think this only applies to 
<<>>), does a backslash protect the first character or both? For example, in
 <>> or die
Is that three words ['some', 'words', '>'] with the >> ending the construct, 
or is that ['some', 'words', '>>>', 'or', 'die']? (and the rest of the file 
is interpolated and split into words)

3. Are <<>>-style delimiters allowed in other quoting constructs? Is 
q<> the string "Hello", or the string "> yet 
at all.)

My head hurts. :-)

By the way, something tells me perl6-compiler isn't the best place for this 
discussion. Is there a secret group of people that discusses cornercases for 
perl6, and if so could someone tell me on what list they live?
-- 
-Roie
v2sw6+7CPhw5ln5pr4/6$ck2ma8+9u7/8LSw2l6Fi2e2+8t4TNDSb8/4Aen4+7g5Za22p7/8
[ http://www.hackerkey.com ]

Comparing rationals/floats

2005-04-15 Thread gcomnz

More questions stemming from cookbook work... Decimal Comparisons:

The most common recipe around for comparisons is to use sprintf to cut
the decimals to size and then compare strings. Seems ugly.

The non-stringification way to do it is usually along the lines of: 

if (abs($value1 - $value2) < abs($value1 * epsilon))

(From Mastering Algorithms with Perl errata)

I'm wondering though, if C<$value1 == $value2> is always wrong (or
almost always wrong) then should it be smarter and:

a. throw a warning
b. DWIM using overloaded operators (as in reduce precision then compare)
c. throw a warning but have other comparison operators just for this
case to make sure you know what you're doing

I'd vote for b., but I don't know enough about the problem domain to
know if that is safe, and realistically I just want to write the
cookbook entry rather than start a math-geniuses flame war ;-)

Which leads to another question: Are there $value.precision() and
$value.accuracy() methods available for decimals? I'd really rather
not do the string comparison if it can be avoided, maybe it's just the
purist in me saying "leave the numbers be" :-)

Apologies in advance if this is somewhere I missed. I did a lot of searching.

Marcus Adair

Re: nbsp in \s, and <>

2005-04-15 Thread Mark Reed

I thought we had just established that nbsp is not in Unicode¹s definition
of whitespace.  So why should \s match it?



On 2005-04-15 18:56, "Larry Wall" <[EMAIL PROTECTED]> wrote:

> On Sat, Apr 16, 2005 at 12:46:47AM +0200, Juerd wrote:
> : Larry Wall skribis 2005-04-15 15:38 (-0700):
> : > : Do \s and  match non-breaking whitespace, U+00A0?
> : > Yes. 
> : 
> : That makes \s+ and \s*, and thus  very useless for anything but
> : trimming whitespace. For splitting (including word wrapping), it'd do
> : exactly the wrong thing.
> 
> Maybe we just need a  for breaking white space, or some such.
>  is primarily used in pattern matching with :w, where a
> non-breaking space in the input would presumably be matched by a
> non-breaking space in the pattern, or maybe an explicit .
> As long as patterns (with or without :w) treat non-breaking spaces
> as ordinary matching characters, it should work out, methinks.
> Though it's probably a hair more readable to use an explicit ...
> 
> Larry 
>

Re: Heredocs: How equal are bunches of spaces to tabs?

2005-04-15 Thread Larry Wall

On Sat, Apr 16, 2005 at 12:11:24AM +0200, Juerd wrote:
: Pasted from pugs/examples/cookbook/01-00introduction.p6:
: 
: # XXX - question: How equal are bunches of spaces to tabs?
: #   -- I'd say that's a question for perl6lang

This seems to be singularly short on context, but if it has to do with
trimming leading whitespace from heredocs, A2 already discusses this.

Larry

Re: nbsp in \s, and <>

2005-04-15 Thread Larry Wall

On Sat, Apr 16, 2005 at 12:46:47AM +0200, Juerd wrote:
: Larry Wall skribis 2005-04-15 15:38 (-0700):
: > : Do \s and  match non-breaking whitespace, U+00A0?
: > Yes.
: 
: That makes \s+ and \s*, and thus  very useless for anything but
: trimming whitespace. For splitting (including word wrapping), it'd do
: exactly the wrong thing.

Maybe we just need a  for breaking white space, or some such.
 is primarily used in pattern matching with :w, where a
non-breaking space in the input would presumably be matched by a
non-breaking space in the pattern, or maybe an explicit .
As long as patterns (with or without :w) treat non-breaking spaces
as ordinary matching characters, it should work out, methinks.
Though it's probably a hair more readable to use an explicit ...

Larry

Re: nbsp in \s, and <>

2005-04-15 Thread Juerd

Larry Wall skribis 2005-04-15 15:38 (-0700):
> : Do \s and  match non-breaking whitespace, U+00A0?
> Yes.

That makes \s+ and \s*, and thus  very useless for anything but
trimming whitespace. For splitting (including word wrapping), it'd do
exactly the wrong thing.

> : \s is said (in S05) to match any unicode whitespace, but letting it
> : match NBSP and then using \s for splitting things is wrong, I think.
> Perhaps the default word split should not be based on \s then.

It'd have to.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: nbsp in \s, and <>

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 11:44:03PM +0200, Juerd wrote:
: Is there a -like thingy that is always \s+?

Not currently, since \s+ is there.   used to be that, but
currently is defined as the magical whitespace matcher used by :words.

: Do \s and  match non-breaking whitespace, U+00A0?

Yes.

: How about:
: 
: U+0008  backspace
: U+00A0  no break space (Repeated for overview)
: U+1361  ethiopic wordspace
: U+2000  en quad
: U+2001  em quad
: U+2002  en space
: U+2003  em space
: U+2004  three per em space
: U+2005  four per em space
: U+2006  six per em space
: U+2007  figure space
: U+2008  punctuation space
: U+2009  thin space 
: U+200A  hair space
: U+200B  zero width space
: U+202F  narrow no break space
: U+205F  medium mathematic space
: U+2060  word joiner (What is that, anyway?)
: U+3000  ideographic space
: U+FEFF  zero width non-breaking space

Yes, any Unicode whitespace, but you seem to have a different list than
I do.  Outside of the standard ASCIIish control-character whitespace,
I count only the \pZ characters, not the \pC characters, so I don't have
to tell you what a word-joiner is, since it's a \p[Cf] character.  :-)

I will also gleefully ignore the existence of BOMs.

So I make it:

0020;SPACE;Zs;0;WS;N;
00A0;NO-BREAK SPACE;Zs;0;CS; 0020N;NON-BREAKING SPACE
1680;OGHAM SPACE MARK;Zs;0;WS;N;
180E;MONGOLIAN VOWEL SEPARATOR;Zs;0;WS;N;
2000;EN QUAD;Zs;0;WS;2002N;
2001;EM QUAD;Zs;0;WS;2003N;
2002;EN SPACE;Zs;0;WS; 0020N;
2003;EM SPACE;Zs;0;WS; 0020N;
2004;THREE-PER-EM SPACE;Zs;0;WS; 0020N;
2005;FOUR-PER-EM SPACE;Zs;0;WS; 0020N;
2006;SIX-PER-EM SPACE;Zs;0;WS; 0020N;
2007;FIGURE SPACE;Zs;0;WS; 0020N;
2008;PUNCTUATION SPACE;Zs;0;WS; 0020N;
2009;THIN SPACE;Zs;0;WS; 0020N;
200A;HAIR SPACE;Zs;0;WS; 0020N;
200B;ZERO WIDTH SPACE;Zs;0;BN;N;
2028;LINE SEPARATOR;Zl;0;WS;N;
2029;PARAGRAPH SEPARATOR;Zp;0;B;N;
202F;NARROW NO-BREAK SPACE;Zs;0;WS; 0020N;
205F;MEDIUM MATHEMATICAL SPACE;Zs;0;WS; 0020N;
3000;IDEOGRAPHIC SPACE;Zs;0;WS; 0020N;

: \s is said (in S05) to match any unicode whitespace, but letting it
: match NBSP and then using \s for splitting things is wrong, I think.

Perhaps the default word split should not be based on \s then.
It's just one more difference, in addition to trimming leading and
trailing whitespace like awk.

: Are the contents of <> split using ? (Is <<$foo>>, where $foo is
: "foo\xA0bar", one or two elements?)

That is using the default word splitter (or it *is* the default word
splitter), so if the default word split is based on <+[\s]-[\xA0]>
it would be one element.

Of course, the ZERO WIDTH SPACE is a nasty critter for anyone using
whitespace to separate tokens.  That and maybe thin spaces probably
merit warnings in Perl code where they might cause visual ambiguity.

Larry

Re: nbsp in \s, and <>

2005-04-15 Thread Juerd

Aaron Sherman skribis 2005-04-15 18:20 (-0400):
> > Is there a -like thingy that is always \s+?
> Not sure what that means exactly.

 is \s* or \s+, depending on its surroundings.

> Thankfully, NBSP (U+00A0) is not Unicode whitespace.

Thanks for sharing this information!


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Juerd

chromatic skribis 2005-04-15 15:18 (-0700):
> > Well, after failure it can be cwd() but false without breaking any real
> > code, because normally, you'd never if (cwd) { ... }, simply because
> > there's ALWAYS a cwd.
> Not always -- try removing a directory that's the pwd of another
> process.

Results in EPERM indeed :(


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: nbsp in \s, and <>

2005-04-15 Thread Aaron Sherman

On Fri, 2005-04-15 at 17:44, Juerd wrote:
> Is there a -like thingy that is always \s+?

Not sure what that means exactly.

> Do \s and  match non-breaking whitespace, U+00A0?

As I understood, Perl 6 was going to use the Unicode standard(s) to
determine the whitespacishness of each codepoint. Going to Google, I
find:

http://www.fileformat.info/info/unicode/category/Zs/list.htm

which lists all of the "separator, space" characters.

> How about:
> 
> U+0008  backspace
Character.isWhitespace() No
> U+00A0  no break space (Repeated for overview)
Character.isWhitespace() No
> U+1361  ethiopic wordspace
Character.isWhitespace() No
> U+2000  en quad
Character.isWhitespace() Yes
> U+2001  em quad
Character.isWhitespace() Yes
> U+2002  en space
Character.isWhitespace() Yes
> U+2003  em space
Character.isWhitespace() Yes
> U+2004  three per em space
Character.isWhitespace() Yes
> U+2005  four per em space
Character.isWhitespace() Yes
> U+2006  six per em space
Character.isWhitespace() Yes
> U+2007  figure space
Character.isWhitespace() No
> U+2008  punctuation space
Character.isWhitespace() Yes
> U+2009  thin space 
Character.isWhitespace() Yes
> U+200A  hair space
Character.isWhitespace() Yes
> U+200B  zero width space
Character.isWhitespace() Yes
> U+202F  narrow no break space
Character.isWhitespace() No
> U+205F  medium mathematic space
Character.isWhitespace() Yes
> U+2060  word joiner (What is that, anyway?)
Character.isWhitespace() No
Comments WJ
a zero width non-breaking space (only)
intended for disambiguation of functions for byte order mark
> U+3000  ideographic space
Character.isWhitespace() Yes
> U+FEFF  zero width non-breaking space
Character.isWhitespace() No

> \s is said (in S05) to match any unicode whitespace, but letting it
> match NBSP and then using \s for splitting things is wrong, I think.

Thankfully, NBSP (U+00A0) is not Unicode whitespace.

-- 
Aaron Sherman <[EMAIL PROTECTED]>
Senior Systems Engineer and Toolsmith
"It's the sound of a satellite saying, 'get me down!'" -Shriekback

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread chromatic

On Fri, 2005-04-15 at 23:52 +0200, Juerd wrote:

> Well, after failure it can be cwd() but false without breaking any real
> code, because normally, you'd never if (cwd) { ... }, simply because
> there's ALWAYS a cwd.

Not always -- try removing a directory that's the pwd of another
process.

-- c

Heredocs: How equal are bunches of spaces to tabs?

2005-04-15 Thread Juerd

Pasted from pugs/examples/cookbook/01-00introduction.p6:

# XXX - question: How equal are bunches of spaces to tabs?
#   -- I'd say that's a question for perl6lang


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 01:12:46PM -0700, Michael G Schwern wrote:
: Thus spake Larry Wall:
: > Offhand, I guess my main semantic problem with it is that if a chdir
: > fails, you aren't in an undefined location, which the new value of $CWD
: > would seem to indicate.  You're just where you were.  Then the user
: > either has to remember that, or there still has to be some other
: > means of finding out the real location.
: 
: To be clear:  Only the store operation will return undef on failure.  

That doesn't square with the notion that an assignment returns the
actual lvalue:

($new = $old) =~ s/foo/bar/;

: Additional fetches on $CWD will continue to return the cwd.
: 
:   $CWD = '/path/which/exists';
:   $CWD = '/i/do/not/exist' err warn $!;
:   print $CWD;
: 
: This prints /path/which/exists/.

Except that the err should be looking at $CWD, not some other return value
of the assignment.

: > The other problem with it is the fact that people will assign relative
: > paths to it and expect to get the relative path back out instead
: > of the absolute path.
: 
: I honestly never had this problem until I sat down and thought about it. :)
: THEN I got all confused and started to do things like $CWD .= '/subdir';
: instead of simply $CWD = 'subdir';.  But the rule is simple and natural.
: It takes a relative or absolute directory and ALWAYS returns an absolute 
: path.  Lax in what inputs it accepts, strict in what it emits.  This is no
: more to remember than what chdir() and cwd() would do.
: 
: The result from $CWD would simply be a Dir object similar to Ken Williams' 
: Path::Class or Ruby's Dir object.  One of the methods would be .relative.
: 
: I didn't bring up @CWD because I thought it would be too much in one sitting.
: Basically it allows you to do this:
: 
:   pop @CWD;   # chdir ('..');
:   push @CWD, 'dir';   # chdir ('dir');
:   print $CWD[0];  # (File::Spec->splitdir(abs_path()))[0];
:   # ie. What top level directory am I in?
: 
: and all sorts of other operations that would normally involve a lot of
: splitdir'ing.
: 
: And then there's %CWD which I'm toying with being a per-volume chdir like
: you can do on Windows but that may be too much of a questionable thing.

You could multiplex both the array and hash roles into the object
returned by $CWD, much like the $/ pattern match result object can
be subscripted as either $/[1] or $/.  $CWD would itself
behave like a string in string context, but $CWD[] would get you to
the array value, and $CWD{} the hash value for systems that have
more than one current directory.

: > Your assumption there is a bit inaccurate--in P6 you are allowed to
: > temporize (localize) the effects of functions and methods that are
: > prepared to deal with it.  
: 
: Yeah, we were talking about it on #perl6 a bit.  That seems to me the more
: bizarre idea than assigning to something which can fail.  Localizing an
: assignment is easy, there's just one thing to revert.  But function calls can
: do lots of things.  Just how much does it reverse?  I guess if its used
: sensibly on sharp functions, such as chdir, and the behavior is 
: user-definable it can work but I don't know if the behavior will ever
: be obvious for anything beyond the trivial.

The function reverses whatever its TEMP property's closure knows how
to reverse.  It's up to the function to know what its side effects are
and arrange to undo them.

: FWIW my prompting to write File::chdir was a desire was for "local chdir".
: So if "temp chdir" can be made to work that would solve most of the problem.
: 
: If nothing else perhaps chdir() should be eliminated and cwd() simply takes
: an argument to make it a getter/setter.

If you're going to throw away the verb then the noun might as well be
a variable.  But I like verbs for their readability, even if the verb
is "push".  Note that "push" could be made to work with "temp" as well:

temp push $CWD, "subdir" err fail "..."

This would automatically pop $CWD at the end of the dynamic scope.

: > However, I agree that it's nice to have an
: > easily interpolatable value.  So I think I'd rather see $CWD always
: > return the current absolute path even after failure
: 
: The problem there is it leaves $CWD without an error mechanism and thus
: becomes an unverifiable operation.  You have to use chdir() if you want to
: error check and $CWD is reduced to a "scripting" feature.

That was my point.  And if you look back at what you wrote, you just
called $CWD an "operation".  It's not--it's a noun.  I like nouns,
but I also like verbs, and unlike in Perl 5 we don't have to rely on
the magical side effects of certain mystical nouns to do localization
any more.

But I don't understand what you mean by a "scripting" feature, or
how getting reduced to one is antithetical to a blissful existence.

: It could throw an exception but then you have to wrap everything in a try
: block.  U

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Juerd

Michael G Schwern skribis 2005-04-15 13:12 (-0700):
> To be clear:  Only the store operation will return undef on failure.  
> Additional fetches on $CWD will continue to return the cwd.

Still breaks

$ref = \($CWD = $foo);

I'm not sure this breakage matters, but if it breaks one thing, it's
likely to break more than just that one thing, and I wonder how much
attention this has been given.

Hm, but $CWD++ is nice! Especially if after photos9 it goes to photos10,
and not photot0. How does string ++ work in Perl 6, anyway?

> The problem there is it leaves $CWD without an error mechanism and thus
> becomes an unverifiable operation.  You have to use chdir() if you want to
> error check and $CWD is reduced to a "scripting" feature.

Well, after failure it can be cwd() but false without breaking any real
code, because normally, you'd never if (cwd) { ... }, simply because
there's ALWAYS a cwd. If this is done, the thing returned by the STORE
can still be an lvalue and thus be properly reffed.

This would mean you'd use or instead of err, but I don't understand the
point of err meaning "error" together with the introduction of
true-but-false values anyway. Low-prec // should imo just be spelled
dor. But it's too late for that, of course.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

nbsp in \s, and <>

2005-04-15 Thread Juerd

Is there a -like thingy that is always \s+?

Do \s and  match non-breaking whitespace, U+00A0?

How about:

U+0008  backspace
U+00A0  no break space (Repeated for overview)
U+1361  ethiopic wordspace
U+2000  en quad
U+2001  em quad
U+2002  en space
U+2003  em space
U+2004  three per em space
U+2005  four per em space
U+2006  six per em space
U+2007  figure space
U+2008  punctuation space
U+2009  thin space 
U+200A  hair space
U+200B  zero width space
U+202F  narrow no break space
U+205F  medium mathematic space
U+2060  word joiner (What is that, anyway?)
U+3000  ideographic space
U+FEFF  zero width non-breaking space

\s is said (in S05) to match any unicode whitespace, but letting it
match NBSP and then using \s for splitting things is wrong, I think.

Are the contents of <> split using ? (Is <<$foo>>, where $foo is
"foo\xA0bar", one or two elements?)


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: Statement modifier scope

2005-04-15 Thread Juerd

Paul Seamons skribis 2005-04-15 13:42 (-0600):
> Each of the declarations my, our and local currently set the value to 
> undefined (unless set = to something).

That's not true.

use strict;
$::foo = 5;
our $foo;
print $foo;  # 5


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Michael G Schwern

Thus spake Larry Wall:
> Offhand, I guess my main semantic problem with it is that if a chdir
> fails, you aren't in an undefined location, which the new value of $CWD
> would seem to indicate.  You're just where you were.  Then the user
> either has to remember that, or there still has to be some other
> means of finding out the real location.

To be clear:  Only the store operation will return undef on failure.  
Additional fetches on $CWD will continue to return the cwd.

$CWD = '/path/which/exists';
$CWD = '/i/do/not/exist' err warn $!;
print $CWD;

This prints /path/which/exists/.


> The other problem with it is the fact that people will assign relative
> paths to it and expect to get the relative path back out instead
> of the absolute path.

I honestly never had this problem until I sat down and thought about it. :)
THEN I got all confused and started to do things like $CWD .= '/subdir';
instead of simply $CWD = 'subdir';.  But the rule is simple and natural.
It takes a relative or absolute directory and ALWAYS returns an absolute 
path.  Lax in what inputs it accepts, strict in what it emits.  This is no
more to remember than what chdir() and cwd() would do.

The result from $CWD would simply be a Dir object similar to Ken Williams' 
Path::Class or Ruby's Dir object.  One of the methods would be .relative.

I didn't bring up @CWD because I thought it would be too much in one sitting.
Basically it allows you to do this:

pop @CWD;   # chdir ('..');
push @CWD, 'dir';   # chdir ('dir');
print $CWD[0];  # (File::Spec->splitdir(abs_path()))[0];
# ie. What top level directory am I in?

and all sorts of other operations that would normally involve a lot of
splitdir'ing.

And then there's %CWD which I'm toying with being a per-volume chdir like
you can do on Windows but that may be too much of a questionable thing.


> Your assumption there is a bit inaccurate--in P6 you are allowed to
> temporize (localize) the effects of functions and methods that are
> prepared to deal with it.  

Yeah, we were talking about it on #perl6 a bit.  That seems to me the more
bizarre idea than assigning to something which can fail.  Localizing an
assignment is easy, there's just one thing to revert.  But function calls can
do lots of things.  Just how much does it reverse?  I guess if its used
sensibly on sharp functions, such as chdir, and the behavior is 
user-definable it can work but I don't know if the behavior will ever
be obvious for anything beyond the trivial.

FWIW my prompting to write File::chdir was a desire was for "local chdir".
So if "temp chdir" can be made to work that would solve most of the problem.

If nothing else perhaps chdir() should be eliminated and cwd() simply takes
an argument to make it a getter/setter.


> However, I agree that it's nice to have an
> easily interpolatable value.  So I think I'd rather see $CWD always
> return the current absolute path even after failure

The problem there is it leaves $CWD without an error mechanism and thus
becomes an unverifiable operation.  You have to use chdir() if you want to
error check and $CWD is reduced to a "scripting" feature.

It could throw an exception but then you have to wrap everything in a try
block.  Unless Perl 6 is going this route for I/O errors in general I'd
rather not.

I'll give the error mechanism some more thought.


Anyhow, I encourage folks to play with File::chdir and see what they think
of the idea.  I'm fixing up the Windows nits in the tests now.

Re: [pugs] regexp "bug"?

2005-04-15 Thread Nicholas Clark

On Fri, Apr 15, 2005 at 09:34:58AM -0700, Larry Wall wrote:

> It doesn't have to be the default, though.  But there has to be
> some way of allowing illegal characters to be talked about, or
> you can't write programs that talk about them.  It's like saying

Thoughtcrime acceptable. Doubleplusgood.

Nicholas Clark

Re: [RFC] some doubtable MMDs?

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 02:38:36PM +0200, Leopold Toetsch wrote:
: I'm not quite sure, but it seems that some of the MMD functions may 
: better be vtable methods:
: 
: - bitwise_sh[rl]*shift by anything other then int?
: - bitwise_lsris missing generally
: 
: or even just a plain opcode only:
: 
: - logical_{or,and,xor}  return a PMC depending on the boolean value
: 
: What are HLLs expecting of these infix operations?

Perl 6 tends to distinguish these as different operators, though Perl 5
did overload the bitwise ops on both strings and numbers, which newbies
found confusing in ambiguous cases, which is why we changed it.

: OTOH it might be useful that the current get__keyed operations 
: (postcircumfix:[]) become MMD subroutines:
: 
:   Px = Py[Pz]Pz = String, Int, Key, Slice, ...

At the moment, the Perl 6 optimizer is explicitly allowed to optimize
array indices with the assumption that the subscript is a scalar
(or slice) of integer, or something that converts to integer.  I'd be
interested to know if that policy will actually buy us any performance.
If it always has to go through MMD anyway, maybe it doesn't.  But
array indexing code tends to be pretty hot, so if we can keep it
somewhat optimizable and/or jittable, that'd be nice.

Larry

Re: Statement modifier scope

2005-04-15 Thread Paul Seamons

> I'm imagining it will be different, as I expect temp to not hide the old
> thing. I'm not sure it will.

That is another good question.  I just searched through the S and A's and 
couldn't find if temp will blank it out.  I am thinking it will act like 
local.  Each of the declarations my, our and local currently set the value to 
undefined (unless set = to something).  I imagine that temp and let will 
behave the same.

In which case "local %h;" and "let %h" would allocate a new, empty variable in 
a addition to the original variable (which is hidden but still retains its 
contents).

Paul

Various questions

2005-04-15 Thread Philip Taylor

I've been working on a C-to-Parrot compiler (actually an IMC backend
for the LCC compiler), tentatively named Carrot, over the past week. It
can currently do some reasonably useful things, like running the Cola
compiler (with only a very small amount of cheating), but it has raised 
a few queries:

* I can usually handle unsigned numbers by pretending they're signed and 
using 'I' registers, but some things appear to be awkward without new 
ops - in particular, div and cmod, and le/lt/ge/gt comparisons. (As far 
as I can tell, those are the only ones C would need; everything else 
should work fine with the signed variants).

I've added divu/leu/etc ops to math.ops/cmp.ops (and just made them cast 
their operands into UINTVALs) - is that a reasonable thing to do? Would 
they be better in a new .ops file?

* Should there be an 'isatty' op/method? (or is there something else 
that "isatty(fileno(file))" (which Cola's lexer uses) should do, in 
order to return a reasonable answer?)

* Is it possible to merge PBC files together, like load_bytecode but at 
compile-time?

The compiler converts .c to .pbc (via .imc), then the linker just 
creates a program full of load_bytecode, so the actual linking gets done 
at run-time, which isn't very nice when you try moving/deleting one of 
the .pbcs. (And lcc always deletes the .pbcs, since it assumes they're 
temporary files.)

* How efficient are PMC method calls? (And are performance concerns 
documented anywhere, like "op calls are roughly n times faster than 
methods", so compiler-writers could avoid implementing things in stupid 
ways, or is it too early to be doing that?)

I've been using [gs]et_integer_keyed_int on a PMC to allow pointer 
access. Since it reads whole ints, it probably crashes unnecessarily 
when e.g. reading chars at unlucky addresses - but IMC code like "val = 
mem.read_i1(ptr)" feels unpleasantly inefficient, particularly in 
string-processing loops.

Hmm... Should I just accept that C-on-Parrot will always be relatively 
slow, since its concept of memory is slightly incompatible with 
Parrot's, and anybody who wants speed can use a native C compiler, so I 
can stop worrying about it? :-)

Thanks,
--
Philip Taylor
[EMAIL PROTECTED]

Re: Statement modifier scope

2005-04-15 Thread Larry Wall

I would like to get rid of all those implicit scopes.  The only
exception would be that any topicalizing modifier allocates a private
lexical $_ scoped to just that statement.  But dynamic scoping may
happen only at explicit block boundaries.

I can see the argument for the other side, where any "deferred"
code is treated as a kind of closure regardless of whether there are
explicit curlies around it.  That would solve certain problems like
defining the scopes of the lexicals in

$a = $x ?? my $y :: my $z;

or the infamous

my $x = 1 if $y;

to extend only to the subexpressions in which they find themselves.
But it's not what naive users expect, and it's hard to explain, so I
think we should stick with explicit curlies for most of our scoping
needs, even if it means letting certain variables hang around undefined
because their initialization was never executed.

Larry

Re: Statement modifier scope

2005-04-15 Thread Juerd

Paul Seamons skribis 2005-04-15 12:41 (-0600):
> In Perl5
> perl -MData::Dumper -e '%h=qw(a 1 b 2); {local %h; $h{a}="one"; print Dumper 
> \%h} print Dumper \%h;
> $VAR1 = {
>   'a' => 'one'
> };
> $VAR1 = {
>   'a' => '1',
>   'b' => '2'
> };
> I'm imaging the behavior would be the same with Perl6.  Notice that 'b' is 

I'm imagining it will be different, as I expect temp to not hide the old
thing. I'm not sure it will.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 11:28:31AM -0500, Rod Adams wrote:
: David Wheeler wrote:
: 
: >But the first person to write <[a...]> gets what's comin' to 'em.
: 
: Is that nothing (since '.' lt 'a'), or everything after 'a'?

Might as well make it everything after 'a' for consistency.  One could
also view the last dot as a special version of the ordinary "any" dot,
and read it "a to whatever".

Larry

Re: [pugs] regexp "bug"?

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 05:12:54PM +, [EMAIL PROTECTED] wrote:

: Isn't that what the difference between byte-level and codepoint-level
: access to strings is all about.  If you want to work with values that
: are illegal codepoints then you should be working at the byte-level
: not the codepoint-level, at least by default.

Sure, but there's no guarantee you have access to a lower level,
depending on the interface presented by the object in question, and
you shouldn't probably have to know that anyway, if there's a useful
abstraction level at which "illegal character" means something as
a unit to the higher level.  The fact is that U+ is an illegal
character regardless of the encoding, and I'd like to be able to
talk about it as a character, without having to know whether it's
an illegal UTF-8 byte sequence, or an illegal UTF-16 byte sequence,
or a 256-bit integer stored somewhere that you just aren't allowed
to think about certain values of.

In short, "legal" Unicode strings should probably be viewed as a
constrained subtype of strings, not as a storage type.  I know you've
known Ada from its infancy. :-)  Perl 6 makes the same distinction, and
can presumably get at the unconstrained type for any constrained type.
So if you hand me a Unicode string with arbitrary value restrictions,
there had better be a way to view that string without the arbitrary
restrictions.  You need to be able to determine somehow that types
Even or Odd have a storage class of type Int.

Larry

Re: [perl #34984] [PATCH] Fix segfault with const

2005-04-15 Thread Nicholas Clark

On Fri, Apr 15, 2005 at 07:26:56PM +0100, Nick Glencross wrote:

> +// Forbid assigning a string to anything other than a string const
> +// for now

In future, please don't use C99 comments.

(apart from that, I don't have the knowledge to comment on this patch)

Nicholas Clark

Re: Statement modifier scope

2005-04-15 Thread Paul Seamons

On Friday 15 April 2005 12:28 pm, Juerd wrote:
> temp %h{ %other.keys } = %other.values;

Oops missed that - I like that for solving this particular problem.  It does 
even work in Perl5:

perl -MData::Dumper -e '%h=qw(a 1 b 2); {local @h{qw(a b)}=("one","two"); 
print Dumper \%h} print Dumper \%h'
$VAR1 = {
  'a' => 'one',
  'b' => 'two'
};
$VAR1 = {
  'a' => '1',
  'b' => '2'
};

I had never thought to do a hash slice in a local.  That is great!!!

Thank you very much!  Wish I'd know about that three years ago.

But, it still doesn't answer the original question about scoping in the 
looping statement modifiers.

Paul

Re: Truely temporary variables

2005-04-15 Thread Aaron Sherman

On Fri, 2005-04-15 at 13:10, Luke Palmer wrote:
> Aaron Sherman writes:
> > Among the various ways of declaring variables, will Perl 6 have a way to
> > say, "this variable is highly temporary, and may be re-declared within
> > the same scope, or in a nested scope without concern"? I often find
> > myself doing:
> > 
> > my $sql = q{...};
> > ...do some DB stuff...
> > my $sql = q{...};
> > ...do more DB stuff...
> 
> There's a pretty common idiom for this:
> 
> {
> my $sql = q{...};
> # ... do some DB stuff ...
> }
> {
> my $sql = q{...};
> # ... do more DB stuff ...
> }
> 
> You see it in test suites all over the CPANdom.  

You see it all over my code too... it is always possible to simulate
many kinds of trickery that way. For example, if you want to write a
loop with a counter that is visible one statement after the loop
completes, you can say:

{
my int $i;
loop $i=0;...;$i++ {
...
}
do_stuff($i);
}

But isn't:

loop my int $i=0;...;$i++ {
...;
LAST{do_stuff($i)}
}

much cleaner? I think so, if for no other reason than it explicitly says
what it means. That's one of the reasons that LAST is so handy.

So too would my mythical declarator would prevent a few steps that are
otherwise quite easy, but cumbersome in the large.

Whatever, though. It was a simple suggestion, and seems to have sparked
FAR more controversy than the small win warrants.

-- 
Aaron Sherman <[EMAIL PROTECTED]>
Senior Systems Engineer and Toolsmith
"It's the sound of a satellite saying, 'get me down!'" -Shriekback

Re: Statement modifier scope

2005-04-15 Thread Paul Seamons

>
> temp %h;
> %h{ %other.keys } = %other.values;
>
> or even
>
> temp %h{ %other.keys } = %other.values;
>
> should work well already?

Almost - but not quite.

In Perl5
perl -MData::Dumper -e '%h=qw(a 1 b 2); {local %h; $h{a}="one"; print Dumper 
\%h} print Dumper \%h;
$VAR1 = {
  'a' => 'one'
};
$VAR1 = {
  'a' => '1',
  'b' => '2'
};

I'm imaging the behavior would be the same with Perl6.  Notice that 'b' is 
gone in the first print.  I only want to temporarily modify "some" values 
(the ones from the %other hash).  I don't want the contents of the %h to be 
identical to %other - I already have %other.

So in Perl5 this does work:

perl -MData::Dumper -e '%h=qw(a 1 b 2); {local %h=%h; $h{a}="one"; print 
Dumper \%h} print Dumper \%h;
$VAR1 = {
  'a' => 'one'
  'b' => '2',
};
$VAR1 = {
  'a' => '1',
  'b' => '2'
};
But this won't work in Perl6 (temp $var = $var doesn't work in Perl6) and 
again it may be fine for small hashes with only a little data - but for a 
huge hash (1000+ keys) it is very inefficient.

This is good discussion - but it isn't the real focus of the original message 
in the thread - the question is about the local (temp) scoping of looping 
statement modifiers in Perl6.

Though, I do appreciate your trying to get my example working as is.

Paul

Re: Statement modifier scope

2005-04-15 Thread Juerd

Paul Seamons skribis 2005-04-15 12:16 (-0600):
> For the given example, your code fits perfectly.  A more common case I have 
> had to deal with is more like this:
> my %h = 
> my %other = ;
> {
>   temp %h{$_} = %other{$_} for %other.keys;

Either

temp %h;
%h{$_} = %other{$_} for %other.keys;

or

temp %h;
%h{ %other.keys } = %other.values;

or even

temp %h{ %other.keys } = %other.values;

should work well already?
 
>   %h.say;
> }

I think it's hard to find an example that can't easily be rewritten as
something that already works. Gather/take solves most.


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: Truely temporary variables

2005-04-15 Thread chromatic

On Fri, 2005-04-15 at 11:21 -0500, Patrick R. Michaud wrote:

> On Fri, Apr 15, 2005 at 09:17:13AM -0700, Larry Wall wrote:

> > Maybe we could define an "ok" operator that suppresses only the
> > *first* warning produced by its argument(s).  Then if you get multiple
> > warnings, you at least get some indication that you've overgeneralized,
> > even if the "wrong" warning comes out.  Or maybe it only suppresses
> > the first warning till you get a second warning, and then it prints both.

> And after the third warning, it sends you to your room with no supper.

Talk about a strict permission system.  If that's the case, I want a
"I'm the human here, darnit!" option to bypass it.

-- c

Re: [perl #34984] [PATCH] Fix segfault with const

2005-04-15 Thread Nick Glencross

Leopold Toetsch via RT wrote:

I think, we could be a bit more graceful here for I/N mismatch and set
for the above case the constant val->set to 'N'.
   

Let me redo that...  I've just sent the wrong attachment which had a 
typo in it ...

[This should really address rare but possible Unicode strings, shouldn't 
it?]

Nick
Index: imcc/symreg.c
===
--- imcc/symreg.c   (revision 7843)
+++ imcc/symreg.c   (working copy)
@@ -307,6 +307,7 @@
 INS(interp, unit, "set_p_pc", "", r, 2, 0, 1);
 return NULL;
 }
+
 /* Makes a new identifier constant with value val */
 SymReg *
 mk_const_ident(Interp *interp,
@@ -314,6 +315,16 @@
 {
 SymReg *r;
 
+// Forbid assigning a string to anything other than a string const
+// for now
+if (t != 'S' && val->set == 'S')
+IMCC_fataly(interp, E_TypeError,
+"bad const initialisation");
+
+// Cast value to const type
+if (t == 'N' || t == 'I')
+val->set = t;
+
 if (global) {
 if (t == 'P') {
 IMCC_fataly(interp, E_SyntaxError,

Re: Truely temporary variables

2005-04-15 Thread Juerd

Brent 'Dax' Royal-Gordon skribis 2005-04-15 11:15 (-0700):
> Anything wrong with:

Yes, moving things around breaks it, as does removing the first. There
is no real dependency on the first $sql and it'd be great if declaration
wouldn't add one.

   temp $sql = q{...};
   my $sql = q{...};
   temp $sql = q{...};


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: Statement modifier scope

2005-04-15 Thread Paul Seamons

On Friday 15 April 2005 11:57 am, Juerd wrote:
> Paul Seamons skribis 2005-04-15 11:50 (-0600):
> > my %h = ;
> > {
> >   temp %h{$_} ++ for %h.keys;
>
> Just make that two lines. Is that so bad?
>
> temp %h;
> %h.values »++;
>

For the given example, your code fits perfectly.  A more common case I have 
had to deal with is more like this:

my %h = 
my %other = ;
{
  temp %h{$_} = %other{$_} for %other.keys;
  %h.say;
}

Ideally that example would print
aone
btwo
c3

It isn't possible any more to do something like
{
  temp %h = (%h, %other);
}
because that second %h is now hidden from scope (I forget which Apocalypse or 
mail thread I saw it in).  Plus for huge hashes it just isn't very efficient.

I'd like to temporarily put the values of one hash into another (without 
wiping out all of the modfied hashes values like "temp %h" would do), run 
some code, leave scope and have the modified hash go back to normal.  In 
perl5 I've had to implement that programatically by saving existing values 
into yet another hash - running the code - putting them back.  It works but 
there is all sorts of issues with defined vs exists.

So yes - your code fits the limited example I gave.  But I'd still like the 
other item to work.

Paul

Re: Truely temporary variables

2005-04-15 Thread Brent 'Dax' Royal-Gordon

Aaron Sherman <[EMAIL PROTECTED]> wrote:
> What I'd really like to say is:
>
> throwawaytmpvar $sql = q{...};
> throwawaytmpvar $sql = q{...};

Anything wrong with:

   my $sql = q{...};
   temp $sql = q{...};
   temp $sql = q{...};

(Assuming C is made to work on lexicals, of course.)

-- 
Brent 'Dax' Royal-Gordon <[EMAIL PROTECTED]>
Perl and Parrot hacker

"I used to have a life, but I liked mail-reading so much better."

Re: Statement modifier scope

2005-04-15 Thread Juerd

Paul Seamons skribis 2005-04-15 11:50 (-0600):
> my %h = ;
> {
>   temp %h{$_} ++ for %h.keys;

Just make that two lines. Is that so bad?

temp %h;
%h.values »++;

>   %h.say; # values are incremented still
> }
> %h.say; # values are back to original values


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: New language: Parrot Common Lisp

2005-04-15 Thread Cory Spencer


 (If anyone is able to track down aforementioned DOD/GC problems,
 you'll earn my eternal gratitude.)
Can you please provide a code snippet that exhibits the error.
Just running the program gives me errors on both Linux/x86 and OS X. 
Running with GC disabled works fine.

On OS X with GC enabled:
forge:~/svn/parrot-lisp/trunk$ parrot lisp.pbc
Can't find method '__set_string_native' for object 'LispSymbol'
On OS X with GC disabled:
forge:~/svn/parrot-lisp/trunk$ parrot -G lisp.pbc
->
On Linux with GC enabled:
anvil:~/svn/parrot-lisp/trunk$ parrot lisp.pbc
Can't find method '__set_string_native' for object 'LispSymbol'
On Linux with GC disabled:
anvil:~/svn/parrot-lisp/trunk$ parrot lisp.pbc
->
This is on the Parrot checked out of Subversion this morning (revision 
7846).  Which OS/build number were you using?

-c

Statement modifier scope

2005-04-15 Thread Paul Seamons

The following chunks behave the same in Perl 5.6 as in Perl 5.8.  Notice the 
output of "branching" statement modifiers vs. "looping" statement modifiers. 

perl -e '$f=1; {local $f=2; print "$f"} print " - $f\n"'
  # prints 2 - 1

perl -e '$f=1; {local $f=2 if 1; print "$f"} print " - $f\n"
  # prints 2 - 1

perl -e '$f=1; {local $f=2 unless 0; print "$f"} print " - $f\n"''
  # prints 2 - 1

perl -e '$f=1; {local $f=2 for 1; print "$f"} print " - $f\n"'
  # prints 1 - 1

perl -e '$f=1; {local $f=2 until 1; print "$f"} print " - $f\n"'
  # prints 1 - 1

perl -e '$f=1; {local $f=2 while !$n++; print "$f"} print " - $f\n"'
  # prints 1 - 1

It appears that there is an implicit block around statements with looping 
statement modifiers.  perlsyn does state that the control variables of the 
"for" statement modifier are locally scoped, but doesn't really mention that 
the entire statement is as well.  I'm not sure if this was in the original 
design spec or if it flowed out of the implementation details, but either way 
it seems to represent an inconsistency in the treatment of locality with 
regards to braces (ok I guess there are several in Perl5).

So the question is, what will it be like for Perl6.  It would seem that all of 
the following should hold true because of scoping being tied to the blocks.

pugs -e 'our $f=1; {temp $f=2; print $f}; say " - $f"'
   # should print 2 - 1 (currently prints 2 - 2 - but that is a compiler 
issue)

pugs -e 'our $f=1; {temp $f=2 if 1; print $f}; say " - $f"'
   # should print 2 - 1 (currently dies with parse error)

pugs -e 'our $f=1; {temp $f=2 for 1; print $f}; say " - $f"'
   # hopefully prints 2 - 1 (currently dies with parse error)

As a side note - pugs does work with:

pugs -e 'our $f=1; {$f=2 for 1; print $f}; say " - $f"'
  # prints 2 - 2 (as it should.  It seems that statement modifiers don't 
currently work with declarations - but that is a compiler issue - not a 
language issue.)

I have wanted to do this in Perl5 but couldn't but would love to be able to do 
in Perl6:

my %h = ;
{
  temp %h{$_} ++ for %h.keys;
  %h.say; # values are incremented still
}
%h.say; # values are back to original values

Paul

Re: [perl #34984] [PATCH] Fix segfault with const

2005-04-15 Thread Nick Glencross

Leopold Toetsch via RT wrote:
Nick Glencross <[EMAIL PROTECTED]> wrote:
 

This patch fixes a problem which can occur in this example:
   

 

.sub test
   .const float a = 12
   print a
   print_newline
.end
   

Ah yep.
 

+if (t != 'P' && t != val->set)
+IMCC_fataly(interp, E_TypeError,
+"const types do not match");
   

I think, we could be a bit more graceful here for I/N mismatch and set
for the above case the constant val->set to 'N'.
 

Yes, I was planning to do something a bit more thorough, but fixing the 
immediate segfault was the first challenge.

I've looked over the code a bit more now, and see that the value is 
still stored textually at this point, so setting the type as you've said 
is pretty simple. It's a shame that strings can be in a number of 
different formats, and probably quoted, preventing this from working for 
them too.

Anyhow, here's a new patch for you to review, and perhaps apply...?
Cheers,
Nick
Index: imcc/symreg.c
===
--- imcc/symreg.c   (revision 7843)
+++ imcc/symreg.c   (working copy)
@@ -307,6 +307,7 @@
 INS(interp, unit, "set_p_pc", "", r, 2, 0, 1);
 return NULL;
 }
+
 /* Makes a new identifier constant with value val */
 SymReg *
 mk_const_ident(Interp *interp,
@@ -314,6 +315,16 @@
 {
 SymReg *r;
 
+// Forbid assigning a string to anything other than a string const
+// for now
+if (t != 'S' && val->set == 'S')
+IMCC_fataly(interp, E_TypeError,
+"bad const initialisation");
+
+// Cast value to const type
+if (t == 'S' || t == 'I')
+val->set = t;
+
 if (global) {
 if (t == 'P') {
 IMCC_fataly(interp, E_SyntaxError,

Re: [perl #35000] [PATCH] README.win32 & icu 3.2

2005-04-15 Thread chromatic

On Fri, 2005-04-15 at 05:38 -0700, François PERRAD wrote:

> small mistake in [perl #34986] :
> with ICU 3.2, the library icudata.lib is renamed icudt.lib.

Thanks, applied.

-- c

Re: [pugs] regexp "bug"?

2005-04-15 Thread mark . a . biggar


Isn't that what the difference between byte-level and codepoint-level access to 
strings is all about.  If you want to work with values that are illegal 
codepoints then you should be working at the byte-level not the 
codepoint-level, at least by default.

--
Mark Biggar
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]


> On Fri, Apr 15, 2005 at 12:56:14AM -0700, Mark A. Biggar wrote:
> : Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 
> : byte UCS-2 value, but the Unicode standard specifically says that the 
> : values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should 
> : never appear in a Unicode string.  0x is reserved for out-of-band 
> : signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are 
> : specificaly reserved for out-of-band marking a UCS-2 file as being 
> : either bigendian or littlendian, but are specifically not considered 
> : part of the data.  chr() is currently defined to mean convert an int 
> : value to a Unicode codepoint. That's why I said that chr(65535) should 
> : return an exception, it's an argument error similar to sqrt(-1).
> 
> It has to at least be possible to Think Bad Thoughts in Perl.
> It doesn't have to be the default, though.  But there has to be
> some way of allowing illegal characters to be talked about, or
> you can't write programs that talk about them.  It's like saying
> it's okay to be an executioner as long as you don't kill anyone...
> 
> Larry

Re: Truely temporary variables

2005-04-15 Thread Luke Palmer

Aaron Sherman writes:
> Among the various ways of declaring variables, will Perl 6 have a way to
> say, "this variable is highly temporary, and may be re-declared within
> the same scope, or in a nested scope without concern"? I often find
> myself doing:
> 
>   my $sql = q{...};
>   ...do some DB stuff...
>   my $sql = q{...};
>   ...do more DB stuff...

There's a pretty common idiom for this:

{
my $sql = q{...};
# ... do some DB stuff ...
}
{
my $sql = q{...};
# ... do more DB stuff ...
}

You see it in test suites all over the CPANdom.  

Luke

Re: Truely temporary variables

2005-04-15 Thread Juerd

Rod Adams skribis 2005-04-15 11:53 (-0500):
> Wouldn't some form of trait make more sense:
>my $sql = '...' is ok;

Depends. A unary ok operator would let you pinpoint very easily,
*without* using parens:

ok $fh.print($foo); # no warnings about print (closed fh?)
# but warning about undef $foo remains

$fh.print(ok $foo);  # warn about printing thingies, but not about
 # undef $foo

say $foo, $bar, ok $baz, $quux;  # complain about everything, except
 # what has to do with $baz

my $foo;
ok my $foo = "foo $bar baz";  # warn about $bar, but not the masking
my $foo = ok "foo $bar baz";  # other way around


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: Truely temporary variables

2005-04-15 Thread Rod Adams

Larry Wall wrote:
On Fri, Apr 15, 2005 at 06:04:32PM +0200, Juerd wrote:
: No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and
: ugly). Suggestions?
Maybe we could define an "ok" operator that suppresses only the
*first* warning produced by its argument(s).  Then if you get multiple
warnings, you at least get some indication that you've overgeneralized,
even if the "wrong" warning comes out.  Or maybe it only suppresses
the first warning till you get a second warning, and then it prints both.
Wouldn't some form of trait make more sense:
   my $sql = '...' is ok;
Only trick would be getting "is ok" to bind to the thing in the 
preceding expression that produces the warning the programmer was 
expecting. Certainly

   {my $sql = '...'} is ok;
get the point across that warnings are somewhat ignorable for the block, 
but that starts getting to look a lot like

   {my $sql = '...'} CATCH {default};
Except that one is run-time, the other compile-time.
So one could interpret this thread as a cry for a compile-time exception 
handler. I see some interesting uses for this in conjunction with 
C, but I doubt I'm seeing the whole story.

-- Rod Adams

Re: [pugs] regexp "bug"?

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 12:56:14AM -0700, Mark A. Biggar wrote:
: Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 
: byte UCS-2 value, but the Unicode standard specifically says that the 
: values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should 
: never appear in a Unicode string.  0x is reserved for out-of-band 
: signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are 
: specificaly reserved for out-of-band marking a UCS-2 file as being 
: either bigendian or littlendian, but are specifically not considered 
: part of the data.  chr() is currently defined to mean convert an int 
: value to a Unicode codepoint. That's why I said that chr(65535) should 
: return an exception, it's an argument error similar to sqrt(-1).

It has to at least be possible to Think Bad Thoughts in Perl.
It doesn't have to be the default, though.  But there has to be
some way of allowing illegal characters to be talked about, or
you can't write programs that talk about them.  It's like saying
it's okay to be an executioner as long as you don't kill anyone...

Larry

Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?

2005-04-15 Thread Rod Adams

David Wheeler wrote:
But the first person to write <[a...]> gets what's comin' to 'em.
Is that nothing (since '.' lt 'a'), or everything after 'a'?
-- Rod Adams

Re: Truely temporary variables

2005-04-15 Thread Patrick R. Michaud

On Fri, Apr 15, 2005 at 09:17:13AM -0700, Larry Wall wrote:
> On Fri, Apr 15, 2005 at 06:04:32PM +0200, Juerd wrote:
> : No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and
> : ugly). Suggestions?
> 
> Maybe we could define an "ok" operator that suppresses only the
> *first* warning produced by its argument(s).  Then if you get multiple
> warnings, you at least get some indication that you've overgeneralized,
> even if the "wrong" warning comes out.  Or maybe it only suppresses
> the first warning till you get a second warning, and then it prints both.

And after the third warning, it sends you to your room with no supper.

Pm

Re: Parrot bytecode reentrancy

2005-04-15 Thread Nigel Sandever

15/04/2005 10:35:56, Leopold Toetsch <[EMAIL PROTECTED]> wrote:

>Nigel Sandever <[EMAIL PROTECTED]> wrote:
>
>> When a sub that closes over a variable
>
>>  my $closure = 0;
>>  sub do_something {
>>  return $closure++:
>>  }
>
>> is called from two threads, do the threads share a single closure or
>> each get their own separate closure?
>
>AFAIK: the closure bytecode is shared, 

Great.

>the Closure PMC with the lexical
>pad is distinct. 

I think that makes perfect sense. No implicit sharing.

>But that all isn't implemented yet.
>

Understood. I am being premature in thinking about this. 

But this is where I come unstuck. What would this mean/do when called from 2 
threads?

my $closure :shared = 0;
sub do_something {
return $closure++:
}

or this:

our $closure :shared = 0;
sub do_something {
return $closure++:
}

I struck me a while back that there is a contradiction in idea of a shared, 
'my' variable. 

I want to say lexical, but a var declared with 'our' is in some sense lexical. 

Where I am going is that "shared" implies global. Access can be constrained by 
requiring a lexical declaration using 'our', but 'my' variables should not be 
able to be marked 'shared'.

One nice thing that falls out of that, is that no 'my' vars would ever be 
shared, which means they never require semaphore checks. That would mean that a 
non threaded app running on a multi-threaded build of Parrot, need never incur 
a 
penalty of semaphore checks if it always use 'my'. *I think*?

In effect, all vars declared 'our' would be implicitly shared, (and would 
require semaphoring), removing the need for a 'shared' attribute. 

In P5, lexicals are already quicker that globals, so any additional penalty 
added to globals because of multithreading will not affect any single-threaded 
code that is striving for ultimate performance, because they would already be 
utilising lexicals. 

Equally, things like filehandles are inherently process-global in scope and 
therefore sharable between threads and require semaphore checks. 

I only throw this into the thought-pot because there seems to me to be a 
natural 
symmetry between the concept of 'global' and the concept of 'shared'.

I won't argue the case for this, but I thought that if I mention it, it might 
also make some sense to others when the time comes for this stuff to be 
designed 
and implemented.

>> njs
>
>leo
>

njs

>

Re: Truely temporary variables

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 06:04:32PM +0200, Juerd wrote:
: No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and
: ugly). Suggestions?

Maybe we could define an "ok" operator that suppresses only the
*first* warning produced by its argument(s).  Then if you get multiple
warnings, you at least get some indication that you've overgeneralized,
even if the "wrong" warning comes out.  Or maybe it only suppresses
the first warning till you get a second warning, and then it prints both.

Larry

Re: Truely temporary variables

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 11:45:16AM -0400, Aaron Sherman wrote:
: Among the various ways of declaring variables, will Perl 6 have a way to
: say, "this variable is highly temporary, and may be re-declared within
: the same scope, or in a nested scope without concern"? I often find
: myself doing:
: 
:   my $sql = q{...};
:   ...do some DB stuff...
:   my $sql = q{...};
:   ...do more DB stuff...
: 
: This of course results in re-defining $sql, so I take out the second
: "my", but then at some point I remove the first one, and strict chews me
: out over not declaring $sql, so I make it "my" again.
: 
: This is a cycle I've repeated with dozens of variations on more
: occasions than I care to (could?) count.

And at that point, why not just change it to this?

my $sql;
$sql = q{...};
...do some DB stuff...
$sql = q{...};
...do more DB stuff...

It seems to me that assignment does a pretty good job of clobbering a
variable's value without the need to redeclare the container.  If you
really want to program in a definitional paradigm that requires every
new definition to have a declaration, then you ought to be giving
different definitions different names, seems like, or putting each
of them into its own scope.  Or write yourself a macro.  Or just turn
off the redefinition warning...

It doesn't seem to rise to the level of a new keyword for me.

Larry

Re: Truely temporary variables

2005-04-15 Thread Juerd

Aaron Sherman skribis 2005-04-15 11:45 (-0400):
> What I'd really like to say is:
>   throwawaytmpvar $sql = q{...};
>   throwawaytmpvar $sql = q{...};

I like the idea and propose "a", aliased "an" for this.

> It should probably be illegal to:
>   throwawaytmpvar $sql = q{...};
>   my $sql = q{...}; # Error: temporary became normal lexical
> or for that matter even give it a new type:
>   throwawaytmpvar int $i = 0;
>   throwawaytmpvar str $i = "oops"; # Error: redefinition of type

Giving it a new type should be valid. That is, I think the variable is
more useful if the old one is thrown away and a new one is created. This
can perhaps be optimized by re-using the same thing if it has no
external references anymore.

In fact,

a Str $foo = $foo;

is a nice way to indicate that from now on, you don't care about its
numeric value anymore.

All in all, I think a|an can just be my without warnings and then do
what you want. 

Hm. Funny idea just occurred to me. What if something in ALLCAPS, or
better, just Ucfirst would disable all warnings for just that thing?

my $foo;
say $foo;  # warning about undef $foo
Say $foo;  # no warning

$closed_fh.print(Int($foo));  # just a warning about the closed fh

my $foo;   # warning about new $foo masking first
My $foo;   # no warning

If you think this looks much like PHP's @, you're right. It's not so bad
an idea, actually. The problem with PHP is that everything's a warning
and almost nothing actually dies.

No, Ucfirst it can't be, I think. And ALLCAPS is ugly. @ is taken (and
ugly). Suggestions?


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

Re: Macros [was: Whither "use English"?]

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 12:45:14PM +1200, Sam Vilain wrote:
: Larry Wall wrote:
: > Well, only if you stick to a standard dialect.  As soon as you start
: > defining your own macros, it gets a little trickier.
: 
: Interesting, I hadn't considered that.
: 
: Having a quick browse through some of the discussions about macros, many
: of the macros I saw[*] looked something like they could be conceptualised
: as referring to the part of the AST where they were defined.
: 
: ie, making the AST more of an Abstract Syntax Graph.  And macros like
: 'free' (ie, stack frame and scope-less) subs, with only the complication
: of variable binding.  The ability to have recursive macros would then
: relate to this graph-ness.

That is one variety of macro.

: What are the shortcomings of this view of macros, as 'smart' (symbol
: binding) AST shortcuts?

The biggest problem with smart things is they're harder for not-so-smart
people to understand.

: The ability to know exactly what source corresponds to a given point on
: the AST, as well as knowing straight after parse time (save for string
: eval, of course) what each token in the source stream relates to is one
: thing that I'm aiming to have work with Perldoc.  I'm hoping this will
: assist I18N efforts and other uses like smart editors.

Yes, that's an important quality for many kinds of tools, whether
documentation, debugging, or refactoring.

: By smart editors, I'm talking about something that uses Perl/PPI as its
: grammar parsing engine, and it highlights the code based on where each
: token in the source stream ended up on the AST.  This would work
: completely with source that munges grammars (assuming the grammars are
: working ;).  Then, use cases like performing L10N for display to non-
: English speakers would be 'easy'.  I can think of other side-benefits
: to such "regularity" of the language, such as allowing Programatica-
: style systems for visually identifying 'proof-carrying code' and
: 'testing certificates' (see http://xrl.us/programatica).

Glad you think it's 'easy'.  Maybe you should 'just do it' for us.  :-)

: macros that run at compile time, and insert strings back into the
: document source seem hackish and scary to these sorts of prospects.

We also allow (but discourage) textual substitution macros.  They're
essentially just lexically scoped source filters, and suffer the
same problems as source filters, except for the fact that you can
more easily limit the damage to a small patch of code.  The problem
is that the original patch of text has to be stored in the AST along
with the new chunk of AST generated by the reparse, and it's not at
all clear how a tool should handle that conflict.  It's better to only
parse once whenever possible, and just make sure the original text
remains attached to the appropriate place in the AST.  More basically,
it's usually better to cooperate with the parser than to lie to it.

: But then, one man's hackish and scary is another man's elegant
: simplicity, I guess.
: 
: * - in particular, messages like this:
: - http://xrl.us/fr78
: 
: but this one gives me a hint that there is more to the story... I
: don't grok the intent of 'is parsed'
: - http://xrl.us/fr8a

This is mostly talked about in the relevant Apocalypses, and maybe
the Synopses.  See dev.perl.org for more.

Larry

Truely temporary variables

2005-04-15 Thread Aaron Sherman

Among the various ways of declaring variables, will Perl 6 have a way to
say, "this variable is highly temporary, and may be re-declared within
the same scope, or in a nested scope without concern"? I often find
myself doing:

my $sql = q{...};
...do some DB stuff...
my $sql = q{...};
...do more DB stuff...

This of course results in re-defining $sql, so I take out the second
"my", but then at some point I remove the first one, and strict chews me
out over not declaring $sql, so I make it "my" again.

This is a cycle I've repeated with dozens of variations on more
occasions than I care to (could?) count.

What I'd really like to say is:

throwawaytmpvar $sql = q{...};
throwawaytmpvar $sql = q{...};

without problems. Of course, "throwawaytmpvar" is a bit long, but you
get the idea.

It should probably be illegal to:

throwawaytmpvar $sql = q{...};
my $sql = q{...}; # Error: temporary became normal lexical

or for that matter even give it a new type:

throwawaytmpvar int $i = 0;
throwawaytmpvar str $i = "oops"; # Error: redefinition of type

There might be other assumptions that this implies. For example, it
might be considered always thread-private and might be required to be a
core, unboxed type. These extra assumptions are only worth it if they
enhance the optimization possibilities surrounding such a value.

-- 
Aaron Sherman <[EMAIL PROTECTED]>
Senior Systems Engineer and Toolsmith
"It's the sound of a satellite saying, 'get me down!'" -Shriekback

Re: $*CWD instead of chdir() and cwd()

2005-04-15 Thread Larry Wall

On Fri, Apr 15, 2005 at 03:11:59AM -0700, Michael G Schwern wrote:
: Error handling is simple, a failed chdir returns undef and sets errno.
: 
:   $CWD = $dir err die "Can't chdir to $dir: $!";

Offhand, I guess my main semantic problem with it is that if a chdir
fails, you aren't in an undefined location, which the new value of $CWD
would seem to indicate.  You're just where you were.  Then the user
either has to remember that, or there still has to be some other
means of finding out the real location.

The other problem with it is the fact that people will assign relative
paths to it and expect to get the relative path back out instead
of the absolute path.

: I encourage Perl 6 to adapt $*CWD similar to File::chdir and simply eliminate
: chdir() and cwd().  They're just an unlocalizable store and fetch for global
: data.

Your assumption there is a bit inaccurate--in P6 you are allowed to
temporize (localize) the effects of functions and methods that are
prepared to deal with it.  However, I agree that it's nice to have an
easily interpolatable value.  So I think I'd rather see $CWD always
return the current absolute path even after failure, and

temp chdir($dir) err fail "Can't chdir to $dir: $!";

be made to work as a temporizable function at some point, via the TEMP
mechanism described in A4.

Larry

MMD 25 - multiply

2005-04-15 Thread Leopold Toetsch

One more, and my fingers & brain are getting tired of these changes.
If someone wants to continue (and complete it during night here ;-), 
it's a simple job:

1) vtable.tbl
   - change existing signature of next infix operation
   - add inplace variant directly below it
2) imcc/parser_util.c:is_infix()
   - add the compare case for the MMD
3) make realclean; perl Configure.pl ... && make -s
4) fix all compiler errors in classes and dynclasses by looking at 
already converted functions and adding the inplace variants

4a) remove code from dynclasses/py*.pmc, if it's the same as the Parrot 
core base class, or adapt code

5) make test &&
6) svn ci
Thanks,
leo

Re: New language: Parrot Common Lisp

2005-04-15 Thread Chip Salzenberg

According to Cory Spencer:
> I'd like to announce the creation of the Parrot Common Lisp project

Excellent!

>   * It's not a compiler yet, although I've got plans for that down the
> road.

(declare (type PerlString s)) ?  :-)
-- 
Chip Salzenberg- a.k.a. -<[EMAIL PROTECTED]>
 Open Source is not an excuse to write fun code
then leave the actual work to others.

Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?

2005-04-15 Thread Patrick R. Michaud

On Fri, Apr 15, 2005 at 01:01:58PM -, Rafael Garcia-Suarez wrote:
> Aaron Sherman wrote in perl.perl6.language :
> >
> > A silly question: is there a canonical character set from which we
> > extract these ranges? Are we hard-coding Unicode here, or is there some
> > way for the user to specify the character set for ranges?
> 
> Perl 5 forces [a-z] (or [i-j] for that matter) to be a range of
> lowercase alphabetic characters, even on EBCDIC platforms (where it's
> not).

At the moment, PGE (the part that implements the rule engine) is
deferring such questions to Parrot, and otherwise assuming Unicode.
Plus, S02 explicitly indicates that Perl is written in Unicode
and has consistent Unicode semantics, so I think that's what we should
go with.  It's certainly the way the compiler will go, at least
initially.

Pm

[SVN ci] MMD 24 - add converted

2005-04-15 Thread Leopold Toetsch

MMD subroutines "add" are done.
* removed all mathematical functions from Tcl scalars - all is inherited now
I forgot to mention in MMD 23:
* If you have an overriden __add or __subtract function, either defined 
as @MULTI or registered via mmdvtregister, these functions must now 
return the destination PMC. For not yet converted MMD infix operations, 
the return result is ignored, but it doesn't harm either.

leo

Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?

2005-04-15 Thread Rafael Garcia-Suarez

Aaron Sherman wrote in perl.perl6.language :
>
> A silly question: is there a canonical character set from which we
> extract these ranges? Are we hard-coding Unicode here, or is there some
> way for the user to specify the character set for ranges?

Perl 5 forces [a-z] (or [i-j] for that matter) to be a range of
lowercase alphabetic characters, even on EBCDIC platforms (where it's
not).

[PATCH] Minor spelling & punctuation errors

2005-04-15 Thread Steven Philip Schubiger

I've corrected a few spelling and punctuation errors;
since I'm not done yet, I'd like to know, whether I should 
continue, or if the general consensus is, that it's mostly 
needless nitpicking.

Punctuation has only been corrected, if punctuation was already
partly present; if totally absent, I didn't mind, as punctuation
does not always add up to readability.

Steven

--- src/builtin.c   Fri Apr 15 14:24:06 2005
+++ src/builtin.c   Fri Apr 15 13:04:58 2005
@@ -4,7 +4,7 @@
 
 =head1 NAME
 
-src/builtin.c - Bultin Methods
+src/builtin.c - Builtin Methods
 
 =head1 SYNOPSIS
 
--- src/datatypes.c Fri Apr 15 14:24:27 2005
+++ src/datatypes.c Fri Apr 15 14:34:40 2005
@@ -1,6 +1,5 @@
 /*
-Copyright: (c) 2002 Leopold Toetsch <[EMAIL PROTECTED]>
-License:  Artistic/GPL, see README and LICENSES for details
+Copyright: (c) 2002-2004 The Perl Foundation.  All Rights Reserved.
 $Id: datatypes.c,v 1.11 2004/09/08 00:33:58 dan Exp $
 
 =head1 NAME
@@ -10,7 +9,7 @@
 =head1 DESCRIPTION
 
 The functions in this file are used in .ops files to access the C
-and C string constants for Parrot and native data types defined iin
+and C string constants for Parrot and native data types defined in
 F.
 
 =head2 Functions

--- src/debug.c Fri Apr 15 14:24:34 2005
+++ src/debug.c Fri Apr 15 13:30:21 2005
@@ -749,7 +749,7 @@
 PDB_line_t *line;
 long ln,i;
 
-/* If no line number was specified set it at the current line */
+/* If no line number was specified, set it at the current line */
 if (command && *command) {
 ln = atol(command);
 
@@ -944,7 +944,7 @@
 /* PDB_find_breakpoint
  *
  * Find breakpoint number N; returns NULL if the breakpoint doesn't
- * exist or if no breakpoint was specified
+ * exist or if no breakpoint was specified.
  *
  */
 /*
@@ -1470,8 +1470,8 @@
 dest[size++] = 'P';
 goto INTEGER;
 case PARROT_ARG_IC:
-/* If the opcode jumps and this is the last argument
-   means this is a label */
+/* If the opcode jumps and this is the last argument,
+   that means this is a label */
 if ((j == info->arg_count - 1) &&
 (info->jump & PARROT_JUMP_RELATIVE))
 {
@@ -1888,7 +1888,7 @@
 
 =over 4
 
-=item * This should take the line get an instruction, get the opcode for
+=item * This should take the line, get an instruction, get the opcode for
 that instruction and check that is the correct one.
 
 =item * Decide what to do with macros if anything.
@@ -2265,7 +2265,8 @@
 =item C
 
-Description.
+Dumps the buflen, flags, bufused, strlen, offset associated
+with a string and the string itself.
 
 =cut
 
--- src/dod.c   Fri Apr 15 14:24:42 2005
+++ src/dod.c   Fri Apr 15 13:41:18 2005
@@ -97,13 +97,13 @@
 ++arena_base->num_extended_PMCs;
 /*
  * XXX this basically invalidates the high-priority marking
- * of PMCs by putting all PMCs onto the front of the list
+ * of PMCs by putting all PMCs onto the front of the list.
  * The reason for this is the by far better cache locality
- * when aggregates and their contents are marked "together"
+ * when aggregates and their contents are marked "together".
  *
  * To enable high priority marking again we should probably
  * use a second pointer chain, which is, when not empty,
- * processed first
+ * processed first.
  */
 if (tptr || hi_prio) {
 if (PMC_next_for_GC(tptr) == tptr) {
@@ -177,7 +177,7 @@
 if (*dod_flags & (PObj_is_special_PMC_FLAG << nm)) {
 /* All PMCs that need special treatment are handled here.
  * For normal PMCs, we don't touch the PMC memory itself
- * so that caches stay clean
+ * so that caches stay clean.
  */
 #if GC_VERBOSE
 if (PObj_report_TEST(obj)) {
@@ -210,7 +210,7 @@
 PObj_live_SET(obj);
 
 /* if object is a PMC and contains buffers or PMCs, then attach
- * the PMC to the chained mark list
+ * the PMC to the chained mark list.
  */
 if (PObj_is_special_PMC_TEST(obj)) {
 mark_special(interpreter, (PMC*) obj);
@@ -305,7 +305,7 @@
  * but t/library/dumper* fails w/o this marking.
  *
  * It seems that the Class PMC gets DODed - these should
- * get created as constant PMCs
+ * get created as constant PMCs.
  */
 for (i = 1; i < (unsigned int)enum_class_max; i++) {
 VTABLE *vtable;
@@ -404,10 +404,10 @@
  * First phase of mark is finished. Now if we are the owner
  * of a shared pool, we must run the mark phase of other
  * interpreters in our pool, so that live shared PMCs in that
- * interpreter are appended to our mark_ptrs chain
+ * interpreter are appended to our mark_ptrs chain.
  *
  * If there is a count of shared PMCs and we have already seen
- * all these, we could skip th

[perl #35000] [PATCH] README.win32 & icu 3.2

2005-04-15 Thread François

# New Ticket Created by  FranÃois PERRAD 
# Please include the string:  [perl #35000]
# in the subject line of all future correspondence about this issue. 
# https://rt.perl.org/rt3/Ticket/Display.html?id=35000 >



small mistake in [perl #34986] :
with ICU 3.2, the library icudata.lib is renamed icudt.lib.

Francois Perrad.--- README.win32.orig   2005-04-15 11:08:34.0 +0200
+++ README.win322005-04-15 11:25:50.0 +0200
@@ -65,7 +65,7 @@
 mkdir C:\usr\lib\data
 set PATH=%PATH%;C:\usr\lib\icu\bin
 cd 
-perl Configure.pl --icushared="C:\usr\lib\icu\lib\icudata.lib 
C:\usr\lib\icu\lib\icuuc.lib" --icuheaders="C:\usr\lib\icu\include" 
--icudatadir="C:\usr\local\icu\data"
+perl Configure.pl --icushared="C:\usr\lib\icu\lib\icudt.lib 
C:\usr\lib\icu\lib\icuuc.lib" --icuheaders="C:\usr\lib\icu\include" 
--icudatadir="C:\usr\local\icu\data"
 
 With MinGW32, use icu-3.2-Win32-msvc6.zip.
 
@@ -112,9 +112,9 @@
 
 With the ActiveState Perl distribution, tell Configure.pl to use gcc :
 
-perl Configure.pl --cc=gcc --icushared="C:\usr\lib\icu\lib\icudata.lib 
C:\usr\lib\icu\lib\icuuc.lib" --icuheaders="C:\usr\lib\icu\include" 
--icudatadir="C:\usr\local\icu\data"
-
-Nota: Use only the ICU binary distribution. 
+perl Configure.pl --cc=gcc --icushared="C:\usr\lib\icu\lib\icudt.lib 
C:\usr\lib\icu\lib\icuuc.lib" --icuheaders="C:\usr\lib\icu\include" 
--icudatadir="C:\usr\local\icu\data"
+or
+perl Configure.pl --cc=gcc --without-icu
 
 =item Intel C++

Re: [perl #34994] [TODO] make useful parts of Parrot config available at runtime

2005-04-15 Thread Steven Philip Schubiger

On 15 Apr, Leopold Toetsch wrote:

: That stuff is all in Perl code under the config dir, e.g:
: 
: $ find config -type f | xargs grep -w intsize

This clarifies some of my unapproved assumptions, although src has
some files containing these keywords too.

: I think we should have:
: 
:INTVAL_t   # type of the INTVAL
:FLOATVAL_t
:INTVAL_size
:int_size   # native c type
: 
: and so on. See also include/parrot/datatypes.h

I will.

: leo 

Steven

Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?

2005-04-15 Thread Matthew Walton


> 
> even sillier question:
> if <[a.z]> matches "a", "." and "z"
> and <[a...]> matches all characters from "a" including (for some
> definition of 'all')
>
> how will be range \x21 .. \x2e written?
> <[!..\.]>? (i.e. "." escaped?)
> 

I was assuming from Larry's mail that <[a...]> would parse as either:

  1) a character class containing the range from 'a' to '.' (what that
  means is a bit mind-bending for a friday afternoon)  2) a character class 
containing 'a' then a range from '.' to... oh, an
  error
Which way might be ambiguous, but could of course be defined in the
grammar. It hadn't occurred to me that ... for the range to infinity would
be allowed or useful here. I suppose it could just mean 'up to the end of
the available codepoints'.
I do love the idea of <[a..f]> type ranges though. It's just what the
three dots mean that's got me confused.

Re: <[]> ugly and hard to type

2005-04-15 Thread Patrick R. Michaud

On Fri, Apr 15, 2005 at 02:58:44PM +0200, Juerd wrote:
> Am I the only one who thinks <[a-z]> is ugly and hard to type because of
> the nested brackets? The same goes for <{...}>. The latter can't easily
> be fixed, I think, but the former perhaps can. 

Part of the thinking behind this is that the <[...]> construct
is likely to be less common in p6 rules than [...] was in p5 regular
expressions.  For unicode reasons, one typically should be writing
 instead of <[a-z]> anyway.  

But yes, I understand the difficulty of typing <[...]> on non-US
keyboards.  :-)

> \letter[] could well replace <[]>, and \LETTER[] would then replace
> <-[]>. This is consistent with many other \letters.
> 
> "c" for character is taken
> "r" for range is taken by carriage return
> "a" for any is taken by alarm (bell)
> "l" for list is taken by lcfirst

Actually, \L[...] is gone -- see S05 and A05.  I'm not sure if \a
exists, I haven't seen any reference to it in p6 rules.  (One could
claim that it's carried over from p5, but rules are so far different
from regexes that I'm hesitant to make that assumption.)  We could
certainly declare \a to be something else.

This isn't a vote from me either in favor or against this idea...
I'm just clarifying and making sure the discussion is up-to-date
with the relevant specs.

Pm

<[]> ugly and hard to type

2005-04-15 Thread Juerd

Am I the only one who thinks <[a-z]> is ugly and hard to type because of
the nested brackets? The same goes for <{...}>. The latter can't easily
be fixed, I think, but the former perhaps can. If there are more who
think it needs to, that is. And <{}> is a bit easier to type because all
four are shifted (US QWERTY and US Dvorak), while with <[]> I really
have to think hard about when to press and when to release the shift
key.

\letter[] could well replace <[]>, and \LETTER[] would then replace
<-[]>. This is consistent with many other \letters.

"c" for character is taken
"r" for range is taken by carriage return
"a" for any is taken by alarm (bell)
"l" for list is taken by lcfirst

"m" is available, but I can't think of a mnemonic :)

\m[a..z]  \M[a..z]

And to replace <[a..z]-[aoeui]> (does that construct even exist?),
[ \m[a..z] & \M[aoeui] ]. IMO, that's the only step backwards.

"a" would best communicate its function. Is the beep thing used enough?
(\cG still does that thing if \a is gone.)


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

[RFC] some doubtable MMDs?

2005-04-15 Thread Leopold Toetsch

I'm not quite sure, but it seems that some of the MMD functions may 
better be vtable methods:

- bitwise_sh[rl]*shift by anything other then int?
- bitwise_lsris missing generally
or even just a plain opcode only:
- logical_{or,and,xor}  return a PMC depending on the boolean value
What are HLLs expecting of these infix operations?
OTOH it might be useful that the current get__keyed operations 
(postcircumfix:[]) become MMD subroutines:

  Px = Py[Pz]Pz = String, Int, Key, Slice, ...
Comments welcome,
leo

Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?

2005-04-15 Thread Braňo Tichý

- Original Message - 
From: "Aaron Sherman" <[EMAIL PROTECTED]>
To: "David Wheeler" <[EMAIL PROTECTED]>
Cc: "Perl6 Language List" 
Sent: Friday, April 15, 2005 2:00 PM
Subject: Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?


> On Thu, 2005-04-14 at 21:32 -0700, David Wheeler wrote:
> > On Apr 14, 2005, at 7:06 PM, Patrick R. Michaud wrote:
> >
> > > So, <[a.z]>  matches "a", ".", and "z",
> > > while   <[a..z]> matches characters "a" through "z" inclusive.
> >
> > I was going to say that that was inconsistent, but since you never need
> > to repeat a letter in a character class, well, I guess it isn't. But
> > the first person to write <[a...]> gets what's comin' to 'em.
>
> A silly question: is there a canonical character set from which we
> extract these ranges? Are we hard-coding Unicode here, or is there some
> way for the user to specify the character set for ranges?
>


even sillier question:
if <[a.z]> matches "a", "." and "z"
and <[a...]> matches all characters from "a" including (for some definition
of 'all')

how will be range \x21 .. \x2e written?
<[!..\.]>? (i.e. "." escaped?)


braÅo

Re: [pugs] regexp "bug"?

2005-04-15 Thread hv

"Mark A. Biggar" <[EMAIL PROTECTED]> wrote:
:BÁRTHÁZI András wrote:
:
:> Hi,
:> 
:> This code:
:> 
:> my $a='A';
:> $a ~~ s:perl5:g/A/{chr(65535)}/;
:> say $a.bytes;
:> 
:> Outputs "0". Why?
:> 
:> Bye,
:>   Andras
:> 
:
:\u is not a legal unicode codepoint.  chr(65535) should raise an 
:exception of some type.  So the above code does seem show a possible 
:bug. But as that chr(65535) is an undefined char, who knows what the 
:code is acually doing.

In perl5 at least, we support a wider concept of codepoints than the
Unicode consortium. This allows us to use strings for a wider variety
of things than just Unicode text (eg version strings, bit vectors etc).

In perl6 the greatly expanded set of types will presumably allow us
to distinguish actual Unicode data from more arbitrary sequences of
codepoints, and I'd normally expect that the more constrained type
would be a subtype of the less constrained type. In this case that
means I'd expect "Unicode string" to be a subtype of something like
"codepoint sequence".

(In fact it'd probably be useful to have more levels than that - there
are times when you need the Unicode concepts for things like [[:digit:]],
but may be able to get better performance by avoiding the checks for
'legal Unicode codepoint'.)

On the other hand you will probably be able to achieve the things p5
overloads onto strings using packed integer arrays, so maybe this all
represents unnecessary complications. In which case maybe 'relaxed'
variants of Unicode strings aren't needed. We will probably still want
other sorts of strings though, such as ASCII.

Hugo

Re: Test::Expect

2005-04-15 Thread Ricardo SIGNES

* Adrian Howard <[EMAIL PROTECTED]> [2005-04-14T15:37:07]
> On 14 Apr 2005, at 11:36, Leon Brocard wrote:
> >Oh, I forgot to mention to perl-qa that I wrote Test::Expect:
> >  http://search.cpan.org/dist/Test-Expect/
> 
> It's nice. Already used it :-)

Does anyone who has used both Test::Expect and Test::Output feel like
giving a simple comparison?

-- 
rjbs


pgpqdwYiXXJrd.pgp
Description: PGP signature

Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?

2005-04-15 Thread Aaron Sherman

On Thu, 2005-04-14 at 21:32 -0700, David Wheeler wrote:
> On Apr 14, 2005, at 7:06 PM, Patrick R. Michaud wrote:
> 
> > So, <[a.z]>  matches "a", ".", and "z",
> > while   <[a..z]> matches characters "a" through "z" inclusive.
> 
> I was going to say that that was inconsistent, but since you never need 
> to repeat a letter in a character class, well, I guess it isn't. But 
> the first person to write <[a...]> gets what's comin' to 'em.

A silly question: is there a canonical character set from which we
extract these ranges? Are we hard-coding Unicode here, or is there some
way for the user to specify the character set for ranges?

[perl #34999] [TODO] remove more old stuff

2005-04-15 Thread via RT

# New Ticket Created by  Leopold Toetsch 
# Please include the string:  [perl #34999]
# in the subject line of all future correspondence about this issue. 
# https://rt.perl.org/rt3/Ticket/Display.html?id=34999 >


Some outdated files:

   lib/Parrot/PackFile/*
   lib/Parrot/PackFile.pm
   lib/Parrot/PackFile2.*

what is:

   lib/Parrot/String.pm  old packfile code?
   lib/Parrot/Types.pm   same?
   lib/Parrot/Key.pm same?

Do we still need:

   lib/Parrot/PMC.pm
   lib/Parrot/Makefile.PL

and what about the

   chartypes

directory, seems to be created in lib/Parrot/Distribution.pm

Already discussed:

   classes/pmc2c.pl   old PMC compiler
   classes/pmcarray.pmc   wrapper for PerlArray

leo

Re: Some PMC's Questions

2005-04-15 Thread Leopold Toetsch

Bloves Mr <[EMAIL PROTECTED]> wrote:
> hi,folks.
> I am reading PMC C source code and reading some document("
> http://www.perl.com/pub/a/2002/01/30/pmcs.html";).

Despite that the text is rather old, it's remarkably valid still.

> Some questions:

> *this PMC design have changed?

The internal layout of the PMC structure has changed, yes. And it
will likely change in the future. The internals of vtable calls and PMC
structure data access is now hidden inside macros:

  SELF->data   =>  PMC_data(SELF)
  SELF->cache.int_val  =>  PMC_int_val(SELF)
  $1->vtable->bet_bool()   =>  VTABLE_get_bool(INTERP, $1)

and so on. For details you might consult include/parrot/pmc.h.

> *any body offer some advice that learn PMC C source code and PMC's theory?

Just have a look at existing PMCs in classes. Commonly used core classes
are a good begin, e.g.:

  classes/integer.pmc... the Integer PMC
  classes/resizablepmcarray.pmc  ... standard PMC array

or even

  classes/tqueue.pmc ... experimental thread-safe queue

> Thanks.

leo

Re: should we change [^a-z] to <-[a..z]> instead of <-[a-z]>?

2005-04-15 Thread Juerd

David Wheeler skribis 2005-04-14 21:32 (-0700):
> I was going to say that that was inconsistent, but since you never need 
> to repeat a letter in a character class, well, I guess it isn't. But 
> the first person to write <[a...]> gets what's comin' to 'em.

Given ASCII, <[\x20...]> would then be everything except control
characters. Handy!

By the way, does ...5 mean -Inf..5? ;)


Juerd
-- 
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html 
http://convolution.nl/gajigu_juerd_n.html

$*CWD instead of chdir() and cwd()

2005-04-15 Thread Michael G Schwern

I was doing some work on Parrot::Test today and was replacing this code
with something more cross platform.

# Run the command in a different directory
my $command = 'some command';
$command= "cd $dir && $command" if $dir;
system($command);

I replaced it with this.

my $orig_dir = cwd;
chdir $dir if $dir;
system $command;
chdir $orig_dir;

Go into some new directory temporarily, run something, go back to the
original.

Hmm.  Set a global to a new value temporarily and then return to the
original value.  Sounds a lot like local.  So why not use it?

{
local chdir $dir if $dir;
system $command;
}

But localizing a function call makes no sense, especially if it has side
effects.  Well, the current working directory is just a filepath.  Scalar
data.  Why have a function to change a scalar?  Just change it directly.
Now local() makes perfect sense.

{
local $CWD = $dir if $dir;
system $command;
}

And this is exactly what File::chdir does.  $CWD is a tied scalar.
Changing it changes the current working directory.  Reading it tells you
what the current working directory is.  Localizing it allows you to
safely change the cwd temporarily, for example within the scope of a
subroutine.  It eliminates both chdir() and cwd().

Error handling is simple, a failed chdir returns undef and sets errno.

$CWD = $dir err die "Can't chdir to $dir: $!";

I encourage Perl 6 to adapt $*CWD similar to File::chdir and simply eliminate
chdir() and cwd().  They're just an unlocalizable store and fetch for global
data.

As a matter of fact, Autrijus is walking me through implementing it in Pugs
right now.

Re: A sketch of the security model

2005-04-15 Thread Shevek

On Thu, 2005-04-14 at 09:11 -0400, Dan Sugalski wrote:
> At 10:03 PM -0400 4/13/05, Michael Walter wrote:

> >  > Each running thread has two sets of privileges -- the active
> >>  privileges and the enableable privileges. Active privs are what's
> >>  actually in force at the moment, and can be dropped at any time. The
> >>  enableable privs are ones that code can turn on. It's possible to
> >>  have an active priv that's not in the enableable set, in which case
> >>  the current running code is allowed to do something but as soon as
> >>  the privilege is dropped it can't be re-enabled.
> >
> >How can dropping a privilege for the duration of a (dynamic) scope be
> >implemented? Does this need to be implemented via a parrot intrinsic,
> >such as:
> >
> >   without_privs(list_of_privs, code_to_be_run_without_these_privs);
> >
> >..or is it possible to do so with the primitives you sketched out above?
> 
> When a priv is dropped it stays dropped until it's reinstated. If 
> code drops a priv that it can't re-enable then the priv is gone. 
> (There are going to be issues with privileges attached to 
> continuations, since this could potentially mean that dropped privs 
> get un-dropped when you invoke a return continuation, though dropping 
> a privilege could ripple up the return continuation chain)

Reinstating privileges when you return is normal, since potentially
malicious code and data has now been removed from the stack.

If you do NOT do it this way, then every piece of code must know the
privileges of every child piece of code it calls (bye-bye virtual base
classes with user implementations).

See http://research.microsoft.com/~adg/Publications/MSR-TR-2001-103.pdf

The ability to explicitly reenable a privilege via an opcode, rather
than via the removal of the malicious party from the computation (by
return) is almost definitely a bad idea. If you protect this opcode
using some security mechanism, you will rapidly find that security
mechanism can supersede the functionality provided by the opcode.

> >  > Additionally, subroutines may be marked as having privileges, which
> >>  means that as long as control is inside the sub the priv in question
> >>  is enabled. This allows for code that has elevated privs, generally
> >>  system-level code.
> >
> >Does the code marking a subroutines must have any other privilege than
> >the one it is marking the subroutine with?
> 
> Dunno, that's something we'll need to work out. It's possible that 
> sub marking needs to be done externally -- that is, it's bytecode 
> metadata or something like that which requires system privileges of 
> some sort to set. (Though there are issues with that) Marking code as 
> privileged is really a system administration task, though we've not 
> really put much thought into administering a parrot system yet.

Actually, what usually happens is that subroutines (etc) are associated
with a responsible party (principal), and privileges are granted to the
principal; thus finding out the privileges of an opcode requires an
extra indirection. This is not a problem.

> >  > ... Non-continuation
> >>  invokables (subs and methods) maintain the current set of privs, plus
> >>  possibly adding the sub-specific privs.
> >Same for closures?
> 
> Yeah, I think so.

No, as before. You cannot execute based only on static privileges - this
is what Unix does, and the Unix model is broken. You need either a stack
inspection or a data inspection model, or a combination of the two. Ask
me if you want formal descriptions or implementation details of these
models.

S.

Re: Parrot bytecode reentrancy

2005-04-15 Thread Leopold Toetsch

Nigel Sandever <[EMAIL PROTECTED]> wrote:

> When a sub that closes over a variable

>   my $closure = 0;
>   sub do_something {
>   return $closure++:
>   }

> is called from two threads, do the threads share a single closure or
> each get their own separate closure?

AFAIK: the closure bytecode is shared, the Closure PMC with the lexical
pad is distinct. But that all isn't implemented yet.

> njs

leo

Re: A sketch of the security model

2005-04-15 Thread Shevek

On Wed, 2005-04-13 at 22:03 -0400, Michael Walter wrote:
> Dan,
> 
> On 4/13/05, Dan Sugalski <[EMAIL PROTECTED]> wrote:
> > All security is done on a per-interpreter basis. (really on a
> > per-thread basis, but since we're one-thread per interpreter it's
> > essentially the same thing)
> Just to get me back on track: Does this mean that when you spawn a
> thread, a separate interpreter runs in/manages that thread, or
> something else?
> 
> > Each running thread has two sets of privileges -- the active
> > privileges and the enableable privileges. Active privs are what's
> > actually in force at the moment, and can be dropped at any time. The
> > enableable privs are ones that code can turn on. It's possible to
> > have an active priv that's not in the enableable set, in which case
> > the current running code is allowed to do something but as soon as
> > the privilege is dropped it can't be re-enabled.
> 
> How can dropping a privilege for the duration of a (dynamic) scope be
> implemented? Does this need to be implemented via a parrot intrinsic,
> such as:
> 
>   without_privs(list_of_privs, code_to_be_run_without_these_privs);
> 
> ..or is it possible to do so with the primitives you sketched out above?

This is usually done by creating a function "f(code) { code() }" without
any static privileges in list_of_privs. To evaluate a function g()
without those privileges, evaluate f(g), and the natural mechanisms of
the interpreter will ensure that these privileges are not held during
g().

> > Additionally, subroutines may be marked as having privileges, which
> > means that as long as control is inside the sub the priv in question
> > is enabled. This allows for code that has elevated privs, generally
> > system-level code.
> 
> Does the code marking a subroutines must have any other privilege than
> the one it is marking the subroutine with?
> 
> > ... Non-continuation
> > invokables (subs and methods) maintain the current set of privs, plus
> > possibly adding the sub-specific privs.
> 
> Same for closures?

Closures may also capture a concept of the current context, which is
used when they are evaluated. This is critical in, for example, the case
of system code with higher static privileges returning a closure to a
low privilege object which may evaluate it at any time.

a) The closure must not have any privileges not held by the low
privilege object, so clearly it cannot just hold its static privilege
set, it must capture a current context.

b) If it does wish to have higher privilege (very common), it may grant
(Fournet+Gordon,2003) these privileges in a dynamic scope bounded below
by itself.

S.

Re: A sketch of the security model

2005-04-15 Thread Shevek

On Thu, 2005-04-14 at 09:51 -0700, Dave Whipp wrote:
> Dan Sugalski wrote:
> 
> > All security is done on a per-interpreter basis. (really on a per-thread 
> > basis, but since we're one-thread per interpreter it's essentially the 
> > same thing)
> ...
> >* Number of open files
> >* IO operations/sec
> >* IO operations total
> ...
> 
> Can an "application" get more resources simply by spawning threads? If 

Well, given that a child thread's dynamic access control context should
include the dynamic context of the parent thread at the point where the
thread was spawned, No.

What I describe is a (provably) correct implementation.

> the answer is "no, parent and child must divide share their quotas" then 
> there is a load balancing problem. If the answer is "yes", then there's 

There is no load balancing problem assuming you are synchronized on the
thread-create point, which is not a major overhead, since that pretty
much has to be a synchronization point in the kernel anyway.

> no real protection at all. A threads-per-second limit isn't an answer 
> here, either (a malicious app could sit around for a few hours, 
> launching threads at a low intensity, until it has enough to bring down 
> the system).
> 
> Is a thread really the right thing to apply these limits to? It seems to 

Limits are applied to privilege sets, not to threads.

> me that there needs to be some sort of token (cf. cash; cf "capability") 
> that an application can obtain/spend/refresh to do these ops. An 

Yes, that's about the same.

> application could share its token(s) with any threads it creates. It 
> could probably even "loan" its token to a backgroud thread that does 
> some operation on behalf of many other threads.

Preferably not. I fear the concept of being able to hand out privileges
to low privilege threads. If the low privilege thread has access to a
(willing) object with static privileges allowing the operation, then
that object should perform the operation on behalf of the thread in a
dynamic context created by a 'grant' operation (See Fournet and Gordon,
2003). If the low privilege thread is made up entirely of low privilege
objects, then it shouldn't have the privilege under any circumstances.

S.

[SVN ci] MMD 23 - convert subtract MMD functions and opcodes

2005-04-15 Thread Leopold Toetsch

Continuing the MMD infix plan, we now have:
1) the subtract MMD functions are converted to the new function signature:
  PMC* subtract(PMC* value, PMC* dest)
If C isn't NULL it's set to the result of the operation and the 
result is returned. This is the existing behavior. The TODO new "n_sub" 
opcode will return a new destination with the result as needed by 
languages like Python or Lisp.

2) There are now distinct infix variants of subtract, with "i_" 
prepended to the function name:

  void i_subtract(PMC *value)
3) during opcode generation, the "sub" opcode is converted according to:
  sub Px, Py, Pz=>  infix .MMD_SUBTRACT,   Px, Py, Pz
  sub Px, Py=>  infix .MMD_I_SUBTRACT, Px, Py
  sub Px, Px, Py=>  infix .MMD_I_SUBTRACT, Px, Py
I'm not quite sure, if the latter is technically correct or useful. It 
might cause a problem, when operators are overloaded. OTOH it can safe a 
compare "if (dest == SELF) ...".

4) Tcl and Python scalars use the inherited subtract MMD of Parrot core 
types Integer, Float, Complex, and BigInt. The old (duplicated, 
cut'n'pasted) variants of subtract got just deleted in Tcl and Python 
dynamic classes.

5) for type promotion on Integer overflow, I've changed the bignum 
vtables. We now have:

  PMC* VTABLE_get_bignum(INTERP, SELF)
which returns a new big integer of the appropriate type e.g. a PyLong. 
Along with morph these two functions are enough to preserve the HLLs 
view of types. There is a new test t/dynclass/pyint_26 that shows 
correct promotion of PyInt to PyLong.

6) during changing the scalar classes I found a lot of unused functions 
and vtables. E.g.
  - get_bool_keyed*  # unused, unneeded
  - set_bool_keyed*  # same
  - set_number
  - set_string   # no vtable slots, we have assign anyway
This is partially cleaned up now.

7) make test succeeds, this includes t/dynclass/py*.t
cd languages/tcl
TEST_PROG_ARGS=-G make test  shows 46/228 failing, with DOD enabled 
almost all fail.

I don't know yet, what's going on here. It seems that TclParser is the 
culprit. It creates during class_init a lot of strings e.g. "bs_nl", 
which are declared static in that file. But these strings aren't 
anchored anywhere or registered with Parrot's DOD registry.

leo
PS please "make realclean" so that vtable changes are propagated

Re: New language: Parrot Common Lisp

2005-04-15 Thread Leopold Toetsch

Cory Spencer <[EMAIL PROTECTED]> wrote:

> I'd like to announce the creation of the Parrot Common Lisp project, which
> aims to implement a significant subset of the Common Lisp language.

Wow. I can even do something with it:

$ ../parrot lisp.imc
-> (+ 2 5)
7
-> (list 1 2 3)
(1 . (2 . (3 . NIL)))

Ehem, that's almost all I know about Lisp.

>  Depending on the system (I develop on both x86/Linux and g4/OS X),
>  you'll get a Bus Error, Segmentation Fault or some other random error
>  if you don't disable the GC.

>  (If anyone is able to track down aforementioned DOD/GC problems,
>  you'll earn my eternal gratitude.)

Can you please provide a code snippet that exhibits the error.

> -c

leo

Re: [perl #34994] [TODO] make useful parts of Parrot config available at runtime

2005-04-15 Thread Leopold Toetsch

Steven Philip Schubiger wrote:
[ cc'ed list, so that folks know about takers ]
On 15 Apr, Leopold Toetsch wrote:
: 5) along with bringing the config online, some cleanup and renaming 
: wouldn't harm e.g. "iv" vs "opcode_t", "intvalsize" vs "intsize" vs 
: "opcode_t_size" ...

This part seems appealing to me, but bear in mind, I've never tampered
with the Parrot C sources, although I've been heavily involved in other
C-based projects (GNU coreutils et al.)
That stuff is all in Perl code under the config dir, e.g:
$ find config -type f | xargs grep -w intsize
And do you have more examples or should I follow my guts?
I think we should have:
  INTVAL_t   # type of the INTVAL
  FLOATVAL_t
  INTVAL_size
  int_size   # native c type
and so on. See also include/parrot/datatypes.h
Steven
leo

Re: A sketch of the security model

2005-04-15 Thread Shevek

On Wed, 2005-04-13 at 17:51 -0400, Aaron Sherman wrote:
> On Wed, 2005-04-13 at 17:01, Dan Sugalski wrote:
> > So here's what I was thinking of for Parrot's security and quota 
> > model. (Note that none of this is actually *implemented* yet...)
> [...]
> > It's actually pretty straightforward, the hard part being the whole 
> > "don't screw up when implementing" thing, along with designing the 
> > base set of privs. Personally I think taking the VMS priv and quota 
> > system as a base is a good way to go -- it's well-respected and 
> > well-tested, and so far as I know theoretically sound. Unix's priv 
> > model's a lot more primitive, and I don't think it's the one to take. 
> > (We could invent our own, but history shows that people who invent 
> > their own security system invent ones that suck, so that looks like 
> > something worth avoiding)
> 
> VMS at least *is* a priv-based security model, but VMS privs are not
> appropriate for parrot on the whole.

The best known model for privileges (logic of authorisation over) is
that of Oracle, RT, etc, where access over privileges is transitive.
Will find good references on request/when I have more time. Bad
references are available from Ravi Sandhu, but he doesn't handle
transitivity or modification of rights well, if at all.

S.

Re: A sketch of the security model

2005-04-15 Thread Shevek

Someone's pointed this thread out to me, so I'm going to shove an oar in
following a few posts. I've done a fair bit of security work, so feel
free to ask me to explain, justify or provide references for anything.

On Wed, 2005-04-13 at 17:01 -0400, Dan Sugalski wrote:
> All security is done on a per-interpreter basis. (really on a 
> per-thread basis, but since we're one-thread per interpreter it's 
> essentially the same thing)

What you actually mean (or what I believe you _should_ mean) is
per-context, in the lambda-calculus sense of context. See notes below
about continuations.

> QUOTAs are limits on the number of resources or operations that an 
> interpreter an allocate or perform, either in absolute terms (i.e. 
> allocate no more than 10M of memory) or relative terms (i.e. can do 
> only 10 IO operations per second). Quotas are tracked by parrot, and 
> cover:

The ability to manipulate and exceed QUOTAs should be controlled in
dynamic context.

> PRIVILEGEs are permissions to do certain things. Parrot will have a 
> number of privileges it checks before doing dangerous operations, and 
> user code may also assign and check privileges.
> 
> Normally parrot runs with no quotas and no privilege checking. This 
> is the fastest way to run. Code may at any time enable privilege 

Actually, you can do privilege checking in an efficient engine, even
using most of the reflection systems, with almost no overhead. See Java.

> and/or quota checking. Once enabled code must have proper privileges 
> to disable it again.

Typically AllPermission, otherwise you have the ability to perform
privilege escalation.

> Each running thread has two sets of privileges -- the active 
> privileges and the enableable privileges. Active privs are what's 
> actually in force at the moment, and can be dropped at any time. The 
> enableable privs are ones that code can turn on. It's possible to 
> have an active priv that's not in the enableable set, in which case 
> the current running code is allowed to do something but as soon as 
> the privilege is dropped it can't be re-enabled.

Enableable privileges are usually called static privileges and are
usually defined as the privileges held statically by the current object,
or if we read ahead to your next point, subroutine.

> Additionally, subroutines may be marked as having privileges, which 
> means that as long as control is inside the sub the priv in question 
> is enabled. This allows for code that has elevated privs, generally 
> system-level code.

Please no. Privileges should be explicitly granted. You have just
described the Unix SUID model, where as long as control is inside a
root-owned daemon (for daemon, read subroutine), the root privilege is
enabled. This always leads to privilege escalation and is BAD.

What you _should_ mean, according to all prior research, is that "No
code may be inside that routine and still hold a privilege not held by
the routine". In shorter form, "The dynamic (current) privilege set must
not exceed the static privilege set of any routine on the stack". A
slightly different formulation applies for data inspection systems. See
footnote.

> Continuations, when taken, capture the current set of active and 
> enableable privs, and when invoked those privs are put into place. 
> (This is a spot that will require some thought, since there's a 
> potential for privilege leaks which worries me here) Non-continuation 
> invokables (subs and methods) maintain the current set of privs, plus 
> possibly adding the sub-specific privs.

If you perform the above step correctly, then capturing a context and
including it in future access control checks is not hard. Java does this
by capturing a current AccessControlContext when a new ClassLoader is
created in a thread to be used in a different thread. No code loaded by
that ClassLoader IN ANY THREAD may exceed the privileges of the thread
which created the classloader at the time it created it.

> It's actually pretty straightforward, the hard part being the whole 
> "don't screw up when implementing" thing, along with designing the 
> base set of privs. Personally I think taking the VMS priv and quota 
> system as a base is a good way to go -- it's well-respected and 
> well-tested, and so far as I know theoretically sound. Unix's priv 
> model's a lot more primitive, and I don't think it's the one to take. 
> (We could invent our own, but history shows that people who invent 
> their own security system invent ones that suck, so that looks like 
> something worth avoiding)

Better systems to inspect would be Java (stack inspection), Perl5 (data
inspection). Please do not confuse the choice of privilege set and logic
over it (authorisation system) with the mechanism for identifying the
current set of privileges (identification of current principal).

The key difference in security between stack inspection and data
inspection systems for the purposes of parrot is that stack inspection
considers for sec

Re: Parrot/PUGS Hack-a-thon at the Austrian Perl Workshop

2005-04-15 Thread BÁRTHÁZI András

Hi,
There will be a Parrot/PUGS Hack-a-thon at the Austrian Perl Workshop, which
takes place on 9th and 10th June in Vienna, Austria.
Autrijus Tang, Chip Salzenberg and Leo Toetsch will be there. You should be
there too :-)
I'll be there, too. ;)
Bye,
  Andras

Re: Hyper operator corner case?

2005-04-15 Thread Thomas Sandlaß

John Williams wrote:
Good point.  Another one is: how does the meta_operator determine the
"identity value" for user-defined operators?
Does it have to? The definition of the identity value---BTW, I like
the term "neutral value" better because identity also is a relation
between two values---is that $x my_infix_op $neutral == $x.
So the generic implementation that copies surplus elements is correct
with respect to the resulting value. You shouldn't expect the operator
beeing called as many times as there are elements in the bigger data
structure, though. It's called only for positions where both structures
have actual values. But that is the same as short-circuiting && and ||.
And somewhat the reverse of authreading from junctive values.

I believe the fine points fall out like this:
   @a >>+<< 1# replicate
   @a >>+<< (1)  # replicate: (1) is still scalar
   @a >>+<< [1]  # extend: [1] is an array (and will auto-deref)
I think they fall out naturally from typing and dispatch. But note
that the » « operator has three args. I haven't made the &op a dispatch
selector. If the my_infix_op from above needs to handle neutral elements
by itself just tell the dispatcher by defining
&infix_circumfix_meta_operator:{'»','«'}:(List,List,&my_infix_op:) and
construct the neutral elements when one of the list runs out of elements.
I hope the syntax I used does what I want to express. Note that in
:(List,List,&my_infix_op:) the first two elements are types while
&my_infix_op is a sub value. In that sense my &op was actually wrong
but it was nice for wording my sentence. So the generic name should read
&infix_circumfix_meta_operator:{'»','«'}:(List,List:Code) or perhaps
&infix_circumfix_meta_operator:{'»','«'}:(List,List:&) if & is considered
as the code sigil. Hmm, then we could also have :(@,@:&) meaning the
same type spec?
BTW, starting from these type specs I come (back) to the suggestion of using
» « for hypering function calls and/or their arguments. Has that been decided?
I'm not sure if specialisation on values is covered by the :() syntax.
E.g. one could implement &infix:<*>:(0,Any) to return 0 without evaluating
the Any term at all! But this needs either lazy evaluation in the functional
paradigma or code morphing 'x() * y()' to '(($t = x()) != 0) ?? $t * y() :: 0'
or some such. On assembler level this morphing reduces to an additional
check of a register for zero. But I'm not sure if the type system and the
optimizer will be *that* strong in the near future ;)
Regards
--
TSa (Thomas Sandlaß)

Some PMC's Questions

2005-04-15 Thread bloves mr

hi,folks.
I am reading PMC C source code and reading some document("
http://www.perl.com/pub/a/2002/01/30/pmcs.html";).

Some questions:

*this PMC design have changed?
*any body offer some advice that learn PMC C source code and PMC's theory?

Thanks.
/*
  p2p is a protocol or a compiler?
*/

Re: [pugs] regexp "bug"?

2005-04-15 Thread Mark A. Biggar

BÁRTHÁZI András wrote:
Hi,
 >> This code:
 >>
 >> my $a='A';
 >> $a ~~ s:perl5:g/A/{chr(65535)}/;
 >> say $a.bytes;
 >>
 >> Outputs "0". Why?
 >
 >
 > \u is not a legal unicode codepoint.  chr(65535) should raise an 
exception of some type.  So the above code does seem show a possible 
bug. But as that chr(65535) is an undefined char, who knows what the 
code is acually doing.

In my opinion (that can be wrong), \u can be stored as an UTF-8 
character, it should be 0xEF~0xBF~0xBF. If I do it outside the regexp (I 
mean "say chr(65535).bytes", it works well.

Another "bug", I've found, it's not related to the regexps, but still 
unicode character one:

  say chr(0x10).bytes;
The answer:
  pugs: encodeUTF8: ord returned a value above 0x10
And if I start to increment $b, I will get:
  pugs: Prelude.chr: bad argument
I don't understand it, as I thougth that unicode characters in the range 
of 0x-0x7FFF. Is Haskell not supporting the whole set?

There is a Unicode version, called UCS-2, that is just between 
0x-0x, but it still not answer the question.

[...]
Meanwhile, I've found this:
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2175.htm
It can be the answer to my question.
Yes, the value 0x can be stored as either 3 byte UTF-8 string or a 2 
byte UCS-2 value, but the Unicode standard specifically says that the 
values 0x, 0xFFFE and 0xFEFF are NOT valid codepoints and should 
never appear in a Unicode string.  0x is reserved for out-of-band 
signaling (such the -1 returnd by getc()) and 0xFFFE and 0xFEFF are 
specificaly reserved for out-of-band marking a UCS-2 file as being 
either bigendian or littlendian, but are specifically not considered 
part of the data.  chr() is currently defined to mean convert an int 
value to a Unicode codepoint. That's why I said that chr(65535) should 
return an exception, it's an argument error similar to sqrt(-1).

--
[EMAIL PROTECTED]
[EMAIL PROTECTED]

Re: [pugs] regexp "bug"?

2005-04-15 Thread Mark A. Biggar

BÁRTHÁZI András wrote:
Hi,
This code:
my $a='A';
$a ~~ s:perl5:g/A/{chr(65535)}/;
say $a.bytes;
Outputs "0". Why?
Bye,
  Andras
\u is not a legal unicode codepoint.  chr(65535) should raise an 
exception of some type.  So the above code does seem show a possible 
bug. But as that chr(65535) is an undefined char, who knows what the 
code is acually doing.

--
[EMAIL PROTECTED]
[EMAIL PROTECTED]

Re: [pugs] regexp "bug"?

2005-04-15 Thread BÁRTHÁZI András

Hi,
my $a='A';
$a ~~ s:perl5:g/A/{chr(65535)}/;
say $a.bytes;
Outputs "0". Why?
\u is not a legal unicode codepoint.  chr(65535) should raise an 
exception of some type.  So the above code does seem show a possible 
bug. But as that chr(65535) is an undefined char, who knows what the 
code is acually doing.
It seems, that it gives back 0 in the 0xE000-0x range. Do you still 
think, it's normal?

"Some Unicode code points are invalid and should not be used. [...] It 
can't be 0x or 0xFFFE, it can't be both <= 0xDFFF and >= 0xD800, and 
it can't be > 0x10 and it can't be less than 0."

  http://www.elfdata.com/plugin/unicodefaqdata.html
Bye,
  Andras

1 2 >

1 - 100 of 108 matches

Mail list logo