Re: Stringification of references and objects.

2002-12-05 Thread Joseph F. Ryan
Brent Dax wrote:


Joseph F. Ryan:
# By default, references should not stringify to anything 
# "pretty", they should stringifiy to something useful for 
# debugging.  Heck, even perl5 style should be fine.  Not only 

Why?  Isn't the pretty form more generally useful?


I don't think so; I'd think it to be annoying to have type more code
in order to specify a more cocise form; if I need to dump a structure,
I'd prefer to do it manually.


# is this handy, but also prevents problems with circular 
# referencing data structures, huge data structures, etc.  
# However, all built-in types should have a .repr() method, 
# which should provide primitive Data::Dumper-ish output
# 
# So:
# 
#$var = [1,2,3];
# print "$var";
# print "\n";
# print "$($var.repr)";
# 
# Might print something like:
# 
# [REF_TO_ARRAY_AT: '0x1245AB']

What's wrong with a Perl 5-esque format for the debugging version?

	Array(0x1245AB)

Personally, I like this format.  It's succinct, informative, and tells
you enough to do identity testing.
 


I like it too, but I thought everyone else hated it :)


# Next, objects:
# 
# Objects should have an AS_STRING method inherited from 
# UNIVERSAL defined as follows:

I'd prefer if we drop the capitals.  str() ought to work fine, IMHO.

# method AS_STRING() {
# return "[CLASS_INSTANCE_OF: '" ~ $self.CLASS() ~ "']";
# }

Once again, what's wrong with:

	method str() {
		#Unnamed invocant means you need $_, right?
		return $_.class() ~ "($_.id())";
	}

(where id() returns a uniquely identifying integer, usually the
address).


Objects aren't references anymore, are they?  So I don't think it is
apporpriate for an object to stringify with its id.

Joseph F. Ryan
[EMAIL PROTECTED]




Re: purge: opposite of grep

2002-12-05 Thread Damian Conway
I would suggest that we could get away with a single n-ary built-in.

And I would strongly suggest that C isn't the right name for it,
since, apart from being a ugly, slang word, "divvy" implies dividing up
equally. The built-in would actually be doing classification of the
elements of the list, so it ought to be called C.

I would expect that C would return a list of array references.
So Larry is (of course! ;-) entirely correct in pointing out that it
would require the use of := (not =). As for an error when = is used,
perhaps that ought to be handled by a general "Second and subsequent
lvalue arrays will never be assigned to" error.

The selector block/closure would, naturally, be called in C
context each time, so (again, as Larry pointed out) a boolean
function would naturally classify into two arrays. Though it
might at first be a little counterintuitive to have to write:

	(@false, @true) := classify { $^x > 10 } @nums;

I think it's a small price to pay to avoid tiresome special cases.

Especially since you then get your purge/vrep/antigrep for free:

	(@members) := classify {$_->{'quit'}} @members;

;-)

Damian





Re: Usage of \[oxdb]

2002-12-05 Thread Damian Conway
Michael Lazzaro wrote:


Huh... having a comma-separated list to represent multiple characters.  
I can't think of any problems with that, and it would be marginally 
easier for some sequences...

Unless someone on the design team objects, I'd say let's go for it.

Larry was certainly in favour of it when he wrote A5
(see under http://search.cpan.org/perl6/apo/A05.pod#Backslash_Reform).
Except the separators he suggests are semicolons:

Perl 5  Perl 6
\x0a\x0d\x[0a;0d]   # CRLF
\x0a\x0d\c[CR;LF]   # CRLF (conjectural)

Damian




RE: Stringification of references and objects.

2002-12-05 Thread Brent Dax
Joseph F. Ryan:
# By default, references should not stringify to anything 
# "pretty", they should stringifiy to something useful for 
# debugging.  Heck, even perl5 style should be fine.  Not only 

Why?  Isn't the pretty form more generally useful?

# is this handy, but also prevents problems with circular 
# referencing data structures, huge data structures, etc.  
# However, all built-in types should have a .repr() method, 
# which should provide primitive Data::Dumper-ish output
# 
# So:
# 
#$var = [1,2,3];
# print "$var";
# print "\n";
# print "$($var.repr)";
# 
# Might print something like:
# 
# [REF_TO_ARRAY_AT: '0x1245AB']

What's wrong with a Perl 5-esque format for the debugging version?

Array(0x1245AB)

Personally, I like this format.  It's succinct, informative, and tells
you enough to do identity testing.

# Next, objects:
# 
# Objects should have an AS_STRING method inherited from 
# UNIVERSAL defined as follows:

I'd prefer if we drop the capitals.  str() ought to work fine, IMHO.

# method AS_STRING() {
# return "[CLASS_INSTANCE_OF: '" ~ $self.CLASS() ~ "']";
# }

Once again, what's wrong with:

method str() {
#Unnamed invocant means you need $_, right?
return $_.class() ~ "($_.id())";
}

(where id() returns a uniquely identifying integer, usually the
address).

--Brent Dax <[EMAIL PROTECTED]>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

"If you want to propagate an outrageously evil idea, your conclusion
must be brazenly clear, but your proof unintelligible."
--Ayn Rand, explaining how today's philosophies came to be




Stringification of references and objects.

2002-12-05 Thread Joseph F. Ryan
A big issue that still remains with literals is the stringification of
objects and references.  In an effort to get the behaviors hammered
down, here are a few ideas:

First off, references:

By default, references should not stringify to anything "pretty", they
should stringifiy to something useful for debugging.  Heck, even perl5
style should be fine.  Not only is this handy, but also prevents
problems with circular referencing data structures, huge data
structures, etc.  However, all built-in types should have a .repr()
method, which should provide primitive Data::Dumper-ish output

So:

  $var = [1,2,3];
   print "$var";
   print "\n";
   print "$($var.repr)";

Might print something like:

[REF_TO_ARRAY_AT: '0x1245AB']
[
   '1',
   '2',
   '3'
]



Next, objects:

Objects should have an AS_STRING method inherited from UNIVERSAL
defined as follows:

method AS_STRING() {
   return "[CLASS_INSTANCE_OF: '" ~ $self.CLASS() ~ "']";
}

The AS_STRING method is implicitly called when an object is
interpolated within a string.  The AS_STRING method can be
overloaded within the class if the class's author wants nicer
(classier;) output.

so:

   class Normal {}

   class Special {
   method AS_STRING() {
   return qq['one','two',three']
   }
   }

   my Normal  $obj1;
   my Special $obj2;

   print $obj1;
   print "\n";
   print $obj2;

Should print:

[CLASS_INSTANCE_OF: 'Normal']
'one','two',three'




Re: purge: opposite of grep

2002-12-05 Thread Austin Hastings

--- Dave Whipp <[EMAIL PROTECTED]> wrote:
> 
> I think that c would be an abysmal name: that implies
> "keep the false ones". I'm not sure that there is a synonym
> for "boolean partition" though. Perhaps we need some help
> from a linguist! ;)
> 

What's wrong with split()?

split { f($_) }, $iterator-or- @array.split { f($_) }

vs.

split /\Q$delim\E/, $string   -or- $string.split( /\Q$delim\E/ )


BTW, since it's possible to say:

my (@even, @odd) = split { $_ % 2 }, 0 .. Inf;

I presume that split will be smart enough to be usefully lazy. So
laziness is probably a contagious property. (If the input is lazy, the
output probably will be, too.)

But what happens with side-effects, or with pathologically ordered
accesses?

That is, iterators tend to get wrapped with a lazy array, which caches
the accesses.

So if the discriminator function caches values of its own, what
happens?

E.g.,

# Side-effects
my (@even, @odd) 
= split { is_prime($_) && $last_prime = $_; $_ % 2 }, 0..Inf;

The value of last_prime is .. ?

# Pathological access:
my (@even, @odd) = ... as above ...

print $#odd;

Does @even (which is going to be cached by the lazy array) just swamp
memory, or what?


=Austin



=Austin




Re: purge: opposite of grep

2002-12-05 Thread Miko O'Sullivan
On 5 Dec 2002, Rafael Garcia-Suarez wrote:

> If you want good'ol Unix flavor, call it "vrep". Compare the ed(1) /
> ex(1) / vi(1) commands (where 're' stands for regular expression, of
> course) :
> :g/re/p
> :v/re/p

I like it.  Fits in with our Un*x heritage, and doesn't have any existing
meaning that implies things it doesn't do.

-miko


Miko O'Sullivan
Programmer Analyst
Rescue Mission of Roanoke




Advanced Contexts (was: RE: seperate() and/or Array.cull)

2002-12-05 Thread Michael Lazzaro

On Thursday, December 5, 2002, at 07:53  AM, Austin Hastings wrote:

And in general, without resorting to something hideous like scanf, is
there going to be some more-advanced want() variant that allows saying

@a, $i, $j, @b, %w, $k, @c = scramble(...);


This is a terribly important question, for divvy() and everything else. 
 Whether or not the description of "context" of a sub/block has the 
same robustness as the "arguments" of that sub/block has big 
implications for multimethods, among other things.

I would hope that, minimally, the return type is considered part of the 
(multimethod) signature, and that you can test for at least the scalar 
types:

  my int $i = bar(...);# so these are
  my str $s = bar(...);# the same sub, but
  my MyClass $o = bar(...);# different multimethod variants.


In my fantasy world (which I visit quite often), P6 context has 
descriptive capabilities matching those that can be assigned to args.

  (@a,@b)= foo(...);   # same sub,
  (@a,@b,@c) = foo(...);   # different multimethod variants.

  my int @arr = zap(...);   # ... you get the idea ...
  my num @arr = zap(...);
  my MyClass @arr = zap(...);
  my Array of Array of Hash of MyClass @arr = zap(...);


That last one leads to the obvious question of what C returns in 
the case of compound types:

  my Array of Array of Hash of MyClass @arr = zap(...);

  ... then inside zap ...

  want scalar;  # false;
  want list;# true (but what spelling?)
  want Array;   # true (but what spelling?)
  want Array of Array; # true
  want Array of Array of Hash; # true
  want Array of Array of Hash of MyClass;  # true

  want Hash;   # false
  want MyClass;# false
  want Array of int;   # false

  my $want = want; # 'Array of Array of Hash of MyClass'; (?)
  my @want = want; # qw(Array, Array, Hash, MyClass); (?)


If we have such functionality, than divvy() and other multimethods can 
be tailored to DWYM even in quite specific contexts, and a lot of 
things become much easier.

MikeL



Re: purge: opposite of grep

2002-12-05 Thread Dave Whipp

"Miko O'Sullivan" <[EMAIL PROTECTED]> wrote:
> On Thu, 5 Dec 2002, Dave Whipp wrote:
>
> > Only if we apply a bit of magic (2 is a true value). The rule might be:
>
> How about if we just have two different methods: one for boolean and one
> for multiple divvies:
>
>  my(@true, @false) := @array.cull{/some test/};
>
>  my (@a, @b, @c) := @array.divvy{some code}

I think you are correct, but only because of the psychology of
affordances: you wrote "@true, @false", not "@false, @true".
I use the same mental ordering, so I expect it would be a
common bug.

I think that c would be an abysmal name: that implies
"keep the false ones". I'm not sure that there is a synonym
for "boolean partition" though. Perhaps we need some help
from a linguist! ;)


Dave.





Re: purge: opposite of grep

2002-12-05 Thread Rafael Garcia-Suarez
John Williams wrote in perl.perl6.language :
> 
> While "purge" is cute, it certainly is not obvious what it does.  Of
> course neither is "grep" unless you are an aging unix guru...
> 
> How about something which is at least obvious to someone who knows what
> grep is, such as "vgrep" or "grep:v"?

If you want good'ol Unix flavor, call it "vrep". Compare the ed(1) /
ex(1) / vi(1) commands (where 're' stands for regular expression, of
course) :
:g/re/p
:v/re/p

What would be an idiomatic Perl 6 implementation of such a vrep function ?



Re: purge: opposite of grep

2002-12-05 Thread Miko O'Sullivan
On Thu, 5 Dec 2002, Dave Whipp wrote:

> Only if we apply a bit of magic (2 is a true value). The rule might be:

How about if we just have two different methods: one for boolean and one
for multiple divvies:

 my(@true, @false) := @array.cull{/some test/};

 my (@a, @b, @c) := @array.divvy{some code}



Miko O'Sullivan
Programmer Analyst
Rescue Mission of Roanoke




RE: purge: opposite of grep

2002-12-05 Thread Fisher Mark
> FWIW, I came up with "purge" because my first inclination was to spell
> "grep" backwards: "perg".  :-)

I like "purge", although "except", "exclude", and "omit" all have their
charms.

For partition function, I like "divvy", "carve", "segment" (in that order)
and almost anything other than "separate", which IIRC is one of the most
misspelled words in English.
===
Mark Leighton Fisher[EMAIL PROTECTED]
Thomson multimedia, Inc.Indianapolis IN
"we have tamed lightning and used it to teach sand to think"




Re: purge: opposite of grep

2002-12-05 Thread Dave Whipp
"Larry Wall" <[EMAIL PROTECTED]> wrote:
> On Thu, Dec 05, 2002 at 10:09:08AM -0800, Michael Lazzaro wrote:
> : What about "divvy" (or are we already using that for something else?)
> :
> : my(@a,@b) = divvy { ... } @c;
>
> Any such solution must use := rather than =.  I'd go as far as to say
> that divvy should be illegal in a list context.

I'm not sure I understand that: we're assigning here, not binding (aren't
we?).

> Note that if the closure is expected to return a small integer saying
> which array to divvy to, then boolean operators fall out naturally
> because they produce 0 and 1.

Only if we apply a bit of magic (2 is a true value). The rule might be:

If context is an list of arrays, then the coderef is evaluated in
integer context: to map each input value to an integer, which selects
which array to append the input-value onto.

If the size of the context is "list of 2 arrays", then the coderef is
evaluated in Boolean context, and the index determined as
c< $result ?? 1 :: 0 >.

If the context is a single array, then it is assumed to be an
array-of-arrays: and the coderef is evaluated in integer-context.

If the context is a hash, then the coderef is evaluated in scalar
context, and the result used as a hash key: the value is pushed
onto the array, in the hash, identified by the key.


One more thing: how to I tell the assignment not to clear to
LHS at the start of the operation. Can I say:

  my (@a,@b) = divvy { ... } @a1;
  (@a,@b) push= divvy { ... } @a2;


Dave.





Re: purge: opposite of grep

2002-12-05 Thread Larry Wall
On Thu, Dec 05, 2002 at 10:09:08AM -0800, Michael Lazzaro wrote:
: What about "divvy" (or are we already using that for something else?)
: 
: my(@a,@b) = divvy { ... } @c;

Any such solution must use := rather than =.  I'd go as far as to say
that divvy should be illegal in a list context.

Note that if the closure is expected to return a small integer saying
which array to divvy to, then boolean operators fall out naturally
because they produce 0 and 1.

Larry



RE: seperate() and/or Array.cull

2002-12-05 Thread Michael Lazzaro

On Thursday, December 5, 2002, at 10:09  AM, Michael Lazzaro wrote:

What about "divvy" (or are we already using that for something else?)

my(@a,@b) = divvy { ... } @c;

Other possibilities from the ol' thesaurus: C, C, 
C, C.

@$#@%*.  Trying to do too many %#@%@ things at once.  I meant 'divvy' 
instead of 'seperate', not 'purge', obviously (duh).  I like Angel's 
general theorizing, but maybe we base it on C instead of C?



Note that this does not generalize for cases > 2.  If you want to 
split things into, say, three different lists, or five, you have to 
use a 'given', and it gets less pleasant.  Perhaps a C can be a 
derivation of C or C by "dividing the streams", either 
like this:

my(@a,@b,@c,@d) = divvy {
/foo/ ::
/bar/ ::
/zap/ ::
} @source;

or this (?):

   divvy( @source; /foo/ :: /bar/ :: /zap/ ) -> @a, @b, @c, @d;


where C<::> is whatever delimiter we deem appropriate, and an empty 
test is taken as the "otherwise" case.

Just pondering.  Seems like a useful variation on the whole C 
vs. C vs. C theme, though.

MikeL




Re: purge: opposite of grep

2002-12-05 Thread Miko O'Sullivan
On Thu, 5 Dec 2002, Robert Spier wrote:

> -R (who does not see any benefit of 'perg' over grep { ! code } )

My problem with grep { ! code } is the same problem I have with if (!
expression): I've never developed a real trust in operator precedence.
Even looking at your pseudocode example, I itched to "fix" it with grep {!
(code) }.

This may be a weakness on my part, but I like computers to address my
weaknesses: I certainly spend enough time addressing theirs.

-miko



Miko O'Sullivan
Programmer Analyst
Rescue Mission of Roanoke




Re: purge: opposite of grep

2002-12-05 Thread Robert Spier
>How about my original inclinaton: "perg"?  It just screams out "the
>opposite of grep".

So it greps a list in reverse order?

-R (who does not see any benefit of 'perg' over grep { ! code } )




Re: purge: opposite of grep

2002-12-05 Thread Miko O'Sullivan
On Wed, 4 Dec 2002, John Williams wrote:

> While "purge" is cute, it certainly is not obvious what it does.  Of
> course neither is "grep" unless you are an aging unix guru...
>
> How about something which is at least obvious to someone who knows what
> grep is, such as "vgrep" or "grep:v"?

How about my original inclinaton: "perg"?  It just screams out "the
opposite of grep".

-miko


Miko O'Sullivan
Programmer Analyst
Rescue Mission of Roanoke




Re: purge: opposite of grep

2002-12-05 Thread Adam D. Lopresto
I like it except for the name, which feels too active to me (ie, if I were to
purge those elements from the array I'd expect the array to be altered, instead
of returning a new array with only those elements).  But I do like the idea.  I
think the name "except" would be pretty nice, though.  Then again, I'm not too
terribly fond of "grep".  If it were named "only", then things might be really
nice.  (Or we could name them "accept" and "except" and be mean :))

> SUMMARY
> 
> Proposal for the "purge" command as the opposite of "grep" in the same way
> that "unless" is the opposite of "if".
> 
> DETAILS
> 
> I've lately been going a lot of greps in which I want to keep all the
> elements in an array that do *not* match some rule.  For example, suppose
> I have a list of members of a club, and I want to remove (i.e. "purge")
> from the list everybody for whom the "quit" property is true.  With grep
> it's done like this:
> 
>@members = grep {! $_->{'quit'}} @members;
> 
> Obviously that works well enough, but just like "unless" somehow
> simplifies the logic by removing that leading !, "purge" can simplifiy the
> array filter:
> 
>@members = purge {$_->{'quit'}} @members;
> 
> FWIW, I came up with "purge" because my first inclination was to spell
> "grep" backwards: "perg".  :-)
> 
> -miko
> 
> 
> Miko O'Sullivan
> Programmer Analyst
> Rescue Mission of Roanoke
> 
> 
-- 
Adam Lopresto ([EMAIL PROTECTED])
http://cec.wustl.edu/~adam/

I love apathy with a passion.

--Jamin Gray



Re: Usage of \[oxdb] (was Re: String Literals, take 2)

2002-12-05 Thread Larry Wall
On Thu, Dec 05, 2002 at 09:18:21AM -0800, Michael Lazzaro wrote:
: 
: On Thursday, December 5, 2002, at 02:11  AM, James Mastros wrote:
: 
: >On 12/04/2002 3:21 PM, Larry Wall wrote:
: >>\x and \o are then just shortcuts.
: >Can we please also have \0 as a shortcut for \0x0?
: 
: \0 in addition to \x, meaning the same thing?  I think that would get 
: us back to where we were with octal, wouldn't it?  I'm not real keen on 
: leading zero meaning anything, personally...  :-P

\0 still means chr(0).  I don't think there's much conflict with
the new \0x, \0o, \0b, and \0d, since \0 almost always occurs at the
end of a string, if anywhere.

: >>There ain't no such thing as a "wide" character.  \xff is exactly
: >>the same character as \x[ff].
: >Which means that the only way to get a string with a literal 0xFF byte 
: >in it is with qq:u1[\xFF]? (Larry, I don't know that this has been 
: >mentioned before: is that right?)  chr:u1(0xFF) might do it too, but 
: >we're getting ahead of ourselves.
: 
: Hmm... does this matter?  I'm a bit rusty on my Unicode these days, but 
: I was assuming that \xFF and \x00FF always pointed to the same 
: character, and that you in fact _don't_ have the ability to put 
: individual bytes in a string, because Perl is deciding how to place the 
: characters for you (how long they should be, etc.)  So if you wanted 
: more explicit control, you'd use C.

A "byte" string is any string whose characters are all under 256.  It's
up to an interface to coerce this to actual bytes if it needs them.

We'll presumably have something like "use bytes" that turns off all
multi-byte processing, in which case you have to deal with any UTF that
comes in by hand.  But in general it'll be better if the interface coerces
to types like "str8", which is presumably pronouced "straight".

Don't ask me how str16 and str32 are pronounced.  (But generally you should
be using utf16 instead of str16 in any event, unless your interface truly
doesn't know how to deal with surrogates.)  In other words, str16 is
the name of the obsolescent UCS-2, and str32 is the name for UCS-4, which
is more or less the same as UTF-32, except that UTF-32 is not allowed to
use the bits above 0x10.

So anyway, we've got all these types:

str8utf8
str16   utf16
str32   utf32

where the "str" version is essentially just a compact integer array.  One could
alias str8 to "latin1" since the default coercion from Unicode to str8 would
have those semantics.

It's not clear exactly what the bare "str" type is.  "Str" is obviously
the abstract string type, but "str" probably means the default C string
type for the current architecture/OS/locale/whatever.  In other words,
it might be str8, or it might be utf8.  Let's hope it's utf8, because
that will work forever, give or take an eon.

: >Also, an annoying corner case: is "\0x1ff" eq "\0x[1f]f", or is it eq 
: >"\0x[1ff]"?  What about other bases?  Is "\0x1x" eq "\0x[1]", or is it 
: >eq "\0x[1x]" (IE illegal).  (Now that I put those three questions 
: >together, the only reasonable answer seems to be that the number ends 
: >in the last place it's valid to end if you don't use explicit 
: >brackets.)
: 
: Yeah, my guess is that it's as you say... it goes till it can't goes no 
: more, but never gives an error (well, maybe for "\0xz", where there are 
: zero valid digits?)  But I would suspect that the bracketed form is 
: *strongly* recommended.  At least, that's what I plan on telling 
: people.  :-)

Sounds good to me.  Dwimming is wonderful, but so is dwissing.

Larry



Re: purge: opposite of grep

2002-12-05 Thread Michael Lazzaro

On Wednesday, December 4, 2002, at 09:11  PM, John Williams wrote:


On Wed, 4 Dec 2002, Miko O'Sullivan wrote:


FWIW, I came up with "purge" because my first inclination was to spell
"grep" backwards: "perg".  :-)


While "purge" is cute, it certainly is not obvious what it does.  Of
course neither is "grep" unless you are an aging unix guru...


The idea certainly has merit, though. It _is_ a quite common operation.

What about "divvy" (or are we already using that for something else?)

my(@a,@b) = divvy { ... } @c;

Other possibilities from the ol' thesaurus: C, C, C, 
C.



Note that this does not generalize for cases > 2.  If you want to split 
things into, say, three different lists, or five, you have to use a 
'given', and it gets less pleasant.  Perhaps a C can be a 
derivation of C or C by "dividing the streams", either like 
this:

my(@a,@b,@c,@d) = divvy {
/foo/ ::
/bar/ ::
/zap/ ::
} @source;

or this (?):

   divvy( @source; /foo/ :: /bar/ :: /zap/ ) -> @a, @b, @c, @d;


where C<::> is whatever delimiter we deem appropriate, and an empty 
test is taken as the "otherwise" case.

Just pondering.  Seems like a useful variation on the whole C 
vs. C vs. C theme, though.

MikeL



Re: Usage of \[oxdb]

2002-12-05 Thread Michael Lazzaro
On Wednesday, December 4, 2002, at 12:55  PM, David Whipp wrote:

How far can we go with this \c thing? How about:
  print "\c[72, 101, 108, 108, 111]";
will that print "Hello"?


Huh... having a comma-separated list to represent multiple characters.  
I can't think of any problems with that, and it would be marginally 
easier for some sequences...

Unless someone on the design team objects, I'd say let's go for it.

MikeL



Re: Usage of \[oxdb] (was Re: String Literals, take 2)

2002-12-05 Thread Michael Lazzaro

On Thursday, December 5, 2002, at 02:11  AM, James Mastros wrote:


On 12/04/2002 3:21 PM, Larry Wall wrote:

\x and \o are then just shortcuts.

Can we please also have \0 as a shortcut for \0x0?


\0 in addition to \x, meaning the same thing?  I think that would get 
us back to where we were with octal, wouldn't it?  I'm not real keen on 
leading zero meaning anything, personally...  :-P

There ain't no such thing as a "wide" character.  \xff is exactly
the same character as \x[ff].

Which means that the only way to get a string with a literal 0xFF byte 
in it is with qq:u1[\xFF]? (Larry, I don't know that this has been 
mentioned before: is that right?)  chr:u1(0xFF) might do it too, but 
we're getting ahead of ourselves.

Hmm... does this matter?  I'm a bit rusty on my Unicode these days, but 
I was assuming that \xFF and \x00FF always pointed to the same 
character, and that you in fact _don't_ have the ability to put 
individual bytes in a string, because Perl is deciding how to place the 
characters for you (how long they should be, etc.)  So if you wanted 
more explicit control, you'd use C.

Also, an annoying corner case: is "\0x1ff" eq "\0x[1f]f", or is it eq 
"\0x[1ff]"?  What about other bases?  Is "\0x1x" eq "\0x[1]", or is it 
eq "\0x[1x]" (IE illegal).  (Now that I put those three questions 
together, the only reasonable answer seems to be that the number ends 
in the last place it's valid to end if you don't use explicit 
brackets.)

Yeah, my guess is that it's as you say... it goes till it can't goes no 
more, but never gives an error (well, maybe for "\0xz", where there are 
zero valid digits?)  But I would suspect that the bracketed form is 
*strongly* recommended.  At least, that's what I plan on telling 
people.  :-)

Design team: If we're wrong on these, please correct.  :-)

MikeL



How do you return arrays?; was: RE: seperate() and/or Array.cull

2002-12-05 Thread Austin Hastings
In thinking about how to write a "partition" function (or separate, or
whatever you want to call it) it occurs to me that you might want some
sort of reverse-varargs behavior, like

my (@a, @b, @c, @d) = @array.partiton { $_ % 4 };

So in this case, partition is supposed to determine, on the fly, how
many classes to return (or return all the classes it makes, and let an
exception take mismatches).

How do we do that? 

And in general, without resorting to something hideous like scanf, is
there going to be some more-advanced want() variant that allows saying

@a, $i, $j, @b, %w, $k, @c = scramble(...);

??

=Austin


--- HellyerP <[EMAIL PROTECTED]> wrote:
> Angel Faus:
> > Maybe the solution is to make it hash-wise:
> > 
> > %hash = @array.sep {
> > when /^[A-Z]*$/ {'uppercase'}
> > when /^[a-z]*$/ {'lowercase'}
> > default {'mixedcase'}
> > }
> 
> I agree that general partitioning is 'better' than a fixed binary
> proposal,
> but what is gained over the full code except a tiny bit of sugar?
> 
> for (@array)
> {
> when /^[A-Z]+$/ { push %hash{'uppercase'}, $_ }
> when /^[a-z]+$/ { push %hash{'lowercase'}, $_ }
> default { push %hash{'mixedcase'}, $_ }
> }
> 
> On the other hand, perhaps binary-partitioning is sufficiently common
> to
> warrant Schwern's abbreviated syntax:
> 
> (@switches, @args) = separate /^-/, @ARGV;
> 
> Which in full would be something like:
> 
> for (@ARGV)
> {
> when /^-/ { push @switches, $_ }
> default   { push @args, $_ }
> }
> 
> Philip
> 
> 
> Disclaimer
> 
> This communication together with any attachments transmitted with it
> ('this E-mail') is intended only for the use of the addressee and may
> contain information which is privileged and confidential. If the
> reader of this E-mail is not the intended recipient or the employee
> or agent responsible for delivering it to the intended recipient you
> are notified that any use of this E-mail is prohibited. Addressees
> should ensure this E-mail is checked for viruses. The Carphone
> Warehouse Group PLC makes no representations as regards the absence
> of viruses in this E-mail. If you have received this E-mail in error
> please notify our ISe Response Team immediately by telephone on + 44
> (0)20 8896 5828 or via E-mail at [EMAIL PROTECTED] Please then
> immediately destroy this E-mail and any copies of it.
> 
> Please feel free to visit our website: 
> 
> UK
> http://www.carphonewarehouse.com
> 
> Group
> http://www.phonehouse.com
> 
> 




Re: In defense of zero-indexed arrays.

2002-12-05 Thread Austin Hastings
> Explain how having indexes (arrays, substr, etc...) in Perl 6 start
> at 0 will benefit most users.

The languages which do not start their indices at 0 are dead or dying. 

> Do not invoke legacy. 

How about FUD? :-)

=Austin

--- Michael G Schwern <[EMAIL PROTECTED]> wrote:
> I'm going to ask something that's probably going to launch off into a
> long,
> silly thread.  But I'm really curious what the results will be so
> I'll ask
> it anyway.  Think of it as an experiment.
> 
> So here's your essay topic:
> 
> Explain how having indexes (arrays, substr, etc...) in Perl 6 start
> at 0
> will benefit most users.  Do not invoke legacy. [1]
> 
> 
> [1] ie. "because that's how most other languages do it" or "everyone
> is used
> to it by now" are not valid arguments.  Ask any Pascal programmer. :)




RE: seperate() and/or Array.cull

2002-12-05 Thread HellyerP
Angel Faus:
> Maybe the solution is to make it hash-wise:
> 
> %hash = @array.sep {
> when /^[A-Z]*$/ {'uppercase'}
> when /^[a-z]*$/ {'lowercase'}
> default {'mixedcase'}
> }

I agree that general partitioning is 'better' than a fixed binary proposal,
but what is gained over the full code except a tiny bit of sugar?

for (@array)
{
when /^[A-Z]+$/ { push %hash{'uppercase'}, $_ }
when /^[a-z]+$/ { push %hash{'lowercase'}, $_ }
default { push %hash{'mixedcase'}, $_ }
}

On the other hand, perhaps binary-partitioning is sufficiently common to
warrant Schwern's abbreviated syntax:

(@switches, @args) = separate /^-/, @ARGV;

Which in full would be something like:

for (@ARGV)
{
when /^-/ { push @switches, $_ }
default   { push @args, $_ }
}

Philip


Disclaimer

This communication together with any attachments transmitted with it ('this E-mail') 
is intended only for the use of the addressee and may contain information which is 
privileged and confidential. If the reader of this E-mail is not the intended 
recipient or the employee or agent responsible for delivering it to the intended 
recipient you are notified that any use of this E-mail is prohibited. Addressees 
should ensure this E-mail is checked for viruses. The Carphone Warehouse Group PLC 
makes no representations as regards the absence of viruses in this E-mail. If you have 
received this E-mail in error please notify our ISe Response Team immediately by 
telephone on + 44 (0)20 8896 5828 or via E-mail at [EMAIL PROTECTED] Please then 
immediately destroy this E-mail and any copies of it.

Please feel free to visit our website: 

UK
http://www.carphonewarehouse.com

Group
http://www.phonehouse.com




Re: seperate() and/or Array.cull

2002-12-05 Thread Angel Faus

Michael G Schwern wrote: 
> and that's just entirely too much work.  I'd love to be able to do
> it with a grep like thing.
>
>  (@switches, @args) = seperate /^-/, @ARGV;
>
> seperate() simply returns two lists.  One of elements which match,
> one of elements which don't.  I think Perl 6 will allow the above
> syntax to work rather than having to play with array refs.
>

It would be nice to make it work with more than two arrays too.

Something like this:

(@upper, @lower, @mixed) = @array.sep {
when /^[A-Z]*$/ {0}
when /^[a-z]*$/ {1}
default {2}
}

But it looks a bit dangerous, because the following won't work if 
@array has numbers in it:

(@false_members, @true_members) = @array.sep { $_ }; # bad

Maybe the solution is to make it hash-wise:

%hash = @array.sep {
when /^[A-Z]*$/ {'uppercase'}
when /^[a-z]*$/ {'lowercase'}
default {'mixedcase'}
}

-angel



RE: seperate() and/or Array.cull

2002-12-05 Thread HellyerP
Aaron Crane:
> However, I don't think it should be called 'seperate'.  I also don't think
> it should be called 'separate', because that word seems to be commonly
> misspelled...

That seems like an excellent argument for calling it 'separate'.  Perhaps it
will be the first of many spelling-improving keywords, enforced by syntax
highlighting of course.  

Philip





Disclaimer

This communication together with any attachments transmitted with it ('this E-mail') 
is intended only for the use of the addressee and may contain information which is 
privileged and confidential. If the reader of this E-mail is not the intended 
recipient or the employee or agent responsible for delivering it to the intended 
recipient you are notified that any use of this E-mail is prohibited. Addressees 
should ensure this E-mail is checked for viruses. The Carphone Warehouse Group PLC 
makes no representations as regards the absence of viruses in this E-mail. If you have 
received this E-mail in error please notify our ISe Response Team immediately by 
telephone on + 44 (0)20 8896 5828 or via E-mail at [EMAIL PROTECTED] Please then 
immediately destroy this E-mail and any copies of it.

Please feel free to visit our website: 

UK
http://www.carphonewarehouse.com

Group
http://www.phonehouse.com




Re: seperate() and/or Array.cull

2002-12-05 Thread Aaron Crane
Michael G Schwern writes:
> I'd love to be able to do it with a grep like thing.
> 
>  (@switches, @args) = seperate /^-/, @ARGV;

Yes.  I've written that function in Perl 5, which isn't ideal, because you
have to return array refs, not arrays.

However, I don't think it should be called 'seperate'.  I also don't think
it should be called 'separate', because that word seems to be commonly
misspelled...

It's hard to come up with a good name, though.  Bad ones I've thought of
include:

   grepboth
 - The unpleasant name my Perl 5 implementation has
   split
 - Overloaded meaning -- but we could perhaps get away with scalar-split
   and array-split being different
   characterize
 - Or do I mean 'characterise'?
   partition
   classify
 - These are the two I dislike least

>@switches = @ARGV.cull /^-/;
> 
> Array.cull would remove and return a list of every element in @ARGV which
> matched.

I'm not so fond of that -- I don't think it's as obvious that you're doing a
two-way classification.

-- 
Aaron Crane * GBdirect Ltd.
http://training.gbdirect.co.uk/courses/perl/



Re: In defense of zero-indexed arrays.

2002-12-05 Thread Richard Proctor
On Thu 05 Dec, Michael G Schwern wrote:
> So here's your essay topic:
> 
> Explain how having indexes (arrays, substr, etc...) in Perl 6 start at 0
> will benefit most users.  Do not invoke legacy. [1]
>
> [1] ie. "because that's how most other languages do it" or "everyone is
> used to it by now" are not valid arguments.  Ask any Pascal programmer. :)

Many years ago I was involved with a project where all the software
people reffered to the hardware as planes 0 and 1 (it was a duplicated 
system) and the hardware people always used 1 and 2.  To avoid confusion
we settled on using 0 and 2.

Any way of indexing arrays has its proponents.  Perl currently has the
heavily depreciated $[ to allow playing with this base, changing it has 
nasty affects at a distance.

Long long ago some computer languages did base their arrays at 1 rather
than 0.  Hopefully they are dead now - it led to confusion and bad practices.
But that is a legacy argument.

There was an argument when computer languages were close to the hardware,
when to index an array you added the index (multiplied by the size of
the element) to the base of the array to find what you wanted.  This is
probably insignificant and not an issue today.

To conclude other than a very large legacy argument, there is probably
no strong reason to base arrays at 0 rather than 1.  I would not want to
change.

Richard


-- 
Personal [EMAIL PROTECTED]http://www.waveney.org
Telecoms [EMAIL PROTECTED]  http://www.WaveneyConsulting.com
Web services [EMAIL PROTECTED]http://www.wavwebs.com
Independent Telecomms Specialist, ATM expert, Web Analyst & Services




Re: In defense of zero-indexed arrays.

2002-12-05 Thread Luke Palmer
> Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm
> Date: Thu, 5 Dec 2002 02:45:39 -0800
> From: Michael G Schwern <[EMAIL PROTECTED]>
> Content-Disposition: inline
> Sender: Michael G Schwern <[EMAIL PROTECTED]>
> X-SMTPD: qpsmtpd/0.12, http://develooper.com/code/qpsmtpd/
>
> I'm going to ask something that's probably going to launch off into a long,
> silly thread.  But I'm really curious what the results will be so I'll ask
> it anyway.  Think of it as an experiment.
>
> So here's your essay topic:
>
> Explain how having indexes (arrays, substr, etc...) in Perl 6 start at 0
> will benefit most users.  Do not invoke legacy. [1]

Through years of experience:  "Because it's cleaner that way."

from 1:A   Z
   ↓   ↓
+--+--+--+--+--+
  $x:   | "1"  | "2"  | "3"  | "4"  | "5"  |
|  |  |  |  |  |
+--+--+--+--+--+
↑  ↑↑  ↑
from 0: a  by  z

They're just different ways of thinking.  If you start from 1, you're
talking about the elements themselves; operations are [i,j]
(inclusive).  If you start from 0, you're talking about the positions
between elements; operations are [i,j) (inclusive, exclusive).

Say you have $x as above, and you wish to partition it into two
strings "12" and "345".  In the "1" paradigm:

$part  = 3;
$first = substr $x, 1, $part-1;
$last  = substr $x, $part, 5;

In the "0":

$part  = 2;
$first = substr $x, 0, $part;
$last  = substr $x, $part, 5;

In the former, you can call $part 2 if you want; it's equally as ugly.
I'm having flashbacks to my QBASIC days, where anything that
manipulated arrays seemed to be flooded with +1 and -1 in that way.
They say C has off by one errors, they have not tried BASIC.

I know this wasn't a strong argument, but in summary, most algorithms
are more elegant when working with spaces between elements than with
the indices of the elements themselves.  And it only makes sense to
number them from zero then (otherwise you get length+1 as the end,
which doesn't make any sense). 

Luke




In defense of zero-indexed arrays.

2002-12-05 Thread Michael G Schwern
I'm going to ask something that's probably going to launch off into a long,
silly thread.  But I'm really curious what the results will be so I'll ask
it anyway.  Think of it as an experiment.

So here's your essay topic:

Explain how having indexes (arrays, substr, etc...) in Perl 6 start at 0
will benefit most users.  Do not invoke legacy. [1]


[1] ie. "because that's how most other languages do it" or "everyone is used
to it by now" are not valid arguments.  Ask any Pascal programmer. :)


-- 

Michael G. Schwern   <[EMAIL PROTECTED]>http://www.pobox.com/~schwern/
Perl Quality Assurance  <[EMAIL PROTECTED]> Kwalitee Is Job One
Follow me to certain death!
http://www.unamerican.com/



Re: purge: opposite of grep

2002-12-05 Thread Simon Cozens
[EMAIL PROTECTED] (Miko O'Sullivan) writes:
> FWIW, I came up with "purge" because my first inclination was to spell
> "grep" backwards: "perg".  :-)

For reference, Ruby uses .detect and .reject.

-- 
3rd Law of Computing:
Anything that can go wr
fortune: Segmentation violation -- Core dumped



Re: Usage of \[oxdb] (was Re: String Literals, take 2)

2002-12-05 Thread James Mastros
On 12/04/2002 3:21 PM, Larry Wall wrote:

On Wed, Dec 04, 2002 at 11:38:35AM -0800, Michael Lazzaro wrote:
: We still need to verify whether we can have, in qq strings:
: 
:\033  - octal   (p5; deprecated but allowed in p6?)

I think it's disallowed.
Thank the many gods ... or One True God, or Larry, or whatever your 
personal preference may be.  ("So have a merry Christmas, Happy Hanukah, 
Kwazy Kwanzaa, a tip-top Tet, and a solemn, dignified Ramadan.")

   \0o33  - octal
   \0x1b  - hex 
   \0d123 - decimal
   \0b1001- binary
\x and \o are then just shortcuts.
Can we please also have \0 as a shortcut for \0x0?


\c[^H], for instance.  We can overload the \c notation to our heart's
desire, as long as we don't conflict with its use for named characters:

\c[GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI]

Very Cool.  (BTW, for those that don't follow Unicode, this means that 
everything matching /^[^A-Z ]$/ is fair game for us; Unicode limits 
charachter names to that to minimize chicken-and-egg problems.  We 
/probably/ shouldn't take anything in /^[A-Za-z ]$/, to allow people to 
say the much more readable "\c[Greek Capital Letter Omega with Pepperoni 
and Pineapple]".

: There is also the question of what the bracketed format does.  "Wide" 
: chars, e.g. for Unicode, seem appropriate only in hex.  But it would 
: seem useful to allow a bracketed form for the others that prevents 
: ambiguities:
: 
:"\o164" ne "\o{16}4"
:"\d100" ne "\d{10}0"
: 
: Whether that means you can actually specify wide chars in \o, \d, and 
: \b or it's just a disambiguification of the Latin-1 case is open to 
: question.

There ain't no such thing as a "wide" character.  \xff is exactly
the same character as \x[ff].
Which means that the only way to get a string with a literal 0xFF byte 
in it is with qq:u1[\xFF]? (Larry, I don't know that this has been 
mentioned before: is that right?)  chr:u1(0xFF) might do it too, but 
we're getting ahead of ourselves.

Also, an annoying corner case: is "\0x1ff" eq "\0x[1f]f", or is it eq 
"\0x[1ff]"?  What about other bases?  Is "\0x1x" eq "\0x[1]", or is it 
eq "\0x[1x]" (IE illegal).  (Now that I put those three questions 
together, the only reasonable answer seems to be that the number ends in 
the last place it's valid to end if you don't use explicit brackets.)

(BTW, in HTML and XML, numeric character escapes are decimal by default, 
you have to add a # for hex.  In windows and several other OSes (I 
think, I like to play with Unicode but have little actual use for it), 
ALT-0nnn is spelt in decimal only.  Decimal Unicode ordnals are 
fundimently flawed (since blocks are always on nice even hex numbers, 
but ugly decimal ones), but useful anyway).

	-=- James Mastros



For's parallel iteration (was Re: seperate() and/or Array.cull)

2002-12-05 Thread Luke Palmer
> From: "Brent Dax" <[EMAIL PROTECTED]>
> Date: Thu, 5 Dec 2002 00:28:52 -0800
> 
> Michael G Schwern:
> # You can do it with a map without much trouble:
> # 
> # my @indexes = map { /condition/ ? $i++ : () } @stuff;
> 
> Unless I'm mistaken, that won't work, since $i only gets incremented on
> matches.  I think this:
> 
>   my @indexes = map { $i++; /condition/ ? $i : () } @stuff;
> 
> Will work fine, though.

> Or, in the spirit of use-a-foreach-like-a-for (and my favorite WTDI):
> 
>   my @indexes = grep { $stuff[$_] =~ /condition/ } 0..$#stuff;

Fantastic!  One that doesn't use a variable temporary variable.  We
all write too quickly to catch errors like the one above.  Except
yours seems to be clean.  Anyway---back to relevant topics.

> As you might guess, I'm a (not very vocal) proponent of adding a way to
> get at a foreach's (or map's or grep's) current index.  (Hmm, can this
> be done with XS?  Must research...)

In Perl 5 that would be nice.  In Perl 6 though, it is not necessary:

for zip(@array, 0..Inf) -> $v, $c {
...
}

That parallel iteration is just getting more and more useful (although
this is a particularly ancient case).  We've already been through A4,
but the idea of C has changed since then (I think).  I don't like
C as a name for such a thing.

for interleave(@array, 0..Inf) {...}
Too long.
for slice   (@array, 0..Inf) {...}
for collate (@array, 0..Inf) {...}
for parallel(@array, 0..Inf) {...}
for braid   (@array, 0..Inf) {...}
for thread  (@array, 0..Inf) {...}
for weave   (@array, 0..Inf) {...}

The sequential list generator would almost certainly be C, if we
need one at all.

for @array {...}
for each @array: {...}
for each(@array, @barry) {...}
for @array, @barry {...}   # Is this legal?

That last one might iterate through the arrays themselves, not their
elements, which would be useful on the blue moon nights.  ??

Or is it still:

for @array ; 0..Inf -> $v ; $c { ... }

I hope not.

Not to delay the next Apocalypse, or anything <:(  

Luke



RE: seperate() and/or Array.cull

2002-12-05 Thread Brent Dax
Michael G Schwern:
# You can do it with a map without much trouble:
# 
# my @indexes = map { /condition/ ? $i++ : () } @stuff;

Unless I'm mistaken, that won't work, since $i only gets incremented on
matches.  I think this:

my @indexes = map { $i++; /condition/ ? $i : () } @stuff;

Will work fine, though.

Or, in the spirit of use-a-foreach-like-a-for (and my favorite WTDI):

my @indexes = grep { $stuff[$_] =~ /condition/ } 0..$#stuff;

As you might guess, I'm a (not very vocal) proponent of adding a way to
get at a foreach's (or map's or grep's) current index.  (Hmm, can this
be done with XS?  Must research...)

# Its so rare that you work by array index in Perl.  Usually 
# things flow together element by element.  Sort of like how 

Funny, I tend to find myself working by-element fairly often, usually
when I have to remove elements in the middle of a loop--something Perl
doesn't like.  :^(  (Not that I don't understand *why* Perl doesn't
allow it--just that it can be an inconvenience.)

# you rarely handle strings character by character which can 
# severely confuse C programmers.

Well, that's a silly way of working anyway.  ;^)

--Brent Dax <[EMAIL PROTECTED]>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

"If you want to propagate an outrageously evil idea, your conclusion
must be brazenly clear, but your proof unintelligible."
--Ayn Rand, explaining how today's philosophies came to be