Split with negative limits, and other weirdnesses

2008-09-23 Thread Moritz Lenz
Today a patch to rakudo brought up the question what split() should do
if the $limit argument is either zero or negative.

In Perl 5 a negative limit means unlimited, which we don't have to do
because we have the Whatever star. A limit of 0 is basically ignored.

Here are a few solution I could think of
 1) A limit of 0 returns the empty list (you want zero items, you get them)
 2) A limit of 0 fail()s
 3) non-positive $limit arguments are rejected by the signature (Int
where { $_  0 })

Any thoughts?
Moritz



-- 
Moritz Lenz
http://moritz.faui2k3.org/ |  http://perl-6.de/


Another split() question - when is there a capture?

2008-09-23 Thread Moritz Lenz
split seems to be a suprisingly tricky beast ;-)

To quote S29:

: As with Perl 5's split, if there is a capture in the pattern it
: is returned in alternation with the split values. Unlike with
: Perl 5, multiple such captures are returned in a single Match object.

Unlike in Perl 5, it is not determined at (pattern) compile time if a
regex has captures.

In Perl 5 the regex m/(foo)?/ always has one capture ($1), in Perl 6 it
has either zero or one ($0 doesn't exist if 'foo' never matched).

So does split return the match objects
  * if the regex has the potential to create at least one capture
or
  * at least one match actually produces a capture
or
  * (weird) only those matches that produce a capture put a match object
into the result list.

I'd go for the first option.
In that case it might be useful to have some sort of introspection, ie a
method like $regex.has_capture (which might return either True
(unconditional captures) or False (no captures at all) or True|False
(for my example above) ;-)

Cheers,
Moritz

-- 
Moritz Lenz
http://moritz.faui2k3.org/ |  http://perl-6.de/


Re: Split with negative limits, and other weirdnesses

2008-09-23 Thread TSa

HaloO,

Moritz Lenz wrote:

In Perl 5 a negative limit means unlimited, which we don't have to do
because we have the Whatever star.


I like the notion of negative numbers as the other end of infinity.
Where infinity here is the length of the split list which can be
infinite if split is called on a file handle. So a negative number
could be the number of splits to skip from the front of the list.
And limits of the form '*-5' would deliver the five last splits.



A limit of 0 is basically ignored.

Here are a few solution I could think of
 1) A limit of 0 returns the empty list (you want zero items, you get them)


I think this is a nice degenerate case.


 2) A limit of 0 fail()s


This is a bit too drastic.


 3) non-positive $limit arguments are rejected by the signature (Int
where { $_  0 })


I think that documents and enforces the common case best. But I would
include zero and use a name like UInt that has other uses as well. Are
there pragmas that turn signature failures into undef return values?


Regards, TSa.
--

The unavoidable price of reliability is simplicity -- C.A.R. Hoare
Simplicity does not precede complexity, but follows it. -- A.J. Perlis
1 + 2 + 3 + 4 + ... = -1/12  -- Srinivasa Ramanujan


Re: Subroutine parameter with trait and default.

2008-09-23 Thread John M. Dlugosz



PS  Incidentally, it seems silly to have is rw but not is ro.  I keep
writing is ro.




The synopses says readonly.  But now that it is possible, I nominate changing 
a hyphen.


I'm not opposed to having it be ro, but wonder why he didn't call it that in 
the first place, so there must be a reason.


It should be possible to alias it in your own scope easily.

--John


Why no is ro? (Re: Subroutine parameter with trait and default.)

2008-09-23 Thread Michael G Schwern
John M. Dlugosz wrote:
 I'm not opposed to having it be ro, but wonder why he didn't call it that
 in the first place, so there must be a reason.

Nobody's perfect?

My other thought is that since parameters are read-only by default it's not
thought you'd have to write it much so clarity wins out over brevity, the flip
side of Huffamn encoding.  But that doesn't work out so good for normal
variable declarations.  The verbosity (which a hyphen would only make worse)
discourages const-ing, as they say in C.


 It should be possible to alias it in your own scope easily.

Every time someone replies to a Perl 6 language design nit with but you can
change the grammar *I* kill a kitten.

*meowmmmf*


-- 
31. Not allowed to let sock puppets take responsibility for any of my
actions.
-- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
   http://skippyslist.com/list/


Re: Why no is ro? (Re: Subroutine parameter with trait and default.)

2008-09-23 Thread David Green

On 2008-Sep-23, at 2:32 pm, Michael G Schwern wrote:
My other thought is that since parameters are read-only by default  
it's not
thought you'd have to write it much so clarity wins out over  
brevity, the flip
side of Huffamn encoding.  But that doesn't work out so good for  
normal

variable declarations.


I'd call it straight Huffman encoding, because clarity is what we  
should be optimising for.  (You read code more than you write it...  
unless you never make any mistakes!)  Happily, brevity often aids  
clarity.  The rest of the time, it should be up to one's editor; any  
editor worth its salt ought to easily auto-complete ro into  
readonly.



-David



Re: Split with negative limits, and other weirdnesses

2008-09-23 Thread David Green

On 2008-Sep-23, at 8:38 am, TSa wrote:

Moritz Lenz wrote:
In Perl 5 a negative limit means unlimited, which we don't have  
to do

because we have the Whatever star.


I like the notion of negative numbers as the other end of infinity.


I think positive values and zero make sense.  But I don't want to give  
funny meanings to negatives; it would be better to replace the int  
limit with a range instead.  (Maybe it would be OK to accept a single  
Int as short for 1..$i, or *-$i as short for *-$i..*.)


Then again, we could get rid of the limit arg altogether, return  
everything, and take a slice of the result -- assuming it can be lazy  
enough to calculate only what ends up getting sliced out.




-David



Re: Why no is ro? (Re: Subroutine parameter with trait and default.)

2008-09-23 Thread Michael G Schwern
David Green wrote:
 On 2008-Sep-23, at 2:32 pm, Michael G Schwern wrote:
 My other thought is that since parameters are read-only by default
 it's not
 thought you'd have to write it much so clarity wins out over brevity,
 the flip
 side of Huffamn encoding.  But that doesn't work out so good for normal
 variable declarations.
 
 I'd call it straight Huffman encoding, because clarity is what we should
 be optimising for. (You read code more than you write it... unless you
 never make any mistakes!)  Happily, brevity often aids clarity.  The
 rest of the time, it should be up to one's editor; any editor worth its
 salt ought to easily auto-complete ro into readonly.

Eeep!  The your IDE should write your verbose code for you argument!  For
that one, I brine and roast an adorable hamster.  That's just another way of
saying that your language is too verbose for a human to write it without
hanging themselves.  See also Java.

Anyhow, I see where you're going, and I understand the desire for no abbvs.
But man, ro is pretty damn easy to remember. [1]  This is even sillier when
you hold it up against all the magic symbols we're supposed to remember.
[EMAIL PROTECTED], :namevalue, |$arg, $arg!, $arg?, :$arg.

If we expect the user to remember what all that means, I think they can figure
out $thing is ro.  It would be incoherent to take a corner of the language
design and suddenly pretend otherwise.

The mark of a great interface is not that you know what everything is the
first time you encounter it, but when you remember what it is the second time.
 The first time what's important is the user knows where to find instructions
and how to play with the device.  It should have a strong analogy and mesh
clearly with the surrounding devices ro and rw have a strong analogy with
the common read-only and read-write terms and they mesh with each other.  Once
this is known to the user, the second time it will be obvious.

You're only a beginner once, and if everything is done right for a short time.
 The rest of your career, you're experienced.  Instead of dumbing the language
down for beginners, the trick is to turn beginners into experienced
programmers as quickly and painlessly as possible.

Now I've totally digressed./rant


-- 
s7ank: i want to be one of those guys that types s/jjd//.^$ueu*///djsls/sm.
   and it's a perl script that turns dog crap into gold.


Re: Why no is ro? (Re: Subroutine parameter with trait and default.)

2008-09-23 Thread John M. Dlugosz

Michael G Schwern schwern-at-pobox.com |Perl 6| wrote:

It should be possible to alias it in your own scope easily.



Every time someone replies to a Perl 6 language design nit with but you can
change the grammar *I* kill a kitten.

*meowmmmf*


  


That would not be a change in the grammar.  Merely deciding for yourself 
which names should be short based on your own usage. 

Since readonly is a class name, the equivalent of a typedef would be 
used.  I think that should be


   my ::ro ::= readonly;

but I have some technical points that still need to be addressed/worked out.

--John


Re: Why no is ro? (Re: Subroutine parameter with trait and default.)

2008-09-23 Thread John M. Dlugosz

Michael G Schwern schwern-at-pobox.com |Perl 6| wrote:

John M. Dlugosz wrote:
  

I'm not opposed to having it be ro, but wonder why he didn't call it that
in the first place, so there must be a reason.



Nobody's perfect?

My other thought is that since parameters are read-only by default it's not
thought you'd have to write it much so clarity wins out over brevity, the flip
side of Huffamn encoding.  But that doesn't work out so good for normal
variable declarations.  The verbosity (which a hyphen would only make worse)
discourages const-ing, as they say in C.

  


Perhaps he was thinking that 'constant' would be used there.  But I 
agree, it's not the same thing.  In C++ I often use const for things 
that are in 'auto' scope and initialized in the normal flow sequence.


Anyway, was 'ro' rejected for some good reason, $Larry, or was it simply 
considerate as not to waste a short word on a rare use since that's the 
default (for parameters)?


I agree that knowing 'rw', and that being common, if I wanted the other 
one and didn't use it every day, I would =guess= that it should be 
called 'ro' to match.


--John