Re: Self-recognizing programs and regular expressions

2005-03-08 Thread Abigail
On Mon, Mar 07, 2005 at 11:05:15PM +, Ton Hospel wrote:
 Ah, the idea seems salvagable, but less elegant:
 
 this sequence is self matching:
 ^
 ^\^
 ^\^\\\^
 ^\^\\\^\\\^
 
 
 so an infinite sequence of ^ with 2**n-1 \ after the n-th ^


This is also self matching:

  \A\\AAA

with 2**(n-1) \s before the n-th A.



Years ago, while I was still writing JAPHs, I was looking for a regex
matching itself - and nothing but itself (regexes that match themselves,
and also other strings are easy and IMO, not interesting), but I never
found one. The search wasn't entirely fruitless, it did lead to:

   my $qr =  qr/^.+?(;).+?\1|;Just another Perl Hacker;|;.+$/;
  $qr =~  s/$qr//g;
print $qr, \n;


But that's a far cry from what I wanted to find.



Abigail


pgpMQ4E8ejcxP.pgp
Description: PGP signature


Re: Load-Bearing Warnings

2005-02-02 Thread Abigail
On Tue, Sep 14, 2004 at 07:40:37AM +0300, Gaal Yahas wrote:
 
 Anyway, since most systems don't have it either, I almost always put -w
 on the #! line even if my script is bound to run on 5.8, which supports
 the warnings pragma, to exploit the behavior you encountered here. Looks
 like I wasn't the only one.


I'd prefer to put -- there instead.



Abigail


pgpapui2lkHEz.pgp
Description: PGP signature


Re: merlyn smeared by Python Johnnies' description of Schwartzian transform

2003-07-16 Thread Abigail
On Wed, Jul 16, 2003 at 07:25:20PM +1000, Andrew Savige wrote:
 Perhaps I am posting this to wrong list, but that has become the norm
 lately. :-)
 
 I noticed on this page:
  http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52234
 titled a guaranteed-stable sort with the decorate-sort-undecorate
 idiom (aka Schwartzian transform)
 
 decorate-sort-undecorate is a general and common idiom that allows
 very flexible and speedy sorting of Python sequences. An auxiliary
 list is first built (the 'decorate' step) where each item is made
 up of all sort-keys (in descending order of significance) of the
 corresponding item of the input sequence (must include all of the
 information in the whole corresponding item, and/or an index to it
 so we can fetch it back [or reconstruct it] in the third step).
 This is then sorted by its builtin sort method without arguments.
 Finally, the desired sorted-list is extracted/reconstructed by
 undecorating the now-sorted auxiliary-list.
 [This idiom is also known as Schwartzian transform by analogy with
 a similar Perl idiom (which, however, implies using map and grep and
 performing the whole sequence inside one single statement)].
 
 
 So these Python Johnnies think the Schwartzian transform uses grep,
 eh? I wish they gave an example.
 
 I may regret this, but I was just wondering if it would be more
 accurate to describe this Python decorate-sort-undecorate (DSU)
 thingy as a Guttman-Rosler transform? I suggest this because they
 state that using the builtin sort method without arguments is
 essential for performance reasons. My understanding is that the GR
 transform always uses the bald sort without arguments while the
 Schwartzian transform always uses a sort block. Is that right?

Yes.  GRT usually have the form:

@sort = map { ... }
sort
map { ... } @unsorted;

with often pack/unpack in the map blocks, while an ST is usually
structured as:

@sort = map  {$_ - [0]}
sort { ... }
map  {[$_ = ...]} @unsorted;


Abigail


Re: merlyn smeared by Python Johnnies' description of Schwartzian transform

2003-07-16 Thread Abigail
On Wed, Jul 16, 2003 at 01:37:27PM +0200, A. Pagaltzis wrote:
 * Abigail [EMAIL PROTECTED] [2003-07-16 11:33]:
  Yes.  GRT usually have the form:
  
  @sort = map { ... }
  sort
  map { ... } @unsorted;
  
  with often pack/unpack in the map blocks, while an ST is
  usually structured as:
  
  @sort = map  {$_ - [0]}
  sort { ... }
  map  {[$_ = ...]} @unsorted;
 
 It has been argued that a GRT is just a special form of an ST.

Yeah, I heard that once before as well. Now if articles, posts, or other
texts explaining ST that were written before the introduction of GRT
had a decent number of examples of a block-less sort, I'd agree with you.

But I just go with what most people seem to think, that GRT and ST are
different techniques. And I think they are right. For me, the important
difference is that in ST the data to be sorted is left as is, other data
(on which the sort block acts) is extracted, and this other data is
discarded at the end, while in GRT, the data gets modified, the modified
data gets sorted, and then the data is reassembled again.


Abigail


Re: my if?

2003-07-03 Thread Abigail
On Thu, Jul 03, 2003 at 05:51:00AM +0100, Pense, Joachim wrote:
 
 [   { my $staticvar; sub mysub {...} }versus   sub mysub {static
 $staticvar;}   ]
 
 Abigail and others point that the first version is more flexible than the
 second one, which is true. Reason: the first construct allows subs to share
 the statics. On the other hand, there is a psychological advantage to the
 second version, which would let me prefere it where possible if there was a
 static construct:
 
 I think is contra-intuitive for many programmers anyway: I am trained to
 consider everything within curly braces as invisible from the outside. So it
 becomes less clear in the first version that mysub is in fact global to the
 program.

I think your training is to fault. Consider that files are a lexical
scope as well, as if their were braces. Were subs not visible outside
of a scope, we wouldn't have modules and objects as we have them now.

 Is there a sensible way of combining the advantages, that is: defining subs
 outside of blocks that share static variables anyway? Whatever I can think
 of would be even worse than version one (more complicated, less intuitive,
 more error-prone), so has anyone else thought of a concept? 


You could always use a package variable - but those would be visible
to other subs as well. But since you don't want to group subs in a
scope anyway...


Abigail


Re: my if?

2003-07-02 Thread Abigail
On Tue, Jul 01, 2003 at 02:37:16PM -0400, Bernie Cosell wrote:
 
 Give you anything isn't really the point -- Perl is filled with 
 multiple ways to do things and the simple argument that you can do 
 something similar using some other mechanism is rarely determinative.

I can't think of a single Perl construct that does only one thing, 
and that can be done with about the same number of keystrokes
by a more general construct.

 Virtually EVERY programmer knows what a simple static variable is -- and 

I doubt that. A lot of programming languages don't know the concept
of static variables. And even in languages that do, it isn't used that
often.  Not every Perl programmer nowadays came by the way of C. Not by
a long shot.



Abigail


Re: my if?

2003-07-02 Thread Abigail
On Wed, Jul 02, 2003 at 07:58:03AM +0100, Pense, Joachim wrote:
 
 Even if you are right and not virtually every programmer knows of the
 concept, in my view it is a concept that anyone who started using it
 probably will not like to miss in the future (well, at least it used to be
 my favorite feature of C classic). It enables you to declare variables at
 the latest possible point (the first usage) which is a programming style
 that is often recommended in the Perl community. I think it is easy to see
 which version looks elegant and which one kludgy:
 
 (Quoted from earlier in the thread, reformatted:)
 
 |   sub x {
 |   static $vbl ;
 |   ...
 |
 
 | {
 | my $vbl;
 | sub x {
 | ...
 | }
 | }


IMO, not doubt the latter looks far more elegant - as that enables your
'static' variable to be shared with more than one function. Something
you can't do with a 'static' declared variable inside a function.



Abigail


Re: my if?

2003-07-02 Thread Abigail
On Wed, Jul 02, 2003 at 10:33:01AM +0100, Pense, Joachim wrote:
 
 Abigail wrote
  
  
  On Wed, Jul 02, 2003 at 07:58:03AM +0100, Pense, Joachim wrote:
 
   that is often recommended in the Perl community. I think it 
  is easy to see
   which version looks elegant and which one kludgy:
   
   (Quoted from earlier in the thread, reformatted:)
   
   |   sub x {
   |   static $vbl ;
   |   ...
   |
   
   | {
   | my $vbl;
   | sub x {
   | ...
   | }
   | }
  
  
  IMO, not doubt the latter looks far more elegant - as that 
  enables your
  'static' variable to be shared with more than one function. Something
  you can't do with a 'static' declared variable inside a function.
  
 
 Mighty != elegant.
 

And neither is not being flexible.


Abigail


Re: my if?

2003-07-02 Thread Abigail
On Wed, Jul 02, 2003 at 11:09:54AM +0100, Pense, Joachim wrote:
 Abigail wrote
 
 
 |   sub x {
 |   static $vbl ;
 |   ...
 |
 
 | {
 | my $vbl;
 | sub x {
 | ...
 | }
 | }


IMO, not doubt the latter looks far more elegant - as that 
enables your
'static' variable to be shared with more than one 
  function. Something
you can't do with a 'static' declared variable inside a function.

   
   Mighty != elegant.
   
  
  And neither is not being flexible.
  
 
 Compare it with conditionals.
 
 You can write
 
 if ($some_condition) {
do_this;
do_that;
do_something_else;
 }
 
 and you can write
 
 do_this if $some_condition;
 
 You need not write
 
 if ($some_condition) {do_this}
 
 The first version is more flexible, the second more elegant in its
 restricted scope. I think it is Perlish to have both available.


But they aren't equivalent.

do_this if $condition;

is more efficient than

if ($condition) {
do_this;
}


The latter requires Perl to enter and leave a block, while the
former doesn't.



Abigail


Re: my if?

2003-07-02 Thread Abigail
On Wed, Jul 02, 2003 at 01:02:12PM +0100, Pense, Joachim wrote:
 
 
  -Original Message-
  From: Abigail [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, July 02, 2003 1:51 PM
  To: Pense, Joachim
  Cc: [EMAIL PROTECTED]
  Subject: Re: my if?
  
  
  On Wed, Jul 02, 2003 at 11:09:54AM +0100, Pense, Joachim wrote:
   Abigail wrote
   
   
   |   sub x {
   |   static $vbl ;
   |   ...
   |
   
   | {
   | my $vbl;
   | sub x {
   | ...
   | }
   | }
  
  
  IMO, not doubt the latter looks far more elegant - as that 
  enables your
  'static' variable to be shared with more than one 
function. Something
  you can't do with a 'static' declared variable inside 
  a function.
  
 
 Mighty != elegant.
 

And neither is not being flexible.

   
   Compare it with conditionals.
   
   You can write
   
   if ($some_condition) {
  do_this;
  do_that;
  do_something_else;
   }
   
   and you can write
   
   do_this if $some_condition;
   
   You need not write
   
   if ($some_condition) {do_this}
   
   The first version is more flexible, the second more elegant in its
   restricted scope. I think it is Perlish to have both available.
  
  
  But they aren't equivalent.
  
  do_this if $condition;
  
  is more efficient than
  
  if ($condition) {
  do_this;
  }
  
  
  The latter requires Perl to enter and leave a block, while the
  former doesn't.
  
 
 Does anything similar hold for the sub/static example? I assume this would
 be an argument pro static.


Not really, as that will be a file scoped block and entered/left as
most once.


Abigail


Re: Hidden state? Invisible closures?

2003-05-28 Thread Abigail
On Tue, May 27, 2003 at 06:19:05PM +0200, Matthias Bauer wrote:
 Hi everybody,
 why is it that the following piece of code prints
 three _different_ strings (in Perl 5.8)?
 
 --8
 sub bla() {
   for my $i ( reverse 0 .. 5 ) {
 $i = $i + 1;
 print $i;
   }
   print \n;
 }
 
 bla();
 bla();
 bla();
 --8
 
 It seems as if the array created to hold the
 returned AV of ''reverse 0..5`` is somehow
 re--used in later invocations of blah(). 
 Bug or feature or programming error?

Known bug. You should get a modification of read-only value error,
like you get if you do 'reverse 0, 1, 2, 3, 4, 5'.

Note the differences if you replace .. with a list, and whether or
not you use reverse or not:


reverse 0 .. 5654321
  765432
  876543

0 .. 5123456
  123456
  123456

reverse 0, 1, 2, 3, 4, 5  Modification of a read-only value attempted

0, 1, 2, 3, 4, 5  Modification of a read-only value attempted



Abigail


Re: Fun with RegExps

2003-02-18 Thread Abigail
On Mon, Feb 17, 2003 at 07:40:46PM -, Matt Groves wrote:
 
 Hello,
 
 I'm looking for the shortest, cleverest possible solution to this.  When
 changing a password, I need a RegExp which will ensure the following
 criteria :
 
 1. It must be at least 6 characters
 
 2. It must contain at least one lower case letter [a-z]
 
 3. It must contain at least one upper case letter [A-Z]
 
 4. It must contain at least one number [0-9]
 
 5. Optionally, it can cover for accepted non-alphanumeric chars such as
 _, - etc (but not #), and a maximum password length of 14
 characters

I'm not going to claim this is the shortest solution, but this
is very straightforward (and untested):

/^(?=.{6})# At least 6 characters long.
  (?=.*[a-z]) # Contains a lowercase letter.
  (?=.*[A-Z]) # Contains an uppercase letter.
  (?=.*[0-9]) # Contains a digit.
  (?=.*[-_])  # Contains a dash or an underscore.
  (?!.{15})   # Doesn't contain 15 characters.
/xs;

It's easy to add more requirements.


Abigail



Re: Zip/Postal codes.

2003-01-02 Thread Abigail
On Thu, Jan 02, 2003 at 04:43:01PM +, Adam Rice wrote:
 David Sheldon wrote:
  
m/(^|\W)(([A-Za-z][0-9]|[A-Za-z][0-9]{2}|[A-Za-z][A-HJ-Ya-hj-y][0-9]|[A-Za-z][A-HJ-Ya-hj-y][0-9]{2}|[A-Za-z][0-9][A-Za-z]|[A-Za-z][A-HJ-Ya-hj-y][0-9][A-Za-z])\s+[0-9][ABD-HJLP-UW-Zabd-hjlp-uw-z]{2}|[Gg][iI][Rr]\W+0[aA]{2})(\W|$)/
 
 I think it's worth mentioning that, right or wrong, postcodes are often 
 written without the space in the middle. Abigail didn't mention whether 
 he wanted to just match the canonical form, or match any common form.


I'll probably make it so that people can do things like:

use Regexp::Common;
/$RE{zip}{British}/; # Uses ' ' as separator.
/$RE{zip}{British}{-sep = '\s*'}/;  # Uses \s* as separator.
/(?i)$RE{zip}{British}/; # Case insensitive match.


Abigail



Re: sort numbers

2003-01-02 Thread Abigail
On Wed, Jan 01, 2003 at 01:02:58PM -0800, artist google wrote:
 Hi,
  I have this puzzle.
  Given N numbers, N4, you have to sort the numbers.
  The only operation permitted is you can rotate any
 sequencial 4 numbers in reverse order. or you can
 roate the entire list sequencially.
 
 How do u approach this??


I don't think this is possible. Define the number of 'inversions' of
a sequence of numbers the number of pairs for which the larger preceeds
the smaller (this is the number of pairs which have to swap to get to
the final, sorted, configuration). Define the 'parity' of the sequence
as the even- or oddness of the number of inversions.

Consider the sequence: [A B C D E], for some number A, B, C, D and E.
There are just three operations possible, giving the sequences
[E D C B A], [D C B A E] and [A E D B C]. Note that none of the operations
changes the parity of the sequence. Note that [2 1 3 4 5] has
an odd parity, while the parity of [1 2 3 4 5] is even. But if no operation
can change the parity, there's no way of going from [2 1 3 4 5] to
[1 2 3 4 5].



Abigail



Re: fun with regex

2002-12-12 Thread Abigail
On Thu, Dec 12, 2002 at 12:05:59AM +, Jonathan E. Paton wrote:
 
 It's late, me not thinking much.  Ignore me if I missed the point
 completely.


You did indeed. As I indicated in my post, I don't see the need to
strip out indents. And furthermore, the question was about regexes,
and whether you could it in a single regex.

I answered the question - your remarks don't have much to do with it.



Abigail



Re: fun with regex

2002-12-12 Thread Abigail
On Thu, Dec 12, 2002 at 12:48:11AM -0500, Jeff 'japhy' Pinyan wrote:
 On Dec 12, Jonathan E. Paton said:
 
 (my $sql =  '--') =~ s/\A(\s+)(?{$::c = $^N})|^(??{$::c})//gm;
 
 all over SQL related source code then your head is on the block!
 
 While it's cool, and I'D use it, for readability purposes, might I
 suggest:
 
   $sql =~ /(\s*)/ and $sql =~ s/^$1//mg;


The question was how to do it in one regex; the original poster
already knew how to do it with two regexes.



Abigail



Re: fun with regex

2002-12-12 Thread Abigail
On Thu, Dec 12, 2002 at 07:36:42AM +0100, A. Pagaltzis wrote:
 * Jeff 'japhy' Pinyan [EMAIL PROTECTED] [2002-12-12 07:32]:
  While it's cool, and I'D use it, for readability purposes,
  might I suggest:
  
$sql =~ /(\s*)/ and $sql =~ s/^$1//mg;
 
 How about /^(\s*)/ - better safe than sorry.
 
 On second thought, /^(\s*|)/ to avoid warnings.


What kind of warning? Do you know of a string where /^(\s*)/ doesn't
match?  Perhaps a string without a beginning? Or that has -1 space at
the beginning?



Abigail



Re: fun with regex

2002-12-11 Thread Abigail
On Wed, Dec 11, 2002 at 11:33:13AM -0500, Selector, Lev Y wrote:
 Hello,
 
 Here is a regex question
 
 I am using the following construct to ident embeded SQL:
 
($sql =EOF) =~ s/^\s+SQL: ?//gm;
   SQL: select row_id
   SQL:   from gs_employee_queue g_q
   SQL:  where g_q.MMDDHHMM = '$MMDDhhmm'
   SQL:and g_q.action = 'D'
   SQL:and g_q.status = 'Unprocessed'
 EOF
 
 The benefit is that the SQL is idented as the Perl code around it (the
 program is easier to read) - and at the same time unnecessary idents are
 removed before sending the SQL to the database.

I'll bite. What's the benefit of removing the indents before sending
it to the database? Surely the additional scanning done in the
database is done faster than the regex machine in Perl takes?

 The drawback is that every time when I need to extract the SQL (to run it
 manually in the DB) - I have to remove all the  ' SQL:' tags manually. 
 
 So naturally I want to live without the 'SQL:' label.
 But then if I use just \s+ in the regex - it will remove different number of
 spaces from different lines - thus ruining the SQL layout.

I'll bite again. So what? You're database engine is automated, isn't it?
Or do you have a human fetching items from a closet?

 Question:
   How to construct the regex so that it subtracts the same amount of white
 space from all the line. Namely, it should memorize the space from the first
 line only - and then subtract it from all the lines.
 
 I know how to do this with 2 regexes. But is there a way to do it in the
 same one substitute (similar to how it is done above)?


(my $sql =  '--') =~ s/\A(\s+)(?{$::c = $^N})|^(??{$::c})//gm;
 select row_id   
   from gs_employee_queue g_q   
  where g_q.MMDDHHMM = '$MMDDhhmm'   
and g_q.action = 'D'   
and g_q.status = 'Unprocessed'   
--
 
print $sql;

__END__
select row_id
  from gs_employee_queue g_q
 where g_q.MMDDHHMM = '$MMDDhhmm'
   and g_q.action = 'D'
   and g_q.status = 'Unprocessed'


Abigail



Re: Function parameter passing (was: Re: limit the list)

2002-11-20 Thread Abigail
On Wed, Nov 20, 2002 at 11:42:43AM +0100, Bart Lateur wrote:
 On Wed, 20 Nov 2002 04:10:02 -0600, Steven Lembark wrote:
 
 sub commify
 {
  my ( $max, $sep, $end ) = ( shift, shift, shift );
   ...
 }
 
 Wow! Hold it! Am I the only one who finds this absurd? More than one
 shift on the same array in one single expressing, sounds like bad style
 to me. Comments?


Why is that bad style? Many times when people say it's bad style,
it's just a case of beauty is in the eye of the beholder. 

However, sometimes a style is bad because it's error-prone, 
confusing, similar to common idiom but doing something else,
or inefficient. But I don't think any of them applies to this
particular example.

Bart, can you explain why this is bad style? Or is it just your
personal preference?



Abigail



Re: NPL Puzzler for 6 Oct

2002-10-14 Thread Abigail

On Fri, Oct 11, 2002 at 02:54:20PM -0400, Bernie Cosell wrote:
 The NPL puzzle for 6 oct was an interesting little Perl exercise [I'm not 
 sure how to solve it analytically --- I played with it some to little 
 avail --- but it was certainly subject to brute force, and it turned out 
 to be a cute little thing.
 
 Write out the digits from 1-9 in order. Then add some plus (+) signs
 and times (x) signs to the string to make it add up to 2,002. As
 usual in arithmetic, multiplication is done before addition, and you
 don't have to put a sign between every 2 digits. The answer is
 unique.  
 
 What's odd is that my little Perl program found *TWO* solutions, but one 
 is potentially ambiguous [in particular, given those rules, what should 
 the value of a*b*c be?-- it doesn't say whether things should be done 
 left-to-right or right-to-left, so perhaps that could be used to exclude 
 one of the two solutions.

I too get two solutions, and if one is ambiguous due to a * b * c, the
other is too, due to a + b + c. I'm not using base-3 counting, but I'm
using recursion:

#!/usr/bin/perl

use strict;
use warnings;

my $target = ARGV ? shift : 2002;
my digits = 1 .. 9;
my op = (,  + ,  * );

sub doit;

sub doit {
my ($str, digits) = _;

unless (digits) {
my $result = eval $str;
print $str == $target\n if $target == eval $str;
return;
}

my $digit = shift digits;

doit $str$_$digit = digits foreach op;
}

doit digits;
__END__
1 * 23 + 45 * 6 * 7 + 89 == 2002
1 * 2 + 34 * 56 + 7 + 89 == 2002


Here's the complete set of solutions for this century:

1 * 23 + 45 * 6 * 7 + 89 == 2002
1 * 2 + 34 * 56 + 7 + 89 == 2002
12 + 34 * 56 + 78 + 9 == 2003
12 + 3 * 456 + 7 * 89 == 2003
1 + 23 + 45 * 6 * 7 + 89 == 2003
1 + 2 + 34 * 56 + 7 + 89 == 2003
12 + 34 * 56 + 7 + 89 == 2012
12 * 3 + 45 * 6 * 7 + 89 == 2015
123 + 45 * 6 * 7 + 8 + 9 == 2030
1234 + 5 + 6 + 789 == 2034
1 * 2 * 3 * 4 * 56 + 78 * 9 == 2046
1 + 2 * 3 * 4 * 56 + 78 * 9 == 2047
1234 + 5 * 6 + 789 == 2053
1 * 2 * 34 * 5 * 6 + 7 + 8 + 9 == 2064
1 + 2 * 34 * 5 * 6 + 7 + 8 + 9 == 2065
12 * 34 * 5 + 6 + 7 + 8 + 9 == 2070
1 * 2 + 3 * 456 + 78 * 9 == 2072
1 + 2 + 3 * 456 + 78 * 9 == 2073
1234 + 56 + 789 == 2079
12 + 3 * 456 + 78 * 9 == 2082
123 + 45 * 6 * 7 + 8 * 9 == 2085
1 * 2 + 345 * 6 + 7 + 8 + 9 == 2096
12 * 34 + 5 * 6 * 7 * 8 + 9 == 2097
12 * 3 * 45 + 6 * 78 + 9 == 2097
1 + 2 + 345 * 6 + 7 + 8 + 9 == 2097
12 * 34 * 5 + 6 * 7 + 8 + 9 == 2099


The 2034 and 2079 solutions use only additions and concatination.



Abigail



Re: ~9M lines of data

2002-10-14 Thread Abigail

On Mon, Oct 14, 2002 at 12:25:03PM -0400, iudicium ferat wrote:
 I am somewhat beating my head against a brick wall here - so I think Hey!
 This sounds like a Fun With Perl project :)
 
 Here is the challenge -
 
 You are presented with a MySQL Schema dump that is less than 9 million rows;
 you should read the data row by row, finding each CREATE TABLE statement,
 and displaying the next ~50 lines INCLUDING this line - do this recursively
 until end of file is reached.


grep -A 50 'CREATE TABLE' file.sql


Abigail



Re: Maybe-useful subroutine (BETTER!)

2002-07-02 Thread Abigail

On Mon, Jul 01, 2002 at 05:43:52PM +, sara starre wrote:
 
1)  I don't think my code was obfuscated. It certainly wasn't
intended to.
2)  I will never ever be at a Perl conference where they charge
over $1000 entrance fee and hundreds more for tutorials.
I'd be too embarrassed to give a talk.
3)  I gave obfuscation talks (about Japhs) on YAPC::NA::2000 and
YAPC::NA::2001.
 
 
 Abigail

 1. Perhaps not- I like the style and I appreciate you passing it on to me.
 I'm always looking for more perlish idioms. Yours will take some study
 however. I would appreciate a little narrative from you on what:
 
 [@{$_ [1]} [$l .. $#{$_ [1]}, 0 .. $l - 1]]

 is doing? I don't understand some of the syntax such as { }[ ] and $#{}?

There isn't much going on here. '$#' followed by a name of an array gives
you the index of the last element. '' followed by the name of an array, 
followed by a list inside '[]' gives you a slice, a list of elements
from the array, indexed by the list found inside the '[]'.

Now, to deal with references, Perl has a rule: where ever you have a
variable (be it simple like $scalar, or complex like $array [index]),   
you may replace the name of the variable by a block (delimited by '{}')
whose result is a reference to the appropriate type.

So, { }[ ] is just a slice of the array pointed to by the reference inside
the { }, and $#{ } is the index of the last element of the array pointed
to by the reference inside the { }.

 2. Yes if my company didn't have a training budget which I apply to pay for
 this conferece I wouldn't either. The costs seem very high, but then it IS
 California. BUt my impression is that O'Reilly sure isn't loosing any $$ at
 it!

Well, I wouldn't go there not even if my company would be paying for me
(or someone else for that matter). The reason I won't go isn't that I
would have to pay lots of money, but that they charge everyone that much.
  
As for O'Reilly making money on it, I've heard that they actually lost
money the last time (and maybe ever the previous time). That O'Reilly is
making (or trying to) make money out of it is their good right. They are
a commercial company after all, and that's what commercial companies do:
make money. They have employees they need to pay, and I guess they have
stockholders too.

I'm sometimes a bit disappointed that most people don't share my views and
happily pay such high fees (now, if just 1 out of 20 decided to pay the
same amount to the Perlfoundation, wouldn't that be great?), but everyone
is entitled to decide for themselves where to spend their money on.

 3. Rats- maybe next year I'll go to YAPC instead! Love to meet ya.


There might be 3 YAPC's to choose from next year, if Nat manages  
to do a YAPC::Winter.

 

Abigail



Re: Maybe-useful subroutine

2002-07-01 Thread Abigail

On Mon, Jul 01, 2002 at 12:23:46PM +, sara starre wrote:
 sub rotate { unshift _, splice _, shift _; return _ }
 
 Curious. Here is one of my routines that rotates a vector (an array) either
 foward or reverse, any number of elements. Seems like we had a similar
 approach I just encapuslated with a bunch of logic and control..
 
 
 
 # rotate elements around, arg is n and array
 sub rotate
 {return () unless $_[1];
 die 'APL: rotate called with improper args' unless $_[0]  $_[1];
 my $n=$_[0];
 my l=@{$_[1]};
 my ($i, lx) = (0);
 
 for ($i==0; $iabs $n; $i++)
   {$lx=shift l if $n0;
$lx=pop l if $n0;
push l,$lx if $n0;
unshift l,$lx if $n0;
   }

 return \@l;
 }

 
Eeew. That's so horribly inefficient and unPerllike.
 
Try this:

sub rotate {
return () unless   $_ [1];
return [] unless {$_ [1]};
$_ [0] %= {$_ [1]};
[@{$_ [1]} [$_ [0] .. $#{$_ [1]}, 0 .. $_ [0] - 1]]
}
  
  
Abigail



Re: Maybe-useful subroutine

2002-07-01 Thread Abigail

On Mon, Jul 01, 2002 at 02:57:36PM +, sara starre wrote:  
   
 I defintely like your syntax better, and yes I was trying to avoid the loop
 entirely. Unfortunately I can't get your solution to work:

   DB1
 main::r2(./x.pl:66):  $_ [0] %= {$_ [1]};
   DB1
 Modification of a read-only value attempted at ./x.pl line 66.
 Debugged program terminated.  Use q to quit or R to restart,
   use O inhibit_exit to avoid stopping after program termination,
   h q, h R or h O to get additional info.
   DB1
 
 Seems it doesn't like you trying to modify $_ [0] ? Also will yours work
 with negative rotation like rotate(-3, \@a) ?
 
 also
 
   return [] unless {$_ [1]};
 
 give a runtime error if $_[1] isn't an array ref- seems like you'd want to
 find another way to trap that error as you're trying to AVOID errors with
 that statement?

 Nice work I'd like to see this function. I hate looping but I couldn't see
 how to avoid it in this case with the negative possiblities plus the fact
 that the rotation parameter can exceed the array length- ie:

 my a=qw(A B C);
 rotate(-14, \@a);


The $_ [0] %= {$_ [1]}; was assuming the first argument was an lvalue.
And yes, the rotation value can be negative, or exceed the array size,
that's the whole point of the %!
 
You might want to try:

#!/usr/bin/perl -w

use strict;
use warnings 'all';


sub rotate {
return () unless   $_ [1];
die Not an array ref unless ARRAY eq ref $_ [1];
return [] unless {$_ [1]};
my $l = $_ [0] % {$_ [1]};
[@{$_ [1]} [$l .. $#{$_ [1]}, 0 .. $l]]
}
  
my $array = ['A' .. 'E'];

for my $r (-15 .. 15) {
printf Rotate %3d: {rotate $r, $array}\n, $r;
}
__END__
Rotate -15: A B C D E A
Rotate -14: B C D E A B
Rotate -13: C D E A B C
Rotate -12: D E A B C D
Rotate -11: E A B C D E
Rotate -10: A B C D E A
Rotate  -9: B C D E A B
Rotate  -8: C D E A B C
Rotate  -7: D E A B C D
Rotate  -6: E A B C D E
Rotate  -5: A B C D E A
Rotate  -4: B C D E A B
Rotate  -3: C D E A B C
Rotate  -2: D E A B C D
Rotate  -1: E A B C D E
Rotate   0: A B C D E A
Rotate   1: B C D E A B
Rotate   2: C D E A B C
Rotate   3: D E A B C D
Rotate   4: E A B C D E
Rotate   5: A B C D E A
Rotate   6: B C D E A B
Rotate   7: C D E A B C
Rotate   8: D E A B C D
Rotate   9: E A B C D E
Rotate  10: A B C D E A
Rotate  11: B C D E A B  
Rotate  12: C D E A B C
Rotate  13: D E A B C D
Rotate  14: E A B C D E
Rotate  15: A B C D E A


Abigail



Re: Maybe-useful subroutine (BETTER!)

2002-07-01 Thread Abigail

Unfortunally, I left out a '-1'. Here's the right version:

#!/usr/bin/perl -w

use strict;
use warnings 'all';


sub rotate {
return () unless   $_ [1];
die Not an array ref unless ARRAY eq ref $_ [1];
return [] unless {$_ [1]};
my $l = $_ [0] % {$_ [1]};
[@{$_ [1]} [$l .. $#{$_ [1]}, 0 .. $l - 1]]
}
  
my $array = ['A' .. 'E'];

for my $r (-15 .. 15) {
printf Rotate %3d: {rotate $r, $array}\n, $r;
}
__END__
Rotate -15: A B C D E
Rotate -14: B C D E A
Rotate -13: C D E A B
Rotate -12: D E A B C
Rotate -11: E A B C D
Rotate -10: A B C D E
Rotate  -9: B C D E A
Rotate  -8: C D E A B
Rotate  -7: D E A B C
Rotate  -6: E A B C D
Rotate  -5: A B C D E 
Rotate  -4: B C D E A
Rotate  -3: C D E A B
Rotate  -2: D E A B C
Rotate  -1: E A B C D
Rotate   0: A B C D E
Rotate   1: B C D E A
Rotate   2: C D E A B
Rotate   3: D E A B C
Rotate   4: E A B C D
Rotate   5: A B C D E
Rotate   6: B C D E A
Rotate   7: C D E A B
Rotate   8: D E A B C
Rotate   9: E A B C D
Rotate  10: A B C D E
Rotate  11: B C D E A
Rotate  12: C D E A B
Rotate  13: D E A B C  
Rotate  14: E A B C D
Rotate  15: A B C D E


Abigail



Re: Whitespace and Blocks (was RE: Fisher-Yates shuffle)

2002-04-19 Thread abigail

On Fri, Apr 19, 2002 at 01:35:10PM -, Lars Henrik Mathiesen wrote:
  Date: Thu, 18 Apr 2002 23:44:26 +0200
  From: Paul Johnson [EMAIL PROTECTED]
  
  On Thu, Apr 18, 2002 at 07:56:06PM -, Lars Henrik Mathiesen wrote:
   I would like to write
   
 for my $key ( $hash{foo}-keys ) { print $hash{bar}-{$key} }
  
  This, or something similar was hashed out on p5p, h, about 4 years
  ago, give or take four years ;-)  Actually, I think it might have been
  when Chip was Pumpking, which would have made it about five years ago.
 
 I'm not surprised that other people suggested it... and I'm sure that
 there are good arguments against as well.
 
 Hmmm... I just realized that it's possible to get almost the same
 thing by misusing bless:
 
 package hash;
 sub bless { bless shift }
 sub keys { keys %{+shift} }
 sub values { values %{+shift} }
 sub hash { %{+shift} }
 
 package main;
 
 my $href = hash::bless { foo = 1, bar = 2 };
 
 print join ', ', $href-keys;
 print join ', ', $href-values;
 print join ', ', $href-hash;
 
 Tempting --- now if only I could create a method named % or @ ... I'm
 not sure how many optimizations this loses relative to the %{ $href }
 construction, though.


Creating methods with names '%', '@' or whatever isn't the problem.
The problem is calling them.

#!/usr/bin/perl -w

use strict;
package hash;
sub bless { bless shift }

{
no strict 'refs';
*{'hash::@'} = sub {keys %{+shift}};
*{'hash::*'} = sub {values %{+shift}};
*{'hash::%'} = sub {%{+shift}}
}

package main;

my $href = hash::bless { foo = 1, bar = 2 };

$\ = \n;
$, = , ;

print $href - $_ foreach qw /@ * %/;
__END__


Running this gives:

foo, bar
1, 2
foo, 1, bar, 2


But '$href - %' and also '$href - %' are syntax errors. You
can put a variable on the right side of an arrow, but not a
literal string.



Abigail



Re: A better way ?

2002-04-18 Thread abigail

On Wed, Apr 17, 2002 at 02:02:02PM -0400, Bill -Sx- Jones wrote:
 I have the habit of doing:
 
  last if (substr($vFlag, 1, 3) eq 'END');
 $vSub = \Sneex   if (substr($vFlag, 1, 5) eq 'SNEEX');
 $vSub = \Admin   if (substr($vFlag, 1, 5) eq 'ADMIN');
 $vSub = \Reports if (substr($vFlag, 1, 7) eq 'REPORTS');
 $vSub = \Logsif (substr($vFlag, 1, 4) eq 'LOGS');
 $vSub = \Targets if (substr($vFlag, 1, 7) eq 'TARGETS');
 $vSub = \Usenet  if (substr($vFlag, 1, 6) eq 'USENET');
 
 (substr($_, 0, 1) eq '[') ? next : $vSub;


my tags = qw /SNEEX ADMIN REPORTS LOGS TARGETS USENET/;
last if 'END' eq substr $vFlag, 1, 3;
next if '['   eq substr $_, 0, 1;
{
local $ = |;
no strict 'refs';
   (ucfirst lc $1) - () if $vFlag =~ /^.(tags)/;
}



Abigail



Re: Whitespace and Blocks (was RE: Fisher-Yates shuffle)

2002-04-17 Thread abigail

On Wed, Apr 17, 2002 at 10:23:42AM -0700, David Wheeler wrote:
 On 4/17/02 9:11 AM, [EMAIL PROTECTED] [EMAIL PROTECTED] claimed:
 
  I find it amazing that someone can make a statement like 99% of the
  time, people leave whitespace of the aggregate and the index, just
  based on personal experience.
  
  Based on the code *I* have written in the past 20 years, more than
  99% of the time people do use whitespace between the aggregate and
  the index. ;-)
 
 Must be a European thing. ;-)
 
  Seeing $hash{foo}{bar}{baz} all over,justmakeswewanttoignoreallwhitescape.
  Butthatissohardtoread.
 
 Personally, I find C$hash{foo}{bar}{baz} a lot easier to read than
 justmakeswewanttoignoreallwhitescape. The braces break it up nicely.
 However, once I start to see code like that, I start to think it's time for
 a redesign. I don't much care for seeing a hash access go more than two
 layers deep.

I agree. C$hash{foo}{bar}{baz} is horribly ugly. Your solution
is a redesign. My solution is to use whitespace. The latter is
far easier than the former.

  Using ()'s doesn't mean the parser suddenly understands %hash {key}
  is an indexing operator, so that's not going to solve much. :-(
 
 That's true. But the trade-off is significant and, IME, totally worthwhile.
 Braces with whitespace in front of them are now always closures. This adds a
 great deal of power and flexibility to the design.

I think the gain is just the option of not having to write () in if.
If () was still mandatory, there would be no ambiguity when a block
is a hash index and when it cannot be - which means it has to be a
closure.

But if some people just
 are lamenting the loss of the whitespace in hash accesses because that's the
 standard that C set long ago, the, to quote Larry,

It's not just C. It's also any language than I've programmed in.
It's also a rather significant break from 14 years of perl.



Abigail



Re: Whitespace and Blocks (was RE: Fisher-Yates shuffle)

2002-04-16 Thread abigail

On Tue, Apr 16, 2002 at 09:03:14AM -0700, David Wheeler wrote:
 On 4/16/02 7:43 AM, Griffith, Shaun [EMAIL PROTECTED] claimed:
 
  So I take it readability is deprecated?
  
  For instance:
  
  $some_nested_hash{very_long_descriptive_key}{another_key}...{last_key}
  
  can no longer be broken into multiple lines, like this?
  
  $some_nested_hash{very_long_descriptive_key}
   {another_key}
   ...
   {last_key}
  
  Or does it depend on whether {some_key} looks like a statement block or
  not?
 
 IIUC, the no whitespace rule only applies before control structure blocks,
 such as if {} and while {}, because you need the space before the opening
 brace.

Eh, no. You got it backwards. It's not that you need a space before
a control block, it is that If you put a space before a {, it *IS*
a control block.

It's the space that determines what follows is a control block.

In other spaces, it's not necessary.

It's not that it isn't necessary. It's forbidden. Unlike in C, awk, 
Java, Pascal, or even in the language of the whitespace, Python. Any
language I can remember programming in in the last 2 decades allows
optional whitespace between the aggregate and the index, without
parsing things differently.

In other spaces, it's not necessary. So while you'll need to do this:
 
   %some_nested_hash{very_long_descriptive_key}
  {another_key}
  ...
  {last_key} = 'foo';

That's a syntax error. You've some stuff in void context, and then
you're assigning something to a block. 

 You'll have to do this for control blocks:

The problem is, because the ()'s are dropped, the parser cannot know
where the control block starts, hence the new rule any opening brace
following whitespace will be a block.

 if %some_nested_hash{very_long_descriptive_key}{another_key}...{last_key} {
 # do stuff
 }
 
 But if you really want to improve legibility, and you find you have a lot of
 this sort of thing in your code, I suggest you make a shortcut:
 
   my %scut = %some_nested_hash{very_long_descriptive_key}{another_key}{one};
 
   if %scut.{last_key} { # Perl 6 hashref syntax.
   # do stuff
   }


Having to *add* code isn't actually making a shortcut, is it? ;-)



Abigail



Re: Fisher-Yates shuffle

2002-04-15 Thread abigail

On Fri, Apr 12, 2002 at 02:14:36PM -0700, Erik Steven Harrison wrote:
  
 --
 
 On Fri, 12 Apr 2002 18:27:11  
  abigail wrote:
 On Fri, Apr 12, 2002 at 04:42:07PM +0100, Piers Cawley wrote:
  [EMAIL PROTECTED] writes:
  
   Why isn't
  
   if %foo {key} {print Hello 1}
  
   equivalent with the perl5 syntax:
  
   if (%foo) {key} {print Hello 1}
  
   Which keyword is it expecting?
  
  Keyword /els(e|if)/, or end of line, or semicolon. Sorry badly phrased
  on my part. The closing brace of {key} only ends the statement if it
 
 
 As i understand it (Tell me if I'm wrong) This
 
 %hash {key};
 
 will not work, because space between the hashname and the brace was no longer 
allowed. This allows for

Will not work is not correct. %hash {key} is parsed as an expression
followed by a block. Whether that will work depends on the context.
'if' for instance is followed by an expression and a block..., so

if %hash {key}

*will* work, except it does something totally different than one might
expect.

 
 if %hash{key} { ... }
 
 and also
 
 if $scalar { ... }.
 
 The only other white space rule is that white space after the closing brace of a 
closure, when that closure is the last argument of a user defined sub get's treated 
as a semicolon if there is nothing else on that line. This allows custom iterators to 
parse (or appear to parse) like builtins.
 
 myforeach @arry, %hash, $scalar {
...
 } 
 #No semicolon required!
 
 What problems does this seem to cause - I don't see anything wrong. I don't see how 
(except in the case of closure as last argument) how it matters one way or another 
what kind white space appears between tokens. 
 
 What am I missing?


What you are missing is the new white space rule:

\s+{ }

shall be always be a block. Hence the difference between
%hash {foo} and %hash{foo}.


Abigail





Re: Fisher-Yates shuffle

2002-04-12 Thread abigail

On Thu, Apr 11, 2002 at 11:23:24AM -0700, William R Ward wrote:
 
 Haven't you heard?  In perl 6, for will replace foreach and the
 C-style for will be called loop.  Larry says that the C-style
 for loop is used far less often than the foreach style.  Sounds
 like it's being deprecated...

I've heard about this mythical perl6 thingy. Wasn't that the project
Larry was going to write apocalypses about first, one for every chapter
of the Camel? With 4 apocalypses in 1.5 years, 20 something chapters in
the Camel, I'm not to focussed on perl6.

Let's deal with things that exist. Like perl5. Perl 5.8 certainly isn't
marking 'for' as deprecated, so it won't disappear any sooner than 5.12.
BTW, if Larry is going to rename 'for' to 'loop', then he plans to keep
it around. Doesn't sound deprecated at all to me.



Abigail



Re: Fisher-Yates shuffle

2002-04-12 Thread abigail
++) {
 print $line: array[$line]\n;
 }


You won't hear me advocating a C-style for loop to iterate
over an array.



Abigail



Re: Fisher-Yates shuffle

2002-04-12 Thread abigail

On Fri, Apr 12, 2002 at 03:09:27PM +0300, Ilmari Karonen wrote:
 On Tue, 9 Apr 2002 [EMAIL PROTECTED] wrote:
  On Tue, Apr 09, 2002 at 01:49:01PM +0300, Ilmari Karonen wrote:
   On Fri, 5 Apr 2002 [EMAIL PROTECTED] wrote:

sub shuffle {
for (my  $i = @_;  $i;) {
 my  $j = rand $i --;
 @_ [$i = $j] = @_ [$j = $i]
}
@_;
}
   
   It doesn't, unfortunately, allow for the idiom one expects from a
   copy-and-shuffle function, namely
   
 my @shuffled = shuffle @original;
  
  It allows for:
  
  my @shuffled = shuffle 1 .. 10;
 
 Not very consistently, though:
 
   my @shuffled = shuffle 'A',2,3,4,5,6,7,8,9,10,'J','Q','K';
   Modification of a read-only value attempted at - line 4.

Eew. That looks like a bug in perl somewhere.

shuffle 1 .. 10; # Fine.
shuffle 1, 2, 3, 4, 5, 6, 7, 8, 9, 10;   # Not ok.

  I do think it needs a reference to Knuth [1]. Or to the original publication
  of Fisher and Yates [2].
  
  [1] D. E. Knuth: IThe Art of Computer Programming, Volume 2,
  Third edition. Section 3.4.2, Algorithm P, pp 145. Reading:
  Addison-Wesley, 1997. ISBN: 0-201-89684-2.
  
  [2] R. A. Fisher and F. Yates: IStatistical Tables. London, 1938.
  Example 12.
 
 So, how _does_ one properly mark that up in POD?  There are no real
 footnotes, and embedding the entire reference in the text would mess up
 the already strained flow of the paragraph.  Hmm...  :-(


From Algorithms::Numerical::Shuffle.pm:


=head1 LITERATURE

The algorithm used is discussed by Knuth [3]. It was first published
by Fisher and Yates [2], and later by Durstenfeld [1].

=head1 CAVEAT

Salfi [4] points to a big caveat. If the outcome of a random generator
is solely based on the value of the previous outcome, like a linear
congruential method, then the outcome of a shuffle depends on exactly
three things: the shuffling algorithm, the input and the seed of the
random generator. Hence, for a given list and a given algorithm, the
outcome of the shuffle is purely based on the seed. Many modern computers
have 32 bit random numbers, hence a 32 bit seed. Hence, there are at
most 2^32 possible shuffles of a list, foreach of the possible algorithms.
But for a list of n elements, there are n! possible permutations.
Which means that a shuffle of a list of 13 elements will not generate
certain permutations, as 13!  2^32.

=head1 REFERENCES

=over

=item [1]

R. Durstenfeld: ICACM, B7, 1964. pp 420.

=item [2] 

R. A. Fisher and F. Yates: IStatistical Tables. London, 1938.
Example 12.

=item [3]

D. E. Knuth: IThe Art of Computer Programming, Volume 2, Third edition.
Section 3.4.2, Algorithm P, pp 145. Reading: Addison-Wesley, 1997.
ISBN: 0-201-89684-2.

=item [4]

R. Salfi: ICOMPSTAT 1974. Vienna: 1974, pp 28 - 35.

=back



Abigail



Re: Fisher-Yates shuffle

2002-04-12 Thread abigail

On Fri, Apr 12, 2002 at 04:00:37PM +0100, Piers Cawley wrote:
 X-posting to perl6-language
 [EMAIL PROTECTED] writes:
  As for cleanness, this is my interpretation of how perl6 is going
  to work:
 
  %foo = ();
  if %foo {key} {print Hello 1}
  
  %foo = ();
  if %foo{key} {print Hello 2}
  
  %foo = ();
  if %foo{key}{print Hello 3}
 
  Case 1 will print Hello 1; this is a block after the if statement.
 
 No, it will be a syntax error. The first closing brace does not end
 the statement, probably something like Block seen when keyword
 expected. 

Now I am confused. In perl6, we can leave off the the parenthesis
around a condition, and I hope that it isn't required to have
an 'elsif' or 'else' block.

Why isn't

if %foo {key} {print Hello 1}

equivalent with the perl5 syntax:

if (%foo) {key} {print Hello 1}

Which keyword is it expecting?

  Case 2 will not print anything. The print is in the 'then' part
 of the if.
 
 Correct.
 
  Case 3 will be a syntax error - an if statement with a condition,
 but not block.
 
 It won't be a syntax error *yet*. If there's a block immediately
 following then that will be treated as the 'then' block. If it's the
 end of file, or a nonblock, then it'll be a syntax error.

Did the code show anything following it? No? Well, then assume
it isn't there. ;-)

Next time I'll show this to someone, I'll add a semicolon.



Abigail



Re: First differing char in two strings

2002-04-11 Thread abigail

On Thu, Apr 11, 2002 at 11:13:44AM -0400, Jeff 'japhy' Pinyan wrote:
 On Apr 11, Jeff 'japhy' Pinyan said:
 
 On Apr 11, Paul Makepeace said:
 
 The task is to find the first differing character given two strings.
 
 You can get a little niftier if you're using Perl 5.6:
 
   ($a^$b)=~/^\0*/$+[0]
 
 Duh.  Remove the ^ and change the  to * and you save two chars:
 
   ($a^$b)=~/\0*/*$+[0]
 
 The regex always succeeds -- thus, always returns 1.


Yeah, but does Perl actually garantee it will evaluate the
left operand of arithmetic operators first? If so, I cannot
find it in the documentation.

Luckely, there's a 1-character length operator that is documented
to first evaluate the left operand, then the right: ,

($a^$b)=~/\0*/,$+[0]

If you actually want to assign the result, you'd have to write it as:

($a^$b)=~/\0*/,$x=$+[0]



Abigail



Re: First differing char in two strings

2002-04-11 Thread abigail

On Thu, Apr 11, 2002 at 01:43:31PM -0400, Jeff 'japhy' Pinyan wrote:
 On Apr 11, [EMAIL PROTECTED] said:
 
 On Thu, Apr 11, 2002 at 11:13:44AM -0400, Jeff 'japhy' Pinyan wrote:
  On Apr 11, Jeff 'japhy' Pinyan said:
  
($a^$b)=~/\0*/*$+[0]
 
 Yeah, but does Perl actually garantee it will evaluate the
 left operand of arithmetic operators first? If so, I cannot
 find it in the documentation.
 
 Well, I've never had that problem.  pop() - pop() works as I expect it to,
 evaluating left-to-right.

Well, yes. But how can you be sure this will always be the case?
Perhaps in a next release, it won't. Is it documented to work this
way, or does it just happen to work?

 Luckely, there's a 1-character length operator that is documented
 to first evaluate the left operand, then the right: ,
 
 ($a^$b)=~/\0*/,$+[0]
 
 Ah, but yours cannot be dropped in as an assignment, as mine can.
 
   $VAL = ($a^$b)=~/\0*/*$+[0];
   $VAL = ($a^$b)=~/\0*/,$+[0];

Well, it can be dropped in, you just have to know how to drop it! ;-)

($a^$b)=~/\0*/,$VAL = $+[0];


Abigail



Re: Fisher-Yates shuffle

2002-04-09 Thread abigail

On Tue, Apr 09, 2002 at 01:49:01PM +0300, Ilmari Karonen wrote:
 
 On Fri, 5 Apr 2002 [EMAIL PROTECTED] wrote:
  
  sub shuffle {
  for (my  $i = @_;  $i;) {
   my  $j = rand $i --;
   @_ [$i = $j] = @_ [$j = $i]
  }
  @_;
  }
 
 I could live with that.  To really go into the FAQ, though, it'd need an
 explanatory paragraph or two, about how it works and why.  And I'd also
 insist on using a while loop instead of for, even if it costs a line.

I think a while loop isn't right. You have an iterator with an obvious
starting value, which gets decremented with the same amount in each
iteration. If that isn't screaming 'for', you might as well start
deprecating 'for'.

  Perhaps this is the best of both worlds:
  - It takes a list as argument.
  - It's in-situ.
  - In list context, it returns the shuffled list.
  - In scalar (and hence void) context, it doesn't require linear
additional memory.
 
 It doesn't, unfortunately, allow for the idiom one expects from a
 copy-and-shuffle function, namely
 
   my @shuffled = shuffle @original;
 
 In fact, it's not really copy-and-shuffle at all -- it's an in-place
 shuffle with an optional *trailing* copy operation.  Not very useful,
 really.

It allows for:

my @shuffled = shuffle 1 .. 10;

which in, my opinion, is very intuitive. If the function doesn't 
return a list in list context, you would have to write that as

shuffle my @shuffled = 1 .. 10;

which not only is harder to understand, but makes people wonder
why one calls an array '@shuffled' when you assign an ordered
list to it. ;-)

 It does have an advantage over the FAQ solution in that it takes a list
 instead of an arrayref.  I might, in fact, prefer to omit the last
 statement entirely, so that the function does an in-place shuffle of its
 arguments and returns nothing in any context.

It will always return *something* in scalar or list context. You might
as well have it do the right thing in list context. Isn't that what's
Perl all about?  ;-)

 Putting these ideas together, I'd suggest something like this:
 
 =pod
 
 Alternatively, you may use this function, which shuffles its arguments
 in place.  The algorithm is known as a Fisher-Yates shuffle, and can be
 proven to produce a uniform distribution of permutations, provided that
 the random number generator is sufficiently random.
 
 sub shuffle {
 my $i = @_; # length of @_ array
 while ($i) {
  my $j = rand $i--;
  @_[$i, $j] = @_[$j, $i];   # 0 = int($j) = $i
 }
 }
 
 # usage examples:
 shuffle @array;
 shuffle $a, $b, $c;
 shuffle my @shuffled = @original;
 
 It may be educational to work out the proof of correctness by yourself
 -- to get you started, consider the part of the array below C$i as a
 pool from which elements get picked one at a time at random.  Note that
 it's surprisingly easy to introduce subtle bugs into the algorithm, for
 example by replacing C$i-- with C--$i.  Can you see why?


I do think it needs a reference to Knuth [1]. Or to the original publication
of Fisher and Yates [2].

[1] D. E. Knuth: IThe Art of Computer Programming, Volume 2,
Third edition. Section 3.4.2, Algorithm P, pp 145. Reading:
Addison-Wesley, 1997. ISBN: 0-201-89684-2.

[2] R. A. Fisher and F. Yates: IStatistical Tables. London, 1938.
Example 12.



Abigail



Re: Fisher-Yates shuffle

2002-04-05 Thread abigail

On Fri, Apr 05, 2002 at 06:30:17PM +0300, Ilmari Karonen wrote:
 
 On Fri, 5 Apr 2002 [EMAIL PROTECTED] wrote:
  On Fri, Apr 05, 2002 at 12:17:27AM +0300, Ilmari Karonen wrote:
   The point of taking a reference is that the Fisher-Yates algorithm is an
   in-place shuffle.  If your array happens to be a couple of megabytes in
   size, you start to appreciate this feature.  So, since we have this nice
  
  Ah, well, one could give the same argument for sort and map, but
  noone seems to object to them not performing in situ operations.
 
 Well, map does allow in-place modification.  I seem to recall you taking
 advantage of this feature before, and defending it quite vehemently
 against accusations of obfuscation...  ;-)

I challenge you to find a piece of code of me that uses map{} to
modify the list it's working on.

Don't confuse modifying *ELEMENTS* with modifying an array. Any function
can modify array elements already, if the array is given as argument.
(See below ;-)

  @array = shuffle @array;   # Tada!
 
 That's not an in-place shuffle.  It's a copy-and-shuffle, where the
 final copy just happens to overwrite the original.  It still takes O(n)
 extra memory[1], whereas a real in-place algorithm takes O(1).


Oh, but with a few modifications, you don't need to the extra memory.


sub shuffle {
for (my  $i = @_;  $i;) {
 my  $j = rand $i --;
 @_ [$i = $j] = @_ [$j = $i]
}
@_;
}


Perhaps this is the best of both worlds:
- It takes a list as argument.
- It's in-situ.
- In list context, it returns the shuffled list.
- In scalar (and hence void) context, it doesn't require linear
  additional memory.
   

It's also, IMO, a simpler function than the one presented previously.
No references, no return in a wierd place, no check for special cases.

(Yes, I realize that it swaps element 0 with itself at the end - a fair
 price for the simplicity)


Abigail



Re: Fisher-Yates shuffle

2002-04-04 Thread abigail

On Wed, Apr 03, 2002 at 10:37:08PM -0700, Sean M. Burke wrote:
 I was looking at the implementation of the FY shuffle that I see in 
 Perlfaq4, namely:
 
  sub fy1 {
  my $deck = shift;  # $deck is a reference to an array
  my $i = $deck;
  while (--$i) {
  my $j = int rand ($i+1);
  $deck[$i,$j] = $deck[$j,$i];
  }
  }
 
 And I thought guh, that looks a bit off -- notably, it dies when given an 
 empty array.


Yeah. It's odd. FY is a simple algorithm, which can be copied from many
sources, and yet people seem to be able to screw it up all the time.

The one in Algorithm::Numeric::Shuffle, of which the FAQ entry was
derived, doesn't have a problem with empty arrays though.

I must say however, that requiring that the array is passed as a 
reference seems an oddity to me. If you pass it a list, it should,
IMO, return the shuffled list.

 So I banged out the following, which I'm pretty sure does the same work:
 
 sub fy2 {
 my($deck,$i,$j,$t) = $_[0];
 $i = $deck || return;
 while(1) {
   $j = int rand($i-- || return);
   $t = $deck-[$i];
   $deck-[$i] = $deck-[$j];
   $deck-[$j] = $t;
 }
 }
 # Thanks to Silmaril for pointing out that the temp var is faster

But this one is butt ugly! '|| return' inside a call to rand() is nice
for obfuscation purposes, it doesn't make readable programs. Nor does
it make for a solution that's easily converted to one that shuffles a
given list. Therefore, I think such trickery doesn't belong in the FAQ.

I'd prefer to just fix the error in fy1, and put

   sub fy1 {
   my $deck = shift;  # $deck is a reference to an array
   my $i = $deck;
   while ($i--) {
   my $j = int rand ($i+1);
   $deck[$i,$j] = $deck[$j,$i];
   }
   }

in the FAQ. Speed isn't holy - if you want speed, you would have used
Inline::C or XS anyway.

Besides, it would be better to fix perl such that both cases are
equivalent. ;-)

 And besides fixing the bug with empty arrays, it runs twenty-odd percent 
 faster.
 That ratio seems to hold true whether the deck is numbers, strings, or objects

That's not very surprising, because the values of the elements
aren't accessed at all.

 I was quite surprised -- it seems too good to be true.  Am I doing anything 
 obviously wrong?


Can't spot any problems.



Abigail



Re: Why bother with separate lists?

2002-03-19 Thread abigail

On Mon, Mar 18, 2002 at 09:31:34PM -0500, Bernie Cosell wrote:
 For all of the last several golf matches, essentially *all* of the 
 traffic specific to a single tournament has been on a very small number 
 of threads [to their credit, the golfers have stuck to reasonable and 
 consistent 'subject' lines rather than lots of random non-threaded ones]. 
 That means that in almost any mail client, it is just a couple of key 
 strokes to 'kill' a thread or two and make the entire 'tournament' 
 essentially cease to exist [as far as you're concerned].

It would, if all the traffic came at once. But usually, those threads
last for over a week. Most people do read mail more than once a week.

And even if you go through the trouble of setting up a perfect killfile
entry, you still need to download the stuff.

 None of the traffic on the list is all that high [what... ten messages a 
 day, if that much, for a particular tournamet??] so I'd say -- it is all 

It's more than 10 message a day. Recently, I killed 143 golf related
messages on fwp after not being able to read mail for a few days.

 fun for *some* of us, so keep it on fwp, just be careful with subject 

some being the keyword. I'd say that because it's fun for only *some*
of us, a separate list is in order.

 lines and let folk use their mail clients to 'tune out' what they choose 
 not to want to see [I can only say that that's what I did...at some point 
 the discussion of one of the TPRs got too involved in minutiae I wasn't 
 following so I killed the thread and I don't even know [or care] how much 
 longer it went on..].

Now, if I could place filters at my ISP, I'm all for it. But since that's
out of the question, I don't go for the tune out reasoning.



Abigail



Re: Why bother with separate lists?

2002-03-19 Thread abigail

On Tue, Mar 19, 2002 at 10:27:47PM -0500, Bernie Cosell wrote:
 
 I say keep it as one list and have folk learn to use the machinery their 
 mail clients provide them with; that's why mail clients HAVE that 
 machinery.


With that reasoning, all we need is one mailing list. Not just for Perl,
for everything.



Abigail



Re: regex for html img... tags

2002-03-19 Thread abigail

On Tue, Mar 19, 2002 at 09:43:00PM -0800, Jeremy Zawodny wrote:
 On Wed, Mar 20, 2002 at 06:29:24AM +0100, [EMAIL PROTECTED] wrote:
  On Tue, Mar 19, 2002 at 09:12:30PM -0800, Randal L. Schwartz wrote:
   
   Only vaguely.  I'm a bit embarassed by them, actually.  I think my
   original twistyness has devolved to Obfuperl, and *that* has
   contributed to people thinking that Perl is really inherently
   obfuscated, which undermines what *I* would like to see how Perl is
   perceived in the marketplace.  So it may have backfired.  Maybe
   Obfuperl would have been come about some other way, but I'm sure my
   JAPHs were a contributing factor.
  
  Frankly, I doubt that's true. There's a obfuscated C contest, but
  that doesn't make people think C is inherently obfuscated.
 
 Apples and Oranges--from where I sit at least.
 
 The obfuscated C contest is far less a part of the C culture compared
 with JAPHs, Golf, etc in the Perl culture.  That could have something
 to do with the perception.


No, it's not part of the culture. At least not of the culture of all
the people programming Perl. Only a handful of people write JAPHs,
play golf or intentionally obfuscate Perl programs. Of the many Perl
books on my bookshelves, none spends more than a few lines on Japhs,
golf or obfuscated programs, and most of them don't mention them at all.

The majority of the people programming Perl don't even know anything
about Japhs, golf or obfuscated Perl. Don't consider the inbred circle
of people on the various mailinglist, clpm, #perl, and perlmonks as
the average Perl programmer. They aren't, they are the intimicy. And
what they do on their mailinglists, newsgroups and websites remains
hidden for most of the world.

It's like saying ice-fishing is part of the American culture, just
because a bunch of people in Alaska do that in winter.


Abigail



Re: regex for html img... tags

2002-03-19 Thread abigail

On Tue, Mar 19, 2002 at 09:12:30PM -0800, Randal L. Schwartz wrote:
 
 Only vaguely.  I'm a bit embarassed by them, actually.  I think my
 original twistyness has devolved to Obfuperl, and *that* has
 contributed to people thinking that Perl is really inherently
 obfuscated, which undermines what *I* would like to see how Perl is
 perceived in the marketplace.  So it may have backfired.  Maybe
 Obfuperl would have been come about some other way, but I'm sure my
 JAPHs were a contributing factor.

Frankly, I doubt that's true. There's a obfuscated C contest, but that
doesn't make people think C is inherently obfuscated.

That Perl has the name of being inherently obfuscated, has, IMO, other
reasons. It's the design of Perl; it allows you to do things in many ways,
with lots of special cases. That's a dual edged sword, it makes life of
the experienced programmer with self-discipline easier, but it provides
too much rope for most programmers. Perl makes it easy to write programs -
both good *and* bad ones. I don't think Japhs and intentionally obfuscated
Perl contributed more to the idea Perl is inherently obfuscated, then
the scores of mediocry programmers who should never have used Perl in
the first place.

To make a (bad?) analogy, I don't think Ferrarri will get a reputation
of being an unsafe car if a bunch of people intentionlly crash Ferrarri
cars into brick walls - but it will get an (undeserved) bad reputation
if a bunch of Sunday drivers crash with them.

I've always said, and I keep on saying so, that Perl is not suited for
most people calling themselves programmer.

 Having said that, I really enjoyed Abigail's presentation a few years
 back at YAPC.

Thanks. 



Abigail



Re: Non-golf fun

2002-03-18 Thread abigail

On Tue, Mar 19, 2002 at 10:46:04AM +1100, [EMAIL PROTECTED] wrote:
 
 In an attempt to stimulate non-golf threads:
 
 .. Has anybody done any really interesting Perl hacks that they are
 proud of?


Yeah, I do actually. Recently I was writing a program that needed
to (shell) source a file, to get some environment variables set,
and use those variables in further calculations.

I could of course have written a shell wrapper that sourced the file,
then started my program. But I decided I just wanted a single program.

To source a shell file, one cannot use 'system' - that would start
a child process, and setting the environment in the child is pointless.
So, I decided to use a trick, a double exec. First I exec a shell
that is sourcing the file with the environment variables, then I
exec the original program - with a special argument to indicate the
variables have been set.

The relevant part of said program follows.


Abigail



my $ENVIRONMENT = /some/file/somewhere;

if (@ARGV  $ARGV [0] eq '--sourced_environment') {
shift;
}
else {
if (-f $ENVIRONMENT) {  
#
# Now we perform a double exec. The first exec gives us a shell,
# allowing us the source the file with the environment variables.
# Then, from within the shell we re-exec ourself - but with an
# argument that will prevent us from going into infinite recursion.
#
# We cannot do a 'system source $ENVIRONMENT', because
# environment variables are not propagated to the parent.
#
# Note the required trickery to do the appropriate shell quoting
# when passing @ARGV back to ourselves.
#

@ARGV = map {s/'/'''/g; '$_'} @ARGV;

exec  --;
source '$ENVIRONMENT'
exec$0  --sourced_environment @ARGV;
--
die  This should never happen.;
}
}




Re: regex for html img... tags

2002-03-17 Thread abigail

On Sun, Mar 17, 2002 at 02:27:14PM +, Kim Schulz wrote:
 
 hi
 I have a long string which contains some img.. html tags ala
 img height=67 alt= hspace=0 
 src=C:\DHTMSamp\SAMPLES\Web\images\AddItem.GIF width=72 align=baseline 
 border=0
 I need to do it throughout the whole string which can contain serval img 
 tags. 
 
 I need an regex that can replace everything between src= and
 the imagename with images/ 
 
 I want to make those stupid MS typical direct image links so they use the 
 images on the server insted of those on the client machine. 
 
 Can anyone help me?? 


Which part of the perl FAQ addressing this question did you fail to understand?



Abigail



Re: longest common substring (return of)

2002-03-13 Thread abigail

On Thu, Mar 14, 2002 at 10:38:29AM -0500, Ted Zlatanov wrote:
 I remember a lengthy LCS discussion, and the solutions for the most
 part used the regex engine.  I also looked on CPAN, but couldn't find
 a canonical LCS module.
 
 Has anyone developed an implementation of the well-known bitstring
 algorithm?  Basically you convert your data strings to bitstrings, AND
 the two, and look for the longest match.  Then you rotate the shorter
 data string by one, and repeat the test.  Repeat for the number of
 bits in the shorter data string.  I want to do it, but just wanted to
 check if it had come up before on FWP.
 
 There are even better algorithms to do this.  I found them with
 Google; for instance the Goeman  Clausen 1999 A New Practical Linear
 Space Algorithm for the LCS Problem paper looks interesting but is
 beyond what I remember of my college math.  It describes a 
 O(ns + min(mp, (m log(m)) + p(n-p))) time, linear space algorithm and a
 O(ns + min(mp, p(n-p))) time, O(ns) space algorithm, and claims the
 latter to be the fastest known such algorithm (m = size of string A, 
 n = size of string B, n= m, s = alphabet size, p = length of LCS).
 I'm pretty sure implementations of the Goeman  Clausen algorithms
 haven't been done in Perl yet.


I'd like to point out that the LCS problem from the literature is usually
a harder problem than the LCS problem discussed here.  Here we usually
talk about consecutive common substrings, while the literature drops
the consective restriction.



Abigail



Re: Sort is lazy?!? (as in Haskell)

2002-03-10 Thread abigail

On Sun, Mar 10, 2002 at 10:16:05PM +, Jonathan E. Paton wrote:
   PS  With a truly lazy sort you'd be able to do...
   
   ($first) = sort { $a wibble $b } big_list;  # in O(n)
  time.
  
  Not necessarely - it would depend on the underlaying
  sorting mechanism.
 
 It wouldn't be lazy then, would it?
 
  Shell sort and heap sort for instance don't get the
  first element first.
 
 Huh?  Are you saything that those sorting algorithms
 don't actually sort?

No. My second first was a statement of time. Lazy evaluation means that
you don't calculate what you need to calculate. But that doesn't mean
that any sort algorithm can find the smallest element in linear time,
just by making it lazy.  A typical heap sort implementation finds the
elements from large to small, taking O (log n) for each next element, 
after an initial O (n) preprocessing time. And you get to the smallest
element only after finding all the other elements.

 I think the idea of Lazy sort is in certain
 circumstances, say for first X highest items, then you
 can switch to a different and faster algorithm.
 
 A lazy sort can be done in n time, easily:
 
 highest;
 foreach element {
if (element  lowest in highest) {
stick in highest, removing lowest in highest;
}
 }
 return sort highest
 
 The Big Oh of that is n.


No, it's not. It's O (n log X).



Abigail



Re: More Wacky Solutions

2002-01-31 Thread abigail

On Thu, Jan 31, 2002 at 06:28:08PM +1100, [EMAIL PROTECTED] wrote:
 We already have Tall Trees/Toothpicks:
 
 -p $_ x=$|--y|||c~(y|a|||y|e|||y|i|||y|o|||y|u|||y|y||)   57 klem
 -nl ($.|y|||c|y|a|||y|e|||y|i|||y|o|||y|u|||y|y||)1||print 59 bart
 -n 1($.|~y|||c|y|a|||y|e|||y|i|||y|o|||y|u|||y|y||)||print 59 byng
 -ln 1($.|y|o|||y|e|||y|u|||y|a|||y|i|||y|y|||y|||c)||print 59 ivey
 -ln 1($.|y|||c|y|a|||y|e|||y|i|||y|o|||y|u|||y|y||)||print 59 sean
 -p 1($.|~y|||c|y|a|||y|e|||y|i|||y|o|||y|u|||y|y||)y|||cd
 60 byng
 
 and Journey Beyond the Stars:
 
 -p y*a
 ***y*e
 ***y*i
 ***y*o
 ***y*u
 ***y*y
 ***y***c*($.+1)1or$_=$*
 
 To these, I would like to add Ampersands of Time:
 
 -p !($|--~ya~ye~yi~yo~yu~yyyc)ycd
 
 containing 33  characters in a score of 66.
 
 I expect Mexican Wave (based on ~) and Dashing (based on -)
 are also possible.
 
 I wonder what the highest possible ratio is? The best I have come
 up with so far is 0.5238 (33/63) by modifying Matthew's 60-stroke
 solution:

One can get as close to 0.833 as one wants, by writing something like:

$=ss;

and adding as many 's's after the = as needed to obtain the desired
ratio.


Or, by using comments and strings, ratios as close to 1 as wanted can
be archieved.



Abigail



Re: A perverse use of the glob function

2002-01-30 Thread abigail

On Wed, Jan 30, 2002 at 03:55:32PM +0100, Joerg Ziefle wrote:
 
 Not to forget scalar evaluation within strings:
 
${\(foo)}
 
 as in:
 
perl -e 'print Current time is: ${\(scalar localtime)}\n'
 
 (the parens could as well have been omitted)
 
 as opposed to the array evaluation
 
@{[foo]}
 
 as in:
 
perl -e 'print Current time is: @{[scalar localtime]}\n'
 
 Note the obvious difference to:
 
perl -e 'print Current time is: @{[localtime]}\n'


Eh, it's the scalar that makes the scalar evaluation in those
examples. After all @{[scalar localtime]} gives the result
of 'localtime' in scalar context, not list context as you suggest.

And you even need the 'scalar' if you are using ${\(EXPR)}, as
\ doesn't propagate context; it provides list context.


#!/usr/bin/perl

use strict;
use warnings 'all';

sub context {wantarray ? LIST : SCALAR}

print ${\context}\n;
__END__
LIST


This makes the ${\(EXPR)} not very useful; one could as well use
@{[EXPR]} - which not only saves a keystroke, but is more symmetric.



Abigail



Re: A perverse use of the glob function

2002-01-30 Thread abigail

On Wed, Jan 30, 2002 at 04:46:28PM +0100, Joerg Ziefle wrote:
 -- Original Message --
 From: [EMAIL PROTECTED]
 Reply-To: [EMAIL PROTECTED]
 Date: Wed, 30 Jan 2002 16:24:20 +0100
 
 On Wed, Jan 30, 2002 at 03:55:32PM +0100, Joerg Ziefle wrote:
 perl -e 'print Current time is: @{[scalar localtime]}\n'
  
  Note the obvious difference to:
  
 perl -e 'print Current time is: @{[localtime]}\n'
 
 
 Eh, it's the scalar that makes the scalar evaluation in those
 examples. After all @{[scalar localtime]} gives the result
 of 'localtime' in scalar context, not list context as you suggest.
 
 Ok, got me :)
 The example was not the best and would have better been something along
 
 perl -e 'print 1 + 3 = ${\(1+3)}\n'

That would not be a very good example, and 1 + 3 is 4 in both scalar
and list context.

 BTW, can you think of a clunkier way to get the name of the current script as
 
 print Call me $${\localtime} darling.\n
 
 (and with some luck, that even fails :)?

I fail to understand what you mean.

 And you even need the 'scalar' if you are using ${\(EXPR)}, as
 \ doesn't propagate context; it provides list context.
 
 But
 
 ${\foo}
 
 (as in your example) is still shorter than
 
 @{[foo]}
 
 (agreed however that the latter looks nicer).


Yeah, but \ has quite a high priority. Higher than most binary operators.
Most expressions you'd need to parenthesize.



Abigail



Re: Better ?

2002-01-28 Thread abigail

On Fri, Jan 25, 2002 at 11:31:37AM -0600, [EMAIL PROTECTED] wrote:
 Greetings, On Friday, January 25, 2002, at 10:56  AM, 
 [EMAIL PROTECTED] wrote:
 
  /[fF][oO][oO]/ better than /foo/i
 
  That'd be Mr. RE Jeffrey Freidl in the Mastering RE/Owl book. 
  The parser
  has to do less backtracking or something.
 
 
 I have to disagree; while I have read the RE/Owl (I hear there is 
 another in the works?) I feel that
 
 /[fF][oO][oO]/ lacks clarity over /foo/i
 
 
 You are correct, its butt-ugly but pg 280 goes over the details.  For the 
 word while (ugly vs. /i) /i is 50 times slower.  While he gets it down to 
 a mere 77% slower and cautions not to get hung up about it, it is worth 
 thinking about if you need the speed, esp. w/ a big RE.


I've heard the claim 50 times slower often. But, although Benchmark.pm
should make it easy, I've yet to see anyone actually supporting their
claim with any evidence.

Cargo-cult programming is bad, but making cargo-cult claims is as bad.



Abigail



Re: match URI and nothing more?

2002-01-21 Thread abigail

On Fri, Jan 18, 2002 at 07:12:09PM +0100, Kim Schulz wrote:
 hi
 does anyone have a regex for checking a text for URI's collect them in
 an array.?? I has to be very precise because some of the textlines looks
 like URI's but aren't (a text talking about dot.com companies ). help
 please :o)


Here's a start. It finds URIs for http, ftp, news, nttp, telnet,
gopher, wais, mailto, file, prospero, ldap (sort of), z39_50,
cid, mid, vemmi, imap and nfs. Not what you'd call complete.

You'd have to remove the newlines. ;-)

Abigail


(?:http://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.
)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)
){3}))(?::(?:\d+))?)(?:/(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F
\d]{2}))|[;:@=])*)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{
2}))|[;:@=])*))*)(?:\?(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{
2}))|[;:@=])*))?)?)|(?:ftp://(?:(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?
:%[a-fA-F\d]{2}))|[;?=])*)(?::(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-
fA-F\d]{2}))|[;?=])*))?@)?(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-
)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?
:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?))(?:/(?:(?:(?:(?:[a-zA-Z\d$\-_.+!
*'(),]|(?:%[a-fA-F\d]{2}))|[?:@=])*)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'()
,]|(?:%[a-fA-F\d]{2}))|[?:@=])*))*)(?:;type=[AIDaid])?)?)|(?:news:(?:
(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;/?:=])+@(?:(?:(
?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[
a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3})))|(?:[a-zA-Z](
?:[a-zA-Z\d]|[_.+-])*)|\*))|(?:nntp://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[
a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d
])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?)/(?:[a-zA-Z](?:[a-zA-Z
\d]|[_.+-])*)(?:/(?:\d+))?)|(?:telnet://(?:(?:(?:(?:(?:[a-zA-Z\d$\-_.+
!*'(),]|(?:%[a-fA-F\d]{2}))|[;?=])*)(?::(?:(?:(?:[a-zA-Z\d$\-_.+!*'()
,]|(?:%[a-fA-F\d]{2}))|[;?=])*))?@)?(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a
-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d]
)?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?))/?)|(?:gopher://(?:(?:
(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:
(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+
))?)(?:/(?:[a-zA-Z\d$\-_.+!*'(),;/?:@=]|(?:%[a-fA-F\d]{2}))(?:(?:(?:[
a-zA-Z\d$\-_.+!*'(),;/?:@=]|(?:%[a-fA-F\d]{2}))*)(?:%09(?:(?:(?:[a-zA
-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;:@=])*)(?:%09(?:(?:[a-zA-Z\d$
\-_.+!*'(),;/?:@=]|(?:%[a-fA-F\d]{2}))*))?)?)?)?)|(?:wais://(?:(?:(?:
(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:
[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?
)/(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*)(?:(?:/(?:(?:[a-zA
-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*)/(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(
?:%[a-fA-F\d]{2}))*))|\?(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]
{2}))|[;:@=])*))?)|(?:mailto:(?:(?:[a-zA-Z\d$\-_.+!*'(),;/?:@=]|(?:%
[a-fA-F\d]{2}))+))|(?:file://(?:(?:(?:(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]
|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:
(?:\d+)(?:\.(?:\d+)){3}))|localhost)?/(?:(?:(?:(?:[a-zA-Z\d$\-_.+!*'()
,]|(?:%[a-fA-F\d]{2}))|[?:@=])*)(?:/(?:(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(
?:%[a-fA-F\d]{2}))|[?:@=])*))*))|(?:prospero://(?:(?:(?:(?:(?:[a-zA-Z
\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:[a-zA-Z\d]|-)
*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?)/(?:(?:(?:(?
:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:@=])*)(?:/(?:(?:(?:[a-
zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:@=])*))*)(?:(?:;(?:(?:(?:[
a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:@])*)=(?:(?:(?:[a-zA-Z\d
$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:@])*)))*)|(?:ldap://(?:(?:(?:(?:
(?:(?:[a-zA-Z\d](?:(?:[a-zA-Z\d]|-)*[a-zA-Z\d])?)\.)*(?:[a-zA-Z](?:(?:
[a-zA-Z\d]|-)*[a-zA-Z\d])?))|(?:(?:\d+)(?:\.(?:\d+)){3}))(?::(?:\d+))?
))?/(?:(?:(?:(?:(?:(?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d])
)|(?:%20))+|(?:OID|oid)\.(?:(?:\d+)(?:\.(?:\d+))*))(?:(?:%0[Aa])?(?:%2
0)*)=(?:(?:%0[Aa])?(?:%20)*))?(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F
\d]{2}))*))(?:(?:(?:%0[Aa])?(?:%20)*)\+(?:(?:%0[Aa])?(?:%20)*)(?:(?:(?
:(?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID
|oid)\.(?:(?:\d+)(?:\.(?:\d+))*))(?:(?:%0[Aa])?(?:%20)*)=(?:(?:%0[Aa])
?(?:%20)*))?(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*)))*)(?:(
?:(?:(?:%0[Aa])?(?:%20)*)(?:[;,])(?:(?:%0[Aa])?(?:%20)*))(?:(?:(?:(?:(
?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID|o
id)\.(?:(?:\d+)(?:\.(?:\d+))*))(?:(?:%0[Aa])?(?:%20)*)=(?:(?:%0[Aa])?(
?:%20)*))?(?:(?:[a-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*))(?:(?:(?:
%0[Aa])?(?:%20)*)\+(?:(?:%0[Aa])?(?:%20)*)(?:(?:(?:(?:(?:[a-zA-Z\d]|%(
?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID|oid)\.(?:(?:\d+)(?:
\.(?:\d+))*))(?:(?:%0[Aa])?(?:%20)*)=(?:(?:%0[Aa])?(?:%20)*))?(?:(?:[a
-zA-Z\d$\-_.+!*'(),]|(?:%[a-fA-F\d]{2}))*)))*))*(?:(?:(?:%0[Aa])?(?:%2
0

Re: ^=~

2002-01-21 Thread abigail

On Sat, Jan 19, 2002 at 10:49:13AM -0500, Yanick wrote:
 On Sat, Jan 19, 2002 at 03:36:59PM +, Simon Cozens wrote:
  It's semi-official name is the hypermatch operator. It's an array
  version of the ordinary =~ match operator. (Which used to be just
  for regular expressions, but is now for all sorts of matches.)
  
  @a = (foo, bar, baz);
  @a ^=~ s/a/e/;
  
  turns @a into foo, ber, bez.
  
  @a ^=~ /a/;
  
  returns (0,1,1). (I think.)
 
   Aaah... Neat. But does it has other advantages
 over map /a/, @a and map s/a/e/, @a than brievity? 


Yes. It's not just ^=~, the ^ prefix can be used for most, if not all,
operators.

 @c = @a ^+ @b;

is far clearer than the map equivalent.



Abigail



Re: ^=~

2002-01-21 Thread abigail

On Sun, Jan 20, 2002 at 11:31:25AM -0600, Matthew Wickline wrote:
 
 
  (well, it /is/ winking at me!)
 
 why not call it a wink? ... or maybe a flirt?


I was thinking of something like winking Siamese (^=^ being a 
Siamese cat).



Abigail



Re: Regex puzzle

2002-01-21 Thread abigail
.*\379[\x00-\xbd].*$)|
 (?:^([\x00-\xbd].*)\xbe.*\380$)|
 (?:^(.*)\xbf.*\381[\x00-\xbe].*$)|
 (?:^([\x00-\xbe].*)\xbf.*\382$)|
 (?:^(.*)\xc0.*\383[\x00-\xbf].*$)|
 (?:^([\x00-\xbf].*)\xc0.*\384$)|
 (?:^(.*)\xc1.*\385[\x00-\xc0].*$)|
 (?:^([\x00-\xc0].*)\xc1.*\386$)|
 (?:^(.*)\xc2.*\387[\x00-\xc1].*$)|
 (?:^([\x00-\xc1].*)\xc2.*\388$)|
 (?:^(.*)\xc3.*\389[\x00-\xc2].*$)|
 (?:^([\x00-\xc2].*)\xc3.*\390$)|
 (?:^(.*)\xc4.*\391[\x00-\xc3].*$)|
 (?:^([\x00-\xc3].*)\xc4.*\392$)|
 (?:^(.*)\xc5.*\393[\x00-\xc4].*$)|
 (?:^([\x00-\xc4].*)\xc5.*\394$)|
 (?:^(.*)\xc6.*\395[\x00-\xc5].*$)|
 (?:^([\x00-\xc5].*)\xc6.*\396$)|
 (?:^(.*)\xc7.*\397[\x00-\xc6].*$)|
 (?:^([\x00-\xc6].*)\xc7.*\398$)|
 (?:^(.*)\xc8.*\399[\x00-\xc7].*$)|
 (?:^([\x00-\xc7].*)\xc8.*\400$)|
 (?:^(.*)\xc9.*\401[\x00-\xc8].*$)|
 (?:^([\x00-\xc8].*)\xc9.*\402$)|
 (?:^(.*)\xca.*\403[\x00-\xc9].*$)|
 (?:^([\x00-\xc9].*)\xca.*\404$)|
 (?:^(.*)\xcb.*\405[\x00-\xca].*$)|
 (?:^([\x00-\xca].*)\xcb.*\406$)|
 (?:^(.*)\xcc.*\407[\x00-\xcb].*$)|
 (?:^([\x00-\xcb].*)\xcc.*\408$)|
 (?:^(.*)\xcd.*\409[\x00-\xcc].*$)|
 (?:^([\x00-\xcc].*)\xcd.*\410$)|
 (?:^(.*)\xce.*\411[\x00-\xcd].*$)|
 (?:^([\x00-\xcd].*)\xce.*\412$)|
 (?:^(.*)\xcf.*\413[\x00-\xce].*$)|
 (?:^([\x00-\xce].*)\xcf.*\414$)|
 (?:^(.*)\xd0.*\415[\x00-\xcf].*$)|
 (?:^([\x00-\xcf].*)\xd0.*\416$)|
 (?:^(.*)\xd1.*\417[\x00-\xd0].*$)|
 (?:^([\x00-\xd0].*)\xd1.*\418$)|
 (?:^(.*)\xd2.*\419[\x00-\xd1].*$)|
 (?:^([\x00-\xd1].*)\xd2.*\420$)|
 (?:^(.*)\xd3.*\421[\x00-\xd2].*$)|
 (?:^([\x00-\xd2].*)\xd3.*\422$)|
 (?:^(.*)\xd4.*\423[\x00-\xd3].*$)|
 (?:^([\x00-\xd3].*)\xd4.*\424$)|
 (?:^(.*)\xd5.*\425[\x00-\xd4].*$)|
 (?:^([\x00-\xd4].*)\xd5.*\426$)|
 (?:^(.*)\xd6.*\427[\x00-\xd5].*$)|
 (?:^([\x00-\xd5].*)\xd6.*\428$)|
 (?:^(.*)\xd7.*\429[\x00-\xd6].*$)|
 (?:^([\x00-\xd6].*)\xd7.*\430$)|
 (?:^(.*)\xd8.*\431[\x00-\xd7].*$)|
 (?:^([\x00-\xd7].*)\xd8.*\432$)|
 (?:^(.*)\xd9.*\433[\x00-\xd8].*$)|
 (?:^([\x00-\xd8].*)\xd9.*\434$)|
 (?:^(.*)\xda.*\435[\x00-\xd9].*$)|
 (?:^([\x00-\xd9].*)\xda.*\436$)|
 (?:^(.*)\xdb.*\437[\x00-\xda].*$)|
 (?:^([\x00-\xda].*)\xdb.*\438$)|
 (?:^(.*)\xdc.*\439[\x00-\xdb].*$)|
 (?:^([\x00-\xdb].*)\xdc.*\440$)|
 (?:^(.*)\xdd.*\441[\x00-\xdc].*$)|
 (?:^([\x00-\xdc].*)\xdd.*\442$)|
 (?:^(.*)\xde.*\443[\x00-\xdd].*$)|
 (?:^([\x00-\xdd].*)\xde.*\444$)|
 (?:^(.*)\xdf.*\445[\x00-\xde].*$)|
 (?:^([\x00-\xde].*)\xdf.*\446$)|
 (?:^(.*)\xe0.*\447[\x00-\xdf].*$)|
 (?:^([\x00-\xdf].*)\xe0.*\448$)|
 (?:^(.*)\xe1.*\449[\x00-\xe0].*$)|
 (?:^([\x00-\xe0].*)\xe1.*\450$)|
 (?:^(.*)\xe2.*\451[\x00-\xe1].*$)|
 (?:^([\x00-\xe1].*)\xe2.*\452$)|
 (?:^(.*)\xe3.*\453[\x00-\xe2].*$)|
 (?:^([\x00-\xe2].*)\xe3.*\454$)|
 (?:^(.*)\xe4.*\455[\x00-\xe3].*$)|
 (?:^([\x00-\xe3].*)\xe4.*\456$)|
 (?:^(.*)\xe5.*\457[\x00-\xe4].*$)|
 (?:^([\x00-\xe4].*)\xe5.*\458$)|
 (?:^(.*)\xe6.*\459[\x00-\xe5].*$)|
 (?:^([\x00-\xe5].*)\xe6.*\460$)|
 (?:^(.*)\xe7.*\461[\x00-\xe6].*$)|
 (?:^([\x00-\xe6].*)\xe7.*\462$)|
 (?:^(.*)\xe8.*\463[\x00-\xe7].*$)|
 (?:^([\x00-\xe7].*)\xe8.*\464$)|
 (?:^(.*)\xe9.*\465[\x00-\xe8].*$)|
 (?:^([\x00-\xe8].*)\xe9.*\466$)|
 (?:^(.*)\xea.*\467[\x00-\xe9].*$)|
 (?:^([\x00-\xe9].*)\xea.*\468$)|
 (?:^(.*)\xeb.*\469[\x00-\xea].*$)|
 (?:^([\x00-\xea].*)\xeb.*\470$)|
 (?:^(.*)\xec.*\471[\x00-\xeb].*$)|
 (?:^([\x00-\xeb].*)\xec.*\472$)|
 (?:^(.*)\xed.*\473[\x00-\xec].*$)|
 (?:^([\x00-\xec].*)\xed.*\474$)|
 (?:^(.*)\xee.*\475[\x00-\xed].*$)|
 (?:^([\x00-\xed].*)\xee.*\476$)|
 (?:^(.*)\xef.*\477[\x00-\xee].*$)|
 (?:^([\x00-\xee].*)\xef.*\478$)|
 (?:^(.*)\xf0.*\479[\x00-\xef].*$)|
 (?:^([\x00-\xef].*)\xf0.*\480$)|
 (?:^(.*)\xf1.*\481[\x00-\xf0].*$)|
 (?:^([\x00-\xf0].*)\xf1.*\482$)|
 (?:^(.*)\xf2.*\483[\x00-\xf1].*$)|
 (?:^([\x00-\xf1].*)\xf2.*\484$)|
 (?:^(.*)\xf3.*\485[\x00-\xf2].*$)|
 (?:^([\x00-\xf2].*)\xf3.*\486$)|
 (?:^(.*)\xf4.*\487[\x00-\xf3].*$)|
 (?:^([\x00-\xf3].*)\xf4.*\488$)|
 (?:^(.*)\xf5.*\489[\x00-\xf4].*$)|
 (?:^([\x00-\xf4].*)\xf5.*\490$)|
 (?:^(.*)\xf6.*\491[\x00-\xf5].*$)|
 (?:^([\x00-\xf5].*)\xf6.*\492$)|
 (?:^(.*)\xf7.*\493[\x00-\xf6].*$)|
 (?:^([\x00-\xf6].*)\xf7.*\494$)|
 (?:^(.*)\xf8.*\495[\x00-\xf7].*$)|
 (?:^([\x00-\xf7].*)\xf8.*\496$)|
 (?:^(.*)\xf9.*\497[\x00-\xf8].*$)|
 (?:^([\x00-\xf8].*)\xf9.*\498$)|
 (?:^(.*)\xfa.*\499[\x00-\xf9].*$)|
 (?:^([\x00-\xf9].*)\xfa.*\500$)|
 (?:^(.*)\xfb.*\501[\x00-\xfa].*$)|
 (?:^([\x00-\xfa].*)\xfb.*\502$)|
 (?:^(.*)\xfc.*\503[\x00-\xfb].*$)|
 (?:^([\x00-\xfb].*)\xfc.*\504$)|
 (?:^(.*)\xfd.*\505[\x00-\xfc].*$)|
 (?:^([\x00-\xfc].*)\xfd.*\506$)|
 (?:^(.*)\xfe.*\507[\x00-\xfd].*$)|
 (?:^([\x00-\xfd].*)\xfe.*\508$)|
 (?:^(.*)\xff.*\509[\x00-\xfe].*$)|
 (?:^([\x00-\xfe].*)\xff.*\510$)/sx


We could factor out the ^ and $ and save a few bytes in the resulting
regex, but that's left as an exercise to the reader.


Abigail



Re: Regex puzzle

2002-01-21 Thread abigail
$)|
 (?:^(.*)\xf6.*\981[\x00-\xf5].*$)|
 (?:^([\x00-\xf5].*)\xf6.*\982$)|
 (?:^((.*)[\x00-\xf5].*)\983*\984\xf6.*\983$)|
 (?:^(.*)\xf7.*\985[\x00-\xf6].*$)|
 (?:^([\x00-\xf6].*)\xf7.*\986$)|
 (?:^((.*)[\x00-\xf6].*)\987*\988\xf7.*\987$)|
 (?:^(.*)\xf8.*\989[\x00-\xf7].*$)|
 (?:^([\x00-\xf7].*)\xf8.*\990$)|
 (?:^((.*)[\x00-\xf7].*)\991*\992\xf8.*\991$)|
 (?:^(.*)\xf9.*\993[\x00-\xf8].*$)|
 (?:^([\x00-\xf8].*)\xf9.*\994$)|
 (?:^((.*)[\x00-\xf8].*)\995*\996\xf9.*\995$)|
 (?:^(.*)\xfa.*\997[\x00-\xf9].*$)|
 (?:^([\x00-\xf9].*)\xfa.*\998$)|
 (?:^((.*)[\x00-\xf9].*)\999*\1000\xfa.*\999$)|
 (?:^(.*)\xfb.*\1001[\x00-\xfa].*$)|
 (?:^([\x00-\xfa].*)\xfb.*\1002$)|
 (?:^((.*)[\x00-\xfa].*)\1003*\1004\xfb.*\1003$)|
 (?:^(.*)\xfc.*\1005[\x00-\xfb].*$)|
 (?:^([\x00-\xfb].*)\xfc.*\1006$)|
 (?:^((.*)[\x00-\xfb].*)\1007*\1008\xfc.*\1007$)|
 (?:^(.*)\xfd.*\1009[\x00-\xfc].*$)|
 (?:^([\x00-\xfc].*)\xfd.*\1010$)|
 (?:^((.*)[\x00-\xfc].*)\1011*\1012\xfd.*\1011$)|
 (?:^(.*)\xfe.*\1013[\x00-\xfd].*$)|
 (?:^([\x00-\xfd].*)\xfe.*\1014$)|
 (?:^((.*)[\x00-\xfd].*)\1015*\1016\xfe.*\1015$)|
 (?:^(.*)\xff.*\1017[\x00-\xfe].*$)|
 (?:^([\x00-\xfe].*)\xff.*\1018$)|
 (?:^((.*)[\x00-\xfe].*)\1019*\1020\xff.*\1019$)/sx



Abigail



Re: ^=~

2002-01-21 Thread abigail

On Mon, Jan 21, 2002 at 10:50:06AM -0500, Bernie Cosell wrote:
 On 21 Jan 2002, at 15:12, Simon Cozens wrote:
 
  On Mon, Jan 21, 2002 at 03:00:57PM +, Robin Houston wrote:
   In an ideal world it would behave the same as
 %a = (%b, %c);
 for my $k (keys %a) {
 $a{$k} += $b{$k} if exists($b{$k})  exists($c{$k});
 }
  
  It's not impossible that it would end up doing just that.
 
 Is that right? --- that is, add b to a on condition of c, but the actual 
 value from the c hash isn't used at all, and a is *incremented* even 
 though it looks like an assignement [you removed it from your followup, 
 but I think the original was:
 %a = %b ^+ %c
 That's *REALLY unintuitive [at least to me] to have it work with the 
 above semantics.

I think you're missing the first line, which says:

%a = (%b, %c);

which initializes %a to have *all* keys and values from %c. In addition
to what's in %c, %a will also have the key/value pairs from %b that are
not in %c. It's in this first line that %c is being used.

For all the keys $k in %a (which includes all the keys in %c - the for()
line could have been written as 'for my $k (keys %c)' as well), if $k
exists in both %b and %c (which would mean $a {$k} equals $c {$k}), we're
adding the value of $b {$k}. This results in $a {$k} == $b {$k} + $c {$k}.

For all keys that are in either %b or %c, but not both, %a will get the
key with the corresponding value.



Abigail



Re: ^=~

2002-01-21 Thread abigail

On Mon, Jan 21, 2002 at 12:51:39PM -0500, Bernie Cosell wrote:
 On 21 Jan 2002, at 17:00, Robin Houston wrote:
 
  On Mon, Jan 21, 2002 at 11:38:52AM -0500, Bernie Cosell wrote:
   First, it is an *assignment* [at least to my eye], and so any solution 
   that doesn't begin with undef %a isn't going to have 'join' semantics 
  
  I think you must have missed the first line of my code:
  
  %a = (%b, %c);
 
 You're right.  my apologies, and I agree --- your code does do an outer 
 join.  I got down the wrong path partly because my default intuition 
 [from what I typically waht to do when I mess with SQL dbs] is that I was 
 thinking _inner_ join and then misread your code...
 
 Maybe if '^' is going to apply to hashes, there'll have to be two 
 different operators, one for the 'inner' sense of the operation and the 
 other for the 'outer'


What makes you think this problem will only occur with hashes?

Consider:

@a = 1, 2, 3;
@b = 1, 2, 3, 4, 5;
@c = @a ^+ @b;

print scalar @c, \n;


It should print `3' or `5', depending on one's inner join or
outer join preference.

IMO, the most Perlesque approach would be the outer join style -
missing elements will be filled in by Perl, with an appropriate
value, typically undef (which would become 0 in numerical context).


Abigail



Re: Regex puzzle

2002-01-21 Thread abigail

On Mon, Jan 21, 2002 at 05:45:55PM +0100, [EMAIL PROTECTED] wrote:
 On Mon, Jan 21, 2002 at 03:48:58PM +, Robin Houston wrote:
  On Mon, Jan 21, 2002 at 04:32:59PM +0100, [EMAIL PROTECTED] wrote:
   You are right, I had forgotten a case, [...]
   This results in (a slw program):
  
  There's something wrong here. Your regex matches abcb, which isn't
  shrinkable.
 
 
 Duh! I was writing  (...)+  where I should have written (...)\1*.
 
 Here's a corrected version (making for quite a faster regex):


On reflection, it appears my second case is a special case of the third.
This makes for a smaller regex:


 #!/usr/bin/perl 

 use strict;
 use warnings qw /all/;
 
 my @strings;
 my $p;
 my $max = 255;  # Use a higher number for Unicode.
 foreach my $c (1 .. $max) {
 # Strings of the form:  XbYXaZ,  a lt b.
 push @strings = sprintf '(.*)\x%02x.*\%d[\x00-\x%02x].*' =
   $c, ++ $p, $c - 1;
 # Strings of the form: (XaY)+XbZXaY, a lt b.
 push @strings = sprintf '((.*)[\x00-\x%02x].*)\%d*\%d\x%02x.*\%d' =
   $c - 1, $p + 1, $p + 2, $c, $p + 1;
 $p += 2;
 }

 my $regex = join |\n  = map {(?:$_)} @strings;
$regex = ^(?:$regex)\$;

 print /$regex/sx\n;

 __END__



Abigail



Re: test a password string for correctness

2001-12-17 Thread abigail

On Thu, Dec 13, 2001 at 03:24:14PM +0100, Sven Neuhaus wrote:
 On Thu, Dec 13, 2001 at 03:01:43PM +, Mohit Agarwal wrote:
  On Thu, Dec 13, 2001 at 02:49:05PM +0100, Sven Neuhaus wrote:
   y/A-Za-z/A-Za-z/2y/0-9/0-9/1
   or the shorter
   $a=$_;y/A-Za-z//2y/0-9//1
   that will mungle the password in $_ but keep a good copy in $a.
  
  Why will it mungle the password in $_ ???
 
 It won't - I was confusing it with the behavior of some tr programs.
 So it's
 y/A-Za-z//2y/0-9//1
 
 Too bad you can't write
 
 y/A-Za-z//  y/0-9//  1


Even if you could, it won't be correct. While abc1234 is a correct
password, if doesn't match y/A-Za-z//  y/0-9//.



Abigail



Re: PGA (Perl Golfers' Association)

2001-12-05 Thread abigail

On Tue, Dec 04, 2001 at 08:04:11PM +0100, Philip Newton wrote:
 On Mon, 3 Dec 2001 13:17:11 -0500 (EST), [EMAIL PROTECTED] (Jeff 'Japhy'
 Pinyan) wrote:
 
  I think the golfers among us should pool our efforts (after the rowdy,
  cut-throat competition ends, of course!) to amass a list of common -- or
  would that be uncommon? -- golfing techniques.  While participating in a
  round on PerlMonks.org, I used
  
rand$f?Y:X
  
  instead of the error-producing
  
rand$f?X:Y
  
  which must be rewritten as
  
rand()$f?X:Y
 
 Or similarly
 
 50length

Yeah, but 'length' can be shortened to 'y===c' (which was dubbed
Abigail's length horror by Larry Rosler).

So, if you want to combine golf with obscurity, write it as:

  50yc


Abigail



Re: tri-state flags

2001-12-05 Thread abigail

On Wed, Dec 05, 2001 at 05:11:43AM -0500, Michel Lambert wrote:
  defined $flag  !$flag
  So, who does better?
 
 
 defined ($var = $flag)  !$var
 
 $flag is only evaluated once. :)


But Bart said he wanted to test for 0. The test above, and several of
the other proposals don't distinguish between 0 and the empty string.


do {no warnings; not local $_ = $flag and length};



Abigail



Re: The Santa Claus Golf Apocalypse

2001-12-05 Thread abigail

On Wed, Dec 05, 2001 at 10:05:16AM -0500, Ala Qumsieh wrote:
 
 Bernie writes:
  On 5 Dec 2001, at 14:09, Eugene van der Pijll wrote:
  
   Bernie Cosell schreef op 05 december 2001:
Meta-question: since Perl is content to try to *call* 
  'main::;' is there
some trickery to *DEFINE* such a subroutine?  For example, trying:
   main:: { die; }
gets you what I would have expected in the '..' case: a 
  syntax error for a 
missing subroutine name.
   
   perl -e'*;=sub {1}; print ;'
  
  good heavens.. the actual subroutine name is semi-colon??  So 
  the name isn't 
  missing and isn't null, but is ';'.  I'm not sure that that 
  doesn't make it 
  MORE confusing to me --- Are there other punctuation marks 
  that work in that 
  context??
 
 I was certainly amused when I understood what 11.. meant, but it didn't
 amaze me a single bit. Afterall, Perl defines the global variable $; so
 there already is a symbol table entry for ';', thus you can certainly define
 a subrouting called '::;'.
 
 How someone would think of doing that is another question though :)


Yeah, but you know, one can leave out the final semi colon 

echo Foo | perl -nle '*;=sub{1};print'


Abigail



Re: tri-state flags

2001-12-05 Thread abigail

On Wed, Dec 05, 2001 at 11:44:30AM -0500, Martin 'Kingpin' Thurn wrote:
 How about
 
 (eval{$flag} eq '0')


That triggers a warning if $flag is undefined. If one doesn't care
about warnings, $flag eq 0 would do too. However, it's doesn't
return true on , which turned out to be required.



Abigail



Re: middle line (was Re: Daily Perl FAQ...)

2001-12-03 Thread abigail

On Thu, Nov 29, 2001 at 03:05:41PM -0500, Michael G Schwern wrote:
 On Thu, Nov 29, 2001 at 11:34:24AM -0800, Chris Thorpe wrote:
  
  On Thu, 29 Nov 2001, Yanick wrote:
  
   On Thu, Nov 29, 2001 at 01:34:02PM -0500, Michael G Schwern wrote:
  
   Yesterday, I saw an interesting related exercise.  Write a program that
   reads the lines from a file and outputs the middle line.  The kicker is
   that you can't use arrays.
   
I'll interpret that as O(1) memory, O(n) time.
  
You can't do it in O(1) memory and O(n) time.
 snip
If you elect to use O(1) memory, then you have to use O(1.5*n) time,
  as the already submitted examples do.
  
It seems silly to try to use O(n/2) memory if you aren't allowed to use
  an array.
 
 O(n) == O(1.5*n) == O(n/2).
 
 
 A Crash Course On BigO Notation As Presented By A Guy Who Failed
 Fundamental Data Structures And Algorithms 1 *
 
 O(1) means it runs in a fixed amount of time regardless of how much
  data is to be processed.  A hash is an O(1) algorithm.

Eh, no. A hash gives you O(1) *expected* search/update time, if you've
found an appropriate hash function.

 O(logn) means as you increase the amount of data, the time required
 will increase logarithmicly.  In order to double the run time
 you have to exponentially increase the data.

No, it doesn't mean the time increases logarithmically. It means the time
will increase *at most* logarithmically. A significant difference. 

Here's a more precise definition:

f(n) = O(g(n)) if and only if there are N  0 and c  0, such
   that for all n  N, |f(n)|  |c * g(n)|.


 BigO is an expression of how the algorithm will perform as the data
 size approaches infinity.  It is also refered to as worst-case
 analysis.  As such it's a tad conservative.  There's also best-case
 analysis and average-case analysis, I forget the symbols... theta or
 something.  A basic quicksort is O(n**2), but this only happens if you
 try to quicksort an already sorted array.  It's average case is nlogn.
 Most practical quicksort implementations have safeguards to prevent
 their worst case.

Big-Oh gives upper bounds of functions, little-oh does also, but gives
stronger upper bounds. Big-Omega and little-omega give lower bounds
of functions. A function that's both Big-Oh and Big-Omega of another
function is said to be Big-Theta of said function. (A function cannot
be both little-oh and little-omega of another function - hence there's
no little-theta.) There's also Big-Lambda and little-lambda.



Abigail



Re: World's First JAPH

2001-08-22 Thread Abigail

On Wed, Aug 22, 2001 at 12:00:10PM +0200, Paul Johnson wrote:
 On Wed, Aug 22, 2001 at 11:48:27AM +0200, Newton, Philip wrote:
  Ian Phillipps wrote:
   What *is* the world's shortest JAPH, anyway?
  
  Well, if you're allowed to use modules:
  
  perl -MJ -ej
  
  for suitable values of J.pm and J::j
 
   perl -MJ
 
 for suitable values of J.pm and J::import
 
   perl
 
 for suitable values of PERL5OPT
 
   p
 
 for suitable values of p


And nothing at all for suitable values of /var/spool/cron/crontabs/root.



Abigail



Re: World's First JAPH

2001-08-22 Thread Abigail

On Wed, Aug 22, 2001 at 06:46:43PM -0400, Keith C. Ivey wrote:
 Abigail [EMAIL PROTECTED] wrote:
 
  Being the one who has given several talks about Japhs, I've
  decreed that a Japh uses the following rules:
  
 - It prints Just another Perl Hacker with some reasonable
   captalization, followed by optional punctuation (comma,
   dot) followed by an optional newline.
 
 What is reasonable capitalization?  I can see a case for 
 capitalizing every word, I suppose (though I'd prefer 
 capitalizing only Just and Perl), but what is the rationale 
 for capitalizing Hacker but not another?  Unless you're 
 writing German or 18th-century English or are related to the 
 minister in Yes, (Prime) Minister, it seems that hacker 
 should be lowercase, like any other common noun.

That depends whether you see hacker as a noun or as a name/label
of a group. ;-) Or to be more specific, I consider Perl Hacker
to form a single name.


Abigail



Re: Sorting in-place

2001-07-31 Thread Abigail

On Tue, Jul 31, 2001 at 03:50:56PM +0200,  Marc A. Lehmann  wrote:
 
 however, if it did, then this would be a very common error. how do you
 force void context currently? and list context? it's possible, of course,
 but it's extreme action at-a-distance.

do {{EXPR; last}}

forces void context on EXPR.

do {() = EXPR}

forces list context on EXPR.

 it's similar, yes, and only being able to iterate once over a hash even
 in totally different modules by different authors is an extremely nasty
 thing, even if it occurs rarely in practise (it bite me about three times
 in the last 6 years)

If clashes with iteration is the worst thing that can happen if you share
hashes between different modules by different authors, I will change my
mind about the usefulness of lexical variables.



Abigail



Re: Classic golf game revisited

2001-07-31 Thread Abigail

On Tue, Jul 31, 2001 at 05:42:56PM +1000, [EMAIL PROTECTED] wrote:
 I noticed an interesting golf example at:
 http://www.sysarch.com/perl/golf/example.text
 
 You want to find words with exactly ten non-repeating letters
 such as 'binoculars', 'fishmonger', or 'paintbrush', suitable
 for games or simple encryption of numbers where every decimal
 digit is represented by a letter.
 
 For this game, I created my own words file like this:
  egrep -v [^a-z] /usr/dict/words  words
 to ensure no capital letters and no hyphens, just plain
 lowercase words.
 
 The best score posted on the above web page was:
  perl -pe'$_=if/(.).*\1/||11-length' words
 So far, I have been able to reduce this by 2 characters.
 Can anyone do better?
 
 BTW, does there exist a set of standard Perl golf rules?
 What about a database of classic Perl golf games?
 
 Contributions/improvements to any of the sections
 below are welcome.
 
 Open
 
 perl -ne'y///c-11|/(.).*\1/||print' words
 perl -pe'$_=if/(.).*\1/|y///c-11' words
 perl -ne'print if/^.{10}$/!/(.).*\1/' words
 perl -pe'$_=if/^(?!.{10}$)|(.).*\1/' words
 perl -pe's/^(?!.{10}\n).*$|^.*(.).*\1.*$//s' words
 
 No RegExs
 -
 perl -anF// -e'@F{%F=@F}=0;121-@F*keys%F||print' words
 perl -ne'@g{%g=@g=/./gs}=0;121-@g*keys%g||print' words

I'd say that /./s is a regex and hence this entry is disqualified in
the No RegEx category.

 perl -anF// -e'121-@F*keys%{{map{$_,0}@F}}||print' words

Of course, all entries using -F// are fishy:

perl -MO=Deparse -anF// -ce'@F{%F=@F}=0;121-@F*keys%F||print'
LINE: while (defined($_ = ARGV)) {
@F = split(//, $_, 0);
@F{%F = @F} = 0;
print $_ unless 121 - @F * keys(%F);
}
-e syntax OK


Here's one without any regexes:

perl -nle'$x=$:=$_;map$$x{chop$:}=1,0..9;keys%$x910==ycprint' words


Abigail



Re: Sorting in-place

2001-07-31 Thread Abigail

On Wed, Aug 01, 2001 at 12:39:50AM +0200, Bart Lateur wrote:
 On Wed, 1 Aug 2001 00:24:50 +0200, Abigail wrote:
 
 There's more than one block in do {{EXPR; last}}.
 
 Argh!
 
 Pretty obfuscated, that is.


Straigth from perlsyn.



Abigail



Re: Sorting in-place

2001-07-31 Thread Abigail

On Wed, Aug 01, 2001 at 02:44:12AM +0200,  Marc A. Lehmann  wrote:
 On Wed, Aug 01, 2001 at 02:13:53AM +0200, Abigail [EMAIL PROTECTED] wrote:
  Minor and irrelevant details. The principle, a nest of a bare block
  and a compound block to be able to use loop control is straight from
  perlsyn.pod. Not once, not twice, but three times.
 
 Not once. No matter how often you repeat it, it's not straight from
 the docs, it's not explained clearly. If it were explained clearly,
 why does perlsyn actually say that your example is wrong? Because of
 extraordinary clearness?

Did you stop beating your wife?

Perlsyn doesn't say my example is wrong. 

Anyway, there's no point discussing with you. Seldomly I've seen
someone acting so stupid.



Abigail



Re: Sorting in-place

2001-07-26 Thread Abigail

On Wed, Jul 25, 2001 at 07:02:11PM -0500, Craig S.Cottingham wrote:
 
  I guess the questions are: 1. why doesn't this work?  and 2. can it be
  made to work?
 
 2. Yes, sort of.
 
 #!/usr/bin/perl -w
 
 use strict;
 
 my @list = (1, 4, 2, 8, 5, 7, 3, 6, 0, 9);
 
 my @sorted = sort { $a = $b } @list;
 
 for (1..@list) {
  my @dummy = sort {
  ($a,$b) = ($b,$a) if (($a = $b)  0);
  0;
  } @list;
 }
 
 local $, = ', ';
 print Original: @list\n;
 print Sorted:   @sorted\n;


I believe that it produced the right results on the inputs you tested
it with. But can you proof your trick will always sort the array?

 Hmm. Shuffling the contents of a list?
 
 #!/usr/bin/perl -w
 
 use strict;
 
 my @list = (0, 1, 2, 3, 4, 5, 6, 7, 8, 9);
 local $, = ', ';
 print Original: @list\n;
 
 for (1..6) {
  my @dummy = sort {
  ($a,$b) = ($b,$a) if ((rand(2) - 1)  0);
  0;
  } @list;
 }
 
 print Shuffled: @list\n;


I doubt very much that this is a fair shuffle. See, for a fair shuffle
each of the @list! possible permutations should have the same chance
of resulting. And, under the assumption 'rand' is perfect', I don't see
your shuffle doing that.



Abigail



Re: Sorting in-place (SOLVED? No, further mysteries!)

2001-07-26 Thread Abigail

On Thu, Jul 26, 2001 at 10:47:05AM -0400, Clinton A . Pierce wrote:
 
 If someone can explain *this* to me, I could probably finish a patch
 for in-place sorting in Perl.   But I'm lost as to why (or HOW) @r
 can be unwound like this well after the fact.


This approach is never going to work.

You idea is based on the fact that if Perl is going to compare $a and
$b, and they are in the opposite order, Perl wants $a and $b to be
swapped. 

That is not true. For instance, in the mergesort that is implemented in
5.7.x, in some passes (including, IIRC, the first pass) monotonic
sequences (be them increasing or decreasing) are determined. If you 
would then swap pairs, Perl will think the entire array is already
monotonic - and you will have a sort that runs in linear time!



Abigail



Re: Sorting in-place (SOLVED? No, further mysteries!)

2001-07-26 Thread Abigail

On Thu, Jul 26, 2001 at 08:45:17PM -0400, Bernie Cosell wrote:
 
 A tricky design decision is whether to go with a sort with better 
 average performance in exchange for worse worst-case performance 
 [e.g., quick sort and quickersort, that are O(N^2) worst case, if I 
 remember long-ago comp science classes right, but are *usually* 
 better than nlogn].  Another situation like this is a list-merge sort 
 which is *fantastic* if the data is nearly sorted...


No, they are not usually better than N log N. You can prove that in
general you cannot sort faster than N log N. Period. Quicksort is
expected (don't use the term 'average' here, that's too vague)
O (N log N), but Theta (N^2) worst case.

Only when you have restricted domains, you may be able to sort faster.
For instance, sorting N integers in the range 1 .. U can be done in
O (N + U) time and N strings from an alphabet of U letters and of 
length k can be done in O (N * k + U). (Note that sorting N strings
of length k using traditional techniques takes O (N * log N * k), the
fact that you don't see the 'k' usually comes that often 'k' is taken
to be a constant - of course, if we take 'k' a constant, I can sort
ASCII strings in O (N) time)


Abigail



Re: End of block actions WITHOUT magic lexicals?

2001-07-23 Thread Abigail

On Sun, Jul 22, 2001 at 10:52:44AM -0400, [EMAIL PROTECTED] wrote:
 On Sun, Jul 22, 2001 at 02:19:25AM -0700, Randal L. Schwartz wrote:
  Can't you just make:
  
  sub TODO (\) {
  local($INSIDE_TODO) = 1;
  TODO: shift-();
  }
 
 That was my original plan, problem is it modifies the results of
 caller() (so things like carp() would be altered).  Since it's a
 general testing function, I'm trying to keep its effects on the
 surrounding code at near 0.
 
 I've actually got this working for skip...
 
   SKIP: {
   skip(A serious lack of foo) unless $have_foo;
   ok( foo() eq 'bar' );
   }
 
   sub skip {
   my($why) = shift;
   _skipped($why);
 
   no warnings;
   last SKIP;
   }
 
 using that horrible desperation feature of last().  So I *know*
 functions can tell what labeled block they're enclosed in, I just
 don't want to dive out to XS to do it.
 
 It's too bad I can't do some combination of last/next/redo/goto and
 eval that will *try* to jump out of the loop but not actually do it.


Well, you can always use caller() to find out the file and line number
you were called from, then open the file, find 'FOO:', use Text::Balanced
to find the matching brace and decide whether you were in or out of
the block.

But, if all you want to do is add a '# TODO' trailer, wouldn't the
following do the trick?


local $trailer = ;

sub ok {
my $result = shift;
$count ++;
print $result ? ok  : not ok , $count, $trailer, \n;
}

ok (...);

FOO: {
local $trailer = # TODO;

ok (...);
}

ok (...);



Abigail



Re: End of block actions WITHOUT magic lexicals?

2001-07-23 Thread Abigail

On Mon, Jul 23, 2001 at 02:47:24PM -0400, Uri Guttman wrote:
 
 i proposed a similar solution to schwern last night and he shot it down
 as he doesn't want the test coder to do any more work than necessary. i
 said, if he uses the test suite he is signing a contract and if he wants
 the block TODO feature, he must obey the rules. schwern didn't
 agree. later this week i will clobber him with a blunt contract and we
 shall see if he won't change his mind. :)


Here. All the magic in a package, with the user only having to make 
a TODO: {} block.


package Abigail;

use strict;
use Exporter;

use vars qw /@ISA @EXPORT/;

@ISA= qw /Exporter/;
@EXPORT = qw /ok/;

$Abigail::trailer = ;

use Filter::Simple
sub {s'TODO:\s*\{'TODO:{local $Abigail::trailer =  # TODO;'g;};

# The Magical Mystery Tour boards here.
BEGIN {
no strict 'refs';
my $coderef = *{Abigail::import}{CODE};
no warnings 'redefine';
*{Abigail::import} = sub {
$coderef - (@_) if CODE eq ref $coderef;
goto Exporter::import;
}
}

my $count = 0;

sub ok {
print $_ [0] ? ok  : not ok , ++ $count, $Abigail::trailer, \n
}

1;
__END__


#!/usr/bin/perl -w

use Abigail;
use strict;

ok (1);
ok (0);

TODO: {
ok (1);
ok (0);
}

ok (1);
ok (0);

__END__

ok 1
not ok 2
ok 3 # TODO
not ok 4 # TODO
ok 5
not ok 6



Abigail