Re: Capturing alternations (was Re: Hypothetical synonyms)

2002-08-28 Thread Damian Conway

Piers wrote:


> Not exactly DWIM, but how about:
> 
>   my $stuff = /^\s* [ "(.*?)" | (\S+) ] : { $foo := $+ }/;
> 
> Assuming $+ means 'the last capture group matched' as it does now.
>

Or just:

 my $stuff = /^\s* [ "$foo:=(.*?)" | $foo:=(\S+) ]/;

BTW, that doesn't actually *do* the match. It merely puts a reference
to a rule object into $stuff.

Perhaps we all actually meant variants on:

 my $stuff = m/^\s* [ "$0:=(.*?)" | $0:=(\S+) ]/;

???

Damian







Capturing alternations (was Re: Hypothetical synonyms)

2002-08-28 Thread Trey Harris

In a message dated Thu, 29 Aug 2002, Janek Schleicher writes:

> Aaron Sherman wrote at Wed, 28 Aug 2002 00:34:15 +0200:
>
> > $stuff = (defined($1)?$1:$2) if /^\s*(?:"(.*?)"|(\S+))/;
>
> It gives me the idea of a missing feature:
>
> What really should be expressed is:
>
> my ($stuff) = /^\s*("°.*?"°|\S+)/;
>
> where the ° character would mean,
> "Don't capture the previous element".

Hmm.  One thing that has always bothered me about regexes is capturing
parentheses in alternations.  It seems to me that:

my ($stuff) = /^\s* [ "(.*?)" | (\S+) ]/;

should DWIM somehow, since it's impossible that both parens will capture.
So when the same number of capturing parens appear in each of an
alternation, they should factor out to being a single return value.

Is this possible in the general case?

Trey




Re: Hypothetical synonyms

2002-08-28 Thread Janek Schleicher

Aaron Sherman wrote at Wed, 28 Aug 2002 00:34:15 +0200:

> $stuff = (defined($1)?$1:$2) if /^\s*(?:"(.*?)"|(\S+))/;

It gives me the idea of a missing feature:

What really should be expressed is:

my ($stuff) = /^\s*("°.*?"°|\S+)/;

where the ° character would mean,
"Don't capture the previous element".

I think that such a meaning of "uncapturing" elements
from a regexp would be really nice,
as it would help to express things directly,
instead of going complicated ways.

The ° character doesn't have any special meaning,
that's why I choosed it in the above example.
However, it also symbolizes a little capturing
and as it isn't filled,
it could really symbolize an uncapturing.

I don't know how hard it would be to implement or
whether it had already discussed yet.


Greetings,
Janek




Re: rule, rx and sub

2002-08-28 Thread Larry Wall

On Wed, 28 Aug 2002, Sean O'Rourke wrote:
: Being able to specify fixed arguments after a splat looks illegal, or at
: least immoral.  It opens the door to backtracking in argument parsing,
: e.g.:
: 
: sub foo (*@args, &func, *@more_args, $arg, &func) { ... }
: 
: > Saying specifically a list of arrays.  Also, would that list gobble up
: > everything, or would it actually allow that coderef on the end?
: 
: I would expect it to be a syntax error, since the slurp parameter has to
: be the last.

This sort of thing must be done with real parsing rules.  These can return
a list of args as a single @args argument without having to play with splat.

Larry




Re: auto deserialization

2002-08-28 Thread Dan Sugalski

At 5:19 PM -0700 8/28/02, Larry Wall wrote:
>On Thu, 29 Aug 2002, Steffen Mueller wrote:
>: Nicholas Clark wrote:
>: [...]
>: > If the compiler were able to see that my Date $bday = 'June 25, 2002';
>: > is one statement that both types $bday as Date, and then assigns a
>: > constant to it, is it possible to do the conversion of that constant
>: > to a constant $bday object at compile time? (and hence get compile
>: > time checking) Without affecting general run time behaviour.
>:
>: While that may be possible (I can't tell, I gladly take Dan's word for it),
>: it doesn't make much sense IMHO. It means that you can only initialize those
>: objects with constants. That's not a problem for people who know Perl well,
>: but it is going to be one hell of a confusion for anybody learning Perl. I
>: can see people whining on clpm why they can't do "my Dog $rex =
>: sub_returning_string();". Again IMHO, taking Perl's flexibility in *some*
>: cases is much worse than making it Java.
>
>We're not going to define it so they can only initialize with constants.
>That would be silly.  I think Dan is talking about the case where we
>can detect that it is a constant at compile time.  As such, it's just
>constant folding, on the assumption that we also know the constructor
>isn't going to change.

I actually had something a bit more subversive in mind, where the 
assignment operator for the Date class did some magic the same way we 
do now when we do math on strings.

On second thought, that's not a great idea, and I think just passing 
in parameters to the class' initialization method's a better idea, 
otherwise we'll have string auto-converting going on all over the 
place, and that's not a great idea.

-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: rule, rx and sub

2002-08-28 Thread Sean O'Rourke

On Wed, 28 Aug 2002, Luke Palmer wrote:

> > Second, is there a prototype-way to specify the arguments to "for"
> > (specifically, the first un-parentesized multidimensional array argument)?
> > In other words, is that kind of signature expected to be used often enough
> > to justify not forcing people to explicitly extend the grammar?
>
> If you're talking about parallel iteration, I know what you mean.

Yeah, that's what I was talking about, though IIRC "parallel iteration"
refers to how the data is used.  I may be on crack here, but I think that
stuff before the arrow is just a multidimensional array, like

   my @a = (1, 2; 3, 4)

but, since we're expecting it, the parens are optional.

> I think there's a time for a special case, and that's one of them.

I probably agree here (if mucking with the parser is a straightforward
thing to do).

> If you're talking about the regular syntax:
>
>   for @a, @b -> $x { ... }
>
> Would that be:
>
>   sub rof (array *@ars, &body) {...}
>
> or
>
>   sub rof (*@ars is array, &body) {...}

Being able to specify fixed arguments after a splat looks illegal, or at
least immoral.  It opens the door to backtracking in argument parsing,
e.g.:

sub foo (*@args, &func, *@more_args, $arg, &func) { ... }

> Saying specifically a list of arrays.  Also, would that list gobble up
> everything, or would it actually allow that coderef on the end?

I would expect it to be a syntax error, since the slurp parameter has to
be the last.

/s




Re: rule, rx and sub

2002-08-28 Thread Luke Palmer

> Second, is there a prototype-way to specify the arguments to "for"
> (specifically, the first un-parentesized multidimensional array argument)?
> In other words, is that kind of signature expected to be used often enough
> to justify not forcing people to explicitly extend the grammar?

If you're talking about parallel iteration, I know what you mean.  I think 
there's a time for a special case, and that's one of them.  But it 
wouldn't be hard to extend that into a signature, I suppose.

If you're talking about the regular syntax:

for @a, @b -> $x { ... }

Would that be:

sub rof (array *@ars, &body) {...}

or

sub rof (*@ars is array, &body) {...}

Saying specifically a list of arrays.  Also, would that list gobble up 
everything, or would it actually allow that coderef on the end?

Luke




Re: Hypothetical synonyms

2002-08-28 Thread Luke Palmer

On Thu, 29 Aug 2002, Steffen Mueller wrote:

> Nicholas Clark wrote:
> > On Thu, Aug 29, 2002 at 12:00:55AM +0300, Markus Laire wrote:
> >> And I'm definitely going to try any future PerlGolf challenges also
> >> in perl6.
> >
> > Is it considered better if perl6 use more characters than perl5? (ie
> > implying probably less line noise)
> > or less (getting your job done more tersely?)
> 
> >From the bit of Perl6 information I've gathered from the Apocalypses, the
> Exegesises (is that really the plural? Sounds horrible.), and my

  Exegeses (like parentheses)

> perl6-language reading, I'd say Perl6 is not only going to be a bit more
> verbose (unless you use the dreaded "use Perl5;" pragma ;) ), but it'll also
> be a Good Thing.

No, not nessecarily.  If you do a line-by-line translation, yes.  But the 
fact is, Perl 6 will be able to do more in a single line (cleanly) than 
Perl 5.  For instance, hyper-operators.  So, Perl 6 will contain less 
line-noise and more whitespace than Perl 5, but code will end up being 
shorter, too.  You can see that in Exegesis 4 (or 3, not sure), where Damian 
takes Perl5ish Perl6 code, and then writes it back out in idiomatic Perl 
6. You see how much shorter it becomes.

Luke




Re: rule, rx and sub

2002-08-28 Thread Sean O'Rourke

On Wed, 28 Aug 2002, Damian Conway wrote:
> Any subroutine/function like C that has a signature (parameter list)
> that ends in a C<&sub> argument can be parsed without the trailing
> semicolon. So C's signature is:
>
>   sub if (bool $condition, &block);
>
> So the trailing semicolon isn't required.

Okay, so curlies always make surrounding commas optional (or verboten?),
and make trailing semis optional when no more arguments are expected.
This seems natural, and naturally extended to allow this

$x = { 1 => 2, ... }
$y = $x;

or even this

$x = { ... } $y = $x;

since the parser sees ("$x", "=", "{") and, knowing that it only wants a
single value, takes the closing "}" to be the end of the statement.  This
would let you do ugly things like this:

@xs = (1 { $^x + 2 } 3, 4); # second element is a closure

but most of the time, people would probably write readable code "by
accident".

Also, to follow up in two directions in two directions...

First, if "if" can be defined as above, is this a syntactic or a semantic
error (or not an error at all):

if $test { ... }
some_other_thing();
elsif $test2 { ... }# matching "if" above.

I personally think it would be nifty, and would fit in with the ability to
mix code with whens in a given.  There'd be a bit of extra overhead
involved in tracking whether or not we'd seen a true condition yet in the
current if-sequence, but that's peanuts compared to other overhead.

Second, is there a prototype-way to specify the arguments to "for"
(specifically, the first un-parentesized multidimensional array argument)?
In other words, is that kind of signature expected to be used often enough
to justify not forcing people to explicitly extend the grammar?

/s




Re: auto deserialization

2002-08-28 Thread Larry Wall

On Thu, 29 Aug 2002, Steffen Mueller wrote:
: Nicholas Clark wrote:
: [...]
: > If the compiler were able to see that my Date $bday = 'June 25, 2002';
: > is one statement that both types $bday as Date, and then assigns a
: > constant to it, is it possible to do the conversion of that constant
: > to a constant $bday object at compile time? (and hence get compile
: > time checking) Without affecting general run time behaviour.
: 
: While that may be possible (I can't tell, I gladly take Dan's word for it),
: it doesn't make much sense IMHO. It means that you can only initialize those
: objects with constants. That's not a problem for people who know Perl well,
: but it is going to be one hell of a confusion for anybody learning Perl. I
: can see people whining on clpm why they can't do "my Dog $rex =
: sub_returning_string();". Again IMHO, taking Perl's flexibility in *some*
: cases is much worse than making it Java.

We're not going to define it so they can only initialize with constants.
That would be silly.  I think Dan is talking about the case where we
can detect that it is a constant at compile time.  As such, it's just
constant folding, on the assumption that we also know the constructor
isn't going to change.

Again, though, assignment to a normal variable is unlikely to invoke
a constructor in any case.

Larry




Re: Hypothetical synonyms

2002-08-28 Thread Sean O'Rourke

On Thu, 29 Aug 2002, Markus Laire wrote:
> (only 32bit numbers, modulo not fully working, no capturing regexps,
> )

Where does modulo break?

/s




RE: rule, rx and sub

2002-08-28 Thread Larry Wall

On Wed, 28 Aug 2002, Thom Boyer wrote:
: Damian Conway wrote:
: > Any subroutine/function like C that has a signature (parameter list)
: > that ends in a C<&sub> argument can be parsed without the trailing
: > semicolon. So C's signature is:
: > 
: > sub if (bool $condition, &block);
: 
: So what does the signature for C look like? I've been wondering about
: this for a long time, and I've searched the Apocalypses and the
: perl6-language archive for an answer, but I've had no success.
: 
: It seems like C's signature might be something like one of these:
: 
:   sub while (bool $test, &body);
:   sub while (&test, &body);
: 
: But neither of these really works. 

That's correct.  Maybe something like

  sub while (&test is expr, &body);

But that would be shorthand for something more general--see below.

: The first would imply that the test is evaluated only once (and that once is
: before 'sub while' is even called). That'd be useless.
: 
: The second would allow multiple evaluations of the test condition (since
: it's a closure). But it seems that it would also require the test expression
: to have curly braces around it. And possibly a comma between the test-block
: and the body-block. That'd be ugly.

Maybe we could have something like:

 sub while (&test is rx//, &body);

or some such.  That probably isn't sufficient to pick  out of Perl's
grammar rather than the current lexical scope.

: I can create a hypothetical "bareblock" rule that says:
: 
:   When an argument's declaration contains an ampersand sigil,
:   then you can pass an "expression block" (i.e., a simple 
:   expression w/o surrounding curlies) to that argument.
: 
: Is there such a rule for Perl 6? 

Not at the moment.  It'd be pure obfuscation if people did that where
curlies *are* expected.  I still want the curlies required on an "else",
for instance.

: On the positive side, this would be an reasonable generalization of the Perl
: 5 handling of expressions given to map or grep.

I don't particularly like the old map and grep syntax.

: On the negative side, this
: rule makes it impossible to have such arguments fulfilled by evaluating an
: expression that returns the desired closure (i.e., the expression you type
: as an argument isn't intended to be the block you pass, but rather it is
: intended to generate the block you want to pass).

Well, we could make the same sort of rule that we (eventually) did
for bare blocks--if you want to return a closure in that circumstance
you'd have to use "sub' (or "return", in the case of a bare block).

: In summary: assuming Perl 6 allows user-defined while-ish structures, how
: would it be done?

I think the secret is to allow easy attachment of regex rules to sub
and parameter declarations.  There's little point in re-inventing
regex syntax using declarations.  The whole point of making Perl 6
parse itself with regexes is to make this sort of stuff easy.

Larry




Re: auto deserialization

2002-08-28 Thread Steffen Mueller

Nicholas Clark wrote:
[...]
> If the compiler were able to see that my Date $bday = 'June 25, 2002';
> is one statement that both types $bday as Date, and then assigns a
> constant to it, is it possible to do the conversion of that constant
> to a constant $bday object at compile time? (and hence get compile
> time checking) Without affecting general run time behaviour.

While that may be possible (I can't tell, I gladly take Dan's word for it),
it doesn't make much sense IMHO. It means that you can only initialize those
objects with constants. That's not a problem for people who know Perl well,
but it is going to be one hell of a confusion for anybody learning Perl. I
can see people whining on clpm why they can't do "my Dog $rex =
sub_returning_string();". Again IMHO, taking Perl's flexibility in *some*
cases is much worse than making it Java.

Steffen
--
@n=(544290696690,305106661574,116357),$b=16,@c=' ,JPacehklnorstu'=~
/./g;for$n(@n){map{$h=int$n/$b**$_;$n-=$b**$_*$h;$c[@c]=$h}c(0..9);
push@p,map{$c[$_]}@c[c($b..$#c)];$#c=$b-1}print@p;sub'c{reverse @_}




Re: Hypothetical synonyms

2002-08-28 Thread Steffen Mueller

Nicholas Clark wrote:
> On Thu, Aug 29, 2002 at 12:00:55AM +0300, Markus Laire wrote:
>> And I'm definitely going to try any future PerlGolf challenges also
>> in perl6.
>
> Is it considered better if perl6 use more characters than perl5? (ie
> implying probably less line noise)
> or less (getting your job done more tersely?)

>From the bit of Perl6 information I've gathered from the Apocalypses, the
Exegesises (is that really the plural? Sounds horrible.), and my
perl6-language reading, I'd say Perl6 is not only going to be a bit more
verbose (unless you use the dreaded "use Perl5;" pragma ;) ), but it'll also
be a Good Thing.

Applying that to Perl Golf, however, isn't possible. It doesn't make sense
to ask whether less line noise is better in golf. Anybody who has seen any
of the winning solutions should realize that whoever wrote that either used
some random string generator or tried to do create ASCII art from a color
scan of bird droppings.

Maybe I am just a bit frustrated that I had such a hard time understanding
some of the solutions. :)

> It would be interesting to see whether there are classes of problems
> that go in different directions.

I guess over 90 percent of problems will be longer; possibly about 60
percent being significantly longer. (Mainly because of the changes of A5.)

Steffen
--
@n=(544290696690,305106661574,116357),$b=16,@c=' ,JPacehklnorstu'=~
/./g;for$n(@n){map{$h=int$n/$b**$_;$n-=$b**$_*$h;$c[@c]=$h}c(0..9);
push@p,map{$c[$_]}@c[c($b..$#c)];$#c=$b-1}print@p;sub'c{reverse @_}




RE: rule, rx and sub

2002-08-28 Thread David Whipp

Thom Boyer [mailto:[EMAIL PROTECTED]] wrote:
>   sub while (bool $test, &body);
>   sub while (&test, &body);
> 
> But neither of these really works. 
> 
> The first would imply that the test is evaluated only once 
> (and that once is
> before 'sub while' is even called). That'd be useless.

It seems to me that this can be thought of as analagous, in a strange kind
of way, to hyper-operator things. Thus:

 sub while (bool $^test, &body)
 {
   return unless $^test;
   &body;
   redo;
 }

Dave.



RE: rule, rx and sub

2002-08-28 Thread Thom Boyer

Damian Conway wrote:
> Any subroutine/function like C that has a signature (parameter list)
> that ends in a C<&sub> argument can be parsed without the trailing
> semicolon. So C's signature is:
> 
> sub if (bool $condition, &block);

So what does the signature for C look like? I've been wondering about
this for a long time, and I've searched the Apocalypses and the
perl6-language archive for an answer, but I've had no success.

It seems like C's signature might be something like one of these:

  sub while (bool $test, &body);
  sub while (&test, &body);

But neither of these really works. 

The first would imply that the test is evaluated only once (and that once is
before 'sub while' is even called). That'd be useless.

The second would allow multiple evaluations of the test condition (since
it's a closure). But it seems that it would also require the test expression
to have curly braces around it. And possibly a comma between the test-block
and the body-block. That'd be ugly.

I can create a hypothetical "bareblock" rule that says:

  When an argument's declaration contains an ampersand sigil,
  then you can pass an "expression block" (i.e., a simple 
  expression w/o surrounding curlies) to that argument.

Is there such a rule for Perl 6? 

On the positive side, this would be an reasonable generalization of the Perl
5 handling of expressions given to map or grep. On the negative side, this
rule makes it impossible to have such arguments fulfilled by evaluating an
expression that returns the desired closure (i.e., the expression you type
as an argument isn't intended to be the block you pass, but rather it is
intended to generate the block you want to pass).

In summary: assuming Perl 6 allows user-defined while-ish structures, how
would it be done?

=thom
   "The rowboat glided gently across the lake, exactly like a bowling ball
wouldn't."



Re: Hypothetical synonyms

2002-08-28 Thread Nicholas Clark

On Thu, Aug 29, 2002 at 12:00:55AM +0300, Markus Laire wrote:
> And I'm definitely going to try any future PerlGolf challenges also 
> in perl6.

Is it considered better if perl6 use more characters than perl5? (ie
implying probably less line noise)
or less (getting your job done more tersely?)

It would be interesting to see whether there are classes of problems that
go in different directions.

Nicholas Clark
-- 
Even better than the real thing:http://nms-cgi.sourceforge.net/



Re: Hypothetical synonyms

2002-08-28 Thread Nicholas Clark

On Tue, Aug 27, 2002 at 08:59:09PM -0400, Uri Guttman wrote:
> > "LW" == Larry Wall <[EMAIL PROTECTED]> writes:
> 
>   LW> On 27 Aug 2002, Uri Guttman wrote: : and quoteline might even
>   LW> default to " for its delim which would make : that line:
>   LW> : 
>   LW> : my ($fields) = /(|\S+)/;
> 
>   LW> That just looks like:
> 
>   LW> my $field = //;

> and it would be nice to have a dictionary of builtin rules. :)

my $data = //;

It would make 1 liners very powerful.

How long before someone writes that and ships it with parrot?

And the $64,000 question - will the perl regexp engine be faster than
calling expat? Or will they be the same (because the regexp compiler has
certain builtin rules that are actually implemented as calls to C code
(unless they are over-ridden))?

Nicholas Clark
-- 
Even better than the real thing:http://nms-cgi.sourceforge.net/



Re: rule, rx and sub

2002-08-28 Thread Damian Conway

Sean O'Rourke wrote:

> I hope this is wrong, because if not, it breaks this:
> 
> if 1 { do something }
> foo $x;
> 
> in weird ways.  Namely, it gets parsed as:
> 
> if(1, sub { do something }, foo($x));
> 
> which comes out as "wrong number of arguments to `if'", which is just
> strange.

Any subroutine/function like C that has a signature (parameter list)
that ends in a C<&sub> argument can be parsed without the trailing
semicolon. So C's signature is:

sub if (bool $condition, &block);

So the trailing semicolon isn't required.

Likewise I could write my own C subroutine:

sub perhaps (bool $condition, num $probability, &block) {
return unless $condition;
return unless $probability > rand;
$block();
}

and then code:

perhaps $x<$y, 0.25 { print "Happened to be less than\n"}
perhaps $x>$y, 0.50 { print "Happened to be greater than\n"}

without the trailing semicolons.

Damian




Re: Hypothetical synonyms

2002-08-28 Thread Markus Laire

On 28 Aug 2002 at 16:04, Steffen Mueller wrote:

> Piers Cawley wrote:
> > Uri Guttman <[EMAIL PROTECTED]> writes:
> >> ... regex code ...
> >
> > Hmm... is this the first Perl 6 golf post?
> 
> Well, no, for two reasons:
> a) There's whitespace.
> b) The time's not quite ready for Perl6 golf because Larry's the only one
> who would qualify as a referee.

I think that time is just right for starting to golf in perl6. Parrot 
with languages/perl6 already supports a working subset of perl6.

I'm currently trying to get factorial-problem from last Perl Golf 
working in perl6, and it has proven to be quite a challenge... 
(only 32bit numbers, modulo not fully working, no capturing regexps, 
)

And I'm definitely going to try any future PerlGolf challenges also 
in perl6.

-- 
Markus Laire 'malaire' <[EMAIL PROTECTED]>





Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine

On Wed, 28 Aug 2002, Steve Fink wrote:

> On Wed, Aug 28, 2002 at 12:55:44PM -0400, Deven T. Corzine wrote:
> > On Wed, 28 Aug 2002, Dan Sugalski wrote:
> > > At 10:57 AM -0400 8/28/02, Deven T. Corzine wrote:
> > 
> > On the other hand, :, ::, ::: and  don't necessarily need to be a 
> > problem if they can be treated as hints that can be ignored.  If allowing 
> > the normal engine to backtrack despite the hints would change the results, 
> > that might be a problem.  I don't know;  may pose special problems.
> 
> They do change the semantics.
> 
>  (June|Jun) ebug
> 
> matches the string "Junebug", but
> 
>  (June|Jun): ebug
> 
> does not. Similarly,
> 
>  (June|Jun) ebug
>  (June|Jun) :: ebug
>  (June|Jun) ::: ebug
>  (June|Jun)  ebug
> 
> all behave differently when embedded in a larger grammar.

Good point.  Okay, they definitely change the semantics.  Still, could such 
semantics be implemented in a non-backtracking state machine, whether or 
not it's a strict DFA?

> However, it is very possible that in many (the majority?) of actual
> uses, they may be intended purely as optimizations and so any
> differing behavior is unintentional. It may be worth having a flag
> noting that (maybe combined with a warning "you said this :: isn't
> supposed to change what can match, but it does.")

I think this is like the leftmost matching semantics -- it may exist for 
the sake of implementation efficiency, yet it has semantic consequences as 
well.  In many cases, those semantic differences may be immaterial, yet 
some code relies on it.  Allowing flags to specify that such differences in
semantics are immaterial to your pattern would be helpful.  (Would it make 
sense for one flag to say "don't care" to the semantic differences for BOTH 
leftmost matching and the :/::/:::/etc. operators?)

> > I believe there are many subpatterns which might be beneficial to compile 
> > to a DFA (or DFA-like) form, where runtime performance is important.  For 
> > example, if a pattern is matching dates, a (Jan|Feb|Mar|Apr|...) subpattern
> > would be more efficient to implement as a DFA than with backtracking.  With 
> > a large amount of data to process, that represent significant savings...
> 
> I agree. I will reply to this in perl6-internals, though.

Yes, the discussion of details belongs there, when it's not infringing on 
issues of language design, as the semantic consequences do...

> > (2) Would simple alternation impede DFA-style optimizations?
> > 
> > Currently, a pattern like (Jun|June) would never match "June" because the 
> > "leftmost" match "Jun" would always take precedence, despite the normal 
> > "longest-match" behavior of regexes in general.  This example could be 
> > implemented in a DFA; would that always be the case?
> 
> You should read Friedl's Mastering Regular Expressions, if you haven't
> already. A POSIX NFA would be required to find the longest match (it
> has to work as if it were a DFA). A "traditional" NFA produces what
> would result from the straightforward backtracking implementation,
> which often gives an answer closer to what the user expects. IMO,
> these are often different, and the DFA would surprise users fairly
> often.

Rather than following a traditional approach to NFA/DFA construction, would 
it be possible to use a modified approach that preserves leftmost matching?
(If so, would it be more expensive or just different?)

> > Would it be better for the matching of (Jun|June) to be "undefined" and 
> > implementation-dependent?  Or is it best to require "leftmost" semantics?
> 
> I'm voting "leftmost", because I've frequently seen people depend on it.

I was going to agree, until I read your next paragraph.

However, it would be useful to be able to say "don't care" to the semantic 
distinction -- it might even be useful to be able to demand longest-match 
take precedence over leftmost matching, but that would incur an extra cost 
in the normal regex engine...

> I'm not so sure that Larry's suggestion of adding a :dfa flag is
> always the right approach, because I think this might actually be
> something you'd want to set for subsets of a grammar or a single
> expression. I don't think it's useful enough to go as far as proposing
> that || mean "alternate without defining the order of preference", but
> perhaps some  would work. (Or can you embed
> flags in expressions, like perl5's (?imsx:R) thing? Then the :dfa flag
> is of course adequate!)

You know, when you bring up an idea like "||", I start thinking that maybe 
the default should be NOT to have a preference (since it normally doesn't 
matter) and to only guarantee the leftmost short-circuit behavior with "||" 
instead of "|".  That would allow for more implementation flexibility, and 
provide a beautiful parallel with C -- in C, only "||" short-circuits and 
the "|" operator still evaluates all parts.  (Granted, that's because it's 
bitwise, but there's still a nice parallel there.)

For the few cases where someone

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine

On Wed, 28 Aug 2002, Larry Wall wrote:

> That is a worthy consideration, but expressiveness takes precedence
> over it in this case.

I see nothing wrong with expressiveness taking precedence -- I'm only 
saying that it would be best to be cognizant of any restrictions we're 
hardcoding into the design (in case there's a less restrictive means to the 
same ends) and make that design tradeoff knowingly rather than by default.

If we can find general solutions that don't demand a particular style of 
implementation, that's probably an improvement.  There may be unavoidable 
cases, in which case we decide to accept the limitation for expressiveness, 
and that's a perfectly reasonable design choice.

I'd just hate to ignore the issue now, and have someone later say "here's a 
great way it could have been done that would have allowed this improvement 
in the implementation"...

> DFAs are really only good for telling you *whether* and *where* a pattern 
> matches as a whole.  They are relatively useless for telling you *how* a 
> pattern matches.  For instance, a DFA can tell you that you have a valid 
> computer program, but can't hand you back the syntax tree, because it has
> no way to decide between shifting and reducing.  It has to do both
> simultaneously.

Yes and no.  You're right, but see below.

> : It may be that backreferences already demand backtracking.  Or some other 
> : feature might.  I don't know; I haven't thought it through.
> 
> I believe you are correct that backrefs require backtracking.  Maybe some 
> smart person will find a way to trace back through the states by which a 
> DFA matched to retrieve backref info, but that's probably worth a PhD or 
> two.

Well, there are certainly PhD students out there doing new research all the 
time.  Who knows what one will come up with one day?  It would suck if one 
gets a PhD for a super-clever pattern-matching algorithm, and we find that 
we can't use it because of hardcoded assumptions in the language design...

As for backtracing states of the DFA, see below.

> Minimal matching is also difficult in a DFA, I suspect.

Is it?  I'm not sure.  Since the DFA effectively follows all branches of 
the NFA at once, perhaps minimal matching is no more dificult than maximal?

Then again, maybe not. :-)

> : If we must have backtracking, so be it.  But if that's a tradeoff we're 
> : making for more expressive and powerful patterns, we should still at least 
> : make that tradeoff with our eyes open.  And if the tradeoff can be avoided, 
> : that's even better.
> 
> I refer you to http:://history.org/grandmother/teach/egg/suck.  :-)

Um, huh?

> That's a tradeoff I knowingly made in Perl 1.  I saw that awk had a
> DFA implementation, and suffered for it as a useful tool.

I suspect it's not practical to have an all-DFA implementation with nearly 
the power and expressiveness of Perl 4 or Perl 5 regexes, much less Perl 6.

On the other hand, many patterns have subpatterns which might benefit from 
using a DFA as an optimization.  You don't lose the expressiveness here,
if the backtracking NFA is available as well.

I'm just asking that we consider the impact on such optimizations, and see 
if we can leave the door open to reap the benefits without compromising the 
power and expressiveness we all want.  Maybe this just amounts to adding a 
few modifiers to allow semantic variants (like longest-trumps-leftmost), to 
enable optimizations that would otherwise impinge on correctness...

> And it's not just the backrefs.  An NFA can easily incorporate
> strategies such as Boyer-Moore, which actually skips looking at many of
> the characters, or the "scream" algorithm used by study, which can skip
> even more.  All the DFAs I've seen have to look at every character,
> albeit only once.  I suppose it's theoretically possible to compile
> to a Boyer-Moore-ish state machine, but I've never seen it done.

Okay, I confess that I've been saying "DFA" when I don't necessarily mean 
precisely that.  What I really mean is a "non-backtracking state machine" 
of some sort, but I'm calling it a DFA because it would be similar to one 
(to the degree possible) and people know what a DFA is.  I could say NBSM, 
but that seems confusing. :-)

Your objections to the limitations of a DFA are quite correct, of course.  
Modifications would be required to overcome the limits, and then it's no 
longer really a DFA, just like Perl 5's "regular expressions" are no longer 
really "regular expressions" in the mathematical sense.  I'm envisioning a 
state machine of some sort, which has a lot in common with a DFA but isn't 
strictly a DFA anymore.  If you prefer, I'll call it an NBSM, or I'm open 
to better suggestions for a name!

Anyway, to respond to your objections to a DFA:

* While you couldn't hand back a syntax tree from a true DFA, it should be
  possible to create an "NBSM" from a DFA recognizer, modified to record 
  whatever extra information is needed to execute the code

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Steve Fink

On Wed, Aug 28, 2002 at 12:55:44PM -0400, Deven T. Corzine wrote:
> On Wed, 28 Aug 2002, Dan Sugalski wrote:
> > At 10:57 AM -0400 8/28/02, Deven T. Corzine wrote:
> 
> On the other hand, :, ::, ::: and  don't necessarily need to be a 
> problem if they can be treated as hints that can be ignored.  If allowing 
> the normal engine to backtrack despite the hints would change the results, 
> that might be a problem.  I don't know;  may pose special problems.

They do change the semantics.

 (June|Jun) ebug

matches the string "Junebug", but

 (June|Jun): ebug

does not. Similarly,

 (June|Jun) ebug
 (June|Jun) :: ebug
 (June|Jun) ::: ebug
 (June|Jun)  ebug

all behave differently when embedded in a larger grammar.

However, it is very possible that in many (the majority?) of actual
uses, they may be intended purely as optimizations and so any
differing behavior is unintentional. It may be worth having a flag
noting that (maybe combined with a warning "you said this :: isn't
supposed to change what can match, but it does.")

> > That doesn't mean you can't write one for a specific subset of perl's 
> > regexes, though. A medium-term goal for the regex engine is to note 
> > where a DFA would give correct behaviour and use one, rather than 
> > going through the more expensive generalized regex engine we'd 
> > otherwise use.
> 
> I think this is a more realistic goal, and more or less what I had in mind.
> 
> I believe there are many subpatterns which might be beneficial to compile 
> to a DFA (or DFA-like) form, where runtime performance is important.  For 
> example, if a pattern is matching dates, a (Jan|Feb|Mar|Apr|...) subpattern
> would be more efficient to implement as a DFA than with backtracking.  With 
> a large amount of data to process, that represent significant savings...

I agree. I will reply to this in perl6-internals, though.

> (2) Would simple alternation impede DFA-style optimizations?
> 
> Currently, a pattern like (Jun|June) would never match "June" because the 
> "leftmost" match "Jun" would always take precedence, despite the normal 
> "longest-match" behavior of regexes in general.  This example could be 
> implemented in a DFA; would that always be the case?

You should read Friedl's Mastering Regular Expressions, if you haven't
already. A POSIX NFA would be required to find the longest match (it
has to work as if it were a DFA). A "traditional" NFA produces what
would result from the straightforward backtracking implementation,
which often gives an answer closer to what the user expects. IMO,
these are often different, and the DFA would surprise users fairly
often.

> Would it be better for the matching of (Jun|June) to be "undefined" and 
> implementation-dependent?  Or is it best to require "leftmost" semantics?

I'm voting "leftmost", because I've frequently seen people depend on
it. I'm not so sure that Larry's suggestion of adding a :dfa flag is
always the right approach, because I think this might actually be
something you'd want to set for subsets of a grammar or a single
expression. I don't think it's useful enough to go as far as proposing
that || mean "alternate without defining the order of preference", but
perhaps some  would work. (Or can you embed
flags in expressions, like perl5's (?imsx:R) thing? Then the :dfa flag
is of course adequate!)



Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine

On Wed, 28 Aug 2002, Larry Wall wrote:

> : (1) Can we have a ":study" modifier in Perl 6 for patterns?
> : 
> : It could be a no-op if necessary, but it could take the place of Perl 5's 
> : "study" operator and indicate that the programmer WANTS the pattern 
> : optimized for maximum runtime speed, even at the cost of compile time or 
> : memory.  (Hey, how about a ":cram" modifier for extreme optimization? :-)
> 
> Well, "studied" isn't really a property of a pattern--it's a property of a
> string that knows it will have multiple patterns matched against it.  One
> could put a :study on the first pattern, but that's somewhat deceptive.

Oh yeah.  I forgot it applied to the string, not the pattern!  I forgot 
since I never use it! :-)  Still, it could be considered a parallel...

Perhaps a better approach would be to allow the optimization priorities to 
be specified, perhaps even as numerical ranges for relative importance?  
The three obvious dimensions to quantify would be compile time, run-time 
speed, and memory usage.  There's often tradeoffs between these three, and 
allowing the ability for a programmer to specify his/her preferences could 
allow for aggressive optimizations that are normally inappropriate...

Of course, these would be useful not only as modifiers for compiling any 
regexes, but as general pragmas controlling optimizing behavior of the 
entire Perl 6 compiler/optimizer...

I'm not sure if it's good enough to just say "optimize for run-time speed 
at the expense of compile time and memory" (or variations for only having 
one of the two sacrificed) -- or it it's better to have a scale (say, 0-9) 
for how important each dimension is.

For the extreme case where long compile time and large memory usage is 
irrelevant, but extreme run-time speed is a must, the programmer might 
specify optimization priorities of compile=0, memory=0, speed=9.  I'm not 
sure what sort of syntax would be appropriate for such specifications...

> : (2) Would simple alternation impede DFA-style optimizations?
> : 
> : Currently, a pattern like (Jun|June) would never match "June" because the 
> : "leftmost" match "Jun" would always take precedence, despite the normal 
> : "longest-match" behavior of regexes in general.  This example could be 
> : implemented in a DFA; would that always be the case?
> 
> Well, "June" can match if what follows fails to match after "Jun".

True enough.  Couldn't that still be implemented in a DFA?  (Possibly at 
the cost of doubling the size of the DFA for the later part of the regex!)

> : Would it be better for the matching of (Jun|June) to be "undefined" and 
> : implementation-dependent?  Or is it best to require "leftmost" semantics?
> 
> Well, the semantics shouldn't generally wobble around like that, but it'd
> be pretty easy to let them wobble on purpose via pragma (or via :modifier,
> which are really just pragmas that scope to regex groups).

Yeah, it's probably safer not to have that much room for undefined behavior 
since people will just try it and assume that their implementation is the 
universal behavior...

Would there be a good way to say "don't care" about the longest-vs-leftmost 
matching semantics?  Would it be worthwhile to have longest-trumps-leftmost 
as an optional modifier?  (This might be very expensive if implemented in a 
backtracking engine, since it could no longer shortcut alternations...)

Dan suggested ":dfa" for DFA semantics -- is that the best answer, or would 
it be better to define the modifiers in terms of visible behavior rather 
than implementation, if possible?

Deven




Re: need help on perl scripts #1 newuser.pl

2002-08-28 Thread Luke Palmer

This is really the wrong place to be sending this.   This is Perl 5 (or 
maybe even Perl 4, which I don't know) code, and this is a list for 
discussing the design of Perl 6.  A good place to send this would 
probably be [EMAIL PROTECTED]

Good Luck,
Luke

On Wed, 28 Aug 2002, frank crowley wrote:

> #!/usr/local/bin/perl
> $mail_prog = '/usr/lib/sendmail' ;
> # This script was generated automatically by Perl
> Builder(tm): http://www.solutionsoft.com
> 
> # ***ENDAUTOGEN:HEADER*** Do NOT modify this line!! 
> You may enter custom code after this line.
> 
> 
> # ***AUTOGEN:INPUT*** Do NOT modify this line!! Do NOT
> enter custom code in this section.
> 
> &GetFormInput;
> 
> # The intermediate variables below make your script
> more readable
> # but somewhat less efficient since they are not
> really necessary.
> # If you do not want to use these variables, clear the
> # Intermediate Variables checkbox in the Tools |
> Options dialog box, CGI Wizard tab.
> 
> $fmc = $field{'fmc'} ; 
> $name = $field{'name'} ;   
> $email = $field{'email'} ; 
> $address1 = $field{'address1'} ;   
> $address2 = $field{'address2'} ;   
> $city = $field{'city'} ;   
> $state = $field{'state'} ; 
> $zip = $field{'zip'} ; 
> $country = $field{'country'} ; 
> $username = $field{'username'} ;   
> $password = $field{'password'} ;   
> $confpassword = $field{'confpassword'} ;   
> $NewUser = $field{'NewUser'} ; 
> 
> $message = "" ;
> $found_err = "" ;
> 
> # ***ENDAUTOGEN:INPUT*** Do NOT modify this line!! 
> You may enter custom code after this line.
> 
> 
> # ***AUTOGEN:VALIDATE*** Do NOT modify this line!! Do
> NOT enter custom code in this section.
> 
> $errmsg = "\n" ;
> 
> if (length($fmc) < 5) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> if (length($fmc) > 1952935525) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "Please enter a valid email
> address\n" ;
> 
> if ($name !~ /.+\@.+\..+/) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "\n" ;
> 
> if (length($address2) > 112076456) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "\n" ;
> 
> if (length($city) > 1830843236) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "\n" ;
> 
> if (length($state) > 112079212) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "\n" ;
> 
> if (length($zip) > 168650098) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "\n" ;
> 
> if (length($country) > 542395983) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "\n" ;
> 
> if (length($username) > 112001069) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "\n" ;
> 
> if (length($password) > 332) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "\n" ;
> 
> if (length($confpassword) > 50528256) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> 
> $errmsg = "\n" ;
> 
> if (length($NewUser) > 112137472) {
>   $message = $message.$errmsg ;
>   $found_err = 1 ; }
> 
> if ($found_err) {
>   &PrintError; }
> 
> 
> # ***ENDAUTOGEN:VALIDATE*** Do NOT modify this line!! 
> You may enter custom code after this line.
> 
> 
> # ***AUTOGEN:LOGFILE*** Do NOT modify this line!! Do
> NOT enter custom code in this section.
> 
> # ***ENDAUTOGEN:LOGFILE*** Do NOT modify this line!! 
> You may enter custom code after this line.
> 
> 
> # ***AUTOGEN:EMAIL*** Do NOT modify this line!! Do NOT
> enter custom code in this section.
> 
> # ***ENDAUTOGEN:EMAIL*** Do NOT modify this line!! 
> You may enter custom code after this line.
> 
> 
> # ***AUTOGEN:HTML*** Do NOT modify this line!! Do NOT
> enter custom code in this section.
> print "Content-type: text/html\n\n";
> print ''."\n" ;
> print " print '   PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"'."\n"
> ;
> print '
> "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd";>'."\n"
> ;
> print 'http://www.w3.org/1999/xhtml";
> lang="en-US">jovens mtg auction - New
> User'."\n" ;
> print ' href="mailto:joven_20%40yahoo.com"; />'."\n" ;
> print ' bgcolor="#00"> action="http://magicauction.netfirms.com/cgi-bin/newuser.cgi";
> enctype="application/x-www-form-urlencoded">'."\n" ;
> print "\n" ;
> print '   SRC="http://www.mtgauction.com/lazarusMtGlogoHORIZ.GIF";
> '."\n" ;
> print '   ALT="lazarus MtG auction" HEIGHT=69
> WIDTH=519 ALIGN=CENTER>'."\n" ;
> print "  \n" ;
> print ' /> cellspacing="1" border="0"> FACE="Arial, Helvetica">New User Entry
> Form Items
> in bold are'."\n" ;
> print '   mandatory for participation in the
> auction.  align="RIGHT">Name: align="LEFT"> maxlength="40" />  align="RIGHT">E-mail
> address: type="text" name="email"  size="40" maxlength="50"
> />  COLOR=#FF>Address: a

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Dan Sugalski

At 10:36 AM -0700 8/28/02, Larry Wall wrote:
>On Wed, 28 Aug 2002, Deven T. Corzine wrote:
>: I'm not saying we should dump the operators -- if we get more power by
>: assuming a backtracking implementation, maybe that's a worthwhile tradeoff.
>:
>: On the other hand, if we can keep the implementation possibilities more
>: open, that's always a worthwhile goal, even if we're not sure if or when
>: we'll ever take advantage of those possibilities, or if we even could...
>
>That is a worthy consideration, but expressiveness takes precedence
>over it in this case.  DFAs are really only good for telling you
>*whether* and *where* a pattern matches as a whole.  They are
>relatively useless for telling you *how* a pattern matches.
>For instance, a DFA can tell you that you have a valid computer
>program, but can't hand you back the syntax tree, because it has
>no way to decide between shifting and reducing.  It has to do both
>simultaneously.

While true, there are reasonably common cases where you don't care 
about what or where, just whether. For a set of mushed-together 
examples:

while (<>) {
last if /END_OF_DATA/;
$line .= $_ if /=$/;
next unless /$user_entered_string/;
}

Sure, it's a restricted subset of the stuff people do, and that's 
cool. I'd not even want to put in DFA-detecting code in the main 
regex compilation grammar. But in those cases where it is useful, a 
:dfa switch for regexes would be nifty.


(Though *please* don't yet--we've not gotten the current grammar 
fully implemented :)
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Larry Wall

On Wed, 28 Aug 2002, Deven T. Corzine wrote:
: I'd like to do that, if I can find the time.  It would be interesting to 
: make a small experimental prototype to see if DFA construction could really 
: improve performance over backtracking, but it would probably need to be a 
: very restricted subset of regex operations to test the idea...

That'd be cool.

: However, while I'm still on perl6-language, I have two language issues to 
: discuss first:
: 
: (1) Can we have a ":study" modifier in Perl 6 for patterns?
: 
: It could be a no-op if necessary, but it could take the place of Perl 5's 
: "study" operator and indicate that the programmer WANTS the pattern 
: optimized for maximum runtime speed, even at the cost of compile time or 
: memory.  (Hey, how about a ":cram" modifier for extreme optimization? :-)

Well, "studied" isn't really a property of a pattern--it's a property of a
string that knows it will have multiple patterns matched against it.  One
could put a :study on the first pattern, but that's somewhat deceptive.

: (2) Would simple alternation impede DFA-style optimizations?
: 
: Currently, a pattern like (Jun|June) would never match "June" because the 
: "leftmost" match "Jun" would always take precedence, despite the normal 
: "longest-match" behavior of regexes in general.  This example could be 
: implemented in a DFA; would that always be the case?

Well, "June" can match if what follows fails to match after "Jun".

: Would it be better for the matching of (Jun|June) to be "undefined" and 
: implementation-dependent?  Or is it best to require "leftmost" semantics?

Well, the semantics shouldn't generally wobble around like that, but it'd
be pretty easy to let them wobble on purpose via pragma (or via :modifier,
which are really just pragmas that scope to regex groups).

Larry




need help in getting the website to aknowledge cgi and perl script when clicking on link to go to new user signup html page, as well as auction.html page

2002-08-28 Thread frank crowley

and for them to interact. 
http://magicauction.netfirms.com/index.html
trying to get the preview auction link to go to
auction.cgi, and the link for new user to go to
newuser.cgi which are both in the cgi-bin


=
frank crowley

__
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com



#2 auction.pl

2002-08-28 Thread frank crowley

#!/usr/local/bin/perl
$mail_prog = '/usr/lib/sendmail' ;
# This script was generated automatically by Perl
Builder(tm): http://www.solutionsoft.com

# ***ENDAUTOGEN:HEADER*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:INPUT*** Do NOT modify this line!! Do NOT
enter custom code in this section.

&GetFormInput;

# The intermediate variables below make your script
more readable
# but somewhat less efficient since they are not
really necessary.
# If you do not want to use these variables, clear the
# Intermediate Variables checkbox in the Tools |
Options dialog box, CGI Wizard tab.

$vcolor = $field{'vcolor'} ; 
$sc = $field{'sc'} ; 
$vsets = $field{'vsets'} ;   
$vlanguages = $field{'vlanguages'} ; 
$vrarities = $field{'vrarities'} ;   
$ChangeView = $field{'ChangeView'} ; 
$White = $field{'White'} ;   
$Blue = $field{'Blue'} ; 
$Black = $field{'Black'} ;   
$Red = $field{'Red'} ;   
$Green = $field{'Green'} ;   
$Gold = $field{'Gold'} ; 
$Artifact = $field{'Artifact'} ; 
$Land = $field{'Land'} ; 
$_cgifields = $field{'.cgifields'} ; 

$message = "" ;
$found_err = "" ;

# ***ENDAUTOGEN:INPUT*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:VALIDATE*** Do NOT modify this line!! Do
NOT enter custom code in this section.

$errmsg = "\n" ;

if (length($vlanguages) > 1092690721) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($Green) > 20) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($Gold) > 197379) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($Artifact) > 23) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($Land) > 1162158653) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($_cgifields) > 537529662) {
$message = $message.$errmsg ;
$found_err = 1 ; }

if ($found_err) {
&PrintError; }


# ***ENDAUTOGEN:VALIDATE*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:LOGFILE*** Do NOT modify this line!! Do
NOT enter custom code in this section.

# ***ENDAUTOGEN:LOGFILE*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:EMAIL*** Do NOT modify this line!! Do NOT
enter custom code in this section.

# ***ENDAUTOGEN:EMAIL*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:HTML*** Do NOT modify this line!! Do NOT
enter custom code in this section.

# ***ENDAUTOGEN:HTML*** Do NOT modify this line!!  You
may enter custom code after this line.


# ***AUTOGEN:ERRPRINT*** Do NOT modify this line!! Do
NOT enter custom code in this section.

sub PrintError { 
print "Content-type: text/html\n\n";
print $message ;

exit 0 ;
return 1 ; 
}

# ***ENDAUTOGEN:ERRPRINT*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:PARSE*** Do NOT modify this line!! Do NOT
enter custom code in this section.
sub GetFormInput {

(*fval) = @_ if @_ ;

local ($buf);
if ($ENV{'REQUEST_METHOD'} eq 'POST') {
read(STDIN,$buf,$ENV{'CONTENT_LENGTH'});
}
else {
$buf=$ENV{'QUERY_STRING'};
}
if ($buf eq "") {
return 0 ;
}
else {
@fval=split(/&/,$buf);
foreach $i (0 .. $#fval){
($name,$val)=split (/=/,$fval[$i],2);
$val=~tr/+/ /;
$val=~ s/%(..)/pack("c",hex($1))/ge;
$name=~tr/+/ /;
$name=~ s/%(..)/pack("c",hex($1))/ge;

if (!defined($field{$name})) {
$field{$name}=$val;
}
else {
$field{$name} .= ",$val";

#if you want multi-selects to goto into an array
change to:
#$field{$name} .= "\0$val";
}


   }
}
return 1;
}


# ***ENDAUTOGEN:PARSE*** Do NOT modify this line!! 
You may enter custom code after this line.



=
frank crowley

__
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com



need help on perl scripts #1 newuser.pl

2002-08-28 Thread frank crowley

#!/usr/local/bin/perl
$mail_prog = '/usr/lib/sendmail' ;
# This script was generated automatically by Perl
Builder(tm): http://www.solutionsoft.com

# ***ENDAUTOGEN:HEADER*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:INPUT*** Do NOT modify this line!! Do NOT
enter custom code in this section.

&GetFormInput;

# The intermediate variables below make your script
more readable
# but somewhat less efficient since they are not
really necessary.
# If you do not want to use these variables, clear the
# Intermediate Variables checkbox in the Tools |
Options dialog box, CGI Wizard tab.

$fmc = $field{'fmc'} ;   
$name = $field{'name'} ; 
$email = $field{'email'} ;   
$address1 = $field{'address1'} ; 
$address2 = $field{'address2'} ; 
$city = $field{'city'} ; 
$state = $field{'state'} ;   
$zip = $field{'zip'} ;   
$country = $field{'country'} ;   
$username = $field{'username'} ; 
$password = $field{'password'} ; 
$confpassword = $field{'confpassword'} ; 
$NewUser = $field{'NewUser'} ;   

$message = "" ;
$found_err = "" ;

# ***ENDAUTOGEN:INPUT*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:VALIDATE*** Do NOT modify this line!! Do
NOT enter custom code in this section.

$errmsg = "\n" ;

if (length($fmc) < 5) {
$message = $message.$errmsg ;
$found_err = 1 ; }

if (length($fmc) > 1952935525) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "Please enter a valid email
address\n" ;

if ($name !~ /.+\@.+\..+/) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($address2) > 112076456) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($city) > 1830843236) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($state) > 112079212) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($zip) > 168650098) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($country) > 542395983) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($username) > 112001069) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($password) > 332) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($confpassword) > 50528256) {
$message = $message.$errmsg ;
$found_err = 1 ; }


$errmsg = "\n" ;

if (length($NewUser) > 112137472) {
$message = $message.$errmsg ;
$found_err = 1 ; }

if ($found_err) {
&PrintError; }


# ***ENDAUTOGEN:VALIDATE*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:LOGFILE*** Do NOT modify this line!! Do
NOT enter custom code in this section.

# ***ENDAUTOGEN:LOGFILE*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:EMAIL*** Do NOT modify this line!! Do NOT
enter custom code in this section.

# ***ENDAUTOGEN:EMAIL*** Do NOT modify this line!! 
You may enter custom code after this line.


# ***AUTOGEN:HTML*** Do NOT modify this line!! Do NOT
enter custom code in this section.
print "Content-type: text/html\n\n";
print ''."\n" ;
print "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd";>'."\n"
;
print 'http://www.w3.org/1999/xhtml";
lang="en-US">jovens mtg auction - New
User'."\n" ;
print 'mailto:joven_20%40yahoo.com"; />'."\n" ;
print 'http://magicauction.netfirms.com/cgi-bin/newuser.cgi";
enctype="application/x-www-form-urlencoded">'."\n" ;
print "\n" ;
print '  http://www.mtgauction.com/lazarusMtGlogoHORIZ.GIF";
'."\n" ;
print '   ALT="lazarus MtG auction" HEIGHT=69
WIDTH=519 ALIGN=CENTER>'."\n" ;
print "  \n" ;
print 'New User Entry
Form Items
in bold are'."\n" ;
print ' mandatory for participation in the
auction. Name: E-mail
address: Address:   City: State: Zip: Country:   Username:Must be
one word and contain letters or numbers
only. Password:Must be
8-10 characters and contain'."\n" ;
print ' letters or numbers only.
Confirm Password:   '."\n" ;
print "\n" ;
print '  '."\n" ;
print " Magic: the Gathering is a trademark
of \n" ;
print '  http://www.wizards.com/";>'."\n"
;
print "Wizards of the Coast\n" ;
print "  .\n" ;
print "   \n" ;
print "   \n" ;
print '   '."\n" ;
print " Web site copyright © 1998-2002 \n" ;
print ' mailto:[EMAIL PROTECTED]";>'."\n" ;
print "   frank j crowley\n" ;
print " . All rights reserved.\n" ;
print "   \n" ;
print " \n" ;
print "\n" ;

# ***ENDAUTOGEN:HTML*** Do NOT modify this line!!  You
may enter custom code after this line.


# ***AUTOGEN:ERRPRINT*** Do NOT modify this line!! Do
NOT enter custom code in this section.

sub PrintError { 
print "Content-type: 

Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Larry Wall

On Wed, 28 Aug 2002, Deven T. Corzine wrote:
: I'm not saying we should dump the operators -- if we get more power by 
: assuming a backtracking implementation, maybe that's a worthwhile tradeoff.
: 
: On the other hand, if we can keep the implementation possibilities more 
: open, that's always a worthwhile goal, even if we're not sure if or when 
: we'll ever take advantage of those possibilities, or if we even could...

That is a worthy consideration, but expressiveness takes precedence
over it in this case.  DFAs are really only good for telling you
*whether* and *where* a pattern matches as a whole.  They are
relatively useless for telling you *how* a pattern matches.
For instance, a DFA can tell you that you have a valid computer
program, but can't hand you back the syntax tree, because it has
no way to decide between shifting and reducing.  It has to do both
simultaneously.

: It seems like backtracking is a Bad Thing, in that it leads to reprocessing 
: data that we've already looked at.  On the other hand, it seems to be a 
: Necessary Evil because of the memory costs of avoiding backtracking, and 
: because we might have to give up valuable features without backtracking.
: 
: It may be that backreferences already demand backtracking.  Or some other 
: feature might.  I don't know; I haven't thought it through.

I believe you are correct that backrefs require backtracking.  Maybe some smart
person will find a way to trace back through the states by which a DFA matched
to retrieve backref info, but that's probably worth a PhD or two.

Minimal matching is also difficult in a DFA, I suspect.

: If we must have backtracking, so be it.  But if that's a tradeoff we're 
: making for more expressive and powerful patterns, we should still at least 
: make that tradeoff with our eyes open.  And if the tradeoff can be avoided, 
: that's even better.

I refer you to http:://history.org/grandmother/teach/egg/suck.  :-)

That's a tradeoff I knowingly made in Perl 1.  I saw that awk had a
DFA implementation, and suffered for it as a useful tool.

And it's not just the backrefs.  An NFA can easily incorporate
strategies such as Boyer-Moore, which actually skips looking at many of
the characters, or the "scream" algorithm used by study, which can skip
even more.  All the DFAs I've seen have to look at every character,
albeit only once.  I suppose it's theoretically possible to compile
to a Boyer-Moore-ish state machine, but I've never seen it done.

Add to that the fact that most real-life patterns don't generally do
much backtracking, because they're written to succeed, not to fail.
This pattern never backtracks, for instance:

my ($num) = /^Items: (\d+)/;

I'm not against applying a DFA implementation where it's useful
and practical, but just because it's the "best" in some limited
theoretical framework doesn't cut it for me.  Humans do a great
deal of backtracking in real life, once the limits of their parallel
pattern matching circuits are exceeded.  Even in language we often
have to reparse sentences that are garden pathological.  Why should
computers be exempt? :-)

Larry




Re: auto deserialization

2002-08-28 Thread David Wheeler

On Wednesday, August 28, 2002, at 09:56  AM, Larry Wall wrote:

> my Date $date { 'June 25, 2002' };
>
> Either way, this makes data declarations more like sub declarations
> in syntax, though the semantics of what you do with the final closure
> when are obviously different.  That is, for ordinary data a bare {...}
> is equivalent to "is now", while for a subroutine definition it's more
> like "is on_demand".

I actually rather like that as a sort of compromise. Syntactic sugar, 
good.

I'm assuming, however, that the difference in syntax between the two 
different uses of {...} would be easily identifiable via the assignment 
operator, viz:

   my Date $date { 'June 25, 2002' };

vs.

   my $sub = { ... };

Correct?

Also, this leads me to wonder, is a closure is actually a typed object?

   my Closure $sub = { ... };

And if so, does it matter?

> Whatever.  My coffee stream hasn't yet suppressed my stream of 
> consciousness.

I think we're all the better for it! :-)

Regards,

David

-- 
David Wheeler AIM: dwTheory
[EMAIL PROTECTED] ICQ: 15726394
http://david.wheeler.net/  Yahoo!: dew7e
Jabber: [EMAIL PROTECTED]




Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Sean O'Rourke

On Wed, 28 Aug 2002, Deven T. Corzine wrote:
> Would it be better for the matching of (Jun|June) to be "undefined" and
> implementation-dependent?  Or is it best to require "leftmost" semantics?

For an alternation spelled out explicitly in the pattern, it seems like
undefined matching would be confusing.  I regularly order the branches of
regexes assuming they are tried left-to-right.

On the other hand, and on a related note of constrained implementation, do
we require leftmost matching for interpolated arrays of literals (e.g.
"/@x/")?  If, as with hyper-operators, we said the order of evaluation is
undefined, we could use a fast algorithm (Aho-Corasick?) that doesn't
preserve order.

/s




Re: auto deserialization

2002-08-28 Thread Dan Sugalski

At 5:29 PM +0100 8/28/02, Nicholas Clark wrote:
>On Wed, Aug 28, 2002 at 12:17:55PM -0400, Dan Sugalski wrote:
>>  At 10:36 AM +0200 8/28/02, [EMAIL PROTECTED] wrote:
>>  >  >> Will there be automatic calling of the deserialization method
>>  >>>  for objects, so that code like this DWIMs...
>>  >
>>  >>>   my Date $bday = 'June 25, 2002';
>>  >
>>  >>  Err... what do you mean it to do?
>>  >
>>  >Wow, this is nice. He means (I think) that this will be translated into
>>  >
>>  >my Date $bday = Date->new('June 25, 2002');
>>
>>  That's really unlikely. More likely what'll happen is:
>>
>> my Date $bday;
>> $bday = 'June 25, 2002';
>>
>>  and it'll be up to $bday's string assignment code to decide what to
>>  do when handed a string that looks like a date.
>
>op wise, how is that different from the original suggestion of
>
> my Date $bday = 'June 25, 2002';

It isn't. It was mostly to stem the followup "eight zillion flavors 
of new" cascade that was sure to follow. :)

>  > That should work OK for a variety of reasons. $bday is strongly typed
>>  since you told perl what type it was in the my declaration. Date can
>>  also override string assignment, thus Doing The Right Thing (pitching
>>  a fit or taking a date) when you assign to it.
>>
>>  I can see downsides to it, though--it means you lose the compile-time
>>  type checking, since just because we're getting the wrong type
>>  doesn't mean it's really an error. OTOH it's not like we have strong
>>  compile-time type checking now...
>
>If the compiler were able to see that my Date $bday = 'June 25, 2002';
>is one statement that both types $bday as Date, and then assigns a constant
>to it, is it possible to do the conversion of that constant to a constant
>$bday object at compile time? (and hence get compile time checking)
>Without affecting general run time behaviour.

That's possible, yes. We could construct the object at compiletime 
and store a real serialized version in the bytecode, and deserialize 
at execution time. We probably will do that, though maybe not for the 
first version of the compilers.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: auto deserialization

2002-08-28 Thread Larry Wall

On Wed, 28 Aug 2002, David Wheeler wrote:
: I have to agree with this. Ideally, IMO, there'd be some magic going on 
: behind the scenes (maybe with a pragma?) that automatically typed 
: variables so we wouldn't have to be so redundant, the code would look 
: more like (most) Perl 5 OO stuff, and I'd save my tendonitis. What I 
: mean (ignoring for the moment the even simpler syntax suggested earlier 
: in this thread) is this:
: 
:my $date = Date.new('June 25, 2002');
: 
: Would automatically type C<$date> as a Date object.

Assignment is wrong for conferring compile-time properties, I think.
Maybe something more like:

my Date $date is new('June 25, 2002');

except that this implies the constructor args would be evaluated at
compile time.  We need to suppress that somehow.  We almost need some
kind of topicalization:

my Date $date = .new('June 25, 2002');

but I think that's taking topicalization a bit too far.  The ordinary
way to suppress early evaluation is by defining a closure.  I've argued
before for something like a topicalized closure property:

my Date $date is first { .init 'June 25, 2002' };

though "first" might be too early.  The init should be inline with
the declaration, so maybe it's

my Date $date is now { .init 'June 25, 2002' };

That might be so common that we could make syntactic sugar for it:

my Date $date { .init 'June 25, 2002' };

That's evaluating the closure for a side effect.  Or we could evaluate
it for its return value, factoring the init out into the implementation
of "now", and just get:

my Date $date { 'June 25, 2002' };

Either way, this makes data declarations more like sub declarations
in syntax, though the semantics of what you do with the final closure
when are obviously different.  That is, for ordinary data a bare {...}
is equivalent to "is now", while for a subroutine definition it's more
like "is on_demand".

Whatever.  My coffee stream hasn't yet suppressed my stream of consciousness.

Larry




Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine

On Wed, 28 Aug 2002, Dan Sugalski wrote:

> At 10:57 AM -0400 8/28/02, Deven T. Corzine wrote:
> >Would it be _possible_ to create a non-backtracking implementation of a
> >Perl 6 pattern engine, or does the existence of backtracking-related
> >operators preclude this possibility in advance?
> 
> In general, no of course it's not possible to create a 
> non-backtracking perl regex engine. Far too much of perl's regexes 
> requires backtracking.

Given that Perl 5 regex's are no longer regular (much less Perl 6), I'm 
sure this is probably true.  There may be a regular subset which could be 
implemented without backtracking if problematic features are avoided, but 
surely a complete non-backtracking implementation is beyond reach.

On the other hand, :, ::, ::: and  don't necessarily need to be a 
problem if they can be treated as hints that can be ignored.  If allowing 
the normal engine to backtrack despite the hints would change the results, 
that might be a problem.  I don't know;  may pose special problems.

Even if the new operators can't work without backtracking, maybe it doesn't 
matter, since there's surely a few others inherited from Perl 5 as well...

> That doesn't mean you can't write one for a specific subset of perl's 
> regexes, though. A medium-term goal for the regex engine is to note 
> where a DFA would give correct behaviour and use one, rather than 
> going through the more expensive generalized regex engine we'd 
> otherwise use.

I think this is a more realistic goal, and more or less what I had in mind.

I believe there are many subpatterns which might be beneficial to compile 
to a DFA (or DFA-like) form, where runtime performance is important.  For 
example, if a pattern is matching dates, a (Jan|Feb|Mar|Apr|...) subpattern
would be more efficient to implement as a DFA than with backtracking.  With 
a large amount of data to process, that represent significant savings...

> If you want to head over to [EMAIL PROTECTED] and pitch in on 
> the regex implementation (it's being worked on now) that'd be great.

I'd like to do that, if I can find the time.  It would be interesting to 
make a small experimental prototype to see if DFA construction could really 
improve performance over backtracking, but it would probably need to be a 
very restricted subset of regex operations to test the idea...

However, while I'm still on perl6-language, I have two language issues to 
discuss first:

(1) Can we have a ":study" modifier in Perl 6 for patterns?

It could be a no-op if necessary, but it could take the place of Perl 5's 
"study" operator and indicate that the programmer WANTS the pattern 
optimized for maximum runtime speed, even at the cost of compile time or 
memory.  (Hey, how about a ":cram" modifier for extreme optimization? :-)

(2) Would simple alternation impede DFA-style optimizations?

Currently, a pattern like (Jun|June) would never match "June" because the 
"leftmost" match "Jun" would always take precedence, despite the normal 
"longest-match" behavior of regexes in general.  This example could be 
implemented in a DFA; would that always be the case?

Would it be better for the matching of (Jun|June) to be "undefined" and 
implementation-dependent?  Or is it best to require "leftmost" semantics?

Deven




Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine

On 28 Aug 2002, Simon Cozens wrote:

> [EMAIL PROTECTED] (Deven T. Corzine) writes:
> > Would it be _possible_ to create a non-backtracking implementation of a 
> > Perl 6 pattern engine
> 
> I don't believe that it is, but not just because of : and friends.
> Why does it matter?

I'm not saying we should dump the operators -- if we get more power by 
assuming a backtracking implementation, maybe that's a worthwhile tradeoff.

On the other hand, if we can keep the implementation possibilities more 
open, that's always a worthwhile goal, even if we're not sure if or when 
we'll ever take advantage of those possibilities, or if we even could...

It seems like backtracking is a Bad Thing, in that it leads to reprocessing 
data that we've already looked at.  On the other hand, it seems to be a 
Necessary Evil because of the memory costs of avoiding backtracking, and 
because we might have to give up valuable features without backtracking.

It may be that backreferences already demand backtracking.  Or some other 
feature might.  I don't know; I haven't thought it through.

If we must have backtracking, so be it.  But if that's a tradeoff we're 
making for more expressive and powerful patterns, we should still at least 
make that tradeoff with our eyes open.  And if the tradeoff can be avoided, 
that's even better.

Deven





Re: auto deserialization

2002-08-28 Thread Nicholas Clark

On Wed, Aug 28, 2002 at 12:17:55PM -0400, Dan Sugalski wrote:
> At 10:36 AM +0200 8/28/02, [EMAIL PROTECTED] wrote:
> >  >> Will there be automatic calling of the deserialization method
> >>>  for objects, so that code like this DWIMs...
> >
> >>>   my Date $bday = 'June 25, 2002';
> >
> >>  Err... what do you mean it to do?
> >
> >Wow, this is nice. He means (I think) that this will be translated into
> >
> >my Date $bday = Date->new('June 25, 2002');
> 
> That's really unlikely. More likely what'll happen is:
> 
>my Date $bday;
>$bday = 'June 25, 2002';
> 
> and it'll be up to $bday's string assignment code to decide what to 
> do when handed a string that looks like a date.

op wise, how is that different from the original suggestion of

my Date $bday = 'June 25, 2002';

?

> That should work OK for a variety of reasons. $bday is strongly typed 
> since you told perl what type it was in the my declaration. Date can 
> also override string assignment, thus Doing The Right Thing (pitching 
> a fit or taking a date) when you assign to it.
> 
> I can see downsides to it, though--it means you lose the compile-time 
> type checking, since just because we're getting the wrong type 
> doesn't mean it's really an error. OTOH it's not like we have strong 
> compile-time type checking now...

If the compiler were able to see that my Date $bday = 'June 25, 2002';
is one statement that both types $bday as Date, and then assigns a constant
to it, is it possible to do the conversion of that constant to a constant
$bday object at compile time? (and hence get compile time checking)
Without affecting general run time behaviour.

Nicholas Clark



Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Dan Sugalski

At 10:57 AM -0400 8/28/02, Deven T. Corzine wrote:
>Would it be _possible_ to create a non-backtracking implementation of a
>Perl 6 pattern engine, or does the existence of backtracking-related
>operators preclude this possibility in advance?

In general, no of course it's not possible to create a 
non-backtracking perl regex engine. Far too much of perl's regexes 
requires backtracking.

That doesn't mean you can't write one for a specific subset of perl's 
regexes, though. A medium-term goal for the regex engine is to note 
where a DFA would give correct behaviour and use one, rather than 
going through the more expensive generalized regex engine we'd 
otherwise use.

If you want to head over to [EMAIL PROTECTED] and pitch in on 
the regex implementation (it's being worked on now) that'd be great.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: auto deserialization

2002-08-28 Thread Dan Sugalski

At 10:36 AM +0200 8/28/02, [EMAIL PROTECTED] wrote:
>  >> Will there be automatic calling of the deserialization method
>>>  for objects, so that code like this DWIMs...
>
>>>   my Date $bday = 'June 25, 2002';
>
>>  Err... what do you mean it to do?
>
>Wow, this is nice. He means (I think) that this will be translated into
>
>my Date $bday = Date->new('June 25, 2002');

That's really unlikely. More likely what'll happen is:

   my Date $bday;
   $bday = 'June 25, 2002';

and it'll be up to $bday's string assignment code to decide what to 
do when handed a string that looks like a date.

That should work OK for a variety of reasons. $bday is strongly typed 
since you told perl what type it was in the my declaration. Date can 
also override string assignment, thus Doing The Right Thing (pitching 
a fit or taking a date) when you assign to it.

I can see downsides to it, though--it means you lose the compile-time 
type checking, since just because we're getting the wrong type 
doesn't mean it's really an error. OTOH it's not like we have strong 
compile-time type checking now...
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: auto deserialization

2002-08-28 Thread David Wheeler

On Wednesday, August 28, 2002, at 06:11  AM, [EMAIL PROTECTED] 
wrote:

> Is there some kind of third option?  I have to admit I've always found 
> Java
> commands like "Date bday = new Date('June 25, 2002')" somehow 
> redundant.

I have to agree with this. Ideally, IMO, there'd be some magic going on 
behind the scenes (maybe with a pragma?) that automatically typed 
variables so we wouldn't have to be so redundant, the code would look 
more like (most) Perl 5 OO stuff, and I'd save my tendonitis. What I 
mean (ignoring for the moment the even simpler syntax suggested earlier 
in this thread) is this:

   my $date = Date.new('June 25, 2002');

Would automatically type C<$date> as a Date object.

Thoughts?

Regards,

David

-- 
David Wheeler AIM: dwTheory
[EMAIL PROTECTED] ICQ: 15726394
http://david.wheeler.net/  Yahoo!: dew7e
Jabber: [EMAIL PROTECTED]




Re: auto deserialization

2002-08-28 Thread Erik Steven Harrison


>From:  [EMAIL PROTECTED]
>> Wow, this is nice. He means (I think) that this will be translated into
>> my Date $bday = Date->new('June 25, 2002');

I don't think this is going to work. First off, there 
is no predefined constructor name in Perl. Secondly, 
you can have multiple constructors in the same class. 
And thirdly Date.new (for better or for worse) does 
not have to return a Date object.

Finally, if these problems could be surmounted (ie 
Perl 6 defines an implicit constructor), then we get 
very subtle bugs like this

my Dog $spot = Poodle.new;


$spot is typed to accept Dog subclasses, right? But 
what if Dog.new is typed to accept an object as it's 
first argument? Or, worse, has no argument list? Does 
this construct turn into


my Dog $spot = Dog.new( Poodle.new );


or


my $spot is 'Dog';

$spot = new Poodle.new

-Erik


Is your boss reading your email? Probably
Keep your messages private by using Lycos Mail.
Sign up today at http://mail.lycos.com



Re: Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Simon Cozens

[EMAIL PROTECTED] (Deven T. Corzine) writes:
> Would it be _possible_ to create a non-backtracking implementation of a 
> Perl 6 pattern engine

I don't believe that it is, but not just because of : and friends.
Why does it matter?

-- 
"Life sucks, but it's better than the alternative."
-- Peter da Silva



Does ::: constrain the pattern engine implementation?

2002-08-28 Thread Deven T. Corzine

I have no objection to pattern operators like ::: in principle, but I do
have a potential concern about them.

Given that the operators are actually defined in terms of "backtracking" 
within the RE engine, does this constrain the implementation such that it 
MUST be a backtracking implementation to behave correctly?

If these operators are purely effeciency optimization hints, that would be 
one thing, but I get the sense that ignoring the "hints" might lead to 
incorrect behavior.  (The  operator might be a special concern.)

Suppose, for the sake of argument, that someone wanted to make a pattern 
engine implementation, compatible with Perl 6 patterns, which was highly 
optimized for speed at the expense of memory, by RE->NFA->DFA construction 
for simultaneous evaluation of multiple alternatives without backtracking.

This might be extremely expensive in memory, but there may be some niche 
applications where run-time speed is paramount, and a pattern is used so 
heavily in such a critical way that the user might be willing to expend 
hundreds of megabytes of RAM to make the patterns execute several times 
faster than normal.  (Obviously, such a tradeoff would be unacceptable in
the general case!)

Would it be _possible_ to create a non-backtracking implementation of a 
Perl 6 pattern engine, or does the existence of backtracking-related 
operators preclude this possibility in advance?

I hope we're not constraining the implementation options by the language 
design, but I'm worried that this might be the case with these operators.

Shouldn't it be an implementation decision whether to use backtracking?

Deven




Re: Hypothetical synonyms

2002-08-28 Thread Steffen Mueller

Piers Cawley wrote:
> Uri Guttman <[EMAIL PROTECTED]> writes:
{...]
>> couldn't that be reduced to:
>>
>> m{^\s* $stuff := [ "(.*?)" | (\S+) ] };
>>
>> the | will only return one of the grabbed chunks and the result of
>> the [] group would be assigned to $stuff.
>
> Hmm... is this the first Perl 6 golf post?

Well, no, for two reasons:
a) There's whitespace.
b) The time's not quite ready for Perl6 golf because Larry's the only one
who would qualify as a referee.

And we all know that's not a recreational task :)

Steffen
--
@n=(544290696690,305106661574,116357),$b=16,@c=' ,JPacehklnorstu'=~
/./g;for$n(@n){map{$h=int$n/$b**$_;$n-=$b**$_*$h;$c[@c]=$h}c(0..9);
push@p,map{$c[$_]}@c[c($b..$#c)];$#c=$b-1}print@p;sub'c{reverse @_}




Re: Hypothetical synonyms

2002-08-28 Thread Trey Harris

In a message dated 28 Aug 2002, Aaron Sherman writes:
> Ok, just to be certain:
>
>   $_ = "0";
>   my $zilch = /0/ || 1;
>
> Is $zilch C<"0"> or 8?

8?  How do you get 8?  You'd get a result object which stringified was "0"
and booleanfied was true.  So here, you'd get a result object vaguely
isomorphic to "0 but true".

> If C<"0">, does it continue to be "true"? What about:
>
>   $_ = "0";
>   my $zilch = /0/ || 1;
>   die "Failed to match zero" unless $zilch;
>
> Is that a bug?

Yes, it's a bug, as I don't see any way to actually die there.  I don't
understand the presence of the C<|| 1> there.  I think you'd just write
C.  If you really truly wanted it to be one if it
failed, but you still wanted the die to work, you'd write:

  $_ = "0";
  my $zilch = /0/ || 1 but false;
  die "Failed to match zero" unless $zilch;

Or, more comprehensibly, just

  $_ = "0";
  my $zilch = /0/
   or die "Failed to match zero";

Trey




Re: Hypothetical synonyms

2002-08-28 Thread Aaron Sherman

On Wed, 2002-08-28 at 03:23, Trey Harris wrote:

> Note--no parens around $field.  We're not "capturing" here, not in the
> Perl 5 sense, anyway.
> 
> When a pattern consisting of only a named rule invokation (possibly
> quantified) matches, it returns the result object, which in boolean
> context returns true, but in string context returns the entire captured
> text from the named rule (so, one hopes that the C rule
> captures only the quoted text, not the quotes surrounding it).

Ok, just to be certain:

$_ = "0";
my $zilch = /0/ || 1;

Is $zilch C<"0"> or 8?

If C<"0">, does it continue to be "true"? What about:

$_ = "0";
my $zilch = /0/ || 1;
die "Failed to match zero" unless $zilch;

Is that a bug?





Re: auto deserialization

2002-08-28 Thread [EMAIL PROTECTED]

From:  [EMAIL PROTECTED]
> Wow, this is nice. He means (I think) that this will be translated into
> my Date $bday = Date->new('June 25, 2002');


I rather like it too, but it hinges on how strictly typing is enforced.  If
typing is strictly enforced then it works because the VM can always know
that since Date isn't a String, it should call the FROM_STRING static
method if such a method is available.

However, it appears that typing won't be so strictly enforced, in which
case the intent becomes ambiguous.  Does the line mean to instantiate Date
using the string, or to just assign the string to $bday and just have the
wrong type?

Is there some kind of third option?  I have to admit I've always found Java
commands like "Date bday = new Date('June 25, 2002')" somehow redundant.

-Miko


mail2web - Check your email from the web at
http://mail2web.com/ .





Re: auto deserialization

2002-08-28 Thread david


>> Will there be automatic calling of the deserialization method 
>> for objects, so that code like this DWIMs...

>>  my Date $bday = 'June 25, 2002';

> Err... what do you mean it to do?

Wow, this is nice. He means (I think) that this will be translated into

my Date $bday = Date->new('June 25, 2002');

As far as I've understood during my hours of lurking, it has been decided that this 
will not happen, but now is also the first time I am even slightly convinced that it 
is a good idea. 

I think it is really pretty, although it could be argued that:

a) As typing the translated code yourself would be easy, the whole idea would be a 
useless complication.

  or 

b) The translation should only happen if the class defined the method 
NEW_FROM_STRING() or some such, and that method would be used instead of new(). I 
might be afraid that would also tend to bloat the class system, but I think there must 
be a way.

Maybe a class could define the method new_from($obj) which would be called if it 
existed, and whose return value would be what was assigned to the class-hinted 
variable.


Is this going to be still-born?


david
--
(unbalanced brackets are really annoying



Re: Hypothetical synonyms

2002-08-28 Thread Trey Harris

In a message dated 27 Aug 2002, Uri Guttman writes:

> > "LW" == Larry Wall <[EMAIL PROTECTED]> writes:
>
>   LW> On 27 Aug 2002, Uri Guttman wrote: : and quoteline might even
>   LW> default to " for its delim which would make : that line:
>   LW> :
>   LW> : my ($fields) = /(|\S+)/;
>
>   LW> That just looks like:
>
>   LW> my $field = //;
>
> where is the grabbing there? if there was more than just shellword would
> you have to () it for a grab? wouldn't that assign a boolean like perl5
> or is the boolean result only returned in a boolean context?

Note--no parens around $field.  We're not "capturing" here, not in the
Perl 5 sense, anyway.

When a pattern consisting of only a named rule invokation (possibly
quantified) matches, it returns the result object, which in boolean
context returns true, but in string context returns the entire captured
text from the named rule (so, one hopes that the C rule
captures only the quoted text, not the quotes surrounding it).

I think this is more generalizable.  I believe that if one matches an
arbitrary rule which does not contain capturing parentheses, it returns
the result object as well, which should contain the entire match (as if
one put parens around the entire thing).  Correct?

So:

   my $vers = _ / 6/;

should cause $vers to contain either "6" or "".  A successful match object
is true in boolean context, so

   my $vers = / \d/;

would cause $vers to be true, even if the digit matched was zero.

Here's an interesting one:

   my $vers = _ / \d/; # Stringify...
   print "yes!" if $vers;  # ... and booleanize

If $vers contained "0", would it still be true?  That is, does the "is
true" property of the result object survive stringification?  It might be
useful if it did.  On the other hand, of course, one can also imagine:

   my $flag = _ (/ <[01]>/
 or die "No debug setting!");
   print "yes!" if $flag;

where one would want the truth value to follow old conventions.  Perhaps
you could write:

   my $flag = /
   [ 0 :: { $0 is false }
   | 1
   ]/;

But then you have no way short of another string comparison for teasing
out the difference between a failed match and a zero match, which is what
we were trying to get away from.

Maybe I'm just making this too complicated

> what happens to $field if no match was found? undef? the old boolean
> false of a null string wouldn't be good as that could be the result of a
> match. i assume undef could never be the result of a match unless some
> included perl code returned undef to the match object. then coder emptor
> would be the rule.

If the pattern doesn't match... will it return the undefined value, or
will it return a false (and stringwise empty) result object?  I could see
it going either way, but a failed pattern result object is fairly useless,
isn't it?

> this is gonna make all the groups that copied perl5 regexes blow their
> lids. just think about all the neat canned regexes that will be
> done. like Regex::Common but even more so. we will need a CPAN just for
> these alone. full blown *ML parsers, email verifiers, formatted data
> extractors, etc.

More and more lately, I've been finding myself getting syntax errors when
I've wishfully put Perl 6 into my code. :-)

Trey