Re: More embedding questions

2007-01-16 Thread chromatic
On Tuesday 16 January 2007 16:56, Isaac Freeman wrote:
\
> So, for my purposes I need an embedding interface that allows for more
> control of the interpretter, e.g. the ability to inspect/modify
> namespace(s) and eventually control which opcodes are allowed, or
> register callbacks for opcodes (i.e. file access, etc) for security
> purposes.
>
> Tackling the first issue for now, I don't see any interface in either
> embed.h or extend.h which allows for this, and the examples I've seen
> don't show a way of doing this, at least not a way that works with the
> current code base. If such a mechanism exists I'd love to hear about
> it, but lacking that, I would like some feedback on how this stuff
> would be best implemented. I am thinking about writing some functions
> which would allow for more control of the namespaces, something like
> Parrot_get_namespace and Parrot_define_global, etc.
>
> I thought I'd ask here before just diving into it.

That's PDD 21, Namespaces, and it's not completely implemented yet.  I'll 
check in the first part of it in a little bit, as one API change required a 
deprecation cycle.

-- c


repository open for commits

2007-01-16 Thread jerry gay

i never officially closed the repo to commits, but for those of you
awaiting parrot's release, it's now complete. you may commit freely.
thanks for your patience.
~jerry


Parrot 0.4.8 Released

2007-01-16 Thread jerry gay

On behalf of the Parrot team, I'm proud to announce Parrot 0.4.8,
"Eponymous." Parrot (http://parrotcode.org) is a virtual machine aimed
at running all dynamic languages.

You may now grab Parrot 0.4.8 from the CPAN.

Parrot 0.4.8 News:
- Compilers:
  + HLLCompiler: added tracing options, modified api
  + PGE & TGE bugfixes and updates
- PAST:
  + added global and lexical variable support
  + added looping constructs, arrays, hashes
- Languages:
  + Updated PHP ("Plumhead"), Tcl ("ParTcl"),
forth, perl6, lua, abc, APL, WMLScript, punie
  + ParTcl is passing > 24.9% of Tcl cvs-latest test suite
  + perl6 now supports hashes, arrays, method calls, arity-based
multisubs, quoted terms, ranges (non-lazy), try blocks, $!
- Design:
  + PDD01 "Overview" - updated
  + PDD22 "I/O" - rewritten and approved
- Test Suite:
  + Converted Perl 5 Regex tests to PIR, with notable speedup
  + Added tests for opcodes, compilers, languages, and coding standards
- Build:
  + Major improvements in test coverage for 'pmc2c.pl'
- Misc:
  + many bugfixes, enhancements, and coding standard updates
  + extended support for non-core platforms including Tru64

If you'd like to develop on Parrot (or help develop Parrot itself), we
recommend that you keep up with the latest and best Parrot code by
using Subversion or SVK to access our source code repository; see
instructions at http://parrotcode.org/source.html.

Thanks to all our contributors for making this possible, and our
sponsors for supporting this project.

Share & Enjoy!
~jerry


More embedding questions

2007-01-16 Thread Isaac Freeman

So, for my purposes I need an embedding interface that allows for more
control of the interpretter, e.g. the ability to inspect/modify
namespace(s) and eventually control which opcodes are allowed, or
register callbacks for opcodes (i.e. file access, etc) for security
purposes.

Tackling the first issue for now, I don't see any interface in either
embed.h or extend.h which allows for this, and the examples I've seen
don't show a way of doing this, at least not a way that works with the
current code base. If such a mechanism exists I'd love to hear about
it, but lacking that, I would like some feedback on how this stuff
would be best implemented. I am thinking about writing some functions
which would allow for more control of the namespaces, something like
Parrot_get_namespace and Parrot_define_global, etc.

I thought I'd ask here before just diving into it.

Thanks,

--
James "Isaac" Freeman


[perl #41280] [PDD] adding methods to subs as objects

2007-01-16 Thread via RT
# New Ticket Created by  Allison Randal 
# Please include the string:  [perl #41280]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt3/Ticket/Display.html?id=41280 >


 larry's most recent change to S05 (more)
 looks like I'll need the ability to attach methods to subs 
sooner rather than later
 (subs as real objects isn't that critical, but it's 
continually coming up)
 (I can continue to implement workarounds)
 patrick: put together a list of what you need sub objects to do
 I'll roll it into the objects PDD
 allison: will do -- the shortlist is that I need to be able 
to inherit (or compose) from .Sub so that I can create Code, Rule, etc. 
classes in Perl 6
 the key phrase is: Every regex in Perl 6 is required to be 
able to
 +return its list of initial constant strings (transitively 
including the
 +initial constant strings of any initial subrule called by 
that regex).
 it may be that this will become a method on grammar objects, 
though


[perl #32667] [PATCH] IMCC - documentation needs updating

2007-01-16 Thread Bram Geron via RT
Attached patch adds new syntax documentation to docs/imcc/syntax.pod and
fixes some typos there. It now also indicates where various flags are
explained.

Is the shorthand syntax for function calls ("($P0, a :slurpy) = foo(3, b
:flat)") clear, or can we better use examples there?

-- bgeron
Index: docs/imcc/syntax.pod
===
--- docs/imcc/syntax.pod	(revision 16662)
+++ docs/imcc/syntax.pod	(working copy)
@@ -132,7 +132,7 @@
 
   set S0, utf8:unicode:"«"
 
-The encoding and charset gets attaced to the string, no further processing
+The encoding and charset gets attached to the string, no further processing
 is done, specifically escape sequences are not honored.
 
 =item numeric constants
@@ -190,11 +190,12 @@
 
 ... all subroutines for language I would use a dynamic lexpad pmc.
 
-=item .sub 
+=item .sub  [: ...]
 
 =item .end
 
-Define a I with the label B.
+Define a I with the label B. See
+L for available flags.
 
 =item .emit
 
@@ -237,8 +238,12 @@
 
 =item .const   = 
 
-Define a named constant of style I and value I.
+=item .globalconst   = 
 
+Define a named constant of style I and value I only for
+this sub or globally. If I denotes a PMC type, I must be
+a string constant.
+
 =item .namespace 
 
 Open a new scope block. This "namespace" is not the same as the
@@ -320,11 +325,52 @@
 
 =back
 
-=head2 Parameter Passing Flags
+=head2 Shortcut directives for PCC call and return
 
+=over 4
+
+=item ([ [: ...], ...]) = ([arg [: ...], ...])
+
+=item  = ([arg [: ...], ...])
+
+=item ([arg [: ...], ...])
+
+=item ."_method"([arg [: ...], ...])
+
+=item ._method([arg [: ...], ...])
+
+Function or method call. These notations are shorthand for a longer
+PCC function call with B<.pcc_*> directives. I can denote a
+global subroutine, a local B or a B.
+
+=item .return ([ [: ...], ...])
+
+Return from the current compilation unit with zero or more values. 
+
+The surrounded parentheses are mandatory. Besides making sequence
+break more conspiscuous, this is necessary to distinguish this syntax
+from other uses of the B<.return> directive that will be probably
+deprecated.
+
+=item .return (args)
+
+=item .return ."somemethod"(args)
+
+=item .return .somemethod(args)
+
+Tail call: call a function or method and return from the sub with the
+function or method call return values.
+
+Internally, the call stack doesn't increase because of a tail call, so
+you can write recursive functions and not have stack overflows.
+
+=back
+
+=head2 Parameter Passing and Getting Flags
+
 See L for a description of
-the meaning of the flag bits bits C, C, C,
-and C, which correspond to the claling convention flags
+the meaning of the flag bits C, C, C,
+and C, which correspond to the calling convention flags
 C<:slurpy>, C<:optional>, C<:opt_flag>, and C<:flat>.
 
 [TODO - once these flag bits are solidified by long-term use, then we
@@ -347,6 +393,12 @@
 
 Translate to B or B.
 
+=item if null  goto 
+
+=item unless null  goto 
+
+Translate to B or B.
+
 =item ifgoto 
 
 The B B<<, <=, ==, != E= E> translate to the PASM opcodes
@@ -420,6 +472,10 @@
 
 B
 
+=item  = null
+
+B>
+
 =back
 
 =head1 SEE ALSO


Re: The S13 "is commutative" trait

2007-01-16 Thread Jonathan Lang

Luke Palmer wrote:

Seems reasonable.  My generality alarm goes off when I realize that
you can't specify commutativity for two of the three args, but that's
fine because it's definitely a cpanable feature.


IIRC, it's possible to embed one signature within another one; if the
embedded signature has two parameters and "is commutative" while the
embedding signature is not commutative and has a third arg, wouldn't
that produce commutativity for two out of the three, as long as
they're adjacent?


> Does the trait only apply within one region of the arglist, or can I
> create a 1-arg method that is commutative between the "self" arg and its
> data arg? (I assume not -- I can't quite work out what that would mean)

That's CPAN's job, I think.


IMHO, "is commutative" should only apply to positional args: named
args have this behavior automatically, and trying to include the
invocant would tend to interfere with the self-contained nature of
classes and roles - that is, it would allow role A to define a method
for role B.

--
Jonathan "Dataweaver" Lang


[svn:perl6-synopsis] r13525 - doc/trunk/design/syn

2007-01-16 Thread larry
Author: larry
Date: Tue Jan 16 13:42:34 2007
New Revision: 13525

Modified:
   doc/trunk/design/syn/S05.pod

Log:
Clarification of relationship of hash keys to constant prefix processing.


Modified: doc/trunk/design/syn/S05.pod
==
--- doc/trunk/design/syn/S05.pod(original)
+++ doc/trunk/design/syn/S05.podTue Jan 16 13:42:34 2007
@@ -633,8 +633,18 @@
 =item *
 
 An interpolated hash matches the longest possible key of the hash
-as a literal, or fails if no key matches.  (A C<""> key
-will match anywhere, provided no longer key matches.)
+as a literal, or fails if no key matches.  (A C<""> key will match
+anywhere, provided no longer key matches.)
+
+In a context requiring a set of initial constant strings, the keys
+of the hash comprise that set of strings, and any subsequent matching
+performed by the hash values is not considered a part of those strings,
+even if that subsequent match begins by matching more constant string.
+The keys are considered to be canonicalized in the same way as any
+surrounding context, so for instance within a case-insensitive context
+the hash keys must match insensitively also.
+
+Subsequent matching depends on the hash value:
 
 =over 4
 


Re: The S13 "is commutative" trait

2007-01-16 Thread Luke Palmer

On 1/16/07, Dave Whipp <[EMAIL PROTECTED]> wrote:

Synopsys 13 mentions an "is commutative" trait in its discussion of
operator overloading syntax:

 > Binary operators may be declared as commutative:
 >
 >multi sub infix:<+> (Us $us, Them $them) is commutative {
 >myadd($us,$them) }

A few questions:

Is this restricted to only binary operators, or can I tag any
function/method with the trait. The semantics would be that the current
seq of ordered args to the function would be treated as a true
(unordered) set for purposes of matching


Seems reasonable.  My generality alarm goes off when I realize that
you can't specify commutivity for two of the three args, but that's
fine because it's definitely a cpanable feture.


Does the fact that a match was obtained by reordering the arguments
affect the distance metric of MMD?


Well, aside from the fact that there's no "metric" per se, I'd say no.
That is, I think saying:

   multi foo (Bar, Baz) is commutative

Should be equivalent to defining:

   multi foo (Bar, Baz)
   multi foo (Baz, Bar)

And if Bar does Baz then also:

   multi foo (Bar, Bar)

And vice versa (this variant is to remove ambiguity by picking one of
those methods, presumably the original (Bar, Baz) variant).


Will the use of this trait catch errors such as the statement "class
quaternion does Num" that came up a few days ago on this list
(multiplication of quaternions isn't commutative; but of Nums is).


Not unless you have explicitly stated somewhere, somehow, that
quaternion multiplication is definitely not commutative.  Roles often
have laws that go with them, and Perl cannot use those laws to prove
that you follow them.  Instead, the laws should be encoded in a
QuickCheckesque test suite.  So in some way, it will catch those
errors, but not at compile time: at test time.


Does the trait only apply within one region of the arglist, or can I
create a 1-arg method that is commutative between the "self" arg and its
data arg? (I assume not -- I can't quite work out what that would mean)


That's CPAN's job, I think.

Luke


[svn:perl6-synopsis] r13524 - doc/trunk/design/syn

2007-01-16 Thread larry
Author: larry
Date: Tue Jan 16 13:27:58 2007
New Revision: 13524

Modified:
   doc/trunk/design/syn/S05.pod

Log:
As suggested by pmichaud++, also split C<&> into C<&> and C<&&>,
with only the latter guaranteeing order.


Modified: doc/trunk/design/syn/S05.pod
==
--- doc/trunk/design/syn/S05.pod(original)
+++ doc/trunk/design/syn/S05.podTue Jan 16 13:27:58 2007
@@ -16,7 +16,7 @@
Date: 24 Jun 2002
Last Modified: 16 Jan 2007
Number: 5
-   Version: 42
+   Version: 43
 
 This document summarizes Apocalypse 5, which is about the new regex
 syntax.  We now try to call them I rather than "regular
@@ -79,7 +79,7 @@
 A logical alternation using C<|> then takes two or more of these lists
 and dispatches to the alternative that advertises the longest matching
 prefix, not necessarily to the alternative that comes first lexically.
-(However, in the case of a tie between alternatives, the first earlier
+(However, in the case of a tie between alternatives, the earlier
 alternative does take precedence.)
 
 Initial constants must take into account case sensitivity (or any other
@@ -497,10 +497,20 @@
 
 =item *
 
-The new C<&> metacharacter separates conjunctive terms.  The patterns on
-either side must match with the same beginning and end point.  The
-operator is list associative like C<|>, has higher precedence than C<|>,
-and backtracking makes the right argument vary faster than the left.   
+The new C<&> metacharacter separates conjunctive terms.  The patterns
+on either side must match with the same beginning and end point.
+Note: if you don't want your two terms to end at the same point,
+then you really want to use a lookahead instead.
+
+As with the disjunctions C<|> and C<||>, conjuctions come in both
+C<&> and C<&&> forms.  The C<&> form allows the compiler and/or the
+run-time system to decide which parts to evaluate first, and it is
+erroneous to assume either order happens consistently.  The C<&&>
+form short-circuits, and backtracking makes the right argument vary
+faster than the left.
+
+The C<&> and C<&&> operators are list associative like C<|> and C<||>,
+but have tighter precedence.
 
 =back
 


Re: Major bullet biting on | vs || within regex

2007-01-16 Thread Larry Wall
On Tue, Jan 16, 2007 at 02:05:44PM -0600, Patrick R. Michaud wrote:
: On Tue, Jan 16, 2007 at 10:41:03AM -0800, Larry Wall wrote:
: > Note, in case you don't read synopsis checkins: the previous checkin
: > majorly changes the semantics of | within regex to support required
: > longest-token matching semantics rather than left-to-right matching.
: > This is nearly on the same philosophical level as requiring the
: > tail-recursion optimization.  It will enable us to write parsers
: > more consistently, and it also opens up normal regexes to better
: > optimization via tries and such.  You can now use || for the old |
: > semantics, which is majorly consistent with how | and || work outside
: > of regexen.
: 
: Do we leave C<&> alone (as opposed to introducing a corresponding C<&&>
: operator)?  I can see arguments both ways.

Good question...

I think let's go ahead and put in && as well to guarantee order.
Then & can evaluate in any order it likes, including even interleaved
if it doesn't want one branch to get too far ahead of the other,
or if it can figure out that one branch can falsify earlier or more
often than the other.  Or it could make a dynamic decision which
branch to try first based on past history.

And it could compare prefix sets fore and aft to traverse with a
single trie if it wants to factor out the common prefix, I guess.
Though I suppose that could happen anyway.

But mostly I think we just do it for consistency, and to avoid a FAQ.

Larry


Re: Major bullet biting on | vs || within regex

2007-01-16 Thread Patrick R. Michaud
On Tue, Jan 16, 2007 at 10:41:03AM -0800, Larry Wall wrote:
> Note, in case you don't read synopsis checkins: the previous checkin
> majorly changes the semantics of | within regex to support required
> longest-token matching semantics rather than left-to-right matching.
> This is nearly on the same philosophical level as requiring the
> tail-recursion optimization.  It will enable us to write parsers
> more consistently, and it also opens up normal regexes to better
> optimization via tries and such.  You can now use || for the old |
> semantics, which is majorly consistent with how | and || work outside
> of regexen.

Do we leave C<&> alone (as opposed to introducing a corresponding C<&&>
operator)?  I can see arguments both ways.

Pm


The S13 "is commutative" trait

2007-01-16 Thread Dave Whipp
Synopsys 13 mentions an "is commutative" trait in its discussion of 
operator overloading syntax:


> Binary operators may be declared as commutative:
>
>multi sub infix:<+> (Us $us, Them $them) is commutative {
>myadd($us,$them) }

A few questions:

Is this restricted to only binary operators, or can I tag any 
function/method with the trait. The semantics would be that the current 
seq of ordered args to the function would be treated as a true 
(unordered) set for purposes of matching


Does the fact that a match was obtained by reordering the arguments 
affect the distance metric of MMD?


Will the use of this trait catch errors such as the statement "class 
quaternion does Num" that came up a few days ago on this list 
(multiplication of quaternions isn't commutative; but of Nums is).


Does the trait only apply within one region of the arglist, or can I 
create a 1-arg method that is commutative between the "self" arg and its 
data arg? (I assume not -- I can't quite work out what that would mean)


Major bullet biting on | vs || within regex

2007-01-16 Thread Larry Wall
Note, in case you don't read synopsis checkins: the previous checkin
majorly changes the semantics of | within regex to support required
longest-token matching semantics rather than left-to-right matching.
This is nearly on the same philosophical level as requiring the
tail-recursion optimization.  It will enable us to write parsers
more consistently, and it also opens up normal regexes to better
optimization via tries and such.  You can now use || for the old |
semantics, which is majorly consistent with how | and || work outside
of regexen.

Larry


[svn:perl6-synopsis] r13523 - doc/trunk/design/syn

2007-01-16 Thread larry
Author: larry
Date: Tue Jan 16 11:09:42 2007
New Revision: 13523

Modified:
   doc/trunk/design/syn/S05.pod

Log:
Tweak | to provide longest-token instead of short-circuit semantics.
Now use || for old short-circuit semantics!


Modified: doc/trunk/design/syn/S05.pod
==
--- doc/trunk/design/syn/S05.pod(original)
+++ doc/trunk/design/syn/S05.podTue Jan 16 11:09:42 2007
@@ -14,9 +14,9 @@
Maintainer: Patrick Michaud <[EMAIL PROTECTED]> and
Larry Wall <[EMAIL PROTECTED]>
Date: 24 Jun 2002
-   Last Modified: 23 Dec 2006
+   Last Modified: 16 Jan 2007
Number: 5
-   Version: 41
+   Version: 42
 
 This document summarizes Apocalypse 5, which is about the new regex
 syntax.  We now try to call them I rather than "regular
@@ -67,6 +67,29 @@
 
 =back
 
+While the syntax of C<|> does not change, the default semantics do
+change slightly.   Instead of representing temporal alternation, C<|>
+now represents logical alternation with longest-token semantics.
+(You may now use C<||> to indicate the old temporal alternation.  That is,
+C<|> and C<||> now work within regex syntax much the same as they
+do outside of regex syntax, where they represent junctional and
+short-circuit OR.)  Every regex in Perl 6 is required to be able to
+return its list of initial constant strings (transitively including the
+initial constant strings of any initial subrule called by that regex).
+A logical alternation using C<|> then takes two or more of these lists
+and dispatches to the alternative that advertises the longest matching
+prefix, not necessarily to the alternative that comes first lexically.
+(However, in the case of a tie between alternatives, the first earlier
+alternative does take precedence.)
+
+Initial constants must take into account case sensitivity (or any other
+canonicalization primitives) and do the right thing even when propagated
+up to rules that don't have the same canonicalization.  That is, they
+must continue to represent the set of matches that the lower rule would
+match.  If and when the optimizer turns such a list of prefixes into,
+say, a trie, the trie must continue to have the appropriate semantics
+for the originating rule.
+
 =head1 Modifiers
 
 =over
@@ -1319,6 +1342,10 @@
 put an explicit C after the alternation to enable backing into
 another alternative if the first pick fails.
 
+The C<::> also has the effect of hiding any constant string on the right
+from "longest token" processing by C<|>.  Only the left side is evaluated
+for initial constancy.
+
 =item *
 
 Backtracking over a triple colon causes the current regex to fail


Re: Numeric Semantics

2007-01-16 Thread TSa

HaloO,

Jonathan Lang wrote:

Agreed.  My only doubt at this point is which definition should be the
default.  Do we go with "mathematically elegant" (E) or "industry
standard" (F, I think)?


I think industry (language) standard is undefined behavior ;)

I'm kind of waiting for an answer what fear Mark has with
calculations crossing zero that are more difficult with
the Euclidean definition. Any idea?

Regards, TSa.
--


Re: Numeric Semantics

2007-01-16 Thread Jonathan Lang

TSa wrote:

My list was sorted in decreasing order of importance with the
F-definition beating the E-definition in popularity. So all I want is

use Math::DivMod:euclid;

to get the E-definition and a

   use Math::DivMod;

to get them all.  The F-definition being the default when no import is
done.


You're unlikely to ever need more than one definition at a time; so
put each in its own module and import as needed.  This will produce
both simpler code (you won't need to remember which of nearly a
half-dozen variant spellings of div or mod to use each time in order
to get the appropriate definition) and more readable code (e.g., if
you see "use Math::Modulus::Truncate" in a given lexical scope, you
know that div and mod will be using the truncating definition there).

In the rare instance that you need more than one definition at a time,
import the ones you need and qualify your div and mod calls by the
module names: e.g. 'Math::Modulus::Euclid::div' and
'Math::Modulus::Floor::div'.  The Huffman coding seems appropriate;
and if the length is excessive, it's because the module names' lengths
should be shorter (say, 'Math::ModE' instead of
'Math::Modulus::Euclid').


A sane definition of div and % is important. A spec that leaves
it up to the implementation to pick whatever is convenient is bad in
my eyes.


Agreed.  My only doubt at this point is which definition should be the
default.  Do we go with "mathematically elegant" (E) or "industry
standard" (F, I think)?

--
Jonathan "Dataweaver" Lang


Re: Numeric Semantics

2007-01-16 Thread TSa

HaloO,

Smylers wrote:

That depends on exactly what you mean by "we" and "need".


Well, with "we" I meant the Perl 6 language list and "need"
is driven by the observation that we can't agree on a single
definition, so picking your personal favorite should be
possible.



By all means have them available as modules.


That is perfectly fine. We should have / return a Num, div return
an Int and % as the Num modulus. This somewhat leaves mod undefined.
How could we fill-in that gap with a useful case? Perhaps we sneak in
euclidean remainder? But that would not fit the F-definition div. So
we might have the pairs fdiv and % and div and mod in core and the rest
in a module. Hmm, or we drop % or use it for something else. The
simplest solution is to have mod as alias for %. Note that F-definition
and E-definition agree for a divisor greater than zero.

My list was sorted in decreasing order of importance with the
F-definition beating the E-definition in popularity. So all I want is

   use Math::DivMod:euclid;

to get the E-definition and a

  use Math::DivMod;

to get them all. The F-definition beeing the default when no import is
done. A sane definition of div and % is important. A spec that leaves
it up to the implementation to pick whatever is convenient is bad in
my eyes.


Regards, TSa.
--