Re: PDD 2, vtables

2001-02-07 Thread Tim Bunce

On Tue, Feb 06, 2001 at 12:28:23PM -0500, Dan Sugalski wrote:
 At 11:26 AM 2/6/2001 +, Tim Bunce wrote:
 [First off: I've not really been paying attention so forgive me if I'm
 being dumb here.  And many thanks for helping to drive this forwards.]
 
 On Mon, Feb 05, 2001 at 05:14:44PM -0500, Dan Sugalski wrote:
  
   =head2 Core datatypes
  
   For ease of use, we define the following semi-abstract data types
 
 Probably worth stating upfront that it'll be easy to add new types
 to avoid people argusing for their favorite type to be added here.
 
 I'm not sure it should be--that'd mean extending the vtables in ways they 
 have little room to grow. Adding new perl datatypes is easy, adding new 
 low-level types is harder.

That's pretty much what I meant. I think it's worth saying.

   =item INT
   =item NUM
   =item STR
   =item BOOL
 
 What about references?
 
 Special type of scalar, not dealt with here.

But should be at least mentioned.

   =item UTF-32 string
   =item Native string
   =item Foreign string
 
 I'm a little surprised not to see UTF-8 there, but since I'm also
 confused about what Native string and Foreign string are I'll skip it.
 Except to say that some clarification here may help, and explicitly
 mentioning UTF-8 (even to say it won't be a core type and provide a
 reference to why) would be good.
 
 I didn't put UTF-8 in on purpose, because I'd just as soon not deal with it 
 internally. Variable length character data's a pain in the butt, and if we 
 can avoid having the internals deal with it except as a source that gets 
 converted to UTF-32, that's fine with me.

I agree with Branden that a default 4x memory bloat would not be popular.

 The native and foreign string data types were an attempt to accommodate 
 UTF-8, as well as ASCII and EBCDIC character data. One of the three will 
 likely be the native type, and the rest will be foreign strings. I'm not 
 sure if perl should have only one foreign string type, or if we should have 
 a type tag along with the other bits for strings.

Umm, one way or another I suspect UTF-8 will be in there.

   =item is_same
  
  BOOL   is_same(PMC1, PMC2[, key]);
  
   Returns TRUE if CPMC1 and CPMC2 refer to the same value, and FALSE
   otherwise.
 
 I think that needs more clarification, especially where they are of
 different types. Contrast with is_equal() below.
 
 If they're different types they can't be the same. This would be used to 
 check if two references have the same referent, or if two magic variables 
 (database handles, say) pointed to the same thing.

Okay, so say so in the PPD. "refer to the same value" isn't very clear
(the word value is probably the problem).

   =item concatenate
  
  void   concatenate(PMC1, PMC2, PMC3[, key]); ##
  
   Concatenates the strings in CPMC2 and CPMC3, storing the result in
   CPMC1.
 
 and insert (ala sv_insert)  etc?
 
 Hadn't considered them. Care to elaborate on the etc?

Er, I haven't looked at sv.c for ages but basically all the kinds of
string manipulations that ended up in there for good reason will
probably need to be in perl6. sv_insert is a good example (and possibly
the only one :-)

   =item logical_or
   =item logical_and
   =item logical_not
 
 Er, why not just use get_bool? The only reason I can think of is to
 support three-value-logic but that would probably be better handled
 via a higher-level overloading kind of mechanism. Either way, clarify.
 
 Well, there's overloading. Plus the potential that a class will do 
 something odd with it--if you || on two custom arrays in list context you 
 might get an array with each pair (left[0] || right [0] and so on) 
 logically or'd.

Okay, don't forget xor then :)

   =item match
  
  void   match(PMC1, PMC2, REGEX[, key]);
  
   Performs a regular expression match on CPMC2 against the expression
   CREGEX, placing the results in CPMC1.
 
 Results, plural = container = array or hash. Needs clarifying.
 
 Yep, especially since I'd considered tossing the match destination 
 entirely. (Though that means special variables, and I'm not sure I want to 
 go there) It'll likely just return true or false. I'll rethink it.

A BOOL return would be good. But "placing the results in CPMC1" is
also good (assuming 'results' are equiv to $1, $2 etc in perl5).

   =head1 REFERENCES
  
   PDD 3: Perl's Internal Data Types.
 
 Some references to any other vtable based languages would be good.
 (I presume people have looked at some and learnt lessons.)
 
 Alas not. This is pretty much head of zeus stuff, modulo some ego. (Mine's 
 not *that* big...)

:-)

Without studying history we may be doomed to repeat it.

So can anyone point to vtable based language implementations?

Tim.



Re: Magic [Slightly Off-Topic... please point me to documentation]

2001-02-07 Thread Bart Lateur

On Tue, 6 Feb 2001 17:53:17 -0200, Branden wrote:

It appears you're blessing one reference and returning another... like

sub new {
my $key;
my $a = \$key;
my $b = \$key;
bless $a;
return $b;
}

I think the problem is not with the overloading magic, but with the code
snippet...

A recent thread on comp.lang.perl.misc discussed how bless() works with
the reference, but alledgedly, it's the underlying thing that gets
blessed, not the reference itself.

my $a = \$x;
my $b = \$x;
bless $a, 'FOO';
print $b;
--
FOO=SCALAR(0x8a652e4)

It sure looks that they're right. Oh, this is perl 5.6.0.

-- 
Bart.



Re: Another approach to vtables

2001-02-07 Thread Edwin Steiner

Edwin Steiner wrote:
 
 Dan Sugalski wrote:
 [snip]
  That's OK, since my example was wrong. (D'oh! Chalk it up to remnants of
  the martian death flu, along with too much blood in my caffeine stream) The
  example
 
$foo{bar} = $baz + $xyzzy[42];
 
  turns into
 
 baz-vtable-add[NATIVE](foo, baz, xyzzy, key);
 
 Why does the bytecode compiler know it should generate NATIVE int addition?
 Are you assuming 'use integer'? (see also next comment)
[snip]

I thought about it once more. Maybe I was confused by the *constant* NATIVE.
Are you suggesting a kind of multiple dispatch (first operand selects
the vtable, second operand selects the slot in the vtable)?

So
$dest = $first + $second
becomes
first-vtable-add[second-vtable-type(second)](dest,first,second,key);
?

or maybe
first-vtable-add[second-vtable-slot_select](dest,first,second,key);
which saves a call by directly reading an integer from the vtable of second.

(BTW, this is also how overloading with respect to the second argument
could be handled (should it be decided on the language level to do that):
There could be a slot like add[ARCANE_MAGIC] selected by
second-vtable-slot_select
which does all kinds of complicated checks and branches without any cost
for the vfunctions in the other slots.)

Such a multiple dispatch seems to me like the only solution which avoids
the following (eg. in Python):
'first + second' becomes
1. call virtual function 'add' on first
2. inside first-add do lots of checks about type of second

-Edwin



Re: vtables: Assignment vs. Aliasing

2001-02-07 Thread Bart Lateur

[CC'ed to language, because I think it's there that it belongs]

On Mon, 5 Feb 2001 15:35:18 -0200, Branden wrote:

There are two possible things that could happen when you say:
$a = $b;
@a = @b;  # or
%a = %b;

These two things are assignment and aliasing.

No way. Although I think aliasing is a great tool, but assignment is by
value. Always. (Well, except for referenced things...)

In perl5 terms:
*a = \$b;
*a = \@b;  # or
*a = \%b;

However, typeglobs are said to disappear from Perl6,

I think Larry wants to drop typeglobs themselves, i.e. keeping different
kinds of variables of the same name in one record, but not the
possibilities they offer. Aliasing is likely the most interesting
feature of them all.

...

My preference:

* Alias when assigning to a reference:
\$a = \$b;
\@a = \@b;
\%a = \%b;

I think this is a nice symmetrical syntax.

* Make aliasing the default for = and provide another way of assigning (NO
WAY!!!)

Indeed, no way.

Look, if you'd do the latter, you would not only make Perl effectively a
different language, but you'd also be missing out on one of the great
benefits of aliasing. For example, you pass a reference of a hash to a
sub, so the original hash can be accessed and modified. With the latter
syntax, you can't even do that through an alias. In the former syntax:

foo(\%bar);

sub foo {
my \%hash = shift;  # alias through reference
print $hash{FOO};
}

You can now access the passed hash as a hash, and not through the
slightly awkward syntax of accessing it through a reference:

sub foo2 {
my $hash = shift;
print $hash-{FOO};
}

(You don't think it's that awkward? Try getting a hash slice through a
hash reference. Ugh.)

-- 
Bart.



Re: Rare Salt-Water Camel May Be Separate Species

2001-02-07 Thread H . Merijn Brand

On Wed, 7 Feb 2001 09:17:30 -0500, Joshua N Pritikin [EMAIL PROTECTED] wrote:
 http://www.nytimes.com/2001/02/07/science/07reuters-camel.html

Which is of no use if you don't have a subscriber ID (and do not want to have
one) to th NYT, since it is quite useless in europe ...

-- 
H.Merijn Brand   Amsterdam Perl Mongers (http://www.amsterdam.pm.org/)
using perl-5.005.03, 5.6.0, 5.6.1, 5.7.1  623 on HP-UX 10.20  11.00, AIX 4.2
   AIX 4.3, WinNT 4.0 SP-6a, and Win2000pro often with Tk800.022 /| DBD-Unify
ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/H/HM/HMBRAND/




Re: Rare Salt-Water Camel May Be Separate Species

2001-02-07 Thread Simon Cozens

On Wed, Feb 07, 2001 at 03:33:39PM +0100, H . Merijn Brand wrote:
 On Wed, 7 Feb 2001 09:17:30 -0500, Joshua N Pritikin [EMAIL PROTECTED] wrote:
  http://www.nytimes.com/2001/02/07/science/07reuters-camel.html
 
 Which is of no use if you don't have a subscriber ID (and do not want to have
 one) to th NYT, since it is quite useless in europe ...

http://news.bbc.co.uk/hi/english/sci/tech/newsid_1156000/1156212.stm

-- 
I am familiar with this particular stupid user; it lives inside one's head 
and takes control at unexpected moments.
- Roger Burton West



Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

Some comments about the vtable PDD...

First a general comment. I think we really need to make it clear for
each method, which arg respresents the object that is having its method
called (ie which is $self/this so to speak). One way to make this clear
would be to insist that the first arg is always $self,
but failing that, it should be explicity mentioned for each function.

Also, can I suggest a bit of terminology, which I will use below?
I define an *empty* PMC as a PMC which exists, but does not have a valid
vtable ptr or content (and so whose methods must not be called under
any circumstances). I specifically contrast this with a undefined PMC,
which has a valid vtable pointer that points to a bunch of methods that
mostly call carp("use of undefined value ...").

Now onwards and upwards


 The Ckey parameter is optional, and if passed it refers to an array
 of key structure pointers.

A mere detail, but would it not be more efficient to just pass them
as extra args, ie add(PMC1, PMC2, PMC3, key1, key2, key3),
rather than having to potentially create and populate a tmp struct
just to call the function???



IVtype(PMC[, subtype]);
 
 Returns the type of the PMC. If the subtype is passed (int, string,
 num) it returns the subtype of the PMC. This is generally a class
 function rather than a variable one, but the PMC is passed in just in
 case. (And so we can have the subtype be a vararg parameter)

I dont understand the subtype bit. If for example we pass an optional
2nd arg of INT (assuming this is some sort of enum?), what does type()
return?



 =item new
 
void   new(PMC[, key]);
 
 Creates a new variable of the appropriate type out of the passed PMC,
 destroying the current contents if there are any. This is a class
 function.

As an aside, am I right in assuming that there will be a function somewhere
(outside the scope of this PDD) that creates new, empty PMCs, which can then
be acted upon by new() to turn them into PMCs of a particular type?

Will PMCs that have just been new()ed have a default null value, eg
0/"" ?  Note that they wont be undefined - at least, I'm assuming that
undefined is handled by a separate class.
Perhaps we need instead a range of new()s that initialise to various
string, numeric etc values?

The above definition of new() implies that it first calls destroy()
to release any previous contents. I think it would be better to define
new() as operating on an empty PMC (so it is the the caller's responsibility
to call destroy() first, if necessary).

Actually, I suspect that the whole area of new/clone/destroy etc will need to
be examined carefully in the light of a 'typical' variable lifecycle,
to avoid to unecessary transitions. For example, my $a = 'abc' might
involve $a going from empty - undef - empty - "" - "abc" or similar,
if we're not careful.


void   clone(PMC1, PMC2 [, int flags[,key]);
 
 Copies CPMC2 into CPMC1. The Cflags parameter notes whether
 a deep copy should be done. (Possibly other things as well, if someone
 thinks of something reasonable)

One flag that would be very useful is 'destroy', which tells clone() to
destroy PMC2 immediately after the clone operation.
This is because a clone will often be immediately followed by a destroy
of the copied PMC, and delegating the destory() to clone() allows clone()
the chance to do things more effiently (eg even when asked to do a deep
copy, it just copies the vtable pointer and payload pointer(s), then scrubs
the old PMC)

I alo think that clone should expect PMC1 to be empty - ie it assumes the
caller has already called destroy() if necessary.


I guess we also need an assign() method, to handle

$a = $b and the like.

Note that assign and clone are very different operations (although assign
may well call clone): assign() copies *to* itself, while clone() copies
*from* itself.
Note that if $a is a 'simple' variable, $a-assign($b) will
itself just fall through to $b-clone($a) and let $a be wiped; while
if $a is magic or tied or whatever, then $a-assign($b) will take
a more active role in setting its own value, based on the value of $b.



void   morph(PMC, type[, key]);
 
 Tells the PMC to change itself into a PMC of the specified type.

I dont really see what the difference is between this and new().



void   destroy(PMC[, key]);
 
 Destroys the variable the PMC represents, leaving it undef.

(See also my comments earlier about new/clone/destroy etc).

I think destroy should leave an empty PMC rather than an undef one,
since as I said earlier, I think undef is a class in its own right.



 =item exists (x)
 
BOOL   exists(PMC1[, key]);


Presumably we also need defined(). (where most classes will always return
false, while the 'undefined' classs class always returns true.)




Re: Rare Salt-Water Camel May Be Separate Species

2001-02-07 Thread H . Merijn Brand

On Wed, 7 Feb 2001 15:05:55 +, Simon Cozens [EMAIL PROTECTED] wrote:
 On Wed, Feb 07, 2001 at 03:33:39PM +0100, H . Merijn Brand wrote:
  On Wed, 7 Feb 2001 09:17:30 -0500, Joshua N Pritikin [EMAIL PROTECTED] 
wrote:
   http://www.nytimes.com/2001/02/07/science/07reuters-camel.html
  
  Which is of no use if you don't have a subscriber ID (and do not want to have
  one) to th NYT, since it is quite useless in europe ...
 
 http://news.bbc.co.uk/hi/english/sci/tech/newsid_1156000/1156212.stm

:-)) Thanks

-- 
H.Merijn Brand   Amsterdam Perl Mongers (http://www.amsterdam.pm.org/)
using perl-5.005.03, 5.6.0, 5.6.1, 5.7.1  623 on HP-UX 10.20  11.00, AIX 4.2
   AIX 4.3, WinNT 4.0 SP-6a, and Win2000pro often with Tk800.022 /| DBD-Unify
ftp://ftp.funet.fi/pub/languages/perl/CPAN/authors/id/H/HM/HMBRAND/




Re: Another approach to vtables

2001-02-07 Thread Branden

Dan Sugalski wrote:
 At 05:41 PM 2/6/2001 -0200, Branden wrote:
   I actually don't see a reason why the vtable entries should be the
 opcodes.
   Is there?
  
   Speed.
  
 
 Actually, I don't see the problem of defining a C function that would do:
 
  void add(SVAR *result, SVAR *lop, SVAR *rop, SVAL *tmp1, SVAL *tmp2)
{
  /* tmp comes from the temporary file */
  lop-vtable-FETCH(tmp1, lop);
  rop-vtable-FETCH(tmp2, rop);
  lop-vtable-ADD(tmp1, tmp1, tmp2);
  result-vtable-STORE(result, tmp1, tmp2);
  }
 
 And have it be my opcode. Passing the indexes wouldn't be a problem
either.
 Is there any problem here?

 Well, no, but what's the point? If you're calling lop's ADD vtable entry,
 why not have that entry deal with the rest of it, rather than stick things
 into, and later remove them, from temp slots?



Dan,

I see you talk about vtables and opcodes as related things. I really don't
see why you think that's necessary, I'd like to hear why you think it.

As far as I know (and I could be _very_ wrong), the primary objectives of
vtables are:
1. Allowing extensible datatypes to be created by extensions and used in
Perl.
2. Making the implementation of `tie' and `overload' more efficient ('cause
it's very slow in Perl 5).
3. Replacing Perl5's SV*,AV*,HV*,... (I don't know if it should replace or
complement -- ?)
And some secondary objectives are:
4. Allow the use of different string encodings internally.
5. Allow int's and float's to become bigint's and bigfloat's when an
overflow occurs.

Is this right or am I missing something?

The point I want to make, is that vtables are directly related to what tie
and overload are today (or sv_magic, if you want the underlying thing [I
don't know what's underlying in overload case, as it's apparently not
documented]). So I don't really see why opcodes are in the discussion. For
me, at least, tie and overload are related to data, and opcodes with
execution. They are orthogonal things, or at least should be (sure that
doesn't mean they are not related and should be implemented separately, but
they sure can).



Of course I understand some opcodes are related to data manipulation, and
should of course be modelled after vtables, but they surely can be
separated. Before I proposed the code above for `add', taking 3 SVAR's and
doing the same as what you proposed (considering no arrays/hashes). I
thought about it, and I saw that it could be extended to handle 3 PMC's,
instead of SVAR's. If they are SVAL's, they are passed directly to the
vtable, otherwise they are fetched/stored using keys that are passed by
parameters. The add method would be able to determine it by the TYPE vtable
entry (finally found a use for it...). That would actually result in the
exact same opcodes that your approach would.

[[ And I actually see an advantage here. As I expect SVAL's to be used by
temporarys, in a longer expression that involves multiple operations, all
results would be stored in SVAL's, that don't have store/fetch operation
(since it's trivial) and so it's cheaper to use them. ]] -- I actually am
wrong here. I thought having to call store/fetch would be an issue, since it
is for tied things in Perl 5, but here it only costs what the method has to
do, what gives me the hint that in Perl 6, tie will be the fastest thing
ever.



Some more about opcode dispatching. What it has to do is:
1. Fetch an instruction (that would be a byte indicating which operation)
2. Find the function that handles this instruction (that would be lookup in
a table)
3. Call the function (here I refer to the stack handling, of passing and
returning parameters and return addresses...)
4. [ Here goes the instruction computing, made by the fetched function ]
5. Some cleanup (I think nothing really expensive)
6. Loop. Go back to 1 and fetch the next instruction.

Ok. Am I right here or I'm missing something really dumb?

If it's this way, what is so expensive here that makes 4 instructions so
much slower than 1 if the 4 together make the same thing as the 1? Compared
to what an `add' function would tipically do, testing the argument types,
and possibly converting things to bigints, or even a `concat' function
having to allocate memory and copy possibly big blocks of bytes, I don't see
what's the problem with some CPU cycles to push the parameters to a stack...

Of course, there is probably something very dumb I'm missing here. Please
point it to me.

- Branden




Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

 Please see my previous post on the subject. As I pointed there, implementing
 || and  like that breaks short-circuits.
 
 No, it doesn't. Just because you pass in two PMCs doesn't mean that they 
 both need to be evaluated. Though the PDD does need to be clearer about how 
 that happens.

Hmmm, I can't quite how that trick works. How whould the following get
evaluated:

$opened || open(F, ...)




Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 04:02 PM 2/7/2001 +, David Mitchell wrote:
  Please see my previous post on the subject. As I pointed there, 
 implementing
  || and  like that breaks short-circuits.
 
  No, it doesn't. Just because you pass in two PMCs doesn't mean that they
  both need to be evaluated. Though the PDD does need to be clearer about 
 how
  that happens.

Hmmm, I can't quite how that trick works. How whould the following get
evaluated:

$opened || open(F, ...)

The second PMC would point to a lazy list, so it wouldn't be evaluated 
unless its value gets fetched.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Another approach to vtables

2001-02-07 Thread Dan Sugalski

At 11:13 AM 2/7/2001 +0100, Edwin Steiner wrote:
Dan Sugalski wrote:
[snip]
  That's OK, since my example was wrong. (D'oh! Chalk it up to remnants of
  the martian death flu, along with too much blood in my caffeine stream) The
  example
 
$foo{bar} = $baz + $xyzzy[42];
 
  turns into
 
 baz-vtable-add[NATIVE](foo, baz, xyzzy, key);

Why does the bytecode compiler know it should generate NATIVE int addition?
Are you assuming 'use integer'? (see also next comment)

It's in there for clarity. It's likely either been cached somewhere, or 
comes from a call to the type vtable entry for the second parameter.

  with the add routine doing
 
   IV lside = baz-vtable-get_int[NATIVE](baz, key+1);
   IV rside = xyzzy-vtable-get_int[NATIVE](xyzzy, key+2);

This code assumes $xyzzy[42] will fit in an IV. It could
be bigint, bigfloat, ... and already create an overflow when
*converted* to IV, not only when added to lside.

I know this is so because it's the add[NATIVE] implementation, right(?).

Right.

But why does the bytecode compiler (or any other code from parsing
to execution) know that NATIVE will be appropriate?
Where does this assumption about both $baz and $xyzzy[42] come from?

Well, we know about baz because it's baz's vtable being used. We know about 
$xyzzy[42] because we asked it and left that out for clarity. (Or because 
we wanted to treat it explicitly as a native integer type)

   IV sum;
   bigstr *bigsum;
   CHECK_OVERFLOW(bigstr, (sum = lside + rside));
 
   if (OVERFLOW) {
 foo-vtable-set_integer[BIGINT](foo, bigsum, key[0]);
   } else {
 foo-vtable-set_integer[NATIVE](foo, sum, key[0]);
   }
 
  and foo's set_integer storing the passed data in the database as 
 appropriate.
 
  And if we replace that line for the correspondent set_string operation, I
  don't see the need to have add in a vtable, because I cannot 
 understand how
  it could be implemented differently for another variable aside from foo.
 
  Now, as to this...
 
  What happens if you have an overloaded array on the left side? Or a complex
  number? Or a bigint? The point of having add in the vtable (along with a
  lot of the other stuff that's in there) is so we can have a lot of
  special-purpose code rather than a hunk of general-purpose code. The idea
  is to reduce the number of tests and branches and shrink the size of the
  code we actually execute so we don't blow processor cache or have to clear
  out execution pipelines.

Having the `key' data structure looks like special-p. - general-p.
to me. Is it your idea that the key pointers will simply get passed
along most of the time and in the end there will be few branches
because eg. an array value knows it has to do indexing and a scalar
value will ignore the key pointer?

Yep. (unless we put some meaning to a key for a scalar) Container variables 
can have their vtable entries called with or without keys--passing a key of 
42 to get_integer for an array gets you the integer value of one of the 
array's entries, while calling get_integer with no key gets you the integer 
value of the entire array. (Which is probably how scalar(@array) is going 
to be implemented)

Are there estimates about the relative frequencies of
 $a
vs. $a[] or $a{}
in perl code?

I know chip has some, but I've been unable to get them from him. One of the 
points he made for his Topaz talk is that the majority of the memory a perl 
program of any size takes up is tied to its hash and array usage.

How will non-constant keys be handled? Will there be key data structures
created on the C-stack or in reused buffers? Or will it be like this:
 1. one or more ops calculate the key and fetch the PMC
from the container,
 2. the PMC is passed to other functions with a NULL key entry.

Good question. I don't know yet.


Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-07 Thread Nicholas Clark

On Wed, Feb 07, 2001 at 04:03:49PM +, David Mitchell wrote:
 BTW, should the vtable include all the mutator operators too, ie
 ++, += and so on, on the grounds that an implementation may be able
 do this more efficiently internally?

++ and -- are already slightly messy in perl5

pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
They know how to increment and decrement integers that don't overflow,
and call routines in sv.c to increment and decrement anything else.

Actually, this nearly provides a divide between values and operators
that has been suggested, with the speed up hack for the common case.

Nicholas Clark



Re: PDD 2, vtables

2001-02-07 Thread Branden

Nicholas Clark wrote:
 ++ and -- are already slightly messy in perl5

 pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
 They know how to increment and decrement integers that don't overflow,
 and call routines in sv.c to increment and decrement anything else.

 Actually, this nearly provides a divide between values and operators
 that has been suggested, with the speed up hack for the common case.

 Nicholas Clark


I guess everything (including get/set, add/sub/mul/...) could have a speed
up hack for the common case. The vtables (or PMC's flags as well) could have
a flag that indicate ``No tying and no overloading here, nothing special,
just another plain old variable''. Every operation would check the `special'
flag of the values it operates, and do the right thing on them. Otherwise,
call the vtable for the generic way of doing it.

I _think_ this would be a great speed up if the program doesn't use much
magic, but perhaps the overhead would be too big and make tying slower than
in Perl 5... something to consider tough.

- Branden




Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

Nicholas Clark [EMAIL PROTECTED] mused:
 On Wed, Feb 07, 2001 at 04:03:49PM +, David Mitchell wrote:
  BTW, should the vtable include all the mutator operators too, ie
  ++, += and so on, on the grounds that an implementation may be able
  do this more efficiently internally?
 
 ++ and -- are already slightly messy in perl5
 
 pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
 They know how to increment and decrement integers that don't overflow,
 and call routines in sv.c to increment and decrement anything else.
 
 Actually, this nearly provides a divide between values and operators
 that has been suggested, with the speed up hack for the common case.

I'm not sure I follow you. What is the "this" in "this nearly provides a
divide"?

Confused of Sheffield.




Re: PDD 2, vtables

2001-02-07 Thread Nicholas Clark

On Wed, Feb 07, 2001 at 05:19:16PM +, David Mitchell wrote:
 Nicholas Clark [EMAIL PROTECTED] mused:
  On Wed, Feb 07, 2001 at 04:03:49PM +, David Mitchell wrote:
   BTW, should the vtable include all the mutator operators too, ie
   ++, += and so on, on the grounds that an implementation may be able
   do this more efficiently internally?
  
  ++ and -- are already slightly messy in perl5
  
  pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
  They know how to increment and decrement integers that don't overflow,
  and call routines in sv.c to increment and decrement anything else.
  
  Actually, this nearly provides a divide between values and operators
  that has been suggested, with the speed up hack for the common case.
 
 I'm not sure I follow you. What is the "this" in "this nearly provides a
 divide"?

this example.
I think the "nearly" probably should go.
Maybe I should have written "++ and -- in perl5 provides an example of a
(nearly clean) divide between operator and value

 Confused of Sheffield.

Hmm. Yes. I'm confused too.

Confused of Newcastle



Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 03:09 PM 2/7/2001 +, David Mitchell wrote:
Some comments about the vtable PDD...

First a general comment. I think we really need to make it clear for
each method, which arg respresents the object that is having its method
called (ie which is $self/this so to speak). One way to make this clear
would be to insist that the first arg is always $self,
but failing that, it should be explicity mentioned for each function.

The docs are unclear there. I'll patch 'em up.

FWIW, generally the first argument is the destination.

Also, can I suggest a bit of terminology, which I will use below?
I define an *empty* PMC as a PMC which exists, but does not have a valid
vtable ptr or content (and so whose methods must not be called under
any circumstances). I specifically contrast this with a undefined PMC,
which has a valid vtable pointer that points to a bunch of methods that
mostly call carp("use of undefined value ...").

This is an area that needs addressing, that's for sure. I'll deal with that 
as well.

  The Ckey parameter is optional, and if passed it refers to an array
  of key structure pointers.

A mere detail, but would it not be more efficient to just pass them
as extra args, ie add(PMC1, PMC2, PMC3, key1, key2, key3),
rather than having to potentially create and populate a tmp struct
just to call the function???

Well, extra arguments cost. I'd originally had only a single key optionally 
passed in, but that meant potentially two of the three PMCs had to be real, 
rather than keyed into containers. (And thus possibly virtual) Hence the 
key array.

I admit it's not a swell decision--I'm not sure whether the single optional 
parameter is better, or several optional (or required) parameters are 
better. I can see it going either way, and I didn't put all that much 
thought into it.

 IVtype(PMC[, subtype]);
 
  Returns the type of the PMC. If the subtype is passed (int, string,
  num) it returns the subtype of the PMC. This is generally a class
  function rather than a variable one, but the PMC is passed in just in
  case. (And so we can have the subtype be a vararg parameter)

I dont understand the subtype bit. If for example we pass an optional
2nd arg of INT (assuming this is some sort of enum?), what does type()
return?

If you pass in INT, you'll get back NATIVE or BIGINT, depending on which 
one the PMC would rather be. STR would get BINARY, NATIVE, FOREIGN, or UTF_32.

Basically, subtype says "If I asked you what kind of string you were, what 
would you tell me?"

  =item new
 
 void   new(PMC[, key]);
 
  Creates a new variable of the appropriate type out of the passed PMC,
  destroying the current contents if there are any. This is a class
  function.

As an aside, am I right in assuming that there will be a function somewhere
(outside the scope of this PDD) that creates new, empty PMCs, which can then
be acted upon by new() to turn them into PMCs of a particular type?

Probably, yes. More likely, PMCs will be declared nukable unless a "clean 
me up" flag is set, in which case we'd just stomp on what was in there. 
(Since we generally don't care about the contents of a PMC when trashing 
it, with some relatively rare exceptions)

Will PMCs that have just been new()ed have a default null value, eg
0/"" ?  Note that they wont be undefined - at least, I'm assuming that
undefined is handled by a separate class.
Perhaps we need instead a range of new()s that initialise to various
string, numeric etc values?

I'm figuring that'll be handled by the bits that deal with constants, but 
there's no reason that a plain new PMC can't be undef. (Basically set the 
private data pointer to NULL and the vtable pointer to the undef class vtable)

The above definition of new() implies that it first calls destroy()
to release any previous contents. I think it would be better to define
new() as operating on an empty PMC (so it is the the caller's responsibility
to call destroy() first, if necessary).

destroy will only be called if the PMC needs it. Most PMCs will just get 
their contents trashed and let the GC clean up after.

Actually, I suspect that the whole area of new/clone/destroy etc will need to
be examined carefully in the light of a 'typical' variable lifecycle,
to avoid to unecessary transitions. For example, my $a = 'abc' might
involve $a going from empty - undef - empty - "" - "abc" or similar,
if we're not careful.

That's an area for the optimizer. I'd like it to go from empty-"abc", 
assuming we don't skip the empty step.

 void   clone(PMC1, PMC2 [, int flags[,key]);
 
  Copies CPMC2 into CPMC1. The Cflags parameter notes whether
  a deep copy should be done. (Possibly other things as well, if someone
  thinks of something reasonable)

One flag that would be very useful is 'destroy', which tells clone() to
destroy PMC2 immediately after the clone operation.

That makes no sense to me. If we're cloning PMC2 then trashing it, why not 
just set 

Re: Another approach to vtables

2001-02-07 Thread Branden


Dan,

I think there is a real problem with your vtable approach. It involves
tying, overloading and assignment. I'm not sure if I really got what you
meant with the PDD, but I'm assuming:
1. PMC's replace SV*.
2. Tying is handled by vtables that implement set_* and get_* entries to do
the magic stuff.
3. Overloading is handled by vtables that implement add/subtract/mul/...
entries to do the magic stuff.
4. There's only one vtable for each class of variables, i.e. all variables
tied to class X share the same vtable, and all objects that have overloading
defined by class Y share the same vtable (this seems obvious, since it's the
vtables that define the behaviour in case of tying/overloading).
5. On a $a = $b assignment, the PMC correspondent to $a after the assignment
is the same it corresponded before it. I.e. $a is set to the value of $b by
calling set_* methods of $a passing some information of $b or $b itself as
parameters, and not replaced with a new generated PMC derived from $b
somehow.

The problem I see concerns assignment, like $a = $b. What will be the vtable
of $a?

Suppose $b is tied. The vtable of $a should not be the one of $b, since by
this assignment $a doesn't get tied, it only fetches the value of $b. That
means the vtable of $a should probably be one of the `plain' vtables of
perl, to handle simple datatypes (of course it could be a special one, but
it doesn't matter here).

Now suppose $b is overloaded. For instance, let's say $b is a bigint, with
overloaded +,-,*,... to do the right thing in the case it's operated with
other values. Then, by $a = $b, $a receives the bigint stored in $b, and
should inherit (bad word) or copy $b's behaviour of add/subtract/mul/... .
As I suppose a new vtable wouldn't be created (as this would cause one
vtable per variable on multiple following assignments), I presume the vtable
of $b would have to be copied to $a.

But now we get into a contradiction: if $b is tied, the vtable should not be
copied; but if $b is overloaded, it should be copied. Now what should happen
if $b is both tied and overloaded? That's not impossible, since $b can, for
example, be tied to a random number generator and be programmed to return
bigints of 512 bits (that's actually only to say it's thinkable to have tie
and overload together, I'm not expecting someone would do something like
this...).

Other awkward consequences would happen if $a is tied and $b is overloaded.
$a = $b would make $a have to use $b's add/subtract/mul/..., but using $b's
vtable in its entirety would untie $a, right?

That's actually what made me feel the need for a separation between
store/fetch and add/subtract/mul/... . I've been tried to figure it out how
your proposal would fit this situation, but I couldn't find a way...



I actually don't know if my assumptions are wrong, and tying and overloading
would not be handled by set_*/get_* and add/subtract/mul/..., but I actually
can't see another way.

What do you think about it?

- Branden




Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

   ++ and -- are already slightly messy in perl5
   
   pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
   They know how to increment and decrement integers that don't overflow,
   and call routines in sv.c to increment and decrement anything else.
   
   Actually, this nearly provides a divide between values and operators
   that has been suggested, with the speed up hack for the common case.
  
  I'm not sure I follow you. What is the "this" in "this nearly provides a
  divide"?
 
 this example.
 I think the "nearly" probably should go.
 Maybe I should have written "++ and -- in perl5 provides an example of a
 (nearly clean) divide between operator and value

Well, many of the vtable methods are operator-ish rather than value-ish,
presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
have add(), concatenate() etc. Whihc leads me back to: I'm not sure
whether you are in favour of, or oppose, += etc being vtable methods.
 
  Confused of Sheffield.
 
 Hmm. Yes. I'm confused too.
 
 Confused of Newcastle

Fancy swapping some cutlery for some Brown Ale? ;-)




Re: PDD 2, vtables

2001-02-07 Thread Branden

Dan Sugalski wrote:
 At 03:09 PM 2/7/2001 +, David Mitchell wrote:
 A mere detail, but would it not be more efficient to just pass them
 as extra args, ie add(PMC1, PMC2, PMC3, key1, key2, key3),
 rather than having to potentially create and populate a tmp struct
 just to call the function???

 Well, extra arguments cost. I'd originally had only a single key
optionally
 passed in, but that meant potentially two of the three PMCs had to be
real,
 rather than keyed into containers. (And thus possibly virtual) Hence the
 key array.

 I admit it's not a swell decision--I'm not sure whether the single
optional
 parameter is better, or several optional (or required) parameters are
 better. I can see it going either way, and I didn't put all that much
 thought into it.



I think filling an array costs at least as much as passing parameters, not
to mention the cost of passing that array as a parameter, also... Unless the
array is constant and can be pre-filled by the compiler (which I think is a
somewhat rare case, considering all the three arguments), or once filled
there are cases it can be reused, I'm not sure it's worth using an array to
save 2 pushes into the stack...

- Branden




Re: PDD 2, vtables

2001-02-07 Thread Nicholas Clark

On Wed, Feb 07, 2001 at 05:54:14PM +, David Mitchell wrote:
 Well, many of the vtable methods are operator-ish rather than value-ish,
 presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
 have add(), concatenate() etc. Whihc leads me back to: I'm not sure
 whether you are in favour of, or oppose, += etc being vtable methods.

I'm not either. They feel like they should be operators.
But I don't like the thought of going in and out of a lot of generic
routines for

$a = 3;
$a += 2;

when the integer scalar ought to know what the inside of another integer
scalar looks like, and that 2 + 3 doesn't overflow.

Hmm. += isn't another opcode
it's a special case of a = b + c where the PMCs for a and b are the same
thing. And I see no real reason why it can't be part of the + entry.


Nicholas Clark



Re: PDD 2, vtables

2001-02-07 Thread Branden

David Mitchell wrote:

 Well, many of the vtable methods are operator-ish rather than value-ish,
 presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
 have add(), concatenate() etc. Whihc leads me back to: I'm not sure
 whether you are in favour of, or oppose, += etc being vtable methods.


Oppose. (Actually I'm talking about my idea on vtables, i.e. separate +/-/*
in one vtable and store/fetch in another). My proposal on ++ and -- would be
having the `value'-part of the vtable (the one that handles +/-/*) return a
value corresponding to what would be the value of it after an increment or
decrement. store would be used to actually commit the ++/-- operation. This
would serve both postfix and prefix cases, because in one case the value
before the store would be used, and in the other the one after.

(I just reminded the C++ overloading of ++, that uses a dummy parameter to
tell if it's a pre or a post increment. So bad...)

- Branden




Re: Another approach to vtables

2001-02-07 Thread Dan Sugalski

At 01:35 PM 2/7/2001 -0200, Branden wrote:
Dan Sugalski wrote:
  At 05:41 PM 2/6/2001 -0200, Branden wrote:
I actually don't see a reason why the vtable entries should be the
  opcodes.
Is there?
   
Speed.
   
  
  Actually, I don't see the problem of defining a C function that would do:
  
   void add(SVAR *result, SVAR *lop, SVAR *rop, SVAL *tmp1, SVAL *tmp2)
{
   /* tmp comes from the temporary file */
   lop-vtable-FETCH(tmp1, lop);
   rop-vtable-FETCH(tmp2, rop);
   lop-vtable-ADD(tmp1, tmp1, tmp2);
   result-vtable-STORE(result, tmp1, tmp2);
   }
  
  And have it be my opcode. Passing the indexes wouldn't be a problem
either.
  Is there any problem here?
 
  Well, no, but what's the point? If you're calling lop's ADD vtable entry,
  why not have that entry deal with the rest of it, rather than stick things
  into, and later remove them, from temp slots?
 


Dan,

I see you talk about vtables and opcodes as related things. I really don't
see why you think that's necessary, I'd like to hear why you think it.

They are. Since the only code that will be calling vtable routines will be 
the opcode functions, designing the two to go hand in hand makes sense to me.

As far as I know (and I could be _very_ wrong), the primary objectives of
vtables are:
1. Allowing extensible datatypes to be created by extensions and used in
Perl.

Secondarily, yes.

2. Making the implementation of `tie' and `overload' more efficient ('cause
it's very slow in Perl 5).

No, not at all. This isn't really a consideration as such. (The vtable 
functions as desinged are inadequate for most overloading, for example)

3. Replacing Perl5's SV*,AV*,HV*,... (I don't know if it should replace or
complement -- ?)

Not really a goal.

You forgot #0.

Go faster.

The big reason is to make the functions that perl calls smaller with fewer 
branches. We can't avoid branches--code that makes no decisions is 
generally dull--but we want to make as few as we can. Making the vtable 
code targeted means the functions don't have to check much at runtime.

And some secondary objectives are:
4. Allow the use of different string encodings internally.

Nope. Happy side-effect.

5. Allow int's and float's to become bigint's and bigfloat's when an
overflow occurs.

Nope, not a reason for vtables. You can do it just fine without them. (Just 
means more code in the opcode functions that do math)

Is this right or am I missing something?

Missing something, as you can see. That's OK, though. The vtable PDD should 
have made all this stuff clear in the preamble text. I've been assuming 
folks have been following along since the beginning. (And possibly know 
about the other stuff in my head (no, not *that* stuff. The other other 
stuff...))

The point I want to make, is that vtables are directly related to what tie
and overload are today (or sv_magic, if you want the underlying thing [I
don't know what's underlying in overload case, as it's apparently not
documented]).

No, they aren't. tie and overloading are a level or three up from this.

So I don't really see why opcodes are in the discussion. For
me, at least, tie and overload are related to data, and opcodes with
execution. They are orthogonal things, or at least should be (sure that
doesn't mean they are not related and should be implemented separately, but
they sure can).

You're thinking at too high a level, or have too many levels jammed 
together into one. This really isn't the place to be doing all of 
overloading, nor all of tying. Also, since the vtable won't be exposed to 
the rest of the world, we don't want to force ties and overloads to use it, 
since it means we'll need to change lots of stuff if we toss vtables for 
some reason. (like, say, their performance turns out to be bad)

Of course I understand some opcodes are related to data manipulation, and
should of course be modelled after vtables, but they surely can be
separated. Before I proposed the code above for `add', taking 3 SVAR's and
doing the same as what you proposed (considering no arrays/hashes). I
thought about it, and I saw that it could be extended to handle 3 PMC's,
instead of SVAR's. If they are SVAL's, they are passed directly to the
vtable, otherwise they are fetched/stored using keys that are passed by
parameters. The add method would be able to determine it by the TYPE vtable
entry (finally found a use for it...). That would actually result in the
exact same opcodes that your approach would.

Well, not exactly. If you're fetching data out of arrays and hashes, the 
fetch_scalar/get_value pair may well end up creating temporary scalars. If 
the source array's declaration is:

   my @foo : int;

Then there aren't any scalars inside of @foo, and fetching them out means 
creating new ones.

That's why the keys are used, so we can avoid that in some cases.

Some more about opcode dispatching. What it has to do is:
1. Fetch an instruction (that would be a byte 

Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 06:12 PM 2/7/2001 +, Nicholas Clark wrote:
On Wed, Feb 07, 2001 at 05:54:14PM +, David Mitchell wrote:
  Well, many of the vtable methods are operator-ish rather than value-ish,
  presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
  have add(), concatenate() etc. Whihc leads me back to: I'm not sure
  whether you are in favour of, or oppose, += etc being vtable methods.

I'm not either. They feel like they should be operators.
But I don't like the thought of going in and out of a lot of generic
routines for

$a = 3;
$a += 2;

when the integer scalar ought to know what the inside of another integer
scalar looks like, and that 2 + 3 doesn't overflow.

That particular case would get caught by the optimizer (I'd hope) so it'd 
not be an issue anyway.

Hmm. += isn't another opcode
it's a special case of a = b + c where the PMCs for a and b are the same
thing. And I see no real reason why it can't be part of the + entry.

Whether a special case in the code would get a speedup or not's up in the 
air. (Is the test and branch faster than a generic doing it routine?) I'd 
want to test that and see before I decided.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 04:15 PM 2/7/2001 -0200, Branden wrote:
David Mitchell wrote:
 
  Well, many of the vtable methods are operator-ish rather than value-ish,
  presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
  have add(), concatenate() etc. Whihc leads me back to: I'm not sure
  whether you are in favour of, or oppose, += etc being vtable methods.
 

Oppose. (Actually I'm talking about my idea on vtables, i.e. separate +/-/*
in one vtable and store/fetch in another). My proposal on ++ and -- would be
having the `value'-part of the vtable (the one that handles +/-/*) return a
value corresponding to what would be the value of it after an increment or
decrement. store would be used to actually commit the ++/-- operation. This
would serve both postfix and prefix cases, because in one case the value
before the store would be used, and in the other the one after.

Splitting the vtable into two pieces, with one piece not tied to a PMC, 
makes some things impossible. Consider this:

   @foo = @bar * @baz;

where all three arrays are really matrix types. In the separate load/store 
and do vtable scheme it means you get the value of @bar and @baz in scalar 
context, and multiply the results. Two operations, and the resultant values 
are sanitzed. In the single vtable scheme, we'd execute @bar's multiply 
routine, which would be clever enough (because we wrote it that way) to see 
the second parameter's also a matrix, and do matrix math.

Splitting things up also loses information when moving data between the two 
vtables routines. While that's not a big deal generally (as the info lost 
is irrelevant) it forbids some rather interesting side cases.

(I just reminded the C++ overloading of ++, that uses a dummy parameter to
tell if it's a pre or a post increment. So bad...)

I'm not sure it's worth having both a preinc and postinc operator, as 
opposed to splitting it into an inc/fetch or fetch/inc pair. (And yes, I 
know, earlier I was arguing that one opcode's better than two. This one's 
rare enough that profiling it would probably be in order...)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Another approach to vtables

2001-02-07 Thread Branden

Dan Sugalski wrote:

 2. Making the implementation of `tie' and `overload' more efficient
('cause
 it's very slow in Perl 5).

 No, not at all. This isn't really a consideration as such. (The vtable
 functions as desinged are inadequate for most overloading, for example)



Well, if it's not tie/overload, I didn't really understand why a vtable
would have to be attached to a variable. I'd really like to see an example
of variables whose vtables would have set_* and get_* different one from
another, and another example of variables whose vtables would have
add/subtract/mul/... different one from another. What happens with vtables
on assignment? (in $a = $b, $a copies its vtable from $b or not?)

And I really don't see why tie/overload couldn't be handled in a level below
the level of the opcodes (in a sense that one opcode calls various methods
of a (potentially) tied/overloaded variable/value).

The example of `my @a :int' really shows your point. I was actually thinking
current Perl5 syntax as a target, and I really wouldn't know how to deal
with this... (but sure I'll think about it!)

- Branden




Re: Another approach to vtables

2001-02-07 Thread David Mitchell

 Dan,
 
 I think there is a real problem with your vtable approach.

[ etc etc ]

I think there's an important misconception about tieing and overloading
going on hre which I will attempt to clear up. (Then Dan and co
can point out that I;'m I;m wrong and have just made matters worsse ;-)

First off, people should be very clear that there are two *completely*
different types of tieing and overloading possible in Perl 6, which
I shall call Perl-mode and C-mode for convenience.

Perl 5 only supports Perl-mode tieing and overloading - ie, where
the tied or overloaded functions that get called are Perl functions.
This is slow and heavyweight, but it is easy to code (ie you write
a Perl module with a few functions).
In addition, Perl 6 allows C-mode tieing and overloading. This is where
the tied or overloaded functions that get called are C functions.
This is much more low-level, but much faster. It's also much harder to
code. C-mode tieing and overloading is what vtables are. Ie you
can write a custom data type and have control over how its data is
accessed, and how its data is operated on.

Note also that C-mode overloading and Perl-mode overloading are quite
different at the language level: Perl 5 overloading is really the overloading
of *reference* operators. Ie if you have

my ($a, $b);
my ($ra, $rb) = (\$ra, \$rb);

Then at this point $ra + $rb gives you nonsense (last time I looked, it
returned the sum of the 2 addresses).

Now if you bless $a, $b into some overloaded class, then
$ra + $rb suddenly starts doing what you want - ie you have overloaded
the defintion of addition *on references*.
Importantly, $a + $b still doesnt do what you want.

Under Perl 6, with C-mode overloading, $a + $b *will* DWIM.

Interestingly enough, perl-mode tieing and overloading will presumably
be implemented using vtables.

For example, when you call the add() vtable method associated with
a tied variable, that add() method just calls the (Perl) FETCH function
from the relevant module, then just passes control on to the add() vtable
method associated with the PMC returned by FETCH, passing through the
original args.

Thus, the only(-ish) place within the perl src that needs to understand
about Perl-mode ties and overloading is within a handful vtable classes





Re: Another approach to vtables

2001-02-07 Thread Branden

Branden wrote:

 Well, if it's not tie/overload, I didn't really understand why a vtable
 would have to be attached to a variable. I'd really like to see an example
 of variables whose vtables would have set_* and get_* different one from
 another, and another example of variables whose vtables would have
 add/subtract/mul/... different one from another.


Try to answer that myself:

my @a : int;
@a = (1, 2, 3);

@a would have set_* and get_* different from the same entries in the usual
array of scalars.

@a = @b * @c

when @b and @c are matrixes instead of arrays, mul would be different from
the same entries in the usual array of scalars.



 What happens with vtables
 on assignment? (in $a = $b, $a copies its vtable from $b or not?)


Already answered, $a can change it's PMC by another one, and ties should be
done above that.


 And I really don't see why tie/overload couldn't be handled in a level
below
 the level of the opcodes (in a sense that one opcode calls various methods
 of a (potentially) tied/overloaded variable/value).


I think this still would be a good thing, at least in cases scalars are
considered (I'll still think a little about my @a : int...).

- Branden




Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

 I'm not either. They feel like they should be operators.
 But I don't like the thought of going in and out of a lot of generic
 routines for
 
 $a = 3;
 $a += 2;
 
 when the integer scalar ought to know what the inside of another integer
 scalar looks like, and that 2 + 3 doesn't overflow.
 
 That particular case would get caught by the optimizer (I'd hope) so it'd 
 not be an issue anyway.
 
 Hmm. += isn't another opcode
 it's a special case of a = b + c where the PMCs for a and b are the same
 thing. And I see no real reason why it can't be part of the + entry.
 
 Whether a special case in the code would get a speedup or not's up in the 
 air. (Is the test and branch faster than a generic doing it routine?) I'd 
 want to test that and see before I decided.

Are we all clear then, that in perl 6, since the opcodes etc are no longer
allowed to rummage around in the internals of a PMC, its purely a question
of whether $a += 3 invokes

add($a,$a,3)
or
eqadd($a,3)

and whether $a++ invokes

add($a,$a,1)
or
postinc($a)

etc?

And that this decision is mainly a 'time it and see' decision?





Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

Dan, before I followup your reply to my list of nits about the PDD,
can I clarify one thing: destruction.

I am assuming that many PMCs will require destruction, eg calling
destroy() on a string PMC will cause the memory used by the string
data to be freed or whatever. Only very simple PMCs (such as integers)
need to do no detruction.

Is this the same as your perception of reality :-) ?

I also gather that PMCs will have a flag saying whether they need destroying,
(eg ints say no, strings say yes), and that calls to destroy() are preceeded
by a check on this flag for efficiency?




Re: Another approach to vtables

2001-02-07 Thread Branden

David Mitchell wrote:
 Perl 5 only supports Perl-mode tieing and overloading - ie, where
 the tied or overloaded functions that get called are Perl functions.
 This is slow and heavyweight, but it is easy to code (ie you write
 a Perl module with a few functions).


Actually, I think Perl 5 supports tying in C level, through sv_magic. I
think overloading is also possible, however I've never seen documentation on
that.


 In addition, Perl 6 allows C-mode tieing and overloading. This is where
 the tied or overloaded functions that get called are C functions.
 This is much more low-level, but much faster. It's also much harder to
 code. C-mode tieing and overloading is what vtables are. Ie you
 can write a custom data type and have control over how its data is
 accessed, and how its data is operated on.
 [...]
 Interestingly enough, perl-mode tieing and overloading will presumably
 be implemented using vtables.
 [...]
 Thus, the only(-ish) place within the perl src that needs to understand
 about Perl-mode ties and overloading is within a handful vtable classes


This is exactly what I was talking about, but Dan told me that's not what he
wants with vtables. At least I see I'm not the only one with this wrong
idea...


 Note also that C-mode overloading and Perl-mode overloading are quite
 different at the language level: Perl 5 overloading is really the
overloading
 of *reference* operators. Ie if you have


Agreed. Don't see why Perl 6 shouldn't be different, e.g. have "foo" with a
+ overloaded to do concatenation.


 Now if you bless $a, $b into some overloaded class, then
 $ra + $rb suddenly starts doing what you want - ie you have overloaded
 the defintion of addition *on references*.
 Importantly, $a + $b still doesnt do what you want.

 Under Perl 6, with C-mode overloading, $a + $b *will* DWIM.


But then I don't see how vtables will propagate in an assignment. If $b has
a bigint and I do $a = $b, $a should copy the add/subtract/mul/... entries
of $b's vtable, because without them a bigint doesn't really make sense. In
the other side, if $a copies $b's vtable, and $b is tied to something, $a
will be tied to the thing $b is tied to, what is wrong, since this is an
assignment and not an aliasing. Copying and not copying is not possible.
Copying partially (only add/subtract/mul/...) is only possible if the
entries are in separate vtables, since there are probably many PMC's
pointing to one vtable (meaning it shouldn't be modified) and creating
per-variable vtables is a bad idea (it's just like saying to the user he has
a 256-byte (or more) storage to a 32-bit integer, because of all vtable
entries).


 For example, when you call the add() vtable method associated with
 a tied variable, that add() method just calls the (Perl) FETCH function
 from the relevant module, then just passes control on to the add() vtable
 method associated with the PMC returned by FETCH, passing through the
 original args.


Tying is clear to me. I only see a problem with overloading on assignment,
that clearly cannot co-exist with tie, as I explained above.

- Branden




Re: Another approach to vtables

2001-02-07 Thread Buddha Buck

At 01:14 PM 02-07-2001 -0500, Dan Sugalski wrote:
At 01:35 PM 2/7/2001 -0200, Branden wrote:
As far as I know (and I could be _very_ wrong), the primary objectives of
vtables are:
1. Allowing extensible datatypes to be created by extensions and used in
Perl.

Secondarily, yes.

2. Making the implementation of `tie' and `overload' more efficient ('cause
it's very slow in Perl 5).

No, not at all. This isn't really a consideration as such. (The vtable 
functions as desinged are inadequate for most overloading, for example)

Hmm, I seem to remember vtables were being cited as a cure for lots of ills 
(perhaps combined with other aspects, like "make Perl nearly as fast as C".)

The vtables were implied (or possibly out-right stated) as giving the 
low-level core a more object-oriented structure: as you state below, 
branching and conditionals in the runtime can be eliminated by the values 
knowing how to operate on themselves.

It was also implied (or out-right claimed) that different 
objects/classes/packages/whatever could have class-specific vtables, 
defined at run-time, that would be used to handle the class-specific 
implementation details.  I'm not sure what that could refer to except ties 
and overloading; class-specific methods wouldn't go in the vtable.

There was some discussion that allowing the vtables to refer to functions 
written in perl would be a good idea, as it would allow extensions to be 
written in perl -- which is a good thing.

I had gotten the impression that the perl code-sequence:

   $a = $b + $c;

would generate the same op-code sequence regardless of the type of $a, $b, 
$c, and the vtables would do all the magic behind the scenes, calling tied 
or overloaded versions of the base functions if so defined for $a, $b, or $c.

Now I seem to be hearing that this is not the case, that variable ties and 
overloads are at a much higher level, never touching the vtables.  It now 
seems that the vtables will exist only for built-in types, and be 
inaccessible for user-defined types (unless those types are defined by the 
perl6 equivilant of XS, for example).  This almost seems to be defaulting 
on the promise of vtables I thought was made.





Re: Another approach to vtables

2001-02-07 Thread Branden

Dan Sugalski wrote:
 At 01:35 PM 2/7/2001 -0200, Branden wrote:
 2. Making the implementation of `tie' and `overload' more efficient
('cause
 it's very slow in Perl 5).

 No, not at all. This isn't really a consideration as such. (The vtable
 functions as desinged are inadequate for most overloading, for example)

 [...]

 Is this right or am I missing something?

 Missing something, as you can see. That's OK, though. The vtable PDD
should
 have made all this stuff clear in the preamble text. I've been assuming
 folks have been following along since the beginning. (And possibly know
 about the other stuff in my head (no, not *that* stuff. The other other
 stuff...))



In http:[EMAIL PROTECTED]/msg00464.html Nick
Ing-Simmons talks about using vtables to implement 'magic hacks' of Perl 5,
which are ties. He says ``(they are even called vtables in the sources)''.

In http:[EMAIL PROTECTED]/msg00494.html Ken
Fox says ``It should be possible to use the dispatch tables to easily
implement overloading because operators should map fairly easily into the
table.''

In the recent
http:[EMAIL PROTECTED]/msg02376.html David
Mitchell talks about `C-mode' tying and overloading, which is the same I was
thinking about.

I really haven't been following along since the beginning, but I see I'm not
the only one who got the wrong point.

The conclusion I'm reaching is that we're definitely talking about different
things. You are talking about fast opcode dispatching through clever
dispatch tables. I'm talking about using polymorphism to implement tying and
overloading in the low-level. I guess they are different, right?

What I would like to know:
* the things we're talking about are compatible or not?
* can they be built into just one, that makes fast opcode dispatch and easy
tying/overloading?
* can one use another to be implemented?
* what is a vtable after all?
* do all this stuff I wrote even _make_ sense?


- Branden




Re: PDD 2, vtables

2001-02-07 Thread Branden

Dan Sugalski wrote:
 Splitting the vtable into two pieces, with one piece not tied to a PMC,
 makes some things impossible. Consider this:

@foo = @bar * @baz;

 where all three arrays are really matrix types.

By the PDD's notion of `key', what would be the `key' of a matrix type ?

(I think that's actually a -language question, but) What $foo[42] (where
@foo is matrix) would compile to?



 In the separate load/store
 and do vtable scheme it means you get the value of @bar and @baz in scalar
 context, and multiply the results. Two operations, and the resultant
values
 are sanitzed. In the single vtable scheme, we'd execute @bar's multiply
 routine, which would be clever enough (because we wrote it that way) to
see
 the second parameter's also a matrix, and do matrix math.


Actually, not necessarily. It depends of what the compiler does... There
could be special entries for array operations, like +/-/*/... . The problem
I see with it is what happens when you @a = @b. Actually, if @b is a matrix,
@a = @b makes @a a matrix or evaluates @b in list context? What about @a =
(@b) ? What if @a is a tied array? This matrix thing is actually getting
very confusing to me... I think all these proposed additions to the language
should be carefully examined for possible mis-interpretations like these.


- Branden




Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 06:08 PM 2/7/2001 -0200, Branden wrote:
Dan Sugalski wrote:
  Splitting the vtable into two pieces, with one piece not tied to a PMC,
  makes some things impossible. Consider this:
 
 @foo = @bar * @baz;
 
  where all three arrays are really matrix types.

By the PDD's notion of `key', what would be the `key' of a matrix type ?

Probably an integer, possibly a list for multidimensional matrices. (And I 
haven't thought about how to handle that--probably force a series of index 
lookups)

(I think that's actually a -language question, but) What $foo[42] (where
@foo is matrix) would compile to?

Identically to how $foo[42] would if @foo were a plain array.

  In the separate load/store
  and do vtable scheme it means you get the value of @bar and @baz in scalar
  context, and multiply the results. Two operations, and the resultant
values
  are sanitzed. In the single vtable scheme, we'd execute @bar's multiply
  routine, which would be clever enough (because we wrote it that way) to
see
  the second parameter's also a matrix, and do matrix math.
 

Actually, not necessarily. It depends of what the compiler does... There
could be special entries for array operations, like +/-/*/... . The problem
I see with it is what happens when you @a = @b. Actually, if @b is a matrix,
@a = @b makes @a a matrix or evaluates @b in list context?

That's a language issue. I don't know--I can see it going either way. I'd 
prefer a straight assign and let the assignment vtable entry handle it, but 
I don't know that we'll have that option.

What about @a =
(@b) ?

Good question. I'd like to see it handled the same way as @a=@b, but I'm 
not sure that's going to happen. It's Larry's decision. (Mainly because I'd 
like to see this:

   @a = (@b, @c);

turn into:

   @a = @b;
   push @a, @c;

but I don't know that we'll be able to)

What if @a is a tied array?

What if? Larry's call as to whether it makes @a a copy of the data from @b, 
or creates a new tied thing, or an alias. Probably @a would be a plain copy 
of @b with no magic, but that's not my call.

This matrix thing is actually getting
very confusing to me... I think all these proposed additions to the language
should be carefully examined for possible mis-interpretations like these.

I'm not proposing they go in (well, OK, I am, but I'm not forcing it). What 
I am doing is trying to not preclude the possibility if its decided that it 
will happen.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Another approach to vtables

2001-02-07 Thread Dan Sugalski

At 02:33 PM 2/7/2001 -0500, Buddha Buck wrote:
At 01:14 PM 02-07-2001 -0500, Dan Sugalski wrote:
At 01:35 PM 2/7/2001 -0200, Branden wrote:
As far as I know (and I could be _very_ wrong), the primary objectives of
vtables are:
1. Allowing extensible datatypes to be created by extensions and used in
Perl.

Secondarily, yes.

2. Making the implementation of `tie' and `overload' more efficient ('cause
it's very slow in Perl 5).

No, not at all. This isn't really a consideration as such. (The vtable 
functions as desinged are inadequate for most overloading, for example)

Hmm, I seem to remember vtables were being cited as a cure for lots of 
ills (perhaps combined with other aspects, like "make Perl nearly as fast 
as C".)

Yep, and they'll wash your windows and do your laundry, too. :)

The vtables were implied (or possibly out-right stated) as giving the 
low-level core a more object-oriented structure: as you state below, 
branching and conditionals in the runtime can be eliminated by the values 
knowing how to operate on themselves.

It was also implied (or out-right claimed) that different 
objects/classes/packages/whatever could have class-specific vtables, 
defined at run-time, that would be used to handle the class-specific 
implementation details.  I'm not sure what that could refer to except ties 
and overloading; class-specific methods wouldn't go in the vtable.

While defining tie and overload functions will affect the vtable generated 
for a package, users generally won't be writing all the vtable functions, 
nor will they be writing directly to the vtable spec.

The current list of functions defined as needed for tieing and overloading 
are sufficient, with some glue, to build up a fully functional vtable. I 
really, *really* don't want people skipping the tie or overload interface 
and heading straight for the low-level vtable interface, since they'll be 
really screwed if we change how vtables are defined or work, or scrap them 
altogether.

There was some discussion that allowing the vtables to refer to functions 
written in perl would be a good idea, as it would allow extensions to be 
written in perl -- which is a good thing.

I had gotten the impression that the perl code-sequence:

   $a = $b + $c;

would generate the same op-code sequence regardless of the type of $a, $b, 
$c, and the vtables would do all the magic behind the scenes, calling tied 
or overloaded versions of the base functions if so defined for $a, $b, or $c.

Unfortunately overloading presents certain problems. Generally speaking 
we're OK if a custom variable behaves as if it were a builtin type, but 
when it doesn't the vtable scheme isn't sufficient.

For example, this:

   @foo = 12 * @bar;

will cause problems if @bar is an odd type. We'll be using 12's vtable (the 
integer one) as the source of the functions to do things, while it's 
actually the right-hand side that's messed up.

With some things, + and * particularly, we can probably get by with some 
arithmetic reordering. With - and /, though, we can't since those 
operations aren't commutative. Because of this, unless Larry says "Left 
hand side wins for custom types", we need more support than is in the 
vtable definitions to manage things. That support will be provided by the 
compiler, assuming you write to the tie or overload interface and not 
directly to the vtable interface.

Now I seem to be hearing that this is not the case, that variable ties and 
overloads are at a much higher level, never touching the vtables.  It now 
seems that the vtables will exist only for built-in types, and be 
inaccessible for user-defined types (unless those types are defined by the 
perl6 equivilant of XS, for example).  This almost seems to be defaulting 
on the promise of vtables I thought was made

It will be possible to write full vtables in perl--that's one of the things 
that Larry wants. (He wants to be able to call perl subs like C functions, 
so it pretty much follows)



Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




RE: PDD 2, vtables [pointers to related documentation]

2001-02-07 Thread Garrett Goebel

From: Tim Bunce [mailto:[EMAIL PROTECTED]]
 
 On Tue, Feb 06, 2001 at 12:28:23PM -0500, Dan Sugalski wrote:
 
  At 11:26 AM 2/6/2001 +, Tim Bunce wrote:
  
   On Mon, Feb 05, 2001 at 05:14:44PM -0500, Dan Sugalski wrote:
   
=head2 Core datatypes
   
For ease of use, we define the following semi-abstract 
data types
  
   Probably worth stating upfront that it'll be easy to add
   new types to avoid people arguing for their favorite type
   to be added here.
  
  I'm not sure it should be--that'd mean extending the 
  vtables in ways they have little room to grow. Adding new
  perl datatypes is easy, adding new low-level types is harder.
 
 That's pretty much what I meant. I think it's worth saying.

Adding comments like the ones Tim is suggesting, are just what someone like
myself needs. A statement of the obvious and the context in which it fits...
for those who haven't a clue and are trying to piece together a conceptual
model of system. It saves me the conflict of deciding whether or not to
pester you all with questions or not ask, never know and continue to be my
own little mushroom.
 

=head1 REFERENCES
   
PDD 3: Perl's Internal Data Types.
  
   Some references to any other vtable based languages would
   be good.(I presume people have looked at some and learnt
   lessons.)
  
  Alas not. This is pretty much head of zeus stuff, modulo 
  some ego. (Mine's not *that* big...)
 
 Without studying history we may be doomed to repeat it.
 
 So can anyone point to vtable based language implementations?

Well, I may be one of the least qualified subscribers on this list, but I'm
a pretty good gopher... Some of this relates to languages implementing
vtables as opposed to being implemented with them. Everything I've scanned
so far seems to raise the flag concerning overhead associated...

Title:Portable Inheritance and Polymorphism in C
URL:  http://www.embedded.com/97/fe29712.htm
Abstract: A lower-level view that assumes only a procedural language like C
for embedded developers who want to apply OO without switching to an OO
language

Title:Programming Language Pragmatics
URL:  http://www.amazon.com/exec/obidos/ASIN/1558604421
Abstract: Mentions virtual methods and tables in Sections 10.4-5. It
discusses vtables from a high level in the general context of Eiffel,
Simula, C++, and Ada.

Title:SableVM: A Research Framework for the Efficient
  Execution of Java Bytecode
URL:  http://www.j-meg.com/~egagnon/sable-report_2000-3/
Abstract: SableVM is an open-source virtual machine for Java2, intended as a
research framework for efficient execution of Java bytecode. The framework
is essentially composed of an extensible bytecode interpreter using
state-of-the-art and innovative techniques. Written in the C programming
language, and assuming minimal system dependencies, the interpreter
emphasizes high-level techniques to support efficient execution. In
particular, we introduce new data layouts for classes, virtual tables and
object instances that reduce the cost of interface method calls to that of
normal virtual calls, allow efficient garbage collection and light
synchronization, and make effective use of memory space. 

Title:C++ Producer Guide
URL:  http://www.cse.unsw.edu.au/~patrykz/TenDRA/tcpplus/lib.html#vtable
Abstract: vtable implementation in C++

Title:C++ ABI for IA-64
URL:  http://reality.sgi.com/dehnert_engr/cxx/abi.html
Abstract: vtables layout, etc. is discussed in sections 2.5-2.6 and
scattered throughout. You can find similar information in C++ ABI
documentation for Macintosh, etc.




Re: yydestruct leaks

2001-02-07 Thread Alan Burlison

[EMAIL PROTECTED] wrote:

 Hmm, so this is (kind of) akin to the regcomp fix - it is the "new" stuff
 that is in yyl?val that should be free-d. And it is worse than that
 as yyl?val is just the topmost the parser state, so if I understand correctly
 it isn't only their current values, but also anything that is in the
 parser's stack (ysave-yyvs) at the time-of-death that needs to go.
 And all of those use the horrid yacc-ish YYSTYPE union, so we don't know
 what they are. Yacc did, it had a table which mapped tokens to types
 which it used to get union assigns right. But byacc does not put that info
 anywhere for run-time to use, so to get it right we would need to
 re-process perly.y and then look at the state stack as we popped it.

Yup - that's about the size of it.

 Yugh.
 
 The way I usually do this is make YYSTYPE an "object" or something
 like a Pascal variant record - which has a tag.

That was my idea - I just couldn't figure out any clean way of capturing
the type information.  If only byacc had a $$type variable as well as $$
etc...

 This would not be easy to fix for perl5.
 The best I can come up with is to make them all OP *, inventing
 special parse-time-only "ops" which can hold ival/pval/gvval values.
 
 Then yydestruct could just free the ops in yylval and yyvs[],
 freeing a gvalOP or pvalOP would do the right thing.
 
 Almost certainly far more than we want to do to the maint branch.

That seems workable, although as you say, far too radical for the maint
branch :-(

 The other way this mess is handled is to use a "slab allocator"
 e.g. GCC used an "obstack" - this allows all the memory allocated
 since one noted a "checkpoint" to be free-d.
 One could fake that idea by making malloc "plugable" and plugging
 it during parse to build a linked list or some such.

Well, that's kinda what we have with the scope stack, the problem is
that you don't know the type of the thing that needs freeing.

 The down side of that scheme is that auxillary allocators tend to
 upset Purify like tools almost as much as memory leaks do.

I've tried 2 approaches to this.  The first is to add "#ifdef PURIFY"
code to pp_ctl.c along the lines of the following:

S_doeval(...)
{
...
/* Flush any existing leaks to the log */
purify_new_leaks();
...
if (yyparse() == failed) {
...
/* Ignore any leaks */
purify_clear_leaks();
}
...
}

However I'm still suspicious of this because of the number of leaks that
only appear when S_doeval is somewhere in the stack trace.

The other approach is to postprocess the purify log and ignore anything
that has S_doeval or Perl_pp_entereval in the stack.  That's the
approach I'm currently using, but of course it ignores any real leaks
that coincidentally appear within an eval.  I think I'll try getting rid
of as many leaks as possible under this restricted regime - even with
this restriction, and with the bugs I've already fixed the test suite
contains 141 memory errors.

The truth of the matter is that I suspect eval and die will always leak
until it is re-architected in perl6 - whenever that might be.

Alan Burlison



Re: yydestruct leaks

2001-02-07 Thread nick

Alan Burlison [EMAIL PROTECTED] writes:
If an eval{} fails because of a snytax error, yydestroy is called on
leaving the eval scope.  Unfortunately it does this:

yyval   = ysave-oldyyval;
yylval  = ysave-oldyylval;

So anything that is in those 2 vars that hasn't made its way into the
parse tree is lost forever.  I know this is an old problem, but I've
been trying to think of a way to fix it.  Has anybody any suggestions? 

Hmm, so this is (kind of) akin to the regcomp fix - it is the "new" stuff
that is in yyl?val that should be free-d. And it is worse than that
as yyl?val is just the topmost the parser state, so if I understand correctly
it isn't only their current values, but also anything that is in the 
parser's stack (ysave-yyvs) at the time-of-death that needs to go.
And all of those use the horrid yacc-ish YYSTYPE union, so we don't know
what they are. Yacc did, it had a table which mapped tokens to types
which it used to get union assigns right. But byacc does not put that info
anywhere for run-time to use, so to get it right we would need to 
re-process perly.y and then look at the state stack as we popped it.

Yugh.

The way I usually do this is make YYSTYPE an "object" or something
like a Pascal variant record - which has a tag. 

This would not be easy to fix for perl5.
The best I can come up with is to make them all OP *, inventing 
special parse-time-only "ops" which can hold ival/pval/gvval values.

Then yydestruct could just free the ops in yylval and yyvs[],
freeing a gvalOP or pvalOP would do the right thing.

Almost certainly far more than we want to do to the maint branch.

The other way this mess is handled is to use a "slab allocator"
e.g. GCC used an "obstack" - this allows all the memory allocated 
since one noted a "checkpoint" to be free-d.
One could fake that idea by making malloc "plugable" and plugging 
it during parse to build a linked list or some such.

The down side of that scheme is that auxillary allocators tend to 
upset Purify like tools almost as much as memory leaks do.


-- 
Nick Ing-Simmons