newPMC() (was: Re: PDD 2, vtables)

2001-02-21 Thread David Mitchell

Dan Sugalski wrote:
 Grab one via a utility function. getPMC() or something of the sort.

 newPMC() ? ;-)

I think we shouldn't rule out the possibility of having multiple
newPMC() style functions for grabbing PMCs used for different activities
(eg lexicals vs tmps vs guaranteed-to-have-refcount=1 vs whatever),
which may all have different allocation/GC schemes.
Heck, we might even consider having newPMC(N) returning a pointer to
a contiguous chuck of N PMCs; this might then allow us to have good locality
of reference for all the stuff in a scratchpad, for example.

I dont think we can have a sensible discussion of this until the GC PDD
is been released, but I thought I'd flag up the idea anyway.




Re: PDD 2, vtables

2001-02-18 Thread Alan Burlison

Dan Sugalski wrote:

 If PMC is a pointer to a structure, "new" will need to allocate memory for a
 new structure, and hence the value of mypmc will have to change.
 
 Nope. PMC structures will be parcelled out from arenas and not malloc'd,
 and they won't be freed and re-malloced much. If we're passing in a PMC
 pointer, we won't be reallocating the memory pointed to--rather we'll be
 reusing it.

So how do you get hold of a PMC from the arena in the first place?

Alan Burlison



Re: PDD 2, vtables

2001-02-18 Thread Dan Sugalski

At 12:45 AM 2/19/2001 +, Alan Burlison wrote:
Dan Sugalski wrote:

  If PMC is a pointer to a structure, "new" will need to allocate memory 
 for a
  new structure, and hence the value of mypmc will have to change.
 
  Nope. PMC structures will be parcelled out from arenas and not malloc'd,
  and they won't be freed and re-malloced much. If we're passing in a PMC
  pointer, we won't be reallocating the memory pointed to--rather we'll be
  reusing it.

So how do you get hold of a PMC from the arena in the first place?

Grab one via a utility function. getPMC() or something of the sort.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-18 Thread Alan Burlison

Dan Sugalski wrote:

 Grab one via a utility function. getPMC() or something of the sort.

newPMC() ? ;-)

Alan Burlison



Re: PDD 2, vtables

2001-02-18 Thread Dan Sugalski

At 01:13 AM 2/19/2001 +, Alan Burlison wrote:
Dan Sugalski wrote:

  Grab one via a utility function. getPMC() or something of the sort.

newPMC() ? ;-)

Works for me. Though for some reason it brings up visions of the Village 
People, and that's generally a Bad Thing... :)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-18 Thread Simon Cozens

On Sun, Feb 18, 2001 at 09:34:46PM -0500, Dan Sugalski wrote:
 At 01:13 AM 2/19/2001 +, Alan Burlison wrote:
 Dan Sugalski wrote:
 
   Grab one via a utility function. getPMC() or something of the sort.
 
 newPMC() ? ;-)
 
 Works for me.

Slight that-sucks alert: So, if I have to get a new PMC of class X (which is
something that's going to have to happen often) I call
X-vtbl-new(newPMC())
or
X-vtbl-morph(newPMC(),X);
?

-- 
   It's is not, it isn't ain't, and it's it's, not its, if you mean it
   is. If you don't, it's its. Then too, it's hers. It isn't her's. It
   isn't our's either. It's ours, and likewise yours and theirs.
-- Oxford University Press, Edpress News



Re: PDD 2, vtables

2001-02-18 Thread Dan Sugalski

At 02:50 AM 2/19/2001 +, Simon Cozens wrote:
On Sun, Feb 18, 2001 at 09:34:46PM -0500, Dan Sugalski wrote:
  At 01:13 AM 2/19/2001 +, Alan Burlison wrote:
  Dan Sugalski wrote:
  
Grab one via a utility function. getPMC() or something of the sort.
  
  newPMC() ? ;-)
 
  Works for me.

Slight that-sucks alert: So, if I have to get a new PMC of class X (which is
something that's going to have to happen often) I call
 X-vtbl-new(newPMC())
or
 X-vtbl-morph(newPMC(),X);

X-vtbl-new. (Whether you snag a new PMC with newPMC or get handed a 
scratch one by the interpreter depends on circumstance) Morph is used for 
active variables, while new can assume that the PMC it was handed is dead.

I was originally thinking that new should handle the case where the PMC it 
was handed might need cleanup, but at this point I think that's a bad 
thing--it forces important code out too many places, and if we want at 
least semi-determinate cleanup we ought be doing it someplace else anyway.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-17 Thread Simon Cozens

On Mon, Feb 05, 2001 at 05:14:44PM -0500, Dan Sugalski wrote:
 =item new
 
void   new(PMC[, key]);
 
 Creates a new variable of the appropriate type out of the passed PMC,
 destroying the current contents if there are any. This is a class
 function.

Can I suggest this becomes
PMC new(PMC[, key]);

It gets really hard to get things started otherwise:

PMC mypmc;
(sviv_vtable.new)(mypmc);
printf("mypmc's class is %s\n", (mypmc-vtbl-name)(mypmc));

If PMC is a pointer to a structure, "new" will need to allocate memory for a
new structure, and hence the value of mypmc will have to change. So I'd rather
say 

PMC mypmc;
mypmc = (sviv_vtable.new)(mypmc); /* Old value of mypmc is ignored */

Unless it's a pointer to a pointer to a structure. Which is possible, but
yuck.

The structure

struct _pmc {
VTABLE* vtbl;
void* private_data;
};

makes it nice and easy to morph one PMC class into another; the "new" entry in
the hypothetical sviv_vtable becomes:

PMC int_new(PMC pmc, ...) {
if (pmc)
(pmc-vtbl-destroy)(pmc);
else
pmc = (PMC)malloc(sizeof(PMC));

pmc-vtbl = sviv_vtable;
pmc-private_data = (IV*)malloc(sizeof(IV*));
return pmc;
}

-- 
If you give a man a fire, he'll be warm for a day. If you set a man on fire, 
he'll be warm for the rest of his life.



Re: PDD 2, vtables

2001-02-17 Thread Simon Cozens

On Sat, Feb 17, 2001 at 02:34:08PM -0500, Dan Sugalski wrote:
 Well, the idea was that the passed in PMC is either reusable, can be 
 trashed, or is an aggregate of some point and we may autoviv the element 
 corresponding to the key.

Right, OK, but how do we create them in the first place?

 Nope. PMC structures will be parcelled out from arenas and not malloc'd, 
 and they won't be freed and re-malloced much.

Oh, phew, good. A bit too much Perl5-think on my part.

 We talked a few months ago about the structure of the base PMC piece. IIRC 
 the general consensus is we need a flags field, a field for the garbage 
 collector, and were going to tack on an integer and float fields as well 
 for speed.

Yuh, that dawned on me just after I sent it. Basically, I've been doing
some quick mock-ups, and wasn't worrying about GC and things just yet.

Simon

-- 
Some people claim that the UNIX learning curve is steep, but at least you
only have to climb it once.



Re: PDD 2, vtables

2001-02-14 Thread David Mitchell

After a week's delay where I've just been too busy,
I thought I'd resurrect the corpse of a thread I was involved in.

First off,

on Wed, 07 Feb 2001 14:37:33, Dan Sugalski [EMAIL PROTECTED] wrote:
 At 07:05 PM 2/7/2001 +, David Mitchell wrote:
 Dan, before I followup your reply to my list of nits about the PDD,
 can I clarify one thing: destruction.
 
 I am assuming that many PMCs will require destruction, eg calling
 destroy() on a string PMC will cause the memory used by the string
 data to be freed or whatever. Only very simple PMCs (such as integers)
 need to do no detruction.
 
 Is this the same as your perception of reality :-) ?
 
 Nope. The things that call destroy will be those things that have some sort 
 of active destruction. Freeing memory doesn't count in this case, since the 
 plan is to have some sort of external garbage collector that handles that.
 
 I also gather that PMCs will have a flag saying whether they need destroying,
 (eg ints say no, strings say yes), and that calls to destroy() are preceeded
 by a check on this flag for efficiency?
 
 Yep, though strings will be a no in this case.

Ah, I see. I really must get my head round this GC business (my copy of
the GC book is on order, but alas hasn't yet arrived :-( ).

As a quick aside, If a PMC contains a pointer to a string, is there ever
any need to set that pointer to NULL to allow the GC to reclaim the string?


Given that PMCs will have a flag saying whether they need destroying,
the naive view is that the external API definition for vtables will
say 'you must check the NEED_DESTROY flag, and if true, you must call
the destroy() method'. I'd regard this an implementation and optimisation
detail which should be hideen from the user. Instead, I'd prefer
that there be a standard set of macros to invoke methods, and that
the macro for destroy() checks the flag before calling the real
method. Then the API docs just say 'you must always call destroy()'.

Similarly, I suggest that PMC flags appear (via the use of macros)
to be identical to vtable methods. So that as implementations change,
whether something can be determined purely by examining a flag or needs
to be done via a full method call, is hidden from the user.

Or to put it another way, PDD 2 should pretend that everything is
a method, and whether some of these get optimised into checks
for flag bits or conditional calls, is an implementation detail
that we can hide from the user by the judicious use of macros or whatever.


   =item new
  
  void   new(PMC[, key]);
  
   Creates a new variable of the appropriate type out of the passed PMC,
   destroying the current contents if there are any. This is a class
   function.
 
 As an aside, am I right in assuming that there will be a function somewhere
 (outside the scope of this PDD) that creates new, empty PMCs, which can then
 be acted upon by new() to turn them into PMCs of a particular type?
 
 Probably, yes. More likely, PMCs will be declared nukable unless a "clean 
 me up" flag is set, in which case we'd just stomp on what was in there. 
 (Since we generally don't care about the contents of a PMC when trashing 
 it, with some relatively rare exceptions)
 
 Will PMCs that have just been new()ed have a default null value, eg
 0/"" ?  Note that they wont be undefined - at least, I'm assuming that
 undefined is handled by a separate class.
 Perhaps we need instead a range of new()s that initialise to various
 string, numeric etc values?
 
 I'm figuring that'll be handled by the bits that deal with constants, but 
 there's no reason that a plain new PMC can't be undef. (Basically set the 
 private data pointer to NULL and the vtable pointer to the undef class vtable)

Hmm, there doesnt seem to be anything related to handling constants in PDD 2.

 
 The above definition of new() implies that it first calls destroy()
 to release any previous contents. I think it would be better to define
 new() as operating on an empty PMC (so it is the the caller's responsibility
 to call destroy() first, if necessary).
 
 destroy will only be called if the PMC needs it. Most PMCs will just get 
 their contents trashed and let the GC clean up after.

yes, but we have to agree whether its the callers or callee's responsibilty
to determine whether to check for nuke requirements. I think it should
be the caller's responsibity: often the caller will know that the thing
it is passing has already been nuked, so it knows not to waste time
checking.
 
 Actually, I suspect that the whole area of new/clone/destroy etc will need to
 be examined carefully in the light of a 'typical' variable lifecycle,
 to avoid to unecessary transitions. For example, my $a = 'abc' might
 involve $a going from empty - undef - empty - "" - "abc" or similar,
 if we're not careful.
 
 That's an area for the optimizer. I'd like it to go from empty-"abc", 
 assuming we don't skip the empty step.

But its also a task for us to define PDD 2 carefully 

Re: PDD 2, vtables

2001-02-14 Thread Simon Cozens

On Wed, Feb 14, 2001 at 06:37:18PM +, David Mitchell wrote:
 Hmm, there doesnt seem to be anything related to handling constants in PDD 2.

I anticipate constants will be PMCs with a small vtable of "get methods",
possibly with several different types of value (string, numberic, float, etc.)
precomputed at compile time.

-- 
We *have* dirty minds. This is not news.
- Kake Pugh



Re: PDD 2, vtables

2001-02-10 Thread Dan Sugalski

At 08:47 AM 2/10/2001 -0200, Branden wrote:
Dan Sugalski wrote:
  The string API should be sufficiently smart to be able to convert data
from
  one encoding to another as it's more convenient.
 
  No, the vtable functions for the variables should know how to convert from
  and to perl's preferred string representations, and can do whatever
Bizarre
  Magic they care to iternally.
 

I don't see why Perl couldn't deal with multiple representations internally.
Conversion could be done on the way in, internally for efficiency on certain
operations, and on the way out, again.

It can, and it will. The question is "which ones". The regex engine will 
almost undoubtedly deal with only fixed-sized characters. Perl itself will 
probably restrict itself to fixed width characters as well. Individual 
variable classes can store data in any form they want. (If someone wants to 
leverage zlib to write a class that compresses its data, I'm fine with that)

  On the other side, for a string that is matched against regexps, it
doesn't
  matter much if it has variable character length, since regexps normally
read
  all the string anyway, and indexing characters isn't much of a concern.
 
  You underestimate the impact of variable-length data, I think. Regexes
  should go rather faster on fixed-length than variable length data. How
much
  so depends on your processor. (I can guarantee that Alphas will run a
  darned sight faster on UTF-32 than UTF-8...)
 

Aggreed. Should go faster. But maybe I don't need it that fast!

That's fine. Speed is my #1 priority. Memory usage is secondary. (An 
important secondary, but secondary nonetheless) Which doesn't rule out 
UTF-8, of course--it may turn out that converting things is slower than 
dealing with variable width data, in which case priority #1 wins.

(I really think it shouldn't be so much slower than doing it on an ASCII
string with the same total buffer size, it only would have to fetch another
byte on certain conditions and build the extended character representation,
what isn't hard either.)

You might not think so, but you would be wrong. You have a test and 
potential branch (possibly more--folks with lots of UTF-8 data, which 
includes everyone with a non-latin alphabet) on *every* character. That is 
not cheap on modern processors. Yes, you're pulling in significantly less 
data, which has an impact with UTF-32 (and garbage collection) but I'm not 
sure you'll find it a win.

We can benchmark it and see if my feeling is wrong once we get some code 
and a testing scaffold built.

  It would be nice if the user had some control to this, for example by
saying
  "I don't care this string will be used by substr, leave it in UTF-8 since
  it's too big and I don't want to waste memory!", or "This string isn't
too
  big, so I should convert it to bloated UTF-32 at once!", or even "use
less
  'memory';".
 
  That would be:
 my str $foo : utf8 : fixed;
  or possibly
 use less qw(memory);
 

Probably not my str $foo :utf8 :fixed, since then if I have $bar = $foo it
would convert the string value from $foo to anything else, right?

Might. Larry's not set the rules on what attributes are passed on with 
assignment. If you're really worried, there's no reason not to set 
attributes on $bar either.

  Generally speaking you probably don't want to do this. Odds are if you
  think you know what's going on better than the compiler, you're wrong.
(Not
  always, but in a non-trivial number of cases, in my experience)
 

I can't beat the compiler, that's for sure. But I really don't think I want
to read a 100KB file into a variable all at once and end up with 400KB
memory usage only for that file. And I really don't care if `regexps' go
slower on that, I can live with it...

If it's binary data or 8-bit characters, you won't. If it's UTF-8 you might 
see expansion, but how much depends on how many 7-bit characters you have. 
And then only if something actually asks for the data in UTF-32 format.

This has been enough to convince me that there should be UTF-8 as one of 
the base character types for vtables, even if we don't use it in many 
places internaly. For stuff that's just read and printed, it'll save 
memory, I think. Hope, at least. (Though it probably means the regex engine 
should deal with variable-width characters, and I'd really rather it didn't)

  And I believe 8-bit ASCII will always be an option, for who doesn't care
  about extended characters and want the best of both worlds on speed and
  memory usage.
 
  8-bit characters in general, yep. (ASCII is really 7-bit) ASCII, EBCDIC,
or
  raw byte buffers.
 

That includes Latin-1, Latin-etc. (I believe they're 10 or 12), which are
the same as the ISO-8859-1, ISO-8859-(etc).

Yes. Anything that doesn't require UTF-8.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED]  

Re: PDD 2, vtables

2001-02-09 Thread Simon Cozens

On Thu, Feb 08, 2001 at 09:55:17PM -0600, Jarkko Hietaniemi wrote:
  Umm, one way or another I suspect UTF-8 will be in there.
 
 I suspect so too but very grudgingly.

If we abstract the string handling nicely, it can be added on later or even
separately. (Credo: We don't have to write all of Perl 6 at once.) Yes, UTF8
is a bit of a pig to deal with, but since most of the world's data is still 
7-bit, it's almost certainly worth it.

-- 
Britain has football hooligans, Germany has neo-Nazis, and France has farmers. 
-The Times



Re: PDD 2, vtables

2001-02-08 Thread Edwin Steiner

Dan Sugalski wrote:
 At 04:02 PM 2/7/2001 +, David Mitchell wrote:
   Please see my previous post on the subject. As I pointed there, 
  implementing
   || and  like that breaks short-circuits.
  
   No, it doesn't. Just because you pass in two PMCs doesn't mean that they
   both need to be evaluated. Though the PDD does need to be clearer about 
  how
   that happens.
 
 Hmmm, I can't quite how that trick works. How whould the following get
 evaluated:
 
 $opened || open(F, ...)
 
 The second PMC would point to a lazy list, so it wouldn't be evaluated 
 unless its value gets fetched.

This implies there will be PMCs containing or refering to an arbitrary
amount of bytecode (with delayed execution). Consider this:

$dest = $i || ${ BLOCK };

I wonder which vtable function $i-logical_or will use to trigger the
delayed evaluation of the right operand. (I assume it will be evaluated
before the assignment.) The only thing $i-logical_or can know about
${ BLOCK } is that it will be something scalar.

Problem:
1. the evaluation of ${ BLOCK } must be done now.
2. the result should not be forced to a certain type, because
   || doesn't do this. (Contrary to arithmetic operators the
   left operand $i has nothing to say in this.)

I don't see any vtable functions in the PDD which could solve this.
Maybe there should be 
get_scalar and 
get_list 
functions, which do not force anything but evaluation in scalar/list 
context.

-Edwin



Re: PDD 2, vtables

2001-02-08 Thread Nicholas Clark

On Wed, Feb 07, 2001 at 01:24:27PM -0500, Dan Sugalski wrote:
 At 06:12 PM 2/7/2001 +, Nicholas Clark wrote:

 But I don't like the thought of going in and out of a lot of generic
 routines for
 
 $a = 3;
 $a += 2;
 
 when the integer scalar ought to know what the inside of another integer
 scalar looks like, and that 2 + 3 doesn't overflow.

 That particular case would get caught by the optimizer (I'd hope) so it'd 
 not be an issue anyway.

Oops. Should have written it as

sub foo {
  $[0] += 2;
}

and somewhere else

$a = 3; foo ($a);

but I agree - we need to time these things, not talk about them

 Hmm. += isn't another opcode
 it's a special case of a = b + c where the PMCs for a and b are the same
 thing. And I see no real reason why it can't be part of the + entry.
 
 Whether a special case in the code would get a speedup or not's up in the 
 air. (Is the test and branch faster than a generic doing it routine?) I'd 
 want to test that and see before I decided.

Yes, we're trading size [data cache (larger virtual tables on scalars) +
instruction cache (2 routines)] versus branch slowdown.  Or probably
something more complicated than that.

Hence your approach of make it flexible enough and try both.

Nicholas Clark



Re: PDD 2, vtables

2001-02-08 Thread Dan Sugalski

At 12:12 PM 2/8/2001 +, Nicholas Clark wrote:
On Wed, Feb 07, 2001 at 01:24:27PM -0500, Dan Sugalski wrote:
  At 06:12 PM 2/7/2001 +, Nicholas Clark wrote:

  But I don't like the thought of going in and out of a lot of generic
  routines for
  
  $a = 3;
  $a += 2;
  
  when the integer scalar ought to know what the inside of another integer
  scalar looks like, and that 2 + 3 doesn't overflow.
 
  That particular case would get caught by the optimizer (I'd hope) so it'd
  not be an issue anyway.

Oops. Should have written it as

sub foo {
   $[0] += 2;
}

and somewhere else

$a = 3; foo ($a);

It's examples like these that make me wince at the things we can't do in 
the perl optimizer. (This is a great candidate for inlining, and we 
probably won't be able to.)

but I agree - we need to time these things, not talk about them

Once we hammer out the vtable spec, I want to start writing code for the 
base vtable types. (Plain scalar, plain array, plain hash) I'm hoping this 
won't be hindered by the current licensing terms. (Which are laid out in 
http://archive.develooper.com/perl6-internals%40perl.org/msg01678.html for 
the interested)

  Hmm. += isn't another opcode
  it's a special case of a = b + c where the PMCs for a and b are the same
  thing. And I see no real reason why it can't be part of the + entry.
 
  Whether a special case in the code would get a speedup or not's up in the
  air. (Is the test and branch faster than a generic doing it routine?) I'd
  want to test that and see before I decided.

Yes, we're trading size [data cache (larger virtual tables on scalars) +
instruction cache (2 routines)] versus branch slowdown.  Or probably
something more complicated than that.

And probably completely counter-intuitive on top of it. (Plus the 
performance characteristics will probably be completely opposite on x86 and 
Alpha/SPARC machines. That'd be about right)

Hence your approach of make it flexible enough and try both.

Yep. Time it and try it, I think.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-08 Thread Dan Sugalski

At 11:57 AM 2/8/2001 +0100, Edwin Steiner wrote:
Dan Sugalski wrote:
  At 04:02 PM 2/7/2001 +, David Mitchell wrote:
Please see my previous post on the subject. As I pointed there,
   implementing
|| and  like that breaks short-circuits.
   
No, it doesn't. Just because you pass in two PMCs doesn't mean that 
 they
both need to be evaluated. Though the PDD does need to be clearer 
 about
   how
that happens.
  
  Hmmm, I can't quite how that trick works. How whould the following get
  evaluated:
  
  $opened || open(F, ...)
 
  The second PMC would point to a lazy list, so it wouldn't be evaluated
  unless its value gets fetched.

This implies there will be PMCs containing or refering to an arbitrary
amount of bytecode (with delayed execution). Consider this:

 $dest = $i || ${ BLOCK };

I wonder which vtable function $i-logical_or will use to trigger the
delayed evaluation of the right operand. (I assume it will be evaluated
before the assignment.) The only thing $i-logical_or can know about
${ BLOCK } is that it will be something scalar.

What we can do in that case is treat the right-hand block as an anonymous 
sub, and put a code ref in the list for the righthand side.

Problem:
 1. the evaluation of ${ BLOCK } must be done now.

Must it? Lazy evaluation would lead me to think it shouldn't be done unless 
the left side of the || evaluates to something false.

 2. the result should not be forced to a certain type, because
|| doesn't do this. (Contrary to arithmetic operators the
left operand $i has nothing to say in this.)

Well, the right side of || gets the context of the left side of the 
assignment, if it's in one. But dealing with odd context isn't anything new 
in perl.

I don't see any vtable functions in the PDD which could solve this.
Maybe there should be
 get_scalar and
 get_list
functions, which do not force anything but evaluation in scalar/list
context.

Context is set separately, and I'm not sure how to do it yet. I'm 
considering some sort of interpreter variable with the current context, but 
I'm not sure that's the best way.

We will need utility functions to find it out, though.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-08 Thread Edwin Steiner

Dan Sugalski wrote:
 
 At 11:57 AM 2/8/2001 +0100, Edwin Steiner wrote:
 Dan Sugalski wrote:
   At 04:02 PM 2/7/2001 +, David Mitchell wrote:
 Please see my previous post on the subject. As I pointed there,
implementing
 || and  like that breaks short-circuits.

 No, it doesn't. Just because you pass in two PMCs doesn't mean that
  they
 both need to be evaluated. Though the PDD does need to be clearer
  about
how
 that happens.
   
   Hmmm, I can't quite how that trick works. How whould the following get
   evaluated:
   
   $opened || open(F, ...)
  
   The second PMC would point to a lazy list, so it wouldn't be evaluated
   unless its value gets fetched.
 
 This implies there will be PMCs containing or refering to an arbitrary
 amount of bytecode (with delayed execution). Consider this:
 
  $dest = $i || ${ BLOCK };
 
 I wonder which vtable function $i-logical_or will use to trigger the
 delayed evaluation of the right operand. (I assume it will be evaluated
 before the assignment.) The only thing $i-logical_or can know about
 ${ BLOCK } is that it will be something scalar.
 
 What we can do in that case is treat the right-hand block as an anonymous
 sub, and put a code ref in the list for the righthand side.

That would be perfectly fine, I think.
Would the sub (itself, not the reference) be a PMC?
Either way you still need a means to evaluate/call this thing.
I thought there should be virtual functions for this on the
ref PMC or on the sub PMC itself. Is this a misconception?

 
 Problem:
  1. the evaluation of ${ BLOCK } must be done now.
 
 Must it? Lazy evaluation would lead me to think it shouldn't be done unless
 the left side of the || evaluates to something false.

I didn't express myself clearly. I was (unfortunately not explicitly)
refering to the case when the left side evaluates to false, and only
to this case. By "now" I meant after the evaluation of the left side
(to false) and before the assignment.

The problem I was pointing out is triggering the delayed evaluation "now"
(probably by calling a get_* function)
without forcing the result to be one of (integer/number/string/bool).

You suggest making it an anon-sub-ref call, which is fine. But as I stated
above I expected there would be virtual functions for this.
After all the internals of a PMC should only be known to the
virtual functions, right? (eg. Where the ref PMC stores the function/sub
pointer would be internal.)

Your words (PDD 2):
Nothing outside
the core of perl (in fact, nothing outside the data type's vtable
routines) should infer anything about a PMC. (hence the Magic part)

 
  2. the result should not be forced to a certain type, because
 || doesn't do this. (Contrary to arithmetic operators the
 left operand $i has nothing to say in this.)
 
 Well, the right side of || gets the context of the left side of the
 assignment, if it's in one. But dealing with odd context isn't anything new
 in perl.

Exactly. I actually found that what I meant (in this example) is described as
"don't-care scalar context" in the Camel Book.

I was troubled because in the PDD there are only get_* functions
for more specific cases (int,number,string,bool).

 
 I don't see any vtable functions in the PDD which could solve this.
 Maybe there should be
  get_scalar and
  get_list
 functions, which do not force anything but evaluation in scalar/list
 context.
 
 Context is set separately, and I'm not sure how to do it yet. I'm
 considering some sort of interpreter variable with the current context, but
 I'm not sure that's the best way.
 
 We will need utility functions to find it out, though.

-Edwin



Re: PDD 2, vtables

2001-02-07 Thread Tim Bunce

On Tue, Feb 06, 2001 at 12:28:23PM -0500, Dan Sugalski wrote:
 At 11:26 AM 2/6/2001 +, Tim Bunce wrote:
 [First off: I've not really been paying attention so forgive me if I'm
 being dumb here.  And many thanks for helping to drive this forwards.]
 
 On Mon, Feb 05, 2001 at 05:14:44PM -0500, Dan Sugalski wrote:
  
   =head2 Core datatypes
  
   For ease of use, we define the following semi-abstract data types
 
 Probably worth stating upfront that it'll be easy to add new types
 to avoid people argusing for their favorite type to be added here.
 
 I'm not sure it should be--that'd mean extending the vtables in ways they 
 have little room to grow. Adding new perl datatypes is easy, adding new 
 low-level types is harder.

That's pretty much what I meant. I think it's worth saying.

   =item INT
   =item NUM
   =item STR
   =item BOOL
 
 What about references?
 
 Special type of scalar, not dealt with here.

But should be at least mentioned.

   =item UTF-32 string
   =item Native string
   =item Foreign string
 
 I'm a little surprised not to see UTF-8 there, but since I'm also
 confused about what Native string and Foreign string are I'll skip it.
 Except to say that some clarification here may help, and explicitly
 mentioning UTF-8 (even to say it won't be a core type and provide a
 reference to why) would be good.
 
 I didn't put UTF-8 in on purpose, because I'd just as soon not deal with it 
 internally. Variable length character data's a pain in the butt, and if we 
 can avoid having the internals deal with it except as a source that gets 
 converted to UTF-32, that's fine with me.

I agree with Branden that a default 4x memory bloat would not be popular.

 The native and foreign string data types were an attempt to accommodate 
 UTF-8, as well as ASCII and EBCDIC character data. One of the three will 
 likely be the native type, and the rest will be foreign strings. I'm not 
 sure if perl should have only one foreign string type, or if we should have 
 a type tag along with the other bits for strings.

Umm, one way or another I suspect UTF-8 will be in there.

   =item is_same
  
  BOOL   is_same(PMC1, PMC2[, key]);
  
   Returns TRUE if CPMC1 and CPMC2 refer to the same value, and FALSE
   otherwise.
 
 I think that needs more clarification, especially where they are of
 different types. Contrast with is_equal() below.
 
 If they're different types they can't be the same. This would be used to 
 check if two references have the same referent, or if two magic variables 
 (database handles, say) pointed to the same thing.

Okay, so say so in the PPD. "refer to the same value" isn't very clear
(the word value is probably the problem).

   =item concatenate
  
  void   concatenate(PMC1, PMC2, PMC3[, key]); ##
  
   Concatenates the strings in CPMC2 and CPMC3, storing the result in
   CPMC1.
 
 and insert (ala sv_insert)  etc?
 
 Hadn't considered them. Care to elaborate on the etc?

Er, I haven't looked at sv.c for ages but basically all the kinds of
string manipulations that ended up in there for good reason will
probably need to be in perl6. sv_insert is a good example (and possibly
the only one :-)

   =item logical_or
   =item logical_and
   =item logical_not
 
 Er, why not just use get_bool? The only reason I can think of is to
 support three-value-logic but that would probably be better handled
 via a higher-level overloading kind of mechanism. Either way, clarify.
 
 Well, there's overloading. Plus the potential that a class will do 
 something odd with it--if you || on two custom arrays in list context you 
 might get an array with each pair (left[0] || right [0] and so on) 
 logically or'd.

Okay, don't forget xor then :)

   =item match
  
  void   match(PMC1, PMC2, REGEX[, key]);
  
   Performs a regular expression match on CPMC2 against the expression
   CREGEX, placing the results in CPMC1.
 
 Results, plural = container = array or hash. Needs clarifying.
 
 Yep, especially since I'd considered tossing the match destination 
 entirely. (Though that means special variables, and I'm not sure I want to 
 go there) It'll likely just return true or false. I'll rethink it.

A BOOL return would be good. But "placing the results in CPMC1" is
also good (assuming 'results' are equiv to $1, $2 etc in perl5).

   =head1 REFERENCES
  
   PDD 3: Perl's Internal Data Types.
 
 Some references to any other vtable based languages would be good.
 (I presume people have looked at some and learnt lessons.)
 
 Alas not. This is pretty much head of zeus stuff, modulo some ego. (Mine's 
 not *that* big...)

:-)

Without studying history we may be doomed to repeat it.

So can anyone point to vtable based language implementations?

Tim.



Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

Some comments about the vtable PDD...

First a general comment. I think we really need to make it clear for
each method, which arg respresents the object that is having its method
called (ie which is $self/this so to speak). One way to make this clear
would be to insist that the first arg is always $self,
but failing that, it should be explicity mentioned for each function.

Also, can I suggest a bit of terminology, which I will use below?
I define an *empty* PMC as a PMC which exists, but does not have a valid
vtable ptr or content (and so whose methods must not be called under
any circumstances). I specifically contrast this with a undefined PMC,
which has a valid vtable pointer that points to a bunch of methods that
mostly call carp("use of undefined value ...").

Now onwards and upwards


 The Ckey parameter is optional, and if passed it refers to an array
 of key structure pointers.

A mere detail, but would it not be more efficient to just pass them
as extra args, ie add(PMC1, PMC2, PMC3, key1, key2, key3),
rather than having to potentially create and populate a tmp struct
just to call the function???



IVtype(PMC[, subtype]);
 
 Returns the type of the PMC. If the subtype is passed (int, string,
 num) it returns the subtype of the PMC. This is generally a class
 function rather than a variable one, but the PMC is passed in just in
 case. (And so we can have the subtype be a vararg parameter)

I dont understand the subtype bit. If for example we pass an optional
2nd arg of INT (assuming this is some sort of enum?), what does type()
return?



 =item new
 
void   new(PMC[, key]);
 
 Creates a new variable of the appropriate type out of the passed PMC,
 destroying the current contents if there are any. This is a class
 function.

As an aside, am I right in assuming that there will be a function somewhere
(outside the scope of this PDD) that creates new, empty PMCs, which can then
be acted upon by new() to turn them into PMCs of a particular type?

Will PMCs that have just been new()ed have a default null value, eg
0/"" ?  Note that they wont be undefined - at least, I'm assuming that
undefined is handled by a separate class.
Perhaps we need instead a range of new()s that initialise to various
string, numeric etc values?

The above definition of new() implies that it first calls destroy()
to release any previous contents. I think it would be better to define
new() as operating on an empty PMC (so it is the the caller's responsibility
to call destroy() first, if necessary).

Actually, I suspect that the whole area of new/clone/destroy etc will need to
be examined carefully in the light of a 'typical' variable lifecycle,
to avoid to unecessary transitions. For example, my $a = 'abc' might
involve $a going from empty - undef - empty - "" - "abc" or similar,
if we're not careful.


void   clone(PMC1, PMC2 [, int flags[,key]);
 
 Copies CPMC2 into CPMC1. The Cflags parameter notes whether
 a deep copy should be done. (Possibly other things as well, if someone
 thinks of something reasonable)

One flag that would be very useful is 'destroy', which tells clone() to
destroy PMC2 immediately after the clone operation.
This is because a clone will often be immediately followed by a destroy
of the copied PMC, and delegating the destory() to clone() allows clone()
the chance to do things more effiently (eg even when asked to do a deep
copy, it just copies the vtable pointer and payload pointer(s), then scrubs
the old PMC)

I alo think that clone should expect PMC1 to be empty - ie it assumes the
caller has already called destroy() if necessary.


I guess we also need an assign() method, to handle

$a = $b and the like.

Note that assign and clone are very different operations (although assign
may well call clone): assign() copies *to* itself, while clone() copies
*from* itself.
Note that if $a is a 'simple' variable, $a-assign($b) will
itself just fall through to $b-clone($a) and let $a be wiped; while
if $a is magic or tied or whatever, then $a-assign($b) will take
a more active role in setting its own value, based on the value of $b.



void   morph(PMC, type[, key]);
 
 Tells the PMC to change itself into a PMC of the specified type.

I dont really see what the difference is between this and new().



void   destroy(PMC[, key]);
 
 Destroys the variable the PMC represents, leaving it undef.

(See also my comments earlier about new/clone/destroy etc).

I think destroy should leave an empty PMC rather than an undef one,
since as I said earlier, I think undef is a class in its own right.



 =item exists (x)
 
BOOL   exists(PMC1[, key]);


Presumably we also need defined(). (where most classes will always return
false, while the 'undefined' classs class always returns true.)




Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

 Please see my previous post on the subject. As I pointed there, implementing
 || and  like that breaks short-circuits.
 
 No, it doesn't. Just because you pass in two PMCs doesn't mean that they 
 both need to be evaluated. Though the PDD does need to be clearer about how 
 that happens.

Hmmm, I can't quite how that trick works. How whould the following get
evaluated:

$opened || open(F, ...)




Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 04:02 PM 2/7/2001 +, David Mitchell wrote:
  Please see my previous post on the subject. As I pointed there, 
 implementing
  || and  like that breaks short-circuits.
 
  No, it doesn't. Just because you pass in two PMCs doesn't mean that they
  both need to be evaluated. Though the PDD does need to be clearer about 
 how
  that happens.

Hmmm, I can't quite how that trick works. How whould the following get
evaluated:

$opened || open(F, ...)

The second PMC would point to a lazy list, so it wouldn't be evaluated 
unless its value gets fetched.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-07 Thread Nicholas Clark

On Wed, Feb 07, 2001 at 04:03:49PM +, David Mitchell wrote:
 BTW, should the vtable include all the mutator operators too, ie
 ++, += and so on, on the grounds that an implementation may be able
 do this more efficiently internally?

++ and -- are already slightly messy in perl5

pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
They know how to increment and decrement integers that don't overflow,
and call routines in sv.c to increment and decrement anything else.

Actually, this nearly provides a divide between values and operators
that has been suggested, with the speed up hack for the common case.

Nicholas Clark



Re: PDD 2, vtables

2001-02-07 Thread Branden

Nicholas Clark wrote:
 ++ and -- are already slightly messy in perl5

 pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
 They know how to increment and decrement integers that don't overflow,
 and call routines in sv.c to increment and decrement anything else.

 Actually, this nearly provides a divide between values and operators
 that has been suggested, with the speed up hack for the common case.

 Nicholas Clark


I guess everything (including get/set, add/sub/mul/...) could have a speed
up hack for the common case. The vtables (or PMC's flags as well) could have
a flag that indicate ``No tying and no overloading here, nothing special,
just another plain old variable''. Every operation would check the `special'
flag of the values it operates, and do the right thing on them. Otherwise,
call the vtable for the generic way of doing it.

I _think_ this would be a great speed up if the program doesn't use much
magic, but perhaps the overhead would be too big and make tying slower than
in Perl 5... something to consider tough.

- Branden




Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

Nicholas Clark [EMAIL PROTECTED] mused:
 On Wed, Feb 07, 2001 at 04:03:49PM +, David Mitchell wrote:
  BTW, should the vtable include all the mutator operators too, ie
  ++, += and so on, on the grounds that an implementation may be able
  do this more efficiently internally?
 
 ++ and -- are already slightly messy in perl5
 
 pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
 They know how to increment and decrement integers that don't overflow,
 and call routines in sv.c to increment and decrement anything else.
 
 Actually, this nearly provides a divide between values and operators
 that has been suggested, with the speed up hack for the common case.

I'm not sure I follow you. What is the "this" in "this nearly provides a
divide"?

Confused of Sheffield.




Re: PDD 2, vtables

2001-02-07 Thread Nicholas Clark

On Wed, Feb 07, 2001 at 05:19:16PM +, David Mitchell wrote:
 Nicholas Clark [EMAIL PROTECTED] mused:
  On Wed, Feb 07, 2001 at 04:03:49PM +, David Mitchell wrote:
   BTW, should the vtable include all the mutator operators too, ie
   ++, += and so on, on the grounds that an implementation may be able
   do this more efficiently internally?
  
  ++ and -- are already slightly messy in perl5
  
  pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
  They know how to increment and decrement integers that don't overflow,
  and call routines in sv.c to increment and decrement anything else.
  
  Actually, this nearly provides a divide between values and operators
  that has been suggested, with the speed up hack for the common case.
 
 I'm not sure I follow you. What is the "this" in "this nearly provides a
 divide"?

this example.
I think the "nearly" probably should go.
Maybe I should have written "++ and -- in perl5 provides an example of a
(nearly clean) divide between operator and value

 Confused of Sheffield.

Hmm. Yes. I'm confused too.

Confused of Newcastle



Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 03:09 PM 2/7/2001 +, David Mitchell wrote:
Some comments about the vtable PDD...

First a general comment. I think we really need to make it clear for
each method, which arg respresents the object that is having its method
called (ie which is $self/this so to speak). One way to make this clear
would be to insist that the first arg is always $self,
but failing that, it should be explicity mentioned for each function.

The docs are unclear there. I'll patch 'em up.

FWIW, generally the first argument is the destination.

Also, can I suggest a bit of terminology, which I will use below?
I define an *empty* PMC as a PMC which exists, but does not have a valid
vtable ptr or content (and so whose methods must not be called under
any circumstances). I specifically contrast this with a undefined PMC,
which has a valid vtable pointer that points to a bunch of methods that
mostly call carp("use of undefined value ...").

This is an area that needs addressing, that's for sure. I'll deal with that 
as well.

  The Ckey parameter is optional, and if passed it refers to an array
  of key structure pointers.

A mere detail, but would it not be more efficient to just pass them
as extra args, ie add(PMC1, PMC2, PMC3, key1, key2, key3),
rather than having to potentially create and populate a tmp struct
just to call the function???

Well, extra arguments cost. I'd originally had only a single key optionally 
passed in, but that meant potentially two of the three PMCs had to be real, 
rather than keyed into containers. (And thus possibly virtual) Hence the 
key array.

I admit it's not a swell decision--I'm not sure whether the single optional 
parameter is better, or several optional (or required) parameters are 
better. I can see it going either way, and I didn't put all that much 
thought into it.

 IVtype(PMC[, subtype]);
 
  Returns the type of the PMC. If the subtype is passed (int, string,
  num) it returns the subtype of the PMC. This is generally a class
  function rather than a variable one, but the PMC is passed in just in
  case. (And so we can have the subtype be a vararg parameter)

I dont understand the subtype bit. If for example we pass an optional
2nd arg of INT (assuming this is some sort of enum?), what does type()
return?

If you pass in INT, you'll get back NATIVE or BIGINT, depending on which 
one the PMC would rather be. STR would get BINARY, NATIVE, FOREIGN, or UTF_32.

Basically, subtype says "If I asked you what kind of string you were, what 
would you tell me?"

  =item new
 
 void   new(PMC[, key]);
 
  Creates a new variable of the appropriate type out of the passed PMC,
  destroying the current contents if there are any. This is a class
  function.

As an aside, am I right in assuming that there will be a function somewhere
(outside the scope of this PDD) that creates new, empty PMCs, which can then
be acted upon by new() to turn them into PMCs of a particular type?

Probably, yes. More likely, PMCs will be declared nukable unless a "clean 
me up" flag is set, in which case we'd just stomp on what was in there. 
(Since we generally don't care about the contents of a PMC when trashing 
it, with some relatively rare exceptions)

Will PMCs that have just been new()ed have a default null value, eg
0/"" ?  Note that they wont be undefined - at least, I'm assuming that
undefined is handled by a separate class.
Perhaps we need instead a range of new()s that initialise to various
string, numeric etc values?

I'm figuring that'll be handled by the bits that deal with constants, but 
there's no reason that a plain new PMC can't be undef. (Basically set the 
private data pointer to NULL and the vtable pointer to the undef class vtable)

The above definition of new() implies that it first calls destroy()
to release any previous contents. I think it would be better to define
new() as operating on an empty PMC (so it is the the caller's responsibility
to call destroy() first, if necessary).

destroy will only be called if the PMC needs it. Most PMCs will just get 
their contents trashed and let the GC clean up after.

Actually, I suspect that the whole area of new/clone/destroy etc will need to
be examined carefully in the light of a 'typical' variable lifecycle,
to avoid to unecessary transitions. For example, my $a = 'abc' might
involve $a going from empty - undef - empty - "" - "abc" or similar,
if we're not careful.

That's an area for the optimizer. I'd like it to go from empty-"abc", 
assuming we don't skip the empty step.

 void   clone(PMC1, PMC2 [, int flags[,key]);
 
  Copies CPMC2 into CPMC1. The Cflags parameter notes whether
  a deep copy should be done. (Possibly other things as well, if someone
  thinks of something reasonable)

One flag that would be very useful is 'destroy', which tells clone() to
destroy PMC2 immediately after the clone operation.

That makes no sense to me. If we're cloning PMC2 then trashing it, why not 
just set 

Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

   ++ and -- are already slightly messy in perl5
   
   pp_preinc, pp_postinc, pp_predec and pp_postdec live in with all the ops.
   They know how to increment and decrement integers that don't overflow,
   and call routines in sv.c to increment and decrement anything else.
   
   Actually, this nearly provides a divide between values and operators
   that has been suggested, with the speed up hack for the common case.
  
  I'm not sure I follow you. What is the "this" in "this nearly provides a
  divide"?
 
 this example.
 I think the "nearly" probably should go.
 Maybe I should have written "++ and -- in perl5 provides an example of a
 (nearly clean) divide between operator and value

Well, many of the vtable methods are operator-ish rather than value-ish,
presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
have add(), concatenate() etc. Whihc leads me back to: I'm not sure
whether you are in favour of, or oppose, += etc being vtable methods.
 
  Confused of Sheffield.
 
 Hmm. Yes. I'm confused too.
 
 Confused of Newcastle

Fancy swapping some cutlery for some Brown Ale? ;-)




Re: PDD 2, vtables

2001-02-07 Thread Branden

Dan Sugalski wrote:
 At 03:09 PM 2/7/2001 +, David Mitchell wrote:
 A mere detail, but would it not be more efficient to just pass them
 as extra args, ie add(PMC1, PMC2, PMC3, key1, key2, key3),
 rather than having to potentially create and populate a tmp struct
 just to call the function???

 Well, extra arguments cost. I'd originally had only a single key
optionally
 passed in, but that meant potentially two of the three PMCs had to be
real,
 rather than keyed into containers. (And thus possibly virtual) Hence the
 key array.

 I admit it's not a swell decision--I'm not sure whether the single
optional
 parameter is better, or several optional (or required) parameters are
 better. I can see it going either way, and I didn't put all that much
 thought into it.



I think filling an array costs at least as much as passing parameters, not
to mention the cost of passing that array as a parameter, also... Unless the
array is constant and can be pre-filled by the compiler (which I think is a
somewhat rare case, considering all the three arguments), or once filled
there are cases it can be reused, I'm not sure it's worth using an array to
save 2 pushes into the stack...

- Branden




Re: PDD 2, vtables

2001-02-07 Thread Nicholas Clark

On Wed, Feb 07, 2001 at 05:54:14PM +, David Mitchell wrote:
 Well, many of the vtable methods are operator-ish rather than value-ish,
 presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
 have add(), concatenate() etc. Whihc leads me back to: I'm not sure
 whether you are in favour of, or oppose, += etc being vtable methods.

I'm not either. They feel like they should be operators.
But I don't like the thought of going in and out of a lot of generic
routines for

$a = 3;
$a += 2;

when the integer scalar ought to know what the inside of another integer
scalar looks like, and that 2 + 3 doesn't overflow.

Hmm. += isn't another opcode
it's a special case of a = b + c where the PMCs for a and b are the same
thing. And I see no real reason why it can't be part of the + entry.


Nicholas Clark



Re: PDD 2, vtables

2001-02-07 Thread Branden

David Mitchell wrote:

 Well, many of the vtable methods are operator-ish rather than value-ish,
 presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
 have add(), concatenate() etc. Whihc leads me back to: I'm not sure
 whether you are in favour of, or oppose, += etc being vtable methods.


Oppose. (Actually I'm talking about my idea on vtables, i.e. separate +/-/*
in one vtable and store/fetch in another). My proposal on ++ and -- would be
having the `value'-part of the vtable (the one that handles +/-/*) return a
value corresponding to what would be the value of it after an increment or
decrement. store would be used to actually commit the ++/-- operation. This
would serve both postfix and prefix cases, because in one case the value
before the store would be used, and in the other the one after.

(I just reminded the C++ overloading of ++, that uses a dummy parameter to
tell if it's a pre or a post increment. So bad...)

- Branden




Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 06:12 PM 2/7/2001 +, Nicholas Clark wrote:
On Wed, Feb 07, 2001 at 05:54:14PM +, David Mitchell wrote:
  Well, many of the vtable methods are operator-ish rather than value-ish,
  presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
  have add(), concatenate() etc. Whihc leads me back to: I'm not sure
  whether you are in favour of, or oppose, += etc being vtable methods.

I'm not either. They feel like they should be operators.
But I don't like the thought of going in and out of a lot of generic
routines for

$a = 3;
$a += 2;

when the integer scalar ought to know what the inside of another integer
scalar looks like, and that 2 + 3 doesn't overflow.

That particular case would get caught by the optimizer (I'd hope) so it'd 
not be an issue anyway.

Hmm. += isn't another opcode
it's a special case of a = b + c where the PMCs for a and b are the same
thing. And I see no real reason why it can't be part of the + entry.

Whether a special case in the code would get a speedup or not's up in the 
air. (Is the test and branch faster than a generic doing it routine?) I'd 
want to test that and see before I decided.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 04:15 PM 2/7/2001 -0200, Branden wrote:
David Mitchell wrote:
 
  Well, many of the vtable methods are operator-ish rather than value-ish,
  presumably on the grounds of efficiency. A pure 'value' vtable wouldnt
  have add(), concatenate() etc. Whihc leads me back to: I'm not sure
  whether you are in favour of, or oppose, += etc being vtable methods.
 

Oppose. (Actually I'm talking about my idea on vtables, i.e. separate +/-/*
in one vtable and store/fetch in another). My proposal on ++ and -- would be
having the `value'-part of the vtable (the one that handles +/-/*) return a
value corresponding to what would be the value of it after an increment or
decrement. store would be used to actually commit the ++/-- operation. This
would serve both postfix and prefix cases, because in one case the value
before the store would be used, and in the other the one after.

Splitting the vtable into two pieces, with one piece not tied to a PMC, 
makes some things impossible. Consider this:

   @foo = @bar * @baz;

where all three arrays are really matrix types. In the separate load/store 
and do vtable scheme it means you get the value of @bar and @baz in scalar 
context, and multiply the results. Two operations, and the resultant values 
are sanitzed. In the single vtable scheme, we'd execute @bar's multiply 
routine, which would be clever enough (because we wrote it that way) to see 
the second parameter's also a matrix, and do matrix math.

Splitting things up also loses information when moving data between the two 
vtables routines. While that's not a big deal generally (as the info lost 
is irrelevant) it forbids some rather interesting side cases.

(I just reminded the C++ overloading of ++, that uses a dummy parameter to
tell if it's a pre or a post increment. So bad...)

I'm not sure it's worth having both a preinc and postinc operator, as 
opposed to splitting it into an inc/fetch or fetch/inc pair. (And yes, I 
know, earlier I was arguing that one opcode's better than two. This one's 
rare enough that profiling it would probably be in order...)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

 I'm not either. They feel like they should be operators.
 But I don't like the thought of going in and out of a lot of generic
 routines for
 
 $a = 3;
 $a += 2;
 
 when the integer scalar ought to know what the inside of another integer
 scalar looks like, and that 2 + 3 doesn't overflow.
 
 That particular case would get caught by the optimizer (I'd hope) so it'd 
 not be an issue anyway.
 
 Hmm. += isn't another opcode
 it's a special case of a = b + c where the PMCs for a and b are the same
 thing. And I see no real reason why it can't be part of the + entry.
 
 Whether a special case in the code would get a speedup or not's up in the 
 air. (Is the test and branch faster than a generic doing it routine?) I'd 
 want to test that and see before I decided.

Are we all clear then, that in perl 6, since the opcodes etc are no longer
allowed to rummage around in the internals of a PMC, its purely a question
of whether $a += 3 invokes

add($a,$a,3)
or
eqadd($a,3)

and whether $a++ invokes

add($a,$a,1)
or
postinc($a)

etc?

And that this decision is mainly a 'time it and see' decision?





Re: PDD 2, vtables

2001-02-07 Thread David Mitchell

Dan, before I followup your reply to my list of nits about the PDD,
can I clarify one thing: destruction.

I am assuming that many PMCs will require destruction, eg calling
destroy() on a string PMC will cause the memory used by the string
data to be freed or whatever. Only very simple PMCs (such as integers)
need to do no detruction.

Is this the same as your perception of reality :-) ?

I also gather that PMCs will have a flag saying whether they need destroying,
(eg ints say no, strings say yes), and that calls to destroy() are preceeded
by a check on this flag for efficiency?




Re: PDD 2, vtables

2001-02-07 Thread Branden

Dan Sugalski wrote:
 Splitting the vtable into two pieces, with one piece not tied to a PMC,
 makes some things impossible. Consider this:

@foo = @bar * @baz;

 where all three arrays are really matrix types.

By the PDD's notion of `key', what would be the `key' of a matrix type ?

(I think that's actually a -language question, but) What $foo[42] (where
@foo is matrix) would compile to?



 In the separate load/store
 and do vtable scheme it means you get the value of @bar and @baz in scalar
 context, and multiply the results. Two operations, and the resultant
values
 are sanitzed. In the single vtable scheme, we'd execute @bar's multiply
 routine, which would be clever enough (because we wrote it that way) to
see
 the second parameter's also a matrix, and do matrix math.


Actually, not necessarily. It depends of what the compiler does... There
could be special entries for array operations, like +/-/*/... . The problem
I see with it is what happens when you @a = @b. Actually, if @b is a matrix,
@a = @b makes @a a matrix or evaluates @b in list context? What about @a =
(@b) ? What if @a is a tied array? This matrix thing is actually getting
very confusing to me... I think all these proposed additions to the language
should be carefully examined for possible mis-interpretations like these.


- Branden




Re: PDD 2, vtables

2001-02-07 Thread Dan Sugalski

At 06:08 PM 2/7/2001 -0200, Branden wrote:
Dan Sugalski wrote:
  Splitting the vtable into two pieces, with one piece not tied to a PMC,
  makes some things impossible. Consider this:
 
 @foo = @bar * @baz;
 
  where all three arrays are really matrix types.

By the PDD's notion of `key', what would be the `key' of a matrix type ?

Probably an integer, possibly a list for multidimensional matrices. (And I 
haven't thought about how to handle that--probably force a series of index 
lookups)

(I think that's actually a -language question, but) What $foo[42] (where
@foo is matrix) would compile to?

Identically to how $foo[42] would if @foo were a plain array.

  In the separate load/store
  and do vtable scheme it means you get the value of @bar and @baz in scalar
  context, and multiply the results. Two operations, and the resultant
values
  are sanitzed. In the single vtable scheme, we'd execute @bar's multiply
  routine, which would be clever enough (because we wrote it that way) to
see
  the second parameter's also a matrix, and do matrix math.
 

Actually, not necessarily. It depends of what the compiler does... There
could be special entries for array operations, like +/-/*/... . The problem
I see with it is what happens when you @a = @b. Actually, if @b is a matrix,
@a = @b makes @a a matrix or evaluates @b in list context?

That's a language issue. I don't know--I can see it going either way. I'd 
prefer a straight assign and let the assignment vtable entry handle it, but 
I don't know that we'll have that option.

What about @a =
(@b) ?

Good question. I'd like to see it handled the same way as @a=@b, but I'm 
not sure that's going to happen. It's Larry's decision. (Mainly because I'd 
like to see this:

   @a = (@b, @c);

turn into:

   @a = @b;
   push @a, @c;

but I don't know that we'll be able to)

What if @a is a tied array?

What if? Larry's call as to whether it makes @a a copy of the data from @b, 
or creates a new tied thing, or an alias. Probably @a would be a plain copy 
of @b with no magic, but that's not my call.

This matrix thing is actually getting
very confusing to me... I think all these proposed additions to the language
should be carefully examined for possible mis-interpretations like these.

I'm not proposing they go in (well, OK, I am, but I'm not forcing it). What 
I am doing is trying to not preclude the possibility if its decided that it 
will happen.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




RE: PDD 2, vtables [pointers to related documentation]

2001-02-07 Thread Garrett Goebel

From: Tim Bunce [mailto:[EMAIL PROTECTED]]
 
 On Tue, Feb 06, 2001 at 12:28:23PM -0500, Dan Sugalski wrote:
 
  At 11:26 AM 2/6/2001 +, Tim Bunce wrote:
  
   On Mon, Feb 05, 2001 at 05:14:44PM -0500, Dan Sugalski wrote:
   
=head2 Core datatypes
   
For ease of use, we define the following semi-abstract 
data types
  
   Probably worth stating upfront that it'll be easy to add
   new types to avoid people arguing for their favorite type
   to be added here.
  
  I'm not sure it should be--that'd mean extending the 
  vtables in ways they have little room to grow. Adding new
  perl datatypes is easy, adding new low-level types is harder.
 
 That's pretty much what I meant. I think it's worth saying.

Adding comments like the ones Tim is suggesting, are just what someone like
myself needs. A statement of the obvious and the context in which it fits...
for those who haven't a clue and are trying to piece together a conceptual
model of system. It saves me the conflict of deciding whether or not to
pester you all with questions or not ask, never know and continue to be my
own little mushroom.
 

=head1 REFERENCES
   
PDD 3: Perl's Internal Data Types.
  
   Some references to any other vtable based languages would
   be good.(I presume people have looked at some and learnt
   lessons.)
  
  Alas not. This is pretty much head of zeus stuff, modulo 
  some ego. (Mine's not *that* big...)
 
 Without studying history we may be doomed to repeat it.
 
 So can anyone point to vtable based language implementations?

Well, I may be one of the least qualified subscribers on this list, but I'm
a pretty good gopher... Some of this relates to languages implementing
vtables as opposed to being implemented with them. Everything I've scanned
so far seems to raise the flag concerning overhead associated...

Title:Portable Inheritance and Polymorphism in C
URL:  http://www.embedded.com/97/fe29712.htm
Abstract: A lower-level view that assumes only a procedural language like C
for embedded developers who want to apply OO without switching to an OO
language

Title:Programming Language Pragmatics
URL:  http://www.amazon.com/exec/obidos/ASIN/1558604421
Abstract: Mentions virtual methods and tables in Sections 10.4-5. It
discusses vtables from a high level in the general context of Eiffel,
Simula, C++, and Ada.

Title:SableVM: A Research Framework for the Efficient
  Execution of Java Bytecode
URL:  http://www.j-meg.com/~egagnon/sable-report_2000-3/
Abstract: SableVM is an open-source virtual machine for Java2, intended as a
research framework for efficient execution of Java bytecode. The framework
is essentially composed of an extensible bytecode interpreter using
state-of-the-art and innovative techniques. Written in the C programming
language, and assuming minimal system dependencies, the interpreter
emphasizes high-level techniques to support efficient execution. In
particular, we introduce new data layouts for classes, virtual tables and
object instances that reduce the cost of interface method calls to that of
normal virtual calls, allow efficient garbage collection and light
synchronization, and make effective use of memory space. 

Title:C++ Producer Guide
URL:  http://www.cse.unsw.edu.au/~patrykz/TenDRA/tcpplus/lib.html#vtable
Abstract: vtable implementation in C++

Title:C++ ABI for IA-64
URL:  http://reality.sgi.com/dehnert_engr/cxx/abi.html
Abstract: vtables layout, etc. is discussed in sections 2.5-2.6 and
scattered throughout. You can find similar information in C++ ABI
documentation for Macintosh, etc.




Re: PDD 2, vtables

2001-02-06 Thread Tim Bunce

[First off: I've not really been paying attention so forgive me if I'm
being dumb here.  And many thanks for helping to drive this forwards.]

On Mon, Feb 05, 2001 at 05:14:44PM -0500, Dan Sugalski wrote:
 
 =head2 Core datatypes
 
 For ease of use, we define the following semi-abstract data types

Probably worth stating upfront that it'll be easy to add new types
to avoid people argusing for their favorite type to be added here.

 =item INT
 =item NUM
 =item STR
 =item BOOL

What about references?

Arrays and hashes should probably be at least mentioned here.

 =head3 String data types
 
 =item binary buffer

'Binary string'

 =item UTF-32 string
 =item Native string
 =item Foreign string

I'm a little surprised not to see UTF-8 there, but since I'm also
confused about what Native string and Foreign string are I'll skip it.
Except to say that some clarification here may help, and explicitly
mentioning UTF-8 (even to say it won't be a core type and provide a
reference to why) would be good.


 The functions are divided into two broad categories, those that perl
 will use the value of internally (for example the type functions) and
 those that produce or modify a PMC, such as the add function.

So possibly a good idea to explicitly group them that way.

 =head2 Functions in detail
 
 =item type
 
 =item name
 
STRname(PMC[, key]);
 
 Returns the name of the class the PMC belongs to.

So I'd call it type_name (or maybe class_name as you seem to be useing
the words interchangably. If type != class then clarify somewhere.).

 =item move_to
 
BOOL   move_to(void *, PMC);
 
 Tells the PMC to move its contents to a block of memory starting at
 the passed address. Used by the garbage collector to compact memory,
 this call can return a false value if the move can't be done for some
 reason. The pointer is guaranteed to point to a chunk of memory at
 least as large as that returned by the Creal_size vtable function.

Shouldn't the PMC be the first arg for consistency?

 =item real_size
 
IV real_size(PMC[, key]);
 
 Returns an integer value that represents the real size of the data
 portion, excluding the vtable, of the PMC.

Contiguous? Sum of parts (allowing for allignment) if it contains
multiple chunks of data?

 =item destroy
 
void   destroy(PMC[, key]);
 
 Destroys the variable the PMC represents, leaving it undef.

Using the word 'variable' here probably isn't a good idea.
Maybe "Destroys the contents of the PMC leaving it undef."

 =item is_same
 
BOOL   is_same(PMC1, PMC2[, key]);
 
 Returns TRUE if CPMC1 and CPMC2 refer to the same value, and FALSE
 otherwise.

I think that needs more clarification, especially where they are of
different types. Contrast with is_equal() below.

 =item concatenate
 
void   concatenate(PMC1, PMC2, PMC3[, key]); ##
 
 Concatenates the strings in CPMC2 and CPMC3, storing the result in
 CPMC1.

and insert (ala sv_insert)  etc?

 =item is_equal

Contrast with is_same() above.

 =item logical_or
 =item logical_and
 =item logical_not

Er, why not just use get_bool? The only reason I can think of is to
support three-value-logic but that would probably be better handled
via a higher-level overloading kind of mechanism. Either way, clarify.

 =item match
 
void   match(PMC1, PMC2, REGEX[, key]);
 
 Performs a regular expression match on CPMC2 against the expression
 CREGEX, placing the results in CPMC1.

Results, plural = container = array or hash. Needs clarifying.

 =item repeat (x)
 
void   repeat(PMC1, PMC2, PMC3[, key]); ##
 
 Performs the following sequence of operations: finds the string value
 from CPMC2; finds an integer value In from CPMC3; replicates the
 string In times; stores the resulting string in CPMC1.

So call it replicate? Could also work for arrays.

 =item nextkey (x)
 
void   nextkey(PMC1, PMC2, start_key[, key]);
 
 Looks up the key Cstart_key in CPMC2 and then stores the key after
 it in CPMC1. If start_key is Cundef, the first key is returned,
 and CPMC1 is set to undef if there is no next key.

Containers again.  And I'd call it key_next()

 =item exists (x)

Likewise, key_exists()

 =head1 TODO
 
 The effects of each function on scalar, array, hash, list, and IO
 PMCs needs to be hashed out.

Before that I think a section on containers need to be added.

 =head1 REFERENCES
 
 PDD 3: Perl's Internal Data Types.

Some references to any other vtable based languages would be good.
(I presume people have looked at some and learnt lessons.)

Tim.



Re: PDD 2, vtables

2001-02-06 Thread Simon Cozens

On Tue, Feb 06, 2001 at 11:26:57AM +, Tim Bunce wrote:
  =item UTF-32 string
  =item Native string
  =item Foreign string
 
 I'm a little surprised not to see UTF-8 there, but since I'm also
 confused about what Native string and Foreign string are I'll skip it.

"Native string encoding" is an abstraction that allows us to say "some
encoding that Perl knows how to deal with, but you don't need to care about".
The idea, if I understand it correctly, is that strings can be treated as
arrays of numbers, and you don't need to know how they're *really* stored.
They might be stored as UTF8 internally; I hope so.

"Foreign string" is the same, but with the implication that some external code
will be needed to tell Perl how it should convert between an array of numbers
and this encoding.

 Using the word 'variable' here probably isn't a good idea.
 Maybe "Destroys the contents of the PMC leaving it undef."

Agreed.

  =item logical_or
  =item logical_and
  =item logical_not
 
 Er, why not just use get_bool?

Overloading.

  =item repeat (x)
  
 void repeat(PMC1, PMC2, PMC3[, key]); ##
  
  Performs the following sequence of operations: finds the string value
  from CPMC2; finds an integer value In from CPMC3; replicates the
  string In times; stores the resulting string in CPMC1.
 
 So call it replicate?

Well, the Perl-space operator is called "repeat", and the Perl 5 operator
is OP_REPEAT, so...

 Before that I think a section on containers need to be added.

This is basically what the whole "key" thing is about. I'm not sure that
*that* much more needs to be described, (well, something needs to be
described, but not in great detail) other than the fact that operating on a
container PMC-key pair is equivalent to operating on a scalar PMC.

Hence, I can say:
(on a hash)   get_number(hash, "key")
(on an array) get_number(array, elem)
(on a scalar) get_number(scalar)
and not worry about what's going on underneath. 

There ought to be special functions for containers though, yeah.

-- 
"I find that anthropomorphism really doesn't help me with a place full 
of bugs." -- Megahal (trained on asr), 1998-11-06



Re: PDD 2, vtables

2001-02-06 Thread Dan Sugalski

At 11:32 AM 2/6/2001 -0200, Branden wrote:
Simon Cozens wrote:
=item logical_or
=item logical_and
=item logical_not
  
   Er, why not just use get_bool?
 
  Overloading.
 

Please see my previous post on the subject. As I pointed there, implementing
|| and  like that breaks short-circuits.

No, it doesn't. Just because you pass in two PMCs doesn't mean that they 
both need to be evaluated. Though the PDD does need to be clearer about how 
that happens.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-06 Thread Dan Sugalski

At 11:26 AM 2/6/2001 +, Tim Bunce wrote:
[First off: I've not really been paying attention so forgive me if I'm
being dumb here.  And many thanks for helping to drive this forwards.]

On Mon, Feb 05, 2001 at 05:14:44PM -0500, Dan Sugalski wrote:
 
  =head2 Core datatypes
 
  For ease of use, we define the following semi-abstract data types

Probably worth stating upfront that it'll be easy to add new types
to avoid people argusing for their favorite type to be added here.

I'm not sure it should be--that'd mean extending the vtables in ways they 
have little room to grow. Adding new perl datatypes is easy, adding new 
low-level types is harder.

  =item INT
  =item NUM
  =item STR
  =item BOOL

What about references?

Special type of scalar, not dealt with here.

Arrays and hashes should probably be at least mentioned here.

And lists, yes. Or they need their own PDD with details.

  =head3 String data types
 
  =item binary buffer

'Binary string'

I avoided that on purpose. Label it a string and people think of its 
contents as characters, and they're probably not going to be a good chunk 
of the time. Might not outweigh the consistency issue, though.

  =item UTF-32 string
  =item Native string
  =item Foreign string

I'm a little surprised not to see UTF-8 there, but since I'm also
confused about what Native string and Foreign string are I'll skip it.
Except to say that some clarification here may help, and explicitly
mentioning UTF-8 (even to say it won't be a core type and provide a
reference to why) would be good.

I didn't put UTF-8 in on purpose, because I'd just as soon not deal with it 
internally. Variable length character data's a pain in the butt, and if we 
can avoid having the internals deal with it except as a source that gets 
converted to UTF-32, that's fine with me.

The native and foreign string data types were an attempt to accommodate 
UTF-8, as well as ASCII and EBCDIC character data. One of the three will 
likely be the native type, and the rest will be foreign strings. I'm not 
sure if perl should have only one foreign string type, or if we should have 
a type tag along with the other bits for strings.

  The functions are divided into two broad categories, those that perl
  will use the value of internally (for example the type functions) and
  those that produce or modify a PMC, such as the add function.

So possibly a good idea to explicitly group them that way.

They were, but I see I lost that.

  =head2 Functions in detail
 
  =item type
 
  =item name
 
 STRname(PMC[, key]);
 
  Returns the name of the class the PMC belongs to.

So I'd call it type_name (or maybe class_name as you seem to be useing
the words interchangably. If type != class then clarify somewhere.).

The interchange is due to sloppy thinking. I'll redo it so that class == 
perl data type, while type == (NUM|STR|BOOL|INT).

  =item move_to
 
 BOOL   move_to(void *, PMC);
 
  Tells the PMC to move its contents to a block of memory starting at
  the passed address. Used by the garbage collector to compact memory,
  this call can return a false value if the move can't be done for some
  reason. The pointer is guaranteed to point to a chunk of memory at
  least as large as that returned by the Creal_size vtable function.

Shouldn't the PMC be the first arg for consistency?

First arg of the PMC is the destination PMC. We don't have one here.

  =item real_size
 
 IV real_size(PMC[, key]);
 
  Returns an integer value that represents the real size of the data
  portion, excluding the vtable, of the PMC.

Contiguous? Sum of parts (allowing for allignment) if it contains
multiple chunks of data?

Size we'd need to allocate if we were going to move the data. Though 
knowing how much space is currently taken would also be useful, assuming 
they're not the same. (They probably would be within a few bytes, though)

  =item destroy
 
 void   destroy(PMC[, key]);
 
  Destroys the variable the PMC represents, leaving it undef.

Using the word 'variable' here probably isn't a good idea.
Maybe "Destroys the contents of the PMC leaving it undef."

Better. Thanks.

  =item is_same
 
 BOOL   is_same(PMC1, PMC2[, key]);
 
  Returns TRUE if CPMC1 and CPMC2 refer to the same value, and FALSE
  otherwise.

I think that needs more clarification, especially where they are of
different types. Contrast with is_equal() below.

If they're different types they can't be the same. This would be used to 
check if two references have the same referent, or if two magic variables 
(database handles, say) pointed to the same thing.

  =item concatenate
 
 void   concatenate(PMC1, PMC2, PMC3[, key]); ##
 
  Concatenates the strings in CPMC2 and CPMC3, storing the result in
  CPMC1.

and insert (ala sv_insert)  etc?

Hadn't considered them. Care to elaborate on the etc?

  =item is_equal

Contrast with is_same() above.

  =item logical_or
  =item 

Re: PDD 2, vtables

2001-02-06 Thread Branden

Dan Sugalski wrote:
 At 11:26 AM 2/6/2001 +, Tim Bunce wrote:
 Arrays and hashes should probably be at least mentioned here.

 And lists, yes. Or they need their own PDD with details.


What's the difference between array and list? How is a list currently
(Perl5) implemented? Which operations does it support?



   =item UTF-32 string
   =item Native string
   =item Foreign string
 
 I'm a little surprised not to see UTF-8 there, but since I'm also
 confused about what Native string and Foreign string are I'll skip it.
 Except to say that some clarification here may help, and explicitly
 mentioning UTF-8 (even to say it won't be a core type and provide a
 reference to why) would be good.

 I didn't put UTF-8 in on purpose, because I'd just as soon not deal with
it
 internally. Variable length character data's a pain in the butt, and if we
 can avoid having the internals deal with it except as a source that gets
 converted to UTF-32, that's fine with me.


I would bother a lot, having my strings occupying 4x more memory than they
do now... For me, at least, UTF32 can be set aside, while UTF8 is a need
nowadays (damn XML!).



   =item match
  
  void   match(PMC1, PMC2, REGEX[, key]);
  
   Performs a regular expression match on CPMC2 against the expression
   CREGEX, placing the results in CPMC1.
 
 Results, plural = container = array or hash. Needs clarifying.

 Yep, especially since I'd considered tossing the match destination
 entirely. (Though that means special variables, and I'm not sure I want to
 go there) It'll likely just return true or false. I'll rethink it.


Will Perl 6 still be based on a stack, to pass a list of parameters and
return a list of results to the subs? Or is there any other approach
discussed for it? Is it still undefined? Will it work the same as in Perl 5
or will it take changes? Too soon to talk about it?

- Branden




Re: PDD 2, vtables

2001-02-06 Thread Dan Sugalski

At 05:01 PM 2/6/2001 -0200, Branden wrote:
Dan Sugalski wrote:
  At 11:26 AM 2/6/2001 +, Tim Bunce wrote:
  Arrays and hashes should probably be at least mentioned here.
 
  And lists, yes. Or they need their own PDD with details.
 

What's the difference between array and list? How is a list currently
(Perl5) implemented? Which operations does it support?

I'll leave this as an exercise for the reader. (Which is a polite way to 
say "go find out, grasshopper" :)

=item UTF-32 string
=item Native string
=item Foreign string
  
  I'm a little surprised not to see UTF-8 there, but since I'm also
  confused about what Native string and Foreign string are I'll skip it.
  Except to say that some clarification here may help, and explicitly
  mentioning UTF-8 (even to say it won't be a core type and provide a
  reference to why) would be good.
 
  I didn't put UTF-8 in on purpose, because I'd just as soon not deal with
it
  internally. Variable length character data's a pain in the butt, and if we
  can avoid having the internals deal with it except as a source that gets
  converted to UTF-32, that's fine with me.
 

I would bother a lot, having my strings occupying 4x more memory than they
do now... For me, at least, UTF32 can be set aside, while UTF8 is a need
nowadays (damn XML!).

It's a speed/space tradeoff. Dealing with the middle of a string with 
variable-length characters either requires an offset array (in which case 
you've just used more memory and time than a fixed-width representation) or 
scanning from the beginning of the array, which can be costly if you need 
to go too far in. (How costly depends on the architecture. Dealing with 
arrays of 8-bit ints is actually slower on the Alpha than the same size 
(element count, not bytecount) array of 32-bit ints. YMMV, though)

=item match
   
   void   match(PMC1, PMC2, REGEX[, key]);
   
Performs a regular expression match on CPMC2 against the expression
CREGEX, placing the results in CPMC1.
  
  Results, plural = container = array or hash. Needs clarifying.
 
  Yep, especially since I'd considered tossing the match destination
  entirely. (Though that means special variables, and I'm not sure I want to
  go there) It'll likely just return true or false. I'll rethink it.
 

Will Perl 6 still be based on a stack, to pass a list of parameters and
return a list of results to the subs? Or is there any other approach
discussed for it? Is it still undefined? Will it work the same as in Perl 5
or will it take changes? Too soon to talk about it?

We'll probably pass lists in and out, but it'll have a stack of 
sorts--that's pretty much a requirement for Algol-based languages.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-06 Thread Simon Cozens

On Tue, Feb 06, 2001 at 05:01:38PM -0200, Branden wrote:
 How is a list currently (Perl5) implemented? 

It's a bunch of SVs sitting on the stack, followed by a mark.

 Which operations does it support?

None.

-- 
Rule the Empire through force.
-- Shogun Tokugawa



Re: PDD 2, vtables

2001-02-06 Thread Alan Burlison

Branden wrote:

 Where can I find how Perl5's stack works (specially about parameter passing
 and returning from subs)?

Oh boy.  What a masochist.

;-)

Alan Burlison