Re: GC: what is better, reuse or avoid cloning?

2001-02-10 Thread Filipe Brandenburger

Buddha M Buck wrote:

   I see two ways of doing this: one is allowing a string value to be
shared by
   two or more variables, and the other one not.
 
  Why would you want to share the string value?  Why did you assign the
  value of $foo to $bar if you really wanted to:
 
 $bar = \$foo;

 I think what he's thinking (in C terms) would be more like the following:

 typedef struct { int length; char *s } string;

 // $foo = "xyzzy";
 string foo; foo.length = 5; foo.s =
strdup("xyzzy---blank---buffer---space---");

 // $bar = $foo;
 string bar; bar.length = foo.length; bar.s = foo.s;

 // $foo .= "xyzzy";
 strncpy(foo.s+foo.length,"xyzzy",5); foo.length += 5;

 // $foo and $bar share string buffers, but $bar only sees the first 5
 // characters while $foo sees the first 10.

 I don't see that as quite the same as the implicit references or
 type-globs you suggested.

 But it's late, and I might not know what I'm talking about...


That's exactly what I mean.

But actually doing that second $foo .= "xyzzy" thing without allocating a
new string would be problematic, since if I did $bar .= "abccb" after that
in the same way that it's done for $foo, it would overwrite the "xyzzy" in
$foo, right?

- Branden






Re: GC: what is better, reuse or avoid cloning?

2001-02-10 Thread Alan Burlison

Branden wrote:

 Any suggestions?

Yes, but none of them polite.

You might do well to study the way perl5 handles these issues.

Alan Burlison



Running Bytecode?

2001-02-10 Thread Vijaya Kumar C


Hai,

How can we run System independent Bytecode...?

I need this answer asap.

Beatie said thro his module we can generate system independent Bytecode.

How can i run that code ?

Also How to implement a compiler?

vijay




Re: Another approach to vtables

2001-02-10 Thread Paolo Molaro

On 02/07/01 Edwin Steiner wrote:
 [snip]
 
 I thought about it once more. Maybe I was confused by the *constant* NATIVE.
 Are you suggesting a kind of multiple dispatch (first operand selects
 the vtable, second operand selects the slot in the vtable)?
 
 So
 $dest = $first + $second
 becomes
 first-vtable-add[second-vtable-type(second)](dest,first,second,key);
 ?
 
 or maybe
 first-vtable-add[second-vtable-slot_select](dest,first,second,key);
 which saves a call by directly reading an integer from the vtable of second.
 
 (BTW, this is also how overloading with respect to the second argument
 could be handled (should it be decided on the language level to do that):
 There could be a slot like add[ARCANE_MAGIC] selected by
 second-vtable-slot_select
 which does all kinds of complicated checks and branches without any cost
 for the vfunctions in the other slots.)
 
 Such a multiple dispatch seems to me like the only solution which avoids
 the following (eg. in Python):
 'first + second' becomes
   1. call virtual function 'add' on first
   2. inside first-add do lots of checks about type of second

Something like what's done in python looks sensible to me.
If a vtable add function is also indexed by type you get exponential
growth of the vtable with the addition of other types and we want to
make that easy in perl 6. Also, it doesn't work if I introduce
my bigint type (the internal int vtable knows nothing about it):

$int = 1;
$bigint = new bigint ("9" x 999);
$res = $bigint + $int; # works, bigint knows about internal int
$res = $int + $bigint; # doesn't work, since the bigint is the second arg

The proposed solution (used in elestic, for example) is to have the add
method return a value indicating it has performed the addition: if it's
false, we try to add using the add method in the second argument that may 
know better...
In the method, you check the types and perform the work only on the
ones you know about.

lupus

-- 
-
[EMAIL PROTECTED] debian/rules
[EMAIL PROTECTED] Monkeys do it better



Re: Running Bytecode?

2001-02-10 Thread Simon Cozens

On Sat, Feb 10, 2001 at 03:14:29AM -0600, Vijaya Kumar C wrote:
 Beatie said thro his module we can generate system independent Bytecode.
 How can i run that code ?

perldoc ByteLoader

 Also How to implement a compiler?

For Perl 6, or just generally? Either way, that's a hell of a question
to answer straight off. I'm not sure what you're getting at. The Perl 5
compiler is implemented by passing the compiled op tree representing a
program to one of the B:: Perl modules that does something with it. I
wouldn't be surprised if Perl 6 did soemthing similar, but less hairly.

-- 
It's much better to have people flaming in the flesh.  -Al Aho



Re: GC: what is better, reuse or avoid cloning?

2001-02-10 Thread Dan Sugalski

At 12:51 AM 2/10/2001 -0200, Branden wrote:
Back to the GC issue, I was wondering something.

Okay, I snipped all of this. After reading it, I'm pretty sure it makes no 
sense at all.

Branden, I'd recommend picking up a copy of _Garbage Collection_ and 
reading it. The ISBN's in the perl reading list. (My copy's in the office 
or I'd dig it out for you)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: PDD 2, vtables

2001-02-10 Thread Dan Sugalski

At 08:47 AM 2/10/2001 -0200, Branden wrote:
Dan Sugalski wrote:
  The string API should be sufficiently smart to be able to convert data
from
  one encoding to another as it's more convenient.
 
  No, the vtable functions for the variables should know how to convert from
  and to perl's preferred string representations, and can do whatever
Bizarre
  Magic they care to iternally.
 

I don't see why Perl couldn't deal with multiple representations internally.
Conversion could be done on the way in, internally for efficiency on certain
operations, and on the way out, again.

It can, and it will. The question is "which ones". The regex engine will 
almost undoubtedly deal with only fixed-sized characters. Perl itself will 
probably restrict itself to fixed width characters as well. Individual 
variable classes can store data in any form they want. (If someone wants to 
leverage zlib to write a class that compresses its data, I'm fine with that)

  On the other side, for a string that is matched against regexps, it
doesn't
  matter much if it has variable character length, since regexps normally
read
  all the string anyway, and indexing characters isn't much of a concern.
 
  You underestimate the impact of variable-length data, I think. Regexes
  should go rather faster on fixed-length than variable length data. How
much
  so depends on your processor. (I can guarantee that Alphas will run a
  darned sight faster on UTF-32 than UTF-8...)
 

Aggreed. Should go faster. But maybe I don't need it that fast!

That's fine. Speed is my #1 priority. Memory usage is secondary. (An 
important secondary, but secondary nonetheless) Which doesn't rule out 
UTF-8, of course--it may turn out that converting things is slower than 
dealing with variable width data, in which case priority #1 wins.

(I really think it shouldn't be so much slower than doing it on an ASCII
string with the same total buffer size, it only would have to fetch another
byte on certain conditions and build the extended character representation,
what isn't hard either.)

You might not think so, but you would be wrong. You have a test and 
potential branch (possibly more--folks with lots of UTF-8 data, which 
includes everyone with a non-latin alphabet) on *every* character. That is 
not cheap on modern processors. Yes, you're pulling in significantly less 
data, which has an impact with UTF-32 (and garbage collection) but I'm not 
sure you'll find it a win.

We can benchmark it and see if my feeling is wrong once we get some code 
and a testing scaffold built.

  It would be nice if the user had some control to this, for example by
saying
  "I don't care this string will be used by substr, leave it in UTF-8 since
  it's too big and I don't want to waste memory!", or "This string isn't
too
  big, so I should convert it to bloated UTF-32 at once!", or even "use
less
  'memory';".
 
  That would be:
 my str $foo : utf8 : fixed;
  or possibly
 use less qw(memory);
 

Probably not my str $foo :utf8 :fixed, since then if I have $bar = $foo it
would convert the string value from $foo to anything else, right?

Might. Larry's not set the rules on what attributes are passed on with 
assignment. If you're really worried, there's no reason not to set 
attributes on $bar either.

  Generally speaking you probably don't want to do this. Odds are if you
  think you know what's going on better than the compiler, you're wrong.
(Not
  always, but in a non-trivial number of cases, in my experience)
 

I can't beat the compiler, that's for sure. But I really don't think I want
to read a 100KB file into a variable all at once and end up with 400KB
memory usage only for that file. And I really don't care if `regexps' go
slower on that, I can live with it...

If it's binary data or 8-bit characters, you won't. If it's UTF-8 you might 
see expansion, but how much depends on how many 7-bit characters you have. 
And then only if something actually asks for the data in UTF-32 format.

This has been enough to convince me that there should be UTF-8 as one of 
the base character types for vtables, even if we don't use it in many 
places internaly. For stuff that's just read and printed, it'll save 
memory, I think. Hope, at least. (Though it probably means the regex engine 
should deal with variable-width characters, and I'd really rather it didn't)

  And I believe 8-bit ASCII will always be an option, for who doesn't care
  about extended characters and want the best of both worlds on speed and
  memory usage.
 
  8-bit characters in general, yep. (ASCII is really 7-bit) ASCII, EBCDIC,
or
  raw byte buffers.
 

That includes Latin-1, Latin-etc. (I believe they're 10 or 12), which are
the same as the ISO-8859-1, ISO-8859-(etc).

Yes. Anything that doesn't require UTF-8.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED]