Re: GC: what is better, reuse or avoid cloning?
Buddha M Buck wrote: I see two ways of doing this: one is allowing a string value to be shared by two or more variables, and the other one not. Why would you want to share the string value? Why did you assign the value of $foo to $bar if you really wanted to: $bar = \$foo; I think what he's thinking (in C terms) would be more like the following: typedef struct { int length; char *s } string; // $foo = "xyzzy"; string foo; foo.length = 5; foo.s = strdup("xyzzy---blank---buffer---space---"); // $bar = $foo; string bar; bar.length = foo.length; bar.s = foo.s; // $foo .= "xyzzy"; strncpy(foo.s+foo.length,"xyzzy",5); foo.length += 5; // $foo and $bar share string buffers, but $bar only sees the first 5 // characters while $foo sees the first 10. I don't see that as quite the same as the implicit references or type-globs you suggested. But it's late, and I might not know what I'm talking about... That's exactly what I mean. But actually doing that second $foo .= "xyzzy" thing without allocating a new string would be problematic, since if I did $bar .= "abccb" after that in the same way that it's done for $foo, it would overwrite the "xyzzy" in $foo, right? - Branden
Re: GC: what is better, reuse or avoid cloning?
Branden wrote: Any suggestions? Yes, but none of them polite. You might do well to study the way perl5 handles these issues. Alan Burlison
Running Bytecode?
Hai, How can we run System independent Bytecode...? I need this answer asap. Beatie said thro his module we can generate system independent Bytecode. How can i run that code ? Also How to implement a compiler? vijay
Re: Another approach to vtables
On 02/07/01 Edwin Steiner wrote: [snip] I thought about it once more. Maybe I was confused by the *constant* NATIVE. Are you suggesting a kind of multiple dispatch (first operand selects the vtable, second operand selects the slot in the vtable)? So $dest = $first + $second becomes first-vtable-add[second-vtable-type(second)](dest,first,second,key); ? or maybe first-vtable-add[second-vtable-slot_select](dest,first,second,key); which saves a call by directly reading an integer from the vtable of second. (BTW, this is also how overloading with respect to the second argument could be handled (should it be decided on the language level to do that): There could be a slot like add[ARCANE_MAGIC] selected by second-vtable-slot_select which does all kinds of complicated checks and branches without any cost for the vfunctions in the other slots.) Such a multiple dispatch seems to me like the only solution which avoids the following (eg. in Python): 'first + second' becomes 1. call virtual function 'add' on first 2. inside first-add do lots of checks about type of second Something like what's done in python looks sensible to me. If a vtable add function is also indexed by type you get exponential growth of the vtable with the addition of other types and we want to make that easy in perl 6. Also, it doesn't work if I introduce my bigint type (the internal int vtable knows nothing about it): $int = 1; $bigint = new bigint ("9" x 999); $res = $bigint + $int; # works, bigint knows about internal int $res = $int + $bigint; # doesn't work, since the bigint is the second arg The proposed solution (used in elestic, for example) is to have the add method return a value indicating it has performed the addition: if it's false, we try to add using the add method in the second argument that may know better... In the method, you check the types and perform the work only on the ones you know about. lupus -- - [EMAIL PROTECTED] debian/rules [EMAIL PROTECTED] Monkeys do it better
Re: Running Bytecode?
On Sat, Feb 10, 2001 at 03:14:29AM -0600, Vijaya Kumar C wrote: Beatie said thro his module we can generate system independent Bytecode. How can i run that code ? perldoc ByteLoader Also How to implement a compiler? For Perl 6, or just generally? Either way, that's a hell of a question to answer straight off. I'm not sure what you're getting at. The Perl 5 compiler is implemented by passing the compiled op tree representing a program to one of the B:: Perl modules that does something with it. I wouldn't be surprised if Perl 6 did soemthing similar, but less hairly. -- It's much better to have people flaming in the flesh. -Al Aho
Re: GC: what is better, reuse or avoid cloning?
At 12:51 AM 2/10/2001 -0200, Branden wrote: Back to the GC issue, I was wondering something. Okay, I snipped all of this. After reading it, I'm pretty sure it makes no sense at all. Branden, I'd recommend picking up a copy of _Garbage Collection_ and reading it. The ISBN's in the perl reading list. (My copy's in the office or I'd dig it out for you) Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: PDD 2, vtables
At 08:47 AM 2/10/2001 -0200, Branden wrote: Dan Sugalski wrote: The string API should be sufficiently smart to be able to convert data from one encoding to another as it's more convenient. No, the vtable functions for the variables should know how to convert from and to perl's preferred string representations, and can do whatever Bizarre Magic they care to iternally. I don't see why Perl couldn't deal with multiple representations internally. Conversion could be done on the way in, internally for efficiency on certain operations, and on the way out, again. It can, and it will. The question is "which ones". The regex engine will almost undoubtedly deal with only fixed-sized characters. Perl itself will probably restrict itself to fixed width characters as well. Individual variable classes can store data in any form they want. (If someone wants to leverage zlib to write a class that compresses its data, I'm fine with that) On the other side, for a string that is matched against regexps, it doesn't matter much if it has variable character length, since regexps normally read all the string anyway, and indexing characters isn't much of a concern. You underestimate the impact of variable-length data, I think. Regexes should go rather faster on fixed-length than variable length data. How much so depends on your processor. (I can guarantee that Alphas will run a darned sight faster on UTF-32 than UTF-8...) Aggreed. Should go faster. But maybe I don't need it that fast! That's fine. Speed is my #1 priority. Memory usage is secondary. (An important secondary, but secondary nonetheless) Which doesn't rule out UTF-8, of course--it may turn out that converting things is slower than dealing with variable width data, in which case priority #1 wins. (I really think it shouldn't be so much slower than doing it on an ASCII string with the same total buffer size, it only would have to fetch another byte on certain conditions and build the extended character representation, what isn't hard either.) You might not think so, but you would be wrong. You have a test and potential branch (possibly more--folks with lots of UTF-8 data, which includes everyone with a non-latin alphabet) on *every* character. That is not cheap on modern processors. Yes, you're pulling in significantly less data, which has an impact with UTF-32 (and garbage collection) but I'm not sure you'll find it a win. We can benchmark it and see if my feeling is wrong once we get some code and a testing scaffold built. It would be nice if the user had some control to this, for example by saying "I don't care this string will be used by substr, leave it in UTF-8 since it's too big and I don't want to waste memory!", or "This string isn't too big, so I should convert it to bloated UTF-32 at once!", or even "use less 'memory';". That would be: my str $foo : utf8 : fixed; or possibly use less qw(memory); Probably not my str $foo :utf8 :fixed, since then if I have $bar = $foo it would convert the string value from $foo to anything else, right? Might. Larry's not set the rules on what attributes are passed on with assignment. If you're really worried, there's no reason not to set attributes on $bar either. Generally speaking you probably don't want to do this. Odds are if you think you know what's going on better than the compiler, you're wrong. (Not always, but in a non-trivial number of cases, in my experience) I can't beat the compiler, that's for sure. But I really don't think I want to read a 100KB file into a variable all at once and end up with 400KB memory usage only for that file. And I really don't care if `regexps' go slower on that, I can live with it... If it's binary data or 8-bit characters, you won't. If it's UTF-8 you might see expansion, but how much depends on how many 7-bit characters you have. And then only if something actually asks for the data in UTF-32 format. This has been enough to convince me that there should be UTF-8 as one of the base character types for vtables, even if we don't use it in many places internaly. For stuff that's just read and printed, it'll save memory, I think. Hope, at least. (Though it probably means the regex engine should deal with variable-width characters, and I'd really rather it didn't) And I believe 8-bit ASCII will always be an option, for who doesn't care about extended characters and want the best of both worlds on speed and memory usage. 8-bit characters in general, yep. (ASCII is really 7-bit) ASCII, EBCDIC, or raw byte buffers. That includes Latin-1, Latin-etc. (I believe they're 10 or 12), which are the same as the ISO-8859-1, ISO-8859-(etc). Yes. Anything that doesn't require UTF-8. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED]