Re: [DOCS] Updated documentation in src
On Thursday 29 January 2004 18:20, Michael Scott wrote: For those who want to browse: http://homepage.mac.com/michael_scott/Parrot/docs/html/ Mike Thanks you defn. rock... -- Vishal Vatsa Dept. of Computer Sc. NUI Maynooth
Re: Benchmark Suite
At 6:38 PM -0500 1/25/04, Gordon Henriksen wrote: On Sunday, January 25, 2004, at 06:01 , Matt Fowles wrote: Of late it seems that everybody has been throwing around their own little homegrown benchmarks to support their points. But many people frequently point out that these benchmarks are flawed on one way or another. I suggest that we add a benchmark/ subdirectory and create a canonical suite of benchmarks that exercise things well (and hopefully fully). Then we can all post relative times for runs on this benchmark suite, and we will know exactly what is being tested and how valid it is. Well, there's already examples/benchmarks. If those programs are not at all realistic, then more realistic benchmarks should be added. Would be nice if there were a convenient way to run the lot of them and collect the timing information, though. Sounds like a good plan. I've thrown an item into the todo list :) -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re[2]: Embedding vs. extending interface types
At 8:57 AM +0100 1/25/04, Mattia Barbon wrote: Il Sat, 24 Jan 2004 19:42:20 -0500 Gordon Henriksen [EMAIL PROTECTED] ha scritto: On Saturday, January 24, 2004, at 11:28 , Mattia Barbon wrote: I feel I'm becoming annoying, but: the embedding and extending interfaces are still using different names for Parrot_Interp/Parrot_INTERP. Which one is correct? Mattia, Both are correct. Sort of. :) Parrot_INTERP is an opaque type, which is a technique for improving binary compatibility. In the core, which is I know that. The problem, as you note in your next mail, is: Parrot_Interp already has opacity guards, and _is used as an opaque type in embedding interface_. Now having two parts of the _external_ interface use differently-named opaque types for the same thing seems pointless, if not confusing. And pointless. Let's just rename it to Parrot_Interp everywhere. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
DOD, mutation, and generational collectors
So, while I'm heaping much grumpiness on threads (though I suppose, as I've been out of touch for a bit maybe you've all solved the problem. That'd be nice) I've also been thinking about DOD, as there's a fair amount of overlap in the things that cause problems for threads and the ones that cause problems for generational garbage collectors. For a generational DOD to work we have to have a way to note which generation a structure's in, as well as trap writes so we know when you're referencing an old generation from a new, and vice versa. (Threads have to trap mutations of this stuff, as this is in large part the bits that need synchronizing on mutation) A short list of mutating activities that are of issue are: *) Setting the PMC data pointer *) Resetting the PMC data pointer *) Setting the PMC cache pointer *) Resetting the PMC cache pointer *) Putting a PMC or buffer pointer into a buffer *) Altering a buffer's metadata (size or location) We don't care about most buffer data, just those buffers whose data are pointers to PMCs or other buffers. (Which argues that strings and buffers that contain raw data should be allocated from different pools than buffers that contain pointers to DOD-able data) I think, then, that we'll want to intercept all these activities, which means that we need to have API facilities for them. Before we do that, though, I want to make sure there's nothing I missed from that list. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: More on threads
At 10:50 AM -0500 1/24/04, Gordon Henriksen wrote: On Saturday, January 24, 2004, at 09:23 , Leopold Toetsch wrote: Gordon Henriksen [EMAIL PROTECTED] wrote: ... Best example: morph. morph must die. Morph is necessary. But please note: morph changes the vtable of the PMC to point to the new data types table. It has nothing to do with a typed union. The vtable IS the discriminator. I'm referring to this: typedef union UnionVal { struct {/* Buffers structure */ void * bufstart; size_t buflen; } b; struct {/* PMC unionval members */ DPOINTER* _struct_val; /* two ptrs, both are defines */ PMC* _pmc_val; } ptrs; INTVAL int_val; FLOATVAL num_val; struct parrot_string_t * string_val; } UnionVal; So long as the discriminator does not change, the union is type stable. The vtable's not the discriminator there, the flags in the pmc are the discriminator, as they're what indicates that the union's a GCable thing or not. I will admit, though, that looks *very* different than it did when I put that stuff in originally. (It used to be just a union of FLOATVAL, INTVAL, and string pointer...) Still, point taken. That needs to die and it needs to die now. For the moment, lets split it into two pieces, a buffer pointer and an int/float union, so we don't have to guess whether the contents have issues with threads. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Threads. Again. Dammit.
Okay, it's obvious that we still have some issues to work out before we hit implementation details. (Hey, it could be worse--this is easy compared to strings...) I think there are some ways we can minimize locking, and I think we have some unpleasant potential issues to deal with in the interaction between strings and threads (I thought we could dodge that, but, well... I was wrong). This needs more thought and more work before we go anywhere. Some of the obvious stuff, like fixing up the cache slot of the PMC, should be done regardless. I also think we need more real-worldish tests for this, so we can see if the problems really are as bad as they seem. That, at least, I think I can help with, since I conveniently happen to have a compiler that targets parrot near-done enough to test some reasonably abusive HLL(ish) code to see what sort of hit we take. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: More on threads
At 1:47 AM + 1/25/04, Pete Lomax wrote: On Sat, 24 Jan 2004 13:59:26 -0500, Gordon Henriksen [EMAIL PROTECTED] wrote: snip It doesn't matter if an int field could read half of a double or v.v.; it won't crash the program. Only pointers matter. snip These rules ensure that dereferencing a pointer will not segfault. In this model, wouldn't catching the segfault Apart from anything else, I don't want to catch segfaults and bus errors in parrot. (Well, OK, that's not true--I *do* want to catch segfaults and bus errors, I just don't think it's feasible, or possible on all platforms) -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
RE: More on threads
Dan Sugalski wrote: Gordon Henriksen wrote: Leopold Toetsch wrote: Gordon Henriksen wrote: ... Best example: morph. morph must die. Morph is necessary. But please note: morph changes the vtable of the PMC to point to the new data types table. It has nothing to do with a typed union. The vtable IS the discriminator. I'm referring to this: typedef union UnionVal { struct {/* Buffers structure */ void * bufstart; size_t buflen; } b; struct {/* PMC unionval members */ DPOINTER* _struct_val; /* two ptrs, both are defines */ PMC* _pmc_val; } ptrs; INTVAL int_val; FLOATVAL num_val; struct parrot_string_t * string_val; } UnionVal; So long as the discriminator does not change, the union is type stable. The vtable's not the discriminator there, the flags in the pmc are the discriminator, as they're what indicates that the union's a GCable thing or not. I will admit, though, that looks *very* different than it did when I put that stuff in originally. (It used to be just a union of FLOATVAL, INTVAL, and string pointer...) Hm. Well, both are a discriminator, then; dispatch to code which presumes the contents of the union is quite frequently done without examining the flags. Maybe use a VTABLE func instead to get certain flags? i.e., INTVAL parrot_string_get_flags(..., PMC *pmc) { return PMC_FLAG_IS_POBJ + ...; } Then, updating the vtable would atomically update the flags as well. Or, hell, put the flags directly in the VTABLE if it's not necessary for them to vary across instances. I have the entire source tree (save src/ tests) scoured of that rat's nest of macros for accessing PMC/PObj fields, but I broke something and haven't had the motivation to track down what in the multi-thousand- line-diff it was, yet. :( Else you'd have the patch already and plenty of mobility in the layout of that struct. Near time to upgrade my poor old G3, methinks; the build cycle kills me when I touch parrot/pobj.h. Do any PMC classes use *both* struct_val *and* pmc_val concurrently? I was looking for that, but am afraid I didn't actually notice. -- Gordon Henriksen IT Manager ICLUBcentral Inc. [EMAIL PROTECTED]
Re: More on threads
Dan Sugalski [EMAIL PROTECTED] wrote: [ PObj union ] Still, point taken. That needs to die and it needs to die now. For the moment, lets split it into two pieces, a buffer pointer and an int/float union, so we don't have to guess whether the contents have issues with threads. The Buffer members (bufstart, buflen) of the union are never used for a PMC. Also a PMC can't get converted into a Buffer or vv. These union members are just there for DOD, so that one pobject_lives() (and other functions) can be used for both PMCs and Buffers. That was introduced when uniting Buffers and PMCs. I don't see a problem with that. The problem that Gordon expressed with morph is: thread1 thread2 PerlInt-vtable-set_string_native (int_val = 3) LOCK() perlscalar-vtable-morph: pmc-vtable is now a PerlString vtable, str_val is invalid read access on pmc - non-locked PerlString-vtable-get_integer STRING *s = pmc-str_val SIGBUS/SEGV on access of s But that can be solved by first clearing str_val, then changing the vtable. leo
RE: More on threads
Leopold Toetsch wrote: Gordon Henriksen wrote: ... in the multi-thousand- line-diff it was, yet. :( Else you'd have the patch already 1) *no* multi-thousands line diffs 2) what is the problem, you like to solve? Er? Extending to the rest of the source tree the huge patch to classes which you already applied. No logic changes; just cleaning those PObj accessor macros up. -- Gordon Henriksen IT Manager ICLUBcentral Inc. [EMAIL PROTECTED]
Re: More on threads
Gordon Henriksen wrote: Er? Extending to the rest of the source tree the huge patch to classes which you already applied. No logic changes; just cleaning those PObj accessor macros up. Ah sorry, that one. Please send in small bunches, a few files changed at once. leo
Re: Start of thread proposal
Dan Sugalski [EMAIL PROTECTED] wrote: At 3:08 PM +0100 1/24/04, Leopold Toetsch wrote: Fianlizers and incremental DOD don't play together. The DOD must run to end to be sure, that the objects isn't referenced any more. Finalizers and incremental DOD work just fine together. At some point the incremental DOD will figure out that something's dead, just as the stop-the-world DOD will. It just may take a bit longer. I wanted to say: Finalizers destructors of PMCs that need timely destruction In the case of dead objects at scope exit. leo
Re: More on threads
Gordon Henriksen wrote: Hm. Well, both are a discriminator, then; dispatch to code which presumes the contents of the union is quite frequently done without examining the flags. The flags are *never* consulted for a vtable call in classes/*. DOD does different things if a Buffer or PMC is looked at, but that doesn't matter here. Then, updating the vtable would atomically update the flags as well. Doesn't matter. Or, hell, put the flags directly in the VTABLE if it's not necessary for them to vary across instances. No, flags are mutable and per PMC *not* per class. ... in the multi-thousand- line-diff it was, yet. :( Else you'd have the patch already 1) *no* multi-thousands line diffs 2) what is the problem, you like to solve? Do any PMC classes use *both* struct_val *and* pmc_val concurrently? E.g. iterator.pmc. UnmanagedStruct uses int_val pmc_val. This is no problem. These PMCs don't morph. leo
RE: More on threads
Leopold Toetsch wrote: Gordon Henriksen wrote: Or, hell, put the flags directly in the VTABLE if it's not necessary for them to vary across instances. No, flags are mutable and per PMC *not* per class. Of course there are flags which must remain per-PMC. I wasn't referring to them. Sorry if that wasn't clear. If a flag is only saying my VTABLE methods use the UnionVal as {a void*/a PObj*/a PMC*/data}, so GC should trace accordingly, it may be a waste of a per-object flag bit to store those flags with the PMC instance rather than with the PMC class. And if it's with the VTABLE, then it doesn't need to be traced. (But, then, all PObjs don't have VTABLES...) Sidebar: If we're looking at lock-free concurrency, flag updates probably have to be performed with atomic 's and |'s. BUT: Doesn't apply during GC, since other threads will have to be stalled then. Do any PMC classes use *both* struct_val *and* pmc_val concurrently? E.g. iterator.pmc. UnmanagedStruct uses int_val pmc_val. This is no problem. These PMCs don't morph. Er, int_val and pmc_val at the same time? That's not quite what the layout provides for: typedef union UnionVal { struct {/* Buffers structure */ void * bufstart; size_t buflen; } b; struct {/* PMC unionval members */ DPOINTER* _struct_val; /* two ptrs, both are defines */ PMC* _pmc_val; } ptrs; INTVAL int_val; FLOATVAL num_val; struct parrot_string_t * string_val; } UnionVal; Says to me: struct_val and pmc_val concurrently -- or -- bufstart and buflen concurrently -- or -- int_val -- or -- num_val -- or -- string_val I don't know if C provides a guarantee that int_val and ptrs._pmc_val won't overlap just because INTVAL and DPOINTER* fields happen to be the same size. At least one optimizing compiler I know of, MrC/MrC++, would do some struct rearrangement when it felt like it. -- Gordon Henriksen IT Manager ICLUBcentral Inc. [EMAIL PROTECTED]
Re: More on threads
Gordon Henriksen wrote: Leopold Toetsch wrote: No, flags are mutable and per PMC *not* per class. Of course there are flags which must remain per-PMC. I wasn't referring to them. Sorry if that wasn't clear. If a flag is only saying my VTABLE methods use the UnionVal as {a void*/a PObj*/a PMC*/data}, so GC should trace accordingly, it may be a waste of a per-object flag bit to store those flags with the PMC instance rather than with the PMC class. All DOD related flags in the fast paths (i.e. for marking scalars) are located in the PMCs arena (with ARENA_DOD_FLAGS is on). This reduces cache misses during DOD to nearly nothing. More DOD related information is in the flags part of the Pobj - but accessing that also means cache pollution. Putting flags elsewhere too, needs one more indirection and allways an access to the PMC memory itself. This doesn't give us any advantage. But again, flags don't matter during setting or getting a PMCs data. Flags aren't used in classes for these purposes. There are very few places in classes, where flags are even changed. This is morphing scalars, and Key PMCs come to my mind. If we're looking at lock-free concurrency, flag updates probably have to be performed with atomic 's and |'s. Almost all mutating vtable methods will lock the pmc. Er, int_val and pmc_val at the same time? I know :) This isn't the safest thing we have. After your union accessor patches, we can clean that up, and use a notation so that for this case, the two union members really can't overlap. leo
Re: More on threads
Leopold Toetsch [EMAIL PROTECTED] wrote: [ perlscalar moprh ] But that can be solved by first clearing str_val, then changing the vtable. Fixed. I currently don't see any more problems related to perscalars. PerlStrings are unsafe per se, as long as we have the copying GC. They need a lock during reading too. All other perscalars should be safe now for non-locked reading. Mutating vtables get a lock. leo
Re: [DOCS] Updated documentation in src
Would doxygen be of use here? http://www.doxygen.org/ Here's an example use http://www.speex.org/API/refman/speex__bits_8h.html#a2 Follow the links, including to the annotated source file. Tim. On Thu, Jan 29, 2004 at 07:20:50PM +0100, Michael Scott wrote: I've add inline docs to everything in src (except for malloc.c and malloc-trace.c). At times I wondered whether this was the right thing to do. For example, in mmd.c, where Dan had already created a mmd.pod, I ended up duplicating information. At other times I reckoned that what was needed was an autodoc. Other times the best I could do was rephrase the function name. All issues to address in phase 2. Next I think, for a bit of light relief, I'll do the examples. For those who want to browse: http://homepage.mac.com/michael_scott/Parrot/docs/html/ Mike
Mail troubles
Hopefully this makes it to p6i list this week. It seems with the recent worm activity, some ISPs have locked down mail servers even more. I have replied to several personal emails (WRT Parrot) and they are bouncing for various reasons, one of which is because my new ISP is on the DUL blacklist, and my mail server is getting rejections. (Not to mention my mail has been getting lost on the way to p6i as well) If you aren't getting replies from me (Tim Cory for example), I'm trying to resolve the issue. -Melvin
Re: Some namespace notes
On Thu, Jan 29, 2004 at 09:16:33AM -0800, Jeff Clites wrote: Then the question becomes, What about namespace clashes?, which Tim has already addressed. We could certainly do some sort of language-specific prefixing, as Tim suggested, but it seems that we are then going to trouble to unify, only to immediately de-unify. Certainly, a random Java programmer shouldn't have to worry about naming a class so that it doesn't conflict with any class in any other language in the world--that's silly, especially since this Java programmer may not even know about parrot. I think you missed the part where I said that each language has it's own root which is actually below the root of the unified namespace. The namespaces of other languages are reached via a backlink/symlink kind of thing so they appear to be within the namespace of the language being used. it seems natural to instead just use that class name as I would expect it to be written--java.lang.String for Java, for example. In Java you would write java.lang.String, naturally, and in Perl you'd write parrot::java::java.lang.String. As per the example I gave previously. Take another look. Tim.
RE: Re[2]: Embedding vs. extending interface types
Dan Sugalski wrote: And pointless. Let's just rename it to Parrot_Interp everywhere. I've submitted a patch for this already. -- Gordon Henriksen IT Manager ICLUBcentral Inc. [EMAIL PROTECTED]
Re: [DOCS] Updated documentation in src
I haven't ruled out something like that in the long term, but what I'm trying achieve at the moment is just to see some pod everywhere. This has the merit that I visit every file and ensure that some basic information gets provided for the newbies - my target audience. In a sense I'm following the time honoured tradition of throwing one away, namely the Getting Started Guide on the wiki. I'm shifting pod from there into the files. At the moment I'm just building a big index.html list and using the default html formatting from Pod-Simple, but this will change soon. I think the trick is to model the project with perl modules so that it's straightforward to extract and compose information. I already have the basis for this, which I'll check in any day now. Mike On 30 Jan 2004, at 19:23, Tim Bunce wrote: Would doxygen be of use here? http://www.doxygen.org/ Here's an example use http://www.speex.org/API/refman/speex__bits_8h.html#a2 Follow the links, including to the annotated source file. Tim. On Thu, Jan 29, 2004 at 07:20:50PM +0100, Michael Scott wrote: I've add inline docs to everything in src (except for malloc.c and malloc-trace.c). At times I wondered whether this was the right thing to do. For example, in mmd.c, where Dan had already created a mmd.pod, I ended up duplicating information. At other times I reckoned that what was needed was an autodoc. Other times the best I could do was rephrase the function name. All issues to address in phase 2. Next I think, for a bit of light relief, I'll do the examples. For those who want to browse: http://homepage.mac.com/michael_scott/Parrot/docs/html/ Mike
Re: Mail troubles
Melvin, We're having some issues at perl.org due to the worm/virus/bounce stuff. Slow delivery is just something we're going to have to deal with for a few more days... but bounces I don't like. If your legitamite email is getting bounced by the perl.org (develooper.com) servers, please let me know (off list) and I'll see what I can do. We've had to implement some ugly things, but they shouldn't be bouncing legit mail. -R At Fri, 30 Jan 2004 13:29:53 -0500, Melvin Smith wrote: Hopefully this makes it to p6i list this week. It seems with the recent worm activity, some ISPs have locked down mail servers even more. I have replied to several personal emails (WRT Parrot) and they are bouncing for various reasons, one of which is because my new ISP is on the DUL blacklist, and my mail server is getting rejections. (Not to mention my mail has been getting lost on the way to p6i as well) If you aren't getting replies from me (Tim Cory for example), I'm trying to resolve the issue. -Melvin
Re: DOD, mutation, and generational collectors
On Thursday, January 29, 2004, at 08:41 , Dan Sugalski wrote: So, while I'm heaping much grumpiness on threads (though I suppose, as I've been out of touch for a bit maybe you've all solved the problem. That'd be nice) I've also been thinking about DOD, as there's a fair amount of overlap in the things that cause problems for threads and the ones that cause problems for generational garbage collectors. Can you elaborate a bit on your concept of generational collection as they apply to parrot? To my ear, generational collection implies all of this (which is a lot, and a lot of it grossly inapplicable to parrot as the runtime now stands): GENERATIONAL GARBAGE COLLECTORS Rationale: Most objects are short-lived, and those which are not are less likely to become garbage. Marking or copying the older is statistically wasted effort. A garbage collector which usually traces only new objects will thus tend to perform better than one which traces all of them. Implementation: Most generational collectors are copying collectors. They are generally mutations upon a hemispace collector. The variation is that collection always proceeds from a small, fixed-sized space to a larger, growable space, rather than swapping back and forth between two symmetric spaces. These asymmetric spaces are the so-called generations, and the small one containing the younger objects is dubbed the nursery. Moving an object from the nursery is called tenuring. Thus, the generation of an object is not so much a property of an object as it is a property of the pool from which the object is presently allocated. Some generational collectors have intermediate generations, but most do not. (Parrot might benefit from a younger-than-the-nursery generation for impatient destructors, but then that property would need to be ascertainable at construction time.) The nursery is typically explicitly un-fragmented. All live objects in it are tenured from it when it fills. Thus, allocation can be as simple as an atomic pointer increment and filling in an object header. The tenured generation might be a heap. To avoid tracing from the roots, the generational collector requires a cooperative mutator to maintain a table of references from tenured objects to objects within the nursery. This is often forced upon the mutator through the use of VM write barriers. The write barrier function marks the VM page as dirty and then removes write protection for that page. When invoked, the garbage collector will examine all dirtied pages, updating the table of references. A truly cooperative mutator will update a table of dirty pages every time it changes a pointer variable. Usually, the pages are smaller than VM pages, and thus called cards instead. This operation can be as simple as card_table[((int) ptr) n] = 1; ptr = obj;.[1] An expensive proposition, at least doubling the memory bandwidth overhead of pointer manipulation. But, since cards are smaller than pages, it can reduce the amount of memory which has to be scanned at collection time. Despite being incremental by nature, a generational collector is not a concurrent collector. That is, it still stops the world, it just doesn't do it for quite a long as a traditional hemispace or mark-sweep collector might. Being a copying collector in the traditional mold has its drawbacks. It requires accurate reference tracing, in particular. A traditional copying collector cannot work with a conservative collector. For instance, it cannot operate on the OS stack (unless the program is written very cooperatively), since the collector cannot ascertain for certain whether a value is a pointer (which should be traced and possibly updated) or data (which should be left alone). But parrot has anchored objects and other sorts of things which are incompatible with this. In particular, the fast-path nursery allocator (the very fast allocation being a major benefit of generational collectors) can't work: If an object were anchored in the nursery, it must stay there after GC rather than being tenured; this leaves the nursery fragmented and complicates the very hot allocation code path. Then there's the whole PMCs don't move guarantee. And there's also the conservative tracing of system areas. Gordon Henriksen [EMAIL PROTECTED] [1] To be that simple, all allocated memory must be contiguous. So it's not usually that simple. More likely, this: (((alloc_header*) obj) - 1)-pool.card_table[((int) obj) n]; obj-ptr = newvalue;