Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-15 Thread Leandro Lucarella
Fawzi Mohamed, el 15 de abril a las 14:57 me escribiste:
> >Well, if it turns out to be a win, I'm sure we could put it into LDC.
> >DMD would be up to Walter.
> 
> and tango will also for sure welcome a new gc implementation.

Well, right now I'm working on a minimal, naive, fully documented GC
implementation, as an exercise mostly, but I think it can be great for
educational / "documentational" purposes. I plan to submit it to
Tango/druntime when it's done.

> Most of the issues, and how to modify to get the that were already discussed. 
> Personally I like a blocked approach (i.e. flag+size), more than a full 
> bitmap, 
> in the future one can think of compiler clustering pointer types,... together 
> to 
> reduce the number of blocks. Subclassing means that you will always have some 
> blocks, but it is still probably better than the bitmap, I don't like that at 
> the moment typeinfo takes up so much space (at least the size of the type).
> To get all the info offTi aside (which are correct only on LDC as far as I 
> know) 
> tango.core.RuntimeTraits could be useful.
> 
> 
> add support for weak pointers (that at the moment are normally stored as
> non pointers), fvbommel had a place for them in its enum values
> 
> at the moment the values in the registers are dumped, but not read back,
> either you change that, or all those values should be pinned (just as
> all union/maibe pointer)
> 
> tango io uses void[] arrays to take advantage of the auto cast, but
> these are not pointers (and the gc knows this because at the moment the
> flag used for an array are the one used to allocate it the first time.
> 
> during the collection you need to stop the threads (at least in the
> moving gc algorithms, and in the current mark an sweep).  While the
> threads are stopped you have very stringent constraints, basically the
> same constraints as for a signal handler.
> You cannot call any non signal safe function, not even acquire posix locks.
> So try to do the least possible in that phase, and be very careful.

Thanks for all the suggestions, they are very useful.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)



Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-15 Thread Fawzi Mohamed
On 2009-04-13 20:33:53 +0200, Frits van Bommel 
 said:



Leandro Lucarella wrote:

Frits van Bommel, el 13 de abril a las 19:36 me escribiste:

Leandro Lucarella wrote:

Frits van Bommel, el 13 de abril a las 13:30 me escribiste:

Or you can pin anything that's referenced from the stack, and move
anything that is only referenced from the heap.

That's more likely to happen, but it requires a compiler change too
(provide type information on allocation). Maybe I wasn't too clear,
I didn't mean to say that a moving collector is impossible, what is
impossible is to make allocation a "pointer bump".
The compiler already passes a TypeInfo on allocations IIRC. And 
TypeInfo can produce a TypeInfo[], it just happens that DMD and GDC 
don't fill it in for user-defined aggregates, and LDC needs a 
compile-time #define to enable it (because it breaks linking the Tango 
runtime, IIRC).

(For other types, this fact it returns null is a simple library issue)

Well, this is nice to know (even when it's not used yet, it's better than
nothing). And how can the GC obtain this kind of information?

Well, since the allocation routines should all get a TypeInfo reference
from the compiler, the GC can store the typeinfo for each memory block
somewhere, and later use it. It can then call ti->offTi() which should
return an array of OffsetTypeInfo structs (see object.d[i]). The only
caveat is that those array return values should be statically allocated;




But right now gc_malloc() doesn't take any TypeInfo argument. I can't see
where I can get the TypeInfo in the first place =/


Ah, you're right. But if you'll look at your nearest lifetime.d[1] 
you'll see that all the allocation routines called by the compiler *do* 
provide a TypeInfo, so apparently it's just not propagated to gc_*. So 
I guess the first thing to do would be to either

   (a) change the signature of gc_{malloc,calloc,extend}()
or
   (b) add something like gc_settype(void*, TypeInfo)...


[1]: Tango name, and presumably druntime as well; I think it's spread 
all over the place for Phobos 1.



I have no idea how efficient this would be, however. My guess would be
not very.

I'm not concerned about efficiency, I'm more concerned in non-trivial
compiler changes.

Well, efficiency is important too.


Sure, and it's really hard to assume how efficient that could it be (you
loose some efficiency in some cases but you probably gain a lot in other
cases if most allocations are a pointer bump). What I meant is that I can
test efficiency, to see if this is really viable or not, but it's very
hard for me to change the compiler (and it's much harder that those
changes would be accepted in "upstream", and one of my thesis goals is to
make something useful, that can be easily adopted, not just an academic
curiosity =).


Well, if it turns out to be a win, I'm sure we could put it into LDC. 
DMD would be up to Walter.


and tango will also for sure welcome a new gc implementation.

Most of the issues, and how to modify to get the that were already 
discussed. Personally I like a blocked approach (i.e. flag+size), more 
than a full bitmap, in the future one can think of compiler clustering 
pointer types,... together to reduce the number of blocks. Subclassing 
means that you will always have some blocks, but it is still probably 
better than the bitmap, I don't like that at the moment typeinfo takes 
up so much space (at least the size of the type).
To get all the info offTi aside (which are correct only on LDC as far 
as I know) tango.core.RuntimeTraits could be useful.



add support for weak pointers (that at the moment are normally stored 
as non pointers), fvbommel had a place for them in its enum values


at the moment the values in the registers are dumped, but not read 
back, either you change that, or all those values should be pinned 
(just as all union/maibe pointer)


tango io uses void[] arrays to take advantage of the auto cast, but 
these are not pointers (and the gc knows this because at the moment the 
flag used for an array are the one used to allocate it the first time.


during the collection you need to stop the threads (at least in the 
moving gc algorithms, and in the current mark an sweep).
While the threads are stopped you have very stringent constraints, 
basically the same constraints as for a signal handler.

You cannot call any non signal safe function, not even acquire posix locks.
So try to do the least possible in that phase, and be very careful.






Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-14 Thread Robert Jacques
On Tue, 14 Apr 2009 11:34:05 -0400, Frits van Bommel  
 wrote:

Robert Jacques wrote:
On Tue, 14 Apr 2009 09:27:09 -0400, Frits van Bommel  
 wrote:

Robert Jacques wrote:
On Tue, 14 Apr 2009 06:04:01 -0400, Frits van Bommel  
 wrote:
Using D2 structs with a moving GC would need some extra bookkeeping  
data anyway, to work out things like their postblit call.
 Postblit is only called when generating an actual copy. For example  
it is not called on assignment is the source is no longer used. So I  
don't see any reason why it should, or it would be expected that  
postblit would run when a struct was moved using the GC.


Oh, I didn't know that. (I haven't done much of anything with D2, I  
mostly stick to D1)

I just presumed they were like C++ copy constructors.

As an aside: I can certainly think of some places where it would be  
useful to have them get called whenever the address changes...
(Though "move constructors" would be even better for most of those  
cases)
 Could you document this use case? (i.e. give some examples as I can't  
think of any)


Any situation in which structs register themselves somewhere for one  
reason or another.


For example, I read that C++'s shared_ptr<> could be implemented by  
having the instances keep a doubly-linked list of themselves instead of  
using an extra heap allocation for the reference count. Such an  
implementation would need to update the pointers in neighboring nodes  
when moved, or insert itself before or after the original when copied.


Umm... aren't stack values not guaranteed to be stable? (i.e. isn't this  
like play Russian roulette with your optimizer?)


Note that shared_ptr<> is not only useful for memory resources, it could  
also be used to e.g. keep a file handle or socket open until all users  
are done with it (and not longer, as you might get with a GC'ed file  
class).
Of course, in this case the more traditional approach with a  
heap-allocated reference (or even storing it in the Monitor structure  
each object has a pointer to) would be just as viable.
But you could also implement weak references in a similar way, to let  
the GC find them in a linked list and allow them to be nulled when  
referred-to objects get collected.


Not all moving GCs or copy GCs for that matter, move every single object  
all the time, so this hack doesn't work. On the other hand, GC-User cache  
interaction would be nice (i.e. telling the GC about user free-lists, etc)



There are probably other use cases...


Well, these are a use cases for a language level move operator, as it  
would allow for slightly better performance than a copy and dtor pair in  
some cases. (Sorry, I was thinking only about moving GCs when I posted,  
for which these aren't use cases)


Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-14 Thread Frits van Bommel

Robert Jacques wrote:
On Tue, 14 Apr 2009 09:27:09 -0400, Frits van Bommel 
 wrote:

Robert Jacques wrote:
On Tue, 14 Apr 2009 06:04:01 -0400, Frits van Bommel 
 wrote:
Using D2 structs with a moving GC would need some extra bookkeeping 
data anyway, to work out things like their postblit call.
 Postblit is only called when generating an actual copy. For example 
it is not called on assignment is the source is no longer used. So I 
don't see any reason why it should, or it would be expected that 
postblit would run when a struct was moved using the GC.


Oh, I didn't know that. (I haven't done much of anything with D2, I 
mostly stick to D1)

I just presumed they were like C++ copy constructors.

As an aside: I can certainly think of some places where it would be 
useful to have them get called whenever the address changes...

(Though "move constructors" would be even better for most of those cases)


Could you document this use case? (i.e. give some examples as I can't 
think of any)


Any situation in which structs register themselves somewhere for one reason or 
another.


For example, I read that C++'s shared_ptr<> could be implemented by having the 
instances keep a doubly-linked list of themselves instead of using an extra heap 
allocation for the reference count. Such an implementation would need to update 
the pointers in neighboring nodes when moved, or insert itself before or after 
the original when copied.
Note that shared_ptr<> is not only useful for memory resources, it could also be 
used to e.g. keep a file handle or socket open until all users are done with it 
(and not longer, as you might get with a GC'ed file class).
Of course, in this case the more traditional approach with a heap-allocated 
reference (or even storing it in the Monitor structure each object has a pointer 
to) would be just as viable.


But you could also implement weak references in a similar way, to let the GC 
find them in a linked list and allow them to be nulled when referred-to objects 
get collected.


There are probably other use cases...


Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-14 Thread Robert Jacques
On Tue, 14 Apr 2009 09:27:09 -0400, Frits van Bommel  
 wrote:

Robert Jacques wrote:
On Tue, 14 Apr 2009 06:04:01 -0400, Frits van Bommel  
 wrote:

Robert Jacques wrote:
it instead. (You'd have to create a fake ClassInfo for structs and  
arrays.) Then the GC only has to track the start of each object (i.e.  
the beginning of a block in the current GC). The advantage is that  
this has 0 storage requirements for objects and on average < 4 bytes  
for structs and arrays (thanks to the coarse block sizes of the  
current GC).


(that'd be < 8 for a 64-bit machine?)
 Yes. The key point it's a per item cost which decreases with item  
size, as opposed to a fixed 6.25% overhead when using a dense bitmask.


I already mentioned the bitmask overhead could be bounded to  
pointer-size by falling back to a TypeInfo-based solution for memory  
blocks where that overhead would otherwise exceed (or match) the size of  
a pointer.


Sorry, I've been looking at non-frreelist based GCs where that  
optimization is not available. Also, there are some limitations associated  
with a variable length page header might be an issue. (i.e. a free page  
with 512B blocks can't be re-purposed as a page with 256B blocks.)




Using D2 structs with a moving GC would need some extra bookkeeping  
data anyway, to work out things like their postblit call.
 Postblit is only called when generating an actual copy. For example it  
is not called on assignment is the source is no longer used. So I don't  
see any reason why it should, or it would be expected that postblit  
would run when a struct was moved using the GC.


Oh, I didn't know that. (I haven't done much of anything with D2, I  
mostly stick to D1)

I just presumed they were like C++ copy constructors.

As an aside: I can certainly think of some places where it would be  
useful to have them get called whenever the address changes...

(Though "move constructors" would be even better for most of those cases)


Could you document this use case? (i.e. give some examples as I can't  
think of any)





Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-14 Thread Frits van Bommel

Robert Jacques wrote:
On Tue, 14 Apr 2009 06:04:01 -0400, Frits van Bommel 
 wrote:

Robert Jacques wrote:
it instead. (You'd have to create a fake ClassInfo for structs and 
arrays.) Then the GC only has to track the start of each object (i.e. 
the beginning of a block in the current GC). The advantage is that 
this has 0 storage requirements for objects and on average < 4 bytes 
for structs and arrays (thanks to the coarse block sizes of the 
current GC).


(that'd be < 8 for a 64-bit machine?)


Yes. The key point it's a per item cost which decreases with item size, 
as opposed to a fixed 6.25% overhead when using a dense bitmask.


I already mentioned the bitmask overhead could be bounded to pointer-size by 
falling back to a TypeInfo-based solution for memory blocks where that overhead 
would otherwise exceed (or match) the size of a pointer.


Using D2 structs with a moving GC would need some extra bookkeeping 
data anyway, to work out things like their postblit call.


Postblit is only called when generating an actual copy. For example it 
is not called on assignment is the source is no longer used. So I don't 
see any reason why it should, or it would be expected that postblit 
would run when a struct was moved using the GC.


Oh, I didn't know that. (I haven't done much of anything with D2, I mostly stick 
to D1)

I just presumed they were like C++ copy constructors.

As an aside: I can certainly think of some places where it would be useful to 
have them get called whenever the address changes...

(Though "move constructors" would be even better for most of those cases)

Arrays, by the way, would also need some special handling, since you 
can't return a variable-sized OffsetTypeInfo[] without allocating 
during collections.
(As long as they fit in the limits for the bitfield, that could be 
repeated though -- as long as it's not an array of structs with 
postblits...)
So maybe a .sizeof should somehow be included, and the offsets assumed 
to repeat after that? (as long as enough bytes are left for at least 
one more item)


Actually, I'd assume there'd be an isArray flag in the Class/Type Info, 
which would cause the bitmask to be repeated until the end of the block.


You'd still need to know the size of the bitmask, to know after how many bits to 
repeat it.
But like I said, a ClassInfo would encode the size of the type (as does 
TypeInfo), so any solution based on either of those should do the trick here.


Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-14 Thread Robert Jacques
On Tue, 14 Apr 2009 06:04:01 -0400, Frits van Bommel  
 wrote:

Robert Jacques wrote:
On Mon, 13 Apr 2009 14:54:57 -0400, Frits van Bommel  
 wrote:

[snip]
 An alternative to this is to encode the information in ClassInfo and  
use


It's already there. That's where TypeInfo for classes gets it from :).

it instead. (You'd have to create a fake ClassInfo for structs and  
arrays.) Then the GC only has to track the start of each object (i.e.  
the beginning of a block in the current GC). The advantage is that this  
has 0 storage requirements for objects and on average < 4 bytes for  
structs and arrays (thanks to the coarse block sizes of the current GC).


(that'd be < 8 for a 64-bit machine?)


Yes. The key point it's a per item cost which decreases with item size, as  
opposed to a fixed 6.25% overhead when using a dense bitmask.


An interesting idea. Indeed, since vtables for objects start with a  
ClassInfo reference, putting a ClassInfo* in front of non-object memory  
blocks should work, if ClassInfo could be generalized to support  
structs, unions, ints, floats, etc...


Using D2 structs with a moving GC would need some extra bookkeeping data  
anyway, to work out things like their postblit call.


Postblit is only called when generating an actual copy. For example it is  
not called on assignment is the source is no longer used. So I don't see  
any reason why it should, or it would be expected that postblit would run  
when a struct was moved using the GC.


This could be put in the ClassInfo or in the second slot of the fake  
vtable.
(Without the fake classinfo, using a TypeInfo reference instead of the  
bitfield and putting it in there would work too)



Arrays, by the way, would also need some special handling, since you  
can't return a variable-sized OffsetTypeInfo[] without allocating during  
collections.
(As long as they fit in the limits for the bitfield, that could be  
repeated though -- as long as it's not an array of structs with  
postblits...)
So maybe a .sizeof should somehow be included, and the offsets assumed  
to repeat after that? (as long as enough bytes are left for at least one  
more item)


Actually, I'd assume there'd be an isArray flag in the Class/Type Info,  
which would cause the bitmask to be repeated until the end of the block.





Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-14 Thread Frits van Bommel

Robert Jacques wrote:
On Mon, 13 Apr 2009 14:54:57 -0400, Frits van Bommel 
 wrote:

[snip]


An alternative to this is to encode the information in ClassInfo and use 


It's already there. That's where TypeInfo for classes gets it from :).

it instead. (You'd have to create a fake ClassInfo for structs and 
arrays.) Then the GC only has to track the start of each object (i.e. 
the beginning of a block in the current GC). The advantage is that this 
has 0 storage requirements for objects and on average < 4 bytes for 
structs and arrays (thanks to the coarse block sizes of the current GC).


(that'd be < 8 for a 64-bit machine?)

An interesting idea. Indeed, since vtables for objects start with a ClassInfo 
reference, putting a ClassInfo* in front of non-object memory blocks should 
work, if ClassInfo could be generalized to support structs, unions, ints, 
floats, etc...


Using D2 structs with a moving GC would need some extra bookkeeping data anyway, 
to work out things like their postblit call. This could be put in the ClassInfo 
or in the second slot of the fake vtable.
(Without the fake classinfo, using a TypeInfo reference instead of the bitfield 
and putting it in there would work too)



Arrays, by the way, would also need some special handling, since you can't 
return a variable-sized OffsetTypeInfo[] without allocating during collections.
(As long as they fit in the limits for the bitfield, that could be repeated 
though -- as long as it's not an array of structs with postblits...)
So maybe a .sizeof should somehow be included, and the offsets assumed to repeat 
after that? (as long as enough bytes are left for at least one more item)
If we go the fake ClassInfo approach, the ClassInfo.init.length field could be 
used to store this size.
Note that this would likely mean initializing unused parts of memory blocks to 
null[1], since the GC doesn't know how much of them is used and might get false 
pointers otherwise.


All in all, maybe it'd be easier to just go the TypeInfo approach. The extra 
information needed to support non-class types is already conveniently available 
there (type sizes, postblits for D2) and they're already available for all types...

Or maybe the ClassInfo in the vtable could be changed into a TypeInfo? :)


[1]: At least for array blocks. Other blocks likely wouldn't have enough padding 
for extra elements -- unless the extra pointer for non-objects puts the size 
over the limit and the block size is doubled.


Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-13 Thread Robert Jacques
On Mon, 13 Apr 2009 14:54:57 -0400, Frits van Bommel  
 wrote:



Sean Kelly wrote:

Leandro Lucarella wrote:
But right now gc_malloc() doesn't take any TypeInfo argument. I can't  
see

where I can get the TypeInfo in the first place =/
 The call would have to be modified.  Right now the best you can do is  
pass BlkAttr.NO_SCAN.  And storing a pointer per block could add a good  
bit of bookkeeping overhead for small objects, of course.  Perhaps the  
TypeInfo array could be converted to a bitmap or some such.


Let's see, you'd need 2 bits per pointer-sized block of bytes, to encode  
these possibilities:

a) Yeah, this is a pointer
b) Nope, not a pointer
c) Maybe a pointer (union, void[])
c2) (optional) A (somehow) explicitly pinned pointer (treated identical  
to (c) for GC purposes; needs to be followed during marking, but data  
pointed to can't be moved)

d) (optional, since we have a value left) This is a weak pointer

I'd split these up as such: One bit to indicate that it can be read as a  
pointer (and should thus be followed when marking, for instance) and one  
to indicate it can be written as a pointer (so it can be moved for (a)  
or nulled for (d)). That gives us these values for the two-bit field:

enum PtrBits {
 // Actual values
 JustData  = 0b00,
 MaybePointer  = 0b01,
 PinnedPointer = 0b01,
 WeakPointer   = 0b10,
 Pointer   = 0b11,

 // For '&' tests
 ReadableFlag  = 0b01,
 WritableFlag  = 0b10,
}

Like I said, this would cost 2 bits per pointer-sized chunk, so 1/16 of  
size for 32-bit systems and 1/32th of the memory block size for 64-bit  
systems. It'd have to be rounded up to a whole number of bytes of  
course, and possibly T.alignof if stored at the start of the block.  
(Storing it at the end of the block would avoid that)


This could be bounded to one pointer worth of memory per block if the GC  
treats blocks > 16*4 = 64 bytes (on 32-bit systems) or > 32*8 = 256  
bytes (on 64-bit systems) specially by just storing the raw TypeInfo  
reference instead of the bitfield for the memory block.
(Implementer's choice on what to do for  
(size_t.sizeof-1)*4*size_t.sizeof to size_t.sizeof^2 * 4 bytes, where  
the bit-encoded data takes up the same number of bytes as a pointer  
would)


An alternative to this is to encode the information in ClassInfo and use  
it instead. (You'd have to create a fake ClassInfo for structs and  
arrays.) Then the GC only has to track the start of each object (i.e. the  
beginning of a block in the current GC). The advantage is that this has 0  
storage requirements for objects and on average < 4 bytes for structs and  
arrays (thanks to the coarse block sizes of the current GC).


Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-13 Thread Leandro Lucarella
Frits van Bommel, el 13 de abril a las 20:33 me escribiste:
> >But right now gc_malloc() doesn't take any TypeInfo argument. I can't see
> >where I can get the TypeInfo in the first place =/
> 
> Ah, you're right. But if you'll look at your nearest lifetime.d[1]
> you'll see that all the allocation routines called by the compiler *do*
> provide a TypeInfo, so apparently it's just not propagated to gc_*. So
> I guess the first thing to do would be to either
>   (a) change the signature of gc_{malloc,calloc,extend}()
> or
>   (b) add something like gc_settype(void*, TypeInfo)...

Ok, these are great news! I will certainly experiment with this change to
achieve a more precise heap!

> [1]: Tango name, and presumably druntime as well; I think it's spread
> all over the place for Phobos 1.

Great, I will stick to Tango for now because:
a) Is more likely to accept changes (because of the "stable D1" policy of
   Walter).
b) I'm using LDC, and there is no Phobos support for LDC right now (I
   guess you know that ;)
c) It's more likely to be (forward) compatible with druntime, and thus,
   D2.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)



Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-13 Thread Frits van Bommel

Sean Kelly wrote:

Leandro Lucarella wrote:

But right now gc_malloc() doesn't take any TypeInfo argument. I can't see
where I can get the TypeInfo in the first place =/


The call would have to be modified.  Right now the best you can do is 
pass BlkAttr.NO_SCAN.  And storing a pointer per block could add a good 
bit of bookkeeping overhead for small objects, of course.  Perhaps the 
TypeInfo array could be converted to a bitmap or some such.


Let's see, you'd need 2 bits per pointer-sized block of bytes, to encode these 
possibilities:

a) Yeah, this is a pointer
b) Nope, not a pointer
c) Maybe a pointer (union, void[])
c2) (optional) A (somehow) explicitly pinned pointer (treated identical to (c) 
for GC purposes; needs to be followed during marking, but data pointed to can't 
be moved)

d) (optional, since we have a value left) This is a weak pointer

I'd split these up as such: One bit to indicate that it can be read as a pointer 
(and should thus be followed when marking, for instance) and one to indicate it 
can be written as a pointer (so it can be moved for (a) or nulled for (d)). That 
gives us these values for the two-bit field:

enum PtrBits {
// Actual values
JustData  = 0b00,
MaybePointer  = 0b01,
PinnedPointer = 0b01,
WeakPointer   = 0b10,
Pointer   = 0b11,

// For '&' tests
ReadableFlag  = 0b01,
WritableFlag  = 0b10,
}

Like I said, this would cost 2 bits per pointer-sized chunk, so 1/16 of size for 
32-bit systems and 1/32th of the memory block size for 64-bit systems. It'd have 
to be rounded up to a whole number of bytes of course, and possibly T.alignof if 
stored at the start of the block. (Storing it at the end of the block would 
avoid that)


This could be bounded to one pointer worth of memory per block if the GC treats 
blocks > 16*4 = 64 bytes (on 32-bit systems) or > 32*8 = 256 bytes (on 64-bit 
systems) specially by just storing the raw TypeInfo reference instead of the 
bitfield for the memory block.
(Implementer's choice on what to do for (size_t.sizeof-1)*4*size_t.sizeof to 
size_t.sizeof^2 * 4 bytes, where the bit-encoded data takes up the same number 
of bytes as a pointer would)


Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-13 Thread Frits van Bommel

Leandro Lucarella wrote:

Frits van Bommel, el 13 de abril a las 19:36 me escribiste:

Leandro Lucarella wrote:

Frits van Bommel, el 13 de abril a las 13:30 me escribiste:

Or you can pin anything that's referenced from the stack, and move
anything that is only referenced from the heap.

That's more likely to happen, but it requires a compiler change too
(provide type information on allocation). Maybe I wasn't too clear,
I didn't mean to say that a moving collector is impossible, what is
impossible is to make allocation a "pointer bump".
The compiler already passes a TypeInfo on allocations IIRC. And TypeInfo can 
produce a TypeInfo[], it just happens that DMD and GDC don't fill it in for 
user-defined aggregates, and LDC needs a compile-time #define to enable it 
(because it breaks linking the Tango runtime, IIRC).

(For other types, this fact it returns null is a simple library issue)

Well, this is nice to know (even when it's not used yet, it's better than
nothing). And how can the GC obtain this kind of information?

Well, since the allocation routines should all get a TypeInfo reference
from the compiler, the GC can store the typeinfo for each memory block
somewhere, and later use it. It can then call ti->offTi() which should
return an array of OffsetTypeInfo structs (see object.d[i]). The only
caveat is that those array return values should be statically allocated;
the GC probably won't like an allocation happening during collections...


But right now gc_malloc() doesn't take any TypeInfo argument. I can't see
where I can get the TypeInfo in the first place =/


Ah, you're right. But if you'll look at your nearest lifetime.d[1] you'll see 
that all the allocation routines called by the compiler *do* provide a TypeInfo, 
so apparently it's just not propagated to gc_*. So I guess the first thing to do 
would be to either

  (a) change the signature of gc_{malloc,calloc,extend}()
or
  (b) add something like gc_settype(void*, TypeInfo)...


[1]: Tango name, and presumably druntime as well; I think it's spread all over 
the place for Phobos 1.



I have no idea how efficient this would be, however. My guess would be
not very.

I'm not concerned about efficiency, I'm more concerned in non-trivial
compiler changes.

Well, efficiency is important too.


Sure, and it's really hard to assume how efficient that could it be (you
loose some efficiency in some cases but you probably gain a lot in other
cases if most allocations are a pointer bump). What I meant is that I can
test efficiency, to see if this is really viable or not, but it's very
hard for me to change the compiler (and it's much harder that those
changes would be accepted in "upstream", and one of my thesis goals is to
make something useful, that can be easily adopted, not just an academic
curiosity =).


Well, if it turns out to be a win, I'm sure we could put it into LDC. DMD would 
be up to Walter.


Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-13 Thread Sean Kelly

Leandro Lucarella wrote:

Frits van Bommel, el 13 de abril a las 19:36 me escribiste:

Leandro Lucarella wrote:

Frits van Bommel, el 13 de abril a las 13:30 me escribiste:

Or you can pin anything that's referenced from the stack, and move
anything that is only referenced from the heap.

That's more likely to happen, but it requires a compiler change too
(provide type information on allocation). Maybe I wasn't too clear,
I didn't mean to say that a moving collector is impossible, what is
impossible is to make allocation a "pointer bump".
The compiler already passes a TypeInfo on allocations IIRC. And TypeInfo can 
produce a TypeInfo[], it just happens that DMD and GDC don't fill it in for 
user-defined aggregates, and LDC needs a compile-time #define to enable it 
(because it breaks linking the Tango runtime, IIRC).

(For other types, this fact it returns null is a simple library issue)

Well, this is nice to know (even when it's not used yet, it's better than
nothing). And how can the GC obtain this kind of information?

Well, since the allocation routines should all get a TypeInfo reference
from the compiler, the GC can store the typeinfo for each memory block
somewhere, and later use it. It can then call ti->offTi() which should
return an array of OffsetTypeInfo structs (see object.d[i]). The only
caveat is that those array return values should be statically allocated;
the GC probably won't like an allocation happening during collections...


But right now gc_malloc() doesn't take any TypeInfo argument. I can't see
where I can get the TypeInfo in the first place =/


The call would have to be modified.  Right now the best you can do is 
pass BlkAttr.NO_SCAN.  And storing a pointer per block could add a good 
bit of bookkeeping overhead for small objects, of course.  Perhaps the 
TypeInfo array could be converted to a bitmap or some such.


Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-13 Thread Leandro Lucarella
Frits van Bommel, el 13 de abril a las 19:36 me escribiste:
> Leandro Lucarella wrote:
> >Frits van Bommel, el 13 de abril a las 13:30 me escribiste:
> Or you can pin anything that's referenced from the stack, and move
> anything that is only referenced from the heap.
> >>>That's more likely to happen, but it requires a compiler change too
> >>>(provide type information on allocation). Maybe I wasn't too clear,
> >>>I didn't mean to say that a moving collector is impossible, what is
> >>>impossible is to make allocation a "pointer bump".
> >>The compiler already passes a TypeInfo on allocations IIRC. And TypeInfo 
> >>can 
> >>produce a TypeInfo[], it just happens that DMD and GDC don't fill it in for 
> >>user-defined aggregates, and LDC needs a compile-time #define to enable it 
> >>(because it breaks linking the Tango runtime, IIRC).
> >>(For other types, this fact it returns null is a simple library issue)
> >Well, this is nice to know (even when it's not used yet, it's better than
> >nothing). And how can the GC obtain this kind of information?
> 
> Well, since the allocation routines should all get a TypeInfo reference
> from the compiler, the GC can store the typeinfo for each memory block
> somewhere, and later use it. It can then call ti->offTi() which should
> return an array of OffsetTypeInfo structs (see object.d[i]). The only
> caveat is that those array return values should be statically allocated;
> the GC probably won't like an allocation happening during collections...

But right now gc_malloc() doesn't take any TypeInfo argument. I can't see
where I can get the TypeInfo in the first place =/

> >>I have no idea how efficient this would be, however. My guess would be
> >>not very.
> >I'm not concerned about efficiency, I'm more concerned in non-trivial
> >compiler changes.
> 
> Well, efficiency is important too.

Sure, and it's really hard to assume how efficient that could it be (you
loose some efficiency in some cases but you probably gain a lot in other
cases if most allocations are a pointer bump). What I meant is that I can
test efficiency, to see if this is really viable or not, but it's very
hard for me to change the compiler (and it's much harder that those
changes would be accepted in "upstream", and one of my thesis goals is to
make something useful, that can be easily adopted, not just an academic
curiosity =).

> >Anyway I think the important thing here is to at least get a precise heap
> >(I would be nice if one could provide type information for the root set
> >too, I guess).
> >For me there's almost no difference between having a non-precise stack and
> >unions/voids[] or having just non-precise unions/voids[]. You have to
> >support non-movable objects anyway, and I guess the stack is small enough
> >to be a non-problem in practice. I think the cost/benefits ratio of having
> >a precise stack doesn't worth the trouble.
> 
> A precise heap would certainly be a nice starting point, but adding precise 
> stack and registers might be a nice improvement over it. Especially for 
> things 
> like the Tango allocating large stack buffers to avoid heap allocs. They're 
> pointerless, but the GC doesn't know that...

A call to gc_addRange() can be done to inform the GC, but of course it
would be really nice if that's not necessary =)

> IIRC there have been some talks on the LLVM mailing list about how to
> emit stack and register maps, so at some point in the future LDC might
> actually support all that...

That's nice. But for now I prefer to target a more general solution (even
when I'm using LDC for the project).

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)



Re: (Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-13 Thread Frits van Bommel

Leandro Lucarella wrote:

Frits van Bommel, el 13 de abril a las 13:30 me escribiste:

Or you can pin anything that's referenced from the stack, and move
anything that is only referenced from the heap.

That's more likely to happen, but it requires a compiler change too
(provide type information on allocation). Maybe I wasn't too clear,
I didn't mean to say that a moving collector is impossible, what is
impossible is to make allocation a "pointer bump".
The compiler already passes a TypeInfo on allocations IIRC. And TypeInfo can 
produce a TypeInfo[], it just happens that DMD and GDC don't fill it in for 
user-defined aggregates, and LDC needs a compile-time #define to enable it 
(because it breaks linking the Tango runtime, IIRC).

(For other types, this fact it returns null is a simple library issue)


Well, this is nice to know (even when it's not used yet, it's better than
nothing). And how can the GC obtain this kind of information?


Well, since the allocation routines should all get a TypeInfo reference from the 
compiler, the GC can store the typeinfo for each memory block somewhere, and 
later use it. It can then call ti->offTi() which should return an array of 
OffsetTypeInfo structs (see object.d[i]). The only caveat is that those array 
return values should be statically allocated; the GC probably won't like an 
allocation happening during collections...



pointed by that type of fields should not be moved, ever. So, even after
a fresh collection, your heap can be still fragmented. You have to store
information about the "holes" and take care of them. This can be very
light too (in comparison with the actual allocation algorithm), but it can
never be as simple as a "pointer bump" (as requested by David =).

Well, it may technically be possible to move a heap object right before
assignment to a union/void[] or passing to C if the compiler calls
a library function before doing something like that.


Yes, I guess it's technically possible, but again, it needs (AFAIK)
non-trivial compiler changes.


Well, the change to the compiler might not be that big. Detection of unions, 
void[]s and C calls should be pretty simple. The lib routine might be "a bit" 
more complicated...


Though this all assumes the compiler first provides enough information about the 
stack & registers for a moving collector to be feasible -- which would probably 
be a much bigger task.



Then pinned objects could be allocated on a separate part of the heap
that never gets moved (unless no more references in untyped memory are
live, maybe?) and allocations could still be a pointer bump in the
movable part of the heap.


Sure. And what about the non-movable part of the heap ;)
You still have to manage that, you can't simply ignore it. That's what
I meant with this:


So technically, you'll always have to deal with memory fragmentation in
D (I don't think anyone wants to drop unions and void[] =), and it's true
that it can be minimized to almost nothing. But since it's technically
possible, you can never get away from the extra complexity for managing
those rare cases.


Well yeah, you'll still have a non-movable part. Hopefully it'll be much smaller 
than the movable part though. And like I said, allocations can still be pointer 
bumps -- it's the assignments to unions, void[]s and C calls that suffer...




[...]


I have no idea how efficient this would be, however. My guess would be
not very.


I'm not concerned about efficiency, I'm more concerned in non-trivial
compiler changes.


Well, efficiency is important too. This has the potential to trigger what is 
effectively a marking of the entire heap (to find all references to an object 
that needs to be moved *now*) much more often than would otherwise happen. Like 
I said, my guess would be this isn't very efficient.



Anyway I think the important thing here is to at least get a precise heap
(I would be nice if one could provide type information for the root set
too, I guess).

For me there's almost no difference between having a non-precise stack and
unions/voids[] or having just non-precise unions/voids[]. You have to
support non-movable objects anyway, and I guess the stack is small enough
to be a non-problem in practice. I think the cost/benefits ratio of having
a precise stack doesn't worth the trouble.


A precise heap would certainly be a nice starting point, but adding precise 
stack and registers might be a nice improvement over it. Especially for things 
like the Tango allocating large stack buffers to avoid heap allocs. They're 
pointerless, but the GC doesn't know that...



IIRC there have been some talks on the LLVM mailing list about how to emit stack 
and register maps, so at some point in the future LDC might actually support all 
that...


(Semi) precise GC [was: Re: Std Phobos 2 and logging library?]

2009-04-13 Thread Leandro Lucarella
Frits van Bommel, el 13 de abril a las 13:30 me escribiste:
> >>Or you can pin anything that's referenced from the stack, and move
> >>anything that is only referenced from the heap.
> >That's more likely to happen, but it requires a compiler change too
> >(provide type information on allocation). Maybe I wasn't too clear,
> >I didn't mean to say that a moving collector is impossible, what is
> >impossible is to make allocation a "pointer bump".
> 
> The compiler already passes a TypeInfo on allocations IIRC. And TypeInfo can 
> produce a TypeInfo[], it just happens that DMD and GDC don't fill it in for 
> user-defined aggregates, and LDC needs a compile-time #define to enable it 
> (because it breaks linking the Tango runtime, IIRC).
> (For other types, this fact it returns null is a simple library issue)

Well, this is nice to know (even when it's not used yet, it's better than
nothing). And how can the GC obtain this kind of information?

> >What I mean is you can be as precise as you want, but as long as union and
> >void[] is there, there always be "might be a pointer" fields, and cells
> 
> Oh, I hadn't read that part yet when I started typing this post :)

=)

> >pointed by that type of fields should not be moved, ever. So, even after
> >a fresh collection, your heap can be still fragmented. You have to store
> >information about the "holes" and take care of them. This can be very
> >light too (in comparison with the actual allocation algorithm), but it can
> >never be as simple as a "pointer bump" (as requested by David =).
> 
> Well, it may technically be possible to move a heap object right before
> assignment to a union/void[] or passing to C if the compiler calls
> a library function before doing something like that.

Yes, I guess it's technically possible, but again, it needs (AFAIK)
non-trivial compiler changes.

> Then pinned objects could be allocated on a separate part of the heap
> that never gets moved (unless no more references in untyped memory are
> live, maybe?) and allocations could still be a pointer bump in the
> movable part of the heap.

Sure. And what about the non-movable part of the heap ;)
You still have to manage that, you can't simply ignore it. That's what
I meant with this:

> >So technically, you'll always have to deal with memory fragmentation in
> >D (I don't think anyone wants to drop unions and void[] =), and it's true
> >that it can be minimized to almost nothing. But since it's technically
> >possible, you can never get away from the extra complexity for managing
> >those rare cases.

[...]

> I have no idea how efficient this would be, however. My guess would be
> not very.

I'm not concerned about efficiency, I'm more concerned in non-trivial
compiler changes.

Anyway I think the important thing here is to at least get a precise heap
(I would be nice if one could provide type information for the root set
too, I guess).

For me there's almost no difference between having a non-precise stack and
unions/voids[] or having just non-precise unions/voids[]. You have to
support non-movable objects anyway, and I guess the stack is small enough
to be a non-problem in practice. I think the cost/benefits ratio of having
a precise stack doesn't worth the trouble.


-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)



Re: Std Phobos 2 and logging library?

2009-04-13 Thread Frits van Bommel

Leandro Lucarella wrote:

Christopher Wright, el 12 de abril a las 17:54 me escribiste:

Absolutely.  When writing parallel code to do large scale data mining in D, the
lack of precision and multithreaded allocation are real killers.  My interests
are, in order of importance:

1.  Being able to allocate at least small chunks of memory without locking.
2.  Precise scanning of at least the heap.
3.  Collection w/o stopping the world.
4.  Moving GC so that allocations can be pointer bumps.

3. is my main goal right now. I think 1. can be done using thread-specific
free lists/pools. 2. Is possible too, but bigger changes are needed,
specially in the compiler side (1. and 3. can be completely done in the GC
implementation). 4. is not 100% possible because we can never have a 100%
precise GC, but can be very close if 2. is fixed =)

You can create StackInfo similar to TypeInfo, I suppose, and thus get an
entirely precise GC.


Sure. This is a big (compiler) change, and you probably have to drop
C compatibility (what would you do with C functions stacks frames without
StackInfo? How do you know it a stack frame is from an "untyped"
C function or a "typed" D one? Where do you search for that StackInfo?).
But it's definitely possible in theory.


Actually, it's not possible in D as it stands. Consider:
union U {
size_t i;
void* p;
}
There's no way for the GC to know whether an instance of this type is storing a 
pointer or an integer that happens to look like a pointer.
So unless we're dropping support for unions (and void[]s as they exist 
currently), any GC needs to support some things that may either be pointers or 
non-pointers, and (implicitly?) pin allocations accordingly. So stack frames not 
described by a StackInfo instance can just be considered to consist of data that 
may or not be pointers, just like the union above.



Or you can pin anything that's referenced from the stack, and move
anything that is only referenced from the heap.


That's more likely to happen, but it requires a compiler change too
(provide type information on allocation). Maybe I wasn't too clear,
I didn't mean to say that a moving collector is impossible, what is
impossible is to make allocation a "pointer bump".


The compiler already passes a TypeInfo on allocations IIRC. And TypeInfo can 
produce a TypeInfo[], it just happens that DMD and GDC don't fill it in for 
user-defined aggregates, and LDC needs a compile-time #define to enable it 
(because it breaks linking the Tango runtime, IIRC).

(For other types, this fact it returns null is a simple library issue)


What I mean is you can be as precise as you want, but as long as union and
void[] is there, there always be "might be a pointer" fields, and cells


Oh, I hadn't read that part yet when I started typing this post :)


pointed by that type of fields should not be moved, ever. So, even after
a fresh collection, your heap can be still fragmented. You have to store
information about the "holes" and take care of them. This can be very
light too (in comparison with the actual allocation algorithm), but it can
never be as simple as a "pointer bump" (as requested by David =).


Well, it may technically be possible to move a heap object right before 
assignment to a union/void[] or passing to C if the compiler calls a library 
function before doing something like that. Then pinned objects could be 
allocated on a separate part of the heap that never gets moved (unless no more 
references in untyped memory are live, maybe?) and allocations could still be a 
pointer bump in the movable part of the heap.

I have no idea how efficient this would be, however. My guess would be not very.


So technically, you'll always have to deal with memory fragmentation in
D (I don't think anyone wants to drop unions and void[] =), and it's true
that it can be minimized to almost nothing. But since it's technically
possible, you can never get away from the extra complexity for managing
those rare cases.


Re: Std Phobos 2 and logging library?

2009-04-12 Thread Leandro Lucarella
Christopher Wright, el 12 de abril a las 17:54 me escribiste:
> >>Absolutely.  When writing parallel code to do large scale data mining in D, 
> >>the
> >>lack of precision and multithreaded allocation are real killers.  My 
> >>interests
> >>are, in order of importance:
> >>
> >>1.  Being able to allocate at least small chunks of memory without locking.
> >>2.  Precise scanning of at least the heap.
> >>3.  Collection w/o stopping the world.
> >>4.  Moving GC so that allocations can be pointer bumps.
> >3. is my main goal right now. I think 1. can be done using thread-specific
> >free lists/pools. 2. Is possible too, but bigger changes are needed,
> >specially in the compiler side (1. and 3. can be completely done in the GC
> >implementation). 4. is not 100% possible because we can never have a 100%
> >precise GC, but can be very close if 2. is fixed =)
> 
> You can create StackInfo similar to TypeInfo, I suppose, and thus get an
> entirely precise GC.

Sure. This is a big (compiler) change, and you probably have to drop
C compatibility (what would you do with C functions stacks frames without
StackInfo? How do you know it a stack frame is from an "untyped"
C function or a "typed" D one? Where do you search for that StackInfo?).
But it's definitely possible in theory.

> Or you can pin anything that's referenced from the stack, and move
> anything that is only referenced from the heap.

That's more likely to happen, but it requires a compiler change too
(provide type information on allocation). Maybe I wasn't too clear,
I didn't mean to say that a moving collector is impossible, what is
impossible is to make allocation a "pointer bump".

What I mean is you can be as precise as you want, but as long as union and
void[] is there, there always be "might be a pointer" fields, and cells
pointed by that type of fields should not be moved, ever. So, even after
a fresh collection, your heap can be still fragmented. You have to store
information about the "holes" and take care of them. This can be very
light too (in comparison with the actual allocation algorithm), but it can
never be as simple as a "pointer bump" (as requested by David =).

So technically, you'll always have to deal with memory fragmentation in
D (I don't think anyone wants to drop unions and void[] =), and it's true
that it can be minimized to almost nothing. But since it's technically
possible, you can never get away from the extra complexity for managing
those rare cases.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

EL "PITUFO ENRIQUE" LLEGO A LA BAILANTA
-- Crónica TV


Re: Std Phobos 2 and logging library?

2009-04-12 Thread Brad Roberts
grauzone wrote:
>> You can create StackInfo similar to TypeInfo, I suppose, and thus get
>> an entirely precise GC.
> 
> What about the registers? It isn't that simple.

Not to mention the fun of stack slot reuse, register reuse, etc.  The
stack layout isn't a fixed entity for the entire lifetime of a function
but can shift as execution flows through it.

Later,
Brad


Re: Std Phobos 2 and logging library?

2009-04-12 Thread grauzone
You can create StackInfo similar to TypeInfo, I suppose, and thus get an 
entirely precise GC.


What about the registers? It isn't that simple.


Re: Std Phobos 2 and logging library?

2009-04-12 Thread Christopher Wright

Leandro Lucarella wrote:

dsimcha, el 11 de abril a las 05:21 me escribiste:

== Quote from Leandro Lucarella (llu...@gmail.com)'s article

Andrei Alexandrescu, el 10 de abril a las 16:49 me escribiste:

And Braddr just made a documentation fix, and Walter only commits
portability stuff and an occasional bug fix now and then, so...
Yes, it really looks like a five-person show =)
I think most work in Phobos now it's done by Andrei, there are other
*collaborators* (the four other you named plus people sending patches), but
it looks like Andrei's show to me. This is not necessarily bad, it's
definitely  better than before, when it was Walter's show, now at least he
can dedicate his efforts in the compiler and language and Phobos is having
a lot more attention.

We'll be very happy to integrate credited contributions from anyone, and
to give dsource.org write access to serious participants. What I think
right now stands in the way of large participation to Phobos is that we
all still learn the ropes of D2; the possibilities are dizzying and we
haven't quite zeroed in on a particular style. Nonetheless, as it's been
noticed I'm always summoning help from this group. So again, if you feel
you want to contribute with ideas and/or code, don't hesitate.

I hope I can come up with something useful with my thesis (improving D's
GC) and I can contribute that. Right now all my energies are focused on
that, and I'm very close to the point to finally start playing with
alternate implementations.
BTW, is there any real interest in adding some more power to the GC
implementator to allow some kind of moving or generational collector?

Absolutely.  When writing parallel code to do large scale data mining in D, the
lack of precision and multithreaded allocation are real killers.  My interests
are, in order of importance:

1.  Being able to allocate at least small chunks of memory without locking.
2.  Precise scanning of at least the heap.
3.  Collection w/o stopping the world.
4.  Moving GC so that allocations can be pointer bumps.


3. is my main goal right now. I think 1. can be done using thread-specific
free lists/pools. 2. Is possible too, but bigger changes are needed,
specially in the compiler side (1. and 3. can be completely done in the GC
implementation). 4. is not 100% possible because we can never have a 100%
precise GC, but can be very close if 2. is fixed =)


You can create StackInfo similar to TypeInfo, I suppose, and thus get an 
entirely precise GC.


Or you can pin anything that's referenced from the stack, and move 
anything that is only referenced from the heap.


Re: Std Phobos 2 and logging library?

2009-04-12 Thread Leandro Lucarella
Denis Koroskin, el 12 de abril a las 21:26 me escribiste:
> >I think I'll target D1 for now. The reasons are:
> >* Stability
> >* Free compilers availability (you know what kind of free I'm talking
> >  about =)
> >* Programs availability (I'm trying to gather programs to make a benchmark
> >  suite, without much success unfortunately, only Leonardo Maffi answered
> >  my request for examples[1], and what I need the most are *real* programs)
> >
> >So for know, I'm not considering anything of that. The only thing I'm
> >vaguely considering is thread-specific heaps, to allow lock-free
> >allocation. This has some disadvantages too, so it's low priority for me
> >right now.
> >
> >[1] http://proj.llucax.com.ar/blog/dgc/blog/post/-1382f6a3
> >
> 
> With "thread-local by default" policy, D2 may be *much* more suitable
> for your research, so think twice.

I thought it more than twice ;-)

That how I came up with the reasons stated above.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

Y2K
 hmm, nothing major has happend, what an anticlimax
 yeah
 really sucks
 I expected for Australia to sink into the sea or something
 but nn


Re: Std Phobos 2 and logging library?

2009-04-12 Thread Denis Koroskin

On Sun, 12 Apr 2009 21:13:09 +0400, Leandro Lucarella  wrote:


Robert Jacques, el 11 de abril a las 01:05 me escribiste:
On Fri, 10 Apr 2009 23:04:16 -0400, Leandro Lucarella  
 wrote:
>I hope I can come up with something useful with my thesis (improving  
D's

>GC) and I can contribute that. Right now all my energies are focused on
>that, and I'm very close to the point to finally start playing with
>alternate implementations.
>
>BTW, is there any real interest in adding some more power to the GC
>implementator to allow some kind of moving or generational collector?

Yes.

>Here are some good starting points on how to allow better GC support  
in D:

>http://d.puremagic.com/issues/show_bug.cgi?id=679

I think this should be less a spec issue and more a library issue and
core.memory seems to already have a BlkAttr.NO_MOVE, which covers memory
pinning.


This is just a flag. You need extra information for knowing actually when
to set that flag. And for that, you need some type information. A cell  
can

be moved when you know everything pointing to it is an actual pointer, so
you can safely overwrite it with the new location.


>http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=35426

Well, making the GC type aware/semi-precise (i.e. providing support for
moving/copying collectors) seems like the most important change,


Exactly.


The change to support concurrent GCs, effects both performance and code
gen significantly. Also, if D's thread model supports thread-local
heaps, the need for a concurrent GC is vastly reduced (its only
a benefit to the shared heaps (mutable and immutable), while most
objects would are on the thread-local heaps).


I think I'll target D1 for now. The reasons are:
* Stability
* Free compilers availability (you know what kind of free I'm talking
  about =)
* Programs availability (I'm trying to gather programs to make a  
benchmark

  suite, without much success unfortunately, only Leonardo Maffi answered
  my request for examples[1], and what I need the most are *real*  
programs)


So for know, I'm not considering anything of that. The only thing I'm
vaguely considering is thread-specific heaps, to allow lock-free
allocation. This has some disadvantages too, so it's low priority for me
right now.

[1] http://proj.llucax.com.ar/blog/dgc/blog/post/-1382f6a3



With "thread-local by default" policy, D2 may be *much* more suitable for your 
research, so think twice.



Re: Std Phobos 2 and logging library?

2009-04-12 Thread Leandro Lucarella
dsimcha, el 11 de abril a las 05:21 me escribiste:
> == Quote from Leandro Lucarella (llu...@gmail.com)'s article
> > Andrei Alexandrescu, el 10 de abril a las 16:49 me escribiste:
> > > >And Braddr just made a documentation fix, and Walter only commits
> > > >portability stuff and an occasional bug fix now and then, so...
> > > >Yes, it really looks like a five-person show =)
> > > >I think most work in Phobos now it's done by Andrei, there are other
> > > >*collaborators* (the four other you named plus people sending patches), 
> > > >but
> > > >it looks like Andrei's show to me. This is not necessarily bad, it's
> > > >definitely  better than before, when it was Walter's show, now at least 
> > > >he
> > > >can dedicate his efforts in the compiler and language and Phobos is 
> > > >having
> > > >a lot more attention.
> > >
> > > We'll be very happy to integrate credited contributions from anyone, and
> > > to give dsource.org write access to serious participants. What I think
> > > right now stands in the way of large participation to Phobos is that we
> > > all still learn the ropes of D2; the possibilities are dizzying and we
> > > haven't quite zeroed in on a particular style. Nonetheless, as it's been
> > > noticed I'm always summoning help from this group. So again, if you feel
> > > you want to contribute with ideas and/or code, don't hesitate.
> > I hope I can come up with something useful with my thesis (improving D's
> > GC) and I can contribute that. Right now all my energies are focused on
> > that, and I'm very close to the point to finally start playing with
> > alternate implementations.
> > BTW, is there any real interest in adding some more power to the GC
> > implementator to allow some kind of moving or generational collector?
> 
> Absolutely.  When writing parallel code to do large scale data mining in D, 
> the
> lack of precision and multithreaded allocation are real killers.  My interests
> are, in order of importance:
> 
> 1.  Being able to allocate at least small chunks of memory without locking.
> 2.  Precise scanning of at least the heap.
> 3.  Collection w/o stopping the world.
> 4.  Moving GC so that allocations can be pointer bumps.

3. is my main goal right now. I think 1. can be done using thread-specific
free lists/pools. 2. Is possible too, but bigger changes are needed,
specially in the compiler side (1. and 3. can be completely done in the GC
implementation). 4. is not 100% possible because we can never have a 100%
precise GC, but can be very close if 2. is fixed =)

Do you have example program that I can use for a benchmark suite?

Thank you.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

He cometido pecados, he hecho el mal, he sido víctima de la envidia, el
egoísmo, la ambición, la mentira y la frivolidad, pero siempre he sido
un padre argentino que quiere que su hijo triunfe en la vida.
-- Ricardo Vaporeso


Re: Std Phobos 2 and logging library?

2009-04-12 Thread Leandro Lucarella
Robert Jacques, el 11 de abril a las 01:05 me escribiste:
> On Fri, 10 Apr 2009 23:04:16 -0400, Leandro Lucarella  
> wrote:
> >I hope I can come up with something useful with my thesis (improving D's
> >GC) and I can contribute that. Right now all my energies are focused on
> >that, and I'm very close to the point to finally start playing with
> >alternate implementations.
> >
> >BTW, is there any real interest in adding some more power to the GC
> >implementator to allow some kind of moving or generational collector?
> 
> Yes.
> 
> >Here are some good starting points on how to allow better GC support in D:
> >http://d.puremagic.com/issues/show_bug.cgi?id=679
> 
> I think this should be less a spec issue and more a library issue and
> core.memory seems to already have a BlkAttr.NO_MOVE, which covers memory
> pinning.

This is just a flag. You need extra information for knowing actually when
to set that flag. And for that, you need some type information. A cell can
be moved when you know everything pointing to it is an actual pointer, so
you can safely overwrite it with the new location.

> >http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=35426
> 
> Well, making the GC type aware/semi-precise (i.e. providing support for
> moving/copying collectors) seems like the most important change,

Exactly.

> The change to support concurrent GCs, effects both performance and code
> gen significantly. Also, if D's thread model supports thread-local
> heaps, the need for a concurrent GC is vastly reduced (its only
> a benefit to the shared heaps (mutable and immutable), while most
> objects would are on the thread-local heaps).

I think I'll target D1 for now. The reasons are:
* Stability
* Free compilers availability (you know what kind of free I'm talking
  about =)
* Programs availability (I'm trying to gather programs to make a benchmark
  suite, without much success unfortunately, only Leonardo Maffi answered
  my request for examples[1], and what I need the most are *real* programs)

So for know, I'm not considering anything of that. The only thing I'm
vaguely considering is thread-specific heaps, to allow lock-free
allocation. This has some disadvantages too, so it's low priority for me
right now.

[1] http://proj.llucax.com.ar/blog/dgc/blog/post/-1382f6a3

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

FALTAN 325 DIAS PARA LA PRIMAVERA
-- Crónica TV


Re: Std Phobos 2 and logging library?

2009-04-12 Thread Leandro Lucarella
Andrei Alexandrescu, el 10 de abril a las 20:51 me escribiste:
> Leandro Lucarella wrote:
> >BTW, is there any real interest in adding some more power to the GC
> >implementator to allow some kind of moving or generational collector?
> 
> That would be awesome!
> 
> >Here are some good starting points on how to allow better GC support in D:
> >http://d.puremagic.com/issues/show_bug.cgi?id=679
> >http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=35426
> >Anyway, if you are interested in my progress, I have a blog[1] where
> >I write almost everything I do related to the subject. The blog it's in
> >Planet D, but Planet D seems to be broken =/
> >[1] http://proj.llucax.com.ar/blog/dgc/blog
> 
> Great, I'll follow. Speaking of readings, here's a paper that I found 
> intriguing. Maybe it could provide inspiration: GC assertions: Using the 
> Garbage 
> Collector to Check Heap Properties (I'd send a link but I can't open the 
> browser; I'm ATM on a Windows machine that has a problem. Long story.)

That looks interesting but it's out of the scope of my thesis. Thanks
anyways.

BTW, here is the link to the paper:
http://www.eecs.tufts.edu/~eaftan/gcassertions-mspc-2008.pdf


-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

careful to all animals (never washing spiders down the plughole),
keep in contact with old friends (enjoy a drink now and then),
will frequently check credit at (moral) bank (hole in the wall),


Re: Std Phobos 2 and logging library?

2009-04-11 Thread Robert Jacques
On Sat, 11 Apr 2009 12:12:07 -0400, Sean Kelly   
wrote:



Robert Jacques wrote:
 On that note, support for per thread GCs in general is another major  
change. i.e.:

GC.malloc!(T)();
to
Thread.getThis.gc.malloc!(T)(); // Alternatively use thread local  
storage.


It can remain as GC.malloc().  Exposing a GC handle via thread would  
allow one thread to use another thread's GC.


Yeah, after a night's sleep I realized that GC.malloc() could just wrap  
the underlying implementation.
P.S. The post was meant to illustrate the under-the-hood changes (i.e. not  
public), but didn't come out that way.


Re: Std Phobos 2 and logging library?

2009-04-11 Thread Andrei Alexandrescu

Zz wrote:

Andrei Alexandrescu Wrote:


Frank Benoit wrote:

Andrei Alexandrescu schrieb:

Zz wrote:

Hi,

Are there any plans for a logging library in Std Phobos 2.0?

Zz

I wanted to add logging support for a while now but am undecided about
the API to use. Log4J is quite popular but quite complicated. There are
a number of simpler APIs out there but I couldn't figure out which is
the best.

If anyone has ideas and/or code to contribute, that would be great.


Andrei

Why not start with the one from tango?
Because it's not my code and every discussion on licensing ends up 
confused. What we can do in Phobos is following e.g. the Log4J API, 
which as far as I understand Tango implements or at least draws 
inspiration from. But then by browsing this group a while ago I figured 
that Tango added a Trace module because some people deemed the logging 
API too complicated.



Why has everything to be
different?

Nobody said it has to be different.


If it really is not important, why do you have to make it
different than tango? Every code that uses tango and phobos, or wants to
support both has to reimplemnent an intermediate abstraction layer.
Again, I don't *have* to make it different, but I can't *copy* it 
either. There are two other things to consider: (a) Phobos' logging can 
take advantage of D2 features; (b) Phobos' logging should be well 
integrated with the rest of itself, e.g. it may be odd to have one way 
to format things in stdio and an entirely different way in the log, or 
to have the logging infrastructure incompatible with the stream 
infrastructure.


That all being said, I don't see a lot of point in making Phobos' 
logging 100% identical with Tango's. Phobos2 and Tango2 will be usable 
together, so there's no point in the duplication - if you want Tango's 
logging mechanism, you just use it. So there will be no point in 
"supporting both" because both can coexist.



Andrei


It would be good to have one that makes use of D2's features have you looked at 
"simple-log", I'm not a java programmer but I do know some people who seem to 
like it more than Log4J.
here is the link https://simple-log.dev.java.net/

Anyway whatever the API looks like one would be welcome.

Zz


That looks interesting, thanks for the pointer.

Andrei


Re: Std Phobos 2 and logging library?

2009-04-11 Thread Zz
Andrei Alexandrescu Wrote:

> Frank Benoit wrote:
> > Andrei Alexandrescu schrieb:
> >> Zz wrote:
> >>> Hi,
> >>>
> >>> Are there any plans for a logging library in Std Phobos 2.0?
> >>>
> >>> Zz
> >> I wanted to add logging support for a while now but am undecided about
> >> the API to use. Log4J is quite popular but quite complicated. There are
> >> a number of simpler APIs out there but I couldn't figure out which is
> >> the best.
> >>
> >> If anyone has ideas and/or code to contribute, that would be great.
> >>
> >>
> >> Andrei
> > 
> > Why not start with the one from tango?
> 
> Because it's not my code and every discussion on licensing ends up 
> confused. What we can do in Phobos is following e.g. the Log4J API, 
> which as far as I understand Tango implements or at least draws 
> inspiration from. But then by browsing this group a while ago I figured 
> that Tango added a Trace module because some people deemed the logging 
> API too complicated.
> 
> > Why has everything to be
> > different?
> 
> Nobody said it has to be different.
> 
> > If it really is not important, why do you have to make it
> > different than tango? Every code that uses tango and phobos, or wants to
> > support both has to reimplemnent an intermediate abstraction layer.
> 
> Again, I don't *have* to make it different, but I can't *copy* it 
> either. There are two other things to consider: (a) Phobos' logging can 
> take advantage of D2 features; (b) Phobos' logging should be well 
> integrated with the rest of itself, e.g. it may be odd to have one way 
> to format things in stdio and an entirely different way in the log, or 
> to have the logging infrastructure incompatible with the stream 
> infrastructure.
> 
> That all being said, I don't see a lot of point in making Phobos' 
> logging 100% identical with Tango's. Phobos2 and Tango2 will be usable 
> together, so there's no point in the duplication - if you want Tango's 
> logging mechanism, you just use it. So there will be no point in 
> "supporting both" because both can coexist.
> 
> 
> Andrei

It would be good to have one that makes use of D2's features have you looked at 
"simple-log", I'm not a java programmer but I do know some people who seem to 
like it more than Log4J.
here is the link https://simple-log.dev.java.net/

Anyway whatever the API looks like one would be welcome.

Zz


Re: Std Phobos 2 and logging library?

2009-04-11 Thread Sean Kelly

Robert Jacques wrote:


On that note, support for per thread GCs in general is another major 
change. i.e.:

GC.malloc!(T)();
to
Thread.getThis.gc.malloc!(T)(); // Alternatively use thread local storage.


It can remain as GC.malloc().  Exposing a GC handle via thread would 
allow one thread to use another thread's GC.


Re: Std Phobos 2 and logging library?

2009-04-11 Thread Sean Kelly

dsimcha wrote:
>

Absolutely.  When writing parallel code to do large scale data mining in D, the
lack of precision and multithreaded allocation are real killers.  My interests
are, in order of importance:

1.  Being able to allocate at least small chunks of memory without locking.


My next big project for Druntime will be to write a GC with per-thread 
heaps, but I don't know when that will be.  I've been pretty busy lately.


Re: Std Phobos 2 and logging library?

2009-04-11 Thread Robert Jacques

On Sat, 11 Apr 2009 01:21:04 -0400, dsimcha  wrote:

== Quote from Leandro Lucarella (llu...@gmail.com)'s article

Andrei Alexandrescu, el 10 de abril a las 16:49 me escribiste:
> >And Braddr just made a documentation fix, and Walter only commits
> >portability stuff and an occasional bug fix now and then, so...
> >Yes, it really looks like a five-person show =)
> >I think most work in Phobos now it's done by Andrei, there are other
> >*collaborators* (the four other you named plus people sending  
patches), but

> >it looks like Andrei's show to me. This is not necessarily bad, it's
> >definitely  better than before, when it was Walter's show, now at  
least he
> >can dedicate his efforts in the compiler and language and Phobos is  
having

> >a lot more attention.
>
> We'll be very happy to integrate credited contributions from anyone,  
and

> to give dsource.org write access to serious participants. What I think
> right now stands in the way of large participation to Phobos is that  
we

> all still learn the ropes of D2; the possibilities are dizzying and we
> haven't quite zeroed in on a particular style. Nonetheless, as it's  
been
> noticed I'm always summoning help from this group. So again, if you  
feel

> you want to contribute with ideas and/or code, don't hesitate.
I hope I can come up with something useful with my thesis (improving D's
GC) and I can contribute that. Right now all my energies are focused on
that, and I'm very close to the point to finally start playing with
alternate implementations.
BTW, is there any real interest in adding some more power to the GC
implementator to allow some kind of moving or generational collector?


Absolutely.  When writing parallel code to do large scale data mining in  
D, the
lack of precision and multithreaded allocation are real killers.  My  
interests

are, in order of importance:

1.  Being able to allocate at least small chunks of memory without  
locking.


After reading Leandro's blog about the current GC, converting the  
free-lists to a lock-free data-structure would be a simple (i.e. library  
only) way to provide this. Another is to provide per thread heaps, which I  
realized this morning can also be done without changing the complier.



2.  Precise scanning of at least the heap.
3.  Collection w/o stopping the world.


*Sigh*. A concurrent GCs (which is what is generally meant by Collection  
w/o stopping the world) is actually the wrong choice for you. In  
data-mining you're generally concerned with throughput. A concurrent  
collector is used solely for gaining latency back, and does so by  
sacrificing throughput. i.e. the total time your program spends collecting  
is increased. A parallel collector is probably what you're looking for,  
since it decreases the total collection time (i.e. increases your  
throughput) (It also reduces the latency on multi-core systems, which is  
why you often see synergistic parallel-concurrent collectors) And if you  
really want to have your cake (low latency) and eat it too (high  
throughput) there are thread-local heaps.



4.  Moving GC so that allocations can be pointer bumps.





Re: Std Phobos 2 and logging library?

2009-04-11 Thread grauzone

dsimcha wrote:

== Quote from grauzone (n...@example.net)'s article

Rioshin an'Harthen wrote:

"Leandro Lucarella"  kirjoitti viestissä
news:20090411030416.ga22...@homero.springfield.home...

BTW, is there any real interest in adding some more power to the GC
implementator to allow some kind of moving or generational collector?

What I mostly want/need from the GC would be determinism. I want to be
able to call delete on a subobject in the destructor of the object being
deleted.

How many times have I stumbled on this already?

Actually, this isn't needed:
- if you want to manually free an object, you can add an extra destroy()
method
- when the object is garbage collected, there's no point in deleting
referenced objects, because these are either still alive, or get
collected as well


In theory true, but in practice false.  If you have a huge array owned by a 
small
class, the huge array can be retained due to false pointers.  Before I realized
that it's illegal, I used to put delete statements in destructors in these kinds
of situations and it seemed to work in practice even though it's illegal 
according
to the spec, although I never tested it rigorously or really thought about how 
it
could break.


Then you should simply use malloc.


Re: Std Phobos 2 and logging library?

2009-04-11 Thread dsimcha
== Quote from grauzone (n...@example.net)'s article
> Rioshin an'Harthen wrote:
> > "Leandro Lucarella"  kirjoitti viestissä
> > news:20090411030416.ga22...@homero.springfield.home...
> >> BTW, is there any real interest in adding some more power to the GC
> >> implementator to allow some kind of moving or generational collector?
> >
> > What I mostly want/need from the GC would be determinism. I want to be
> > able to call delete on a subobject in the destructor of the object being
> > deleted.
> >
> > How many times have I stumbled on this already?
> Actually, this isn't needed:
> - if you want to manually free an object, you can add an extra destroy()
> method
> - when the object is garbage collected, there's no point in deleting
> referenced objects, because these are either still alive, or get
> collected as well

In theory true, but in practice false.  If you have a huge array owned by a 
small
class, the huge array can be retained due to false pointers.  Before I realized
that it's illegal, I used to put delete statements in destructors in these kinds
of situations and it seemed to work in practice even though it's illegal 
according
to the spec, although I never tested it rigorously or really thought about how 
it
could break.


Re: Std Phobos 2 and logging library?

2009-04-11 Thread Andrei Alexandrescu

grauzone wrote:
You make no sense. You can look at the Log4J API, but not at Tango's, 
because Phobos should take advantage of D2.0 features?


Yeah, I know, that's not exactly what you said, but come on.


"Shut that cigarette!"

"I'm not smoking and am not a smoker."

"Yeah, I know, but come on."


Andrei


Re: Std Phobos 2 and logging library?

2009-04-11 Thread grauzone
You make no sense. You can look at the Log4J API, but not at Tango's, 
because Phobos should take advantage of D2.0 features?


Yeah, I know, that's not exactly what you said, but come on.


Re: Std Phobos 2 and logging library?

2009-04-11 Thread Andrei Alexandrescu

Frank Benoit wrote:

Andrei Alexandrescu schrieb:

Zz wrote:

Hi,

Are there any plans for a logging library in Std Phobos 2.0?

Zz

I wanted to add logging support for a while now but am undecided about
the API to use. Log4J is quite popular but quite complicated. There are
a number of simpler APIs out there but I couldn't figure out which is
the best.

If anyone has ideas and/or code to contribute, that would be great.


Andrei


Why not start with the one from tango?


Because it's not my code and every discussion on licensing ends up 
confused. What we can do in Phobos is following e.g. the Log4J API, 
which as far as I understand Tango implements or at least draws 
inspiration from. But then by browsing this group a while ago I figured 
that Tango added a Trace module because some people deemed the logging 
API too complicated.



Why has everything to be
different?


Nobody said it has to be different.


If it really is not important, why do you have to make it
different than tango? Every code that uses tango and phobos, or wants to
support both has to reimplemnent an intermediate abstraction layer.


Again, I don't *have* to make it different, but I can't *copy* it 
either. There are two other things to consider: (a) Phobos' logging can 
take advantage of D2 features; (b) Phobos' logging should be well 
integrated with the rest of itself, e.g. it may be odd to have one way 
to format things in stdio and an entirely different way in the log, or 
to have the logging infrastructure incompatible with the stream 
infrastructure.


That all being said, I don't see a lot of point in making Phobos' 
logging 100% identical with Tango's. Phobos2 and Tango2 will be usable 
together, so there's no point in the duplication - if you want Tango's 
logging mechanism, you just use it. So there will be no point in 
"supporting both" because both can coexist.



Andrei


Re: Std Phobos 2 and logging library?

2009-04-11 Thread grauzone

Rioshin an'Harthen wrote:
"Leandro Lucarella"  kirjoitti viestissä 
news:20090411030416.ga22...@homero.springfield.home...

BTW, is there any real interest in adding some more power to the GC
implementator to allow some kind of moving or generational collector?


What I mostly want/need from the GC would be determinism. I want to be 
able to call delete on a subobject in the destructor of the object being 
deleted.


How many times have I stumbled on this already?


Actually, this isn't needed:
- if you want to manually free an object, you can add an extra destroy() 
method
- when the object is garbage collected, there's no point in deleting 
referenced objects, because these are either still alive, or get 
collected as well


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Frank Benoit
Andrei Alexandrescu schrieb:
> Zz wrote:
>> Hi,
>>
>> Are there any plans for a logging library in Std Phobos 2.0?
>>
>> Zz
> 
> I wanted to add logging support for a while now but am undecided about
> the API to use. Log4J is quite popular but quite complicated. There are
> a number of simpler APIs out there but I couldn't figure out which is
> the best.
> 
> If anyone has ideas and/or code to contribute, that would be great.
> 
> 
> Andrei

Why not start with the one from tango? Why has everything to be
different? If it really is not important, why do you have to make it
different than tango? Every code that uses tango and phobos, or wants to
support both has to reimplemnent an intermediate abstraction layer.




Re: Std Phobos 2 and logging library?

2009-04-10 Thread Rioshin an'Harthen
"Leandro Lucarella"  kirjoitti viestissä 
news:20090411030416.ga22...@homero.springfield.home...

BTW, is there any real interest in adding some more power to the GC
implementator to allow some kind of moving or generational collector?


What I mostly want/need from the GC would be determinism. I want to be able 
to call delete on a subobject in the destructor of the object being deleted.


How many times have I stumbled on this already? 



Re: Std Phobos 2 and logging library?

2009-04-10 Thread Don

dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

Leandro Lucarella wrote:

Christopher Wright, el 10 de abril a las 16:18 me escribiste:

BLS wrote:

Zz wrote:

Hi,

Are there any plans for a logging library in Std Phobos 2.0?

Zz

Why ask. Phobos is a one man show. In other word, Phobos is an ego-lib.
In case that you want something special, ask the tango folks. ( beside,
logging is avail. there for quite a while)
Björn

It's at least a five-person show: Andrei, Walter, Sean Kelly, braddr,
and Don have committed to Phobos svn in the past two weeks. Random other
people have donated code to it.

Granted, Sean probably only concerns himself with druntime
compatibility, and Don is probably mostly concerned with std.math and
related modules.

And Braddr just made a documentation fix, and Walter only commits
portability stuff and an occasional bug fix now and then, so...

Yes, it really looks like a five-person show =)

I think most work in Phobos now it's done by Andrei, there are other
*collaborators* (the four other you named plus people sending patches), but
it looks like Andrei's show to me. This is not necessarily bad, it's
definitely  better than before, when it was Walter's show, now at least he
can dedicate his efforts in the compiler and language and Phobos is having
a lot more attention.

We'll be very happy to integrate credited contributions from anyone, and
to give dsource.org write access to serious participants. What I think
right now stands in the way of large participation to Phobos is that we
all still learn the ropes of D2; the possibilities are dizzying and we
haven't quite zeroed in on a particular style. Nonetheless, as it's been
noticed I'm always summoning help from this group. So again, if you feel
you want to contribute with ideas and/or code, don't hesitate.
Andrei


I think part of the problem (this is not a criticism, just a statement of fact, 
as
I believe it to have overall been a good thing) is that you've evolved Phobos so
fast lately that noone else can keep up with what the heck is going on.  While 
you
appear to have done a great job on the new Phobos and things appear to be 
settling
down now, in the interim trying to figure out what was and wasn't going to be
completely turned upside down by ranges and Phobos 2 made contributing small
improvements and new features rather difficult.


I think you're completely right here. There were a couple of compiler 
bugs which were preventing Andrei from checking his stuff in; that made 
it impossible for anyone else to do much.


It's also worth noting that Janice Caron was a very active contributer 
to Phobos, before she suddenly disappeared.



I've definitely worked on projects like this before, where I was the lead person
and they were evolving faster than I could keep other people up to date, etc.
This gap can be frustrating, but sometimes it's necessary to allow a project to
evolve freely.


Re: Std Phobos 2 and logging library?

2009-04-10 Thread dsimcha
== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
> Leandro Lucarella wrote:
> > Christopher Wright, el 10 de abril a las 16:18 me escribiste:
> >> BLS wrote:
> >>> Zz wrote:
>  Hi,
> 
>  Are there any plans for a logging library in Std Phobos 2.0?
> 
>  Zz
> >>> Why ask. Phobos is a one man show. In other word, Phobos is an ego-lib.
> >>> In case that you want something special, ask the tango folks. ( beside,
> >>> logging is avail. there for quite a while)
> >>> Björn
> >> It's at least a five-person show: Andrei, Walter, Sean Kelly, braddr,
> >> and Don have committed to Phobos svn in the past two weeks. Random other
> >> people have donated code to it.
> >>
> >> Granted, Sean probably only concerns himself with druntime
> >> compatibility, and Don is probably mostly concerned with std.math and
> >> related modules.
> >
> > And Braddr just made a documentation fix, and Walter only commits
> > portability stuff and an occasional bug fix now and then, so...
> >
> > Yes, it really looks like a five-person show =)
> >
> > I think most work in Phobos now it's done by Andrei, there are other
> > *collaborators* (the four other you named plus people sending patches), but
> > it looks like Andrei's show to me. This is not necessarily bad, it's
> > definitely  better than before, when it was Walter's show, now at least he
> > can dedicate his efforts in the compiler and language and Phobos is having
> > a lot more attention.
> We'll be very happy to integrate credited contributions from anyone, and
> to give dsource.org write access to serious participants. What I think
> right now stands in the way of large participation to Phobos is that we
> all still learn the ropes of D2; the possibilities are dizzying and we
> haven't quite zeroed in on a particular style. Nonetheless, as it's been
> noticed I'm always summoning help from this group. So again, if you feel
> you want to contribute with ideas and/or code, don't hesitate.
> Andrei

I think part of the problem (this is not a criticism, just a statement of fact, 
as
I believe it to have overall been a good thing) is that you've evolved Phobos so
fast lately that noone else can keep up with what the heck is going on.  While 
you
appear to have done a great job on the new Phobos and things appear to be 
settling
down now, in the interim trying to figure out what was and wasn't going to be
completely turned upside down by ranges and Phobos 2 made contributing small
improvements and new features rather difficult.

I've definitely worked on projects like this before, where I was the lead person
and they were evolving faster than I could keep other people up to date, etc.
This gap can be frustrating, but sometimes it's necessary to allow a project to
evolve freely.


Re: Std Phobos 2 and logging library?

2009-04-10 Thread dsimcha
== Quote from Leandro Lucarella (llu...@gmail.com)'s article
> Andrei Alexandrescu, el 10 de abril a las 16:49 me escribiste:
> > >And Braddr just made a documentation fix, and Walter only commits
> > >portability stuff and an occasional bug fix now and then, so...
> > >Yes, it really looks like a five-person show =)
> > >I think most work in Phobos now it's done by Andrei, there are other
> > >*collaborators* (the four other you named plus people sending patches), but
> > >it looks like Andrei's show to me. This is not necessarily bad, it's
> > >definitely  better than before, when it was Walter's show, now at least he
> > >can dedicate his efforts in the compiler and language and Phobos is having
> > >a lot more attention.
> >
> > We'll be very happy to integrate credited contributions from anyone, and
> > to give dsource.org write access to serious participants. What I think
> > right now stands in the way of large participation to Phobos is that we
> > all still learn the ropes of D2; the possibilities are dizzying and we
> > haven't quite zeroed in on a particular style. Nonetheless, as it's been
> > noticed I'm always summoning help from this group. So again, if you feel
> > you want to contribute with ideas and/or code, don't hesitate.
> I hope I can come up with something useful with my thesis (improving D's
> GC) and I can contribute that. Right now all my energies are focused on
> that, and I'm very close to the point to finally start playing with
> alternate implementations.
> BTW, is there any real interest in adding some more power to the GC
> implementator to allow some kind of moving or generational collector?

Absolutely.  When writing parallel code to do large scale data mining in D, the
lack of precision and multithreaded allocation are real killers.  My interests
are, in order of importance:

1.  Being able to allocate at least small chunks of memory without locking.
2.  Precise scanning of at least the heap.
3.  Collection w/o stopping the world.
4.  Moving GC so that allocations can be pointer bumps.



Re: Std Phobos 2 and logging library?

2009-04-10 Thread Robert Jacques
On Fri, 10 Apr 2009 23:04:16 -0400, Leandro Lucarella   
wrote:

I hope I can come up with something useful with my thesis (improving D's
GC) and I can contribute that. Right now all my energies are focused on
that, and I'm very close to the point to finally start playing with
alternate implementations.

BTW, is there any real interest in adding some more power to the GC
implementator to allow some kind of moving or generational collector?


Yes.

Here are some good starting points on how to allow better GC support in  
D:

http://d.puremagic.com/issues/show_bug.cgi?id=679


I think this should be less a spec issue and more a library issue and  
core.memory seems to already have a BlkAttr.NO_MOVE, which covers memory  
pinning.



http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=35426


Well, making the GC type aware/semi-precise (i.e. providing support for  
moving/copying collectors) seems like the most important change, i.e.

static void* malloc(size_t sz, uint ba = 0);
to
static void* malloc(T)(uint ba = 0);
static void* malloc(T:T[])(uint ba = 0);

The change to support concurrent GCs, effects both performance and code  
gen significantly. Also, if D's thread model supports thread-local heaps,  
the need for a concurrent GC is vastly reduced (its only a benefit to the  
shared heaps (mutable and immutable), while most objects would are on the  
thread-local heaps).


On that note, support for per thread GCs in general is another major  
change. i.e.:

GC.malloc!(T)();
to
Thread.getThis.gc.malloc!(T)(); // Alternatively use thread local storage.
Even without a locality guarantee, this allows for concurrent allocation  
and (I think) better D DLL behaviour since you don't end up with two  
separate heaps which don't know about each other.



Anyway, if you are interested in my progress, I have a blog[1] where
I write almost everything I do related to the subject. The blog it's in
Planet D, but Planet D seems to be broken =/

[1] http://proj.llucax.com.ar/blog/dgc/blog


P.S. Thanks for the blog. (I have been following it for a while now)



Re: Std Phobos 2 and logging library?

2009-04-10 Thread Andrei Alexandrescu

Leandro Lucarella wrote:

BTW, is there any real interest in adding some more power to the GC
implementator to allow some kind of moving or generational collector?


That would be awesome!


Here are some good starting points on how to allow better GC support in D:
http://d.puremagic.com/issues/show_bug.cgi?id=679
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=35426

Anyway, if you are interested in my progress, I have a blog[1] where
I write almost everything I do related to the subject. The blog it's in
Planet D, but Planet D seems to be broken =/

[1] http://proj.llucax.com.ar/blog/dgc/blog



Great, I'll follow. Speaking of readings, here's a paper that I found 
intriguing. Maybe it could provide inspiration: GC assertions: Using the 
Garbage Collector to Check Heap Properties (I'd send a link but I can't 
open the browser; I'm ATM on a Windows machine that has a problem. Long 
story.)



Andrei



Re: Std Phobos 2 and logging library?

2009-04-10 Thread Leandro Lucarella
Andrei Alexandrescu, el 10 de abril a las 16:49 me escribiste:
> >And Braddr just made a documentation fix, and Walter only commits
> >portability stuff and an occasional bug fix now and then, so...
> >Yes, it really looks like a five-person show =)
> >I think most work in Phobos now it's done by Andrei, there are other
> >*collaborators* (the four other you named plus people sending patches), but
> >it looks like Andrei's show to me. This is not necessarily bad, it's
> >definitely  better than before, when it was Walter's show, now at least he
> >can dedicate his efforts in the compiler and language and Phobos is having
> >a lot more attention.
> 
> We'll be very happy to integrate credited contributions from anyone, and
> to give dsource.org write access to serious participants. What I think
> right now stands in the way of large participation to Phobos is that we
> all still learn the ropes of D2; the possibilities are dizzying and we
> haven't quite zeroed in on a particular style. Nonetheless, as it's been
> noticed I'm always summoning help from this group. So again, if you feel
> you want to contribute with ideas and/or code, don't hesitate.

I hope I can come up with something useful with my thesis (improving D's
GC) and I can contribute that. Right now all my energies are focused on
that, and I'm very close to the point to finally start playing with
alternate implementations.

BTW, is there any real interest in adding some more power to the GC
implementator to allow some kind of moving or generational collector?

Here are some good starting points on how to allow better GC support in D:
http://d.puremagic.com/issues/show_bug.cgi?id=679
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=35426

Anyway, if you are interested in my progress, I have a blog[1] where
I write almost everything I do related to the subject. The blog it's in
Planet D, but Planet D seems to be broken =/

[1] http://proj.llucax.com.ar/blog/dgc/blog

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

- Tata Dios lo creó a usté solamente pa despertar al pueblo y fecundar
  las gayinas.
- Otro constrasentido divino... Quieren que yo salga de joda con las
  hembras y después quieren que madrugue.
-- Inodoro Pereyra y un gallo


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Andrei Alexandrescu

Leandro Lucarella wrote:

Christopher Wright, el 10 de abril a las 16:18 me escribiste:

BLS wrote:

Zz wrote:

Hi,

Are there any plans for a logging library in Std Phobos 2.0?

Zz

Why ask. Phobos is a one man show. In other word, Phobos is an ego-lib.
In case that you want something special, ask the tango folks. ( beside,
logging is avail. there for quite a while)
Björn

It's at least a five-person show: Andrei, Walter, Sean Kelly, braddr,
and Don have committed to Phobos svn in the past two weeks. Random other
people have donated code to it.

Granted, Sean probably only concerns himself with druntime
compatibility, and Don is probably mostly concerned with std.math and
related modules.


And Braddr just made a documentation fix, and Walter only commits
portability stuff and an occasional bug fix now and then, so...

Yes, it really looks like a five-person show =)

I think most work in Phobos now it's done by Andrei, there are other
*collaborators* (the four other you named plus people sending patches), but
it looks like Andrei's show to me. This is not necessarily bad, it's
definitely  better than before, when it was Walter's show, now at least he
can dedicate his efforts in the compiler and language and Phobos is having
a lot more attention.


We'll be very happy to integrate credited contributions from anyone, and 
to give dsource.org write access to serious participants. What I think 
right now stands in the way of large participation to Phobos is that we 
all still learn the ropes of D2; the possibilities are dizzying and we 
haven't quite zeroed in on a particular style. Nonetheless, as it's been 
noticed I'm always summoning help from this group. So again, if you feel 
you want to contribute with ideas and/or code, don't hesitate.



Andrei


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Leandro Lucarella
Christopher Wright, el 10 de abril a las 16:18 me escribiste:
> BLS wrote:
> >Zz wrote:
> >>Hi,
> >>
> >>Are there any plans for a logging library in Std Phobos 2.0?
> >>
> >>Zz
> >Why ask. Phobos is a one man show. In other word, Phobos is an ego-lib.
> >In case that you want something special, ask the tango folks. ( beside,
> >logging is avail. there for quite a while)
> >Björn
> 
> It's at least a five-person show: Andrei, Walter, Sean Kelly, braddr,
> and Don have committed to Phobos svn in the past two weeks. Random other
> people have donated code to it.
> 
> Granted, Sean probably only concerns himself with druntime
> compatibility, and Don is probably mostly concerned with std.math and
> related modules.

And Braddr just made a documentation fix, and Walter only commits
portability stuff and an occasional bug fix now and then, so...

Yes, it really looks like a five-person show =)

I think most work in Phobos now it's done by Andrei, there are other
*collaborators* (the four other you named plus people sending patches), but
it looks like Andrei's show to me. This is not necessarily bad, it's
definitely  better than before, when it was Walter's show, now at least he
can dedicate his efforts in the compiler and language and Phobos is having
a lot more attention.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

Did you see the frightened ones?
Did you hear the falling bombs?
Did you ever wonder why we had to run for shelter when the promise of a
brave new world unfurled beneath a clear blue sky?


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Christopher Wright

BLS wrote:

Zz wrote:

Hi,

Are there any plans for a logging library in Std Phobos 2.0?

Zz
Why ask. Phobos is a one man show. In other word, Phobos is an ego-lib. 
In case that you want something special, ask the tango folks. ( beside, 
logging is avail. there for quite a while)


Björn


It's at least a five-person show: Andrei, Walter, Sean Kelly, braddr, 
and Don have committed to Phobos svn in the past two weeks. Random other 
people have donated code to it.


Granted, Sean probably only concerns himself with druntime 
compatibility, and Don is probably mostly concerned with std.math and 
related modules.


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Sean Kelly
== Quote from BLS (windev...@hotmail.de)'s article
> Zz wrote:
> > Hi,
> >
> > Are there any plans for a logging library in Std Phobos 2.0?
> >
> Why ask. Phobos is a one man show.

Phobos has never been a one-man show.  Either way, you're
guaranteed not to get what you want if you don't even bother
to ask for it.


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Steven Schveighoffer
On Fri, 10 Apr 2009 12:20:46 -0400, Andrei Alexandrescu  
 wrote:



Zz wrote:

Hi,
 Are there any plans for a logging library in Std Phobos 2.0?
 Zz


I wanted to add logging support for a while now but am undecided about  
the API to use. Log4J is quite popular but quite complicated. There are  
a number of simpler APIs out there but I couldn't figure out which is  
the best.


If anyone has ideas and/or code to contribute, that would be great.


Having experience with Tango's logger, here are the things I like about it:

1. lazy evaluation.  This is key, because it removes the whole requirement  
in log4* which requires you to check if the logger is active before doing  
some expensive calculation.  With lazy evaluation, you move the check into  
the log function.  BTW, this is a *HUGE* potential win for macros (if they  
are ever implemented), since you can get rid of the lazy eval.  See my  
post:  
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=12431
2. No heap activity.  This is to keep the logger from bogging down the  
program with memory allocations.

3. Thread safe.

Other than that, Tango's is pretty similar to log4* varieties.  I think  
the general design of log4* libs is pretty well tested and solid, but  
using some nifty features of D that can't be had in other languages makes  
it even more useful.  So I'd start with that design and see what can be  
improved.  Similar to how you approached algorithms (start with stl, see  
what d features can be applied to it).


-Steve


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Leandro Lucarella
Andrei Alexandrescu, el 10 de abril a las 09:20 me escribiste:
> Zz wrote:
> >Hi,
> >Are there any plans for a logging library in Std Phobos 2.0?
> >Zz
> 
> I wanted to add logging support for a while now but am undecided about the 
> API 
> to use. Log4J is quite popular but quite complicated. There are a number of 
> simpler APIs out there but I couldn't figure out which is the best.
> 
> If anyone has ideas and/or code to contribute, that would be great.

I find Python API very convenient and flexible (I think it's inspired in
another library but I just used Python's).

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/

GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)

La esperanza es una amiga que nos presta la ilusión.


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Adam D. Ruppe
On Fri, Apr 10, 2009 at 09:20:46AM -0700, Andrei Alexandrescu wrote:
> If anyone has ideas and/or code to contribute, that would be great.

I never understood why they should be complicated. Couldn't you just do
something like (pseudocodeish):

==

enum LogLevel { Verbose, Warning, Error }

FILE* logStream;
LogLevel currentLevel

// We need to open the log file ahead of time; this might be from command
// line args in a real program.
static this() {
logStream = stderr; // or fopen("log", "wt"); or whatever
currentLevel = LogLevel.Verbose;
}

static ~this() {
fclose(logStream);
}

void log(LogLevel message, formatted message...) {
if( currentLevel >= message) {
logStream.writef("%s: ", currentTime() );
logStream.writefln(formatted message);
}
}

void fun() {
log(LogLevel.Verbose, "Entering function %s", __FUNCTION__);
if ( crap )
log(LogLevel.Error, "Crap happened!"
}

===


Does it really need to be much more complex than that?


> 
> Andrei

-- 
Adam D. Ruppe
http://arsdnet.net


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Andrei Alexandrescu

BLS wrote:

Zz wrote:

Hi,

Are there any plans for a logging library in Std Phobos 2.0?

Zz
Why ask. Phobos is a one man show. In other word, Phobos is an ego-lib. 


That's a rather random thing to say, particularly in wake of the recent 
concerted efforts to improve Phobos and to port it to new OSs. Walter, 
Don, and myself are working actively on Phobos. Sean is helping a lot 
with druntime and only lack of time is preventing him from adding to 
Phobos.


Andrei


Re: Std Phobos 2 and logging library?

2009-04-10 Thread Andrei Alexandrescu

Zz wrote:

Hi,

Are there any plans for a logging library in Std Phobos 2.0?

Zz


I wanted to add logging support for a while now but am undecided about 
the API to use. Log4J is quite popular but quite complicated. There are 
a number of simpler APIs out there but I couldn't figure out which is 
the best.


If anyone has ideas and/or code to contribute, that would be great.


Andrei


Re: Std Phobos 2 and logging library?

2009-04-10 Thread BLS

Zz wrote:

Hi,

Are there any plans for a logging library in Std Phobos 2.0?

Zz
Why ask. Phobos is a one man show. In other word, Phobos is an ego-lib. 
In case that you want something special, ask the tango folks. ( beside, 
logging is avail. there for quite a while)


Björn


Std Phobos 2 and logging library?

2009-04-10 Thread Zz
Hi,

Are there any plans for a logging library in Std Phobos 2.0?

Zz