Re: How the GC distinguishes code from data

2011-01-07 Thread %u
> First, you should understand that the GC does not know what data is in a 
> memory
block.

That is exactly why I was wondering how it figures things out. :)


> Data *allocated* as a void[] (which I highly recommend *not* doing) will be
conservatively marked as containing pointers.

Ah, all right, that clears things up! Thank you!!


Re: How the GC distinguishes code from data

2011-01-07 Thread Steven Schveighoffer

On Fri, 07 Jan 2011 16:39:20 -0500, %u  wrote:


None what so ever.


Huh.. then what about what is said in this link?
http://d.puremagic.com/issues/show_bug.cgi?id=5326#c1

I was told that void[] could contain references, but that ubyte[] would  
not, and
that the GC would need to scan the former but not the latter. Is that  
wrong?


First, you should understand that the GC does not know what data is in a  
memory block.  It has no idea that the block is a void[] or a ubyte[] or a  
class instance or whatever it is.  All it knows is that it's data.  What  
makes it scan a block is a bit set on the block indicating that it  
contains pointers.  This bit is set by the higher-level runtime routines  
(like the ones that create an array) which use the TypeInfo to determine  
whether to set the NO_SCAN bit or not.


Second, memory that is not part of D's allocation is *not* scanned or  
marked, no matter where it is.  Essentially the mark routine goes like  
this (pseudocode):


foreach(root; roots)
  if(root.hasPointers)  // notice this has nothing to do with type
 foreach(pointer; root)
if(pointer.pointsAt.GCHeapBlock)
   pointer.heapBlock.mark = true;

while(changesWereMade)
   foreach(heapBlock; heap)
  if(heapBlock.hasPointers)
 foreach(pointer; heapBlock)
if(pointer.pointsAt.GCHeapBlock)
{
   pointer.heapBlock.mark = true;
   changesWereMade = true;
}

// free memory
foreach(heapBlock; heap)
   if(!heapBlock.mark)
  free(heapBlock)

So essentially, you can see if you allocated memory for example with  
malloc, and you didn't add it as a root, it's neither scanned nor marked.   
It does not participate whatsoever with the collection cycle, no matter  
what the type of the data is.


Now, you should also realize that just because an array is a void[]  
doesn't necessarily make it marked as containing pointers.  It is quite  
possible to implicitly cast a ubyte[] to a void[], and this does not  
change the NO_SCAN bit in the memory block.  Data *allocated* as a void[]  
(which I highly recommend *not* doing) will be conservatively marked as  
containing pointers.  This is probably where you get the notion that  
void[] contains pointers.


-Steve


Re: How the GC distinguishes code from data

2011-01-07 Thread %u
> None what so ever.

Huh.. then what about what is said in this link?
http://d.puremagic.com/issues/show_bug.cgi?id=5326#c1

I was told that void[] could contain references, but that ubyte[] would not, and
that the GC would need to scan the former but not the latter. Is that wrong?

Thank you!


Re: How the GC distinguishes code from data

2011-01-07 Thread Simen kjaeraas

%u  wrote:


You have to add it to the garbage collector's list of roots


But if I need to do that, then what would be the difference between  
void[] and

ubyte[]?


None what so ever. If you want to mark some memory with special bits,
use setattr in core.memory.


--
Simen


Re: How the GC distinguishes code from data

2011-01-07 Thread %u
> Kinda sorta. I haven't had any problems from that. If you allocate very large
blocks in the garbage collector you may face trouble :-)

Haha okay, thanks. :) (This makes me shiver quite a bit...)


> You have to add it to the garbage collector's list of roots

But if I need to do that, then what would be the difference between void[] and
ubyte[]?


Re: How the GC distinguishes code from data

2011-01-07 Thread Pelle

On 01/07/2011 06:47 PM, %u wrote:

It assumes everything on the stack is pointers, at the moment, I believe


Uh-oh... not the answer I wanted to hear, but I was half-expecting this.
So doesn't that mean that, at the moment, D will leak memory?


Kinda sorta. I haven't had any problems from that. If you allocate very 
large blocks in the garbage collector you may face trouble :-)



If it's not on the garbage collected heap, it won't scan it unless you
tell it to.


But what if it's a void[] on a non-GC heap? Doesn't the language say that needs 
to
be scanned too?


You have to add it to the garbage collector's list of roots, I'm not 
sure what it's named exactly. Note that you only have to do that if 
there actually are pointers to the gc heap there.


Re: How the GC distinguishes code from data

2011-01-07 Thread %u
> It assumes everything on the stack is pointers, at the moment, I believe

Uh-oh... not the answer I wanted to hear, but I was half-expecting this.
So doesn't that mean that, at the moment, D will leak memory?

> If it's not on the garbage collected heap, it won't scan it unless you
tell it to.

But what if it's a void[] on a non-GC heap? Doesn't the language say that needs 
to
be scanned too?


Re: How the GC distinguishes code from data

2011-01-06 Thread Jérôme M. Berger
%u wrote:
> Hi,
> 
> There's a question that's been lurking in the back of my mind ever since I
> learned about D:
> 
> How does the GC distinguish code from data when determining the objects to
> collect? (E.g. void[] from uint[], size_t from void*, etc.?)
> 
> If I have a large uint[], it's practically guaranteed to have data that looks
> like pointers, and that might cause memory leaks. Furthermore, if the GC moves
> things around, it would corrupt my data. How is this handled?
> 
> Thank you!

The GC knows about global variables, the stack, everything that was
allocated through it and everything that you tell it to scan (which
allows using C malloc without seeing an object disappear because the
only remaining pointers are in a malloc'ed buffer). Moreover, for
GC-allocated data (and maybe the globals too), the GC knows that
some data cannot contain pointers and will refrain from scanning it
(it will always assume that anything on the stack or that you tell
it to scan contains pointers).

The GC keeps track internally of the memory where it knows there
are no pointers and the memory where there may be pointers.

Jerome
-- 
mailto:jeber...@free.fr
http://jeberger.free.fr
Jabber: jeber...@jabber.fr



signature.asc
Description: OpenPGP digital signature


Re: How the GC distinguishes code from data

2011-01-06 Thread Pelle

On 01/06/2011 07:31 AM, %u wrote:

If you have allocated a large uint[], most likely =C3=ACt will be flagged

NO_SCAN, meaning it has no pointers in it, and the GC will ignore it.


Ah, but the trouble is, no one said that this array has to be in the GC heap! I
could easily have a void[] and a uint[] that both point to non-GC managed 
memory.
Or I might even have a uint[] allocated on the stack! How does the GC 
distinguish
these, when there's no "attribute" it can mark? (Or does it?!)


It assumes everything on the stack is pointers, at the moment, I believe.

If it's not on the garbage collected heap, it won't scan it unless you 
tell it to.


Re: How the GC distinguishes code from data

2011-01-05 Thread %u
> If you have allocated a large uint[], most likely =C3=ACt will be flagged
NO_SCAN, meaning it has no pointers in it, and the GC will ignore it.


Ah, but the trouble is, no one said that this array has to be in the GC heap! I
could easily have a void[] and a uint[] that both point to non-GC managed 
memory.
Or I might even have a uint[] allocated on the stack! How does the GC 
distinguish
these, when there's no "attribute" it can mark? (Or does it?!)


Re: How the GC distinguishes code from data

2011-01-05 Thread Steven Schveighoffer
On Wed, 05 Jan 2011 16:56:47 -0500, Simen kjaeraas  
 wrote:



%u  wrote:

If I have a large uint[], it's practically guaranteed to have data that  
looks like pointers, and that might cause memory leaks.


If you have allocated a large uint[], most likely ìt will be flagged
NO_SCAN, meaning it has no pointers in it, and the GC will ignore it.


There is another problem that I recently ran into.  If you allocate a  
large memory block, even one marked as not containing pointers, there is a  
medium probability that a 'fake' pointer exists that points *at* that  
block, not from it.  This means that uint[] may never get collected unless  
you manually free it.



Furthermore, if the GC moves
things around, it would corrupt my data. How is this handled?


The current GC does not move things. One could write such a GC for D (I
believe), and in such a case data would be marked NO_MOVE if for whatever
reason it cannot be moved.


A moving GC cannot exist without precise scanning.  Anything that is  
marked from a conservative block (one that has no pointer map) would not  
be able to move.


-Steve


Re: How the GC distinguishes code from data

2011-01-05 Thread Simen kjaeraas

%u  wrote:


Hi,

There's a question that's been lurking in the back of my mind ever since  
I learned about D:


How does the GC distinguish code from data when determining the objects  
to collect? (E.g. void[] from uint[], size_t from void*, etc.?)


This is hardly the code/data dualism (data can easily hold pointers), but
simply POD/pointers.


If I have a large uint[], it's practically guaranteed to have data that  
looks like pointers, and that might cause memory leaks.


If you have allocated a large uint[], most likely ìt will be flagged
NO_SCAN, meaning it has no pointers in it, and the GC will ignore it.



Furthermore, if the GC moves
things around, it would corrupt my data. How is this handled?


The current GC does not move things. One could write such a GC for D (I
believe), and in such a case data would be marked NO_MOVE if for whatever
reason it cannot be moved.


--
Simen


How the GC distinguishes code from data

2011-01-05 Thread %u
Hi,

There's a question that's been lurking in the back of my mind ever since I
learned about D:

How does the GC distinguish code from data when determining the objects to
collect? (E.g. void[] from uint[], size_t from void*, etc.?)

If I have a large uint[], it's practically guaranteed to have data that looks
like pointers, and that might cause memory leaks. Furthermore, if the GC moves
things around, it would corrupt my data. How is this handled?

Thank you!