Re: How the GC distinguishes code from data
> First, you should understand that the GC does not know what data is in a > memory block. That is exactly why I was wondering how it figures things out. :) > Data *allocated* as a void[] (which I highly recommend *not* doing) will be conservatively marked as containing pointers. Ah, all right, that clears things up! Thank you!!
Re: How the GC distinguishes code from data
On Fri, 07 Jan 2011 16:39:20 -0500, %u wrote: None what so ever. Huh.. then what about what is said in this link? http://d.puremagic.com/issues/show_bug.cgi?id=5326#c1 I was told that void[] could contain references, but that ubyte[] would not, and that the GC would need to scan the former but not the latter. Is that wrong? First, you should understand that the GC does not know what data is in a memory block. It has no idea that the block is a void[] or a ubyte[] or a class instance or whatever it is. All it knows is that it's data. What makes it scan a block is a bit set on the block indicating that it contains pointers. This bit is set by the higher-level runtime routines (like the ones that create an array) which use the TypeInfo to determine whether to set the NO_SCAN bit or not. Second, memory that is not part of D's allocation is *not* scanned or marked, no matter where it is. Essentially the mark routine goes like this (pseudocode): foreach(root; roots) if(root.hasPointers) // notice this has nothing to do with type foreach(pointer; root) if(pointer.pointsAt.GCHeapBlock) pointer.heapBlock.mark = true; while(changesWereMade) foreach(heapBlock; heap) if(heapBlock.hasPointers) foreach(pointer; heapBlock) if(pointer.pointsAt.GCHeapBlock) { pointer.heapBlock.mark = true; changesWereMade = true; } // free memory foreach(heapBlock; heap) if(!heapBlock.mark) free(heapBlock) So essentially, you can see if you allocated memory for example with malloc, and you didn't add it as a root, it's neither scanned nor marked. It does not participate whatsoever with the collection cycle, no matter what the type of the data is. Now, you should also realize that just because an array is a void[] doesn't necessarily make it marked as containing pointers. It is quite possible to implicitly cast a ubyte[] to a void[], and this does not change the NO_SCAN bit in the memory block. Data *allocated* as a void[] (which I highly recommend *not* doing) will be conservatively marked as containing pointers. This is probably where you get the notion that void[] contains pointers. -Steve
Re: How the GC distinguishes code from data
> None what so ever. Huh.. then what about what is said in this link? http://d.puremagic.com/issues/show_bug.cgi?id=5326#c1 I was told that void[] could contain references, but that ubyte[] would not, and that the GC would need to scan the former but not the latter. Is that wrong? Thank you!
Re: How the GC distinguishes code from data
%u wrote: You have to add it to the garbage collector's list of roots But if I need to do that, then what would be the difference between void[] and ubyte[]? None what so ever. If you want to mark some memory with special bits, use setattr in core.memory. -- Simen
Re: How the GC distinguishes code from data
> Kinda sorta. I haven't had any problems from that. If you allocate very large blocks in the garbage collector you may face trouble :-) Haha okay, thanks. :) (This makes me shiver quite a bit...) > You have to add it to the garbage collector's list of roots But if I need to do that, then what would be the difference between void[] and ubyte[]?
Re: How the GC distinguishes code from data
On 01/07/2011 06:47 PM, %u wrote: It assumes everything on the stack is pointers, at the moment, I believe Uh-oh... not the answer I wanted to hear, but I was half-expecting this. So doesn't that mean that, at the moment, D will leak memory? Kinda sorta. I haven't had any problems from that. If you allocate very large blocks in the garbage collector you may face trouble :-) If it's not on the garbage collected heap, it won't scan it unless you tell it to. But what if it's a void[] on a non-GC heap? Doesn't the language say that needs to be scanned too? You have to add it to the garbage collector's list of roots, I'm not sure what it's named exactly. Note that you only have to do that if there actually are pointers to the gc heap there.
Re: How the GC distinguishes code from data
> It assumes everything on the stack is pointers, at the moment, I believe Uh-oh... not the answer I wanted to hear, but I was half-expecting this. So doesn't that mean that, at the moment, D will leak memory? > If it's not on the garbage collected heap, it won't scan it unless you tell it to. But what if it's a void[] on a non-GC heap? Doesn't the language say that needs to be scanned too?
Re: How the GC distinguishes code from data
%u wrote: > Hi, > > There's a question that's been lurking in the back of my mind ever since I > learned about D: > > How does the GC distinguish code from data when determining the objects to > collect? (E.g. void[] from uint[], size_t from void*, etc.?) > > If I have a large uint[], it's practically guaranteed to have data that looks > like pointers, and that might cause memory leaks. Furthermore, if the GC moves > things around, it would corrupt my data. How is this handled? > > Thank you! The GC knows about global variables, the stack, everything that was allocated through it and everything that you tell it to scan (which allows using C malloc without seeing an object disappear because the only remaining pointers are in a malloc'ed buffer). Moreover, for GC-allocated data (and maybe the globals too), the GC knows that some data cannot contain pointers and will refrain from scanning it (it will always assume that anything on the stack or that you tell it to scan contains pointers). The GC keeps track internally of the memory where it knows there are no pointers and the memory where there may be pointers. Jerome -- mailto:jeber...@free.fr http://jeberger.free.fr Jabber: jeber...@jabber.fr signature.asc Description: OpenPGP digital signature
Re: How the GC distinguishes code from data
On 01/06/2011 07:31 AM, %u wrote: If you have allocated a large uint[], most likely =C3=ACt will be flagged NO_SCAN, meaning it has no pointers in it, and the GC will ignore it. Ah, but the trouble is, no one said that this array has to be in the GC heap! I could easily have a void[] and a uint[] that both point to non-GC managed memory. Or I might even have a uint[] allocated on the stack! How does the GC distinguish these, when there's no "attribute" it can mark? (Or does it?!) It assumes everything on the stack is pointers, at the moment, I believe. If it's not on the garbage collected heap, it won't scan it unless you tell it to.
Re: How the GC distinguishes code from data
> If you have allocated a large uint[], most likely =C3=ACt will be flagged NO_SCAN, meaning it has no pointers in it, and the GC will ignore it. Ah, but the trouble is, no one said that this array has to be in the GC heap! I could easily have a void[] and a uint[] that both point to non-GC managed memory. Or I might even have a uint[] allocated on the stack! How does the GC distinguish these, when there's no "attribute" it can mark? (Or does it?!)
Re: How the GC distinguishes code from data
On Wed, 05 Jan 2011 16:56:47 -0500, Simen kjaeraas wrote: %u wrote: If I have a large uint[], it's practically guaranteed to have data that looks like pointers, and that might cause memory leaks. If you have allocated a large uint[], most likely ìt will be flagged NO_SCAN, meaning it has no pointers in it, and the GC will ignore it. There is another problem that I recently ran into. If you allocate a large memory block, even one marked as not containing pointers, there is a medium probability that a 'fake' pointer exists that points *at* that block, not from it. This means that uint[] may never get collected unless you manually free it. Furthermore, if the GC moves things around, it would corrupt my data. How is this handled? The current GC does not move things. One could write such a GC for D (I believe), and in such a case data would be marked NO_MOVE if for whatever reason it cannot be moved. A moving GC cannot exist without precise scanning. Anything that is marked from a conservative block (one that has no pointer map) would not be able to move. -Steve
Re: How the GC distinguishes code from data
%u wrote: Hi, There's a question that's been lurking in the back of my mind ever since I learned about D: How does the GC distinguish code from data when determining the objects to collect? (E.g. void[] from uint[], size_t from void*, etc.?) This is hardly the code/data dualism (data can easily hold pointers), but simply POD/pointers. If I have a large uint[], it's practically guaranteed to have data that looks like pointers, and that might cause memory leaks. If you have allocated a large uint[], most likely ìt will be flagged NO_SCAN, meaning it has no pointers in it, and the GC will ignore it. Furthermore, if the GC moves things around, it would corrupt my data. How is this handled? The current GC does not move things. One could write such a GC for D (I believe), and in such a case data would be marked NO_MOVE if for whatever reason it cannot be moved. -- Simen
How the GC distinguishes code from data
Hi, There's a question that's been lurking in the back of my mind ever since I learned about D: How does the GC distinguish code from data when determining the objects to collect? (E.g. void[] from uint[], size_t from void*, etc.?) If I have a large uint[], it's practically guaranteed to have data that looks like pointers, and that might cause memory leaks. Furthermore, if the GC moves things around, it would corrupt my data. How is this handled? Thank you!