[Pharo-project] garbage collection
This is more of a GC theory question perhaps, but I do have practical problems, so: I'm trying to figure out how come my headless Pharo 1.3 image with fairly recent Cog running Magma ran out of Semaphores. I know there is a fixed limit on number of semaphores Cog can handle, and only Magma and RFB server have been running on this image for two months, and there wasn't any heavy load whatsoever, so something must be leaking semaphores - or more likely - something must be leaking objects having reference(s) to semaphore(s). I am not leaking sockets btw. At the moment there is about 1400 instances of Semaphore class, and around 400 after GC. So first question is: Suppose I haven't run GC manually, could the semaphore limit be hit? 400 is still too high for a idling image, don't you think? Anyway, what I try next is to shut down Magma completely and try to clean the memory by hand. I notice some MaServerSockets are still floating around so I track the pointers and I find that all the references are circular! Question two: Does Pharo's GC solve circular references? If yes, how good is it at it? But then, thinking more about it, the whole running image is one big graph of circular references, you wouldn't want the GC to clean that :) So how does it really work? That's my naive view anyway. After I wrote all this I figure I'm better of taking a fresh new image and load what I need into it, even if I have to do it every two months. Not saving the image is also an option. Yeah, I think I'm going to do just that. I found my solution, at least this e-mail served for something. Since I don't have a blog I'll send it anyway. -- Milan Mimica
Re: [Pharo-project] garbage collection
Milan, On 14 Feb 2012, at 21:01, Milan Mimica wrote: > This is more of a GC theory question perhaps, but I do have practical > problems, so: I'm trying to figure out how come my headless Pharo 1.3 image > with fairly recent Cog running Magma ran out of Semaphores. I know there is a > fixed limit on number of semaphores Cog can handle, and only Magma and RFB > server have been running on this image for two months, and there wasn't any > heavy load whatsoever, so something must be leaking semaphores - or more > likely - something must be leaking objects having reference(s) to > semaphore(s). I am not leaking sockets btw. > > At the moment there is about 1400 instances of Semaphore class, and around > 400 after GC. > So first question is: Suppose I haven't run GC manually, could the semaphore > limit be hit? > > 400 is still too high for a idling image, don't you think? > > Anyway, what I try next is to shut down Magma completely and try to clean the > memory by hand. I notice some MaServerSockets are still floating around so I > track the pointers and I find that all the references are circular! > Question two: Does Pharo's GC solve circular references? If yes, how good is > it at it? > > But then, thinking more about it, the whole running image is one big graph of > circular references, you wouldn't want the GC to clean that :) So how does it > really work? That's my naive view anyway. > > After I wrote all this I figure I'm better of taking a fresh new image and > load what I need into it, even if I have to do it every two months. Not > saving the image is also an option. Yeah, I think I'm going to do just that. > I found my solution, at least this e-mail served for something. Since I don't > have a blog I'll send it anyway. > > > -- > Milan Mimica There was a semaphore related problem in 1.3 that was fixed some time ago. Doing regular image saves was one of the factors involved. Search the mailing list or issues list for semaphore. Sven
Re: [Pharo-project] garbage collection
On 14 February 2012 21:09, Sven Van Caekenberghe wrote: > > > There was a semaphore related problem in 1.3 that was fixed some time ago. > Doing regular image saves was one of the factors involved. > Search the mailing list or issues list for semaphore. > Nah, leaking one semaphore per image save is not a problem. I don't save regularly. The remark on image saving was that when you save the image, you save all the crap that is running, and it accumulates every time. There is no point saving an image in production if you are persisting data externally. Besides that I don't remember anything related being fixed. -- Milan Mimica http://sparklet.sf.net
Re: [Pharo-project] garbage collection
let us know because we should address that or at least documented it and be aware of the problem. Tx On Feb 14, 2012, at 9:21 PM, Milan Mimica wrote: > On 14 February 2012 21:09, Sven Van Caekenberghe wrote: > > There was a semaphore related problem in 1.3 that was fixed some time ago. > Doing regular image saves was one of the factors involved. > Search the mailing list or issues list for semaphore. > > Nah, leaking one semaphore per image save is not a problem. I don't save > regularly. The remark on image saving was that when you save the image, you > save all the crap that is running, and it accumulates every time. There is no > point saving an image in production if you are persisting data externally. > Besides that I don't remember anything related being fixed. > > > -- > Milan Mimica > http://sparklet.sf.net
Re: [Pharo-project] garbage collection
On 14 Feb 2012, at 21:21, Milan Mimica wrote: > On 14 February 2012 21:09, Sven Van Caekenberghe wrote: > > There was a semaphore related problem in 1.3 that was fixed some time ago. > Doing regular image saves was one of the factors involved. > Search the mailing list or issues list for semaphore. > > Nah, leaking one semaphore per image save is not a problem. I don't save > regularly. The remark on image saving was that when you save the image, you > save all the crap that is running, and it accumulates every time. There is no > point saving an image in production if you are persisting data externally. > Besides that I don't remember anything related being fixed. These are not necessarily related, but these are some of them: http://code.google.com/p/pharo/issues/detail?id=4910 http://code.google.com/p/pharo/issues/detail?id=4768 http://code.google.com/p/pharo/issues/detail?id=4505
Re: [Pharo-project] garbage collection
Just in time. Before leaving home last week Friday, i sketched things to be checked on next week. And one of them was: - turning an external semaphore table to hold semaphores weakly. The problem that currently a Smalltalk externalSemaphoreTable is an Array, means that even if in the rest of image nobody points to some semaphore in that array, it will be not GCed unless someone will cleanup the entry there. Now the reasons why certain semaphores are leaked (and not cleaned up in external semaphore table) could be different, like abnormal process termination, forgetting putting #ensure: etc. But by making that array to be weak array we can make sure that it won't overflow with garbage which no-one uses. And of course if it overflows because those semaphores is actually used, then its another story. On 14 February 2012 22:01, Milan Mimica wrote: > This is more of a GC theory question perhaps, but I do have practical > problems, so: I'm trying to figure out how come my headless Pharo 1.3 image > with fairly recent Cog running Magma ran out of Semaphores. I know there is > a fixed limit on number of semaphores Cog can handle, and only Magma and RFB > server have been running on this image for two months, and there wasn't any > heavy load whatsoever, so something must be leaking semaphores - or more > likely - something must be leaking objects having reference(s) to > semaphore(s). I am not leaking sockets btw. > > At the moment there is about 1400 instances of Semaphore class, and around > 400 after GC. > So first question is: Suppose I haven't run GC manually, could the semaphore > limit be hit? > > 400 is still too high for a idling image, don't you think? > > Anyway, what I try next is to shut down Magma completely and try to clean > the memory by hand. I notice some MaServerSockets are still floating around > so I track the pointers and I find that all the references are circular! > Question two: Does Pharo's GC solve circular references? If yes, how good is > it at it? > > But then, thinking more about it, the whole running image is one big graph > of circular references, you wouldn't want the GC to clean that :) So how > does it really work? That's my naive view anyway. > > After I wrote all this I figure I'm better of taking a fresh new image and > load what I need into it, even if I have to do it every two months. Not > saving the image is also an option. Yeah, I think I'm going to do just that. > I found my solution, at least this e-mail served for something. Since I > don't have a blog I'll send it anyway. > > > -- > Milan Mimica > -- Best regards, Igor Stasenko.
Re: [Pharo-project] garbage collection
Ah great! I knew something weird was going on. It will be much easier to detect possible semaphore leakage once the VM itself stops leaking them. On 14 February 2012 22:38, Igor Stasenko wrote: > Just in time. > > Before leaving home last week Friday, i sketched things to be checked > on next week. > And one of them was: > - turning an external semaphore table to hold semaphores weakly. > > The problem that currently a > Smalltalk externalSemaphoreTable > is an Array, means that even if in the rest of image nobody points to > some semaphore in that array, > it will be not GCed unless someone will cleanup the entry there. > > Now the reasons why certain semaphores are leaked (and not cleaned up > in external semaphore table) > could be different, like abnormal process termination, forgetting > putting #ensure: etc. > But by making that array to be weak array we can make sure that it > won't overflow with garbage which no-one uses. > > And of course if it overflows because those semaphores is actually > used, then its another story. > > On 14 February 2012 22:01, Milan Mimica wrote: > > This is more of a GC theory question perhaps, but I do have practical > > problems, so: I'm trying to figure out how come my headless Pharo 1.3 > image > > with fairly recent Cog running Magma ran out of Semaphores. I know there > is > > a fixed limit on number of semaphores Cog can handle, and only Magma and > RFB > > server have been running on this image for two months, and there wasn't > any > > heavy load whatsoever, so something must be leaking semaphores - or more > > likely - something must be leaking objects having reference(s) to > > semaphore(s). I am not leaking sockets btw. > > > > At the moment there is about 1400 instances of Semaphore class, and > around > > 400 after GC. > > So first question is: Suppose I haven't run GC manually, could the > semaphore > > limit be hit? > > > > 400 is still too high for a idling image, don't you think? > > > > Anyway, what I try next is to shut down Magma completely and try to clean > > the memory by hand. I notice some MaServerSockets are still floating > around > > so I track the pointers and I find that all the references are circular! > > Question two: Does Pharo's GC solve circular references? If yes, how > good is > > it at it? > > > > But then, thinking more about it, the whole running image is one big > graph > > of circular references, you wouldn't want the GC to clean that :) So how > > does it really work? That's my naive view anyway. > > > > After I wrote all this I figure I'm better of taking a fresh new image > and > > load what I need into it, even if I have to do it every two months. Not > > saving the image is also an option. Yeah, I think I'm going to do just > that. > > I found my solution, at least this e-mail served for something. Since I > > don't have a blog I'll send it anyway. > > > > > > -- > > Milan Mimica > > > > > > -- > Best regards, > Igor Stasenko. > > -- Milan Mimica http://sparklet.sf.net
Re: [Pharo-project] garbage collection
On 14 February 2012 23:45, Milan Mimica wrote: > Ah great! I knew something weird was going on. It will be much easier to > detect possible semaphore leakage once the VM itself stops leaking them. > VM cannot leak semaphores because their lifetime are controlled by image. The change i proposing is in image, not VM. Here the small snippet of code i wrote to check if every semaphore registered in external semaphore table belongs to something, but not just held by table itself: | semaphores arr | arr := ExternalSemaphoreTable unprotectedExternalObjects. semaphores := arr reject: #isNil. semaphores collect: [:sema | sema -> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == semaphores ] ] ) ] Can you try invoking it on your image , look for those who has an empty array, which will mean that there is no references to it except semaphore table itself. > > > On 14 February 2012 22:38, Igor Stasenko wrote: >> >> Just in time. >> >> Before leaving home last week Friday, i sketched things to be checked >> on next week. >> And one of them was: >> - turning an external semaphore table to hold semaphores weakly. >> >> The problem that currently a >> Smalltalk externalSemaphoreTable >> is an Array, means that even if in the rest of image nobody points to >> some semaphore in that array, >> it will be not GCed unless someone will cleanup the entry there. >> >> Now the reasons why certain semaphores are leaked (and not cleaned up >> in external semaphore table) >> could be different, like abnormal process termination, forgetting >> putting #ensure: etc. >> But by making that array to be weak array we can make sure that it >> won't overflow with garbage which no-one uses. >> >> And of course if it overflows because those semaphores is actually >> used, then its another story. >> >> On 14 February 2012 22:01, Milan Mimica wrote: >> > This is more of a GC theory question perhaps, but I do have practical >> > problems, so: I'm trying to figure out how come my headless Pharo 1.3 >> > image >> > with fairly recent Cog running Magma ran out of Semaphores. I know there >> > is >> > a fixed limit on number of semaphores Cog can handle, and only Magma and >> > RFB >> > server have been running on this image for two months, and there wasn't >> > any >> > heavy load whatsoever, so something must be leaking semaphores - or more >> > likely - something must be leaking objects having reference(s) to >> > semaphore(s). I am not leaking sockets btw. >> > >> > At the moment there is about 1400 instances of Semaphore class, and >> > around >> > 400 after GC. >> > So first question is: Suppose I haven't run GC manually, could the >> > semaphore >> > limit be hit? >> > >> > 400 is still too high for a idling image, don't you think? >> > >> > Anyway, what I try next is to shut down Magma completely and try to >> > clean >> > the memory by hand. I notice some MaServerSockets are still floating >> > around >> > so I track the pointers and I find that all the references are circular! >> > Question two: Does Pharo's GC solve circular references? If yes, how >> > good is >> > it at it? >> > >> > But then, thinking more about it, the whole running image is one big >> > graph >> > of circular references, you wouldn't want the GC to clean that :) So how >> > does it really work? That's my naive view anyway. >> > >> > After I wrote all this I figure I'm better of taking a fresh new image >> > and >> > load what I need into it, even if I have to do it every two months. Not >> > saving the image is also an option. Yeah, I think I'm going to do just >> > that. >> > I found my solution, at least this e-mail served for something. Since I >> > don't have a blog I'll send it anyway. >> > >> > >> > -- >> > Milan Mimica >> > >> >> >> >> -- >> Best regards, >> Igor Stasenko. >> > > > > -- > Milan Mimica > http://sparklet.sf.net -- Best regards, Igor Stasenko.
Re: [Pharo-project] garbage collection
On 14 February 2012 22:55, Igor Stasenko wrote: > > semaphores collect: [:sema | >sema -> >(sema pointersTo reject: [:ptr | ptr == arr or: [ptr == > semaphores ] ] ) ] > > Can you try invoking it on your image , look for those who has an > empty array, which will mean that there is no references to it except > semaphore table itself. > There is 69 of such semaphores, out of hundreds in total. Not much :-/ On a normal image there is only a few. -- Milan Mimica http://sparklet.sf.net
Re: [Pharo-project] garbage collection
On 15 February 2012 00:54, Milan Mimica wrote: > On 14 February 2012 22:55, Igor Stasenko wrote: >> >> >> semaphores collect: [:sema | >> sema -> >> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >> semaphores ] ] ) ] >> >> Can you try invoking it on your image , look for those who has an >> empty array, which will mean that there is no references to it except >> semaphore table itself. > > > There is 69 of such semaphores, out of hundreds in total. Not much :-/ > On a normal image there is only a few. > So, we got a leak somewhere. Can you double-check, this is semaphores which in the list which when you inspect that list will show entries like a Semaphore() -> #() but not a total number of entries. Oh, ok lets just change the code: | semaphores arr | arr := ExternalSemaphoreTable unprotectedExternalObjects. semaphores := arr reject: #isNil. semaphores reject: [:sema | (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == semaphores ] ] ) isEmpty not ] so, normally this code should answer an empty array. If not, then there's leak -- Best regards, Igor Stasenko.
Re: [Pharo-project] garbage collection
On 15 February 2012 01:11, Igor Stasenko wrote: > On 15 February 2012 00:54, Milan Mimica wrote: >> On 14 February 2012 22:55, Igor Stasenko wrote: >>> >>> >>> semaphores collect: [:sema | >>> sema -> >>> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >>> semaphores ] ] ) ] >>> >>> Can you try invoking it on your image , look for those who has an >>> empty array, which will mean that there is no references to it except >>> semaphore table itself. >> >> >> There is 69 of such semaphores, out of hundreds in total. Not much :-/ >> On a normal image there is only a few. >> > > So, we got a leak somewhere. Can you double-check, > this is semaphores which in the list which > when you inspect that list > will show entries like > > a Semaphore() -> #() > > but not a total number of entries. > > Oh, ok lets just change the code: > > | semaphores arr | > > arr := ExternalSemaphoreTable unprotectedExternalObjects. > semaphores := arr reject: #isNil. > > semaphores reject: [:sema | > (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == > semaphores ] ] ) isEmpty not ] > > so, normally this code should answer an empty array. > If not, then there's leak > btw, i found it strange that #pointersTo does not reports a context of closure, which apparently holds a strong reference to semaphore, i.e. the following must yield true: | sema | sema := Semaphore new. sema pointersTo includes: thisContext -- Best regards, Igor Stasenko.
Re: [Pharo-project] garbage collection
On 15 February 2012 01:18, Igor Stasenko wrote: > On 15 February 2012 01:11, Igor Stasenko wrote: >> On 15 February 2012 00:54, Milan Mimica wrote: >>> On 14 February 2012 22:55, Igor Stasenko wrote: semaphores collect: [:sema | sema -> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == semaphores ] ] ) ] Can you try invoking it on your image , look for those who has an empty array, which will mean that there is no references to it except semaphore table itself. >>> >>> >>> There is 69 of such semaphores, out of hundreds in total. Not much :-/ >>> On a normal image there is only a few. >>> >> >> So, we got a leak somewhere. Can you double-check, >> this is semaphores which in the list which >> when you inspect that list >> will show entries like >> >> a Semaphore() -> #() >> >> but not a total number of entries. >> >> Oh, ok lets just change the code: >> >> | semaphores arr | >> >> arr := ExternalSemaphoreTable unprotectedExternalObjects. >> semaphores := arr reject: #isNil. >> >> semaphores reject: [:sema | >> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >> semaphores ] ] ) isEmpty not ] >> >> so, normally this code should answer an empty array. >> If not, then there's leak >> > > btw, i found it strange that #pointersTo does not reports a context of > closure, which apparently > holds a strong reference to semaphore, i.e. the following must yield true: > > > | sema | > sema := Semaphore new. > sema pointersTo includes: thisContext > ah, ok, it seems to be intended by implementation: objectsToAlwaysExclude := { thisContext. thisContext sender. thisContext sender sender. objectsToExclude. }. > > > -- > Best regards, > Igor Stasenko. -- Best regards, Igor Stasenko.
Re: [Pharo-project] garbage collection
| semaphores arr res | arr := ExternalSemaphoreTable unprotectedExternalObjects. semaphores := arr reject: #isNil. res := semaphores collect: [:sema | sema pointersTo reject: [:ptr | ptr == arr or: [ptr == semaphores ] ] ]. res := res select: [ :each | each isEmpty ]. res size. Yes, that's what I did, I'm sure. All of my images, except a fresh 1.4, answer a number >0. On 15 February 2012 00:11, Igor Stasenko wrote: > On 15 February 2012 00:54, Milan Mimica wrote: > > On 14 February 2012 22:55, Igor Stasenko wrote: > >> > >> > >> semaphores collect: [:sema | > >>sema -> > >>(sema pointersTo reject: [:ptr | ptr == arr or: [ptr == > >> semaphores ] ] ) ] > >> > >> Can you try invoking it on your image , look for those who has an > >> empty array, which will mean that there is no references to it except > >> semaphore table itself. > > > > > > There is 69 of such semaphores, out of hundreds in total. Not much :-/ > > On a normal image there is only a few. > > > > So, we got a leak somewhere. Can you double-check, > this is semaphores which in the list which > when you inspect that list > will show entries like > > a Semaphore() -> #() > > but not a total number of entries. > > Oh, ok lets just change the code: > > | semaphores arr | > > arr := ExternalSemaphoreTable unprotectedExternalObjects. > semaphores := arr reject: #isNil. > > semaphores reject: [:sema | >(sema pointersTo reject: [:ptr | ptr == arr or: [ptr == > semaphores ] ] ) isEmpty not ] > > so, normally this code should answer an empty array. > If not, then there's leak > > -- > Best regards, > Igor Stasenko. > > -- Milan Mimica http://sparklet.sf.net
Re: [Pharo-project] garbage collection
So, what i suggest you to do is to replace Array with WeakArray in ExternalSemaphoreTable>>clearExternalObjects and restart the image, so change will be put in effect. And report an observation if it cures the problem (and not adds new ones ;). On 15 February 2012 01:23, Milan Mimica wrote: > | semaphores arr res | > > arr := ExternalSemaphoreTable unprotectedExternalObjects. > semaphores := arr reject: #isNil. > > res := semaphores collect: [:sema | sema pointersTo reject: [:ptr | ptr == > arr or: [ptr == semaphores ] ] ]. > res := res select: [ :each | each isEmpty ]. > res size. > > Yes, that's what I did, I'm sure. All of my images, except a fresh 1.4, > answer a number >0. > > > On 15 February 2012 00:11, Igor Stasenko wrote: >> >> On 15 February 2012 00:54, Milan Mimica wrote: >> > On 14 February 2012 22:55, Igor Stasenko wrote: >> >> >> >> >> >> semaphores collect: [:sema | >> >> sema -> >> >> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >> >> semaphores ] ] ) ] >> >> >> >> Can you try invoking it on your image , look for those who has an >> >> empty array, which will mean that there is no references to it except >> >> semaphore table itself. >> > >> > >> > There is 69 of such semaphores, out of hundreds in total. Not much :-/ >> > On a normal image there is only a few. >> > >> >> So, we got a leak somewhere. Can you double-check, >> this is semaphores which in the list which >> when you inspect that list >> will show entries like >> >> a Semaphore() -> #() >> >> but not a total number of entries. >> >> Oh, ok lets just change the code: >> >> | semaphores arr | >> >> arr := ExternalSemaphoreTable unprotectedExternalObjects. >> semaphores := arr reject: #isNil. >> >> semaphores reject: [:sema | >> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >> semaphores ] ] ) isEmpty not ] >> >> so, normally this code should answer an empty array. >> If not, then there's leak >> >> -- >> Best regards, >> Igor Stasenko. >> > > > > -- > Milan Mimica > http://sparklet.sf.net -- Best regards, Igor Stasenko.
Re: [Pharo-project] garbage collection
.. and perhaps we should tell Chris to check Magma code about potential leak by preventing unregistering external semaphores. of course, that could be Pharo bug as well. On 15 February 2012 02:11, Igor Stasenko wrote: > So, what i suggest you to do is to replace Array with WeakArray in > > ExternalSemaphoreTable>>clearExternalObjects > > and restart the image, so change will be put in effect. > > And report an observation if it cures the problem (and not adds new ones ;). > > On 15 February 2012 01:23, Milan Mimica wrote: >> | semaphores arr res | >> >> arr := ExternalSemaphoreTable unprotectedExternalObjects. >> semaphores := arr reject: #isNil. >> >> res := semaphores collect: [:sema | sema pointersTo reject: [:ptr | ptr == >> arr or: [ptr == semaphores ] ] ]. >> res := res select: [ :each | each isEmpty ]. >> res size. >> >> Yes, that's what I did, I'm sure. All of my images, except a fresh 1.4, >> answer a number >0. >> >> >> On 15 February 2012 00:11, Igor Stasenko wrote: >>> >>> On 15 February 2012 00:54, Milan Mimica wrote: >>> > On 14 February 2012 22:55, Igor Stasenko wrote: >>> >> >>> >> >>> >> semaphores collect: [:sema | >>> >> sema -> >>> >> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >>> >> semaphores ] ] ) ] >>> >> >>> >> Can you try invoking it on your image , look for those who has an >>> >> empty array, which will mean that there is no references to it except >>> >> semaphore table itself. >>> > >>> > >>> > There is 69 of such semaphores, out of hundreds in total. Not much :-/ >>> > On a normal image there is only a few. >>> > >>> >>> So, we got a leak somewhere. Can you double-check, >>> this is semaphores which in the list which >>> when you inspect that list >>> will show entries like >>> >>> a Semaphore() -> #() >>> >>> but not a total number of entries. >>> >>> Oh, ok lets just change the code: >>> >>> | semaphores arr | >>> >>> arr := ExternalSemaphoreTable unprotectedExternalObjects. >>> semaphores := arr reject: #isNil. >>> >>> semaphores reject: [:sema | >>> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >>> semaphores ] ] ) isEmpty not ] >>> >>> so, normally this code should answer an empty array. >>> If not, then there's leak >>> >>> -- >>> Best regards, >>> Igor Stasenko. >>> >> >> >> >> -- >> Milan Mimica >> http://sparklet.sf.net > > > > -- > Best regards, > Igor Stasenko. -- Best regards, Igor Stasenko.
Re: [Pharo-project] garbage collection
I sort of doubt Magma uses anything registering external objects but the standard Sockets… While interrupting finalization process _could_ be a culprit, I'd rather watch for something like what happened with the InputEventFetcher, where registration/deregistration ends up not being symmetric. Perhaps Magma tries to reuse existing Socket instances by calls to initialize:/acceptFrom: for example? I think changing ExternalSemaphoreTable to a weak structure would be a really bad idea btw. All of a sudden, you free up slots which were previously taken for insertion of old objects. There is no guarantee the external user of the semaphore will stop using the index it was given just because the image no longer holds a reference to the object which used to occupy that index. So if you register a new object in the same slot, in the case of Sockets at least, you could potentially end up responding to signals from both the old external user, and the new one. TLDR; Any object registered in this table by definition needs explicit cleanup of external users before they are GC'd. Thus, making the table weak makes no sense. Cheers, Henry On Feb 15, 2012, at 12:11 AM, Igor Stasenko wrote: > On 15 February 2012 00:54, Milan Mimica wrote: >> On 14 February 2012 22:55, Igor Stasenko wrote: >>> >>> >>> semaphores collect: [:sema | >>>sema -> >>>(sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >>> semaphores ] ] ) ] >>> >>> Can you try invoking it on your image , look for those who has an >>> empty array, which will mean that there is no references to it except >>> semaphore table itself. >> >> >> There is 69 of such semaphores, out of hundreds in total. Not much :-/ >> On a normal image there is only a few. >> > > So, we got a leak somewhere. Can you double-check, > this is semaphores which in the list which > when you inspect that list > will show entries like > > a Semaphore() -> #() > > but not a total number of entries. > > Oh, ok lets just change the code: > > | semaphores arr | > > arr := ExternalSemaphoreTable unprotectedExternalObjects. > semaphores := arr reject: #isNil. > > semaphores reject: [:sema | > (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == > semaphores ] ] ) isEmpty not ] > > so, normally this code should answer an empty array. > If not, then there's leak > > -- > Best regards, > Igor Stasenko. >
Re: [Pharo-project] garbage collection
On 15 February 2012 11:26, Henrik Johansen wrote: > I sort of doubt Magma uses anything registering external objects but the > standard Sockets… > While interrupting finalization process _could_ be a culprit, I'd rather > watch for something like what happened with the InputEventFetcher, where > registration/deregistration ends up not being symmetric. > > Perhaps Magma tries to reuse existing Socket instances by calls to > initialize:/acceptFrom: for example? > > I think changing ExternalSemaphoreTable to a weak structure would be a really > bad idea btw. > All of a sudden, you free up slots which were previously taken for insertion > of old objects. > > There is no guarantee the external user of the semaphore will stop using the > index it was given just because the image no longer holds a reference to the > object which used to occupy that index. > > So if you register a new object in the same slot, in the case of Sockets at > least, you could potentially end up responding to signals from both the old > external user, and the new one. > > TLDR; Any object registered in this table by definition needs explicit > cleanup of external users before they are GC'd. Thus, making the table weak > makes no sense. > Yes, that could happen. Well, then the only option is to check what code producing leaks. And that's not trivial, since you cannot trace references back to semaphore's original owner. :( -- Best regards, Igor Stasenko.
Re: [Pharo-project] garbage collection
Milan, try checking whether the number of instances of MaTimer correlate to the number of instances of Semaphore you are seeing. - Chris On Tue, Feb 14, 2012 at 2:01 PM, Milan Mimica wrote: > This is more of a GC theory question perhaps, but I do have practical > problems, so: I'm trying to figure out how come my headless Pharo 1.3 image > with fairly recent Cog running Magma ran out of Semaphores. I know there is > a fixed limit on number of semaphores Cog can handle, and only Magma and RFB > server have been running on this image for two months, and there wasn't any > heavy load whatsoever, so something must be leaking semaphores - or more > likely - something must be leaking objects having reference(s) to > semaphore(s). I am not leaking sockets btw. > > At the moment there is about 1400 instances of Semaphore class, and around > 400 after GC. > So first question is: Suppose I haven't run GC manually, could the semaphore > limit be hit? > > 400 is still too high for a idling image, don't you think? > > Anyway, what I try next is to shut down Magma completely and try to clean > the memory by hand. I notice some MaServerSockets are still floating around > so I track the pointers and I find that all the references are circular! > Question two: Does Pharo's GC solve circular references? If yes, how good is > it at it? > > But then, thinking more about it, the whole running image is one big graph > of circular references, you wouldn't want the GC to clean that :) So how > does it really work? That's my naive view anyway. > > After I wrote all this I figure I'm better of taking a fresh new image and > load what I need into it, even if I have to do it every two months. Not > saving the image is also an option. Yeah, I think I'm going to do just that. > I found my solution, at least this e-mail served for something. Since I > don't have a blog I'll send it anyway. > > > -- > Milan Mimica >
Re: [Pharo-project] garbage collection
On Feb 15, 2012, at 5:53 PM, Igor Stasenko wrote: > On 15 February 2012 11:26, Henrik Johansen > wrote: >> I sort of doubt Magma uses anything registering external objects but the >> standard Sockets… >> While interrupting finalization process _could_ be a culprit, I'd rather >> watch for something like what happened with the InputEventFetcher, where >> registration/deregistration ends up not being symmetric. >> >> Perhaps Magma tries to reuse existing Socket instances by calls to >> initialize:/acceptFrom: for example? >> >> I think changing ExternalSemaphoreTable to a weak structure would be a >> really bad idea btw. >> All of a sudden, you free up slots which were previously taken for insertion >> of old objects. >> >> There is no guarantee the external user of the semaphore will stop using the >> index it was given just because the image no longer holds a reference to the >> object which used to occupy that index. >> >> So if you register a new object in the same slot, in the case of Sockets at >> least, you could potentially end up responding to signals from both the old >> external user, and the new one. >> >> TLDR; Any object registered in this table by definition needs explicit >> cleanup of external users before they are GC'd. Thus, making the table weak >> makes no sense. >> > > Yes, that could happen. > Well, then the only option is to check what code producing leaks. And > that's not trivial, since > you cannot trace references back to semaphore's original owner. :( > > > -- > Best regards, > Igor Stasenko. > It's not such a big task to review the current users, there are like 5 of them in the base image. AFAICT: 1) InputEventFetcher, this has been bug-fixed once already. No leaks as long as it has a single entry in startup/shutdown lists. 2) Sockets, which, to my knowledge work OK with the weak finalization. 3) AsyncFile, only done correctly if #close'd at EOL. No weak cleanup done it seems, could be changed to work like Sockets. 4) StandardFileStream url opening, should be modified to use an ensure: block. Only applicable when running in a browser though, unlikely to be a large issue. 5) NetNameResolver, never unregisters the semaphore, and leaks one whenever network needs (re)initialization. Snapshot? Image quit? Arbitrary points during execution? Reading the plugin source would tell the real cases I guess. A probable culprit in this case, I would say. I opened http://code.google.com/p/pharo/issues/detail?id=5310 , which contains proposed changes related to 3-5. I also opened 2 additional issues for things I ran into, but didn't do as it wasn't worth the effort/strictly related: 5311: Rewrite AsyncFile to have weak finalization for guaranteed cleanup of external objects 5312: Move StandardFileStream browserRequest protocol functionality to a separate package, and deprecate the protocol itself. The main purpose of the methods in this protocol has nothing to do with FileStreams, but rather performing navigation in a Browser when Pharo is run as a plugin. Cheers, Henry
Re: [Pharo-project] garbage collection
When Magma is down there are no instances of MaTimer. There are 3 instances of MaServerSocket though. On 15 February 2012 18:41, Chris Muller wrote: > Milan, try checking whether the number of instances of MaTimer > correlate to the number of instances of Semaphore you are seeing. > > - Chris > > On Tue, Feb 14, 2012 at 2:01 PM, Milan Mimica > wrote: > > This is more of a GC theory question perhaps, but I do have practical > > problems, so: I'm trying to figure out how come my headless Pharo 1.3 > image > > with fairly recent Cog running Magma ran out of Semaphores. I know there > is > > a fixed limit on number of semaphores Cog can handle, and only Magma and > RFB > > server have been running on this image for two months, and there wasn't > any > > heavy load whatsoever, so something must be leaking semaphores - or more > > likely - something must be leaking objects having reference(s) to > > semaphore(s). I am not leaking sockets btw. > > > > At the moment there is about 1400 instances of Semaphore class, and > around > > 400 after GC. > > So first question is: Suppose I haven't run GC manually, could the > semaphore > > limit be hit? > > > > 400 is still too high for a idling image, don't you think? > > > > Anyway, what I try next is to shut down Magma completely and try to clean > > the memory by hand. I notice some MaServerSockets are still floating > around > > so I track the pointers and I find that all the references are circular! > > Question two: Does Pharo's GC solve circular references? If yes, how > good is > > it at it? > > > > But then, thinking more about it, the whole running image is one big > graph > > of circular references, you wouldn't want the GC to clean that :) So how > > does it really work? That's my naive view anyway. > > > > After I wrote all this I figure I'm better of taking a fresh new image > and > > load what I need into it, even if I have to do it every two months. Not > > saving the image is also an option. Yeah, I think I'm going to do just > that. > > I found my solution, at least this e-mail served for something. Since I > > don't have a blog I'll send it anyway. > > > > > > -- > > Milan Mimica > > > > -- Milan Mimica http://sparklet.sf.net
Re: [Pharo-project] garbage collection
Tx for the bug entries!!! >>> I sort of doubt Magma uses anything registering external objects but the >>> standard Sockets… >>> While interrupting finalization process _could_ be a culprit, I'd rather >>> watch for something like what happened with the InputEventFetcher, where >>> registration/deregistration ends up not being symmetric. >>> >>> Perhaps Magma tries to reuse existing Socket instances by calls to >>> initialize:/acceptFrom: for example? >>> >>> I think changing ExternalSemaphoreTable to a weak structure would be a >>> really bad idea btw. >>> All of a sudden, you free up slots which were previously taken for >>> insertion of old objects. >>> >>> There is no guarantee the external user of the semaphore will stop using >>> the index it was given just because the image no longer holds a reference >>> to the object which used to occupy that index. >>> >>> So if you register a new object in the same slot, in the case of Sockets at >>> least, you could potentially end up responding to signals from both the old >>> external user, and the new one. >>> >>> TLDR; Any object registered in this table by definition needs explicit >>> cleanup of external users before they are GC'd. Thus, making the table weak >>> makes no sense. >>> >> >> Yes, that could happen. >> Well, then the only option is to check what code producing leaks. And >> that's not trivial, since >> you cannot trace references back to semaphore's original owner. :( >> >> >> -- >> Best regards, >> Igor Stasenko. >> > > It's not such a big task to review the current users, there are like 5 of > them in the base image. > > AFAICT: > 1) InputEventFetcher, this has been bug-fixed once already. No leaks as long > as it has a single entry in startup/shutdown lists. > 2) Sockets, which, to my knowledge work OK with the weak finalization. > 3) AsyncFile, only done correctly if #close'd at EOL. No weak cleanup done it > seems, could be changed to work like Sockets. > 4) StandardFileStream url opening, should be modified to use an ensure: > block. Only applicable when running in a browser though, unlikely to be a > large issue. > 5) NetNameResolver, never unregisters the semaphore, and leaks one whenever > network needs (re)initialization. Snapshot? Image quit? Arbitrary points > during execution? Reading the plugin source would tell the real cases I > guess. A probable culprit in this case, I would say. > > I opened http://code.google.com/p/pharo/issues/detail?id=5310 , which > contains proposed changes related to 3-5. > > I also opened 2 additional issues for things I ran into, but didn't do as it > wasn't worth the effort/strictly related: > > 5311: Rewrite AsyncFile to have weak finalization for guaranteed cleanup of > external objects > 5312: Move StandardFileStream browserRequest protocol functionality to a > separate package, and deprecate the protocol itself. > The main purpose of the methods in this protocol has nothing to do with > FileStreams, but rather performing navigation in a Browser when Pharo is run > as a plugin. > > Cheers, > Henry > > >
Re: [Pharo-project] garbage collection
Ok. I thought you were trying to figure out whether the majority of the Semaphores you were seeing were due to Magma or not. I have a suspicion that they may be. Which might be a concern since I didn't realize there was a Semaphore limit.. I was suggesting that, since MaTimer references Mutex references Semaphore, you could do: Semaphore instanceCount - MaTimer instanceCount "The number of Semaphores allocated due to Magma's use of MaTimer" - Chris On Thu, Feb 16, 2012 at 10:57 AM, Milan Mimica wrote: > When Magma is down there are no instances of MaTimer. There are 3 instances > of MaServerSocket though. > > On 15 February 2012 18:41, Chris Muller wrote: >> >> Milan, try checking whether the number of instances of MaTimer >> correlate to the number of instances of Semaphore you are seeing. >> >> - Chris >> >> On Tue, Feb 14, 2012 at 2:01 PM, Milan Mimica >> wrote: >> > This is more of a GC theory question perhaps, but I do have practical >> > problems, so: I'm trying to figure out how come my headless Pharo 1.3 >> > image >> > with fairly recent Cog running Magma ran out of Semaphores. I know there >> > is >> > a fixed limit on number of semaphores Cog can handle, and only Magma and >> > RFB >> > server have been running on this image for two months, and there wasn't >> > any >> > heavy load whatsoever, so something must be leaking semaphores - or more >> > likely - something must be leaking objects having reference(s) to >> > semaphore(s). I am not leaking sockets btw. >> > >> > At the moment there is about 1400 instances of Semaphore class, and >> > around >> > 400 after GC. >> > So first question is: Suppose I haven't run GC manually, could the >> > semaphore >> > limit be hit? >> > >> > 400 is still too high for a idling image, don't you think? >> > >> > Anyway, what I try next is to shut down Magma completely and try to >> > clean >> > the memory by hand. I notice some MaServerSockets are still floating >> > around >> > so I track the pointers and I find that all the references are circular! >> > Question two: Does Pharo's GC solve circular references? If yes, how >> > good is >> > it at it? >> > >> > But then, thinking more about it, the whole running image is one big >> > graph >> > of circular references, you wouldn't want the GC to clean that :) So how >> > does it really work? That's my naive view anyway. >> > >> > After I wrote all this I figure I'm better of taking a fresh new image >> > and >> > load what I need into it, even if I have to do it every two months. Not >> > saving the image is also an option. Yeah, I think I'm going to do just >> > that. >> > I found my solution, at least this e-mail served for something. Since I >> > don't have a blog I'll send it anyway. >> > >> > >> > -- >> > Milan Mimica >> > >> > > > > -- > Milan Mimica > http://sparklet.sf.net
Re: [Pharo-project] garbage collection
On Feb 17, 2012, at 3:01 AM, Chris Muller wrote: > Ok. I thought you were trying to figure out whether the majority of > the Semaphores you were seeing were due to Magma or not. I have a > suspicion that they may be. Which might be a concern since I didn't > realize there was a Semaphore limit.. There isn't a limit to Semaphores per se, but there's (in Cog) a limit to the amount of Semaphores which can be registered in the ExternalSemaphoreTable (a.k.a. being available to signal by primitives). In the base image the five cases I mentioned earlier are the only users, any wrongdoings on Magma's part would either be through something doing custom registering in that table, or "misuse" of base users, like reusing Sockets by sending certain messages, as mentioned in my first mail. As per later mails though, network reinitialization is already one case in the base image where such leaks can occur , so it's not certain Magma is involved at all... Cheers, Henry
Re: [Pharo-project] garbage collection
On 17 February 2012 08:15, Henrik Johansen wrote: > > As per later mails though, network reinitialization is already one case in > the base image where such leaks can occur , so it's not certain Magma is > involved at all... > Might be also because of RFB server, but that is less likely. -- Milan Mimica http://sparklet.sf.net
Re: [Pharo-project] garbage collection
On 17 Feb 2012, at 09:16, Milan Mimica wrote: > Might be also because of RFB server, but that is less likely. I have the impression that most people who run in trouble with server images are using RFB. On the other hand, those not using RFB with probably restart server images sooner. Anyway, Henrik's list is excellent, thanks ! Sven
Re: [Pharo-project] garbage collection
Am 17.02.2012 um 09:25 schrieb Sven Van Caekenberghe: > > On 17 Feb 2012, at 09:16, Milan Mimica wrote: > >> Might be also because of RFB server, but that is less likely. > > I have the impression that most people who run in trouble with server images > are using RFB. > Very strange. Shortly after I read your post I opened a unresponsive image myself. In the process stack I saw that magma was the culprit (didn't remember I have it installed in that image). However the problem was that I moved my directory and magma was looking for the old path. I didn't investigate further but it seems that the MagmaRepositoryController>>#open (called on startUp) is able to block the UI thread. In the same image I could use seaside still although the UI was frozen. Norbert
Re: [Pharo-project] garbage collection
> Very strange. Shortly after I read your post I opened a unresponsive image > myself. In the process stack I saw that magma was the culprit (didn't remember > I have it installed in that image). However the problem was that I moved my > directory and magma was looking for the old path. I didn't investigate > further but > it seems that the MagmaRepositoryController>>#open (called on startUp) is > able to block the UI thread. In the same image I could use seaside still > although > the UI was frozen. Hi. Based on your description, it sounds like Magma hit its warning condition check that it found a file named "_open" in the repository directory, indicating it thinks another process has it open. This can occur if the server image did not shut down properly last time. Before starting the image, see if there is a file called "_open" and, if there is, delete it. The image should start normally now. This is something that has annoyed me at least a couple of times as well. I'm not sure what the UI process was doing in your case, but Alt+. should allow you to interrupt it and see the stacks.. My suspicion is that something in the image wanted a MagmaSession to do something, but since the server was waiting for you to respond to the Warning about "_open" it wouldn't return... The confusing part for me is, the image startup code runs in the UI process doesn't it? So why were there two Processes to possibly get into a deadlock? - Chris
Re: [Pharo-project] garbage collection
Am 19.02.2012 um 18:26 schrieb Chris Muller: >> Very strange. Shortly after I read your post I opened a unresponsive image >> myself. In the process stack I saw that magma was the culprit (didn't >> remember >> I have it installed in that image). However the problem was that I moved my >> directory and magma was looking for the old path. I didn't investigate >> further but >> it seems that the MagmaRepositoryController>>#open (called on startUp) is >> able to block the UI thread. In the same image I could use seaside still >> although >> the UI was frozen. > > Hi. Based on your description, it sounds like Magma hit its warning > condition check that it found a file named "_open" in the repository > directory, indicating it thinks another process has it open. This can > occur if the server image did not shut down properly last time. > > Before starting the image, see if there is a file called "_open" and, > if there is, delete it. The image should start normally now. > The directory wasn't there anymore where magma was looking for it. Does magma look in other directories as well. The magma/ Folder was still in the same directory as the image. But I thought magma is looking for the absolute path because that's what I can see in the debug log. Anyway there was a file called _open in that directory. So either magma didn't find the directory or it was locked by the _open. > This is something that has annoyed me at least a couple of times as > well. I'm not sure what the UI process was doing in your case, but > Alt+. should allow you to interrupt it and see the stacks.. My > suspicion is that something in the image wanted a MagmaSession to do > something, but since the server was waiting for you to respond to the > Warning about "_open" it wouldn't return... > Alt+. did not work. It was not reacting to anything. The warning about _open would have been a dialog? > The confusing part for me is, the image startup code runs in the UI > process doesn't it? So why were there two Processes to possibly get > into a deadlock? Well, I don't have enough knowledge but I would say in pharo it isn't impossible that there are more than one thread being involved. Norbert
Re: [Pharo-project] garbage collection
For your information igor and marcus fixed the interrupt handler in pharo1.4 nd now you can avoid dot kill the finalize process. Stef >>> Very strange. Shortly after I read your post I opened a unresponsive image >>> myself. In the process stack I saw that magma was the culprit (didn't >>> remember >>> I have it installed in that image). However the problem was that I moved my >>> directory and magma was looking for the old path. I didn't investigate >>> further but >>> it seems that the MagmaRepositoryController>>#open (called on startUp) is >>> able to block the UI thread. In the same image I could use seaside still >>> although >>> the UI was frozen. >> >> Hi. Based on your description, it sounds like Magma hit its warning >> condition check that it found a file named "_open" in the repository >> directory, indicating it thinks another process has it open. This can >> occur if the server image did not shut down properly last time. >> >> Before starting the image, see if there is a file called "_open" and, >> if there is, delete it. The image should start normally now. >> > The directory wasn't there anymore where magma was looking for it. Does magma > look in other directories as well. The magma/ Folder was still in the same > directory as the image. But I thought magma is looking for the absolute path > because that's what I can see in the debug log. Anyway there was a file > called _open in that directory. So either magma didn't find the directory or > it was locked by the _open. > >> This is something that has annoyed me at least a couple of times as >> well. I'm not sure what the UI process was doing in your case, but >> Alt+. should allow you to interrupt it and see the stacks.. My >> suspicion is that something in the image wanted a MagmaSession to do >> something, but since the server was waiting for you to respond to the >> Warning about "_open" it wouldn't return... >> > Alt+. did not work. It was not reacting to anything. The warning about _open > would have been a dialog? > >> The confusing part for me is, the image startup code runs in the UI >> process doesn't it? So why were there two Processes to possibly get >> into a deadlock? > > Well, I don't have enough knowledge but I would say in pharo it isn't > impossible that there are more than one thread being involved. > > Norbert
Re: [Pharo-project] garbage collection
On 14 February 2012 23:54, Milan Mimica wrote: > On 14 February 2012 22:55, Igor Stasenko wrote: > >> >> semaphores collect: [:sema | >>sema -> >>(sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >> semaphores ] ] ) ] >> >> Can you try invoking it on your image , look for those who has an >> empty array, which will mean that there is no references to it except >> semaphore table itself. >> > > There is 69 of such semaphores, out of hundreds in total. Not much :-/ > On a normal image there is only a few. > Should say that the 69 leaked semaphores are probably there because of the bug that leaks a semaphore on each image save. Still, that should be causing major problems in my case. -- Milan Mimica http://sparklet.sf.net
Re: [Pharo-project] garbage collection
On 20 February 2012 16:47, Milan Mimica wrote: > On 14 February 2012 23:54, Milan Mimica wrote: >> >> On 14 February 2012 22:55, Igor Stasenko wrote: >>> >>> >>> semaphores collect: [:sema | >>> sema -> >>> (sema pointersTo reject: [:ptr | ptr == arr or: [ptr == >>> semaphores ] ] ) ] >>> >>> Can you try invoking it on your image , look for those who has an >>> empty array, which will mean that there is no references to it except >>> semaphore table itself. >> >> >> There is 69 of such semaphores, out of hundreds in total. Not much :-/ >> On a normal image there is only a few. > > > Should say that the 69 leaked semaphores are probably there because of the > bug that leaks a semaphore on each image save. Still, that should be causing > major problems in my case. > Well, that should not cause problem unless you keep the image running and running and running (so garbage accumulates). And of course, if you restart an image, an external semaphore table wiped out.. so i think we having a side effect of running image stably for quite a long duration (and this is positive part) , but which uncovering such bags :) > > -- > Milan Mimica > http://sparklet.sf.net -- Best regards, Igor Stasenko.
Re: [Pharo-project] garbage collection
Am 20.02.2012 17:51 schrieb "Igor Stasenko" : ... > so i think we having a side effect of running image stably for quite a > long duration (and this is positive part) , but which uncovering such > bags :) There are a lot more of leaks deep in the core of Pharo that make the machine dog-slow and absolutely unusable for long-term purposes. A resource watchdog would be nice to have, hat keep operators informed. :-) Have fun debugging, Guido Stepken
Re: [Pharo-project] garbage collection
On 20 February 2012 16:47, Milan Mimica wrote: > > Still, that should be causing major problems in my case. > I am missing a "not" somewhere here. -- Milan Mimica http://sparklet.sf.net
Re: [Pharo-project] garbage collection
On 20 February 2012 18:30, Guido Stepken wrote: > Am 20.02.2012 17:51 schrieb "Igor Stasenko" : > > > ... >> so i think we having a side effect of running image stably for quite a >> long duration (and this is positive part) , but which uncovering such >> bags :) > > There are a lot more of leaks deep in the core of Pharo that make the > machine dog-slow and absolutely unusable for long-term purposes. Guido, try to not repeat yourself. Your initial message were heard. Loud and clear. If you want to help diagnosing this particular problem, and help finding solution, feel free to do it. But please don't turn every topic into own speech pedestal. > A resource watchdog would be nice to have, hat keep operators informed. :-) > > Have fun debugging, Guido Stepken -- Best regards, Igor Stasenko.