Re: [Xen-devel] xenstored memory leak
On 13/07/16 16:28, Ian Jackson wrote: > David Vrabel writes ("Re: [Xen-devel] xenstored memory leak"): >> On 13/07/16 14:32, Juergen Gross wrote: >>> On 13/07/16 15:17, David Vrabel wrote: >>>> The Linux driver already cleans up open transactions and removes watches >>>> when the file handle is released. >>> >>> Hmm, does this work reliably? I observed a memory leak of about 5kB per >>> create/destroy domain pair with xenstored _and_ with xenstore domain. >> >> Don't know. >> >> I only took a quick look at the xenbus_file_release() function and it >> looked like it ought to be doing the right thing... > > If it didn't, it would leak with oxenstored too. This I can't tell. I've got no oxenstore domain running. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
David Vrabel writes ("Re: [Xen-devel] xenstored memory leak"): > On 13/07/16 14:32, Juergen Gross wrote: > > On 13/07/16 15:17, David Vrabel wrote: > >> The Linux driver already cleans up open transactions and removes watches > >> when the file handle is released. > > > > Hmm, does this work reliably? I observed a memory leak of about 5kB per > > create/destroy domain pair with xenstored _and_ with xenstore domain. > > Don't know. > > I only took a quick look at the xenbus_file_release() function and it > looked like it ought to be doing the right thing... If it didn't, it would leak with oxenstored too. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On Wed, Jul 13, 2016 at 04:09:26PM +0200, Juergen Gross wrote: > On 13/07/16 15:52, Wei Liu wrote: > > On Wed, Jul 13, 2016 at 03:25:45PM +0200, Juergen Gross wrote: > >> On 13/07/16 15:07, Wei Liu wrote: > >>> On Wed, Jul 13, 2016 at 02:21:38PM +0200, Juergen Gross wrote: > On 06/07/16 09:31, Juergen Gross wrote: > > While testing some patches for support of ballooning in Mini-OS by using > > the xenstore domain I realized that each xl create/destroy pair would > > increase memory consumption in Mini-OS by about 5kB. Wondering whether > > this is a xenstore domain only effect I did the same test with xenstored > > and oxenstored daemons. > > > > xenstored showed the same behavior, the "referenced" size showed by the > > pmap command grew by about 5kB for each create/destroy pair. > > > > oxenstored seemed to be even worse in the beginning (about 6kB for each > > pair), but after about 100 create/destroys the value seemed to be > > rather stable. > > > > Did anyone notice this memory leak before? > > I think I've found the problem: > > qemu as the device model is setting up a xenstore watch for each backend > type it is supporting. Unfortunately those watches are never removed > again. This sums up to the observed memory leak. > > I'm not sure how oxenstored is avoiding the problem, may be by testing > socket connections to be still alive and so detecting qemu has gone. > OTOH this won't help for oxenstored running in another domain than the > device model (either due to oxenstore-stubdom, or a driver domain with > a qemu based device model). > > >>> > >>> How unfortunate. > >>> > >>> My gut feeling is that xenstored shouldn't have the knowledge to > >>> associate a watch with a "process". The concept of a process is only > >>> meaningful to OS, which wouldn't work on cross-domain xenstored setup. > >> > >> Right. > >> > >>> Maybe the OS xenbus driver should reap all watches on behalf the dead > >>> process. This would also avoid a crashed QEMU leaking resources. > >>> > >>> And xenstored should have proper quota support so that a domain can't > >>> set up excessive numbers of watches. > >> > >> This would be dom0 unless you arrange the device model to be accounted > >> as the domid it is running for. But this is problematic with a xenstore > >> domain again. > >> > > > > The quota could be based on "connection" (ring or socket) and > > counted as per-connection? Just throwing ideas around, not necessarily > > saying this is the way to go. > > Sure. But with xenstore domain all xenstore access of dom0 is via one > ring. And how would you want to apply any quota here solving our > problem with one qemu process in dom0 creating stale watches? You That's a job for the kernel driver. The quota inside xenstored is to protect itself from abuse from one single connection and punish the bad actors. > could open a new connection for the device model, of course, but this > would require some xenbus rework. > I wouldn't go down that route unless absolutely necessary because that seems to require xenbus protocol extension. Wei. > > Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 13/07/16 15:52, Wei Liu wrote: > On Wed, Jul 13, 2016 at 03:25:45PM +0200, Juergen Gross wrote: >> On 13/07/16 15:07, Wei Liu wrote: >>> On Wed, Jul 13, 2016 at 02:21:38PM +0200, Juergen Gross wrote: On 06/07/16 09:31, Juergen Gross wrote: > While testing some patches for support of ballooning in Mini-OS by using > the xenstore domain I realized that each xl create/destroy pair would > increase memory consumption in Mini-OS by about 5kB. Wondering whether > this is a xenstore domain only effect I did the same test with xenstored > and oxenstored daemons. > > xenstored showed the same behavior, the "referenced" size showed by the > pmap command grew by about 5kB for each create/destroy pair. > > oxenstored seemed to be even worse in the beginning (about 6kB for each > pair), but after about 100 create/destroys the value seemed to be > rather stable. > > Did anyone notice this memory leak before? I think I've found the problem: qemu as the device model is setting up a xenstore watch for each backend type it is supporting. Unfortunately those watches are never removed again. This sums up to the observed memory leak. I'm not sure how oxenstored is avoiding the problem, may be by testing socket connections to be still alive and so detecting qemu has gone. OTOH this won't help for oxenstored running in another domain than the device model (either due to oxenstore-stubdom, or a driver domain with a qemu based device model). >>> >>> How unfortunate. >>> >>> My gut feeling is that xenstored shouldn't have the knowledge to >>> associate a watch with a "process". The concept of a process is only >>> meaningful to OS, which wouldn't work on cross-domain xenstored setup. >> >> Right. >> >>> Maybe the OS xenbus driver should reap all watches on behalf the dead >>> process. This would also avoid a crashed QEMU leaking resources. >>> >>> And xenstored should have proper quota support so that a domain can't >>> set up excessive numbers of watches. >> >> This would be dom0 unless you arrange the device model to be accounted >> as the domid it is running for. But this is problematic with a xenstore >> domain again. >> > > The quota could be based on "connection" (ring or socket) and > counted as per-connection? Just throwing ideas around, not necessarily > saying this is the way to go. Sure. But with xenstore domain all xenstore access of dom0 is via one ring. And how would you want to apply any quota here solving our problem with one qemu process in dom0 creating stale watches? You could open a new connection for the device model, of course, but this would require some xenbus rework. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On Wed, Jul 13, 2016 at 03:25:45PM +0200, Juergen Gross wrote: > On 13/07/16 15:07, Wei Liu wrote: > > On Wed, Jul 13, 2016 at 02:21:38PM +0200, Juergen Gross wrote: > >> On 06/07/16 09:31, Juergen Gross wrote: > >>> While testing some patches for support of ballooning in Mini-OS by using > >>> the xenstore domain I realized that each xl create/destroy pair would > >>> increase memory consumption in Mini-OS by about 5kB. Wondering whether > >>> this is a xenstore domain only effect I did the same test with xenstored > >>> and oxenstored daemons. > >>> > >>> xenstored showed the same behavior, the "referenced" size showed by the > >>> pmap command grew by about 5kB for each create/destroy pair. > >>> > >>> oxenstored seemed to be even worse in the beginning (about 6kB for each > >>> pair), but after about 100 create/destroys the value seemed to be > >>> rather stable. > >>> > >>> Did anyone notice this memory leak before? > >> > >> I think I've found the problem: > >> > >> qemu as the device model is setting up a xenstore watch for each backend > >> type it is supporting. Unfortunately those watches are never removed > >> again. This sums up to the observed memory leak. > >> > >> I'm not sure how oxenstored is avoiding the problem, may be by testing > >> socket connections to be still alive and so detecting qemu has gone. > >> OTOH this won't help for oxenstored running in another domain than the > >> device model (either due to oxenstore-stubdom, or a driver domain with > >> a qemu based device model). > >> > > > > How unfortunate. > > > > My gut feeling is that xenstored shouldn't have the knowledge to > > associate a watch with a "process". The concept of a process is only > > meaningful to OS, which wouldn't work on cross-domain xenstored setup. > > Right. > > > Maybe the OS xenbus driver should reap all watches on behalf the dead > > process. This would also avoid a crashed QEMU leaking resources. > > > > And xenstored should have proper quota support so that a domain can't > > set up excessive numbers of watches. > > This would be dom0 unless you arrange the device model to be accounted > as the domid it is running for. But this is problematic with a xenstore > domain again. > The quota could be based on "connection" (ring or socket) and counted as per-connection? Just throwing ideas around, not necessarily saying this is the way to go. Wei. > > Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 13/07/16 15:17, David Vrabel wrote: > On 13/07/16 14:07, Wei Liu wrote: >> >> My gut feeling is that xenstored shouldn't have the knowledge to >> associate a watch with a "process". The concept of a process is only >> meaningful to OS, which wouldn't work on cross-domain xenstored setup. >> Maybe the OS xenbus driver should reap all watches on behalf the dead >> process. This would also avoid a crashed QEMU leaking resources. > > The Linux driver already cleans up open transactions and removes watches > when the file handle is released. Hmm, does this work reliably? I observed a memory leak of about 5kB per create/destroy domain pair with xenstored _and_ with xenstore domain. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On Wed, Jul 13, 2016 at 02:20:28PM +0100, Ian Jackson wrote: > Wei Liu writes ("Re: [Xen-devel] xenstored memory leak"): > > On Wed, Jul 13, 2016 at 02:21:38PM +0200, Juergen Gross wrote: > > > qemu as the device model is setting up a xenstore watch for each backend > > > type it is supporting. Unfortunately those watches are never removed > > > again. This sums up to the observed memory leak. > > I think this must be a bug in C xenstored. > > > > I'm not sure how oxenstored is avoiding the problem, may be by testing > > > socket connections to be still alive and so detecting qemu has gone. > > > OTOH this won't help for oxenstored running in another domain than the > > > device model (either due to oxenstore-stubdom, or a driver domain with > > > a qemu based device model). > > > > How unfortunate. > > > > My gut feeling is that xenstored shouldn't have the knowledge to > > associate a watch with a "process". > > xenstored needs to associate watches with connections. If a > connection is terminated, the watches need to be cleaned up, along > with whatever other things "belong" to that connection (notably > transactions, and replies in flight). > > Here a `connection' might be a socket, or a ring. > Agreed. > C xenstored does have code which tries to do this. It's a bit > impenetrable, though, because it's done through destructors provided > to the reference counting membery allocator (!) > > > The concept of a process is only meaningful to OS, which wouldn't > > work on cross-domain xenstored setup. Maybe the OS xenbus driver > > should reap all watches on behalf the dead process. This would also > > avoid a crashed QEMU leaking resources. > > The OS xenbus driver needs to mediate everything, so that it can > direct replies to the right places etc. It needs to (and does) > maintain a list of watches. When a process closes the xenbus device, > it destroys the watches by issuing commands to the actual xenstored > via its ring connection. > > I guess that the qemu in this case is making a socket connection to > xenstored. > Yeah, I suspect that as well. Wei. > Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 13/07/16 14:32, Juergen Gross wrote: > On 13/07/16 15:17, David Vrabel wrote: >> On 13/07/16 14:07, Wei Liu wrote: >>> >>> My gut feeling is that xenstored shouldn't have the knowledge to >>> associate a watch with a "process". The concept of a process is only >>> meaningful to OS, which wouldn't work on cross-domain xenstored setup. >>> Maybe the OS xenbus driver should reap all watches on behalf the dead >>> process. This would also avoid a crashed QEMU leaking resources. >> >> The Linux driver already cleans up open transactions and removes watches >> when the file handle is released. > > Hmm, does this work reliably? I observed a memory leak of about 5kB per > create/destroy domain pair with xenstored _and_ with xenstore domain. Don't know. I only took a quick look at the xenbus_file_release() function and it looked like it ought to be doing the right thing... David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
Juergen Gross writes ("Re: [Xen-devel] xenstored memory leak"): > On 13/07/16 14:40, Andrew Cooper wrote: > > On 13/07/16 13:21, Juergen Gross wrote: > >> I'll post a qemu patch to remove those watches on exit soon. I don't think this is right. qemu should not explictly remove watches when it is exiting. The cleanup should be handled by xenstored directly (if qemu is connecting via a socket) or by the OS xenbus driver (otherwise). Each of those entities keeps a list of which of their clients owns what watches. > > That is good, but there needs to be further consideration as to how to > > clean up after a crashed qemu/etc. > > Hmm, I see two possibilities: > > a) Add a possibility to do xs_unwatch() with a path and a token with >wildcards (e.g.: xs_unwatch(xs, "/local/domain/0/backend/qdisk", >"be:*:97:*") for removing all qdisk watches for domid 97) and use >this when doing the xenstore cleanup from libxl on domain >destruction. > > b) Instead of watching /local/domain//backend/ let qemu >create /local/domain//backend// and watch >this. When this directory is being removed by libxl all the watches >will be gone automatically. This is going in entirely the wrong direction. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 13/07/16 15:07, Wei Liu wrote: > On Wed, Jul 13, 2016 at 02:21:38PM +0200, Juergen Gross wrote: >> On 06/07/16 09:31, Juergen Gross wrote: >>> While testing some patches for support of ballooning in Mini-OS by using >>> the xenstore domain I realized that each xl create/destroy pair would >>> increase memory consumption in Mini-OS by about 5kB. Wondering whether >>> this is a xenstore domain only effect I did the same test with xenstored >>> and oxenstored daemons. >>> >>> xenstored showed the same behavior, the "referenced" size showed by the >>> pmap command grew by about 5kB for each create/destroy pair. >>> >>> oxenstored seemed to be even worse in the beginning (about 6kB for each >>> pair), but after about 100 create/destroys the value seemed to be >>> rather stable. >>> >>> Did anyone notice this memory leak before? >> >> I think I've found the problem: >> >> qemu as the device model is setting up a xenstore watch for each backend >> type it is supporting. Unfortunately those watches are never removed >> again. This sums up to the observed memory leak. >> >> I'm not sure how oxenstored is avoiding the problem, may be by testing >> socket connections to be still alive and so detecting qemu has gone. >> OTOH this won't help for oxenstored running in another domain than the >> device model (either due to oxenstore-stubdom, or a driver domain with >> a qemu based device model). >> > > How unfortunate. > > My gut feeling is that xenstored shouldn't have the knowledge to > associate a watch with a "process". The concept of a process is only > meaningful to OS, which wouldn't work on cross-domain xenstored setup. Right. > Maybe the OS xenbus driver should reap all watches on behalf the dead > process. This would also avoid a crashed QEMU leaking resources. > > And xenstored should have proper quota support so that a domain can't > set up excessive numbers of watches. This would be dom0 unless you arrange the device model to be accounted as the domid it is running for. But this is problematic with a xenstore domain again. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 13/07/16 14:40, Andrew Cooper wrote: > On 13/07/16 13:21, Juergen Gross wrote: >> On 06/07/16 09:31, Juergen Gross wrote: >>> While testing some patches for support of ballooning in Mini-OS by using >>> the xenstore domain I realized that each xl create/destroy pair would >>> increase memory consumption in Mini-OS by about 5kB. Wondering whether >>> this is a xenstore domain only effect I did the same test with xenstored >>> and oxenstored daemons. >>> >>> xenstored showed the same behavior, the "referenced" size showed by the >>> pmap command grew by about 5kB for each create/destroy pair. >>> >>> oxenstored seemed to be even worse in the beginning (about 6kB for each >>> pair), but after about 100 create/destroys the value seemed to be >>> rather stable. >>> >>> Did anyone notice this memory leak before? >> I think I've found the problem: >> >> qemu as the device model is setting up a xenstore watch for each backend >> type it is supporting. Unfortunately those watches are never removed >> again. This sums up to the observed memory leak. >> >> I'm not sure how oxenstored is avoiding the problem, may be by testing >> socket connections to be still alive and so detecting qemu has gone. >> OTOH this won't help for oxenstored running in another domain than the >> device model (either due to oxenstore-stubdom, or a driver domain with >> a qemu based device model). > > I do seem to remember something along those lines. > >> >> I'll post a qemu patch to remove those watches on exit soon. > > That is good, but there needs to be further consideration as to how to > clean up after a crashed qemu/etc. Hmm, I see two possibilities: a) Add a possibility to do xs_unwatch() with a path and a token with wildcards (e.g.: xs_unwatch(xs, "/local/domain/0/backend/qdisk", "be:*:97:*") for removing all qdisk watches for domid 97) and use this when doing the xenstore cleanup from libxl on domain destruction. b) Instead of watching /local/domain//backend/ let qemu create /local/domain//backend// and watch this. When this directory is being removed by libxl all the watches will be gone automatically. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
Wei Liu writes ("Re: [Xen-devel] xenstored memory leak"): > On Wed, Jul 13, 2016 at 02:21:38PM +0200, Juergen Gross wrote: > > qemu as the device model is setting up a xenstore watch for each backend > > type it is supporting. Unfortunately those watches are never removed > > again. This sums up to the observed memory leak. I think this must be a bug in C xenstored. > > I'm not sure how oxenstored is avoiding the problem, may be by testing > > socket connections to be still alive and so detecting qemu has gone. > > OTOH this won't help for oxenstored running in another domain than the > > device model (either due to oxenstore-stubdom, or a driver domain with > > a qemu based device model). > > How unfortunate. > > My gut feeling is that xenstored shouldn't have the knowledge to > associate a watch with a "process". xenstored needs to associate watches with connections. If a connection is terminated, the watches need to be cleaned up, along with whatever other things "belong" to that connection (notably transactions, and replies in flight). Here a `connection' might be a socket, or a ring. C xenstored does have code which tries to do this. It's a bit impenetrable, though, because it's done through destructors provided to the reference counting membery allocator (!) > The concept of a process is only meaningful to OS, which wouldn't > work on cross-domain xenstored setup. Maybe the OS xenbus driver > should reap all watches on behalf the dead process. This would also > avoid a crashed QEMU leaking resources. The OS xenbus driver needs to mediate everything, so that it can direct replies to the right places etc. It needs to (and does) maintain a list of watches. When a process closes the xenbus device, it destroys the watches by issuing commands to the actual xenstored via its ring connection. I guess that the qemu in this case is making a socket connection to xenstored. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 13/07/16 14:07, Wei Liu wrote: > > My gut feeling is that xenstored shouldn't have the knowledge to > associate a watch with a "process". The concept of a process is only > meaningful to OS, which wouldn't work on cross-domain xenstored setup. > Maybe the OS xenbus driver should reap all watches on behalf the dead > process. This would also avoid a crashed QEMU leaking resources. The Linux driver already cleans up open transactions and removes watches when the file handle is released. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On Wed, Jul 13, 2016 at 02:21:38PM +0200, Juergen Gross wrote: > On 06/07/16 09:31, Juergen Gross wrote: > > While testing some patches for support of ballooning in Mini-OS by using > > the xenstore domain I realized that each xl create/destroy pair would > > increase memory consumption in Mini-OS by about 5kB. Wondering whether > > this is a xenstore domain only effect I did the same test with xenstored > > and oxenstored daemons. > > > > xenstored showed the same behavior, the "referenced" size showed by the > > pmap command grew by about 5kB for each create/destroy pair. > > > > oxenstored seemed to be even worse in the beginning (about 6kB for each > > pair), but after about 100 create/destroys the value seemed to be > > rather stable. > > > > Did anyone notice this memory leak before? > > I think I've found the problem: > > qemu as the device model is setting up a xenstore watch for each backend > type it is supporting. Unfortunately those watches are never removed > again. This sums up to the observed memory leak. > > I'm not sure how oxenstored is avoiding the problem, may be by testing > socket connections to be still alive and so detecting qemu has gone. > OTOH this won't help for oxenstored running in another domain than the > device model (either due to oxenstore-stubdom, or a driver domain with > a qemu based device model). > How unfortunate. My gut feeling is that xenstored shouldn't have the knowledge to associate a watch with a "process". The concept of a process is only meaningful to OS, which wouldn't work on cross-domain xenstored setup. Maybe the OS xenbus driver should reap all watches on behalf the dead process. This would also avoid a crashed QEMU leaking resources. And xenstored should have proper quota support so that a domain can't set up excessive numbers of watches. > I'll post a qemu patch to remove those watches on exit soon. > > To find the problem I've added some debug aid to xenstored: when > a special parameter is specified on invocation it will dump its memory > allocation structure via talloc_report_full() to a file whenever it is > receiving a SIGUSR1 signal. Anybody interested in this patch? > Wouldn't hurt to post to the list. If we take it we take it, if we don't it would still be useful for people who want custom debug patch later. Wei. > > Juergen > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 13/07/16 13:21, Juergen Gross wrote: > On 06/07/16 09:31, Juergen Gross wrote: >> While testing some patches for support of ballooning in Mini-OS by using >> the xenstore domain I realized that each xl create/destroy pair would >> increase memory consumption in Mini-OS by about 5kB. Wondering whether >> this is a xenstore domain only effect I did the same test with xenstored >> and oxenstored daemons. >> >> xenstored showed the same behavior, the "referenced" size showed by the >> pmap command grew by about 5kB for each create/destroy pair. >> >> oxenstored seemed to be even worse in the beginning (about 6kB for each >> pair), but after about 100 create/destroys the value seemed to be >> rather stable. >> >> Did anyone notice this memory leak before? > I think I've found the problem: > > qemu as the device model is setting up a xenstore watch for each backend > type it is supporting. Unfortunately those watches are never removed > again. This sums up to the observed memory leak. > > I'm not sure how oxenstored is avoiding the problem, may be by testing > socket connections to be still alive and so detecting qemu has gone. > OTOH this won't help for oxenstored running in another domain than the > device model (either due to oxenstore-stubdom, or a driver domain with > a qemu based device model). I do seem to remember something along those lines. > > I'll post a qemu patch to remove those watches on exit soon. That is good, but there needs to be further consideration as to how to clean up after a crashed qemu/etc. > > To find the problem I've added some debug aid to xenstored: when > a special parameter is specified on invocation it will dump its memory > allocation structure via talloc_report_full() to a file whenever it is > receiving a SIGUSR1 signal. Anybody interested in this patch? Sounds useful. +1 ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 06/07/16 09:31, Juergen Gross wrote: > While testing some patches for support of ballooning in Mini-OS by using > the xenstore domain I realized that each xl create/destroy pair would > increase memory consumption in Mini-OS by about 5kB. Wondering whether > this is a xenstore domain only effect I did the same test with xenstored > and oxenstored daemons. > > xenstored showed the same behavior, the "referenced" size showed by the > pmap command grew by about 5kB for each create/destroy pair. > > oxenstored seemed to be even worse in the beginning (about 6kB for each > pair), but after about 100 create/destroys the value seemed to be > rather stable. > > Did anyone notice this memory leak before? I think I've found the problem: qemu as the device model is setting up a xenstore watch for each backend type it is supporting. Unfortunately those watches are never removed again. This sums up to the observed memory leak. I'm not sure how oxenstored is avoiding the problem, may be by testing socket connections to be still alive and so detecting qemu has gone. OTOH this won't help for oxenstored running in another domain than the device model (either due to oxenstore-stubdom, or a driver domain with a qemu based device model). I'll post a qemu patch to remove those watches on exit soon. To find the problem I've added some debug aid to xenstored: when a special parameter is specified on invocation it will dump its memory allocation structure via talloc_report_full() to a file whenever it is receiving a SIGUSR1 signal. Anybody interested in this patch? Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On Wed, Jul 06, 2016 at 09:31:38AM +0200, Juergen Gross wrote: > While testing some patches for support of ballooning in Mini-OS by using > the xenstore domain I realized that each xl create/destroy pair would > increase memory consumption in Mini-OS by about 5kB. Wondering whether > this is a xenstore domain only effect I did the same test with xenstored > and oxenstored daemons. > > xenstored showed the same behavior, the "referenced" size showed by the > pmap command grew by about 5kB for each create/destroy pair. > > oxenstored seemed to be even worse in the beginning (about 6kB for each > pair), but after about 100 create/destroys the value seemed to be > rather stable. > > Did anyone notice this memory leak before? > No. But I don't normally look at xenstored memory consumption anyway... > David, it seems ocaml based xenstore doesn't have such a problem. How > mature is the xenstore Mirage-OS variant? Could it be easily integrated > into the Xen build? > > > Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 06/07/16 14:55, Juergen Gross wrote: > On 06/07/16 15:48, Andrew Cooper wrote: >> On 06/07/16 08:31, Juergen Gross wrote: >>> While testing some patches for support of ballooning in Mini-OS by using >>> the xenstore domain I realized that each xl create/destroy pair would >>> increase memory consumption in Mini-OS by about 5kB. Wondering whether >>> this is a xenstore domain only effect I did the same test with xenstored >>> and oxenstored daemons. >>> >>> xenstored showed the same behavior, the "referenced" size showed by the >>> pmap command grew by about 5kB for each create/destroy pair. >>> >>> oxenstored seemed to be even worse in the beginning (about 6kB for each >>> pair), but after about 100 create/destroys the value seemed to be >>> rather stable. >> Do you mean that after a while, you see oxenstored not leaking any >> further memory, even with new domains being created? > In my test: yes. I did: > > while true > do > xl create minios.xl > sleep 3 > xl shutdown minios > sleep 2 > done > > After about 200 iterations memory usage with oxenstored was stable. I > stopped the loop after more than 1000 iterations. > >> Ocaml is a garbage collected languague, so you would expect the process >> to get larger until the GC decides to kick in. > Okay. This explains the pattern. > >>> Did anyone notice this memory leak before? >> We have not encountered this in XenServer stress scenarios. > You are using oxenstored, right? The real leak is in xenstored only. Correct. > >> (It is entirely possible that this specific to something xl does which >> Xapi doesn't.) > I doubt that. I'm seeing the leak with the C-variant of xenstore, both > as daemon and as stubdom. Right, and we haven't used C xenstored in the last decade. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 06/07/16 15:48, Andrew Cooper wrote: > On 06/07/16 08:31, Juergen Gross wrote: >> While testing some patches for support of ballooning in Mini-OS by using >> the xenstore domain I realized that each xl create/destroy pair would >> increase memory consumption in Mini-OS by about 5kB. Wondering whether >> this is a xenstore domain only effect I did the same test with xenstored >> and oxenstored daemons. >> >> xenstored showed the same behavior, the "referenced" size showed by the >> pmap command grew by about 5kB for each create/destroy pair. >> >> oxenstored seemed to be even worse in the beginning (about 6kB for each >> pair), but after about 100 create/destroys the value seemed to be >> rather stable. > > Do you mean that after a while, you see oxenstored not leaking any > further memory, even with new domains being created? In my test: yes. I did: while true do xl create minios.xl sleep 3 xl shutdown minios sleep 2 done After about 200 iterations memory usage with oxenstored was stable. I stopped the loop after more than 1000 iterations. > Ocaml is a garbage collected languague, so you would expect the process > to get larger until the GC decides to kick in. Okay. This explains the pattern. >> Did anyone notice this memory leak before? > > We have not encountered this in XenServer stress scenarios. You are using oxenstored, right? The real leak is in xenstored only. > (It is entirely possible that this specific to something xl does which > Xapi doesn't.) I doubt that. I'm seeing the leak with the C-variant of xenstore, both as daemon and as stubdom. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xenstored memory leak
On 06/07/16 08:31, Juergen Gross wrote: > While testing some patches for support of ballooning in Mini-OS by using > the xenstore domain I realized that each xl create/destroy pair would > increase memory consumption in Mini-OS by about 5kB. Wondering whether > this is a xenstore domain only effect I did the same test with xenstored > and oxenstored daemons. > > xenstored showed the same behavior, the "referenced" size showed by the > pmap command grew by about 5kB for each create/destroy pair. > > oxenstored seemed to be even worse in the beginning (about 6kB for each > pair), but after about 100 create/destroys the value seemed to be > rather stable. Do you mean that after a while, you see oxenstored not leaking any further memory, even with new domains being created? Ocaml is a garbage collected languague, so you would expect the process to get larger until the GC decides to kick in. > > Did anyone notice this memory leak before? We have not encountered this in XenServer stress scenarios. (It is entirely possible that this specific to something xl does which Xapi doesn't.) ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] xenstored memory leak
While testing some patches for support of ballooning in Mini-OS by using the xenstore domain I realized that each xl create/destroy pair would increase memory consumption in Mini-OS by about 5kB. Wondering whether this is a xenstore domain only effect I did the same test with xenstored and oxenstored daemons. xenstored showed the same behavior, the "referenced" size showed by the pmap command grew by about 5kB for each create/destroy pair. oxenstored seemed to be even worse in the beginning (about 6kB for each pair), but after about 100 create/destroys the value seemed to be rather stable. Did anyone notice this memory leak before? David, it seems ocaml based xenstore doesn't have such a problem. How mature is the xenstore Mirage-OS variant? Could it be easily integrated into the Xen build? Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel