Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
Just a one-time announcement - beside the git tree at github.com/hallyn/libresource there is also a mailing list now at https://lists.linuxcontainers.org/listinfo/libresource-devel I don't really intend to be a driving developer on it, but will happily review and discuss and help where I can. -serge -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
Quoting Daniel P. Berrange (berra...@redhat.com): > On Thu, Sep 24, 2015 at 03:53:24PM +, Serge Hallyn wrote: > > Quoting Daniel P. Berrange (berra...@redhat.com): > > > On Thu, Sep 24, 2015 at 02:41:49PM +, Serge Hallyn wrote: > > > > Quoting Fabio Kung (fabio.k...@gmail.com): > > > > > On Wed, Sep 16, 2015 at 12:29 PM, Serge Hallyn > > > > >wrote: > > > > > > > > > > > > Ok, so I could create a project on github, but that doesn't come > > > > > > with > > > > > > a m-l. Last I used it, sf was problematic. Any other suggestions > > > > > > for > > > > > > where to host a mailing list? Might the github issue tracker > > > > > > suffice? > > > > > > We could (as worked quite well for lxd) have a specs/ directory in a > > > > > > libresource source tree, and use issues and pull reuqests to guide > > > > > > the > > > > > > api specifications under that directory. Just a thought. > > > > > > > > > > This all sgtm. A mailing list for design discussions + github issue > > > > > tracker seems to be working well for many open source projects I've > > > > > been tracking lately. Most of them are using Google Groups for their > > > > > mailing lists. > > > > > > > > Well for starters I created https://github.com/hallyn/libresource . We > > > > should create a real project for it but it's a start. (I'll create an > > > > organization if this starts to move) > > > > > > > > Actually I suppose the first step would be deciding on a license. > > > > Normally > > > > I default to gplv2, but for this that may not be appropriate. Apache > > > > license? Can be settled in an issue or pull request for a License file, > > > > I think. > > > > > > My personal preference is always LGPLv2+ for libraries, since it gives > > > ability to use from non-open source apps, but is still copyleft. I know > > > corporates tend to prefer non-copyleft licenses like Apache these days, > > > but that is generally for ulterior motives like being able to do dual > > > open/closed products. > > > > I think one of the most important consumers would be procps, and this > > wouldn't be an issue for them. Now one of the reasons we want this is > > so that software like databases and big java apps can check their > > real available resources to scale - would this be an issue for them, > > or do we think they would just link to or execute commands from > > procps? > > I guess where it could become an issue is if $BIGVENDOR wants to bundle > a copy of the library statically with their app. Some companies are > (irrationally) paranoid about shipping anything copyleft themselves, > so Apache could suit that. Its a tradeoff, as it obviously lets them > embrace & extend rather than forcing them to share improvements they > make. Agreed. https://github.com/hallyn/libresource/pull/2 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Thu, Sep 24, 2015 at 02:41:49PM +, Serge Hallyn wrote: > Quoting Fabio Kung (fabio.k...@gmail.com): > > On Wed, Sep 16, 2015 at 12:29 PM, Serge Hallyn> > wrote: > > > > > > Ok, so I could create a project on github, but that doesn't come with > > > a m-l. Last I used it, sf was problematic. Any other suggestions for > > > where to host a mailing list? Might the github issue tracker suffice? > > > We could (as worked quite well for lxd) have a specs/ directory in a > > > libresource source tree, and use issues and pull reuqests to guide the > > > api specifications under that directory. Just a thought. > > > > This all sgtm. A mailing list for design discussions + github issue > > tracker seems to be working well for many open source projects I've > > been tracking lately. Most of them are using Google Groups for their > > mailing lists. > > Well for starters I created https://github.com/hallyn/libresource . We > should create a real project for it but it's a start. (I'll create an > organization if this starts to move) > > Actually I suppose the first step would be deciding on a license. Normally > I default to gplv2, but for this that may not be appropriate. Apache > license? Can be settled in an issue or pull request for a License file, > I think. My personal preference is always LGPLv2+ for libraries, since it gives ability to use from non-open source apps, but is still copyleft. I know corporates tend to prefer non-copyleft licenses like Apache these days, but that is generally for ulterior motives like being able to do dual open/closed products. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
Quoting Fabio Kung (fabio.k...@gmail.com): > On Wed, Sep 16, 2015 at 12:29 PM, Serge Hallyn> wrote: > > > > Ok, so I could create a project on github, but that doesn't come with > > a m-l. Last I used it, sf was problematic. Any other suggestions for > > where to host a mailing list? Might the github issue tracker suffice? > > We could (as worked quite well for lxd) have a specs/ directory in a > > libresource source tree, and use issues and pull reuqests to guide the > > api specifications under that directory. Just a thought. > > This all sgtm. A mailing list for design discussions + github issue > tracker seems to be working well for many open source projects I've > been tracking lately. Most of them are using Google Groups for their > mailing lists. Well for starters I created https://github.com/hallyn/libresource . We should create a real project for it but it's a start. (I'll create an organization if this starts to move) Actually I suppose the first step would be deciding on a license. Normally I default to gplv2, but for this that may not be appropriate. Apache license? Can be settled in an issue or pull request for a License file, I think. -serge -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
Quoting Daniel P. Berrange (berra...@redhat.com): > On Thu, Sep 24, 2015 at 02:41:49PM +, Serge Hallyn wrote: > > Quoting Fabio Kung (fabio.k...@gmail.com): > > > On Wed, Sep 16, 2015 at 12:29 PM, Serge Hallyn> > > wrote: > > > > > > > > Ok, so I could create a project on github, but that doesn't come with > > > > a m-l. Last I used it, sf was problematic. Any other suggestions for > > > > where to host a mailing list? Might the github issue tracker suffice? > > > > We could (as worked quite well for lxd) have a specs/ directory in a > > > > libresource source tree, and use issues and pull reuqests to guide the > > > > api specifications under that directory. Just a thought. > > > > > > This all sgtm. A mailing list for design discussions + github issue > > > tracker seems to be working well for many open source projects I've > > > been tracking lately. Most of them are using Google Groups for their > > > mailing lists. > > > > Well for starters I created https://github.com/hallyn/libresource . We > > should create a real project for it but it's a start. (I'll create an > > organization if this starts to move) > > > > Actually I suppose the first step would be deciding on a license. Normally > > I default to gplv2, but for this that may not be appropriate. Apache > > license? Can be settled in an issue or pull request for a License file, > > I think. > > My personal preference is always LGPLv2+ for libraries, since it gives > ability to use from non-open source apps, but is still copyleft. I know > corporates tend to prefer non-copyleft licenses like Apache these days, > but that is generally for ulterior motives like being able to do dual > open/closed products. > > Regards, > Daniel I think one of the most important consumers would be procps, and this wouldn't be an issue for them. Now one of the reasons we want this is so that software like databases and big java apps can check their real available resources to scale - would this be an issue for them, or do we think they would just link to or execute commands from procps? -serge -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Thu, Sep 24, 2015 at 03:53:24PM +, Serge Hallyn wrote: > Quoting Daniel P. Berrange (berra...@redhat.com): > > On Thu, Sep 24, 2015 at 02:41:49PM +, Serge Hallyn wrote: > > > Quoting Fabio Kung (fabio.k...@gmail.com): > > > > On Wed, Sep 16, 2015 at 12:29 PM, Serge Hallyn > > > >wrote: > > > > > > > > > > Ok, so I could create a project on github, but that doesn't come with > > > > > a m-l. Last I used it, sf was problematic. Any other suggestions for > > > > > where to host a mailing list? Might the github issue tracker suffice? > > > > > We could (as worked quite well for lxd) have a specs/ directory in a > > > > > libresource source tree, and use issues and pull reuqests to guide the > > > > > api specifications under that directory. Just a thought. > > > > > > > > This all sgtm. A mailing list for design discussions + github issue > > > > tracker seems to be working well for many open source projects I've > > > > been tracking lately. Most of them are using Google Groups for their > > > > mailing lists. > > > > > > Well for starters I created https://github.com/hallyn/libresource . We > > > should create a real project for it but it's a start. (I'll create an > > > organization if this starts to move) > > > > > > Actually I suppose the first step would be deciding on a license. > > > Normally > > > I default to gplv2, but for this that may not be appropriate. Apache > > > license? Can be settled in an issue or pull request for a License file, > > > I think. > > > > My personal preference is always LGPLv2+ for libraries, since it gives > > ability to use from non-open source apps, but is still copyleft. I know > > corporates tend to prefer non-copyleft licenses like Apache these days, > > but that is generally for ulterior motives like being able to do dual > > open/closed products. > > I think one of the most important consumers would be procps, and this > wouldn't be an issue for them. Now one of the reasons we want this is > so that software like databases and big java apps can check their > real available resources to scale - would this be an issue for them, > or do we think they would just link to or execute commands from > procps? I guess where it could become an issue is if $BIGVENDOR wants to bundle a copy of the library statically with their app. Some companies are (irrationally) paranoid about shipping anything copyleft themselves, so Apache could suit that. Its a tradeoff, as it obviously lets them embrace & extend rather than forcing them to share improvements they make. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Wed, 2015-09-16 at 19:29 +, Serge Hallyn wrote: > Quoting Daniel P. Berrange (berra...@redhat.com): > > On Wed, Sep 16, 2015 at 03:15:52PM +, Serge Hallyn wrote: > > > Quoting Fabio Kung (fabio.k...@gmail.com): > > > > On Mon, Sep 7, 2015 at 8:55 AM, Serge Hallyn> > > > wrote: > > > > > > > > > > Ah, my memory was failing me, so took a bit of searching, but > > > > > > > > > > http://fabiokung.com/2014/03/13/memory-inside-linux-containers/ > > > > > > > > > > I can't find anything called 'libmymem', and in 2014 he said > > > > > > > > > > https://github.com/docker/docker/issues/8427#issuecomment-58255159 > > > > > > > > > > so maybe this never went anywhere. > > > > > > > > Correct, unfortunately. > > > > > > > > > > > > > For the same reasons you cited above, and because everyeone is rolling > > > > > their own at fuse level, I still think that a libresource and patches > > > > > to proc tools to use them, is the right way to go. We have no > > > > > shortage > > > > > of sample code for the functions doing the actual work, between > > > > > libvirt, > > > > > lxc, docker, etc :) > > > > > > > > > > Should we just go ahead and start a libresource github project? > > > > > > > > +1, if there's momentum on this I believe I will be able to contribute > > > > some cycles. Maybe now is the right time? > > > > > > Might be. Maybe the thing to do is start a project and mailing list > > > (any objections to github? Do we create a new project for this?), and > > > see if more than 3 people join :) Announce on containers@ and cgroup@ > > > mailing lists, and start discussing what a reasonable API would look > > > like. > > > > FWIW, I would support any such effort, but I'm unlikely to have free > > resources to do anything more than watch its mailing list. > > NP - if you can correct our course if we're heading someplace bad for > libvirt that'll be great. Though I suspect lxc/lxd and libvirt will > mostly agree. I can possibly help the coding... though I'm not too versed in the low-level things (yet), don't count on me as one of the main hackers ;) > Ok, so I could create a project on github, but that doesn't come with > a m-l. Last I used it, sf was problematic. Any other suggestions for > where to host a mailing list? Might the github issue tracker suffice? > We could (as worked quite well for lxd) have a specs/ directory in a > libresource source tree, and use issues and pull reuqests to guide the > api specifications under that directory. Just a thought. It could be OK to start with the github issue tracker and we'll see if a mailing list is really needed. I'm using SF.net for other projects and I feel it's always a pain to use. -- Cedric -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Wed, Sep 16, 2015 at 12:29 PM, Serge Hallynwrote: > > Ok, so I could create a project on github, but that doesn't come with > a m-l. Last I used it, sf was problematic. Any other suggestions for > where to host a mailing list? Might the github issue tracker suffice? > We could (as worked quite well for lxd) have a specs/ directory in a > libresource source tree, and use issues and pull reuqests to guide the > api specifications under that directory. Just a thought. This all sgtm. A mailing list for design discussions + github issue tracker seems to be working well for many open source projects I've been tracking lately. Most of them are using Google Groups for their mailing lists. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
Quoting Fabio Kung (fabio.k...@gmail.com): > On Mon, Sep 7, 2015 at 8:55 AM, Serge Hallynwrote: > > > > Ah, my memory was failing me, so took a bit of searching, but > > > > http://fabiokung.com/2014/03/13/memory-inside-linux-containers/ > > > > I can't find anything called 'libmymem', and in 2014 he said > > > > https://github.com/docker/docker/issues/8427#issuecomment-58255159 > > > > so maybe this never went anywhere. > > Correct, unfortunately. > > > > For the same reasons you cited above, and because everyeone is rolling > > their own at fuse level, I still think that a libresource and patches > > to proc tools to use them, is the right way to go. We have no shortage > > of sample code for the functions doing the actual work, between libvirt, > > lxc, docker, etc :) > > > > Should we just go ahead and start a libresource github project? > > +1, if there's momentum on this I believe I will be able to contribute > some cycles. Maybe now is the right time? Might be. Maybe the thing to do is start a project and mailing list (any objections to github? Do we create a new project for this?), and see if more than 3 people join :) Announce on containers@ and cgroup@ mailing lists, and start discussing what a reasonable API would look like. -serge -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Wed, Sep 16, 2015 at 03:15:52PM +, Serge Hallyn wrote: > Quoting Fabio Kung (fabio.k...@gmail.com): > > On Mon, Sep 7, 2015 at 8:55 AM, Serge Hallyn> > wrote: > > > > > > Ah, my memory was failing me, so took a bit of searching, but > > > > > > http://fabiokung.com/2014/03/13/memory-inside-linux-containers/ > > > > > > I can't find anything called 'libmymem', and in 2014 he said > > > > > > https://github.com/docker/docker/issues/8427#issuecomment-58255159 > > > > > > so maybe this never went anywhere. > > > > Correct, unfortunately. > > > > > > > For the same reasons you cited above, and because everyeone is rolling > > > their own at fuse level, I still think that a libresource and patches > > > to proc tools to use them, is the right way to go. We have no shortage > > > of sample code for the functions doing the actual work, between libvirt, > > > lxc, docker, etc :) > > > > > > Should we just go ahead and start a libresource github project? > > > > +1, if there's momentum on this I believe I will be able to contribute > > some cycles. Maybe now is the right time? > > Might be. Maybe the thing to do is start a project and mailing list > (any objections to github? Do we create a new project for this?), and > see if more than 3 people join :) Announce on containers@ and cgroup@ > mailing lists, and start discussing what a reasonable API would look > like. FWIW, I would support any such effort, but I'm unlikely to have free resources to do anything more than watch its mailing list. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
Quoting Daniel P. Berrange (berra...@redhat.com): > On Wed, Sep 16, 2015 at 03:15:52PM +, Serge Hallyn wrote: > > Quoting Fabio Kung (fabio.k...@gmail.com): > > > On Mon, Sep 7, 2015 at 8:55 AM, Serge Hallyn> > > wrote: > > > > > > > > Ah, my memory was failing me, so took a bit of searching, but > > > > > > > > http://fabiokung.com/2014/03/13/memory-inside-linux-containers/ > > > > > > > > I can't find anything called 'libmymem', and in 2014 he said > > > > > > > > https://github.com/docker/docker/issues/8427#issuecomment-58255159 > > > > > > > > so maybe this never went anywhere. > > > > > > Correct, unfortunately. > > > > > > > > > > For the same reasons you cited above, and because everyeone is rolling > > > > their own at fuse level, I still think that a libresource and patches > > > > to proc tools to use them, is the right way to go. We have no shortage > > > > of sample code for the functions doing the actual work, between libvirt, > > > > lxc, docker, etc :) > > > > > > > > Should we just go ahead and start a libresource github project? > > > > > > +1, if there's momentum on this I believe I will be able to contribute > > > some cycles. Maybe now is the right time? > > > > Might be. Maybe the thing to do is start a project and mailing list > > (any objections to github? Do we create a new project for this?), and > > see if more than 3 people join :) Announce on containers@ and cgroup@ > > mailing lists, and start discussing what a reasonable API would look > > like. > > FWIW, I would support any such effort, but I'm unlikely to have free > resources to do anything more than watch its mailing list. NP - if you can correct our course if we're heading someplace bad for libvirt that'll be great. Though I suspect lxc/lxd and libvirt will mostly agree. Ok, so I could create a project on github, but that doesn't come with a m-l. Last I used it, sf was problematic. Any other suggestions for where to host a mailing list? Might the github issue tracker suffice? We could (as worked quite well for lxd) have a specs/ directory in a libresource source tree, and use issues and pull reuqests to guide the api specifications under that directory. Just a thought. -serge -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Mon, Sep 7, 2015 at 8:55 AM, Serge Hallynwrote: > > Ah, my memory was failing me, so took a bit of searching, but > > http://fabiokung.com/2014/03/13/memory-inside-linux-containers/ > > I can't find anything called 'libmymem', and in 2014 he said > > https://github.com/docker/docker/issues/8427#issuecomment-58255159 > > so maybe this never went anywhere. Correct, unfortunately. > For the same reasons you cited above, and because everyeone is rolling > their own at fuse level, I still think that a libresource and patches > to proc tools to use them, is the right way to go. We have no shortage > of sample code for the functions doing the actual work, between libvirt, > lxc, docker, etc :) > > Should we just go ahead and start a libresource github project? +1, if there's momentum on this I believe I will be able to contribute some cycles. Maybe now is the right time? -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Thu, Sep 03, 2015 at 11:51:16AM +0200, Cédric Bosdonnat wrote: > We already have a fuse mount to reflect the cgroup memory restrictions > in the container. This commit adds the same for the number of available > CPUs. Only the CPUs listed by virProcessGetAffinity are shown in the > container's cpuinfo. So this (re-)raises some interesting / difficult questions that I'm not sure we have a good answer to. The main concern is that actually this is not really a problem specific to containers, rather it is related to cgroup resource confinement. ie the cgroup has confined a process(es) to a set of CPUs are the process is using /proc/cpuinfo to count CPUs and so is wrong. Cgroups are being increasingly widely used in Linux, particularly since systemd, so pretty much any process has to expect that it can be confined to a subset of CPUs. IOW, any application using /proc/cpuinfo to determine "available" resource is already broken, even when run on bare metal. The same also applies to the use of /proc/meminfo, which we previously faked via fuse. So the question is whether we should invest time trying to fake the /proc/cpuinfo in containers, when any apps we'd be fixing are already broken in bare metal. Apps might have avoided /proc/cpuinfo and instead be trying /sys/devices/system/cpu/ which your patch isn't trying to fake. This is just as broken, because sysfs doesn't reflect cgroup confinement either. I think what is ultimately needed for applications is some kind of libresource.so library that they can use to query what resources are available in their compute environment, which can intelligently query cgroups directly, and ignore the legacy /proc & /sys interfaces for counting memory / cpu availability. I don't think that's something that libvirt should solve - if anything it could be systemd, or a standalone project. So I'm increasingly convinced that LXC should not try to fake out any /proc & /sys file content, and instead document the limitations. I'm also thinking that we should kill off our existing meminfo fake fuse at some point. The more minor concern I have is around the implementation. AFAIR, the /proc/cpuinfo file contents is not standardized across architectures, so I'm concerned whether your parsing code is robust on non-x86 arches. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
Quoting Daniel P. Berrange (berra...@redhat.com): > On Thu, Sep 03, 2015 at 11:51:16AM +0200, Cédric Bosdonnat wrote: > > We already have a fuse mount to reflect the cgroup memory restrictions > > in the container. This commit adds the same for the number of available > > CPUs. Only the CPUs listed by virProcessGetAffinity are shown in the > > container's cpuinfo. > > So this (re-)raises some interesting / difficult questions that I'm > not sure we have a good answer to. > > The main concern is that actually this is not really a problem specific > to containers, rather it is related to cgroup resource confinement. > ie the cgroup has confined a process(es) to a set of CPUs are the process > is using /proc/cpuinfo to count CPUs and so is wrong. Cgroups are being > increasingly widely used in Linux, particularly since systemd, so pretty > much any process has to expect that it can be confined to a subset of > CPUs. > > IOW, any application using /proc/cpuinfo to determine "available" > resource is already broken, even when run on bare metal. The same also > applies to the use of /proc/meminfo, which we previously faked via > fuse. > > So the question is whether we should invest time trying to fake the > /proc/cpuinfo in containers, when any apps we'd be fixing are already > broken in bare metal. Apps might have avoided /proc/cpuinfo and instead > be trying /sys/devices/system/cpu/ which your patch isn't trying to > fake. This is just as broken, because sysfs doesn't reflect cgroup > confinement either. > > I think what is ultimately needed for applications is some kind of > libresource.so library that they can use to query what resources Does anyone remember who it was that announced an effort to this end a year or two ago, or know what the status of it is? > are available in their compute environment, which can intelligently > query cgroups directly, and ignore the legacy /proc & /sys interfaces > for counting memory / cpu availability. I don't think that's something > that libvirt should solve - if anything it could be systemd, or a > standalone project. > > So I'm increasingly convinced that LXC should not try to fake out > any /proc & /sys file content, and instead document the limitations. > I'm also thinking that we should kill off our existing meminfo fake > fuse at some point. > > The more minor concern I have is around the implementation. AFAIR, the > /proc/cpuinfo file contents is not standardized across architectures, > so I'm concerned whether your parsing code is robust on non-x86 arches. > > Regards, > Daniel > -- > |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- http://virt-manager.org :| > |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| > |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| > > -- > libvir-list mailing list > libvir-list@redhat.com > https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
Quoting Daniel P. Berrange (berra...@redhat.com): > On Mon, Sep 07, 2015 at 03:39:13PM +, Serge Hallyn wrote: > > Quoting Daniel P. Berrange (berra...@redhat.com): > > > On Thu, Sep 03, 2015 at 11:51:16AM +0200, Cédric Bosdonnat wrote: > > > > We already have a fuse mount to reflect the cgroup memory restrictions > > > > in the container. This commit adds the same for the number of available > > > > CPUs. Only the CPUs listed by virProcessGetAffinity are shown in the > > > > container's cpuinfo. > > > > > > So this (re-)raises some interesting / difficult questions that I'm > > > not sure we have a good answer to. > > > > > > The main concern is that actually this is not really a problem specific > > > to containers, rather it is related to cgroup resource confinement. > > > ie the cgroup has confined a process(es) to a set of CPUs are the process > > > is using /proc/cpuinfo to count CPUs and so is wrong. Cgroups are being > > > increasingly widely used in Linux, particularly since systemd, so pretty > > > much any process has to expect that it can be confined to a subset of > > > CPUs. > > > > > > IOW, any application using /proc/cpuinfo to determine "available" > > > resource is already broken, even when run on bare metal. The same also > > > applies to the use of /proc/meminfo, which we previously faked via > > > fuse. > > > > > > So the question is whether we should invest time trying to fake the > > > /proc/cpuinfo in containers, when any apps we'd be fixing are already > > > broken in bare metal. Apps might have avoided /proc/cpuinfo and instead > > > be trying /sys/devices/system/cpu/ which your patch isn't trying to > > > fake. This is just as broken, because sysfs doesn't reflect cgroup > > > confinement either. > > > > > > I think what is ultimately needed for applications is some kind of > > > libresource.so library that they can use to query what resources > > > > Does anyone remember who it was that announced an effort to this > > end a year or two ago, or know what the status of it is? > > I don't recall seeing any formal announcement about this, but I have > had this exact same discussion with Red Hat folks involved with > Docker and similar higher level container mgmt tools, so perhaps > someone involved in those efforts is working on it ? Ah, my memory was failing me, so took a bit of searching, but http://fabiokung.com/2014/03/13/memory-inside-linux-containers/ I can't find anything called 'libmymem', and in 2014 he said https://github.com/docker/docker/issues/8427#issuecomment-58255159 so maybe this never went anywhere. For the same reasons you cited above, and because everyeone is rolling their own at fuse level, I still think that a libresource and patches to proc tools to use them, is the right way to go. We have no shortage of sample code for the functions doing the actual work, between libvirt, lxc, docker, etc :) Should we just go ahead and start a libresource github project? -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Mon, 2015-09-07 at 15:21 +0200, Cedric Bosdonnat wrote: > > The more minor concern I have is around the implementation. AFAIR, > > the > > /proc/cpuinfo file contents is not standardized across > > architectures, > > so I'm concerned whether your parsing code is robust on non-x86 > > arches. > > Hum... I didn't even know that file would change with arch'es. Take a look at linuxNodeInfoCPUPopulate() in src/nodeinfo.c for inspiration. Sharing the parsing code would also be nice. Cheers. -- Andrea Bolognani Software Engineer - Virtualization Team -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Mon, Sep 07, 2015 at 03:39:13PM +, Serge Hallyn wrote: > Quoting Daniel P. Berrange (berra...@redhat.com): > > On Thu, Sep 03, 2015 at 11:51:16AM +0200, Cédric Bosdonnat wrote: > > > We already have a fuse mount to reflect the cgroup memory restrictions > > > in the container. This commit adds the same for the number of available > > > CPUs. Only the CPUs listed by virProcessGetAffinity are shown in the > > > container's cpuinfo. > > > > So this (re-)raises some interesting / difficult questions that I'm > > not sure we have a good answer to. > > > > The main concern is that actually this is not really a problem specific > > to containers, rather it is related to cgroup resource confinement. > > ie the cgroup has confined a process(es) to a set of CPUs are the process > > is using /proc/cpuinfo to count CPUs and so is wrong. Cgroups are being > > increasingly widely used in Linux, particularly since systemd, so pretty > > much any process has to expect that it can be confined to a subset of > > CPUs. > > > > IOW, any application using /proc/cpuinfo to determine "available" > > resource is already broken, even when run on bare metal. The same also > > applies to the use of /proc/meminfo, which we previously faked via > > fuse. > > > > So the question is whether we should invest time trying to fake the > > /proc/cpuinfo in containers, when any apps we'd be fixing are already > > broken in bare metal. Apps might have avoided /proc/cpuinfo and instead > > be trying /sys/devices/system/cpu/ which your patch isn't trying to > > fake. This is just as broken, because sysfs doesn't reflect cgroup > > confinement either. > > > > I think what is ultimately needed for applications is some kind of > > libresource.so library that they can use to query what resources > > Does anyone remember who it was that announced an effort to this > end a year or two ago, or know what the status of it is? I don't recall seeing any formal announcement about this, but I have had this exact same discussion with Red Hat folks involved with Docker and similar higher level container mgmt tools, so perhaps someone involved in those efforts is working on it ? Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
On Mon, 2015-09-07 at 13:23 +0100, Daniel P. Berrange wrote: > On Thu, Sep 03, 2015 at 11:51:16AM +0200, Cédric Bosdonnat wrote: > > We already have a fuse mount to reflect the cgroup memory restrictions > > in the container. This commit adds the same for the number of available > > CPUs. Only the CPUs listed by virProcessGetAffinity are shown in the > > container's cpuinfo. > > So this (re-)raises some interesting / difficult questions that I'm > not sure we have a good answer to. > > The main concern is that actually this is not really a problem specific > to containers, rather it is related to cgroup resource confinement. > ie the cgroup has confined a process(es) to a set of CPUs are the process > is using /proc/cpuinfo to count CPUs and so is wrong. Cgroups are being > increasingly widely used in Linux, particularly since systemd, so pretty > much any process has to expect that it can be confined to a subset of > CPUs. I agree. > IOW, any application using /proc/cpuinfo to determine "available" > resource is already broken, even when run on bare metal. The same also > applies to the use of /proc/meminfo, which we previously faked via > fuse. > > So the question is whether we should invest time trying to fake the > /proc/cpuinfo in containers, when any apps we'd be fixing are already > broken in bare metal. Apps might have avoided /proc/cpuinfo and instead > be trying /sys/devices/system/cpu/ which your patch isn't trying to > fake. This is just as broken, because sysfs doesn't reflect cgroup > confinement either. I agree /sys/devices/system/cpu should be patched too... but it contains much more subtle things to handle. At least I don't have a good enough knowledge of that FS to fake it properly. > I think what is ultimately needed for applications is some kind of > libresource.so library that they can use to query what resources > are available in their compute environment, which can intelligently > query cgroups directly, and ignore the legacy /proc & /sys interfaces > for counting memory / cpu availability. I don't think that's something > that libvirt should solve - if anything it could be systemd, or a > standalone project. Ok, then not something that would be available in a reasonable time frame unless we start it. Do you know if someone in another project is caring about that problem? > So I'm increasingly convinced that LXC should not try to fake out > any /proc & /sys file content, and instead document the limitations. > I'm also thinking that we should kill off our existing meminfo fake > fuse at some point. OK. > The more minor concern I have is around the implementation. AFAIR, the > /proc/cpuinfo file contents is not standardized across architectures, > so I'm concerned whether your parsing code is robust on non-x86 arches. Hum... I didn't even know that file would change with arch'es. -- Cedric -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v2] lxc: fuse mount for /proc/cpuinfo
We already have a fuse mount to reflect the cgroup memory restrictions in the container. This commit adds the same for the number of available CPUs. Only the CPUs listed by virProcessGetAffinity are shown in the container's cpuinfo. --- src/lxc/lxc_container.c | 42 --- src/lxc/lxc_fuse.c | 106 +++- 2 files changed, 133 insertions(+), 15 deletions(-) diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index a433552..7ae13a8 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -1055,24 +1055,38 @@ static int lxcContainerMountProcFuse(virDomainDefPtr def, const char *stateDir) { int ret; -char *meminfo_path = NULL; +char *src_path = NULL; +char *dst_path = NULL; +const char *paths[] = {"meminfo", "cpuinfo"}; +size_t i; -VIR_DEBUG("Mount /proc/meminfo stateDir=%s", stateDir); +for (i = 0; i < 2; i++) { +VIR_DEBUG("Mount /proc/%s stateDir=%s", paths[i], stateDir); + +if ((ret = virAsprintf(_path, + "/.oldroot/%s/%s.fuse/%s", + stateDir, + def->name, + paths[i])) < 0) +return ret; + +if ((ret = virAsprintf(_path, + "/proc/%s", + paths[i])) < 0) { +VIR_FREE(src_path); +return ret; +} -if ((ret = virAsprintf(_path, - "/.oldroot/%s/%s.fuse/meminfo", - stateDir, - def->name)) < 0) -return ret; +if ((ret = mount(src_path, dst_path, + NULL, MS_BIND, NULL)) < 0) { +virReportSystemError(errno, + _("Failed to mount %s on %s"), + src_path, dst_path); +} -if ((ret = mount(meminfo_path, "/proc/meminfo", - NULL, MS_BIND, NULL)) < 0) { -virReportSystemError(errno, - _("Failed to mount %s on /proc/meminfo"), - meminfo_path); +VIR_FREE(src_path); +VIR_FREE(dst_path); } - -VIR_FREE(meminfo_path); return ret; } #else diff --git a/src/lxc/lxc_fuse.c b/src/lxc/lxc_fuse.c index 34a69cc..0d60434 100644 --- a/src/lxc/lxc_fuse.c +++ b/src/lxc/lxc_fuse.c @@ -42,6 +42,58 @@ #if WITH_FUSE static const char *fuse_meminfo_path = "/meminfo"; +static const char *fuse_cpuinfo_path = "/cpuinfo"; + +static virBufferPtr lxcProcComputeCpuinfo(void) { +FILE *fd = NULL; +char *line = NULL; +size_t n; +bool writeProc = false; +virBuffer buffer = VIR_BUFFER_INITIALIZER; +virBufferPtr new_cpuinfo = +pid_t pid; +virBitmapPtr cpuAffinity = NULL; + +fd = fopen("/proc/cpuinfo", "r"); +if (fd == NULL) { +virReportSystemError(errno, "%s", _("Cannot open /proc/cpuinfo")); +goto error; +} + +pid = getpid(); +if (!(cpuAffinity = virProcessGetAffinity(pid))) +goto error; + +while (getline(, , fd) > 0) { +if (STRPREFIX(line, "processor\t:")) { +unsigned long cpuid = 0; +char *suffix = NULL; +if (virStrToLong_ul(line + 12, , 10, ) < 0) +goto error; + +if (virBitmapGetBit(cpuAffinity, cpuid, ) < 0) +goto error; +} + +if (writeProc) { +virBufferAdd(new_cpuinfo, line, -1); + +if (virBufferCheckError(new_cpuinfo) < 0) +goto error; +} +} + + cleanup: +VIR_FREE(line); +VIR_FORCE_FCLOSE(fd); +virBitmapFree(cpuAffinity); +return new_cpuinfo; + + error: +virBufferFreeAndReset(new_cpuinfo); +new_cpuinfo = NULL; +goto cleanup; +} static int lxcProcGetattr(const char *path, struct stat *stbuf) { @@ -50,6 +102,7 @@ static int lxcProcGetattr(const char *path, struct stat *stbuf) struct stat sb; struct fuse_context *context = fuse_get_context(); virDomainDefPtr def = (virDomainDefPtr)context->private_data; +virBufferPtr cpuinfo = NULL; memset(stbuf, 0, sizeof(struct stat)); if (virAsprintf(, "/proc/%s", path) < 0) @@ -76,12 +129,36 @@ static int lxcProcGetattr(const char *path, struct stat *stbuf) stbuf->st_atime = sb.st_atime; stbuf->st_ctime = sb.st_ctime; stbuf->st_mtime = sb.st_mtime; +} else if (STREQ(path, fuse_cpuinfo_path)) { +if (!(cpuinfo = lxcProcComputeCpuinfo())) { +res = -EIO; +goto cleanup; +} + +if (stat(mempath, ) < 0) { +res = -errno; +goto cleanup; +} + +stbuf->st_uid = def->idmap.uidmap ? def->idmap.uidmap[0].target : 0; +stbuf->st_gid = def->idmap.gidmap ? def->idmap.gidmap[0].target : 0; +