Re: per-product packaging question

2017-01-30 Thread Vivek Goyal
On Mon, Jan 30, 2017 at 05:00:34PM -0500, Lokesh Mandvekar wrote:
> Hi,
> 
> I'm looking at the per-product packaging doc at
> https://fedoraproject.org/wiki/Packaging:Per-Product_Configuration
> and I see that variants for all products are installed at package install 
> time, with
> the ghost file pointing to the appropriate product variant.
> 
> Just wondering if there's a reason for installing all variants and/or if it's
> worth considering installation of just the particular variant appropriate for
> the system at install time?

We are looking at installing per-product configuration for
docker-storage-setup. Now we are ending up with many configuration
files.

/etc/sysconfig/docker-storage-setup-default
/etc/sysconfig/docker-storage-setup-server
/etc/sysconfig/docker-storage-setup-workstation
/etc/sysconfig/docker-storage-setup-atomic
/etc/sysconfig/docker-storage-setup-cloud

This really looks ugly. So question is can we just install the default
and config file specific to that product and ignore rest?

Vivek


> 
> Thanks,
> -- 
> Lokesh
> Freenode: lsm5
> GPG: 0xC7C3A0DD
> https://keybase.io/lsm5

___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Changing default "docker" storage to to Overlay2 in Fedora 26

2017-01-09 Thread Vivek Goyal
On Fri, Jan 06, 2017 at 11:15:35PM +0100, Alec Leamas wrote:
> Hi!
> 
> Is there a local convention that discussion about containers is using
> top-posting?
> 
> "Curious"

Nope. I just posted it without thinking about convention. But if we are
following inline replies on this list, next time onwards I will follow
that. No issues.

Vivek

> 
> --alec
> 
> On 06/01/17 23:12, Vivek Goyal wrote:
> > There is more conversation on this issue here.
> > 
> > https://pagure.io/atomic-wg/issue/186
> > 
> > I wished there was a single thread of conversation on this instead of
> > separate conversation for per product variant.
> > 
> > Thanks
> > Vivek
> > 
> > On Fri, Jan 06, 2017 at 02:05:49PM -0500, Daniel J Walsh wrote:
> > > Upstream docker is moving to overlay2 by default for its storage.  We
> > > plan on following suit.  Their are some performance advantages of
> > > overlay2 over devicemapper in memory sharing, which we would like to
> > > take advantage of.   We now have SELinux support for Overlay  file
> > > systems, so the security should be just as good.
> > > 
> > > Note: Overlay is not a Posix Compliant file system, so their could be
> > > problems with your containers running on overlay, so
> > > we want to make sure it is fairly easy to switch back to devicemapper.
> > > 
> > > Devicemapper out of the box, on Fedora Workstation, currently defaults
> > > to loopback devices for storage, which has a performance penalty, but
> > > this was the only way we were able to get docker to work right away.
> > > Switching to overlay2 will cause the storage to be shared with / and
> > > will eliminate this performance overhead.   This is the way we will ship
> > > Fedora Workstation.
> > > 
> > > On Fedora atomic host and Fedora server we have been storing
> > > devicemapper storage on a separate partition.  We plan on doing the same
> > > thing with overlay2.  This means separate device will be mounted on
> > > /var/lib/docker.  This will make it easier for someone to switch back to
> > > devicemapper, if overlay2 has problems.
> > > 
> > > Upgraded systems will not be effected.
> > > 
> > > If you want to switch from one storage to another take a look at the
> > > `atomic storage` commands.
> > > 
> > > We will write up release notes to cover this change. Along with a blog
> > > explaining the commands to switch back and forth.
> > ___
> > devel mailing list -- devel@lists.fedoraproject.org
> > To unsubscribe send an email to devel-le...@lists.fedoraproject.org
> > 
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Changing default "docker" storage to to Overlay2 in Fedora 26

2017-01-06 Thread Vivek Goyal
There is more conversation on this issue here.

https://pagure.io/atomic-wg/issue/186

I wished there was a single thread of conversation on this instead of
separate conversation for per product variant.

Thanks
Vivek

On Fri, Jan 06, 2017 at 02:05:49PM -0500, Daniel J Walsh wrote:
> Upstream docker is moving to overlay2 by default for its storage.  We
> plan on following suit.  Their are some performance advantages of
> overlay2 over devicemapper in memory sharing, which we would like to
> take advantage of.   We now have SELinux support for Overlay  file
> systems, so the security should be just as good.
> 
> Note: Overlay is not a Posix Compliant file system, so their could be
> problems with your containers running on overlay, so
> we want to make sure it is fairly easy to switch back to devicemapper.
> 
> Devicemapper out of the box, on Fedora Workstation, currently defaults
> to loopback devices for storage, which has a performance penalty, but
> this was the only way we were able to get docker to work right away. 
> Switching to overlay2 will cause the storage to be shared with / and
> will eliminate this performance overhead.   This is the way we will ship
> Fedora Workstation.
> 
> On Fedora atomic host and Fedora server we have been storing
> devicemapper storage on a separate partition.  We plan on doing the same
> thing with overlay2.  This means separate device will be mounted on
> /var/lib/docker.  This will make it easier for someone to switch back to
> devicemapper, if overlay2 has problems.
> 
> Upgraded systems will not be effected. 
> 
> If you want to switch from one storage to another take a look at the
> `atomic storage` commands.
> 
> We will write up release notes to cover this change. Along with a blog
> explaining the commands to switch back and forth.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Changing default "docker" storage to to Overlay2 in Fedora 26

2017-01-06 Thread Vivek Goyal
On Fri, Jan 06, 2017 at 02:05:49PM -0500, Daniel J Walsh wrote:
> Upstream docker is moving to overlay2 by default for its storage.  We
> plan on following suit.  Their are some performance advantages of
> overlay2 over devicemapper in memory sharing, which we would like to
> take advantage of.   We now have SELinux support for Overlay  file
> systems, so the security should be just as good.
> 
> Note: Overlay is not a Posix Compliant file system, so their could be
> problems with your containers running on overlay, so
> we want to make sure it is fairly easy to switch back to devicemapper.

Yes, idea is basically that with overlay2 continuously improving upstream,
it might be at a stage where pros of overlay2 might outweight its cons
for lot of people. And if does not, then provide and easy way for
users to switch back to devicemapper.

With overlay2 being default, it will also provide with some data about
how well it works for various workloads.

Vivek

> On Fedora atomic host and Fedora server we have been storing
> devicemapper storage on a separate partition.  We plan on doing the same
> thing with overlay2.  This means separate device will be mounted on
> /var/lib/docker.  This will make it easier for someone to switch back to
> devicemapper, if overlay2 has problems.
> 
> Upgraded systems will not be effected. 
> 
> If you want to switch from one storage to another take a look at the
> `atomic storage` commands.
> 
> We will write up release notes to cover this change. Along with a blog
> explaining the commands to switch back and forth.
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Modifying container storage for Fedora 26.

2016-11-23 Thread Vivek Goyal
On Wed, Nov 16, 2016 at 02:56:35PM -0500, Colin Walters wrote:
> On Wed, Nov 16, 2016, at 02:49 PM, Stephen Gallagher wrote:
> 
> > Today, Fedora Server relies on whatever is the default for 
> > docker-storage-setup.
> > We just tell Anaconda to reserve up to 15GiB by default for the / partition 
> > and
> > then it puts all remaining free space (on drives selected to be used by
> > Anaconda) into a single logical volume with no partitions.
> > 
> > It's a very easy thing for us to drop a different config file for
> > docker-storage-setup into place for Server. So if that's all we need to do, 
> > let
> > me know and I'll work it up.
> 
> If anyone has time to work on this, I don't see a need for Atomic Host
> and Server to have different partitioning defaults, so if we can just
> merge them, that'd be nice.

Agreed. 

Vivek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Modifying container storage for Fedora 26.

2016-11-23 Thread Vivek Goyal
On Tue, Nov 22, 2016 at 11:24:04PM -, Josh Berkus wrote:
> Vivek, Dan,
> 
> > - Now when docker users overlay2 graph driver, all the images, containers
> >   and associated metadata will be stored outside the root filesystem and
> >   onto /dev/docker-vg/foo logical volume.
> 

[ Please don't remove emaild ids from oringal to, cc list. My emails
  get filtered in a folder and I don't scan it often ]

> This is a change from current storage setup?  Right now, containers go in the 
> docker volume, but images do not.  If we can do that, it would be worth 
> having overlay right there.

Right now, with lvm thin pool, both images and containers get stored in
lvm thin pool. Some metadata associated with images and containers remains
on rootfs (in /var/lib/docker/ dir).

> 
> My one concern for Atomic is ... how do existing users upgrade when we make 
> this change?  Does their devicemapper config still just work when they pull 
> the new tree?

There will be no change to existing devicemapper functionality and upgrade
should continue to work fine. 

This new functionality should kick in only if there is no existing setup.
(Signalled by the absense of /etc/sysconfig/docker-storage file).

Otherwise docker-storage-setup can detect that already there is a
configured driver and it will not setup a new one.

Vivek
> ___
> devel mailing list -- devel@lists.fedoraproject.org
> To unsubscribe send an email to devel-le...@lists.fedoraproject.org
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Modifying container storage for Fedora 26.

2016-11-16 Thread Vivek Goyal
On Wed, Nov 16, 2016 at 03:19:06PM -0500, Stephen Gallagher wrote:
> On 11/16/2016 03:09 PM, Vivek Goyal wrote:
> > On Wed, Nov 16, 2016 at 03:01:06PM -0500, Stephen Gallagher wrote:
> >> On 11/16/2016 02:56 PM, Vivek Goyal wrote:
> >>> On Wed, Nov 16, 2016 at 02:49:25PM -0500, Stephen Gallagher wrote:
> >>>> On 11/16/2016 02:40 PM, Vivek Goyal wrote:
> >>>>> On Wed, Nov 16, 2016 at 02:32:46PM -0500, Daniel J Walsh wrote:
> >>>>>> We would like to change the docker container storage to default to
> >>>>>> Overlayfs2 in Fedora 26.  But we have a problem on Atomic Host and
> >>>>>> Fedora Server distributions.
> >>>>>>
> >>>>>>
> >>>>>> Currently docker-storage-setup defaults to devicemapper and is hard
> >>>>>> coded to setup a thinpool of 40% of remaining disk.  Otherwise it sets
> >>>>>> up loopback devices on the root file system.   Devicemapper is nice
> >>>>>> since it works with thinpools and can automatically expand the storage
> >>>>>> if the disk space is getting used up. 
> >>>>>>
> >>>>>> Moving to Overlay, we can more easily use the root file system 
> >>>>>> directly,
> >>>>>> which would be fine for Fedora Workstation.  We want to preserve the 
> >>>>>> use
> >>>>>> of the remaining storage for Overlay on AH and Fedora Server,  since
> >>>>>> this would give a user flexibility to switch back to using devicemapper
> >>>>>> if they had problems with the Overlay driver.
> >>>>>
> >>>>> And being able to do so basically involves following.
> >>>>>
> >>>>> - docker-storage-setup creates a logical volume from free space
> >>>>> - Creates a filesystem on that logical volume
> >>>>> - Mounts that logical volume on the directory which docker is going to
> >>>>>   use.
> >>>>>
> >>>>>   mount /dev/docker-vg/foo /var/lib/docker/
> >>>>>
> >>>>> - Now when docker users overlay2 graph driver, all the images, 
> >>>>> containers
> >>>>>   and associated metadata will be stored outside the root filesystem and
> >>>>>   onto /dev/docker-vg/foo logical volume.
> >>>>>
> >>>>>> We can not as easily
> >>>>>> support the expanding disk for Overlay since we will not use using 
> >>>>>> thinpool.
> >>>>>
> >>>>>>
> >>>>>> We have looked at options to hard code OverlayFS with the defaults,
> >>>>>
> >>>>> If we always mount /var/lib/docker on on /dev/vg/foo for overlay2 driver
> >>>>> this will be a regression w.r.t current behavior. So I would not
> >>>>> recommend changing current behavior. I think this should be an opt-in.
> >>>>> We are working on provide a config knob to elect this behavior and
> >>>>> atomic host and fedora server will have to opt-in somehow.
> >>>>>
> >>>>> I think it will be easy for atomic host as they already drop something
> >>>>> in /etc/sysconfig/docker-storage-setup. Not sure how fedora server
> >>>>> variant will do it.
> >>>>>
> >>>>
> >>>>
> >>>> Today, Fedora Server relies on whatever is the default for 
> >>>> docker-storage-setup.
> >>>> We just tell Anaconda to reserve up to 15GiB by default for the / 
> >>>> partition and
> >>>> then it puts all remaining free space (on drives selected to be used by
> >>>> Anaconda) into a single logical volume with no partitions.
> >>>>
> >>>> It's a very easy thing for us to drop a different config file for
> >>>> docker-storage-setup into place for Server. So if that's all we need to 
> >>>> do, let
> >>>> me know and I'll work it up.
> >>>
> >>> Ok, that sounds good. We are working on providing a knob to opt-in new
> >>> behavior. I think all you have to drop in config file will be something
> >>> like.
> >>>
> >>> /etc/sysconfig/docker-storage-setup
> >>>
> >>> STORAGE_DRIVER=overlay2
> >>> YET_TO_BE_NAMED_OP

Re: Modifying container storage for Fedora 26.

2016-11-16 Thread Vivek Goyal
On Wed, Nov 16, 2016 at 03:01:06PM -0500, Stephen Gallagher wrote:
> On 11/16/2016 02:56 PM, Vivek Goyal wrote:
> > On Wed, Nov 16, 2016 at 02:49:25PM -0500, Stephen Gallagher wrote:
> >> On 11/16/2016 02:40 PM, Vivek Goyal wrote:
> >>> On Wed, Nov 16, 2016 at 02:32:46PM -0500, Daniel J Walsh wrote:
> >>>> We would like to change the docker container storage to default to
> >>>> Overlayfs2 in Fedora 26.  But we have a problem on Atomic Host and
> >>>> Fedora Server distributions.
> >>>>
> >>>>
> >>>> Currently docker-storage-setup defaults to devicemapper and is hard
> >>>> coded to setup a thinpool of 40% of remaining disk.  Otherwise it sets
> >>>> up loopback devices on the root file system.   Devicemapper is nice
> >>>> since it works with thinpools and can automatically expand the storage
> >>>> if the disk space is getting used up. 
> >>>>
> >>>> Moving to Overlay, we can more easily use the root file system directly,
> >>>> which would be fine for Fedora Workstation.  We want to preserve the use
> >>>> of the remaining storage for Overlay on AH and Fedora Server,  since
> >>>> this would give a user flexibility to switch back to using devicemapper
> >>>> if they had problems with the Overlay driver.
> >>>
> >>> And being able to do so basically involves following.
> >>>
> >>> - docker-storage-setup creates a logical volume from free space
> >>> - Creates a filesystem on that logical volume
> >>> - Mounts that logical volume on the directory which docker is going to
> >>>   use.
> >>>
> >>>   mount /dev/docker-vg/foo /var/lib/docker/
> >>>
> >>> - Now when docker users overlay2 graph driver, all the images, containers
> >>>   and associated metadata will be stored outside the root filesystem and
> >>>   onto /dev/docker-vg/foo logical volume.
> >>>
> >>>> We can not as easily
> >>>> support the expanding disk for Overlay since we will not use using 
> >>>> thinpool.
> >>>
> >>>>
> >>>> We have looked at options to hard code OverlayFS with the defaults,
> >>>
> >>> If we always mount /var/lib/docker on on /dev/vg/foo for overlay2 driver
> >>> this will be a regression w.r.t current behavior. So I would not
> >>> recommend changing current behavior. I think this should be an opt-in.
> >>> We are working on provide a config knob to elect this behavior and
> >>> atomic host and fedora server will have to opt-in somehow.
> >>>
> >>> I think it will be easy for atomic host as they already drop something
> >>> in /etc/sysconfig/docker-storage-setup. Not sure how fedora server
> >>> variant will do it.
> >>>
> >>
> >>
> >> Today, Fedora Server relies on whatever is the default for 
> >> docker-storage-setup.
> >> We just tell Anaconda to reserve up to 15GiB by default for the / 
> >> partition and
> >> then it puts all remaining free space (on drives selected to be used by
> >> Anaconda) into a single logical volume with no partitions.
> >>
> >> It's a very easy thing for us to drop a different config file for
> >> docker-storage-setup into place for Server. So if that's all we need to 
> >> do, let
> >> me know and I'll work it up.
> > 
> > Ok, that sounds good. We are working on providing a knob to opt-in new
> > behavior. I think all you have to drop in config file will be something
> > like.
> > 
> > /etc/sysconfig/docker-storage-setup
> > 
> > STORAGE_DRIVER=overlay2
> > YET_TO_BE_NAMED_OPTION=VAL
> > 
> > So upstream default will continue to be devicemapper. We will have to
> > modify fedora workstation, fedora server and atomic host infrastructure
> > to opt-in for overlay2.
> > 
> 
> Why exactly does this need to be opt-in? Why wouldn't we just change the 
> default
> on Fedora Server to use overlay2 instead of devicemapper?
> 
> I think I'm missing some key part of the problem here.

I mean it will be devicemapper in upstream project. And distributions will
have to opt-in for overlay2. 

And I think one reason being that rhel uses same git tree and we don't
want to switch to overlay2 by default for rhel yet.

overlay2 will be an experiment on fedora first as default and if it works
well, then change default upstream too.

Vivek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Modifying container storage for Fedora 26.

2016-11-16 Thread Vivek Goyal
On Wed, Nov 16, 2016 at 02:49:25PM -0500, Stephen Gallagher wrote:
> On 11/16/2016 02:40 PM, Vivek Goyal wrote:
> > On Wed, Nov 16, 2016 at 02:32:46PM -0500, Daniel J Walsh wrote:
> >> We would like to change the docker container storage to default to
> >> Overlayfs2 in Fedora 26.  But we have a problem on Atomic Host and
> >> Fedora Server distributions.
> >>
> >>
> >> Currently docker-storage-setup defaults to devicemapper and is hard
> >> coded to setup a thinpool of 40% of remaining disk.  Otherwise it sets
> >> up loopback devices on the root file system.   Devicemapper is nice
> >> since it works with thinpools and can automatically expand the storage
> >> if the disk space is getting used up. 
> >>
> >> Moving to Overlay, we can more easily use the root file system directly,
> >> which would be fine for Fedora Workstation.  We want to preserve the use
> >> of the remaining storage for Overlay on AH and Fedora Server,  since
> >> this would give a user flexibility to switch back to using devicemapper
> >> if they had problems with the Overlay driver.
> > 
> > And being able to do so basically involves following.
> > 
> > - docker-storage-setup creates a logical volume from free space
> > - Creates a filesystem on that logical volume
> > - Mounts that logical volume on the directory which docker is going to
> >   use.
> > 
> >   mount /dev/docker-vg/foo /var/lib/docker/
> > 
> > - Now when docker users overlay2 graph driver, all the images, containers
> >   and associated metadata will be stored outside the root filesystem and
> >   onto /dev/docker-vg/foo logical volume.
> > 
> >> We can not as easily
> >> support the expanding disk for Overlay since we will not use using 
> >> thinpool.
> > 
> >>
> >> We have looked at options to hard code OverlayFS with the defaults,
> > 
> > If we always mount /var/lib/docker on on /dev/vg/foo for overlay2 driver
> > this will be a regression w.r.t current behavior. So I would not
> > recommend changing current behavior. I think this should be an opt-in.
> > We are working on provide a config knob to elect this behavior and
> > atomic host and fedora server will have to opt-in somehow.
> > 
> > I think it will be easy for atomic host as they already drop something
> > in /etc/sysconfig/docker-storage-setup. Not sure how fedora server
> > variant will do it.
> > 
> 
> 
> Today, Fedora Server relies on whatever is the default for 
> docker-storage-setup.
> We just tell Anaconda to reserve up to 15GiB by default for the / partition 
> and
> then it puts all remaining free space (on drives selected to be used by
> Anaconda) into a single logical volume with no partitions.
> 
> It's a very easy thing for us to drop a different config file for
> docker-storage-setup into place for Server. So if that's all we need to do, 
> let
> me know and I'll work it up.

Ok, that sounds good. We are working on providing a knob to opt-in new
behavior. I think all you have to drop in config file will be something
like.

/etc/sysconfig/docker-storage-setup

STORAGE_DRIVER=overlay2
YET_TO_BE_NAMED_OPTION=VAL

So upstream default will continue to be devicemapper. We will have to
modify fedora workstation, fedora server and atomic host infrastructure
to opt-in for overlay2.

Vivek
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Modifying container storage for Fedora 26.

2016-11-16 Thread Vivek Goyal
On Wed, Nov 16, 2016 at 02:32:46PM -0500, Daniel J Walsh wrote:
> We would like to change the docker container storage to default to
> Overlayfs2 in Fedora 26.  But we have a problem on Atomic Host and
> Fedora Server distributions.
> 
> 
> Currently docker-storage-setup defaults to devicemapper and is hard
> coded to setup a thinpool of 40% of remaining disk.  Otherwise it sets
> up loopback devices on the root file system.   Devicemapper is nice
> since it works with thinpools and can automatically expand the storage
> if the disk space is getting used up. 
> 
> Moving to Overlay, we can more easily use the root file system directly,
> which would be fine for Fedora Workstation.  We want to preserve the use
> of the remaining storage for Overlay on AH and Fedora Server,  since
> this would give a user flexibility to switch back to using devicemapper
> if they had problems with the Overlay driver.

And being able to do so basically involves following.

- docker-storage-setup creates a logical volume from free space
- Creates a filesystem on that logical volume
- Mounts that logical volume on the directory which docker is going to
  use.

  mount /dev/docker-vg/foo /var/lib/docker/

- Now when docker users overlay2 graph driver, all the images, containers
  and associated metadata will be stored outside the root filesystem and
  onto /dev/docker-vg/foo logical volume.

> We can not as easily
> support the expanding disk for Overlay since we will not use using thinpool.

> 
> We have looked at options to hard code OverlayFS with the defaults,

If we always mount /var/lib/docker on on /dev/vg/foo for overlay2 driver
this will be a regression w.r.t current behavior. So I would not
recommend changing current behavior. I think this should be an opt-in.
We are working on provide a config knob to elect this behavior and
atomic host and fedora server will have to opt-in somehow.

I think it will be easy for atomic host as they already drop something
in /etc/sysconfig/docker-storage-setup. Not sure how fedora server
variant will do it.

Thanks
Vivek

> or
> we could just drop a /etc/sysconfig/docker-storage-setup that specified
> Overlay and the percentage of remaining space to use for the
> /var/lib/docker device.  But what is the best way to set different
> defaults for AH, Fedora Server and Fedora Workstation.
> 
> We would like to discuss with you guys about what  you would think is
> the best way to handle. 
___
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org


Re: Query about package versioning

2014-02-20 Thread Vivek Goyal
On Thu, Feb 20, 2014 at 05:39:17PM +0100, Mikolaj Izdebski wrote:
> On 02/20/2014 05:28 PM, Marcin Juszkiewicz wrote:
> > W dniu 20.02.2014 17:16, Vivek Goyal pisze:
> > 
> >> So instead of increasing release number on released branches, why don't
> >> we append additional number after dist and bump that up in released
> >> branch. So FC21 releases will look like.
> >>
> >>   kexec-tools-2.0.4-24.fc21.1
> >>   kexec-tools-2.0.4-24.fc21.2
> >>   ...
> >>   ...
> >>   kexec-tools-2.0.4-23.fc21.10
> >>
> >> That way we clearly know that FC21 was forked off master release .24. And
> >> upgradability of package will be maintained without any change of older
> >> release versions getting ahead of newer release versions.
> > 
> > %dist should be at the end.
> > 
> > So rather kexec-tools-2.0.4-23.X.fc21 where X means x-th revision of
> > fc21 package after distribution release.
> 
> That won't work if both branches already have the same release number
> and you need to bump release in older branch.  That can happen for
> example if you were fast-forwarding commits from f21 to f20 and at some
> point you need to add a bugfix only for f20.  Adding .1 after dist-tag
> will work in this case.

What is fast forwarding commits from f21 to f20. I guess you are saying
there are bunch of commits in master branch and you want to now apply
those commits to f20 branch too?

If yes, one can simply do another release on master branch if there is
a need to commit a patch in f20 only.

Say master is at kexec-tools-2.0.4-23.0.fc21 and has bunch of more commits
on top.

FC21 forks off and has kexec-tools-2.0.4-23.0.fc21 and a patch needs to
applied to FC21 only. Then one can do another release on master to avoid
any kind of upgradability conflicts.

master will be kexec-tools-2.0.4-24.0.fc22
FC21  will be kexec-tools-2.0.4-23.1.fc21

So I don't see why above will not work?

IOW, it is better to use an extra field for versioning of released
branches to avoid any kind of conflicts with master. Instead of
overloading same release field for all the branches.

Not sure why more package not follow this extra field thing. I am trying
to find out if anything is fundamentally wrong if we try to pursue this
scheme in kexec-tools pacakge.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: Query about package versioning

2014-02-20 Thread Vivek Goyal
On Thu, Feb 20, 2014 at 05:28:02PM +0100, Marcin Juszkiewicz wrote:
> W dniu 20.02.2014 17:16, Vivek Goyal pisze:
> 
> > So instead of increasing release number on released branches, why don't
> > we append additional number after dist and bump that up in released
> > branch. So FC21 releases will look like.
> > 
> >   kexec-tools-2.0.4-24.fc21.1
> >   kexec-tools-2.0.4-24.fc21.2
> >   ...
> >   ...
> >   kexec-tools-2.0.4-23.fc21.10
> > 
> > That way we clearly know that FC21 was forked off master release .24. And
> > upgradability of package will be maintained without any change of older
> > release versions getting ahead of newer release versions.
> 

> %dist should be at the end.

[ Can you please keep me in "To" list. I don't want to scan mailing list
  to figure out if somebody responded to my mail or not ]

Why %dist should be at the end? html page I referenced previously mentions
that one can use x.%{dist}.y kind of release number in select cases.

> 
> So rather kexec-tools-2.0.4-23.X.fc21 where X means x-th revision of
> fc21 package after distribution release.

I think this will work too. As 23 got frozen in time and master and later
releases will always be higher. And that would not break upgradability.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Query about package versioning

2014-02-20 Thread Vivek Goyal
Hi All,

We are trying to sort out how to do kexec-tools package version, release
number management in fedora across various branches, hence this query.

I quickly went through following.

https://fedoraproject.org/wiki/Packaging:NamingGuidelines#Package_Naming_and_Versioning_Guidelines

So far we do following.

- In master branch keep on doing development and keep on bumping release
  at regular intervals.

  kexec-tools-2.0.4-1.fc21
  kexec-tools-2.0.4-2.fc21
  kexec-tools-2.0.4-3.fc21
  ...
  ...
  kexec-tools-2.0.4-23.fc21

Lets say now FC21 forks off and a new branch is created. We keep on
bumping release number in FC21 branch also.

  kexec-tools-2.0.4-24.fc21
  kexec-tools-2.0.4-25.fc21
  ...
  ...
  kexec-tools-2.0.4-30.fc21

And master will continue. 
  kexec-tools-2.0.4-24.fc22
  kexec-tools-2.0.4-25.fc22
  ...
  ...
  kexec-tools-2.0.4-30.fc22

Are we doing the right thing here.

I have few problems with above model.

- By bumping release version, we kind of lose the information what was
  our base at the time of fork of branch. 

- Release numbers of released branch can get ahead of master branch
  depending on how many releases were done on master and how many releases
  were done on released branches.

So instead of increasing release number on released branches, why don't
we append additional number after dist and bump that up in released
branch. So FC21 releases will look like.

  kexec-tools-2.0.4-24.fc21.1
  kexec-tools-2.0.4-24.fc21.2
  ...
  ...
  kexec-tools-2.0.4-23.fc21.10

That way we clearly know that FC21 was forked off master release .24. And
upgradability of package will be maintained without any change of older
release versions getting ahead of newer release versions.

Thanks
Vivek




-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-22 Thread Vivek Goyal
On Fri, Jul 19, 2013 at 06:08:48PM +0200, Florian Weimer wrote:

[..]
> Have you considered a non-cryptographic solution, like a physical
> presence check to (temporarily) disable Secure Boot so that the
> kexec restriction no longer applies?  This could be a fallback
> option if the original plan turns out to be too brittle/complex.

I think kyle has a patch which will allow disabling secureboot
restriction if one is on console. I will have to look into details
and see how can I make use of it in kexec code to relax signature
restrictions if user is on physical console.

[CC kyle]

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: How to create a new mailing list at lists.fedoraproject.org

2013-07-19 Thread Vivek Goyal
On Fri, Jul 19, 2013 at 07:22:55AM -0700, T.C. Hollingsworth wrote:
> Hi!
> 
> On Jul 19, 2013 7:12 AM, "Vivek Goyal"  wrote:
> >
> > Hi,
> >
> > I want to create a new mailing list for kexec/kdump related discussions
> > in fedora. How do I go about it.
> >
> > I try to create one here but it asks for List creator's password. I don't
> > have any such password.
> >
> > So who is authorized to create the list and can he/she create one for
> > me?
> 
> File a ticket against the "Mailing Lists" component in Fedora
> Infrastructure's trac instance:
> https://fedorahosted.org/fedora-infrastructure/

Thanks. created one just now.

https://fedorahosted.org/fedora-infrastructure/ticket/3900

Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-19 Thread Vivek Goyal
On Thu, Jul 18, 2013 at 08:51:36PM +0200, Miloslav Trmač wrote:
> On Thu, Jul 11, 2013 at 1:40 PM, Jaroslav Reznik  wrote:
> > = Proposed System Wide Change: Enable kdump on secureboot machines =
> > https://fedoraproject.org/wiki/Changes/Kdump_with_secureboot
> 
> > == Detailed description ==
> > /sbin/kexec prepares a binary blob, called purgatory. This code runs at
> > priviliged level between kernel transition. With secureboot enabled, no
> > unsigned code should run at privilige level 0, hence kexec/kdump is 
> > currently
> > disabled if secureboot is enabled.
> >
> > One proposed way to solve the problem is that sign /sbin/kexec utility. And
> > upon successful signature verification, allow it to load kernel, initramfs, 
> > and
> > binary blob. /sbin/kexec will verify signatures of kernel being loaded 
> > before
> > it asks running kernel to load it.
> 
> For someone like me unfamiliar with kdump architecture, wouldn't it be
> possible to generate all relevant blobs (kdump kernel/initrd, ...) at
> kernel build time and sign them using essentially the existing module
> signing mechanism, and let the _kernel_ do all signature verification?
>  Then /sbin/kexec wouldn't have to be trusted at all.
> 
> > === Build and ship ima-evm-utils package ===
> > /sbin/kexec will be signed by evmctl. This utility will put an xattr
> > security.ima on /sbin/kexec file and kernel will leverage IMA 
> > infrastructure in
> > kernel to verify signature of /sbin/kexec upon execution.
> 
> (My motivation for the above question is that I view IMA (and any
> approach based on verifying only a pre-specified subset of files) as
> rather suspect, and that dm-verity makes much more sense to me for
> enforcing a "trusted base OS".

IIUC, dm-verity will just ensure that what was loaded from disk matches
signature. It does not enforce any restrictions after that. That is make
sure once signatures are verified, nothing unsgined can change the 
address space.

So disable ptrace() from unsigned process, run executable run locked in
from memory etc.

IMA does not enforce this too. But it can possibly be enhanced to put
some metadata in signature which says file needs to be executed locked
in memory.

And IMA will allow me to do it per file and we don't have to sign
everything which is on disk.

So for this use case where we want to sign only one thing, there is
no need to make everything signed.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

How to create a new mailing list at lists.fedoraproject.org

2013-07-19 Thread Vivek Goyal
Hi,

I want to create a new mailing list for kexec/kdump related discussions
in fedora. How do I go about it.

I try to create one here but it asks for List creator's password. I don't
have any such password.

So who is authorized to create the list and can he/she create one for
me?

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: SSD cache

2013-07-15 Thread Vivek Goyal
On Mon, Jul 15, 2013 at 02:47:30PM +0300, Mihamina Rakotomandimby wrote:
> On 2013-07-15 12:56, Jaroslav Reznik wrote:
> >= Proposed System Wide Change: SSD cache =
> >https://fedoraproject.org/wiki/Changes/SSD_cache
> 
> One thing I would recommend would be to correctly detect SSD:
> ATM, I installed my Fedora over an SSD and it did not ajust the
> mount settings nor suggest an appropriate setting for the SSD.
> If we can at least detect SSD and suggest (just suggest) a setting
> it would be great.
> 

What are the right default settings for an SSD? From an IO scheduler
perspective, CFQ already reads rotational flag and changes its behavior.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-12 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 04:46:42PM -0600, Stephen John Smoogen wrote:

[..]
> > Anyway, USB case is interesting. I have to admin I have never tried
> > dumping to USB disk either. But in theory it should work.
> >
> >
> I tried USB direct dump and USB ext3. kdump said it could see the USB disk
> in the logs and then nothing would get written.

Ok, I just took an laptop (lenovo T61, yes it is old) and installed F19
and tried kdump (echo c > /proc/sysrq-trigger) in following 3
configurations.

- Save to local disk (root, unencrypted).
- Save to a usb flash driver (4GB, ext4 file system)
- Save dump over ssh

All 3 worked for me. 

Display does get reset but that happens very late and we don't see any
of the kernel messages. I see just dracut and kdump messages.

If USB did not work for you, you can try passing rd.debug on command line
(edit /etc/sysconfig/kdump) and also set "default shell" in
/etc/kdump.conf. So after failing to save dump, you should be put in a
shell. You can look around for usb device. Also debug outupt should
tell us where we are.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-12 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 02:09:54PM -0700, Adam Williamson wrote:
> On Thu, 2013-07-11 at 14:13 -0400, Vivek Goyal wrote:
> 
> > I think this is a wrong impression. Kdump should work in Fedora. For a
> > long time I got the feedback that fedora users don't care about kdump
> > working. But I think kdump is an important debugging facility and is
> > very useful for enterprise distro. So at times it should be useful for
> > fedora users too.
> > 
> > So now I am trying to make sure things work well in Fedora.
> 
> FWIW, dedoimedo (a notoriously tough distro reviewer) mentioned to me in
> private correspondence that he's been hitting a kernel crash with F19
> with one of his test systems and he'd want to use kdump to try and
> diagnose it, only it doesn't work on live images, he says. Is that
> accurate? If so, can it be made to work on live images?

I have never tested kdump on live images. What's the problem he is facing?

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-11 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 08:22:17PM +0100, Matthew Garrett wrote:
> On Thu, Jul 11, 2013 at 03:10:07PM -0400, Vivek Goyal wrote:
> 
> > We will need a serial console to debug kdump issues. I am not expert
> > enough to figure out how to reset graphical console without going
> > through the bios. Is there any reliable way to do that.
> 
> Make sure the kdump kernel has graphics drivers - they should be able to 
> reconfigure the device.

Ok, including graphics drivers in initramfs should be doable. But this will
still not display the message on consoles if early failures happen during
transition to second kernel and drivers are not loaded yet.

> Or just pass the framebuffer offset, size, 
> stride and pixel format to the kdump kernel and have it treat it as an 
> unaccelerated linear framebuffer.

Ok, I will look into this. Thanks for the ideas though.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-11 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 12:42:16PM -0600, Stephen John Smoogen wrote:

[..]
> > > Issues I ran into was:
> > >
> > > 1) kdump needs to write to an unencrypted disk space. I tried a USB disk
> > > and various other places but the best ability I got was reinstalling the
> > > laptop and making a /var/crash partition.
> >
> > Is your root encrypted? USB should have worked. Otherwise try dumping
> > to NFS partition. Or ssh the dump out to a different machine. All of
> > these should work.
> >
> >
> The USB was the ones I tried but couldn't get to work correctly.  NFS and
> SSH were not going to work because the problem is with RHEL-5 talking over
> the bridge and my laptop has wireless.

[ I am ccing devel list again. So that if people have ideas about how
  to get serial console on laptop, that will help ]

What do you mean by "NFS and SSH were not going to work because the problem
is with RHEL-5 talking over the bridge"?

I have never tested kdump with wireless. As I always tried to make these
work on servers and always assumed etherhnet connectivity is there.

Anyway, USB case is interesting. I have to admin I have never tried
dumping to USB disk either. But in theory it should work.

Right now it does not work with encrypted disks. Given the fact that
dumping to root disk is easiest on a laptop, I think it is reasonable
to try to make it work with encrypted disks. 

With encrypted disks we don't know where to get the password. I think 
we probably can just wait for user to enter the password. But wait,
we have plenty of issues with display reset in kdump environment. Kdump
might be working in the background while display might be frozen or
garbage displayed. That's why I always use serial console for any kind
of debugging.

So until and unless we figure a way out to solve resetting display issues,
we can't expect a user to enter password on prompt and supporting
encrypted disk is hard. 

Anyway, so as you said in your case trying to mount an un-encrypted
disk/partition and trying to dump to that parition is easiest.

> 
> 
> > > 2) kdump didn't seem to dump for anything than the forced dump in the
> > > instruction manual.
> >
> > You mean dump did not trigger after panic or it did not complete after
> > panic?
> >
> > If kdump kernel is loaded, and panic happens or oops happens and
> > panic_on_oops is set, we should transition in to second kernel and capture
> > dump.
> >
> >
> This did not happen. The system froze completely.

We need to have serial console to debug things here. Without console we
have no idea where things might have gone wrong.

> Power cycle was required
> and nothing was in /var/crash. This could be a problem with my setup but it
> was pretty much stock what the fedora web pages said to do. The
> system-config-kdump application didn't work when I tried it so I went to
> fedora-kernel and got the "we don't expect it to work, please try a rawhide
> kernel and see if it ooops" which it did.
> 
> 
> If it did not work, there must have been some kernel issue. Please open
> > bugs for these issues.
> >
> >
> OK what are you wanting to look for in a bug. At the moment I would just be
> opening the unhelpful bug of:
> 
> laptop freezes. no kdump is found.

We will need a serial console to debug kdump issues. I am not expert
enough to figure out how to reset graphical console without going
through the bios. Is there any reliable way to do that.

Once we have console going then we can try different things like
enabling debugging messages in purgatory, enable early printk etc
to figure out where did things fail.

Now laptops don't have serial port (most new one). Are there any
usb based gadgets which can help here? I don't know.

> 
> Which could be a setup issue on my part or a bunch of other stuff. Since
> the bug is still possible to trigger with Fedora 19+RHEL-5 guest, I can go
> through the steps again to see what needs to be done. I just need to know
> what they are.

So you are running RHEL-5 as Guest with F19 host and trying to take
dump of host?

I think we need to solve the issue of how to get a serial console working
on a laptop to debug this issue.

Anybody, any ideas?

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-11 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 02:13:05PM -0400, Vivek Goyal wrote:

[..]

> how do we get kdump to be more useful?
> 
> I think testing and bug reporting will help. I would love to have kdump
> enabled by default in Fedora. But it eats around 128MB of memory by
> default which keeps sitting and not used. So lot of people might not like
> it.

Right now there is no fedora specific kdump list to discuss issues. If
there is enough interest, we can create one list just to discuss fedora
kdump issues and fixes.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-11 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 11:58:56AM -0600, Stephen John Smoogen wrote:
> On 11 July 2013 05:40, Jaroslav Reznik  wrote:
> 
> > = Proposed System Wide Change: Enable kdump on secureboot machines =
> > https://fedoraproject.org/wiki/Changes/Kdump_with_secureboot
> >
> > Change owner(s): Vivek Goyal 
> >
> > Currently kexec/kdump is disabled on machines with secureboot enabled. This
> > feature aims to enable kexec/kdump on such machines.
> >
> > == Detailed description ==
> > /sbin/kexec prepares a binary blob, called purgatory. This code runs at
> > priviliged level between kernel transition. With secureboot enabled, no
> > unsigned code should run at privilige level 0, hence kexec/kdump is
> > currently
> > disabled if secureboot is enabled.
> >
> >
> My question is "Does kdump work even without secureboot?"

Yes it works. Kdump user space bits rotted for a while in Fedora and
it did not get enough care and attention. But things have changed now. 
We first put everything in Fedora and test it before we try to pull
the same bits in rhel.

> In trying to debug a crashing bug with older kernels and F19 I enabled kdump
> to try and
> get a crashing image for the developers to work. I ran into a ton of issues
> and when asking on #fedora-kernel was given the strong impression that
> kdump was not expected to work by the various kernel developers.

We have been testing F18 and F19 in our virtual machines and things work
for us.

I think this is a wrong impression. Kdump should work in Fedora. For a
long time I got the feedback that fedora users don't care about kdump
working. But I think kdump is an important debugging facility and is
very useful for enterprise distro. So at times it should be useful for
fedora users too.

So now I am trying to make sure things work well in Fedora.

> 
> Issues I ran into was:
> 
> 1) kdump needs to write to an unencrypted disk space. I tried a USB disk
> and various other places but the best ability I got was reinstalling the
> laptop and making a /var/crash partition.

Is your root encrypted? USB should have worked. Otherwise try dumping
to NFS partition. Or ssh the dump out to a different machine. All of 
these should work.

> 2) kdump didn't seem to dump for anything than the forced dump in the
> instruction manual.

You mean dump did not trigger after panic or it did not complete after
panic?

If kdump kernel is loaded, and panic happens or oops happens and
panic_on_oops is set, we should transition in to second kernel and capture
dump.

If it did not work, there must have been some kernel issue. Please open
bugs for these issues.

> 
> In the end the kernel developers got me a kernel with enough oops detection
> to help find the problems but how do we get kdump to be more useful?

I think testing and bug reporting will help. I would love to have kdump
enabled by default in Fedora. But it eats around 128MB of memory by
default which keeps sitting and not used. So lot of people might not like
it.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-11 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 11:45:34AM -0400, Steve Grubb wrote:
> On Thursday, July 11, 2013 10:33:05 AM Vivek Goyal wrote:
> > Secondly, there are disagreements upstream w.r.t how locking down
> > executable should happen. IMA folks want some functionality behind
> > security hooks (as opposed to what I have done). So I am expecting
> > that once patches do get merged upstream, they might be in little
> > different shape altogether.
> 
> I don't know if the average person has played with IMA. It hashes all files 
> being accessed depending on its policy. This is CPU intensive and will cause 
> the system fans to run faster and the system uses more power. It also runs 
> slower because of all the time spent hashing files. I reported this to 
> upstream 
> IMA developers a while back. I doubt anything has changed.

This overhead shows up only one loads an IMA policy to do so. In my case
I have exported some appraisal functions from IMA code so these can be
directly called by other kernel components. And I call these functions
from elf loader code.

That way, in regular configuration no hashing of all the files will
take place. Executables will be hashed only if they are signed and
only if user has asked to run executable locked down in memory. (I
have created a way so that in security.ima attribute one can
put additional info to run executable locked down in memory).

So we just need to enable IMA but for regular users I am not
expecting any significant overhead to show up. It will show up
only if users choose to load some IMA policy in the system.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-11 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 05:19:58PM +0200, Florian Weimer wrote:
> On 07/11/2013 04:33 PM, Vivek Goyal wrote:
> 
> >>I don't think it would make sense to add more and more
> >>Fedora-specific patches which implement security functionality.  I
> >>don't want Fedora to become the next Android.
> >
> >I don't see those patches going upstream in near term. First of all
> >base secureboot patches need to go upstream. And at this point of
> >time upsteram does not seem to be eager to do anything more in kernel
> >for secureboot.
> 
> The IMA stuff looks pretty independent of Secure Boot to me.  It
> seems upstream picked up some of it in 2.6.30.
> 

It is but it implements stuff which is needed to meet TCB requirements.
Current implementation is nowhere near to require secureboot requirements.

For example, executables are not locked down in memory. That means
after signature verification, if executable gets swapped out, it can
be tweaked by root.

Any direct changes to disk to a file are not detected. So modify
the disk blocks after signature verification and IMA will not
notice it.

Without secureboot we don't have any requirement where we need
the capability to extend root of trust to user space application. Current
IMA is not sufficient to meet secureboot requirements. So pushing
more patches in because they are required to meet secureboot
requirements is hard. (Because lot of people don't bite into the
theory of locking down kernel in secureboot mode).

So there is very little direct dependecny on secureboot. It is
more of why do you need these enahancements. And answer is because
secureboot will enforce that. Well, secureboot locking is not
even in kernel, and kdump is not even broken upstream, so why
are you bothering to push something in now.


> >Secondly, there are disagreements upstream w.r.t how locking down
> >executable should happen. IMA folks want some functionality behind
> >security hooks (as opposed to what I have done). So I am expecting
> >that once patches do get merged upstream, they might be in little
> >different shape altogether.
> 
> The important question is whether we can drop our own patches and
> switch to whatever upstream does when the time comes.

I think we should be able to do that. We are expecting only /sbin/kexec
to use this functionality and if things change we can change /sbin/kexec
and signing process as we control everything.

I think important thing will be to emphasize that other applications
should not try to latch on to new keyctl() ioctl option to verify 
signature of a user space buffer or IMA functionality to put extra
data in signature which locks down executable in memory. These
are not stable interfaces and might disappear in next fedora
release.

> 
> >I think now we can not back out. Merging secureboot patches without
> >these being upstream broke other subsystems (kdump, systemtap,..).
> 
> Sure.  But systemtap (and things like PCI passthrough) are
> fundamentally incompatible with our approach to Secure Boot.  There
> are conceptual challenges like irreversible software updates, too.
> At a certain point, we will have added so much inconvenience that
> only very few users will use this feature.  Upstream is not totally
> crazy in rejecting Fedora's restrictive approach.
> 

Is it time to re-visit the decision of locking down kernel on secureboot
machine given the fact that upstream does not seem to like the idea.

What are other distributions planning to do? Are they carrying additional
patches or they have decided to go upstream way of not locking down
kernel.

If others are going upstream way, then is it worth that fedora continues
to be different and in the process we continue to add more out of the
tree patches to make things like kdump work with secureboot.

> The proposed change goes in the right direction (more user control),
> but we're still missing things on the restrictive side (dbx updates,
> Fedora-local revocation checking, and others I'm not aware of).  And
> as long as Fedora and the more lenient distributions are under the
> same trust root, Fedora users get pretty close to zero additional
> security benefit, despite all the effort we put into this.
> 
> >Yes. I am going to use IMA for signature verification. These signatures
> >formats are very close to what is used for module signing (PKCS1.5 with
> >some metadata).
> >
> >For trust chain, we will still use secureboot trust chain and trust
> >an executable only if it has been signed by a key in .system_keyring.
> 
> Okay, that's a dependency on the Secure Boot patchset, but not
> Secure Boot as a technology.  Good to know.
> 
> >I have not written the code to check blacklist yet but I plan to do
> >that later.

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-11 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 10:53:42AM -0400, Bill Nottingham wrote:
> Jaroslav Reznik (jrez...@redhat.com) said: 
> > = Proposed System Wide Change: Enable kdump on secureboot machines =
> > https://fedoraproject.org/wiki/Changes/Kdump_with_secureboot
> > 
> > Change owner(s): Vivek Goyal 
> > 
> > Currently kexec/kdump is disabled on machines with secureboot enabled. This 
> > feature aims to enable kexec/kdump on such machines.
> 
> As a minor language nit, I would change the feature title itself to 'Allow
> kdump...' ; a quick reading of the title otherwise might imply that it would
> be enabled OOTB on such machines. Or maybe I'm being overly anal.

I changed the feature title to "Allow kdump on secureboot machines".

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: F20 System Wide Change: Enable kdump on secureboot machines

2013-07-11 Thread Vivek Goyal
On Thu, Jul 11, 2013 at 03:57:38PM +0200, Florian Weimer wrote:
> On 07/11/2013 01:40 PM, Jaroslav Reznik wrote:
> >=== Build and ship ima-evm-utils package ===
> >/sbin/kexec will be signed by evmctl. This utility will put an xattr
> >security.ima on /sbin/kexec file and kernel will leverage IMA infrastructure 
> >in
> >kernel to verify signature of /sbin/kexec upon execution.
> >
> >* There is a bz open 807476 for inclusion of this package since long time. 
> >Not
> >sure what it is stuck on though.
> >
> >* There are some patches which are not upstream yet (like lock down 
> >executable
> >in memory) which we need to carry in this patckage till patches get upstream.
> 
> Is there a chance this (and the other patches mentioned below)
> actually makes it in the kernel?  Are at least the VM changes part
> of upstream already?
> 
> I don't think it would make sense to add more and more
> Fedora-specific patches which implement security functionality.  I
> don't want Fedora to become the next Android.

I don't see those patches going upstream in near term. First of all
base secureboot patches need to go upstream. And at this point of
time upsteram does not seem to be eager to do anything more in kernel
for secureboot.

Secondly, there are disagreements upstream w.r.t how locking down
executable should happen. IMA folks want some functionality behind
security hooks (as opposed to what I have done). So I am expecting
that once patches do get merged upstream, they might be in little
different shape altogether.

I think now we can not back out. Merging secureboot patches without
these being upstream broke other subsystems (kdump, systemtap,..). And
we should now allow other non-upstream patches to go in to make 
these subsystems work again. Or we should remove kernel lockdown
functionality in secureboot mode altogether (as upstream does not
have it).

> 
> >=== Kernel Changes ===
> >Kernel needs to carry additional patches to do verify elf binary signature.
> >* There are patches to extend keyctl() so that user space can use it to 
> >verify
> >signature of a user buffer (vmlinuz in this case).
> >* These patches are not upstream, so these need to be carried in fedora till
> >patches get upstream.
> >* Kernel need to be signed using evmctl and detached signature need to be
> >generated. These signatures need to be installed on vmlinuz upon kernel rpm
> >installation in security.ima xattr.
> 
> Does this mean your implementation of signature checking will be
> completely independent of UEFI Secure Boot (unless you decide to use
> that to obtain the trust root)?

Yes. I am going to use IMA for signature verification. These signatures
formats are very close to what is used for module signing (PKCS1.5 with
some metadata).

For trust chain, we will still use secureboot trust chain and trust
an executable only if it has been signed by a key in .system_keyring.

> 
> >=== Signing Key Management ===
> >Yet to be figured out. There are couple of ideas on table.
> >
> >* Embed few keys in kernel and one of these keys will be used to sign
> >/sbin/kexec. In case of a key is revoked, use a new key from set of embedded
> >keys.
> 
> How do you intend to handle revocation?

I have not written the code to check blacklist yet but I plan to do 
that later.

If a key is revoked, I am expecting that we will request M$ for an 
update. And also push out new version of /sbin/kexec signed with
a new key. 

We will need to have this new key signed with either redhat certificate
or signed with M$ so that it can be loaded in already running kernel. That
will make sure old /sbin/kexec instances don't run while new instances
signed with new key do run.

> 
> >* Ship a PE/COFF wrapped key in kexec-tools package. This PE/COFF binary
> >should be signed by appropriate authority so it can be loaded in system
> >keyring.
> 
> Who is the appropriate authority?

I am not sure yet. I was thinking that can we embed fedora/redhat
certificate in kernel (like shim) and use that as CA key to sign
/sbin/kexec key.

Signing a key using CA key will require loading of that key every
reboot automatically. I am not sure how that can be handled. May 
be rpm packages can drop those keys in some directory and a system
service scans for keys and loads these every reboot?

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

What are the guidelines for F17 patches after beta stage

2012-04-19 Thread Vivek Goyal
Hi,

Dave Young is maintaining kexec-tools package in Fedora. We are fixing 
quite a few bugs and adding new features too (forward porting lots of
stuff from rhel6 mkdumprd).

We are not very sure what stuff should we commit to F17 branch and what
should go on master branch for F18. What are the guidelines here?

Thanks
Vivek

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: /var/crash/* disappear after reboot

2012-04-11 Thread Vivek Goyal
On Wed, Apr 11, 2012 at 09:33:39AM -0400, Vivek Goyal wrote:
> On Wed, Apr 11, 2012 at 09:20:07AM +0200, Jiri Moskovcak wrote:
> 
> [..]
> > >I think vmcore should not be moved by ABRT. At max they can create
> > >a soft link to vmcore present in /var/crash.
> > >
> > 
> > - yes, we're going to teach it to use links
> > (https://fedorahosted.org/abrt/ticket/448)
> 
> Thanks. Just that you will need to use soft links instead of hard links.
> Now sure how would you deal with stale soft link issue if admin decides
> to move or delete the vmcore file.

I have opened an F17 bug to track this issue.

https://bugzilla.redhat.com/show_bug.cgi?id=811733

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: /var/crash/* disappear after reboot

2012-04-11 Thread Vivek Goyal
On Wed, Apr 11, 2012 at 09:20:07AM +0200, Jiri Moskovcak wrote:

[..]
> >I think vmcore should not be moved by ABRT. At max they can create
> >a soft link to vmcore present in /var/crash.
> >
> 
> - yes, we're going to teach it to use links
> (https://fedorahosted.org/abrt/ticket/448)

Thanks. Just that you will need to use soft links instead of hard links.
Now sure how would you deal with stale soft link issue if admin decides
to move or delete the vmcore file.

[..]
> Sorry for the troubles, we will fix it with next update (either fix
> the abrt vmcore plugin or disable the service if we wouldn't make it
> for F17)

Sounds good. Thanks for looking into it.

Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: /var/crash/* disappear after reboot

2012-04-10 Thread Vivek Goyal
On Tue, Apr 10, 2012 at 11:17:37AM +0200, Michal Toman wrote:
> On 2012/10/04 09:48, Cong Wang  wrote:
> >On 04/09/2012 05:32 PM, Dave Young wrote:
> >>On 04/09/2012 04:55 PM, Nikola Pajkovsky wrote:
> >>
> >>From kdump side of view, the vmcore should be there instead of being
> >>deleted, It's the default behaviour. Could the abrt keep them?
> >>
> >>I can not think out why it will delete them if user/customer intend to
> >>capture and save them vmcore there.
> >>
> >
> >Me neither. vmcore should be stored for further kernel debugging.
> 
> From kdump side of view nothing changes. It really writes the vmcore
> to /var/crash.
> 
> The "disappearing" happens after reboot, when abrt-vmcore service
> processes it. The vmcore itself is not deleted, but moved to
> /var/spool/abrt/vmcore-{whatever}/vmcore so that it can be processed
> like any other ABRT problem. You can still access it there, nothing
> is lost. In addition, ABRT provides tools to automatically install
> appropriate kernel debuginfo, extract the oops message and report it
> to Bugzilla.
> 
> If you still want the old behavior, you can simply disable the
> abrt-vmcore service and ABRT will not touch the vmcore at all.
> 

I think vmcore should not be moved by ABRT. At max they can create
a soft link to vmcore present in /var/crash.

I am surprised that abrt guys did not even communicate this decision
to kdump folks and just went ahead and decided to automatically move
vmcore.

In rhel history kernel vmcore has always been present in /var/crash
by default. Kdump allows user to change the location and save it either
in a different directory, different filesystem or on a different machine
over network etc. So first of all assuming that after system crash
vmcore is present in /var/crash is broken.

Secondly, user might have mounted a separate disk on /var/crash
which has sufficient space to store vmcores. Trying to move it to /
var/spool/ might fail due to lack of space.

Thirdly, it breaks the existing behavior. So abrt maintainers, please
change this behavior. Don't break thinkgs by default. I think hardcoding
the logic to look into /var/crash/ for vmcores or creating a soft
link should work for you. Even if does not work for whatever reason,
please disable abrt-vmcore service by default. This is completely
unexpected change of behavior.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: new cg-manager gui tool for managin cgroups

2011-08-02 Thread Vivek Goyal
On Thu, Jul 21, 2011 at 06:36:55PM +0200, Lennart Poettering wrote:
> On Thu, 21.07.11 11:28, Vivek Goyal (vgo...@redhat.com) wrote:
> 
> > > It is already possible for different applications to use cgroups
> > > without stepping on each other, and without requiring every app
> > > to communicate with each other.
> > > 
> > > As an example, when it starts libvirt will look at what cgroup
> > > it has been placed in, and create the VM cgroups below this point.
> > > So systemd can put libvirtd in an arbitrary location and set an
> > > overall limits for the virtualization service, and it will cap
> > > all VMs. No direct communication between systemd & libvirt is
> > > required.
> > > 
> > > If applications similarly take care to honour the location in
> > > which they were started, rather than just creating stuff directly
> > > in the root cgroup, they too will interoperate nicely.
> > > 
> > > This is one of the nice aspects to the cgroups hierarchy, and
> > > why having tools/daemons which try to arbitrarily re-arrange
> > > cgroups systemwide are not very desirable IMHO.
> > 
> > This will work as long as somebody has done the top level setup and
> > planning. For example, if somebody is running bunch of virtual machines
> > and hosting some native applications and services also on the machine,
> > then he might decide that all the virt machines can only use 8 out of
> > 10 cpus and keep 2 cpus free for native services.
> > 
> > In that case an admin ought to be able to do this top level planning
> > before handing out control of sub-hierarchies to respective applications.
> > Does systemd allow that as of today?
> 
> Right now, systemd only offers you to place services in the cgroups of
> your choice, it will not configure any settings on those cgroups. (This
> is very likely to change soon though as there is a patch pending that
> makes a number of popular controls available as native config options in
> systemd.)
> 
> For the controllers like "cpuset" or the RT part of "cpu" where you
> assign resources from a limited pool we currently have no solution at
> all to deal with conflicts. Neither in libcgroup and friends, not in
> systemd, not in libvirt.

It is not just "cpuset" or "RT part of cpu". This resource thing can apply
to simple thing like cpu shares or blkio controller weigts. For example,
one notion people seem to have to be able view division of system
resources in terms of percentage. Currently we don't have any way to
deal with it and if we want to achieve it then one would require overall
view of the hierarchy to be able to tell whether a certain group has got
certain % of something or not. If there is a separate manager for separate
parts of hierarchy, it is hard to do so.

So if we want to offer more sophisticated features to admin, then design
becomes somewhat complex and I am not sure if it is worth or not.

Also there is a question what kind of interface should be exposed to a
user/admin when it comes to allocating resources to cgroup. Saying that
give a virtual machine/group a cpu weight of 512 does not mean much. If
one wants to translate this number to a certain %, then he needs the
gloabl view.

Similarly some absolute max limits like offered by some controllers like
blkio, cpu might not make much sense if parent has been throttled to even
a smaller limit.

All this raises the question of how the design of UI/command line look
like for configuring cgroups/limits on various things like
users/services/virtual machines. Right now libvirt seems to be allowing
to specify name of the guest domain and some cgroups parameters (cpu
shares, blkio weight etc) for that domain. Again, in an hierarchy
specifying that does not mean anything in absolute system picture until
and unless somebody has overall view of the system.

This also raises the interesting question how cgroup interface of other
UIs in the system should evolve. 

So I have lots of questions/concerns but do not have good answers.
Hopefully this discussion can lead to some of the answers.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: new cg-manager gui tool for managin cgroups

2011-07-21 Thread Vivek Goyal
On Thu, Jul 21, 2011 at 06:17:03PM +0200, Lennart Poettering wrote:
> On Thu, 21.07.11 15:52, Daniel P. Berrange (berra...@redhat.com) wrote:
> 
> > > I think its a question of do we want to make users go to a bunch of
> > > different front end tools, which don't communicate with each other to
> > > configure the system? I think it makes sense to have libvirt or
> > > virt-manager and systemd front-end be able to configure cgroups, but I
> > > think it would be also nice if they could know when the step on each
> > > other. I think it would also be nice if there was a way to help better
> > > understand how the various system components are making use of cgroups
> > > and interacting. I liked to see an integrated desktop approach - not one
> > > where separate components aren't communicating with each other. 
> > 
> > It is already possible for different applications to use cgroups
> > without stepping on each other, and without requiring every app
> > to communicate with each other.
> > 
> > As an example, when it starts libvirt will look at what cgroup
> > it has been placed in, and create the VM cgroups below this point.
> > So systemd can put libvirtd in an arbitrary location and set an
> > overall limits for the virtualization service, and it will cap
> > all VMs. No direct communication between systemd & libvirt is
> > required.
> 
> systemd (when run as the user) does exactly the same thing btw. It will
> discover the group it is urnning in, and will create all its groups
> beneath that.
> 
> In fact, right now the cgroup hierarchy is not virtualized. To make sure
> systemd works fine in a container we employ the very same logic here: if
> the container manager started systemd in specific cgroup, then system
> will do all its stuff below that, even if it is PID 1.

How does the cgroup hierarchy look like in case of containers? I thought
libvirtd will do all container management and libvirtd is one of the services
started by systemd.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: new cg-manager gui tool for managin cgroups

2011-07-21 Thread Vivek Goyal
On Thu, Jul 21, 2011 at 04:15:46PM -0400, Jason Baron wrote:
> On Thu, Jul 21, 2011 at 05:53:13PM +0200, Lennart Poettering wrote:
> > On Thu, 21.07.11 16:36, Daniel P. Berrange (berra...@redhat.com) wrote:
> > 
> > > IIRC, you can already set cgroups configuration in the service's
> > > systemd unit file, using something like:
> > > 
> > >  ControlGroup="cpu:/foo/bar mem:/wizz"
> > > 
> > > though I can't find the manpage right now.
> > 
> > Yes, this is supported in systemd (though without the "").
> > 
> > It's (tersely) documented in systemd.exec(5).
> > 
> > My guess is that we'll cover about 90% or so of the usecases of cgroups
> > if we make it easy to assign cgroup-based limits to VMs, to system
> > services and to users, like we can do with libvirt and systemd. There's
> > still a big chunk of the 10% who want more complex setups with more
> > arbitrary groupings, but I have my doubts the folks doing that are the
> > ones who need a UI for that.
> > 
> 
> Ok, if systemd is planning to add all these knobs and buttons to cover all of
> these cgroups use cases, then yes, we probably don't need another management
> layer nor the UI. I guess we could worry about the more complex setups when
> we better understand what they are, once systemd has added more cgroup
> support. (I only recently understood that systemd was adding all this
> cgroups support). In fact, at that point we probably don't need
> libcgroup either.

So the model seems to be that there are various components which control
their own children. 

So at top level is systemd which controls users, services and libvirtd
and will provide interfaces/options to be able to configure cgroups
and resources for its children.

Similarly libvirtd will provide interfaces/options to be able to configure
cgroups/resources for its children (primary VMs and possibly containers at
some point of time).

And down the line if there is another significant component comes along
it does the same thing.

So every component defines its own sytax and interfaces to configure
cgroups and there is no global control. If somebody wants to mangage
the system remotely, it got to decide what it wants to control and then
use the API offered by respective manager (systemd, libvirt, xyz)?

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: new cg-manager gui tool for managin cgroups

2011-07-21 Thread Vivek Goyal
On Thu, Jul 21, 2011 at 03:52:03PM +0100, Daniel P. Berrange wrote:
> On Thu, Jul 21, 2011 at 10:36:23AM -0400, Jason Baron wrote:
> > Hi,
> > 
> > On Wed, Jul 20, 2011 at 07:01:30PM -0400, Matthias Clasen wrote:
> > > On Wed, 2011-07-20 at 15:20 -0400, Jason Baron wrote:
> > > > Hi,
> > > > 
> > > > I've been working on a new gui tool for managing and monitoring 
> > > > cgroups, called
> > > > 'cg-manager'. I'm hoping to get people interested in contributing to 
> > > > this
> > > > project, as well as to add to the conversation about how cgroups should
> > > > be configured and incorporated into distros.
> > > > 
> > > 
> > > As a high-level comment, I don't think 'cgroup management' is a very
> > > compelling rationale for an end-user graphical tool.
> > > 
> > > For most people it will be much better to expose cgroup information in
> > > the normal process monitor. For people who want to use the specific
> > > cgroup functionality of systemd, it will be better to have that
> > > functionality available in a new service management frontend.
> > 
> > I've thought that displaying at least the cgroup that a process is part of 
> > would
> > be nice in the system monitor as well.
> > 
> > I think its a question of do we want to make users go to a bunch of
> > different front end tools, which don't communicate with each other to
> > configure the system? I think it makes sense to have libvirt or
> > virt-manager and systemd front-end be able to configure cgroups, but I
> > think it would be also nice if they could know when the step on each
> > other. I think it would also be nice if there was a way to help better
> > understand how the various system components are making use of cgroups
> > and interacting. I liked to see an integrated desktop approach - not one
> > where separate components aren't communicating with each other. 
> 
> It is already possible for different applications to use cgroups
> without stepping on each other, and without requiring every app
> to communicate with each other.
> 
> As an example, when it starts libvirt will look at what cgroup
> it has been placed in, and create the VM cgroups below this point.
> So systemd can put libvirtd in an arbitrary location and set an
> overall limits for the virtualization service, and it will cap
> all VMs. No direct communication between systemd & libvirt is
> required.
> 
> If applications similarly take care to honour the location in
> which they were started, rather than just creating stuff directly
> in the root cgroup, they too will interoperate nicely.
> 
> This is one of the nice aspects to the cgroups hierarchy, and
> why having tools/daemons which try to arbitrarily re-arrange
> cgroups systemwide are not very desirable IMHO.

This will work as long as somebody has done the top level setup and
planning. For example, if somebody is running bunch of virtual machines
and hosting some native applications and services also on the machine,
then he might decide that all the virt machines can only use 8 out of
10 cpus and keep 2 cpus free for native services.

In that case an admin ought to be able to do this top level planning
before handing out control of sub-hierarchies to respective applications.
Does systemd allow that as of today?

Secondly not all applications are going to do the cgroup management. So
we still need a common system wide tool which can do the monitoring
to figure out the problems and also be able to do atlest top level
resource planning before it allows the applications to do their own
planning with-in their top level group.

To allow that I think systemd should either provide native configuration
capability or build on top of existing libcgroups constructs like
cgconfig, cgrules.conf to decide how an admin has planned the resource
management and in what cgroups services have to be launched, IMHO.

> 
> > > The only role I could see for this kind of dedicated cgroup UI would be
> > > as a cgroup debugging aid, but is that really worth the effort,
> > > considering most cgroup developers probably prefer to use cmdline tools
> > > for the that purpose ?
> > > 
> > > 
> > 
> > The reason I started looking at this was b/c there were requests to be
> > able to use a GUI to configure cgroups. Correct me if I'm wrong, but  the 
> > answer
> > is go to the virt-manager gui, then the systemd front end, and then hand 
> > edit
> > cgrules.conf for custom rules. And then hope you don't start services in
> > the wrong order.
> 
> My point is that 'configure cgroups' is not really a task users would
> need to. Going to virt-manager GUI, then systemd GUI, and so on is not
> likely to be a problem in the real world usage, because the users tasks
> do not require that they go through touch every single cgroup on the
> system at once.
> 
> People who are using virtualization, will already be using virt-manager
> to configure their VMs, so of course they expect to be able to control
> the VMs's resource utilization from there, rather tha

Re: new cg-manager gui tool for managin cgroups

2011-07-21 Thread Vivek Goyal
On Thu, Jul 21, 2011 at 10:20:54AM -0400, Jason Baron wrote:

[..]
> > Quite frankly, I think cgrulesd is a really bad idea, since it applies
> > control group limits after a process is already running. This is
> > necessarily racy (and adds quite a burden too, since you ask for
> > notifications on each exec()). I'd claim that cgrulesd is broken by
> > design and cannot be fixed.
> 
> I'm not going to claim that cgrulesd is perfect, but in the case where
> you have untrusted users, you can start their login session in a
> cgroup, and they can't break out of it. I agree it can be racy in the
> case where you want to then further limit that user at run-time (fork
> vs. re-assignment race). Another point, is that the current situation
> can be no worse then the current unconstrained (no cgroup) case,
> especially when you take into account the fact that system services or
> 'trusted services' are going to be properly assigned. Perhaps, the
> authors of cgrulesd can further comment on this issue... 

Agreed that cgrulesd reacts after the event and can be racy. It is a
best effort kind of situation. A more fool proof way is to launch the
task in right cgroup to begin with and that can be done with various
other mechianisms available.

- pam plugin to put users in right cgroup upon login
- cgexec command line tool to launch tasks in right cgroup
- Applications make use of libcgroup API to launch/fork tasks in
  desired cgroup. 

If none of the above is being used, then cgrulesengd works in the
background as best effort to enforce the rules and can easily be turned
off, if need be.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: new cg-manager gui tool for managin cgroups

2011-07-20 Thread Vivek Goyal
On Wed, Jul 20, 2011 at 11:07:14PM +0200, Lennart Poettering wrote:
> On Wed, 20.07.11 16:42, Vivek Goyal (vgo...@redhat.com) wrote:
> 
> > 
> > On Wed, Jul 20, 2011 at 10:28:32PM +0200, Lennart Poettering wrote:
> > 
> > [..]
> > > 
> > > > Right now, the gui assumes that the various hierarchies are mounted 
> > > > separately,
> > > > but that the cpu and cpuacct are co-mounted. Its my understanding that 
> > > > this
> > > > is consistent with how systemd is doing things. So that's great.
> > > 
> > > In F15 we mount all controllers enabled in the kernel separately. In F16
> > > we'll optionally mount some of them together, and we'll probably ship a
> > > default of mounting cpu and cpuacct together if they are both enabled.
> > 
> > Last time we talked about possibility of co-mounting memory and IO at some
> > point of time and you said it is a bad idea from applications programming
> > point of view.  Has that changed now?
> 
> Well, no, but yes.
> 
> After discussing this Dhaval the scheme we came up with is to add
> symlinks to /sys/fs/cgroup/ so that even when some controllers are
> mounted together they are still available at the the separate
> directories. Example, if we mount cpu+cpuacct together things will look
> like this:
> 
> /sys/fs/cgroup/cpu+cpuacct is the joint mount point.
> 
> /sys/fs/cgroup/cpu → /sys/fs/cgroup/cpu+cpuacct is a symlink.
> 
> /sys/fs/cgroup/cpuacct → /sys/fs/cgroup/cpu+cpuacct is a symlink.
> 
> That way to most applications it will be very easy to support this: they
> can simply assume that the controller "foobar" is available under
> /sys/fs/cgroup/foobar, and that's it.

I guess this will be reasonable. Just that application need to handle
the case that directory they are about to create might already be present
there.

So down the we should be able to co-mount memory and IO together with
additional symlinks?

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: new cg-manager gui tool for managin cgroups

2011-07-20 Thread Vivek Goyal
On Wed, Jul 20, 2011 at 10:28:32PM +0200, Lennart Poettering wrote:

[..]
> systemd is and will always have to maintain its own hierarchy
> independently of everybody else.

In the presentation today, you mentioned that you would like to create
cgroups for users by default in cpu hierarchy (once RT time allocation
issue is resolved). I am wondering what happens if an admin wants to
change the policy a bit. Say give higher cpu shares to a specific
user.

Generally one should have been able to do this with the help of GUI
tool. Show the system view and allow to change the parameters (which
are persistent across reboot).

How is one supposed to do that. As it looks like that part of the
control lies with systemd (as it is the one creates user group under
cpu) and part of the control lies with GUItool/cgconfig.

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: new cg-manager gui tool for managin cgroups

2011-07-20 Thread Vivek Goyal
On Wed, Jul 20, 2011 at 10:28:32PM +0200, Lennart Poettering wrote:

[..]
> 
> > Right now, the gui assumes that the various hierarchies are mounted 
> > separately,
> > but that the cpu and cpuacct are co-mounted. Its my understanding that this
> > is consistent with how systemd is doing things. So that's great.
> 
> In F15 we mount all controllers enabled in the kernel separately. In F16
> we'll optionally mount some of them together, and we'll probably ship a
> default of mounting cpu and cpuacct together if they are both enabled.

Last time we talked about possibility of co-mounting memory and IO at some
point of time and you said it is a bad idea from applications programming
point of view.  Has that changed now?

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: New feature for Fedora 16: new mkdumprd

2011-07-14 Thread Vivek Goyal
On Thu, Jul 14, 2011 at 06:35:05AM -0400, Josh Boyer wrote:
> On Wed, Jul 13, 2011 at 11:08 PM, Américo Wang  
> wrote:
> > On Wed, Jul 13, 2011 at 9:52 PM, Tomasz Torcz  wrote:
> >> On Wed, Jul 13, 2011 at 03:50:07PM +0200, Tomasz Torcz wrote:
> >>> > > Would any of you kindly help to review my proposed feature
> >>> > > for Fedora 16?
> >>> > >
> >>> > > https://fedoraproject.org/wiki/Features/NewMkdumprd
> >>> >
> >>>   Feature proposal dealing was yesterday?
> >>
> >>  deadline, of course.
> >>
> >
> > Am I too late? I made the page before the deadline, just sent
> > it out a little late. :-/
> 
> The deadline is for FESCo to review it by, so yes you're a little late.
> 
> However, you can still do the work and get it into F16.  The only
> thing you won't get is some possible marketing as a feature.  You can
> still get it mentioned in the release notes though.

That's fine. This feature is not introducing any new functinality as
such. This is old kdump feature. Just that we are changing the way
we generate the initrd/initramfs for kdump kenrel. Thought it is a
better idea to use dracut to generate initramfs for kdump also and
dump "mkdumprd".

Thanks
Vivek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Fwd: Re: [ANNOUNCE] bfq disk I/O scheduler

2010-08-04 Thread Vivek Goyal
My response got rejected by list. Now I have subscribed to the list.
Hoepefully it will go through this time.

Vivek

- Forwarded message from Vivek Goyal  -

Date: Wed, 4 Aug 2010 15:18:45 -0400
From: Vivek Goyal 
To: Paolo Valente 
Cc: Development discussions related to Fedora ,
Jeff Moyer , Ric Wheeler 
Subject: Re: [ANNOUNCE] bfq disk I/O scheduler
User-Agent: Mutt/1.5.20 (2009-12-10)
Message-ID: <20100804191845.gd11...@redhat.com>

On Wed, Aug 04, 2010 at 02:44:21PM -0400, Ric Wheeler wrote:
> On 08/04/2010 01:58 PM, Paolo Valente wrote:
> >Hi,
> >I have been working for a few years (with Fabio Checconi) on a disk
> >scheduler providing definitely lower latencies than cfq, as well as a
> >higher throughput with most of the test workloads we used (or the same
> >throughput as cfq with the other workloads). We named this scheduler
> >bfq (budget fair queueing). I hope this is the right list for announcing
> >this work.
> >
> >One of the things we measured in our tests is the cold-cache execution
> >time of a command as, e.g., "bash -c exit", "xterm /bin/true" or
> >"konsole -e /bin/true", while the disk was also accessed by different
> >combinations of sequential, or random, readers and/or
> >writers. Depending on which of these background workloads was used,
> >these execution times were five to nine times lower with bfq under
> >2.6.32. Under 2.6.35 they were instead from six to fourteen times
> >lower. The highest price paid for these lower latencies was a 20% loss
> >of aggregated disk throughput for konsole in case of background
> >workloads made only of sequential requests (due to the fact that bfq
> >of course privileges, more than cfq, the seeky IO needed to load
> >konsole and its dependencies). In contrast, with shorter commands, as
> >bash or xterm, bfq also provided up to 30% higher aggregated
> >throughput.
> >
> >We saw from 15% to 30% higher aggregated throughput also in our
> >only-aggregated-throughput tests. You can find in [1] all the details
> >on our tests and on other nice features of bfq, such as the fact that
> >it perfectly distributes the disk throughput as desired, independently
> >of disk physical parameters like, e.g., ZBR. in [1] you can also find
> >a detailed description of bfq and a short report on the maturity level
> >of the code (TODO list), plus all the scripts used for the tests.
> >
> >The results I mentioned so far have been achieved with the last
> >version of bfq, released about two months ago as patchsets for 2.6.33
> >or 2.6.34. From a few days a patchset for 2.6.35 is available too, as
> >well as a backport to 2.6.32. The latter has been prepared by Mauro
> >Andreolini, who also helped me a lot with debugging. All these patches
> >can be found here [2]. Mauro also built a binary kernel package for
> >current lucid, and hosted it into a PPA, which can be found here [3].
> >
> >A few days after being released, this version of bfq has been
> >introduced as the default disk scheduler in the Zen Kernel. It has
> >been adopted as the default disk scheduler in Gentoo Linux too. I
> >also recorded downloads from users with other distributions, as, e.g.,
> >Ubuntu and ArchLinux. As of now we received only positive feedbacks
> >from the users.
> >
> >Paolo
> >
> >[1] http://algo.ing.unimo.it/people/paolo/disk_sched/
> >[2] http://algo.ing.unimo.it/people/paolo/disk_sched/sources.php
> >[3] Ubuntu PPA: ppa:mauro-andreolini/ubuntu-kernel-bfq
> >
> 
> Hi Paolo,
> 
> Have you tried to post this to the upstream developers of CFQ and IO 
> schedulers?
> 

Hi Paolo,

Last time when BFQ was discussed on lkml, Jens did not want one more IO
scheduler and wanted CFQ to be fixed. So are you not convinced that CFQ
can be fixed to achieve similar results as BFQ?

IIRC, BFQ had two main things.

- B-WF2Q+ algorithm
- Budget allocation in terms of sectors and not time.

I am not sure that in practice we really need kind of fairness and accracy
B-W2Q+ tries to provide. Reason being that WF2Q+ was assuming continuous
IO traffic from queues and on desktop systems, I have seen most of the
things we care about do small amount of random read/write and go away.
Only sequential readers keep the IO queue busy for longer intervals.

IOW, I thought that one can fix latency issues by just reducing the time
slice length and by giving perference to random readers/writers.

I am wondering what difference design philosophy we bring in BFQ which
can not be employed in CFQ and fix the things.

Thanks
Vivek

- End forwarded message -
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel