from:"Chris Wright"

[Qemu-devel] Virtualization DevRoom at FOSDEM 2013

2012-11-16 Thread Chris Wright

Following on the heels of a successful KVM Forum and oVirt Workshop,
FOSDEM will be hosting a Virtualization DevRoom in February.  If you've
been to FOSDEM before, you know this is about developers and code, not
products.

Presentation proposals are due by December 16th 2012.

The full details are here:

 http://osvc.v2.cs.unibo.it/index.php/Main_Page

With the relevant topics being:

"Topics covered will include, but not limited to:
 - machine virtualization (e.g. KVM, Xen, VirtualBox,...)
 - network virtualization (e.g. openvstack, vale, vde, Open vSwitch,...)
 - process level virtualization, flexible kernels (e.g. rump anykernel, 
view-os, ...)
 - virt management (e.g. ganeti, libvirt, ovirt, XCP, ...)"

thanks,
-chris

Re: [Qemu-devel] [PATCH] Add nvram to default boot device list

2012-10-11 Thread Chris Wright

* Alexander Graf (ag...@suse.de) wrote:
> On 12.10.2012, at 02:28, David Gibson wrote:
> > On Fri, Oct 12, 2012 at 02:03:00AM +0200, Alexander Graf wrote:
> >> On 12.10.2012, at 00:59, David Gibson wrote:
> >>> On Thu, Oct 11, 2012 at 07:34:42AM +0530, Avik Sil wrote:
>  This patch adds nvram specified boot device into qemu default
>  boot_devices list. This helps firmware to boot from nvram specified
>  boot device if no -boot option is specified.
> >>> 
> >>> I really don't think this is a good idea, it extends an already
> >>> deprecated mechanism in a fuzzy way and requires careful checking to
> >>> see if it could break anything.  On all platforms the boot sequence
> >>> should be:
> >>>   if bootindex is specified:
> >>>   boot according to bootindex
> >>>   else if -boot is specified:
> >>>   boot according to -boot sequence
> >>>   else:
> >>>   use platform firmware default sequence
> >>> 
> >>> The last will of course vary by platform, and could depend on platform
> >>> details like the contents of NVRAM.  Your original idea of making it
> >>> clear to the guest when -boot has been specified (as opposed to when
> >>> it contains its default value) was the right one, and this "x" in
> >>> -boot is going the wrong direction.
> >> 
> >> Given that this is a fundamental direction for a bunch of machines,
> >> how about we talk about it on the weekly QEMU call?
> > 
> > Uh, is this a call I know about?
> 
> I would hope so. Chris / Juan, who is in charge of the phone numbers these 
> days?

Added David to the invite which contains the call details (very
unfriendly time for .au I'm afraid, 14:00 UTC).

[Qemu-devel] KVM Forum 2012 Call For Participation

2012-08-02 Thread Chris Wright

For some reason I'm not seeing this on the qemu list, so here's a fwd

- Forwarded message from KVM Forum 2012 Program Committee 
 -

Date: Fri, 27 Jul 2012 16:31:45 -0700
From: KVM Forum 2012 Program Committee 
To: k...@vger.kernel.org, libvir-l...@redhat.com, qemu-devel@nongnu.org,
virtualizat...@lists.linux-foundation.org
Cc: kvm-forum-2012...@redhat.com
Subject: KVM Forum 2012 Call For Participation

=
KVM Forum 2012: Call For Participation
November 7-9, 2012 - Hotel Fira Palace - Barcelona, Spain

(All submissions must be received before midnight Aug 31st, 2012)
=

KVM is an industry leading open source hypervisor that provides
an ideal platform for datacenter virtualization, virtual desktop
infrastructure, and cloud computing.  Once again, it's time to bring
together the community of developers and users that define the KVM
ecosystem for our annual technical conference.  We will discuss the
current state of affairs and plan for the future of KVM, its surrounding
infrastructure, and management tools.  We are also excited to announce
the oVirt Workshop will run in parallel with the KVM Forum, bringing in
a community focused on enterprise datacenter virtualization management
built on KVM.  For topics which overlap we will have shared sessions.
So mark your calendar and join us in advancing KVM.

http://events.linuxfoundation.org/events/kvm-forum/

Once again we are colocated with The Linux Foundation's LinuxCon,
Based on feedback from last year, this time it's LinuxCon Europe!
KVM Forum attendees will be able to attend oVirt Workshop sessions and
are eligible to attend LinuxCon Europe for a discounted rate.

http://events.linuxfoundation.org/events/kvm-forum/register

We invite you to lead part of the discussion by submitting a speaking
proposal for KVM Forum 2012.

http://events.linuxfoundation.org/cfp

Suggested topics:

 KVM
 - Scaling and performance
 - Nested virtualization
 - I/O improvements
 - PCI device assignment
 - Driver domains
 - Time keeping
 - Resource management (cpu, memory, i/o)
 - Memory management (page sharing, swapping, huge pages, etc)
 - VEPA, VN-Link, vswitch
 - Security
 - Architecture ports

 QEMU
 - Device model improvements
 - New devices and chipsets
 - Scaling and performance
 - Desktop virtualization
 - Spice
 - Increasing robustness and hardening
 - Security model
 - Management interfaces
 - QMP protocol and implementation
 - Image formats
 - Firmware (SeaBIOS, OVMF, UEFI, etc)
 - Live migration
 - Live snapshots and merging
 - Fault tolerance, high availability, continuous backup
 - Real-time guest support

 Virtio
 - Speeding up existing devices
 - Alternatives
 - Virtio on non-Linux or non-virtualized

 Management infrastructure
 - oVirt (shared track w/ oVirt Workshop)
 - Libvirt
 - KVM autotest
 - OpenStack
 - Network virtualization management
 - Enterprise storage management

 Cloud computing
 - Scalable storage
 - Virtual networking
 - Security
 - Provisioning

SUBMISSION REQUIREMENTS

Abstracts due: Aug 31st, 2012
Notification: Sep 14th, 2012

Please submit a short abstract (~150 words) describing your presentation
proposal.  In your submission please note how long your talk will take.
Slots vary in length up to 45 minutes.  Also include in your proposal
the proposal type -- one of:

- technical talk
- end-user talk
- birds of a feather (BOF) session

Submit your proposal here:

http://events.linuxfoundation.org/cfp

You will receive a notification whether or not your presentation proposal
was accepted by Sep 14th.

END-USER COLLABORATION

One of the big challenges as developers is to know what, where and how
people actually use our software.  We will reserve a few slots for end
users talking about their deployment challenges and achievements.

If you are using KVM in production you are encouraged submit a speaking
proposal.  Simply mark it as an end-user collaboration proposal.  As an
end user, this is a unique opportunity to get your input to developers.

BOF SESSION

We will reserve some slots in the evening after the main conference
tracks, for birds of a feather (BOF) sessions. These sessions will be
less formal than presentation tracks and targetted for people who would
like to discuss specific issues with other developers and/or users.
If you are interested in getting developers and/or uses together to
discuss a specific problem, please submit a BOF proposal.

LIGHTNING TALKS

In addition to submitted talks we will also have some room for lightning
talks. These are short (5 minute) discussions to highlight new work or
ideas that aren't complete enough to warrant a full presentation slot.
Lightning talk submissions and scheduling will be handled on-site at
KVM Forum.

HOTEL / TRAVEL

The KVM Forum 2012 will be held in Barcelona, Spain at the Hotel Fira Palace.

http://events.linuxfoundation.org/events/kvm-forum

Re: [Qemu-devel] QEMU hacking session/day at KVM Forum 2012?

2012-07-30 Thread Chris Wright

* Anthony Liguori (aligu...@us.ibm.com) wrote:
> Peter Maydell  writes:
> > Last year at KVM Forum, in addition to the scheduled talks we also
> > had an informal hacking session on one of the following days, since
> > we were colocated with LinuxCon NA and most people were still around
> > afterwards.
> >
> > I thought this was really useful and I think it would be good if we
> > could arrange something similar this year. This year we're colo'd
> > with LinuxCon Europe (which is 5th-7th November with KVM Forum being
> > 7th-9th), so I guess that one of the two days beforehand might
> > be usable.
> 
> Do you think we can get a room at some point before/after the main
> conference for a hackathon?  Would be interesting to try and combine it
> with an oVirt hackathon too and get everyone in the same room.

We have one extra room (moderate capacity) that can be used Thur/Fri
ad-hoc.  I think we can get that space earlier in the week as well.

thanks,
-chris

Re: [Qemu-devel] QEMU was not selected for Google Summer of Code this year

2012-03-17 Thread Chris Wright

* Natalia Portillo (clau...@claunia.com) wrote:
> QEMU hosted on Haiku would be interesting.

The fun of Haiku
especially when it is
hosting QEMU

Re: [Qemu-devel] [RFC] Next gen kvm api

2012-02-07 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> On 02/07/2012 07:18 AM, Avi Kivity wrote:
> >On 02/07/2012 02:51 PM, Anthony Liguori wrote:
> >>On 02/07/2012 06:40 AM, Avi Kivity wrote:
> >>>On 02/07/2012 02:28 PM, Anthony Liguori wrote:
> 
> >It's a potential source of exploits
> >(from bugs in KVM or in hardware). I can see people wanting to be
> >selective with access because of that.
> 
> As is true of the rest of the kernel.
> 
> If you want finer grain access control, that's exactly why we have things 
> like
> LSM and SELinux. You can add the appropriate LSM hooks into the KVM
> infrastructure and setup default SELinux policies appropriately.
> >>>
> >>>LSMs protect objects, not syscalls. There isn't an object to protect here
> >>>(except the fake /dev/kvm object).
> >>
> >>A VM can be an object.
> >
> >Not really, it's not accessible in a namespace. How would you label it?

A VM, vcpu, etc are all objects.  The labelling can be implicit based on
the security context of the process creating the object.  You could create
simplistic rules such as a process may have the ability KVM__VM_CREATE
(this is roughly analogous to the PROC__EXECMEM policy control that
allows some processes to create executable writable memory mappings, or
SHM__CREATE for a process that can create a shared memory segment).
Adding some label mgmt to the object (add ->security and some callbacks to
do ->alloc/init/free), and then checks on the object itself would allow
for finer grained protection.  If there was any VM lookup (although the
original example explicitly ties a process to a vm and a thread to a
vcpu) the finer grained check would certainly be useful to verify that
the process can access the VM.

> Labels can originate from userspace, IIUC, so I think it's possible for QEMU
> (or whatever the userspace is) to set the label for the VM while it's
> creating it. I think this is how most of the labeling for X and things of
> that nature works.

For X, the policy enforcement is done in the X server.  There is
assistance from the kernel for doing policy server queries (can foo do
bar?), but it's up to the X server to actually care enough to ask and
then fail a request that doesn't comply.  I'm not sure that's the model
here.

thanks,
-chris

Re: [Qemu-devel] [RFC] QEMU Code Audit Team

2012-01-06 Thread Chris Wright

* Anthony Liguori (aligu...@us.ibm.com) wrote:
> 2) Two people walk through a particular piece of code and
> independently flag anything that looks like a potential security
> issue.

Auditing is always helpful, but won't ever get full coverage.  qtest +
fuzz is another great way to identify problems.  Also improving any
anotations to help static analysis tools is useful.  And both of those
are development efforts rather than code review.  Trouble with code
review is that security bugs can be subtle and easy to miss.

> I'd want to focus initially on the common PC devices.   The list
> isn't all that large and a review like this should only take a few
> hours to complete each step.

I definitely agree on the initial scope.

thanks,
-chris

Re: [Qemu-devel] [RFC] QEMU Code Audit Team

2012-01-06 Thread Chris Wright

* Corey Bryant (cor...@linux.vnet.ibm.com) wrote:
> Count me in for step 2.  A good approach may be to run a static
> analysis tool against the code, followed by a manual scan of the
> code for common vulnerabilities that static analysis can't find.

Good idea.  Folks are already running things like Coverity.  The false
positive rate is high enough that it's a lot to wade through at first
(so extra eyes could be quite helpful here).  Perhaps the people who
are involved in this could share some of their findings.

thanks,
-chris

Re: [Qemu-devel] KVM Call Agenda for 12/6 (Tuesday) @ 10am US/Eastern

2011-12-05 Thread Chris Wright

* Chris Wright (chr...@redhat.com) wrote:
> * Anthony Liguori (aligu...@us.ibm.com) wrote:
> > 1. A short introduction to each of the guest agents, what guests they
> > support, and what verbs they support.
> 
> I think we did this once before w/ Matahari.  Can we please capture
> these things in email before the call, so people actually have time
> to ponder the details.
> 
> > 2. A short description of key requirements from each party (oVirt, libvirt,
> > QEMU) for a guest agent
> 
> Same here...call this the abstract/intro of the above detailed list of
> verbs and guest support, and send it by Friday this week.
> 
> I know there's plenty of details buried in the current thread and old
> discussions of Matahari.  But that's just it...buried...

It's past Friday.  Barak's links are all we have so far...

thanks,
-chris

Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding

2011-11-30 Thread Chris Wright

* Peter Zijlstra (a.p.zijls...@chello.nl) wrote:
> On Wed, 2011-11-30 at 21:52 +0530, Dipankar Sarma wrote:
> > 
> > Also, if at all topology changes due to migration or host kernel decisions,
> > we can make use of something like VPHN (virtual processor home node)
> > capability on Power systems to have guest kernel update its topology
> > knowledge. You can refer to that in
> > arch/powerpc/mm/numa.c. 
> 
> I think that fail^Wfeature of PPC is terminally broken. You simply
> cannot change the topology after the fact. 

Agreed, there's too many things that consult topology once and never
look back.

Re: [Qemu-devel] KVM Call Agenda for 12/6 (Tuesday) @ 10am US/Eastern

2011-11-30 Thread Chris Wright

* Anthony Liguori (aligu...@us.ibm.com) wrote:
> Hi,
> 
> I'd like to propose that we discuss guest agent convergence in our next KVM
> call.  I've CC'd folks from oVirt and libvirt to join the discussion.
> 
> I think we should probably attempt to have some structure to the discussion.
> I would suggest:
> 
> 1. A short introduction to each of the guest agents, what guests they
> support, and what verbs they support.

I think we did this once before w/ Matahari.  Can we please capture
these things in email before the call, so people actually have time
to ponder the details.

> 2. A short description of key requirements from each party (oVirt, libvirt,
> QEMU) for a guest agent

Same here...call this the abstract/intro of the above detailed list of
verbs and guest support, and send it by Friday this week.

I know there's plenty of details buried in the current thread and old
discussions of Matahari.  But that's just it...buried...

> 3. An open discussion about possible ways to collaborate/converge.

That should really help facilitate this item ;)

thanks,
-chris

Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding

2011-11-21 Thread Chris Wright

* Peter Zijlstra (a.p.zijls...@chello.nl) wrote:
> On Mon, 2011-11-21 at 21:30 +0530, Bharata B Rao wrote:
> > 
> > In the original post of this mail thread, I proposed a way to export
> > guest RAM ranges (Guest Physical Address-GPA) and their corresponding host
> > host virtual mappings (Host Virtual Address-HVA) from QEMU (via QEMU 
> > monitor).
> > The idea was to use this GPA to HVA mappings from tools like libvirt to bind
> > specific parts of the guest RAM to different host nodes. This needed an
> > extension to existing mbind() to allow binding memory of a process(QEMU) 
> > from a
> > different process(libvirt). This was needed since we wanted to do all this 
> > from
> > libvirt.
> > 
> > Hence I was coming from that background when I asked for extending
> > ms_mbind() to take a tid parameter. If QEMU community thinks that NUMA
> > binding should all be done from outside of QEMU, it is needed, otherwise
> > what you have should be sufficient. 
> 
> That's just retarded, and no you won't get such extentions. Poking at
> another process's virtual address space is just daft. Esp. if there's no
> actual reason for it.

Need to separate the binding vs the policy mgmt.  The policy mgmt could
still be done outside, whereas the binding could still be done from w/in
QEMU.  A simple monitor interface to rebalance vcpu memory allcoations
to different nodes could very well schedule vcpu thread work in QEMU.

So, I agree, even if there is some external policy mgmt, it could still
easily work w/ QEMU to use Peter's proposed interface.

thanks,
-chris

Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding

2011-11-08 Thread Chris Wright

* Alexander Graf (ag...@suse.de) wrote:
> On 29.10.2011, at 20:45, Bharata B Rao wrote:
> > As guests become NUMA aware, it becomes important for the guests to
> > have correct NUMA policies when they run on NUMA aware hosts.
> > Currently limited support for NUMA binding is available via libvirt
> > where it is possible to apply a NUMA policy to the guest as a whole.
> > However multinode guests would benefit if guest memory belonging to
> > different guest nodes are mapped appropriately to different host NUMA nodes.
> > 
> > To achieve this we would need QEMU to expose information about
> > guest RAM ranges (Guest Physical Address - GPA) and their host virtual
> > address mappings (Host Virtual Address - HVA). Using GPA and HVA, any 
> > external
> > tool like libvirt would be able to divide the guest RAM as per the guest 
> > NUMA
> > node geometry and bind guest memory nodes to corresponding host memory nodes
> > using HVA. This needs both QEMU (and libvirt) changes as well as changes
> > in the kernel.
> 
> Ok, let's take a step back here. You are basically growing libvirt into a 
> memory resource manager that know how much memory is available on which nodes 
> and how these nodes would possibly fit into the host's memory layout.
> 
> Shouldn't that be the kernel's job? It seems to me that architecturally the 
> kernel is the place I would want my memory resource controls to be in.

I think that both Peter and Andrea are looking at this.  Before we commit
an API to QEMU that has a different semantic than a possible new kernel
interface (that perhaps QEMU could use directly to inform kernel of the
binding/relationship between vcpu thread and it's memory at VM startuup)
it would be useful to see what these guys are working on...

thanks,
-chris

Re: [Qemu-devel] Memory API code review

2011-09-14 Thread Chris Wright

* Avi Kivity (a...@redhat.com) wrote:
> I would like to carry out an online code review of the memory API so that
> more people are familiar with the internals, and perhaps even to catch some
> bugs or deficiency.  I'd like to use the next kvm conference call slot for
> this (Tuesday 1400 UTC) since many people already have it reserved in the
> schedule.
> 
> It would be great if people from the wider qemu community be present, rather
> than the usual "x86 is everything" crowd (+Jan) that usually participates in
> the kvm weekly call.
> 
> Juan, Chris, can we dedicate next week's call to this?

Yup, sounds like a good idea.

Re: [Qemu-devel] kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Chris Wright

* Aaron Fabbri (aafab...@cisco.com) wrote:
> On 8/26/11 12:35 PM, "Chris Wright"  wrote:
> > * Aaron Fabbri (aafab...@cisco.com) wrote:
> >> Each process will open vfio devices on the fly, and they need to be able to
> >> share IOMMU resources.
> > 
> > How do you share IOMMU resources w/ multiple processes, are the processes
> > sharing memory?
> 
> Sorry, bad wording.  I share IOMMU domains *within* each process.

Ah, got it.  Thanks.

> E.g. If one process has 3 devices and another has 10, I can get by with two
> iommu domains (and can share buffers among devices within each process).
> 
> If I ever need to share devices across processes, the shared memory case
> might be interesting.
> 
> > 
> >> So I need the ability to dynamically bring up devices and assign them to a
> >> group.  The number of actual devices and how they map to iommu domains is
> >> not known ahead of time.  We have a single piece of silicon that can expose
> >> hundreds of pci devices.
> > 
> > This does not seem fundamentally different from the KVM use case.
> > 
> > We have 2 kinds of groupings.
> > 
> > 1) low-level system or topoolgy grouping
> > 
> >Some may have multiple devices in a single group
> > 
> >* the PCIe-PCI bridge example
> >* the POWER partitionable endpoint
> > 
> >Many will not
> > 
> >* singleton group, e.g. typical x86 PCIe function (majority of
> >  assigned devices)
> > 
> >Not sure it makes sense to have these administratively defined as
> >opposed to system defined.
> > 
> > 2) logical grouping
> > 
> >* multiple low-level groups (singleton or otherwise) attached to same
> >  process, allowing things like single set of io page tables where
> >  applicable.
> > 
> >These are nominally adminstratively defined.  In the KVM case, there
> >is likely a privileged task (i.e. libvirtd) involved w/ making the
> >device available to the guest and can do things like group merging.
> >In your userspace case, perhaps it should be directly exposed.
> 
> Yes.  In essence, I'd rather not have to run any other admin processes.
> Doing things programmatically, on the fly, from each process, is the
> cleanest model right now.

I don't see an issue w/ this.  As long it can not add devices to the
system defined groups, it's not a privileged operation.  So we still
need the iommu domain concept exposed in some form to logically put
groups into a single iommu domain (if desired).  In fact, I believe Alex
covered this in his most recent recap:

  ...The group fd will provide interfaces for enumerating the devices
  in the group, returning a file descriptor for each device in the group
  (the "device fd"), binding groups together, and returning a file
  descriptor for iommu operations (the "iommu fd").

thanks,
-chris

Re: [Qemu-devel] kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Chris Wright

* Aaron Fabbri (aafab...@cisco.com) wrote:
> On 8/26/11 7:07 AM, "Alexander Graf"  wrote:
> > Forget the KVM case for a moment and think of a user space device driver. I 
> > as
> > a user am not root. But I as a user when having access to /dev/vfioX want to
> > be able to access the device and manage it - and only it. The admin of that
> > box needs to set it up properly for me to be able to access it.
> > 
> > So having two steps is really the correct way to go:
> > 
> >   * create VFIO group
> >   * use VFIO group
> > 
> > because the two are done by completely different users.
> 
> This is not the case for my userspace drivers using VFIO today.
> 
> Each process will open vfio devices on the fly, and they need to be able to
> share IOMMU resources.

How do you share IOMMU resources w/ multiple processes, are the processes
sharing memory?

> So I need the ability to dynamically bring up devices and assign them to a
> group.  The number of actual devices and how they map to iommu domains is
> not known ahead of time.  We have a single piece of silicon that can expose
> hundreds of pci devices.

This does not seem fundamentally different from the KVM use case.

We have 2 kinds of groupings.

1) low-level system or topoolgy grouping

   Some may have multiple devices in a single group

   * the PCIe-PCI bridge example
   * the POWER partitionable endpoint

   Many will not

   * singleton group, e.g. typical x86 PCIe function (majority of
 assigned devices)

   Not sure it makes sense to have these administratively defined as
   opposed to system defined.

2) logical grouping

   * multiple low-level groups (singleton or otherwise) attached to same
 process, allowing things like single set of io page tables where
 applicable.

   These are nominally adminstratively defined.  In the KVM case, there
   is likely a privileged task (i.e. libvirtd) involved w/ making the
   device available to the guest and can do things like group merging.
   In your userspace case, perhaps it should be directly exposed.

> In my case, the only administrative task would be to give my processes/users
> access to the vfio groups (which are initially singletons), and the
> application actually opens them and needs the ability to merge groups
> together to conserve IOMMU resources (assuming we're not going to expose
> uiommu).

I agree, we definitely need to expose _some_ way to do this.

thanks,
-chris

Re: [Qemu-devel] [PATCH] os-posix: set groups properly for -runas

2011-07-12 Thread Chris Wright

* Chris Wright (chr...@sous-sol.org) wrote:
> * Stefan Hajnoczi (stefa...@linux.vnet.ibm.com) wrote:
> > @@ -199,6 +200,11 @@ static void change_process_uid(void)
> >  fprintf(stderr, "Failed to setgid(%d)\n", user_pwd->pw_gid);
> >  exit(1);
> >  }
> > +if (initgroups(user_pwd->pw_name, user_pwd->pw_gid) < 0) {
> > +fprintf(stderr, "Failed to initgroups(\"%s\", %d)\n",
> > +user_pwd->pw_name, user_pwd->pw_gid);
> > +exit(1);
> > +}
> 
> Does initgroups need access to /etc/group?  How does this combine w/
> -chroot?

Tested this on Linux, and w/out /etc/group it simply fails to add any
supplementary groups (doesn't fail completely, just fails safely).
Appears similar from solaris manpages.

Given that...

Acked-by: Chris Wright

[Qemu-devel] [Bug 807893] Re: [PATCH] os-posix: set groups properly for -runas

2011-07-12 Thread Chris Wright

* Stefan Hajnoczi (stefa...@linux.vnet.ibm.com) wrote:
> @@ -199,6 +200,11 @@ static void change_process_uid(void)
>  fprintf(stderr, "Failed to setgid(%d)\n", user_pwd->pw_gid);
>  exit(1);
>  }
> +if (initgroups(user_pwd->pw_name, user_pwd->pw_gid) < 0) {
> +fprintf(stderr, "Failed to initgroups(\"%s\", %d)\n",
> +user_pwd->pw_name, user_pwd->pw_gid);
> +exit(1);
> +}

Does initgroups need access to /etc/group?  How does this combine w/
-chroot?

Added bonus...this will fail when the initial user is not privileged
_and_ is the same user as -runas user (probably not what a user intended,
but would've worked before).  Something like:

[doh@laptop qemu]$ qemu -runas doh

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/807893

Title:
  qemu privilege escalation

Status in QEMU:
  Confirmed

Bug description:
  If qemu is started as root, with -runas, the extra groups is not
  dropped correctly

  /proc/`pidof qemu`/status
  ..
  Uid:100 100 100 100
  Gid:100 100 100 100
  FDSize: 32
  Groups: 0 1 2 3 4 6 10 11 26 27 
  ...

  The fix is to add initgroups() or setgroups(1, [gid]) where
  appropriate to os-posix.c.

  The extra gid's allow read or write access to other files (such as
  /dev etc).

  Emulating the qemu code:

  # python
  ...
  >>> import os
  >>> os.setgid(100)
  >>> os.setuid(100)
  >>> os.execve("/bin/sh", [ "/bin/sh" ], os.environ)
  sh-4.1$ xxd /dev/sda | head -n2
  000: eb48 9000        .H..
  010:          
  sh-4.1$ ls -l /dev/sda
  brw-rw 1 root disk 8, 0 Jul  8 11:54 /dev/sda
  sh-4.1$ id
  uid=100(qemu00) gid=100(users) 
groups=100(users),0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),26(tape),27(video)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/807893/+subscriptions

[Qemu-devel] [Bug 807893] Re: qemu privilege escalation

2011-07-12 Thread Chris Wright

Requesting CVE.  Tools like libvirt deprivilege themselves before
launching qemu as an unprivileged user (no use of -runas), so aren't
vulnerable.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/807893

Title:
  qemu privilege escalation

Status in QEMU:
  Confirmed

Bug description:
  If qemu is started as root, with -runas, the extra groups is not
  dropped correctly

  /proc/`pidof qemu`/status
  ..
  Uid:100 100 100 100
  Gid:100 100 100 100
  FDSize: 32
  Groups: 0 1 2 3 4 6 10 11 26 27 
  ...

  The fix is to add initgroups() or setgroups(1, [gid]) where
  appropriate to os-posix.c.

  The extra gid's allow read or write access to other files (such as
  /dev etc).

  Emulating the qemu code:

  # python
  ...
  >>> import os
  >>> os.setgid(100)
  >>> os.setuid(100)
  >>> os.execve("/bin/sh", [ "/bin/sh" ], os.environ)
  sh-4.1$ xxd /dev/sda | head -n2
  000: eb48 9000        .H..
  010:          
  sh-4.1$ ls -l /dev/sda
  brw-rw 1 root disk 8, 0 Jul  8 11:54 /dev/sda
  sh-4.1$ id
  uid=100(qemu00) gid=100(users) 
groups=100(users),0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),26(tape),27(video)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/807893/+subscriptions

[Qemu-devel] [Bug 807893] Re: qemu privilege escalation

2011-07-12 Thread Chris Wright

This bug is being tracked as CVE-2011-2527

** CVE added: http://www.cve.mitre.org/cgi-
bin/cvename.cgi?name=2011-2527

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/807893

Title:
  qemu privilege escalation

Status in QEMU:
  Confirmed

Bug description:
  If qemu is started as root, with -runas, the extra groups is not
  dropped correctly

  /proc/`pidof qemu`/status
  ..
  Uid:100 100 100 100
  Gid:100 100 100 100
  FDSize: 32
  Groups: 0 1 2 3 4 6 10 11 26 27 
  ...

  The fix is to add initgroups() or setgroups(1, [gid]) where
  appropriate to os-posix.c.

  The extra gid's allow read or write access to other files (such as
  /dev etc).

  Emulating the qemu code:

  # python
  ...
  >>> import os
  >>> os.setgid(100)
  >>> os.setuid(100)
  >>> os.execve("/bin/sh", [ "/bin/sh" ], os.environ)
  sh-4.1$ xxd /dev/sda | head -n2
  000: eb48 9000        .H..
  010:          
  sh-4.1$ ls -l /dev/sda
  brw-rw 1 root disk 8, 0 Jul  8 11:54 /dev/sda
  sh-4.1$ id
  uid=100(qemu00) gid=100(users) 
groups=100(users),0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),26(tape),27(video)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/807893/+subscriptions

Re: [Qemu-devel] [PATCHv2] vhost: fix double free on device stop

2011-06-21 Thread Chris Wright

* Michael S. Tsirkin (m...@redhat.com) wrote:
> vhost dev stop failed to clear the log field.
> Typically not an issue as dev start overwrites this field,
> but if logging gets disabled before the following start,
> it doesn't so this causes a double free.
> 
> Signed-off-by: Michael S. Tsirkin 

Acked-by: Chris Wright 

thanks,
-chris

[Qemu-devel] KVM call minutes for June 21

2011-06-21 Thread Chris Wright

concerns about backwards compat
- https://bugzilla.redhat.com/show_bug.cgi?id=689672
  - f13 host can no longer run f14 guest after qemu update
- this particular bug is older f13 which includes patched qemu...
- could be useful to fingerprint the guest (lspci, etc)
  - sounds simple enough, need someone who's inclined to do it

state of image streaming/block copy
- live block copy and image streaming overlap
  - attempting to unify
- some confusion over next steps
- need to clarify differing requirements (shared storage vs. generic storage)
- stefan to summarize solution proposal on list/wiki

guest agent api current verbs and future roadmap?
- pretty happy w/ current verbs, future intention to keep it simple,
  high-level
- should be working on windows guests

[Qemu-devel] KVM call agenda for June 7

2011-06-06 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] KVM call minutes for Apr 26

2011-04-26 Thread Chris Wright

Tools for resource accounting the virtual machines.
- Luis Castro was not on the call

Status of glib tree - next steps?
- full conversion done in tree
- still targeting 0.15

status of QCFG
- code generator rewritten to be more generic and useful
- merge core infrastructure first
  - to not block other work waiting on full conversion
- still need to complete full conversion

qemu-kvm merge
- status
  - review and merge/feedback pending from Avi on current outstanding patches
  - still have some 60 patches
- break them into a few smaller series
- next steps, specifically:
  - upstreaming in-kernel irqchip support
  - MSI/MSI-X (cleanup and make mergable)
  - this is a decent amount of work, Jan is solo...anyone want to help?
- need to be careful of regressions
- add tests to avi's autotest run (e.g., cpu hotplug)
  - cpu hotplug test initiated from host side
  - online needs some cooperation in linux
  - still unclear on what's supported, windows apparently only supports online

autotest
- had autotest test day, feedback coming on list
- some issues with getting set up
- having basic common config could be useful

KVM Forum reminder
- send in your proposals

Re: [Qemu-devel] [PATCH] vfio: Add an ioctl to reset the device

2011-04-19 Thread Chris Wright

* Alex Williamson (alex.william...@redhat.com) wrote:
> On Tue, 2011-04-19 at 15:26 -0700, Chris Wright wrote:
> > * Alex Williamson (alex.william...@redhat.com) wrote:
> > > On Tue, 2011-04-19 at 15:07 -0700, Chris Wright wrote:
> > > > * Alex Williamson (alex.william...@redhat.com) wrote:
> > > > > When using VFIO to assign a device to a guest, we want to make sure
> > > > > the device is quiesced on VM reset to stop all DMA within the guest
> > > > > mapped memory.  Add an ioctl which just calls pci_reset_function()
> > > > > and returns whether it succeeds.
> > > > 
> > > > Shouldn't there be a reset when binding/unbinding vfio to/from a pci
> > > > device?
> > > 
> > > There's already one when the /dev/vfioX file is opened, we should add
> > > another on release, and probably add the same PCI save state store/load
> > > that I'm proposing for KVM across those.  Thanks,
> > 
> > Hmm, I looked and didn't see it, hence the question.
> 
> vfio_open() -> pci_reset_function()
> https://github.com/pugs/vfio-linux-2.6/blob/vfio/drivers/vfio/vfio_main.c

Got it, thanks Alex.

Re: [Qemu-devel] [PATCH] vfio: Add an ioctl to reset the device

2011-04-19 Thread Chris Wright

* Alex Williamson (alex.william...@redhat.com) wrote:
> On Tue, 2011-04-19 at 15:07 -0700, Chris Wright wrote:
> > * Alex Williamson (alex.william...@redhat.com) wrote:
> > > When using VFIO to assign a device to a guest, we want to make sure
> > > the device is quiesced on VM reset to stop all DMA within the guest
> > > mapped memory.  Add an ioctl which just calls pci_reset_function()
> > > and returns whether it succeeds.
> > 
> > Shouldn't there be a reset when binding/unbinding vfio to/from a pci
> > device?
> 
> There's already one when the /dev/vfioX file is opened, we should add
> another on release, and probably add the same PCI save state store/load
> that I'm proposing for KVM across those.  Thanks,

Hmm, I looked and didn't see it, hence the question.

Re: [Qemu-devel] [PATCH] vfio: Add an ioctl to reset the device

2011-04-19 Thread Chris Wright

* Alex Williamson (alex.william...@redhat.com) wrote:
> When using VFIO to assign a device to a guest, we want to make sure
> the device is quiesced on VM reset to stop all DMA within the guest
> mapped memory.  Add an ioctl which just calls pci_reset_function()
> and returns whether it succeeds.

Shouldn't there be a reset when binding/unbinding vfio to/from a pci
device?

Re: [Qemu-devel] [PATCH] vfio: Add an ioctl to reset the device

2011-04-19 Thread Chris Wright

* Randy Dunlap (rdun...@xenotime.net) wrote:
> I can't find include/linux/vfio.h in linux-next or mainline git, but
> ioctls need to be documented in Documentation/ioctl/ioctl-number.txt

It is in the full patchset: https://github.com/pugs/vfio-linux-2.6

[Qemu-devel] KVM call minutes for Apr 5

2011-04-05 Thread Chris Wright

KVM Forum
- save the date is out, cfp will follow later this week
- abstracts due in 6wks, 2wk review period, notifications by end of May

Improving process to scale project
- Trivial patch bot
- Sub-maintainership

Trivial patch monkeys^Wteam
- small/simple patches posted can fall through the cracks (esp. for
  areas that aren't well maintained)
- patches should be simple, easy to review (
- aiming to gather a team, so that the position can rotate
- patch submitter can rest assured
- Stefan and possibly Mike Roth are volunteering to get this started
- Cc: qemu-triv...@nongnu.org to send patches to the Trivial patch monkey
- details here:
  
  http://wiki.qemu.org/Contribute/TrivialPatches

Sub-maintainership
- have MAINTAINERS file
  - need to add git tree URLs
  - needs another pass to make sure there are no missing subsystems
  - make it clearer how maintained the subsystems are
- adding a wiki page to show how to become a subsystem maintainer
  - one valuable step...write testing around the subsystem
- means you've had to learn the subsystem (builds expertise)
- allows for regression testing the subsystem (esp. validating new patches)
- sub-maintainers sometimes disappear
  - can add another maintainer
  - actively poke the maintainer when patches are languishing
  - if you're going to be away, be sure to let list or backup know
- systematic patch tracking would help, patchwork doesn't quite cut it
- who receives pull request
  - list + blue swirl/aurelien for tcg, anthony picking up plenty of
other bits
- infrastructure subsystems (qdev, migration, etc..)
  - big invasive changes done externally, effective flag day for full merge
  - subsystem localized change (e.g. vmstate fix for a specific device)
maintainers can work it out, be sure to have both
- facilitating patch review and hopefully improving subsystem over time

kvm-autotest
- roadmap...refactor to centralize testing (handle the xen-autotest split off)
- internally at RH, lmr and cleber maintain autotest server to test
  branches (testing qemu.git daily)
  - have good automation for installs and testing
- seems more QA focused than developers
  - plenty of benefit for developers, so lack of developer use partly
cultural/visibility...
  - kvm-autotest team always looking for feedback to improve for
developer use case
- kvm-autotest day to have folks use it, write test, give feedback?
  - startup cost is/was steep, the day might be too much handholding
  - install-fest? (to get it installed and up and running)
- buildbot or autotest for testing patches to verify building and working
- one goal is to reduce mailing list load (patch resubmission because
  they haven't handled basic cases that buildbot or autotest would have
  caught)
- fedora-virt test day coming up on April 14th.  lucas will be on hand and
  we can piggy back on that to include kvm-autotest install and virt testing
- kvm autotest run before qemu pull request and post merge to track
  regressions, more frequent testing helps developers see breakage
  quickly
  - qemu.git daily testing already, only the "sanity" test subset 
- run more comprehensive "stable" set of tests on weekends
- one issue is the large number of known failures, need to make these
  easier to identify (and fix the failures one way or another)
- create database and verify (regressions) against that
  - red/yellow/green (yellow shows area was already broken)
- autotest can be run against server, not just on laptop
- how to do remote client display testing (e.g. spice client)
  - dogtail and LDTP
  - graphics could be tested w/ screenshot compares
- WHQL testing automated as well

Re: [Qemu-devel] KVM call minutes for Mar 15

2011-03-15 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> On 03/15/2011 09:53 AM, Chris Wright wrote:
> > QAPI

> >- c library implementation is critical to have unit tests and test
> >   driven development
> >   - thread safe?
> > - no shared state, no statics.
> > - threading model requires lock for the qmp session
> >   - licensiing?
> > - LGPL
> >   - forwards/backwards compat?
> > - designed with that in mind see wiki:
> >
> >   http://wiki.qemu.org/Features/QAPI
> 
> One neat feature of libqmp is that once libvirt has a better QMP
> passthrough interface, we can create a QmpSession that uses libvirt.
> 
> It would look something like:
> 
> QmpSession *libqmp_session_new_libvirt(virDomainPtr dom);

Looks like you mean this?

   -> request QmpSession -> 
client  libvirt
   <- return QmpSession  <-

client -> QmpSession -> QMP -> QEMU

So bypassing libvirt completely to actually use the session?

Currently, it's more like:

client -> QemuMonitorCommand -> libvirt -> QMP -> QEMU

> The QmpSession returned by this call can then be used with all of
> the libqmp interfaces.  This means we can still exercise our test
> suite with a guest launched through libvirt.  It also should make
> the libvirt pass through interface a bit easier to consume by third
> parties.

This sounds like it's something libvirt folks should be involved with.
At the very least, this mode is there now and considered basically
unstable/experimental/developer use:

 "Qemu monitor command '%s' executed; libvirt results may be unpredictable!"

So likely some concern about making it easier to use, esp. assuming
that third parties above are mgmt apps, not just developers.

thanks,
-chris

[Qemu-devel] KVM call minutes for Mar 15

2011-03-15 Thread Chris Wright

QAPI -- http://wiki.qemu.org/Features/QAPI
- please review!
- Anthony would like to see feedback and plans to commit in a week
  (assuming agreement and no major issues in review)
- some concern about the maintainability of code generation
  - but still nothing concrete on the list, need to review and discuss
on the list
- some concern that implementation details may change the wire protocol
  - introduces a new mechanism for new signals (mask by default and
enabled explicitly)
  - disagreement over when/how to introduce new extensions
- libvirt feedback?
  - no protocol level changes
- old and new versions are testable with test suite and proves this
- c library implementation is critical to have unit tests and test
  driven development
  - thread safe?
- no shared state, no statics.
- threading model requires lock for the qmp session
  - licensiing?
- LGPL
  - forwards/backwards compat?
- designed with that in mind see wiki:
  
  http://wiki.qemu.org/Features/QAPI

QCFG -- http://wiki.qemu.org/Features/QCFG
- command line args translation to objects is complex and buggy
- schema + code generator to formalize this
- formally describe each command line option and generate code
  to build and validate objects
- provides systematic way to document command line options
- automatically 
- device_add does multiple conversions to go from qmp to qemuopts to
  objects
- move to basic c structures, and autogenerated marshalling code
- no plan to do this work soon, late in 0.15 cycle
  - same as qapi, fork a tree, do mass conversion and merge for 0.16 cycle
- qmp server mode to take all configuation commands before actually
  starting the guest
- can provide a config file 
- qdev...
  - could just bridge to setting and getting qdev properties
  - OR get to point where device objects go directly to qdev device init
- why not move command line to qmp instead of new schema?
  - single schema
- considerations for -M (didn't capture all of these)
- for all the details:
  
  http://wiki.qemu.org/Features/QCFG

Merging big changes
- in the past, evolving in tree has not worked well, leaving partial
  conversions
- QAPI/QCFG method of doing changes in external tree hopes to set new precedent
  - preserve patch/review on list
  - do full conversion
  - provide strong testing to show it works

Kemari merge plans
- just needs some ACKs
- Juan, Anthony, anybody else who is familiar with migration to review?

switch from gpxe to ipxe
- possible 0.15 release w/ ipxe (Alex looking into it)
- Michael Brown been helpful in fixing bugs, so compat
- Alex will send out mail soon on the details
- ipxe releases?  not yet, there are plans for it, should be coming RSN
- Stefan volunteers to help test

[Qemu-devel] Re: [PATCH] Fix performance regression in qemu_get_ram_ptr

2011-03-10 Thread Chris Wright

* Vincent Palatin (vpala...@chromium.org) wrote:
> When the commit f471a17e9d869df3c6573f7ec02c4725676d6f3a converted the
> ram_blocks structure to QLIST, it also removed the conditional check before
> switching the current block at the beginning of the list.

Nice catch.

> In the common use case where ram_blocks has a few blocks with only one
> frequently accessed (the main RAM), this has a performance impact as it
> performs the useless list operations on each call (which are on a really
> hot path).
> 
> On my machine emulation (ARM on amd64), this patch reduces the
> percentage of CPU time spent in qemu_get_ram_ptr from 6.3% to 2.1% in the
> profiling of a full boot.

Hopefully this is back on par with before the QLIST switchover.

> Signed-off-by: Vincent Palatin 

Acked-by: Chris Wright 

> ---
>  exec.c |7 +--
>  1 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index d611100..81f08b7 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2957,8 +2957,11 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
>  
>  QLIST_FOREACH(block, &ram_list.blocks, next) {
>  if (addr - block->offset < block->length) {
> -QLIST_REMOVE(block, next);
> -QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +/* Move this entry to to start of the list.  */
> +if (block != QLIST_FIRST(&ram_list.blocks)) {
> +QLIST_REMOVE(block, next);
> +QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +}

Pretty close to self-documenting code now.  Not sure if it's subtle enough
to warrant change to the comment like:
 
  /* Move block to head of list if it's not there already */

thanks,
-chris

[Qemu-devel] KVM call minutes for Mar 8

2011-03-08 Thread Chris Wright

QAPI merge plans
- should be 100% back compat
- qmp moved over
- hmp moved over
- 1st pass, core infrastructure (includes test framework)
- 2nd pass, command conversion
- 3rd pass, more controversial bits
- adds dependencies: glib and python
- some testing based on kvm-unit-test micro-os instance (e.g. added a balloon
  and run commands against it to test)
  - add more functionality here? (kvm autotest is slow, above is quick)
- will hit some point where full functionality is needed
  - have a mini linux to do this (lags where driver updates are part of test)
- generated code can obfuscate the debugging process
  - code generator has some ugly corners (python writing C...)
  - but generated code should be debuggable, readable, etc.
- some grumbling regarding glib dependency
  - reducing NIH and relying on external functionality is solid way to
grow qemu as a project

Read wiki here and review closely:

  http://wiki.qemu.org/Features/QAPI

virt-agent
- json string converted to command (and vice versa)
- add to qmp schema - allows generated marshalling code to sanity check in/out
- problem with qmp not being bi-directional (rpc - in, events - out)
  - posted events allow migration to save and send unposted events
- any issues with guest agent interface extensibility
  - will add command to return schema
  - can add (optional) parameters to commands
- make libqmp a shared object for 0.16 (too much going on for 0.15)
- can terminate in qemu (e.g. vnc server internally qmp client to interact
  with guest cut 'n paste) or externally proxying to/from endpoint
- possibly revisit dynamic schema in future

glib, main loop, events
- (context was setfd changes from amit)
- iothread work is more critical to do first and get merged
- glib work starting just in qapi

iothread merge?
- progressing slowly, marcelo working on it
- have found regressions (signal handling code) (ifdef'd away for now)

[Qemu-devel] KVM call minutes for Feb 22

2011-02-22 Thread Chris Wright

0.14 recap
- keeping schedule on wiki was helpful
- changelog was helpful
- testing (could even more emphasis could be improved)
- -rc cycles
  - -rc2 and final release just hours

0.15
- tentative date July 1st
- qapi
- qed features
- virtagent?
  - depends on whether to terminate in qemu vs external
- terminating w/in qemu is close to feature complete
- using QMP (kinda, QObject -> JSON marshalling, still use HTTP)
- QMP is not bi-directional XMLRPC, one way with event posting
- XMLRPC + server logic add to the basic QEMU side attack surface
  - splitting out to external process
- state associated with guest in external process complicates live migration
  - e.g. handling in-process command in server
  - guest client reconnects during migration
  - can virtagent features be stateless 
- Avi's favorite Lua based extension language coming RSN ;)
  - let's use copy and paste as a concrete example
- usecase to help define the requirements and expose
  architectural
- Jes will do this, make concrete counter proposal to hosting
  virtagent server in qemu
  - splitting QEMU into more modular components is a large architectural
step, but better step

Block format acceptance
- qcow3 wiki starting

GSoC projects
- only 3 so far, mentoring organization applications Feb 28th
  - can update app 
- please add your thoughts here so that we can have a successful
- Luiz will send out a note as more explicit reminder

gpxe vs ipxe
- gpxe still stagnate
- ipxe accepting patches (e.g. igbvf)
- perhaps switch in 0.15 (Alex take a look)

[Qemu-devel] KVM call minutes for Feb 15

2011-02-15 Thread Chris Wright

QAPI and QMP
- Anthony adding a new wiki page to describe all of this
- specified in formal schema using JSON
  - includes documenation in javadoc-like syntax
  - can generate api (possibly protocol) docs
  - documenting each command and expected errors
- creates marshalling functions and C interfaces
- can generate C library
  - facilitates unit tests/regression tests
- new and old code both exist in Anthony's tree
  - allows unit tests to run on both to verify
  - will remove old and force a flag day on merging in for 0.15
- still need to convert human monitor commands
  - goal to convert all of human monitor to QMP
- events?
  - still not consumable from internal use
  - model signals and slots
- similar to notifier lists, but can pass arbitrary data
- client connects to signal via QMP
  - how to extend?
- optional parameters (ABI bump)
  - no way to know if client is aware of and consuming the optional
parameters
- add new events
  - client required to register for new events when the know about
them, server can generate different logic based on clients
capability
- first release may not include shared library (lack of libconf/autotool)
  - could 
- QMP session in default well-known location
  - allows iteration of all running QMP sessions
  - per-user directory to handle user-level isolation

qdev future
- have an object model, but can't do polymorphism (i.e. bus level)
- could use more oop style, use GObject, use C++...no great ideas
- no major qdev plans for 0.15
- would be useful to have the ability to do device level unit testing
  - cleaner device model, better encapsulation
  - this is both the device side interfaces, but also interfaces back to qemu
  - ability to do something like a virtual PCI bus to be a test harness
to interact with a device
  - back to the GObject, oop, C++ questions?
- IDL based code generation to generate VMState in effort to make
  migration more verifiable
- VMState
  - need to focus on serialized guest visible state
- start with all state and remove obviously internal only state
- start with only guest visible state (structure separation)
  - verfiable
- need a qdev tree maintainer?
- some disagreement on exactly how much 
- qdev autodoc patches? (posted and ack'd multiple times)

bad patches committed that are not on list
- please inform of specifics incidents, this should not be happening

SeaBIOS update?
- w/out we will have features that can't be used 
- need a release..
  - 0.15 will need good planning and dates and communication with Kevin

0.14-rc2 tagged please review for any missing patches, 0.14.0 likely
tagged late today

revisit new -> old migration
- Amit offers virtio-serial patches and some legwork
- tabled discussion to list, possibly next week's call

[Qemu-devel] KVM call agenda for Feb 15

2011-02-14 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] KVM call minutes for Feb 8

2011-02-08 Thread Chris Wright

Automated builds and testing
- found broken 32-bit
- luiz suggested running against maintainer trees
- daniel gollub offered to take on maintenance
- integration with kvm-autotest?
  - lucas, daniel, stefan...
  - testing each git commit is probably overkill and too expensive
  - current autotest run (each 48-hours to batch it up)
  - stefan currently running once a day, autotest run is 3 hours, so
daily should work
- need an integration tree to run build test on?
  - probably still too early

QEMU testing
- kvm unit tests
  - small standalone kernel that exercises paths that have shown bugs
http://git.kernel.org/?p=virt/kvm/kvm-unit-tests.git;a=summary
- Michael Roth recent sent RFC for qtest
  (http://www.mail-archive.com/qemu-devel@nongnu.org/msg54191.html)
  - test module (->init(), ->run()) which runs in place of vcpu threads to
set up a test framework to do targetted testing, for example, of devices
  - normal C code, access to qemu internal functions
  - not just functional device testing, but can also to fuzz testing
  - looking feedback/users/test developers/etc
- PPC (just kernel + initrd to boot, and verify boots are identical)
  - full install in many cases is too long, and can trigger other issues
(alex had examples of emulation being slow enough that login screen
times out)
- tcg basic testing to verify qemu-kvm patch isn't breaking tcg

Cross version migration (new->old version migration thread)
- downstreams want this, support this upstream?
- versions vs. subsections (subsections should allow this to work)
- (as usual) more vmstate conversion needed
- qdev/vmstate both examples of partially completed work that need more
  attention

[Qemu-devel] KVM call agenda for Feb 8

2011-02-07 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] KVM call minutes for Feb 1

2011-02-01 Thread Chris Wright

KVM upstream merge: status, plans, coordination
- Jan has a git tree, consolidating
- qemu-kvm io threading is still an issue
- Anthony wants to just merge
  - concerns with non-x86 arch and merge
  - concerns with big-bang patch merge and following stability
- post 0.14 conversion to glib mainloop, non-upstreamed qemu-kvm will be
  a problem if it's not there by then
- testing and nuances are still an issue (e.g. stefan berger's mmio read issue)
- qemu-kvm still evolving, needs to get sync'd or it will keep diverging
- 2 implementations of main init, cpu init, Jan has merged them into one
  - qemu-kvm-x86.c file that's only a few hundred lines
- review as one patch to see the fundamental difference

QMP support status for 0.14
- declare QMP fully supported
  - caveats: specific errors aren't guaranteed yet (primarily documentation)
  - human monitor passthrough command is best effort
- device tree structure is not reliable, use name not path
- will send out patch to update qmp-commands.hx to document this (and Cc
  libvirt)
- schema file (json subset which is python) and code generator to
  generate code with C structures, also generates client library for
  test cases (can test against new and old qmp server to verify hasn't
  changed)
  - HMP implemented in terms of QMP only
  - at the end should have a test framework to test all commands
  - glib/gtest framework

0.14 stable fork today
already posted 0.14 patches?
- will pick up all those patches before forking, fork at the end of the day
- will grab latest SeaBIOS and vgabios

SeaBIOS update for 0.14 (AHCI boot capable version)
- need to check if (and why) AHCI is disabled by default 
  - assuming no fundamental issues, could be enabled and become an
experimental new 0.14 feature

Summer of code 2011
- http://wiki.qemu.org/Google_Summer_of_Code_2011
- update wiki page with project ideas (let Anthony or Luiz know if you
  want to be a mentor)
- application is due at end of the month
- mentors...be prepared that projects may take longer than just the
  summer of code to complete
- join #qemu-gsoc on OFTC for gsoc discussions

Going to FOSDEM?  agraf will be there...

[Qemu-devel] KVM call agenda for Jan 25

2011-01-24 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] Re: KVM call agenda for Jan 18

2011-01-18 Thread Chris Wright

* Chris Wright (chr...@redhat.com) wrote:
> Please send in any agenda items you are interested in covering.

No agenda, this week's call is cancelled.

thanks,
-chris

[Qemu-devel] KVM call agenda for Jan 18

2011-01-17 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] KVM call minutes for Jan 11

2011-01-11 Thread Chris Wright

KVM Forum 2011
- expand the scope? yes, continue up the stack
- how long?  2 days (maybe 2 1/2 - 3 space permitting)
- where?  Vancouver with LinuxCon

Spice guest agent:
- virt agent, matahari, spice agent...what is in spice agent?
- spice char device
  - mouse, copy 'n paste, screen resolution change
- could be generic (at least input and copy/paste)
  - send protocol details of what is being sent
- need to look at how difficult it is to split it out from spice
  (how to split out in qemu vs. libspice)
- goal to converge on common framework
- more discussion on char device vs. protocol
  - eg. mouse_set breaks if mouse channel is part pv and part spice specific
- Alon will send link to protocol and try to propose new interfaces

migration and block devices:
- need to invalidate data after first read on target,
  because it can be stale
- close + reopen is what was done for NFS
- iscsi: can issue ioctl(BLKFLSBUF) to flush, but it's CAP_SYS_ADMIN only
- O_DIRECT to avoid cache (concerns that it's not guaranteed)
- agree change the default (cache=none for 

qemu patch queue is long:
- slow to return from break
- patience and more patch review will help make sure things are applied
  and don't fall through cracks

[Qemu-devel] Re: KVM call agenda for Dec 21

2010-12-21 Thread Chris Wright

* Chris Wright (chr...@redhat.com) wrote:
> Please send in any agenda items you are interested in covering.

No agenda, today's call is cancelled.

Also, given people's holiday and vacation schedules, next week's call is
cancelled.  Talk again after the New Year.

thanks,
-chris

[Qemu-devel] KVM call agenda for Dec 21

2010-12-20 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

Re: [Qemu-devel] KVM call agenda for Dec 14

2010-12-14 Thread Chris Wright

* Jes Sorensen (jes.soren...@redhat.com) wrote:
> Any chance you could fix your cronjob to send out the CFA a day earlier?
> 15 hrs before is a bit short notice.

Sure.

[Qemu-devel] Re: KVM call agenda for Dec 14

2010-12-14 Thread Chris Wright

* Chris Wright (chr...@redhat.com) wrote:
> Please send in any agenda items you are interested in covering.

No agenda, today's call is cancelled.

[Qemu-devel] KVM call agenda for Dec 14

2010-12-13 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

Re: [Qemu-devel] KVM call agenda for Dec 7

2010-12-07 Thread Chris Wright

* Jes Sorensen (jes.soren...@redhat.com) wrote:
> On 12/07/10 00:51, Chris Wright wrote:
> > Please send in any agenda items you are interested in covering.
> > 
> > thanks,
> > -chris
> > 
> 
> No agenda, no replies
> 
> Call canceled I presume?

Indeed, next week, then pick up next year...

[Qemu-devel] KVM call agenda for Dec 7

2010-12-06 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-01 Thread Chris Wright

* Peter Zijlstra (a.p.zijls...@chello.nl) wrote:
> On Wed, 2010-12-01 at 09:17 -0800, Chris Wright wrote:
> > Directed yield and fairness don't mix well either. You can end up
> > feeding the other tasks more time than you'll ever get back.
> 
> If the directed yield is always to another task in your cgroup then
> inter-guest scheduling fairness should be maintained. 
> 
> Yes, but not the inter-vcpu fairness.

That same vcpu doesn't get fair scheduling if it spends its entire
timeslice spinning on a lock held by a de-scheduled vcpu.

[Qemu-devel] Re: [PATCH] qemu-kvm: response to SIGUSR1 to start/stop a VCPU (v2)

2010-12-01 Thread Chris Wright

* Peter Zijlstra (a.p.zijls...@chello.nl) wrote:
> On Wed, 2010-12-01 at 21:42 +0530, Srivatsa Vaddagiri wrote:
> 
> > Not if yield() remembers what timeslice was given up and adds that back when
> > thread is finally ready to run. Figure below illustrates this idea:
> > 
> > 
> >   A0/4C0/4 D0/4 A0/4  C0/4 D0/4 A0/4  C0/4 D0/4 A0/4 
> > p0   
> > ||-L||||L||||L||||--|
> > \\   \  \
> >B0/2[2]  B0/0[6] B0/0[10]B0/14[0]
> > 
> >  
> > where,
> > p0  -> physical cpu0
> > L   -> denotes period of lock contention
> > A0/4-> means vcpu A0 (of guest A) ran for 4 ms
> > B0/2[6] -> means vcpu B0 (of guest B) ran for 2 ms (and has given up
> >6ms worth of its timeslice so far). In reality, we should
> >not see too much of "given up" timeslice for a vcpu.
> 
> /me fails to parse
> 
> > > >Regarding directed yield, do we have any reliable mechanism to find 
> > > >target of
> > > >directed yield in this (unmodified/non-paravirtualized guest) case? IOW 
> > > >how do
> > > >we determine the vcpu thread to which cycles need to be yielded upon 
> > > >contention?
> > > 
> > > My idea was to yield to a random starved vcpu of the same guest.
> > > There are several cases to consider:
> > > 
> > > - we hit the right vcpu; lock is released, party.
> > > - we hit some vcpu that is doing unrelated work.  yielding thread
> > > doesn't make progress, but we're not wasting cpu time.
> > > - we hit another waiter for the same lock.  it will also PLE exit
> > > and trigger a directed yield.  this increases the cost of directed
> > > yield by a factor of count_of_runnable_but_not_running_vcpus, which
> > > could be large, but not disasterously so (i.e. don't run a 64-vcpu
> > > guest on a uniprocessor host with this)
> > > 
> > > >>  So if you were to test something similar running with a 20% vcpu
> > > >>  cap, I'm sure you'd run into similar issues.  It may show with fewer
> > > >>  vcpus (I've only tested 64).
> > > >>
> > > >>  >Are you assuming the existence of a directed yield and the
> > > >>  >specific concern is what happens when a directed yield happens
> > > >>  >after a PLE and the target of the yield has been capped?
> > > >>
> > > >>  Yes.  My concern is that we will see the same kind of problems
> > > >>  directed yield was designed to fix, but without allowing directed
> > > >>  yield to fix them.  Directed yield was designed to fix lock holder
> > > >>  preemption under contention,
> > > >
> > > >For modified guests, something like [2] seems to be the best approach to 
> > > >fix
> > > >lock-holder preemption (LHP) problem, which does not require any sort of
> > > >directed yield support. Essentially upon contention, a vcpu registers 
> > > >its lock
> > > >of interest and goes to sleep (via hypercall) waiting for lock-owner to 
> > > >wake it
> > > >up (again via another hypercall).
> > > 
> > > Right.
> > 
> > We don't have these hypercalls for KVM atm, which I am working on now.
> > 
> > > >For unmodified guests, IMHO a plain yield (or slightly enhanced yield 
> > > >[1])
> > > >should fix the LHP problem.
> > > 
> > > A plain yield (ignoring no-opiness on Linux) will penalize the
> > > running guest wrt other guests.  We need to maintain fairness.
> > 
> > Agreed on the need to maintain fairness.
> 
> Directed yield and fairness don't mix well either. You can end up
> feeding the other tasks more time than you'll ever get back.

If the directed yield is always to another task in your cgroup then
inter-guest scheduling fairness should be maintained.

> > > >Fyi, Xen folks also seem to be avoiding a directed yield for some of the 
> > > >same
> > > >reasons [3].
> > > 
> > > I think that fails for unmodified guests, where you don't know when
> > > the lock is released and so you don't have a wake_up notification.
> > > You lost a large timeslice and you can't gain it back, whereas with
> > > pv the wakeup means you only lose as much time as the lock was held.
> > > 
> > > >Given this line of thinking, hard-limiting guests (either in user-space 
> > > >or
> > > >kernel-space, latter being what I prefer) should not have adverse 
> > > >interactions
> > > >with LHP-related solutions.
> > > 
> > > If you hard-limit a vcpu that holds a lock, any waiting vcpus are
> > > also halted.
> > 
> > This can happen in normal case when lock-holders are preempted as well. So
> > not a new problem that hard-limits is introducing!
> 
> No, but hard limits make it _much_ worse.
> 
> > >  With directed yield you can let the lock holder make
> > > some progress at the expense of another vcpu.  A regular yield()
> > > will simply stall the waiter.
> > 
> > Agreed. Do you see any problems with slightly enhanced version of yeild
> > described above (rather than directed yield)? It has some advantage over 
> > directed yield in t

[Qemu-devel] KVM call minutes for Nov 30

2010-12-01 Thread Chris Wright

2011 KVM Conference
- together with LF event like LinuxCon Vancouver BC (Aug), KS Prague (Nov)
- wider audience
  - include qemu (tcg)
  - include libvirt
  - include xen

0.14.0 release plan
- could push things out, mainly want to keep on track for

infrastructure changes (irc channel migration, git tree migration)
- savannah down
- git.qemu.org was mirror, will start pushing there
- when savannah is back up, will become mirror (so git users should
  still work)
- plan on moving #qemu to OFTC

nested VMX
- no progress, future plans are unclear

qemu users forum in grenoble
- worth having someone there
- goal to get embedded forks to push changes back to qemu

migration with large memory
- switching to 50ms cap likely to cause regression in terms of vcpu runtime
- 50ms qemu mutex contention, brief period of mutex access
  - this has the effect of speeding up migration but giving too little vcpu
access to qemu mutex (network connections could terminate, for example)
- only fixes to this are to use bw limit or not holding qemu mutex during
  mirgration
- run Anthony's test load and discuss on list

[Qemu-devel] KVM call minutes for Nov 23

2010-11-23 Thread Chris Wright

qcow2 performance roadmap
- What can be done to achieve near-raw image format performance?
  - some discussion points from Kevin on list
http://lists.nongnu.org/archive/html/qemu-devel/2010-11/msg02126.html
  - please follow up on the list
- some perf numbers (latest upstream qcow2 compared with qed)
  - qed is fully async, added unconditional flush to model qcow2
  - http://wiki.qemu.org/Qcow2/PerformanceRoadmap 
  - qcow2 not scaling as well
- metadata handling still quite sync
- sequential reads not scaling at all (a
- only serialization point is two accesses to same block and need to
  allocate
- template based backing file is common (esp. in cloud)
- perf data suggests that data/table format dictates performance ceiling
  - barriers off on underlying fs, cache=writethrough
  - raw backing file (sparse) grows with basic tools like cp
  - suggestion: qed == qcow2 v3
- wouldn't support encryption and compression (Kevin won't do this)

usb-ccid
- concern about external library implementation
  - hard to add device features, enhancements, live migration protocol changes
- external library
- will resend patch to 

vcpu hard limits
- will continue discussion on list

0.14 (release date, bug day, -rc planning, etc)
- aiming for dec 15th
- will send note out after call with release schedule

0.13.x
- will connect with jforbes regarding -stable maintainance

gPXE vs. iPXE
- ipxe is new fork
- ipxe looking more active (including original gpxe developers)
- which is a better choice?
  - iPXE more active, gPXE stalled
  - some concern about where the community sits (gPXE has irc, bug
reports, etc)
  - some concern about boot delay with iPXE
- qemu not updating roms that frequently, next time we need to update,
  can evaluate
- syslinux still using gPXE

[Qemu-devel] Re: [PATCH] qemu-kvm: introduce cpu_start/cpu_stop commands

2010-11-22 Thread Chris Wright

* Anthony Liguori (aligu...@linux.vnet.ibm.com) wrote:
> On 11/22/2010 05:04 PM, Chris Wright wrote:
> >* Anthony Liguori (aligu...@us.ibm.com) wrote:
> >>qemu-kvm vcpu threads don't response to SIGSTOP/SIGCONT.  Instead of 
> >>teaching
> >>them to respond to these signals, introduce monitor commands that stop and 
> >>start
> >>individual vcpus.
> >In the past SIGSTOP has introduced time skew.  Have you verified this
> >isn't an issue.
> 
> Time skew is a big topic.  Are you talking about TSC drift,
> pit/rtc/hpet drift, etc?

Sorry to be vague, but it's been long enough that I don't recall
the details.  The guest kernel's clocksource effected how timekeeping
progressed across STOP/CONT (was probably missing qemu based timer ticks).
While this is not the same, made me wonder if you'd tested against that.

> It's certainly going to stress periodic interrupt catch up code.

Heh, call it a feature for autotest ;)

thanks,
-chris

[Qemu-devel] Re: [PATCH] qemu-kvm: introduce cpu_start/cpu_stop commands

2010-11-22 Thread Chris Wright

* Anthony Liguori (aligu...@us.ibm.com) wrote:
> qemu-kvm vcpu threads don't response to SIGSTOP/SIGCONT.  Instead of teaching
> them to respond to these signals, introduce monitor commands that stop and 
> start
> individual vcpus.

In the past SIGSTOP has introduced time skew.  Have you verified this
isn't an issue.

thanks,
-chris

Re: [Qemu-devel] KVM call agenda for Nov 23

2010-11-22 Thread Chris Wright

* Juan Quintela (quint...@redhat.com) wrote:
> Please send in any agenda items you are interested in covering.

usb-ccid

[Qemu-devel] KVM call agenda for Nov 16

2010-11-15 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] Re: [PATCH v3] virtio-9p: fix build on !CONFIG_UTIMENSAT

2010-11-14 Thread Chris Wright

* Hidetoshi Seto (seto.hideto...@jp.fujitsu.com) wrote:
> This patch introduce a fallback mechanism for old systems that do not
> support utimensat().  This fix build failure with following warnings:
> 
> hw/virtio-9p-local.c: In function 'local_utimensat':
> hw/virtio-9p-local.c:479: warning: implicit declaration of function 
> 'utimensat'
> hw/virtio-9p-local.c:479: warning: nested extern declaration of 'utimensat'
> 
> and:
> 
> hw/virtio-9p.c: In function 'v9fs_setattr_post_chmod':
> hw/virtio-9p.c:1410: error: 'UTIME_NOW' undeclared (first use in this 
> function)
> hw/virtio-9p.c:1410: error: (Each undeclared identifier is reported only once
> hw/virtio-9p.c:1410: error: for each function it appears in.)
> hw/virtio-9p.c:1413: error: 'UTIME_OMIT' undeclared (first use in this 
> function)
> hw/virtio-9p.c: In function 'v9fs_wstat_post_chmod':
> hw/virtio-9p.c:2905: error: 'UTIME_OMIT' undeclared (first use in this 
> function)
> 
> v3:
>   - Use better alternative handling for UTIME_NOW/OMIT
>   - Move qemu_utimensat() to cutils.c
> V2:
>   - Introduce qemu_utimensat()
> 
> Signed-off-by: Hidetoshi Seto 

Looks good to me (no strong opinion on the cutils vs oslib-posix that
Jes mentioned).

Acked-by: Chris Wright

[Qemu-devel] Re: [PATCH] virtio-9p: fix build on !CONFIG_UTIMENSAT v2

2010-11-13 Thread Chris Wright

* Hidetoshi Seto (seto.hideto...@jp.fujitsu.com) wrote:
> +/*
> + * Fallback: use utimes() instead of utimensat().
> + * See commit 74bc02b2d2272dc88fb98d43e631eb154717f517 for known problem.
> + */
> +struct timeval tv[2];
> +int i;
> +
> +for (i = 0; i < 2; i++) {
> +if (times[i].tv_nsec == UTIME_OMIT || times[i].tv_nsec == UTIME_NOW) 
> {
> +tv[i].tv_sec = 0;
> +tv[i].tv_usec = 0;

I don't think this is accurate in either case.  It will set the
atime, mtime, or both to 0.

For UTIME_NOW (in both) we'd simply pass NULL to utimes(2).  For
UTIME_OMIT (in both) we'd simply skip the call to utimes(2) altogether.

The harder part is a mixed mode (i.e. the truncate fix mentioned in the
above commit).  I think the only way to handle UTIME_NOW in one is to
call gettimeofday (or clock_gettime for better resolution) to find out
what current time is.  And for UTIME_OMIT call stat to find out what the
current setting is and reset to that value.  Both of those cases can
possibly zero out the extra precision (providing only seconds
resolution).

thanks,
-chris

[Qemu-devel] KVM call minutes for Nov 9

2010-11-09 Thread Chris Wright

linux plumbers
- qemu talks, including xen folks efforts to get patches upstream
  - considering virtio
  - considering seabios
  - even some xenner interest
- seabios presentation
- uefi
  - still needs CSM support (lot of work) to be the only BIOS
  - otherwise need legacy BIOS and UEFI and users choose
  - who's interested?
- host numa support (guest placement and home noding)
  - static pinning fine for benchmarking
  - -mempath
  - migrate_pages(2) can be called from usersapce to push pages around
  - like to push this framework into the kernel
  - wiki page for planning coming RSN
- ways to improve qemu ...
  - C++, RFC patches to show value of language object support and device model
  - like to get to devices modular enough to have unit tests, etc.
  - any security interest in using a stronger typed language?

kernel summit
- showed up in many talks (mainstream part of the kernel)
- KVM for ARM hallways discussion
- more developers interested in joining the project (need to prepare for this)

USB 2.0 ehci support
- anyone looking at this?
  - Jes has it on his todo list (albeit not near the top priority)
- should send out a note summarizing status of that tree (Jan will send note)
  - is USB passthrough fully working with this?
- USB 3.0 coming, plan for that (maybe go straight there)

openfirmware tree
- problem stable/unique names for devices (like UUID generated at dev creation)
- addressing device externally from QEMU is still hard
- also can't touch the user defined namespace

Re: [Qemu-devel] qemu-kvm build issue on RHEL5.1

2010-11-04 Thread Chris Wright

* Hidetoshi Seto (seto.hideto...@jp.fujitsu.com) wrote:
> (2010/10/14 4:11), Blue Swirl wrote:
> > On Wed, Oct 13, 2010 at 8:00 AM, Hidetoshi Seto
> >  wrote:
> >> (Add CC to k...@vger)
> >>
> >> (2010/10/12 10:52), Hao, Xudong wrote:
> >>> Hi,
> >>> Currently qemu-kvm build fail on RHEL5 with gcc 4.1.2, build can pass on 
> >>> Fedora11 with gcc 4.4.1, can anybody look on RHEL5 system?
> >>>
> >>> Gcc: 4.1.2
> >>> system: RHEL5.1
> >>> qemu-kvm: 85566812a4f8cae721fea0224e05a7e75c08c5dd
> >>>
> >>> ...
> >>>   LINK  qemu-img
> >>>   LINK  qemu-io
> >>>   CClibhw64/virtio-9p-local.o
> >>> cc1: warnings being treated as errors
> >>> /home/source/qemu-kvm/hw/virtio-9p-local.c: In function 'local_utimensat':
> >>> /home/source/qemu-kvm/hw/virtio-9p-local.c:479: warning: implicit 
> >>> declaration of function 'utimensat'
> >>> /home/source/qemu-kvm/hw/virtio-9p-local.c:479: warning: nested extern 
> >>> declaration of 'utimensat'
> >>> make[1]: *** [virtio-9p-local.o] Error 1
> >>> make: *** [subdir-libhw64] Error 2
> >>>
> >>>
> >>> Best Regards,
> >>> Xudong Hao
> >>
> >> It seems that this issue is caused by the old glibc.
> >> Though I don't know well about virtio-9p and suppose there
> >> should be better fix, I confirmed that following change
> >> removed the warnings.
> > 
> > But then the system call will be made blindly without checking if the
> > kernel supports utimensat(). At the minimum, there should be a sane
> > response to ENOSYS error.
> 
> Yes. But I'm not sure how this virtio-9p should behave if there is
> no utimensat.  I think it will be better to fix this warning first
> to allow fellows using RHEL5 to restart contribute on qemu-kvm,
> and change this issue to virtio-9p specific problem to allow
> specialists of virtio-9p to have discussion for fix without
> bothering other developers. 

One way to workaround this is to simply not install libattr-devel
(effecitvely disabling virtio-9p).

But I agree with Blue Swirl, need a better fallback plan.  A qemu local
implementation of something like qemu_utimensat() that simply uses
glibc/kernel interface when available and falls back to using utimes()
makes sense to me.  Then the worst case is loss of resolution from ns to
us.

thanks,
-chris

[Qemu-devel] KVM call minutes for Oct 26

2010-10-26 Thread Chris Wright

Guest Agents
- need to get to guest userspace for many actions
- virtio for userspace
- host backend needs to terminate in QEMU
- portable across guest OS's
- virt-agent
  - bi-directional RPC (XML-RPC just since it's easy)
  - cmd: shutdown, reboot, dmesg, execute command, read/write file
- query guest type
- Matahari
  - consolidate agent proliferation
  - w/ or w/out networking (virtio-serial is fine)
  - may or may not have access to host
  - using amqp
- single transport
- use of qpid (C++, not C friendly)
  - put qemu bits in library and wrap (similar to libvirt, netcf, etc)
- e.g. shutdown, where does it live?
  - ACPI shutdown can trigger dbus, dialog, etc.
  - can already do ACPI, agent is for direct "shutdown -h now"
  - is there an async notification on shutdown (know it's been sucessful)
  - perhaps another library like libsysconfig
- openvmtools
  - useful as reference, GPL (requires copyright assignment)
  - uses PIO (or socket)

0.13.1
- vmmouse is broken
- assertion failure in block layer
  - just qemu-img: https://bugzilla.redhat.com/show_bug.cgi?id=646538
- patch posted, thanks Stefan
- hotplug fixes
- fix for seabios SCI level triggered (broken host initiated powerdown on 
FreeBSD)
  - regression <-- any regression needs to be considered seriously
  - was planning to move to 0.6.1 (vs. latest git snapshot)
  - kevin indicated ok with stable/tagging/branch for seabios

bootindex patch series
- qdev name vs specific name
- fine for seabios interface
- migration needs stable address too
  - worth holding up series for this?
  - one more try, then fallback to plan b (new callback)

migration issues
- keep using network after VM has stopped
- sent hacky patch for virtio-net, but need a generic sol'n
- virtio-block flushes after stop
- need a similar stop/flush for other devices
- unclear how anything is running w/out getting back to main loop
  - happening after migration completes

networking interfaces
- old vlan vs new netdev...be nice to finish off and simply internal
  interfaces

[Qemu-devel] Re: KVM call agenda for Oct 26

2010-10-25 Thread Chris Wright

* Juan Quintela (quint...@redhat.com) wrote:
> Please send in any agenda items you are interested in covering.

Guest agents

Re: [Qemu-devel] KVM call minutes for Oct 19

2010-10-22 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> On 10/22/2010 01:20 PM, Chris Wright wrote:
> >I'm not sure about that.  That same new shiny Fedora 21 QEMU has no idea
> >what the right OS specific command to run in guest is.  Granted, it's
> >not likely that "reboot" or "shutdown -r now" are likely to change for
> >Linux guests, do we assume cygwin for Windows guests?
> 
> No, but I'll waive my hands and say that I'm sure Windows has some
> appropriate mechanism to do the same thing (like PowerShell).

OK (bleh), but it's still specific to the guest OS.

> >   Really seems to
> >make more sense to have a stable ABI and negotiate version.
> 
> I guess the point is: we can always teach QEMU about how to work
> around older guests.  We (usually) can't control the software that's
> present on the guest itself.

I don't understand why we'd work around an older guest if the host <->
guest interface is stable.  Sure it can be extended, but old interfaces
should keep on Just Working (TM).

> The more logic we have in QEMU, the less we have to change the
> software in the guest which means the more likely things will work.

Maybe you're saying the advantage of injecting the raw commands into
the guest is that a host rev will automagically give an old guest new
functionality?

> >Also, from the point of view of a cloud where a VM agent is awfully
> >close to provider having backdoor into VM...a freeform vm_system()
> >doesn't seem like it'd be real popular.
> 
> This is the best (irrational) argument against this practice.
> Obviously, there's no real security concern here, but the end-user
> view may be troubling.

Heh, cloud + security == irrational fear, basic axiom

> That said, VMware has an interface for exactly this at least it's an
> established practice.

OK, what about other bits of API?  I recall seeing things like cut'n
paste, reboot, ballooning, time, few bits that spice would care about...
Are you thinking that as well, or all in terms of read/write/exec?

thanks,
-chris

Re: [Qemu-devel] KVM call minutes for Oct 19

2010-10-22 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> On 10/22/2010 12:29 PM, Chris Wright wrote:
> >* Anthony Liguori (anth...@codemonkey.ws) wrote:
> >>The first step is just identifying what interfaces we need in a
> >>guest agent.  So far, I think we can get away with a very small
> >>number of interfaces (mainly read/write files, execute command).
> >Could you elaborate here?  I can't imagine you mean:
> >
> >vm_system(target_vm, "shutdown -r now")
> >
> >But from other post, it does seem you want complexity in the host side
> >not guest side of agent.
> >
> >Seems vm_reboot(target_vm) as the RPC makes more sense with the guest
> >side implementing that in whatever guest-specific appropriate way.
> 
> What I really want is a vm_system API that a guest agent MUST
> implement and then APIs like vm_reboot that a guest agent MAY
> implement.
> 
> In my mind, the guest agent lives in the distros even though it's
> built from QEMU source tree.  We don't install it ourselves.
> 
> That means we might have a new funky fresh version of Fedora 21
> version of QEMU but running an old Fedora 14 guest with a really
> back-level guest agent.
> 
> Having very low level APIs with logic that primarily lives in QEMU
> gives us the ability to support new features in QEMU with older
> guests.

I'm not sure about that.  That same new shiny Fedora 21 QEMU has no idea
what the right OS specific command to run in guest is.  Granted, it's
not likely that "reboot" or "shutdown -r now" are likely to change for
Linux guests, do we assume cygwin for Windows guests?  Really seems to
make more sense to have a stable ABI and negotiate version.

Also, from the point of view of a cloud where a VM agent is awfully
close to provider having backdoor into VM...a freeform vm_system()
doesn't seem like it'd be real popular.

thanks,
-chris

Re: [Qemu-devel] KVM call minutes for Oct 19

2010-10-22 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> The first step is just identifying what interfaces we need in a
> guest agent.  So far, I think we can get away with a very small
> number of interfaces (mainly read/write files, execute command).

Could you elaborate here?  I can't imagine you mean:

vm_system(target_vm, "shutdown -r now")

But from other post, it does seem you want complexity in the host side
not guest side of agent.

Seems vm_reboot(target_vm) as the RPC makes more sense with the guest
side implementing that in whatever guest-specific appropriate way.

thanks,
-chris

Re: [Qemu-devel] KVM call minutes for Oct 19

2010-10-21 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> So there's no doubt in my mind that if you need a way to inventory
> physical and virtual systems, something like Matahari becomes a very
> appealing option to do that.
> 
> But that's not the problem space I'm trying to tackle.
> 
> An example of the problem I'm trying to tackle is guest reboot.

Matahari already has shutdown and reboot methods.

Inventory, reboot, filesystem freeze, cut'n paste, etc.. all are
communicating between host and guest.  Main point is to consolidate
effort to keep from having some sprawl of agents (which agent do I
install to do reboot?).

thanks,
-chris

[Qemu-devel] KVM Forum 2010: videos online [was Re: KVM Forum 2010: presentations online]

2010-10-19 Thread Chris Wright

* Chris Wright (chr...@redhat.com) wrote:
> We were also able to video the speakers, and will send a note when the
> videos are available.
> (and thanks again to Andrew Cathrow for making this happen)

I don't think a note went out yet.  The videos are available as well.

thanks,
-chris

[Qemu-devel] KVM call minutes for Oct 19

2010-10-19 Thread Chris Wright

0.13.X -stable
- Anthony will send note to qemu-devel on this
- move 0.13.X -stable to a separate tree
- driven independently of main qemu tree
- challenge is always in the porting and testing of backported fixes
- looking for volunteers

0.14
- would like to do this before end of the year
- 0.13 forked off a while back (~July), 
- 0.14 features
  - QMP stabilized
- 0.13.0 -> 0.14 QMP
- hard attempt not to break compatibility
- new commands, rework, async, human monitor passthrough
- goal getting to libvirt not needing human monitor at all
- QMP KVM autotest test suite submitted
- in-kernel apic, tpr patching still outstanding
- QED coroutine concurrency

Live snapshots
- merge snapshot?
  - already supported, question about mgmt of snapshot chain
- integrate with fsfreeze (and windows alternative)

Guest Agent
- have one coming RSN (poke Anthony for details)
- works over legacy or virtio serial
- simple RPC mechanism between host/guest
- allows host initiated reboot, for example
- can be place to do host driven snahpshot too
- Matahari?
  - can deal w/ block/net/UUID/cluster/etc
  - too heavyweight?
  - be sure to coordinate

threadlets and concurrency model (for virtfs)
- prior discussions
  - 1st model: http://www.mail-archive.com/qemu-devel@nongnu.org/msg43838.html
  - 2nd model: http://www.mail-archive.com/qemu-devel@nongnu.org/msg43921.html
  - threadlets: http://www.mail-archive.com/qemu-devel@nongnu.org/msg43842.html
- please review and continue discussion on the list
- concurrency model questions
  - async state machine is easiest to merge right now
  - future work could push it to cooperative coroutines

usb-ccid (aka external device modules such as vtpm)
- isolating device specific interface from qemu device internals is hard
- usb-ccid description...(go read the patches) 
- technical complexity with external emulation
  - version skew
  - live migration
  - same complextiy as full plug-in

[Qemu-devel] Re: KVM call agenda for Oct 19

2010-10-18 Thread Chris Wright

* Juan Quintela (quint...@redhat.com) wrote:
> 
> Please send in any agenda items you are interested in covering.

- 0.13.X -stable handoff
- 0.14 planning
- threadlet work
- virtfs proposals

Re: [Qemu-devel] [patch uq/master 0/8] port qemu-kvm's MCE support

2010-10-05 Thread Chris Wright

* Andreas Färber (andreas.faer...@web.de) wrote:
> Am 04.10.2010 um 20:54 schrieb Marcelo Tosatti:
> 
> I assume something went wrong with your cover letter here. It
> would've been nice to see MCE spelled out or summarized for those of
> us that don't speak x86.

It would help.  The acronym is Machine Check Exception.  The patchset
should allow (on newer Intel x86 hw with a newer linux kernel) a class of
memory errors delivered to the host OS as MCEs to be propagated to the
guest OS.  Without the patchset, the qemu process assoicated with the
memory where the error took place would be killed.  With the patchset,
qemu can propagate the error into the guest and allow the guest to kill
only the process within the guest that is assocated with the memory error.

[Qemu-devel] Re: KVM call agenda for Oct 5

2010-10-05 Thread Chris Wright

* Chris Wright (chr...@redhat.com) wrote:
> Please send in any agenda items you are interested in covering.

No agenda, call cancelled.

thanks,
-chris

[Qemu-devel] KVM call agenda for Oct 5

2010-10-04 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] KVM call agenda for Sept 28

2010-09-27 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] KVM call minutes for Sept 21

2010-09-21 Thread Chris Wright

Nested VMX
- looking for forward progress and better collaboration between the
  Intel and IBM teams
- needs more review (not a new issue)
- use cases
- work todo
  - merge baseline patch
- looks pretty good
- review is finding mostly small things at this point
- need some correctness verification (both review from Intel and testing)
  - need a test suite
- test suite harness will help here
  - a few dozen nested SVM tests are there, can follow for nested VMX
  - nested EPT
  - optimize (reduce vmreads and vmwrites)
- has long term maintan

Hotplug
- command...guest may or may not respond
- guest can't be trusted to be direct part of request/response loop
- solve at QMP level
- human monitor issues (multiple successive commands to complete a
  single unplug)
  - should be a GUI interface design decision, human monitor is not a
good design point
- digression into GUI interface

Drive caching
- need to formalize the meanings in terms of data integrity guarantees
- guest write cache (does it directly reflect the host write cache?)
  - live migration, underlying block dev changes, so need to decouple the two
- O_DIRECT + O_DSYNC
  - O_DSYNC needed based on whether disk cache is available
  - also issues with sparse files (e.g. O_DIRECT to unallocated extent)
  - how to manage w/out needing to flush every write, slow
- perhaps start with O_DIRECT on raw, non-sparse files only?
- backend needs to open backing store matching to guests disk cache state
- O_DIRECT itself has inconsistent integrity guarantees
  - works well with fully allocated file, depedent on disk cache disable
(or fs specific flushing)
- filesystem specific warnings (ext4 w/ barriers on, brtfs)
- need to be able to open w/ O_DSYNC depending on guets's write cache mode
- make write cache visible to guest (need a knob for this)
- qemu default is cache=writethrough, do we need to revisit that?
- just present user with option whether or not to use host page cache
- allow guest OS to choose disk write cache setting
  - set up host backend accordingly
- be nice preserve write cache settings over boot (outgrowing cmos storage)
- maybe some host fs-level optimization possible
  - e.g. O_DSYNC to allocated O_DIRECT extent becomes no-op
- conclusion
  - one direct user tunable, "use host page cache or not"
  - one guest OS tunable, "enable disk cache"

[Qemu-devel] KVM call agenda for Sept 21

2010-09-20 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

Re: [Qemu-devel] Re: ACPI error when mapping a 2GB BAR w/ 4GB of RAM

2010-09-17 Thread Chris Wright

* Cam Macdonell (c...@cs.ualberta.ca) wrote:
> On Fri, Sep 17, 2010 at 2:52 PM, Cam Macdonell  wrote:
> > On Fri, Sep 17, 2010 at 2:04 PM, Chris Wright  wrote:
> >> * Cam Macdonell (c...@cs.ualberta.ca) wrote:
> >>> After fixing the resource_size_t return value with
> >>> pci_resource_alignment, I see one other strange behaviour only when
> >>> using 4GB of RAM and a 2GB BAR.  I haven't found any other combination
> >>> of RAM/BAR size that triggers this bug.  I am using 2.6.36-rc3.
> >>>
> >>> ACPI Error: The DSDT has been corrupted or replaced - old, new headers
> >>> below (20100702/tbutils-372)
> >>> ACPI: DSDT (null) 01F15 (v01   BXPC   BXDSDT 0001 INTL 20090123)
> >>> ACPI:      (null) 0 (v00                       )
> >>> ACPI Error: Please send DMI info to linux-a...@vger.kernel.org
> >>> If system does not work as expected, please boot with acpi=copy_dsdt
> >>> (20100702/tbutils-378)
> >>> ACPI: PCI Interrupt Link [LNKC] disabled and referenced, BIOS bug
> >>> ACPI Exception: AE_AML_INVALID_RESOURCE_TYPE, Evaluating _CRS
> >>> (20100702/pci_link-283)
> >>> ACPI: Unable to set IRQ for PCI Interrupt Link [LNKC]. Try pci=noacpi
> >>> or acpi=off
> >>> virtio-pci :00:03.0: PCI INT A: no GSI - using ISA IRQ 11
> >>> Non-volatile memory driver v1.3
> >>> Linux agpgart interface v0.103
> >>> Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> >>
> >> IIRC,  the pci hole is only 1.5G in the BIOS, can you verify that
> >> seabios is doing the right thing?
> >
> > I'm not sure what the right thing for seabios to do is.  Here is the
> > seabios output related to the device.
> >
> > PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
> > region 0: 0xf204
> > init smm
> > init boot device ordering
> >
> > 
> >
> > Attempting to init PCI bdf 00:04.0 (vd 1af4:1110)
> > Attempting to map option rom on dev 00:04.0
> > Option rom sizing returned 0 0
> > Checking rom 0x000c9800 (sig aa55 size 17)
> > Checking rom 0x000cc000 (sig aa55 size 2)
> > Checking rom 0x000c9000 (sig aa55 size 4)
> > Checking rom 0x000c9800 (sig aa55 size 17)
> > Checking rom 0x000cc000 (sig aa55 size 2)
> > Mapping hd drive 0x000fdb50 to 0
> > Running option rom at c980:0003
> > Running option rom at cc00:0003
> > pmm_malloc zone=0x000f515c handle= size=36 align=10
> > ret=0x000fdaf0 (detail=0x7ffefca0)
> > ebda moved from 9fc00 to 9f400
> > pmm_malloc zone=0x000f5154 handle= size=2048 align=10
> > ret=0x0009f800 (detail=0x7ffefb40)
> > finalize PMM
> > malloc finalize
> >
> > when using a BAR of 2GB or less, there is an additional write to the
> > PCI space of the device, which may be from the bios
> >
> > pci_write_config: (val) 0x -> 0x18 (addr)
> > pci_read_config: (val) 0x8004 <- 0x18 (addr)
> > pci_write_config: (val) 0x4 -> 0x18 (addr)
> > pci_write_config: (val) 0x3 -> 0x4 (addr)
> > pci_read_config: (val) 0x0 <- 0x1c (addr)
> > pci_write_config: (val) 0x -> 0x1c (addr)
> > IVSHMEM: guest pci addr = , guest h/w addr =
> > 4312137728, size = 8000
> >
> > so is it succeeding with smaller sizes (> 2GB) because it fits in the
> > bios' pci hole?
> 
> sorry that should be "< 2GB".

It seems most likely...< 2GB also means <= 1GB (which would fit in
the hole).  Although, I have to admit, I'm not sure how seabios handles
the hole nowadays.

What about 2GB with 32bit BAR?

thanks,
-chris

[Qemu-devel] Re: ACPI error when mapping a 2GB BAR w/ 4GB of RAM

2010-09-17 Thread Chris Wright

* Cam Macdonell (c...@cs.ualberta.ca) wrote:
> After fixing the resource_size_t return value with
> pci_resource_alignment, I see one other strange behaviour only when
> using 4GB of RAM and a 2GB BAR.  I haven't found any other combination
> of RAM/BAR size that triggers this bug.  I am using 2.6.36-rc3.
> 
> ACPI Error: The DSDT has been corrupted or replaced - old, new headers
> below (20100702/tbutils-372)
> ACPI: DSDT (null) 01F15 (v01   BXPC   BXDSDT 0001 INTL 20090123)
> ACPI:  (null) 0 (v00   )
> ACPI Error: Please send DMI info to linux-a...@vger.kernel.org
> If system does not work as expected, please boot with acpi=copy_dsdt
> (20100702/tbutils-378)
> ACPI: PCI Interrupt Link [LNKC] disabled and referenced, BIOS bug
> ACPI Exception: AE_AML_INVALID_RESOURCE_TYPE, Evaluating _CRS
> (20100702/pci_link-283)
> ACPI: Unable to set IRQ for PCI Interrupt Link [LNKC]. Try pci=noacpi
> or acpi=off
> virtio-pci :00:03.0: PCI INT A: no GSI - using ISA IRQ 11
> Non-volatile memory driver v1.3
> Linux agpgart interface v0.103
> Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled

IIRC,  the pci hole is only 1.5G in the BIOS, can you verify that
seabios is doing the right thing?

thanks,
-chris

[Qemu-devel] KVM call minutes for Sept 14

2010-09-14 Thread Chris Wright

0.13
- if all goes well...tomorrow

stable tree
- please look at -stable to see what is missing (bugfixes)
  - esp. regressions from 0.12
- looking for dedicated stable maintainer/release manager
  - pick this discussion up next week

qed/qcow2
- increase concurrency, performance
- threading vs state machine
- avi doesn't like qed reliance on fsck
  - undermines value of error checking (errors become normal)
  - prefer preallocation and fsck just checks for leaked blocks
- just load and validate metadata
- options for correctness are
  - fsync at every data allocation
  - leak data blocks
  - scan
- qed is pure statemachine
  - state on stack, control flow vs function call
- common need to separate handle requests concurrently, issue async i/o
- most disk formats have similar metadata and methods
  - lookup cluster, read/write data
  - qed could be a path to cleaning up other formats (reusing)
- need an incremental way to improve qcow2 performance
  - threading doesn't seem to be the way to achieve this (incrementally)
- coroutines vs. traditional threads discussion
  - parallel (and locking) vs few well-defined preemption points
- plan for qed...attempt to merge in 0.14
  - online fsck support is all that's missing
  - add bdrv check callback, look for new patch series over the next week
- back to list with discussion...

[Qemu-devel] KVM call agenda for Sept 14

2010-09-13 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] [PATCH] pci: fix pci_resource_alignment prototype

2010-09-07 Thread Chris Wright

From: Cam Macdonell 

* Cam Macdonell (c...@cs.ualberta.ca) wrote:
> It seems it was the alignment value being passed back from
> pci_resource_alignment().  The return type is an int, which was
> causing value of 2GB to be sign extended to to 0x8000.
> Changing the return type to resource_size_t allows BAR values >= 2GB
> to be successfully assigned.

> -static inline int pci_resource_alignment(struct pci_dev *dev,
> +static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,
>  struct resource *res)

Yes, that's my mistake.  Thanks for debugging the issue Cam.
This fixes the prototype for both pci_resource_alignment() and
pci_sriov_resource_alignment().

Patch started as debugging effort from Cam Macdonell.

Cc: Cam Macdonell 
Cc: Avi Kivity 
Cc: Jesse Barnes 
[chrisw: add iov bits]
Signed-off-by: Chris Wright 
---
 drivers/pci/iov.c |2 +-
 drivers/pci/pci.h |5 +++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index ce6a366..553d8ee 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -608,7 +608,7 @@ int pci_iov_resource_bar(struct pci_dev *dev, int resno,
  * the VF BAR size multiplied by the number of VFs.  The alignment
  * is just the VF BAR size.
  */
-int pci_sriov_resource_alignment(struct pci_dev *dev, int resno)
+resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno)
 {
struct resource tmp;
enum pci_bar_type type;
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 679c39d..5d0aeb1 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -262,7 +262,8 @@ extern int pci_iov_init(struct pci_dev *dev);
 extern void pci_iov_release(struct pci_dev *dev);
 extern int pci_iov_resource_bar(struct pci_dev *dev, int resno,
enum pci_bar_type *type);
-extern int pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
+extern resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev,
+   int resno);
 extern void pci_restore_iov_state(struct pci_dev *dev);
 extern int pci_iov_bus_range(struct pci_bus *bus);
 
@@ -318,7 +319,7 @@ static inline int pci_ats_enabled(struct pci_dev *dev)
 }
 #endif /* CONFIG_PCI_IOV */
 
-static inline int pci_resource_alignment(struct pci_dev *dev,
+static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,
 struct resource *res)
 {
 #ifdef CONFIG_PCI_IOV

[Qemu-devel] KVM call minutes for Sept 7

2010-09-07 Thread Chris Wright

0.13 schedule
- RSN
- rc1 uploaded, tagged in git (and tag should actually be there now)
- announcement once it propagates
- 0.13.0 should be 1 week after rc1 announcement
- please check rc1 for any missing critical patches

qed
- concession that qcow2 is complicated and hard to get right
- it's much more efficient than qcow2
- not had data integrity testing, but simple enough design to
  rationalize the format and meta-data updates
- formal spec planned...documented on wiki http://wiki.qemu.org/Features/QED
  - design doc written first, code written to design doc
- defragmentation supportable and important (not done now)
- defragmented image should be as fast as raw
- concern about splitting install base (doubles qa effort, etc)
  - should be possible to do an in-place qcow2->qed update
  - even live update could be doable
- what about vmdk or vhd?
  - controlled externally
  - specification license implications are unclear
  - too close to NIH?
- qed and async model could put pressure to improve other formats and
  push code out of qed to core
- another interest for qed...streaming images (fault in image extents
  via network)
  - want to design this as starting from mgmt interface discussion

Re: [Qemu-devel] [PATCH 1/5] virtio-net: Make tx_timer timeout configurable

2010-08-31 Thread Chris Wright

* Alex Williamson (alex.william...@redhat.com) wrote:
> On Tue, 2010-08-31 at 11:00 -0700, Chris Wright wrote:
> > * Alex Williamson (alex.william...@redhat.com) wrote:
> > > diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> > > index 075f72d..9ef29f0 100644
> > > --- a/hw/virtio-net.c
> > > +++ b/hw/virtio-net.c
> > > @@ -36,6 +36,7 @@ typedef struct VirtIONet
> > >  VirtQueue *ctrl_vq;
> > >  NICState *nic;
> > >  QEMUTimer *tx_timer;
> > > +uint32_t tx_timeout;
> > >  int tx_timer_active;
> > >  uint32_t has_vnet_hdr;
> > >  uint8_t has_ufo;
> > > @@ -702,7 +703,7 @@ static void virtio_net_handle_tx(VirtIODevice *vdev, 
> > > VirtQueue *vq)
> > >  virtio_net_flush_tx(n, vq);
> > >  } else {
> > >  qemu_mod_timer(n->tx_timer,
> > > -   qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL);
> > > +   qemu_get_clock(vm_clock) + n->tx_timeout);
> > >  n->tx_timer_active = 1;
> > >  virtio_queue_set_notification(vq, 0);
> > >  }
> > > @@ -842,7 +843,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, 
> > > int version_id)
> > >  
> > >  if (n->tx_timer_active) {
> > >  qemu_mod_timer(n->tx_timer,
> > > -   qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL);
> > > +   qemu_get_clock(vm_clock) + n->tx_timeout);
> > 
> > I think I'm missing where this is stored?  Looks like migration
> > would revert a changed tx_timeout back to 150us.
> 
> It's not stored, it can be instantiated on the migration target any way
> you please and we can migrate between different values or even different
> TX mitigation strategies.  If a non-default value is used on the source
> and you want to maintain the same behavior, the target needs to be
> started the same way.

heh, IOW, I did miss how it's stored...on cmdline ;)

thanks,
-chris

Re: [Qemu-devel] [PATCH 1/5] virtio-net: Make tx_timer timeout configurable

2010-08-31 Thread Chris Wright

* Alex Williamson (alex.william...@redhat.com) wrote:
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index 075f72d..9ef29f0 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -36,6 +36,7 @@ typedef struct VirtIONet
>  VirtQueue *ctrl_vq;
>  NICState *nic;
>  QEMUTimer *tx_timer;
> +uint32_t tx_timeout;
>  int tx_timer_active;
>  uint32_t has_vnet_hdr;
>  uint8_t has_ufo;
> @@ -702,7 +703,7 @@ static void virtio_net_handle_tx(VirtIODevice *vdev, 
> VirtQueue *vq)
>  virtio_net_flush_tx(n, vq);
>  } else {
>  qemu_mod_timer(n->tx_timer,
> -   qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL);
> +   qemu_get_clock(vm_clock) + n->tx_timeout);
>  n->tx_timer_active = 1;
>  virtio_queue_set_notification(vq, 0);
>  }
> @@ -842,7 +843,7 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int 
> version_id)
>  
>  if (n->tx_timer_active) {
>  qemu_mod_timer(n->tx_timer,
> -   qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL);
> +   qemu_get_clock(vm_clock) + n->tx_timeout);

I think I'm missing where this is stored?  Looks like migration
would revert a changed tx_timeout back to 150us.

thanks,
-chris

[Qemu-devel] KVM call minutes for August 31

2010-08-31 Thread Chris Wright

QMP/QPI
- declare what's in 0.13 supported
  - means reasonable effort to avoid breaking something (deprecation is
possible)
- things that are left, shallow patch or human monitor conversion
- how to move forward?
  - need to improve interfaces (no argument there)
  - internal vs. external interfaces
  - QMP == external, stable, extensible, discoverable, fwd/back compat
  - internal == C, data structures, changeable, non-stable
  - redefine internal interfaces and work up?
- this addresses concern that most changes are in monitor.c
- and addresses the concern that we aren't defining proper
  extensible top level interfaces
  - work top down from external?
- map to internal details...fix internals when external is
  hard/impossible based on internals
- need to focus on QMP command addition in the face of internal details
  - no disagreement there
- decouple monitor and QMP
- sane interfaces (proposal for migration from Anthony)
- error issues...

0.13 schedule
- rc1 tagged locally and under test, once testing completes, upload,
  then one week to fix any outstanding issues
  - will push tag later today and upload, announce once propagates (24hrs-ish)

qemu-kvm integration
- not getting a lot of cycles
- nothing major that anthony won't pull
  - extboot still
- performance
  - i/o threading model (merge both and fix in-tree)
- in-kernel apic
- device assignement (vfio against qemu tree)
- disable ia64
- avi will look at doing the pull request

[Qemu-devel] KVM call cancelled [was: Re: KVM call agenda for August 24]

2010-08-24 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> On 08/23/2010 05:30 PM, Chris Wright wrote:
> >Please send in any agenda items you are interested in covering.
> 
> There are quite a few important discussions on the list but I think
> they should stay on the list right now.
> 
> So it sounds like we don't have an agenda for today.

No agenda, so this week's call is cancelled.

thanks,
-chris

[Qemu-devel] KVM call agenda for August 24

2010-08-23 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] KVM call cancelled [was: KVM call agenda for August 17]

2010-08-17 Thread Chris Wright

* Chris Wright (chr...@redhat.com) wrote:
> Please send in any agenda items you are interested in covering.

Today's call is cancelled.

[Qemu-devel] KVM call agenda for August 17

2010-08-16 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] Re: KVM Forum 2010: presentations online

2010-08-16 Thread Chris Wright

* Dor Laor (dl...@redhat.com) wrote:
> On 08/17/2010 12:50 AM, Chris Wright wrote:
> >KVM Forum 2010 was quite a success, many thanks to all who participated!
> >
> >For those who couldn't attend, the presentations are available online now:
> >(thanks to Andrew Cathrow for pushing them all up)
> >
> >http://www.linux-kvm.org/page/KVM_Forum_2010#Presentations
> 
> I Beat you in a second ;-)

Assuming accurate local clocks...6 seconds even ;)

[Qemu-devel] KVM Forum 2010: presentations online

2010-08-16 Thread Chris Wright

KVM Forum 2010 was quite a success, many thanks to all who participated!

For those who couldn't attend, the presentations are available online now:
(thanks to Andrew Cathrow for pushing them all up)

http://www.linux-kvm.org/page/KVM_Forum_2010#Presentations

We were also able to video the speakers, and will send a note when the
videos are available.
(and thanks again to Andrew Cathrow for making this happen)

thanks,
-chris

Re: [Qemu-devel] Re: [PATCH] Introduce a -libvirt-caps flag as a stop-gap

2010-07-27 Thread Chris Wright

* Anthony Liguori (anth...@codemonkey.ws) wrote:
> Here are the possible things we can do:
> 
> 1) merge -libvirt-caps as an intermediate solution, stop caring
> about -help changes, when full caps are introduced, stop updating
> -libvirt-caps
> 
> 2) don't merge -libvirt-caps, stop caring about -help changes, put
> everything on getting full caps merged by 0.14
> 
> 3) don't merge -libvirt-caps, care about making -help changes, use
> -help as the caps mechanism until full caps get merged

3.5) same as 3) + add test case to qemu to test that -help parser from
libvirt isn't busted.

> We can't do (3).  I'm going to revert the -help changes for 0.13 so
> that old versions of libvirt work but not for master.

I suspect that if the breakage is seen, it'd be easy to fix.  Adding new
help items won't be the problem, just the subtle changes to the existing
output.  Suck?  Yes.  Workable until full caps?  Think so.

thanks,
-chris

Re: [Qemu-devel] Re: KVM call agenda for July 27

2010-07-27 Thread Chris Wright

* Avi Kivity (a...@redhat.com) wrote:
>  On 07/27/2010 07:29 PM, Chris Wright wrote:
> >
> >>QEMU stderr+out is already recorded in /var/lib/libvirt/qemu/$GUESTNAME.log
> >>along with the env variables and argv used to spawn it. Or did you mean
> >>provide an API + virsh command /virt-manager UI for accessing the logs ?
> >I read that to mean...propagate stderr from qemu to be right in front of
> >the user.
> 
> Yes, that's what I meant.
> 
> >So that's output from virsh or in virt-manager.  Trouble is,
> >that's only useful (at best) when starting a guest.  Perhaps some
> >virt-manager thing (an exclamation point to show there's errors in the
> >log and a way to read them), and a virsh utility to match (although
> >that'd require the user to actually poll the interface, at which point
> >they can just as easily just look at the log).
> 
> If things work there's  no reason for the user to go look at the
> logs.  An exclamation point invites clicking.
> 
> Even better would be an ABRT plugin, so if something goes
> (marginally) wrong, the siren pops up and you're invited to report
> the bug.

Despite some of the ABRT growing pains, ABRT plugin seems like a good
idea.  I don't know enough of the plugins to know if that requires
formatted output and just grepping for some known regexps.

thanks,
-chris

Re: [Qemu-devel] Re: KVM call agenda for July 27

2010-07-27 Thread Chris Wright

* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Tue, Jul 27, 2010 at 07:17:06PM +0300, Avi Kivity wrote:
> >  On 07/27/2010 06:28 PM, Anthony Liguori wrote:
> > >
> > >If we add docs/deprecated-features.txt, schedule removal for at least 
> > >1 year in the future, and put a warning in the code that prints 
> > >whenever raw is probed, I think I could warm up to this.
> > >
> > >Since libvirt should be insulating users from this today, I think the 
> > >fall out might not be terrible.
> > 
> > On a related note, we should ask libvirt to make qemu stderr output 
> > available to its users, or perhaps an ABRT plugin to report such 
> > messages from libvirt's logs.
> 
> QEMU stderr+out is already recorded in /var/lib/libvirt/qemu/$GUESTNAME.log
> along with the env variables and argv used to spawn it. Or did you mean 
> provide an API + virsh command /virt-manager UI for accessing the logs ?

I read that to mean...propagate stderr from qemu to be right in front of
the user.  So that's output from virsh or in virt-manager.  Trouble is,
that's only useful (at best) when starting a guest.  Perhaps some
virt-manager thing (an exclamation point to show there's errors in the
log and a way to read them), and a virsh utility to match (although
that'd require the user to actually poll the interface, at which point
they can just as easily just look at the log).

thanks,
-chris

[Qemu-devel] KVM call minutes for July 27

2010-07-27 Thread Chris Wright

0.13
- -rc0 tagged, propagating now
- no more features, bug fix only
- although a few things, like shared memory device, are still feasible

qemu64 cpu model
- currently model 2
  - this cpu simply does not exist at all in the real world
- model 13 or higher windows 32bit will enable MSI/-X support
  - anyone aware of issues with simply bumping the model
- must retain compatibility with -M
- cpu models fully configurable in config files
  - should move default to config files
- raises a couple questions
  - does "qemu64" need to have a single stable definition?
  - does default cpu make sense
- also the physical models are broken (Conroe, Penryn, etc..)
  - these are simply broken and need to change
- can create versions of base model (qemu64-v1/0.13/whatever)

probed_raw
- 79368c81 closed security hole
- qraw addressing theoretical issue and has too much magic
- any further discussion, list is ---> that way

qemu -help parsing
- libvirt current issues
  - qemu -help/-version was changed and broke libvirt (fixed 0.8.2)
  - libvirt improperly parses cache= (fixed 0.8.3)
- reverting version string change: f75ca1ae
- apply bruce's cache -help patch
- no further significant -help changes
- libvirt uses version only (and maintains the version meaning for
  downstreams)
- eventually capabilities fixes this
- minimal "info capabilities" that usable tomorrow?
  - becomes a suppored interface, deprecating it will be complicated
  - unclear if it buys anything, -help parsing working now, the interim
sol'n would be thrown away
- could isolate libvirt -help parser and toss it into qemu as a test

[Qemu-devel] KVM call agenda for July 27

2010-07-26 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

[Qemu-devel] KVM call minutes for July 20

2010-07-20 Thread Chris Wright

0.12.stable
- start w/ git tree + pull requests
- release process is separate from commit access
- justin will put up a tree for pull requests
- there's current backlog, what about that?
- anthony's concern with -stable is the testing (upstream tree gets more
  testing than -stable)
- 0.12.5?
  - planning to do next w/ 0.13 release
  - aurelien may cut a release
  - justin will do some sanity testing, most patches are in fedora anyway

0.13
- rc RSN (hopefully this week, top priority for anthony)

kvm testsuite
- was planning to clean up and contribute to qemu
- now thinking perhaps just split it out to its own repo
  - not really qemu code, not really kvm code, not cross compile, etc..
  - could use std serial device
  - could use vga (needs mmio space)
  - 
- would like to add nested svm and (more important) nested vmx
  - small bit to copy l1 to l2 state, to make guest nested
  - need framework, can then require nested patches come w/ regression tests
- current testsuite failing on qemu (shows softmmu issues, any takers?)

fw_cfg issues
- mostly on list
- concerns about dma interface (too close to use case specific hack)
- rep could be optimized in general
  - each byte == function call
- possible pull in 4k (instead of 1k) on each exit
- bar for changes should be no new interfaces

[Qemu-devel] KVM call agenda for July 20

2010-07-19 Thread Chris Wright

Please send in any agenda items you are interested in covering.

thanks,
-chris

Re: [Qemu-devel] Re: [RFC PATCH 4/7] ide: IOMMU support

2010-07-15 Thread Chris Wright

* Avi Kivity (a...@redhat.com) wrote:
> On 07/15/2010 08:17 PM, Chris Wright wrote:
> >
> >>For emulated device, it seems like we can ignore ATS completely, no?
> >Not if you want to emulate an ATS capable device ;)
> 
> What I meant was that the whole request translation, invalidate, dma
> using a translated address thing is invisible to software.  We can
> emulate an ATS capable device by going through the iommu every time.

Well, I don't see any reason to completely ignore it.  It'd be really
useful for testing (I'd use it that way).  Esp to verify the
invalidation of the device IOTLBs.

But I think it's not a difficult thing to emulate once we have a proper
api encapsulating a device's dma request.

thanks,
-chris

1 2 >

1 - 100 of 171 matches

Mail list logo