On Tue, Feb 09, 2010 at 08:38:56AM +0200, Avi Kivity wrote:
> On 02/09/2010 12:41 AM, Marcelo Tosatti wrote:
> >On Thu, Feb 04, 2010 at 11:46:25PM +0200, Avi Kivity wrote:
> >>On 02/04/2010 11:36 PM, Marcelo Tosatti wrote:
> >>>On Thu, Feb 04, 2010 at 09:16:47PM +0200, Avi Kivity wrote:
> >>>>On 01/28/2010 09:03 PM, Marcelo Tosatti wrote:
> >>>>>A vcpu can be stopped after handling IO in userspace,
> >>>>>but before returning to kernel to finish processing.
> >>>>>
> >>>>Is this strictly needed?  If we teach qemu to migrate before
> >>>>executing the pio request, I think we'll be all right?  should work
> >>>>at least for IN/INS, not sure about OUT/OUTS.
> >>>It would be nice (instead of more state to keep track of between
> >>>kernel/user) but the drawbacks i see are:
> >>>
> >>>You'd have to add a limitation so that any IN which was processed
> >>>by device emulation has to re-entry kernel to complete it (so it
> >>>complicates vcpu stop in userspace).
> >>>
> >>You could fix that by moving the IN emulation to before guest entry.
> >>It complicates the vcpu loop a bit, but is backwards compatible and
> >>all that.
> >Under such scheme, to avoid a stream of IN's from temporarily blocking
> >vcpu stop capability, you'd have to requeue a signal to stop the vcpu
> >(so the next IN in the stream is not executed, but complete_pio does).
> >
> >Or not process the stop signal in the first place (new state for main
> >loop, "pending pio/mmio").
> 
> Why?  you would handle stops exactly the same way:
> 
> vcpu_loop:
>      while running:
>          process_last_in()
>          run_vcpu()
>          handle_exit_except_in()
> 
> An IN that is stopped would simply be unprocessed, and the next
> entry, if at a new host, will simply re-execute it.

Its not so simple.

The kernel advances RIP before exiting to userspace with EXIT_IO (for
IN). So simply skipping an IN exit is not possible.

In the case of an IN, you have to make sure kernel re-entry is performed
(to complete the operation). This is what complicates vcpu stop (you
need a new state which says "do not stop vcpu, re-enter kernel first").

And then you must re-raise the stop signal before entering the kernel.

Does that make sense?

> >Or even just copy the result from QEMU device to RAX in userspace, which
> >is somewhat nasty since you'd have either userspace or kernel finishing
> >the op.
> 
> Definitely bad.
> 
> >For REP OUTS larger than page size, the current position is held in RCX,
> >but complete_pio uses vcpu->arch.pio.cur_count and count to hold the
> >position. So you either make it possible to writeback vcpu->arch.pio
> >to the kernel, or wait for the operation to finish (with similar
> >complications regarding signal processing).
> 
> RCX is always consistent, no?  So if we migrate in the middle of REP
> OUTS, the operation will restart at the correct place?

On a second though, yeah, the state held in vcpu->arch.pio will be 
reinstatiated on the destination with updates values from RCX.

> >As i see it, the benefit of backward compatibility is not worthwhile
> >compared to the complications introduced to vcpu loop processing (and
> >potential for damaging vcpu stop ->  vcpu stopped latency).
> >
> >Are you certain its worth avoiding the restore ioctl for pio/mmio?
> 
> First, let's see if it's feasible or not.  If it's feasible, it's
> probably just a matter of rearranging things to get userspace sane.
> A small price to pay for backward compatibility.
> 
> 
> -- 
> I have a truly marvellous patch that fixes the bug which this
> signature is too narrow to contain.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to