subject:"Re\: \[Qemu\-devel\] Anyone seeing huge slowdown launching qemu with Linux 2.6.35\?"

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gleb Natapov

On Tue, Aug 03, 2010 at 12:13:06PM +0100, Richard W.M. Jones wrote:
> 
> qemu compiled from today's git.  Using the following command line:
> 
> $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios \
> -drive file=/dev/null,if=virtio \
> -enable-kvm \
> -nodefaults \
> -nographic \
> -serial stdio \
> -m 500 \
> -no-reboot \
> -no-hpet \
> -net user,vlan=0,net=169.254.0.0/16 \
> -net nic,model=ne2k_pci,vlan=0 \
> -kernel /tmp/libguestfsEyAMut/kernel \
> -initrd /tmp/libguestfsEyAMut/initrd \
> -append 'panic=1 console=ttyS0 udevtimeout=300 noapic acpi=off 
> printk.time=1 cgroup_disable=memory selinux=0 
> guestfs_vmchannel=tcp:169.254.2.2:35007 guestfs_verbose=1 TERM=xterm-color '
> 
> With kernel 2.6.35 [*], this takes about 1 min 20 s before the guest
> starts.
> 
> If I revert back to kernel 2.6.34, it's pretty quick as usual.
> 
> strace is not very informative.  It's in a loop doing select and
> reading/writing from some file descriptors, including the signalfd and
> two pipe fds.
> 
> Anyone seen anything like this?
> 
I assume your initrd is huge. In newer kernels ins/outs are much slower that
they were. They are much more correct too. It shouldn't be 1 min 20 sec for
100M initrd though, but it can take 20-30 sec. This belongs to kvm list BTW.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 02:33:02PM +0300, Gleb Natapov wrote:
> On Tue, Aug 03, 2010 at 12:13:06PM +0100, Richard W.M. Jones wrote:
> > 
> > qemu compiled from today's git.  Using the following command line:
> > 
> > $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios \
> > -drive file=/dev/null,if=virtio \
> > -enable-kvm \
> > -nodefaults \
> > -nographic \
> > -serial stdio \
> > -m 500 \
> > -no-reboot \
> > -no-hpet \
> > -net user,vlan=0,net=169.254.0.0/16 \
> > -net nic,model=ne2k_pci,vlan=0 \
> > -kernel /tmp/libguestfsEyAMut/kernel \
> > -initrd /tmp/libguestfsEyAMut/initrd \
> > -append 'panic=1 console=ttyS0 udevtimeout=300 noapic acpi=off 
> > printk.time=1 cgroup_disable=memory selinux=0 
> > guestfs_vmchannel=tcp:169.254.2.2:35007 guestfs_verbose=1 TERM=xterm-color '
> > 
> > With kernel 2.6.35 [*], this takes about 1 min 20 s before the guest
> > starts.
> > 
> > If I revert back to kernel 2.6.34, it's pretty quick as usual.
> > 
> > strace is not very informative.  It's in a loop doing select and
> > reading/writing from some file descriptors, including the signalfd and
> > two pipe fds.
> > 
> > Anyone seen anything like this?
> > 
> I assume your initrd is huge.

It's ~110MB, yes.

> In newer kernels ins/outs are much slower that they were. They are
> much more correct too. It shouldn't be 1 min 20 sec for 100M initrd
> though, but it can take 20-30 sec. This belongs to kvm list BTW.

I can't see anything about this in the kernel changelog.  Can you
point me to the commit or the key phrase to look for?

Also, what's the point of making in/out "more correct" when they we
know we're talking to qemu (eg. from the CPUID) and we know it already
worked fine before with qemu?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into Xen guests.
http://et.redhat.com/~rjones/virt-p2v

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gleb Natapov

On Tue, Aug 03, 2010 at 01:10:00PM +0100, Richard W.M. Jones wrote:
> On Tue, Aug 03, 2010 at 02:33:02PM +0300, Gleb Natapov wrote:
> > On Tue, Aug 03, 2010 at 12:13:06PM +0100, Richard W.M. Jones wrote:
> > > 
> > > qemu compiled from today's git.  Using the following command line:
> > > 
> > > $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios \
> > > -drive file=/dev/null,if=virtio \
> > > -enable-kvm \
> > > -nodefaults \
> > > -nographic \
> > > -serial stdio \
> > > -m 500 \
> > > -no-reboot \
> > > -no-hpet \
> > > -net user,vlan=0,net=169.254.0.0/16 \
> > > -net nic,model=ne2k_pci,vlan=0 \
> > > -kernel /tmp/libguestfsEyAMut/kernel \
> > > -initrd /tmp/libguestfsEyAMut/initrd \
> > > -append 'panic=1 console=ttyS0 udevtimeout=300 noapic acpi=off 
> > > printk.time=1 cgroup_disable=memory selinux=0 
> > > guestfs_vmchannel=tcp:169.254.2.2:35007 guestfs_verbose=1 
> > > TERM=xterm-color '
> > > 
> > > With kernel 2.6.35 [*], this takes about 1 min 20 s before the guest
> > > starts.
> > > 
> > > If I revert back to kernel 2.6.34, it's pretty quick as usual.
> > > 
> > > strace is not very informative.  It's in a loop doing select and
> > > reading/writing from some file descriptors, including the signalfd and
> > > two pipe fds.
> > > 
> > > Anyone seen anything like this?
> > > 
> > I assume your initrd is huge.
> 
> It's ~110MB, yes.
> 
> > In newer kernels ins/outs are much slower that they were. They are
> > much more correct too. It shouldn't be 1 min 20 sec for 100M initrd
> > though, but it can take 20-30 sec. This belongs to kvm list BTW.
> 
> I can't see anything about this in the kernel changelog.  Can you
> point me to the commit or the key phrase to look for?
> 
7972995b0c346de76

> Also, what's the point of making in/out "more correct" when they we
> know we're talking to qemu (eg. from the CPUID) and we know it already
> worked fine before with qemu?
> 
Qemu has nothing to do with that. ins/outs didn't worked correctly for
some situation. They didn't work at all if destination/source memory
was MMIO (didn't work as in hang vcpu IIRC and this is security risk).
Direction flag wasn't handled at all (if it was set instruction injected
#GP into a gust). It didn't check that memory it writes to is shadowed
in which case special action should be taken. It didn't delivered events
during long string operations. May be more. Unfortunately adding all that
makes emulation much slower.  I already implemented some speedups, and
more is possible, but we will not be able to get to previous string io speed
which was our upper limit.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 03:37:14PM +0300, Gleb Natapov wrote:
> On Tue, Aug 03, 2010 at 01:10:00PM +0100, Richard W.M. Jones wrote:
> > I can't see anything about this in the kernel changelog.  Can you
> > point me to the commit or the key phrase to look for?
> > 
> 7972995b0c346de76

Thanks - I see.

> > Also, what's the point of making in/out "more correct" when they we
> > know we're talking to qemu (eg. from the CPUID) and we know it already
> > worked fine before with qemu?
> > 
> Qemu has nothing to do with that. ins/outs didn't worked correctly for
> some situation. They didn't work at all if destination/source memory
> was MMIO (didn't work as in hang vcpu IIRC and this is security risk).
> Direction flag wasn't handled at all (if it was set instruction injected
> #GP into a gust). It didn't check that memory it writes to is shadowed
> in which case special action should be taken. It didn't delivered events
> during long string operations. May be more. Unfortunately adding all that
> makes emulation much slower.  I already implemented some speedups, and
> more is possible, but we will not be able to get to previous string io speed
> which was our upper limit.

Thanks for the explanation.  I'll repost my "DMA"-like fw-cfg patch
once I've rebased it and done some more testing.  This huge regression
for a common operation (implementing -initrd) needs to be solved
without using inb/rep ins.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 03:48 PM, Richard W.M. Jones wrote:


Thanks for the explanation.  I'll repost my "DMA"-like fw-cfg patch
once I've rebased it and done some more testing.  This huge regression
for a common operation (implementing -initrd) needs to be solved
without using inb/rep ins.


Adding more interfaces is easy but a problem in the long term.  We'll 
optimize it as much as we can.  Meanwhile, why are you loading huge 
initrds?  Use a cdrom instead (it will also be faster since the guest 
doesn't need to unpack it).


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 04:19:39PM +0300, Avi Kivity wrote:
>  On 08/03/2010 03:48 PM, Richard W.M. Jones wrote:
> >
> >Thanks for the explanation.  I'll repost my "DMA"-like fw-cfg patch
> >once I've rebased it and done some more testing.  This huge regression
> >for a common operation (implementing -initrd) needs to be solved
> >without using inb/rep ins.
> 
> Adding more interfaces is easy but a problem in the long term.
> We'll optimize it as much as we can.  Meanwhile, why are you loading
> huge initrds?  Use a cdrom instead (it will also be faster since the
> guest doesn't need to unpack it).

Because it involves rewriting the entire appliance building process,
and we don't necessarily know if it'll be faster after we've done
that.

Look: currently we create the initrd on the fly in 700ms.  We've no
reason to believe that creating a CD-ROM on the fly wouldn't take
around the same time.  After all, both processes involve reading all
the host files from disk and writing a temporary file.

You have to create these things on the fly, because we don't actually
ship an appliance to end users, just a tiny (< 1 MB) skeleton.  You
can't ship a massive statically linked appliance to end users because
it's just unmanageable (think: security; updates; bandwidth).

Loading the initrd currently takes 115ms (or could do, if a sensible
50 line patch was permitted).

So the only possible saving would be the 115ms load time of the
initrd.  In theory the CD-ROM device could be detected in 0 time.

Total saving: 115ms.

But will it be any faster, since after spending 115ms, everything runs
from memory, versus being loaded from the CD?

Let's face the fact that qemu has suffered from an enormous
regression.  From some hundreds of milliseconds up to over a minute,
in the space of 6 months of development.  For a very simple operation:
loading a file into memory.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 05:05 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 04:19:39PM +0300, Avi Kivity wrote:

  On 08/03/2010 03:48 PM, Richard W.M. Jones wrote:

Thanks for the explanation.  I'll repost my "DMA"-like fw-cfg patch
once I've rebased it and done some more testing.  This huge regression
for a common operation (implementing -initrd) needs to be solved
without using inb/rep ins.

Adding more interfaces is easy but a problem in the long term.
We'll optimize it as much as we can.  Meanwhile, why are you loading
huge initrds?  Use a cdrom instead (it will also be faster since the
guest doesn't need to unpack it).

Because it involves rewriting the entire appliance building process,
and we don't necessarily know if it'll be faster after we've done
that.

Look: currently we create the initrd on the fly in 700ms.  We've no
reason to believe that creating a CD-ROM on the fly wouldn't take
around the same time.  After all, both processes involve reading all
the host files from disk and writing a temporary file.


The time will only continue to grow as you add features and as the 
distro bloats naturally.


Much better to create it once and only update it if some dependent file 
changes (basically the current on-the-fly code + save a list of file 
timestamps).


Alternatively, pass through the host filesystem.


You have to create these things on the fly, because we don't actually
ship an appliance to end users, just a tiny (<  1 MB) skeleton.  You
can't ship a massive statically linked appliance to end users because
it's just unmanageable (think: security; updates; bandwidth).


Shipping it is indeed out of the question.  But on-the-fly creation is 
not the only alternative.



Loading the initrd currently takes 115ms (or could do, if a sensible
50 line patch was permitted).

So the only possible saving would be the 115ms load time of the
initrd.  In theory the CD-ROM device could be detected in 0 time.

Total saving: 115ms.


815 ms by my arithmetic.  You also save 3*N-2*P memory where N is the 
size of your initrd and P is the actual amount used by the guest.



But will it be any faster, since after spending 115ms, everything runs
from memory, versus being loaded from the CD?

Let's face the fact that qemu has suffered from an enormous
regression.  From some hundreds of milliseconds up to over a minute,
in the space of 6 months of development.


It wasn't qemu, but kvm.  And it didn't take six months, just a few 
commits.  Those aren't going back, they're a lot more important than 
some libguestfs problem which shouldn't have been coded differently in 
the first place.



For a very simple operation:
loading a file into memory.


Loading a file into memory is plenty fast if you use the standard 
interfaces.  -kernel -initrd is a specialized interface.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 05:38:25PM +0300, Avi Kivity wrote:
> The time will only continue to grow as you add features and as the
> distro bloats naturally.
> 
> Much better to create it once and only update it if some dependent
> file changes (basically the current on-the-fly code + save a list of
> file timestamps).

This applies to both cases, the initrd could also be saved, so:

> >Total saving: 115ms.
> 
> 815 ms by my arithmetic.

no, not true, 115ms.

> You also save 3*N-2*P memory where N is the size of your initrd and
> P is the actual amount used by the guest.

Can you explain this?

> Loading a file into memory is plenty fast if you use the standard
> interfaces.  -kernel -initrd is a specialized interface.

Why bother with any command line options at all?  After all, they keep
changing and causing problems for qemu's users ...  Apparently we're
all doing stuff "wrong", in ways that are never explained by the
developers.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 05:53 PM, Richard W.M. Jones wrote:



Total saving: 115ms.

815 ms by my arithmetic.

no, not true, 115ms.


If you bypass creating the initrd/cdrom (700 ms) and loading it (115ms) 
you save 815ms.



You also save 3*N-2*P memory where N is the size of your initrd and
P is the actual amount used by the guest.

Can you explain this?


(assuming ahead-of-time image generation)

initrd:
  qemu reads image (host pagecache): N
  qemu stores image in RAM: N
  guest copies image to its RAM: N
  guest faults working set (no XIP): P
  total: 3N+P

initramfs:
  qemu reads image (host pagecache): N
  qemu stores image: N
  guest copies image: N
  guest extracts image (XIP): N
  total: 4N

cdrom:
  guest faults working set: P
  kernel faults working set: P
  total: 2P

difference: 3N-P or 4N-2P depending on model



Loading a file into memory is plenty fast if you use the standard
interfaces.  -kernel -initrd is a specialized interface.

Why bother with any command line options at all?  After all, they keep
changing and causing problems for qemu's users ...  Apparently we're
all doing stuff "wrong", in ways that are never explained by the
developers.


That's a real problem.  It's hard to explain the intent behind 
something, especially when it's obvious to the author and not so obvious 
to the user.  However making everything do everything under all 
circumstances has its costs.


-kernel and -initrd is a developer's interface intended to make life 
easier for users that use qemu to develop kernels.  It was not intended 
as a high performance DMA engine.  Neither was the firmware 
_configuration_ interface.  That is what virtio and to a lesser extent 
IDE was written to perform.  You'll get much better results from them.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 07:10:18PM +0300, Avi Kivity wrote:
> -kernel and -initrd is a developer's interface intended to make life
> easier for users that use qemu to develop kernels.  It was not
> intended as a high performance DMA engine.  Neither was the firmware
> _configuration_ interface.  That is what virtio and to a lesser
> extent IDE was written to perform.  You'll get much better results
> from them.

Firmware configuration replaced something which was already working
really fast -- preloading the images into memory -- with something
which worked slower, and has just recently got _way_ more slow.

This is a regression.  Plain and simple.

I have posted a small patch which makes this 650x faster without
appreciable complication.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 09:53 AM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 05:38:25PM +0300, Avi Kivity wrote:
   

The time will only continue to grow as you add features and as the
distro bloats naturally.

Much better to create it once and only update it if some dependent
file changes (basically the current on-the-fly code + save a list of
file timestamps).
 

This applies to both cases, the initrd could also be saved, so:

   

Total saving: 115ms.
   

815 ms by my arithmetic.
 

no, not true, 115ms.

   

You also save 3*N-2*P memory where N is the size of your initrd and
P is the actual amount used by the guest.
 

Can you explain this?

   

Loading a file into memory is plenty fast if you use the standard
interfaces.  -kernel -initrd is a specialized interface.
 

Why bother with any command line options at all?  After all, they keep
changing and causing problems for qemu's users ...  Apparently we're
all doing stuff "wrong", in ways that are never explained by the
developers.
   


Let's be fair.  I think we've all agreed to adjust the fw_cfg interface 
to implement DMA.  The only requirement was that the DMA operation not 
be triggered from a single port I/O but rather based on a polling 
operation which better fits the way real hardware works.


Is this a regression?  Probably.  But performance regressions that 
result from correctness fixes don't get reverted.  We have to find an 
approach to improve performance without impacting correctness.


That said, the general view of -kernel/-append is that these are 
developer options and we don't really look at it as a performance 
critical interface.  We could do a better job of communicating this to 
users but that's true of most of the features we support.


Regards,

Anthony Liguori


Rich.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 11:39:43AM -0500, Anthony Liguori wrote:
> Let's be fair.  I think we've all agreed to adjust the fw_cfg
> interface to implement DMA.  The only requirement was that the DMA
> operation not be triggered from a single port I/O but rather based
> on a polling operation which better fits the way real hardware
> works.

The patch I posted requires that the caller poll a register, so
hopefully this requirement is satisfied.

The other requirement was that the interface be discoverable, which is
also something in the latest version of the patch that I just posted.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into Xen guests.
http://et.redhat.com/~rjones/virt-p2v

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 07:28 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 07:10:18PM +0300, Avi Kivity wrote:

-kernel and -initrd is a developer's interface intended to make life
easier for users that use qemu to develop kernels.  It was not
intended as a high performance DMA engine.  Neither was the firmware
_configuration_ interface.  That is what virtio and to a lesser
extent IDE was written to perform.  You'll get much better results
from them.

Firmware configuration replaced something which was already working
really fast -- preloading the images into memory -- with something
which worked slower, and has just recently got _way_ more slow.

This is a regression.  Plain and simple.


It's only a regression if there was any intent at making this a 
performant interface.  Otherwise any change an be interpreted as a 
regression.  Even "binary doesn't hash to exact same signature" is a 
regression.



I have posted a small patch which makes this 650x faster without
appreciable complication.


It doesn't appear to support live migration, or hiding the feature for 
-M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the kernel 
and virtio support demand loading of any image size you'd want to use.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 11:44 AM, Avi Kivity wrote:

 On 08/03/2010 07:28 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 07:10:18PM +0300, Avi Kivity wrote:

-kernel and -initrd is a developer's interface intended to make life
easier for users that use qemu to develop kernels.  It was not
intended as a high performance DMA engine.  Neither was the firmware
_configuration_ interface.  That is what virtio and to a lesser
extent IDE was written to perform.  You'll get much better results
from them.

Firmware configuration replaced something which was already working
really fast -- preloading the images into memory -- with something
which worked slower, and has just recently got _way_ more slow.

This is a regression.  Plain and simple.


It's only a regression if there was any intent at making this a 
performant interface.  Otherwise any change an be interpreted as a 
regression.  Even "binary doesn't hash to exact same signature" is a 
regression.



I have posted a small patch which makes this 650x faster without
appreciable complication.


It doesn't appear to support live migration, or hiding the feature for 
-M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the kernel 
and virtio support demand loading of any image size you'd want to use.


firmware is totally broken with respect to -M older FWIW.

Regards,

Anthony Liguori

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 07:44 PM, Avi Kivity wrote:


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the kernel 
and virtio support demand loading of any image size you'd want to use.




Even better would be to use virtio-9p.  You don't even need an image in 
this case.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 07:48:17PM +0300, Avi Kivity wrote:
>  On 08/03/2010 07:44 PM, Avi Kivity wrote:
> >
> >It's not a good path to follow.  Tomorrow we'll need to load 300MB
> >initrds and we'll have to rework this yet again.  Meanwhile the
> >kernel and virtio support demand loading of any image size you'd
> >want to use.
> >
> 
> Even better would be to use virtio-9p.  You don't even need an image
> in this case.

We don't want to expose the whole host filesystem, just selected
files, and we want to use our own configuration files (basically
that's what is in the skeleton part that we do ship).

Of course, if we can use virtio-9p, then excellent.  Is there good
documentation about virtio-9p?  What I can find is fragmentary or
based on reading qemu -help ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 07:53 PM, Anthony Liguori wrote:

On 08/03/2010 11:50 AM, Avi Kivity wrote:

 On 08/03/2010 07:46 PM, Anthony Liguori wrote:
It doesn't appear to support live migration, or hiding the feature 
for -M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the 
kernel and virtio support demand loading of any image size you'd 
want to use.



firmware is totally broken with respect to -M older FWIW.



Well, then this is adding to the brokenness.

fwcfg dma is going to have exactly one user, libguestfs.  Much better 
to have libguestfs move to some other interface and improve are 
users-to-interfaces ratio.


You mean, only one class of users cares about the performance of 
loading an initrd.  However, you've also argued in other threads how 
important it is not to break libvirt even if it means we have to do 
silly things (like change help text).


So... why is it that libguestfs has to change itself and yet we should 
bend over backwards so libvirt doesn't have to change itself?


libvirt is a major user that is widely deployed, and would be completely 
broken if we change -help.  Changing -help is of no consequence to us.
libguestfs is a (pardon me) minor user that is not widely used, and 
would suffer a performance regression, not total breakage, unless we add 
a fw-dma interface.  Adding the interface is of consequence to us: we 
have to implement live migration and backwards compatibility, and 
support this new interface for a long while.


In an ideal world we wouldn't tolerate any regression.  The world is not 
ideal, so we prioritize.


the -help change scores very high on benfit/cost.  fw-dma, much lower.

Note in both cases the long term solution is for the user to move to 
another interface (cap reporting, virtio), so adding an interface which 
would only be abandoned later by its only user drops the benfit/cost 
ratio even further.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 11:50 AM, Avi Kivity wrote:

 On 08/03/2010 07:46 PM, Anthony Liguori wrote:
It doesn't appear to support live migration, or hiding the feature 
for -M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the 
kernel and virtio support demand loading of any image size you'd 
want to use.



firmware is totally broken with respect to -M older FWIW.



Well, then this is adding to the brokenness.

fwcfg dma is going to have exactly one user, libguestfs.  Much better 
to have libguestfs move to some other interface and improve are 
users-to-interfaces ratio.


BTW, the brokenness is that regardless of -M older, we always use the 
newest firmware.  Because always use the newest firmware, fwcfg is not a 
backwards compatible interface.


Migration totally screws this up.  While we migrate roms (and correctly 
now thanks to Alex's patches), we size the allocation based on the 
newest firmware size.  That means if we ever decreased the size of a 
rom, we'd see total failure (even if we had a compatible fwcfg interface).


Regards,

Anthony Liguori

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 07:44:49PM +0300, Avi Kivity wrote:
>  On 08/03/2010 07:28 PM, Richard W.M. Jones wrote:
> >I have posted a small patch which makes this 650x faster without
> >appreciable complication.
> 
> It doesn't appear to support live migration, or hiding the feature
> for -M older.

AFAICT live migration should still work (even assuming someone live
migrates a domain during early boot, which seems pretty unlikely ...)
Maybe you mean live migration of the dma_* global variables?  I can
fix that.

> It's not a good path to follow.  Tomorrow we'll need to load 300MB
> initrds and we'll have to rework this yet again.

Not a very good straw man ...  The patch would take ~300ms instead
of ~115ms, versus something like 2 mins 40 seconds with the current
method.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 08:00 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 07:48:17PM +0300, Avi Kivity wrote:

  On 08/03/2010 07:44 PM, Avi Kivity wrote:

It's not a good path to follow.  Tomorrow we'll need to load 300MB
initrds and we'll have to rework this yet again.  Meanwhile the
kernel and virtio support demand loading of any image size you'd
want to use.


Even better would be to use virtio-9p.  You don't even need an image
in this case.

We don't want to expose the whole host filesystem, just selected
files, and we want to use our own configuration files (basically
that's what is in the skeleton part that we do ship).


True.  The guest might landmine its disks with something that the 
libguestfs kernel would step on an be exploited.


You might hardlink the needed files into a private directory tree.


Of course, if we can use virtio-9p, then excellent.  Is there good
documentation about virtio-9p?  What I can find is fragmentary or
based on reading qemu -help ...


Not to my knowledge.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 07:56 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 07:44:49PM +0300, Avi Kivity wrote:

  On 08/03/2010 07:28 PM, Richard W.M. Jones wrote:

I have posted a small patch which makes this 650x faster without
appreciable complication.

It doesn't appear to support live migration, or hiding the feature
for -M older.

AFAICT live migration should still work (even assuming someone live
migrates a domain during early boot, which seems pretty unlikely ...)


Live migration is sometimes performed automatically by management tools, 
which have no idea (nor do they care) what the guest is doing.



Maybe you mean live migration of the dma_* global variables?  I can
fix that.


Yes.


It's not a good path to follow.  Tomorrow we'll need to load 300MB
initrds and we'll have to rework this yet again.

Not a very good straw man ...  The patch would take ~300ms instead
of ~115ms, versus something like 2 mins 40 seconds with the current
method.



It's still 300ms extra time, with a 900MB footprint.

btw, a DMA interface which blocks the guest and/or qemu for 115ms is not 
something we want to introduce to qemu.  dma is hard, doing something 
simple means it won't work very well.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 11:50 AM, Avi Kivity wrote:

 On 08/03/2010 07:46 PM, Anthony Liguori wrote:
It doesn't appear to support live migration, or hiding the feature 
for -M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the 
kernel and virtio support demand loading of any image size you'd 
want to use.



firmware is totally broken with respect to -M older FWIW.



Well, then this is adding to the brokenness.

fwcfg dma is going to have exactly one user, libguestfs.  Much better 
to have libguestfs move to some other interface and improve are 
users-to-interfaces ratio.


You mean, only one class of users cares about the performance of loading 
an initrd.  However, you've also argued in other threads how important 
it is not to break libvirt even if it means we have to do silly things 
(like change help text).


So... why is it that libguestfs has to change itself and yet we should 
bend over backwards so libvirt doesn't have to change itself?


Regards,

Anthony Liguori

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 07:46 PM, Anthony Liguori wrote:
It doesn't appear to support live migration, or hiding the feature 
for -M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the 
kernel and virtio support demand loading of any image size you'd want 
to use.



firmware is totally broken with respect to -M older FWIW.



Well, then this is adding to the brokenness.

fwcfg dma is going to have exactly one user, libguestfs.  Much better to 
have libguestfs move to some other interface and improve are 
users-to-interfaces ratio.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 12:01 PM, Avi Kivity wrote:
You mean, only one class of users cares about the performance of 
loading an initrd.  However, you've also argued in other threads how 
important it is not to break libvirt even if it means we have to do 
silly things (like change help text).


So... why is it that libguestfs has to change itself and yet we 
should bend over backwards so libvirt doesn't have to change itself?



libvirt is a major user that is widely deployed, and would be 
completely broken if we change -help.  Changing -help is of no 
consequence to us.
libguestfs is a (pardon me) minor user that is not widely used, and 
would suffer a performance regression, not total breakage, unless we 
add a fw-dma interface.  Adding the interface is of consequence to us: 
we have to implement live migration and backwards compatibility, and 
support this new interface for a long while.


I certainly buy the argument about making changes of little consequence 
to us vs. ones that we have to be concerned about long term.


However, I don't think we can objectively differentiate between a 
"major" and "minor" user.  Generally speaking, I would rather that we 
not take the position of "you are a minor user therefore we're not going 
to accommodate you".


Regards,

Anthony Liguori



In an ideal world we wouldn't tolerate any regression.  The world is 
not ideal, so we prioritize.


the -help change scores very high on benfit/cost.  fw-dma, much lower.

Note in both cases the long term solution is for the user to move to 
another interface (cap reporting, virtio), so adding an interface 
which would only be abandoned later by its only user drops the 
benfit/cost ratio even further.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 08:42 PM, Anthony Liguori wrote:
However, I don't think we can objectively differentiate between a 
"major" and "minor" user.  Generally speaking, I would rather that we 
not take the position of "you are a minor user therefore we're not 
going to accommodate you".


Again it's a matter of practicalities.  With have written virtio drivers 
for Windows and Linux, but not for FreeDOS or NetWare.  To speed up 
Windows XP we have (in qemu-kvm) kvm-tpr-opt.c that is a gross breach of 
decency, would we go to the same lengths to speed up Haiku?  I suggest 
that we would not.


libvirt and Windows XP did not win "major user" status by making large 
anonymous donations to qemu developers.  They did so by having lots of 
users.  Those users are our end users, and we should be focusing our 
efforts in a way that maximizes the gain for as large a number of those 
end users as we can.


Not breaking libvirt will be unknowingly appreciated by a large number 
of users, every day.  Not slowing down libguestfs, by a much smaller 
number for a much shorter time.  If it were just a matter of changing 
the help text I wouldn't mind at all, but introducing an undocumented 
migration-unsafe broken-dma interface isn't something I'm happy to do.


btw, gaining back some of the speed that we lost _is_ something I want 
to do, since it doesn't break or add any interfaces, and would be a gain 
not just for libguestfs, but also for Windows installs (which use string 
pio extensively).  Richard, can you test kvm.git master?  it already 
contains one fix and we plan to add more.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 08:58:10PM +0300, Avi Kivity wrote:
> Richard, can you test kvm.git
> master?  it already contains one fix and we plan to add more.

Yup, I will ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 12:58 PM, Avi Kivity wrote:

 On 08/03/2010 08:42 PM, Anthony Liguori wrote:
However, I don't think we can objectively differentiate between a 
"major" and "minor" user.  Generally speaking, I would rather that we 
not take the position of "you are a minor user therefore we're not 
going to accommodate you".


Again it's a matter of practicalities.  With have written virtio 
drivers for Windows and Linux, but not for FreeDOS or NetWare.  To 
speed up Windows XP we have (in qemu-kvm) kvm-tpr-opt.c that is a 
gross breach of decency, would we go to the same lengths to speed up 
Haiku?  I suggest that we would not.


tpr-opt optimizes a legitimate dependence on the x86 architecture that 
Windows has.  While the implementation may be grossly indecent, it 
certainly fits the overall mission of what we're trying to do in qemu 
and kvm which is emulate an architecture.


You've invested a lot of time and effort into it because it's important 
to you (or more specifically, your employer).  That's because Windows is 
important to you.


If someone as adept and commit as you was heavily invested in Haiku and 
was willing to implement something equivalent to tpr-opt and also 
willing to do all of the work of maintaining it, then reject such a 
patch would be a mistake.


If Richard is willing to do the work to make -kernel perform faster in 
such a way that it fits into the overall mission of what we're building, 
then I see no reason to reject it.  The criteria for evaluating a patch 
should only depend on how it affects other areas of qemu and whether it 
impacts overall usability.


As a side note, we ought to do a better job of removing features that 
have created a burden on other areas of qemu that aren't actively being 
maintained.  That's a different discussion though.


Regards,

Anthony Liguori

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 09:26 PM, Anthony Liguori wrote:

On 08/03/2010 12:58 PM, Avi Kivity wrote:

 On 08/03/2010 08:42 PM, Anthony Liguori wrote:
However, I don't think we can objectively differentiate between a 
"major" and "minor" user.  Generally speaking, I would rather that 
we not take the position of "you are a minor user therefore we're 
not going to accommodate you".


Again it's a matter of practicalities.  With have written virtio 
drivers for Windows and Linux, but not for FreeDOS or NetWare.  To 
speed up Windows XP we have (in qemu-kvm) kvm-tpr-opt.c that is a 
gross breach of decency, would we go to the same lengths to speed up 
Haiku?  I suggest that we would not.


tpr-opt optimizes a legitimate dependence on the x86 architecture that 
Windows has.  While the implementation may be grossly indecent, it 
certainly fits the overall mission of what we're trying to do in qemu 
and kvm which is emulate an architecture.


You've invested a lot of time and effort into it because it's 
important to you (or more specifically, your employer).  That's 
because Windows is important to you.


Correct.



If someone as adept and commit as you was heavily invested in Haiku 
and was willing to implement something equivalent to tpr-opt and also 
willing to do all of the work of maintaining it, then reject such a 
patch would be a mistake.


libguestfs does not depend on an x86 architectural feature.  
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We should 
discourage people from depending on this interface for production use.




If Richard is willing to do the work to make -kernel perform faster in 
such a way that it fits into the overall mission of what we're 
building, then I see no reason to reject it.  The criteria for 
evaluating a patch should only depend on how it affects other areas of 
qemu and whether it impacts overall usability.


That's true, but extending fwcfg doesn't fit into the overall picture 
well.  We have well defined interfaces for pushing data into a guest: 
virtio-serial (dma upload), virtio-blk (adds demand paging), and 
virtio-p9fs (no image needed).  Adapting libguestfs to use one of these 
is a better move than adding yet another interface.


A better (though still inaccurate) analogy is would be if the developers 
of a guest OS came up with a virtual bus for devices and were willing to 
do the work to make this bus perform better.  Would we accept this new 
work or would we point them at our existing bus (pci) instead?


Really, the bar on new interfaces (both to guest and host) should be 
high, much higher than it is now.  Interfaces should be well documented, 
future proof, migration safe, and orthogonal to existing interfaces.  
While the first three points could be improved with some effort, adding 
a new dma interface is not going to be orthogonal to virtio.  And 
frankly, libguestfs is better off switching to one of the other 
interfaces.  Slurping huge initrds isn't the right way to do this.


As a side note, we ought to do a better job of removing features that 
have created a burden on other areas of qemu that aren't actively 
being maintained.  That's a different discussion though.


Sure, we need something like Linux' 
Documentation/feature-removal-schedule.txt for people to ignore.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 09:43 PM, Avi Kivity wrote:
Really, the bar on new interfaces (both to guest and host) should be 
high, much higher than it is now.  Interfaces should be well 
documented, future proof, migration safe, and orthogonal to existing 
interfaces.  While the first three points could be improved with some 
effort, adding a new dma interface is not going to be orthogonal to 
virtio.  And frankly, libguestfs is better off switching to one of the 
other interfaces.  Slurping huge initrds isn't the right way to do this.


btw, precedent should play no role here.  Just because an older 
interfaces wasn't documented or migration safe or unit-tested doesn't 
mean new ones get off the hook.


It does help to have a framework in place that we can point people at, 
for example I added a skeleton Documentation/kvm/api.txt and some unit 
tests and then made contributors fill them in for new features.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 01:43 PM, Avi Kivity wrote:


If Richard is willing to do the work to make -kernel perform faster 
in such a way that it fits into the overall mission of what we're 
building, then I see no reason to reject it.  The criteria for 
evaluating a patch should only depend on how it affects other areas 
of qemu and whether it impacts overall usability.


That's true, but extending fwcfg doesn't fit into the overall picture 
well.  We have well defined interfaces for pushing data into a guest: 
virtio-serial (dma upload), virtio-blk (adds demand paging), and 
virtio-p9fs (no image needed).  Adapting libguestfs to use one of 
these is a better move than adding yet another interface.


On real hardware, there's an awful lot of interaction between the 
firmware and the platform.  It's a pretty rich interface.  On IBM 
systems, we actually extend that all the way down to userspace via a 
virtual USB RNDIS driver that you can use IPMI over.


A better (though still inaccurate) analogy is would be if the 
developers of a guest OS came up with a virtual bus for devices and 
were willing to do the work to make this bus perform better.  Would we 
accept this new work or would we point them at our existing bus (pci) 
instead?


Doesn't this precisely describe virtio-s390?



Really, the bar on new interfaces (both to guest and host) should be 
high, much higher than it is now.  Interfaces should be well 
documented, future proof, migration safe, and orthogonal to existing 
interfaces.


Okay, but this is a bigger discussion that I'm very eager to have.  But 
we shouldn't explicitly apply new policies to random patches without 
clearly stating the policy up front.


Regards,

Anthony Liguori

  While the first three points could be improved with some effort, 
adding a new dma interface is not going to be orthogonal to virtio.  
And frankly, libguestfs is better off switching to one of the other 
interfaces.  Slurping huge initrds isn't the right way to do this.


As a side note, we ought to do a better job of removing features that 
have created a burden on other areas of qemu that aren't actively 
being maintained.  That's a different discussion though.


Sure, we need something like Linux' 
Documentation/feature-removal-schedule.txt for people to ignore.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 09:55 PM, Anthony Liguori wrote:

On 08/03/2010 01:43 PM, Avi Kivity wrote:


If Richard is willing to do the work to make -kernel perform faster 
in such a way that it fits into the overall mission of what we're 
building, then I see no reason to reject it.  The criteria for 
evaluating a patch should only depend on how it affects other areas 
of qemu and whether it impacts overall usability.


That's true, but extending fwcfg doesn't fit into the overall picture 
well.  We have well defined interfaces for pushing data into a guest: 
virtio-serial (dma upload), virtio-blk (adds demand paging), and 
virtio-p9fs (no image needed).  Adapting libguestfs to use one of 
these is a better move than adding yet another interface.


On real hardware, there's an awful lot of interaction between the 
firmware and the platform.  It's a pretty rich interface.  On IBM 
systems, we actually extend that all the way down to userspace via a 
virtual USB RNDIS driver that you can use IPMI over.


That is fine and we'll do pv interfaces when we have to.  That's fwfg, 
that's virtio.  But let's not do more than we have to.




A better (though still inaccurate) analogy is would be if the 
developers of a guest OS came up with a virtual bus for devices and 
were willing to do the work to make this bus perform better.  Would 
we accept this new work or would we point them at our existing bus 
(pci) instead?


Doesn't this precisely describe virtio-s390?


As I understood it, s390 had good reasons not to use their native 
interfaces.  On x86 we have no good reason not to use pci and no good 
reason not to use virtio for dma.




Really, the bar on new interfaces (both to guest and host) should be 
high, much higher than it is now.  Interfaces should be well 
documented, future proof, migration safe, and orthogonal to existing 
interfaces.


Okay, but this is a bigger discussion that I'm very eager to have.  
But we shouldn't explicitly apply new policies to random patches 
without clearly stating the policy up front.




Migration safety has been part of the criteria for a while.  Future 
proofness less so.  Documentation was usually completely missing but I 
see no reason not to insist on it now, better late than never.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gleb Natapov

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
> >
> >If Richard is willing to do the work to make -kernel perform
> >faster in such a way that it fits into the overall mission of what
> >we're building, then I see no reason to reject it.  The criteria
> >for evaluating a patch should only depend on how it affects other
> >areas of qemu and whether it impacts overall usability.
> 
> That's true, but extending fwcfg doesn't fit into the overall
> picture well.  We have well defined interfaces for pushing data into
> a guest: virtio-serial (dma upload), virtio-blk (adds demand
> paging), and virtio-p9fs (no image needed).  Adapting libguestfs to
> use one of these is a better move than adding yet another interface.
> 
+1. I already proposed that. Nobody objects against fast fast
communication channel between guest and host. In fact we have one:
virtio-serial. Of course it is much easier to hack dma semantic into
fw_cfg interface than add virtio-serial to seabios, but it doesn't make
it right. Does virtio-serial has to be exposed as PCI to a guest or can
we expose it as ISA device too in case someone want to use -kernel option
but do not see additional PCI device in a guest?

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 10:05 PM, Gleb Natapov wrote:



That's true, but extending fwcfg doesn't fit into the overall
picture well.  We have well defined interfaces for pushing data into
a guest: virtio-serial (dma upload), virtio-blk (adds demand
paging), and virtio-p9fs (no image needed).  Adapting libguestfs to
use one of these is a better move than adding yet another interface.


+1. I already proposed that. Nobody objects against fast fast
communication channel between guest and host. In fact we have one:
virtio-serial. Of course it is much easier to hack dma semantic into
fw_cfg interface than add virtio-serial to seabios, but it doesn't make
it right. Does virtio-serial has to be exposed as PCI to a guest or can
we expose it as ISA device too in case someone want to use -kernel option
but do not see additional PCI device in a guest?


No need for virtio-serial in firmware.  We can have a small initrd slurp 
a larger filesystem via virtio-serial, or mount a virtio-blk or 
virtio-p9fs, or boot the whole thing from a virtio-blk image and avoid 
-kernel -initrd completely.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
> libguestfs does not depend on an x86 architectural feature.
> qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
> should discourage people from depending on this interface for
> production use.

I really don't get this whole thing where we must slavishly
emulate an exact PC ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 02:05 PM, Gleb Natapov wrote:

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
   

If Richard is willing to do the work to make -kernel perform
faster in such a way that it fits into the overall mission of what
we're building, then I see no reason to reject it.  The criteria
for evaluating a patch should only depend on how it affects other
areas of qemu and whether it impacts overall usability.
   

That's true, but extending fwcfg doesn't fit into the overall
picture well.  We have well defined interfaces for pushing data into
a guest: virtio-serial (dma upload), virtio-blk (adds demand
paging), and virtio-p9fs (no image needed).  Adapting libguestfs to
use one of these is a better move than adding yet another interface.

 

+1. I already proposed that. Nobody objects against fast fast
communication channel between guest and host. In fact we have one:
virtio-serial. Of course it is much easier to hack dma semantic into
fw_cfg interface than add virtio-serial to seabios, but it doesn't make
it right. Does virtio-serial has to be exposed as PCI to a guest or can
we expose it as ISA device too in case someone want to use -kernel option
but do not see additional PCI device in a guest?
   


fw_cfg has to be available pretty early on so relying on a PCI device 
isn't reasonable.  Having dual interfaces seems wasteful.


We're already doing bulk data transfer over fw_cfg as we need to do it 
to transfer roms and potentially a boot splash.  Even outside of loading 
an initrd, the performance is going to start to matter with a large 
number of devices.


Regards,

Anthony Liguori


--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gleb Natapov

On Tue, Aug 03, 2010 at 08:13:46PM +0100, Richard W.M. Jones wrote:
> On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
> > libguestfs does not depend on an x86 architectural feature.
> > qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
> > should discourage people from depending on this interface for
> > production use.
> 
> I really don't get this whole thing where we must slavishly
> emulate an exact PC ...
> 
May be because you don't have to dial with consequences of not doing so?

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 02:13 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
   

libguestfs does not depend on an x86 architectural feature.
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
should discourage people from depending on this interface for
production use.
 

I really don't get this whole thing where we must slavishly
emulate an exact PC ...
   


History has shown that when we deviate, we usually get it wrong and it 
becomes very painful to fix.


Regards,

Anthony Liguori


Rich.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 10:13 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:

libguestfs does not depend on an x86 architectural feature.
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
should discourage people from depending on this interface for
production use.

I really don't get this whole thing where we must slavishly
emulate an exact PC ...


This has two motivations:

- documented interfaces: we suck at documentation.  We seldom document.  
Even when we do document something, the documentation is often 
inaccurate, misleading, and incomplete.  While an "exact PC" 
unfortunately doesn't exist, it's a lot closer to reality than, say, an 
"exact Linux syscall interface".  If we adopt an existing interface, we 
already have the documentation, and if there's a conflict between the 
documentation and our implementation, it's clear who wins (well, not 
always).


- preexisting guests: if we design a new interface, we get to update all 
guests; and there are many of them.  Whereas an "exact PC" will be seen 
by the guest vendors as well who will then add whatever support is 
necessary.


Obviously we break this when we have to, but we don't, we shouldn't.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 10:15 PM, Anthony Liguori wrote:


fw_cfg has to be available pretty early on so relying on a PCI device 
isn't reasonable.  Having dual interfaces seems wasteful.


Agree.



We're already doing bulk data transfer over fw_cfg as we need to do it 
to transfer roms and potentially a boot splash. 


Why do we need to transfer roms?  These are devices on the memory bus or 
pci bus, it just needs to be there at the right address.  Boot splash 
should just be another rom as it would be on a real system.


Even outside of loading an initrd, the performance is going to start 
to matter with a large number of devices.


I don't really see why.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gleb Natapov

On Tue, Aug 03, 2010 at 02:15:05PM -0500, Anthony Liguori wrote:
> On 08/03/2010 02:05 PM, Gleb Natapov wrote:
> >On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
> >>>If Richard is willing to do the work to make -kernel perform
> >>>faster in such a way that it fits into the overall mission of what
> >>>we're building, then I see no reason to reject it.  The criteria
> >>>for evaluating a patch should only depend on how it affects other
> >>>areas of qemu and whether it impacts overall usability.
> >>That's true, but extending fwcfg doesn't fit into the overall
> >>picture well.  We have well defined interfaces for pushing data into
> >>a guest: virtio-serial (dma upload), virtio-blk (adds demand
> >>paging), and virtio-p9fs (no image needed).  Adapting libguestfs to
> >>use one of these is a better move than adding yet another interface.
> >>
> >+1. I already proposed that. Nobody objects against fast fast
> >communication channel between guest and host. In fact we have one:
> >virtio-serial. Of course it is much easier to hack dma semantic into
> >fw_cfg interface than add virtio-serial to seabios, but it doesn't make
> >it right. Does virtio-serial has to be exposed as PCI to a guest or can
> >we expose it as ISA device too in case someone want to use -kernel option
> >but do not see additional PCI device in a guest?
> 
> fw_cfg has to be available pretty early on so relying on a PCI
> device isn't reasonable.  Having dual interfaces seems wasteful.
> 
fw_cfg wasn't mean to be used for bulk transfers (seabios doesn't even
use string pio to access it which make load time 50 times slower that
what Richard reports). It was meant to be easy to use on very early
stages of booting. Kernel/initrd are loaded on very late stage of
booting at which point PCI is fully initialized.

> We're already doing bulk data transfer over fw_cfg as we need to do
> it to transfer roms and potentially a boot splash.  Even outside of
> loading an initrd, the performance is going to start to matter with
> a large number of devices.
> 
Most roms are loaded from rom PIC bars, so this leaves us with boot
splash, but boot splash image should be relatively small and if user
wants it he does not care about boot time already since bios need to
pause to show the boot splash anyway.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 02:24 PM, Avi Kivity wrote:

 On 08/03/2010 10:15 PM, Anthony Liguori wrote:


fw_cfg has to be available pretty early on so relying on a PCI device 
isn't reasonable.  Having dual interfaces seems wasteful.


Agree.



We're already doing bulk data transfer over fw_cfg as we need to do 
it to transfer roms and potentially a boot splash. 


Why do we need to transfer roms?  These are devices on the memory bus 
or pci bus, it just needs to be there at the right address.


Not quite.  The BIOS owns the option ROM space.  The way it works on 
bare metal is that the PCI ROM BAR gets mapped to some location in 
physical memory by the BIOS, the BIOS executes the initialization 
vector, and after initialization, the ROM will reorganize itself into 
something smaller.  It's nice and clean.


But ISA is not nearly as clean.  Ultimately, to make this mix work in a 
reasonable way, we have to provide a side channel interface to SeaBIOS 
such that we can deliver ROMs outside of PCI and still let SeaBIOS 
decide how ROMs get organized.


It's additionally complicated by the fact that we didn't support PCI ROM 
BAR until recently so to maintain compatibility with -M older, we have 
to use a side channel to lay out option roms.


Regards,

Anthony Liguori


  Boot splash should just be another rom as it would be on a real system.

Even outside of loading an initrd, the performance is going to start 
to matter with a large number of devices.


I don't really see why.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 10:38 PM, Anthony Liguori wrote:
Why do we need to transfer roms?  These are devices on the memory bus 
or pci bus, it just needs to be there at the right address.



Not quite.  The BIOS owns the option ROM space.  The way it works on 
bare metal is that the PCI ROM BAR gets mapped to some location in 
physical memory by the BIOS, the BIOS executes the initialization 
vector, and after initialization, the ROM will reorganize itself into 
something smaller.  It's nice and clean.


But ISA is not nearly as clean. 


So far so good.

Ultimately, to make this mix work in a reasonable way, we have to 
provide a side channel interface to SeaBIOS such that we can deliver 
ROMs outside of PCI and still let SeaBIOS decide how ROMs get organized.


I don't follow.  Why do we need this side channel?  What would a real 
ISA machine do?  Are there actually enough ISA devices for there to be a 
problem?




It's additionally complicated by the fact that we didn't support PCI 
ROM BAR until recently so to maintain compatibility with -M older, we 
have to use a side channel to lay out option roms.


Again I don't follow.  We can just lay out the ROMs in memory like we 
did in the past?


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 02:41 PM, Avi Kivity wrote:

 On 08/03/2010 10:38 PM, Anthony Liguori wrote:
Why do we need to transfer roms?  These are devices on the memory 
bus or pci bus, it just needs to be there at the right address.



Not quite.  The BIOS owns the option ROM space.  The way it works on 
bare metal is that the PCI ROM BAR gets mapped to some location in 
physical memory by the BIOS, the BIOS executes the initialization 
vector, and after initialization, the ROM will reorganize itself into 
something smaller.  It's nice and clean.


But ISA is not nearly as clean. 


So far so good.

Ultimately, to make this mix work in a reasonable way, we have to 
provide a side channel interface to SeaBIOS such that we can deliver 
ROMs outside of PCI and still let SeaBIOS decide how ROMs get organized.


I don't follow.  Why do we need this side channel?  What would a real 
ISA machine do?


It depends on the ISA machine.  In the worst case, there's a DIP switch 
on the card and if you've got a conflict between two cards, you start 
flipping DIP switches.  It's pure awesomeness.  No, I don't want to 
emulate DIP switches :-)



  Are there actually enough ISA devices for there to be a problem?


No, but -M older has the same problem.



It's additionally complicated by the fact that we didn't support PCI 
ROM BAR until recently so to maintain compatibility with -M older, we 
have to use a side channel to lay out option roms.


Again I don't follow.  We can just lay out the ROMs in memory like we 
did in the past?


Because only one component can own the option ROM space.  Either that's 
SeaBIOS and we need a side channel or it's QEMU and we can't use PMM.


I guess that's the real issue here.  Previously we used etherboot which 
was well under 32k.  We only loaded roms we needed.  Now we use gPXE 
which is much bigger and if you don't use PMM, then you run out of 
option rom space very quickly.


Previously, we loaded option ROMs on demand when a user used -boot n but 
that was a giant hack and wasn't like bare metal at all.  It involved 
x86-isms in vl.c.  Now we always load ROMs so PMM is very important.


Regards,

Anthony Liguori

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 10:22:22PM +0300, Avi Kivity wrote:
>  On 08/03/2010 10:13 PM, Richard W.M. Jones wrote:
> >On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
> >>libguestfs does not depend on an x86 architectural feature.
> >>qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
> >>should discourage people from depending on this interface for
> >>production use.
> >I really don't get this whole thing where we must slavishly
> >emulate an exact PC ...
> 
> This has two motivations:
> 
> - documented interfaces: we suck at documentation.  We seldom
> document.  Even when we do document something, the documentation is
> often inaccurate, misleading, and incomplete.  While an "exact PC"
> unfortunately doesn't exist, it's a lot closer to reality than, say,
> an "exact Linux syscall interface".  If we adopt an existing
> interface, we already have the documentation, and if there's a
> conflict between the documentation and our implementation, it's
> clear who wins (well, not always).
> 
> - preexisting guests: if we design a new interface, we get to update
> all guests; and there are many of them.  Whereas an "exact PC" will
> be seen by the guest vendors as well who will then add whatever
> support is necessary.

On the other hand we end up with stuff like only being able to add 29
virtio-blk devices to a single guest.  As best as I can tell, this
comes from PCI, and this limit required a bunch of hacks when
implementing virt-df.

These are reasonable motivations, but I think they are partially about
us:

We could document things better and make things future-proof.  I'm
surprised by how lacking the doc requirements are for qemu (compared
to, hmm, libguestfs for example).

We could demand that OSes write device drivers for more qemu devices
-- already OS vendors write thousands of device drivers for all sorts
of obscure devices, so this isn't really much of a demand for them.
In fact, they're already doing it.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 03:00 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 10:22:22PM +0300, Avi Kivity wrote:
   

  On 08/03/2010 10:13 PM, Richard W.M. Jones wrote:
 

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
   

libguestfs does not depend on an x86 architectural feature.
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
should discourage people from depending on this interface for
production use.
 

I really don't get this whole thing where we must slavishly
emulate an exact PC ...
   

This has two motivations:

- documented interfaces: we suck at documentation.  We seldom
document.  Even when we do document something, the documentation is
often inaccurate, misleading, and incomplete.  While an "exact PC"
unfortunately doesn't exist, it's a lot closer to reality than, say,
an "exact Linux syscall interface".  If we adopt an existing
interface, we already have the documentation, and if there's a
conflict between the documentation and our implementation, it's
clear who wins (well, not always).

- preexisting guests: if we design a new interface, we get to update
all guests; and there are many of them.  Whereas an "exact PC" will
be seen by the guest vendors as well who will then add whatever
support is necessary.
 

On the other hand we end up with stuff like only being able to add 29
virtio-blk devices to a single guest.  As best as I can tell, this
comes from PCI


No, this comes from us being too clever for our own good and not 
following the way hardware does it.


All modern systems keep disks on their own dedicated bus.  In 
virtio-blk, we have a 1-1 relationship between disks and PCI devices.  
That's a perfect example of what happens when we try to "improve" things.



, and this limit required a bunch of hacks when
implementing virt-df.

These are reasonable motivations, but I think they are partially about
us:

We could document things better and make things future-proof.  I'm
surprised by how lacking the doc requirements are for qemu (compared
to, hmm, libguestfs for example).
   


We enjoy complaining about our lack of documentation more than we like 
actually writing documentation.



We could demand that OSes write device drivers for more qemu devices
-- already OS vendors write thousands of device drivers for all sorts
of obscure devices, so this isn't really much of a demand for them.
In fact, they're already doing it.
   


So far, MS hasn't quite gotten the clue yet that they should write 
device drivers for qemu :-)  In fact, noone has.


Regards,

Anthony Liguori


Rich.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Paolo Bonzini


On 08/03/2010 10:49 PM, Anthony Liguori wrote:

On the other hand we end up with stuff like only being able to add 29
virtio-blk devices to a single guest.  As best as I can tell, this
comes from PCI


No, this comes from us being too clever for our own good and not
following the way hardware does it.

All modern systems keep disks on their own dedicated bus.  In
virtio-blk, we have a 1-1 relationship between disks and PCI devices.
That's a perfect example of what happens when we try to "improve" things.


Comparing (from personal experience) the complexity of the Windows 
drivers for Xen and virtio shows that it's not a bad idea at all.


Paolo

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gerd Hoffmann


  Hi,


We're already doing bulk data transfer over fw_cfg as we need to do it
to transfer roms and potentially a boot splash.


Why do we need to transfer roms? These are devices on the memory bus or
pci bus, it just needs to be there at the right address.


Indeed.  We do that in most cases.  The exceptions are:

  (1) -M somethingold.  PCI devices don't have a pci rom bar then by
  default because they didn't not have one in older qemu versions,
  so we need some other way to pass the option rom to seabios.
  (2) vgabios.bin.  vgabios needs patches to make loading via pci rom
  bar work (vgabios-cirrus.bin works fine already).  I have patches
  in the queue to do that.
  (3) roms not associated with a PCI device:  multiboot, extboot,
  -option-rom command line switch, vgabios for -M isapc.

The default configuration (qemu $diskimage) loads two roms: 
vgabios-cirrus.bin and e1000.bin.  Both are loaded via pci rom bar and 
not via fw_cfg.


cheers,
  Gerd

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gerd Hoffmann


  Hi,


Again I don't follow. We can just lay out the ROMs in memory like we did
in the past?


Well.  We have some size issues then.  PCI ROMS are loaded by the BIOS 
in a way that only a small fraction is actually resident in the small 
0xd -> 0xe area.  That doesn't work if qemu tries to simply copy 
the whole thing there like old versions did.  With the size of the gPXE 
roms this matters in real life.


cheers,
  Gerd

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori


On 08/03/2010 04:13 PM, Paolo Bonzini wrote:

On 08/03/2010 10:49 PM, Anthony Liguori wrote:

On the other hand we end up with stuff like only being able to add 29
virtio-blk devices to a single guest.  As best as I can tell, this
comes from PCI


No, this comes from us being too clever for our own good and not
following the way hardware does it.

All modern systems keep disks on their own dedicated bus.  In
virtio-blk, we have a 1-1 relationship between disks and PCI devices.
That's a perfect example of what happens when we try to "improve" 
things.


Comparing (from personal experience) the complexity of the Windows 
drivers for Xen and virtio shows that it's not a bad idea at all.


Not quite sure what you're suggesting, but I could have been clearer.  
Instead of having virtio-blk where a virtio disk has a 1-1 mapping to a 
PCI device, we probably should have just done virtio-scsi.


Since most OSes have a SCSI-centric block layer, it would have resulted 
in much simpler drivers and we could support more than 1 disk per PCI 
slot.  I had thought Christoph was working on such a device at some 
point in time...


Regards,

Anthony Liguori



Paolo

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones

On Tue, Aug 03, 2010 at 10:24:41PM +0300, Avi Kivity wrote:
> Why do we need to transfer roms?  These are devices on the memory
> bus or pci bus, it just needs to be there at the right address.
> Boot splash should just be another rom as it would be on a real
> system.

Just like the initrd?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://et.redhat.com/~rjones/libguestfs/
See what it can do: http://et.redhat.com/~rjones/libguestfs/recipes.html

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Jamie Lokier

Richard W.M. Jones wrote:
> We could demand that OSes write device drivers for more qemu devices
> -- already OS vendors write thousands of device drivers for all sorts
> of obscure devices, so this isn't really much of a demand for them.
> In fact, they're already doing it.

Result: Most OSes not working with qemu?

Actually we seem to be going that way.  Recent qemus don't work with
older versions of Windows any more, so we have to use different
versions of qemu for different guests.

-- Jamie

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 10:47 PM, Anthony Liguori wrote:

On 08/03/2010 02:41 PM, Avi Kivity wrote:

 On 08/03/2010 10:38 PM, Anthony Liguori wrote:
Why do we need to transfer roms?  These are devices on the memory 
bus or pci bus, it just needs to be there at the right address.



Not quite.  The BIOS owns the option ROM space.  The way it works on 
bare metal is that the PCI ROM BAR gets mapped to some location in 
physical memory by the BIOS, the BIOS executes the initialization 
vector, and after initialization, the ROM will reorganize itself 
into something smaller.  It's nice and clean.


But ISA is not nearly as clean. 


So far so good.

Ultimately, to make this mix work in a reasonable way, we have to 
provide a side channel interface to SeaBIOS such that we can deliver 
ROMs outside of PCI and still let SeaBIOS decide how ROMs get 
organized.


I don't follow.  Why do we need this side channel?  What would a real 
ISA machine do?


It depends on the ISA machine.  In the worst case, there's a DIP 
switch on the card and if you've got a conflict between two cards, you 
start flipping DIP switches.  It's pure awesomeness.  No, I don't want 
to emulate DIP switches :-)


How else do you set the IRQ line and I/O port base address?

 static ISADeviceInfo ne2000_isa_info = {
 .qdev.name  = "ne2k_isa",
 .qdev.size  = sizeof(ISANE2000State),
 .init   = isa_ne2000_initfn,
 .qdev.props = (Property[]) {
 DEFINE_PROP_HEX32("iobase", ISANE2000State, iobase, 0x300),
 DEFINE_PROP_UINT32("irq",   ISANE2000State, isairq, 9),
+  DEFINE_PROP_HEX32("rombase", ISANE2000State, isarombase, 0xe8000),
 DEFINE_NIC_PROPERTIES(ISANE2000State, ne2000.c),
 DEFINE_PROP_END_OF_LIST(),
 },
 };


we already are emulating DIP switches...




  Are there actually enough ISA devices for there to be a problem?


No, but -M older has the same problem.


So we do the same solution we did in older.  We didn't have fwcfg dma 
back then.






It's additionally complicated by the fact that we didn't support PCI 
ROM BAR until recently so to maintain compatibility with -M older, 
we have to use a side channel to lay out option roms.


Again I don't follow.  We can just lay out the ROMs in memory like we 
did in the past?


Because only one component can own the option ROM space.  Either 
that's SeaBIOS and we need a side channel or it's QEMU and we can't 
use PMM.


I guess that's the real issue here.  Previously we used etherboot 
which was well under 32k.  We only loaded roms we needed.  Now we use 
gPXE which is much bigger and if you don't use PMM, then you run out 
of option rom space very quickly.


A true -M older would use the older ROMs for full compatibility.



Previously, we loaded option ROMs on demand when a user used -boot n 
but that was a giant hack and wasn't like bare metal at all.  It 
involved x86-isms in vl.c.  Now we always load ROMs so PMM is very 
important.


Though it's a hack, we can load ROMs via the existing fwcfg interface; 
no need for an extension.  Richard is seeing problems loading 100MB 
initrds, not 64KB ROMs.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/04/2010 12:20 AM, Gerd Hoffmann wrote:

  Hi,


We're already doing bulk data transfer over fw_cfg as we need to do it
to transfer roms and potentially a boot splash.


Why do we need to transfer roms? These are devices on the memory bus or
pci bus, it just needs to be there at the right address.


Indeed.  We do that in most cases.  The exceptions are:

  (1) -M somethingold.  PCI devices don't have a pci rom bar then by
  default because they didn't not have one in older qemu versions,
  so we need some other way to pass the option rom to seabios.


What did we do back then?  before we had the fwcfg interface?


  (2) vgabios.bin.  vgabios needs patches to make loading via pci rom
  bar work (vgabios-cirrus.bin works fine already).  I have patches
  in the queue to do that.


So not an issue.


  (3) roms not associated with a PCI device:  multiboot, extboot,
  -option-rom command line switch, vgabios for -M isapc.


We could lay those out in high memory (4GB-512MB) and have the bios copy 
them from there.  I believe that's what real hardware does - the flash 
chip is mapped there (the reset vector is at 4GB-16) and shadowed at the 
end of the 1MB 8086 range.



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/04/2010 01:06 AM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 10:24:41PM +0300, Avi Kivity wrote:

Why do we need to transfer roms?  These are devices on the memory
bus or pci bus, it just needs to be there at the right address.
Boot splash should just be another rom as it would be on a real
system.

Just like the initrd?


There isn't enough address space for a 100MB initrd in ROM.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity


 On 08/03/2010 11:49 PM, Anthony Liguori wrote:



We could demand that OSes write device drivers for more qemu devices
-- already OS vendors write thousands of device drivers for all sorts
of obscure devices, so this isn't really much of a demand for them.
In fact, they're already doing it.


So far, MS hasn't quite gotten the clue yet that they should write 
device drivers for qemu :-) 


To be fair, we haven't actually demanded that they do.


In fact, noone has.


Strangely, the reverse has happened - I think virtualbox has written 
virtio device models for their VMM.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gerd Hoffmann


  Hi,


(1) -M somethingold. PCI devices don't have a pci rom bar then by
default because they didn't not have one in older qemu versions,
so we need some other way to pass the option rom to seabios.


What did we do back then? before we had the fwcfg interface?


Have qemu instead of bochs/seabios manage the vgabios/optionrom area 
(0xc8000 -> 0xe) and copy the roms to memory.  Which implies the 
whole rom has to sit there as PMM can't be used then.



(3) roms not associated with a PCI device: multiboot, extboot,
-option-rom command line switch, vgabios for -M isapc.


We could lay those out in high memory (4GB-512MB) and have the bios copy
them from there.


Yea, we could.  But it is pointless IMHO.

$ ls -l *.bin
-rwxrwxr-x. 1 kraxel kraxel 1536 Jul 15 15:51 extboot.bin*
-rwxrwxr-x. 1 kraxel kraxel 1024 Jul 15 15:51 linuxboot.bin*
-rwxrwxr-x. 1 kraxel kraxel 1024 Jul 15 15:51 multiboot.bin*
-rwxrwxr-x. 1 kraxel kraxel 8960 Jul 15 15:51 vapic.bin*

That are the ones we can't load via pci rom bar.  Look how small they are.

cheers,
  Gerd

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Paolo Bonzini


On 08/03/2010 11:34 PM, Anthony Liguori wrote:


Comparing (from personal experience) the complexity of the Windows
drivers for Xen and virtio shows that it's not a bad idea at all.


Not quite sure what you're suggesting, but I could have been clearer.
Instead of having virtio-blk where a virtio disk has a 1-1 mapping to a
PCI device, we probably should have just done virtio-scsi.


If you did virtio-scsi you might have as well ditched virtio-pci 
altogether and provide a single PCI device just like Xen does.  Just 
make your network device also speak SCSI (which is actually in the 
spec...), and the same for serial devices.


But now your driver that has to implement its own hot-plug/hot-unplug 
mechanism rather than deferring it to the PCI subsystem of the OS (like 
Xen), greatly adding to the complication.  In fact, a SCSI controller's 
firmware has a lot of other communication channels with the driver 
besides SCSI commands, and all this would be mapped into additional 
complexity on both the host side and the guest side.  Yet another 
reminder of Xen.


Despite the shortcomings, I think virtio-pci is the best example of 
balancing PV-specific aspects (do not make things too complicated) and 
"real world" aspects (do not invent new buses and the like).


Paolo

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Avi Kivity


 On 08/04/2010 10:56 AM, Gerd Hoffmann wrote:

  Hi,


(1) -M somethingold. PCI devices don't have a pci rom bar then by
default because they didn't not have one in older qemu versions,
so we need some other way to pass the option rom to seabios.


What did we do back then? before we had the fwcfg interface?


Have qemu instead of bochs/seabios manage the vgabios/optionrom area 
(0xc8000 -> 0xe) and copy the roms to memory.  Which implies the 
whole rom has to sit there as PMM can't be used then.


Do we actually need PMM for isapc?  Did PMM exist before pci?




(3) roms not associated with a PCI device: multiboot, extboot,
-option-rom command line switch, vgabios for -M isapc.


We could lay those out in high memory (4GB-512MB) and have the bios copy
them from there.


Yea, we could.  But it is pointless IMHO.

$ ls -l *.bin
-rwxrwxr-x. 1 kraxel kraxel 1536 Jul 15 15:51 extboot.bin*
-rwxrwxr-x. 1 kraxel kraxel 1024 Jul 15 15:51 linuxboot.bin*
-rwxrwxr-x. 1 kraxel kraxel 1024 Jul 15 15:51 multiboot.bin*
-rwxrwxr-x. 1 kraxel kraxel 8960 Jul 15 15:51 vapic.bin*

That are the ones we can't load via pci rom bar.  Look how small they 
are.


So they can just sit there?  I'm confused, either there is enough 
address space and we don't need to play games, or there isn't and we do.


For playing games, there are three options:
- existing fwcfg
- fwcfg+dma
- put roms in 4GB-2MB (or whatever we decide the flash size is) and have 
the BIOS copy them


Existing fwcfg is the least amount of work and probably satisfactory for 
isapc.  fwcfg+dma is IMO going off a tangent.  High memory flash is the 
most hardware-like solution, pretty easy from a qemu point of view but 
requires more work.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Avi Kivity


 On 08/04/2010 10:57 AM, Paolo Bonzini wrote:

On 08/03/2010 11:34 PM, Anthony Liguori wrote:


Comparing (from personal experience) the complexity of the Windows
drivers for Xen and virtio shows that it's not a bad idea at all.


Not quite sure what you're suggesting, but I could have been clearer.
Instead of having virtio-blk where a virtio disk has a 1-1 mapping to a
PCI device, we probably should have just done virtio-scsi.


If you did virtio-scsi you might have as well ditched virtio-pci 
altogether and provide a single PCI device just like Xen does.  Just 
make your network device also speak SCSI (which is actually in the 
spec...), and the same for serial devices.


But now your driver that has to implement its own hot-plug/hot-unplug 
mechanism rather than deferring it to the PCI subsystem of the OS 
(like Xen), greatly adding to the complication.  In fact, a SCSI 
controller's firmware has a lot of other communication channels with 
the driver besides SCSI commands, and all this would be mapped into 
additional complexity on both the host side and the guest side.  Yet 
another reminder of Xen.


Despite the shortcomings, I think virtio-pci is the best example of 
balancing PV-specific aspects (do not make things too complicated) and 
"real world" aspects (do not invent new buses and the like).


Making virtio-blk a controller doesn't involve much difficulty.  We add 
LUN to all requests, and send a configuration interrupt (which we 
already have) when a LUN is added or removed.  Add some config space for 
discovering available LUNs.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Avi Kivity


 On 08/03/2010 10:13 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:

libguestfs does not depend on an x86 architectural feature.
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
should discourage people from depending on this interface for
production use.

I really don't get this whole thing where we must slavishly
emulate an exact PC ...


An additional point in favour is that we have a method of resolving 
design arguments.  No need to think, we have the spec in front of us.  
The arguments then devolve into interpretation of the spec.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 11:17:28AM +0300, Avi Kivity wrote:
>  On 08/04/2010 10:56 AM, Gerd Hoffmann wrote:
> >  Hi,
> >
> >>>(1) -M somethingold. PCI devices don't have a pci rom bar then by
> >>>default because they didn't not have one in older qemu versions,
> >>>so we need some other way to pass the option rom to seabios.
> >>
> >>What did we do back then? before we had the fwcfg interface?
> >
> >Have qemu instead of bochs/seabios manage the vgabios/optionrom
> >area (0xc8000 -> 0xe) and copy the roms to memory.  Which
> >implies the whole rom has to sit there as PMM can't be used then.
> 
> Do we actually need PMM for isapc?  Did PMM exist before pci?
> 
> >
> >>>(3) roms not associated with a PCI device: multiboot, extboot,
> >>>-option-rom command line switch, vgabios for -M isapc.
> >>
> >>We could lay those out in high memory (4GB-512MB) and have the bios copy
> >>them from there.
> >
> >Yea, we could.  But it is pointless IMHO.
> >
> >$ ls -l *.bin
> >-rwxrwxr-x. 1 kraxel kraxel 1536 Jul 15 15:51 extboot.bin*
> >-rwxrwxr-x. 1 kraxel kraxel 1024 Jul 15 15:51 linuxboot.bin*
> >-rwxrwxr-x. 1 kraxel kraxel 1024 Jul 15 15:51 multiboot.bin*
> >-rwxrwxr-x. 1 kraxel kraxel 8960 Jul 15 15:51 vapic.bin*
> >
> >That are the ones we can't load via pci rom bar.  Look how small
> >they are.
> 
> So they can just sit there?  I'm confused, either there is enough
> address space and we don't need to play games, or there isn't and we
> do.
> 
> For playing games, there are three options:
> - existing fwcfg
> - fwcfg+dma
> - put roms in 4GB-2MB (or whatever we decide the flash size is) and
> have the BIOS copy them
> 
> Existing fwcfg is the least amount of work and probably satisfactory
> for isapc.  fwcfg+dma is IMO going off a tangent.  High memory flash
> is the most hardware-like solution, pretty easy from a qemu point of
> view but requires more work.
> 
We can do interface like that: guest enumerates available roms using
fwcfg. Guest can tell host to map rom into guest specified IOMEM region.
Guest copies rom from IOMEM region and tell host to unmap it.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gerd Hoffmann


On 08/04/10 10:17, Avi Kivity wrote:

On 08/04/2010 10:56 AM, Gerd Hoffmann wrote:

Hi,


(1) -M somethingold. PCI devices don't have a pci rom bar then by
default because they didn't not have one in older qemu versions,
so we need some other way to pass the option rom to seabios.


What did we do back then? before we had the fwcfg interface?


Have qemu instead of bochs/seabios manage the vgabios/optionrom area
(0xc8000 -> 0xe) and copy the roms to memory. Which implies the
whole rom has to sit there as PMM can't be used then.


Do we actually need PMM for isapc? Did PMM exist before pci?


I don't know.


(3) roms not associated with a PCI device: multiboot, extboot,
-option-rom command line switch, vgabios for -M isapc.


We could lay those out in high memory (4GB-512MB) and have the bios copy
them from there.


Yea, we could. But it is pointless IMHO.

$ ls -l *.bin
-rwxrwxr-x. 1 kraxel kraxel 1536 Jul 15 15:51 extboot.bin*
-rwxrwxr-x. 1 kraxel kraxel 1024 Jul 15 15:51 linuxboot.bin*
-rwxrwxr-x. 1 kraxel kraxel 1024 Jul 15 15:51 multiboot.bin*
-rwxrwxr-x. 1 kraxel kraxel 8960 Jul 15 15:51 vapic.bin*

That are the ones we can't load via pci rom bar. Look how small they are.


So they can just sit there? I'm confused, either there is enough address
space and we don't need to play games, or there isn't and we do.


Well.  Looks like I should be a bit more verbose.

The old (qemu 0.11) way was to have qemu load roms to memory and 
bochsbios/seabios scan the memory area for option rom signatures to find 
them.  All option roms have to fit in there then, completely:


  vgabios(~40k)
  etherboot rom  (~32k)
  extboot rom(~1k)

The new way is to have seabios load roms to memory:

  vgabios (~40k)
  gPXE rom header (~2k IIRC)
  extboot rom (~1k)

Thanks to SeaBIOS loading the roms only a small part of the gPXE rom has 
to live in the option rom area, everything else is stored somewhere else 
in high memory (using PMM, don't ask me how this works in detail).  gPXE 
roms are ~56k in size (e1000 even 72k), so they would fill up the option 
rom area pretty quickly if we would load them the old way without PMM.


Another advantage of seabios loading the roms is that parts of the 
0xe segment can be used then.  Seabios size is just a bit more than 
64k, so most of the 0xe -> 0xf area isn't actually used by seabios.


seabios has two ways get the roms:  (1) fw_cfg and (2) pci rom bar.  The 
ones listed above are the ones which have to go through fw_cfg.  There 
are more roms which have to fit into the option rom space (vgabios, one 
gPXE per nic), but these don't depend on fw_cfg.



For playing games, there are three options:
- existing fwcfg


Gived the size that is good+fast enougth for the roms IMO.

Kernel+initrd is another story though.  We are talking about megabytes 
not kilobytes then.  Standard fedora initramfs is ~14M on x86_64.


cheers,
  Gerd

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 10:24:28AM +0100, Richard W.M. Jones wrote:
> On Wed, Aug 04, 2010 at 08:54:35AM +0300, Avi Kivity wrote:
> >  On 08/04/2010 01:06 AM, Richard W.M. Jones wrote:
> > >On Tue, Aug 03, 2010 at 10:24:41PM +0300, Avi Kivity wrote:
> > >>Why do we need to transfer roms?  These are devices on the memory
> > >>bus or pci bus, it just needs to be there at the right address.
> > >>Boot splash should just be another rom as it would be on a real
> > >>system.
> > >Just like the initrd?
> > 
> > There isn't enough address space for a 100MB initrd in ROM.
> 
> Because of limits of the original PC, sure, where you had to fit
> everything in 0xa-0xf or whatever it was.
> 
> But this isn't a real PC.
> 
In what way it is not?

> You can map the read-only memory anywhere you want.
> 
You can't. Guests expects certain memory layouts.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Richard W.M. Jones

On Wed, Aug 04, 2010 at 08:54:35AM +0300, Avi Kivity wrote:
>  On 08/04/2010 01:06 AM, Richard W.M. Jones wrote:
> >On Tue, Aug 03, 2010 at 10:24:41PM +0300, Avi Kivity wrote:
> >>Why do we need to transfer roms?  These are devices on the memory
> >>bus or pci bus, it just needs to be there at the right address.
> >>Boot splash should just be another rom as it would be on a real
> >>system.
> >Just like the initrd?
> 
> There isn't enough address space for a 100MB initrd in ROM.

Because of limits of the original PC, sure, where you had to fit
everything in 0xa-0xf or whatever it was.

But this isn't a real PC.

You can map the read-only memory anywhere you want.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://et.redhat.com/~rjones/libguestfs/
See what it can do: http://et.redhat.com/~rjones/libguestfs/recipes.html

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Avi Kivity


 On 08/04/2010 12:24 PM, Richard W.M. Jones wrote:


Just like the initrd?

There isn't enough address space for a 100MB initrd in ROM.

Because of limits of the original PC, sure, where you had to fit
everything in 0xa-0xf or whatever it was.

But this isn't a real PC.

You can map the read-only memory anywhere you want.


I wasn't talking about the 1MB limit, rather the 4GB limit.  Of that, 
3-3.5GB are reserved for RAM, 0.5-1GB for PCI.  Putting large amounts of 
ROM in that space will cost us PCI space.


100 MB initrds are a bad idea for multiple reasons.  Demand paging is 
there for a reason.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Richard W.M. Jones

On Wed, Aug 04, 2010 at 12:52:23PM +0300, Avi Kivity wrote:
>  On 08/04/2010 12:24 PM, Richard W.M. Jones wrote:
> >>>
> >>>Just like the initrd?
> >>There isn't enough address space for a 100MB initrd in ROM.
> >Because of limits of the original PC, sure, where you had to fit
> >everything in 0xa-0xf or whatever it was.
> >
> >But this isn't a real PC.
> >
> >You can map the read-only memory anywhere you want.
> 
> I wasn't talking about the 1MB limit, rather the 4GB limit.  Of
> that, 3-3.5GB are reserved for RAM, 0.5-1GB for PCI.  Putting large
> amounts of ROM in that space will cost us PCI space.

I'm only allocating 500MB of RAM, so there's easily enough space to
put a large ROM, with tons of room for growth (of both RAM and ROM).
Yes, even real hardware has done this.  The Weitek math copro mapped
itself in at physical memory addresses c000 (a 32 MB window IIRC).

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Avi Kivity


 On 08/04/2010 02:33 PM, Richard W.M. Jones wrote:


I'm only allocating 500MB of RAM, so there's easily enough space to
put a large ROM, with tons of room for growth (of both RAM and ROM).
Yes, even real hardware has done this.  The Weitek math copro mapped
itself in at physical memory addresses c000 (a 32 MB window IIRC).


I'm sure it will work for your use case, but it becomes a feature that 
only works if you have a guest with a small amount of memory and few pci 
devices.  With a larger guest it fails.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 12:33:18PM +0100, Richard W.M. Jones wrote:
> On Wed, Aug 04, 2010 at 12:52:23PM +0300, Avi Kivity wrote:
> >  On 08/04/2010 12:24 PM, Richard W.M. Jones wrote:
> > >>>
> > >>>Just like the initrd?
> > >>There isn't enough address space for a 100MB initrd in ROM.
> > >Because of limits of the original PC, sure, where you had to fit
> > >everything in 0xa-0xf or whatever it was.
> > >
> > >But this isn't a real PC.
> > >
> > >You can map the read-only memory anywhere you want.
> > 
> > I wasn't talking about the 1MB limit, rather the 4GB limit.  Of
> > that, 3-3.5GB are reserved for RAM, 0.5-1GB for PCI.  Putting large
> > amounts of ROM in that space will cost us PCI space.
> 
> I'm only allocating 500MB of RAM, so there's easily enough space to
> put a large ROM, with tons of room for growth (of both RAM and ROM).
> Yes, even real hardware has done this.  The Weitek math copro mapped
> itself in at physical memory addresses c000 (a 32 MB window IIRC).
> 
c000 is 3G. This is where PCI windows starts usually  (configurable
in the chipset). Don't see anything unusual in this particular HW.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 02:57 AM, Paolo Bonzini wrote:

On 08/03/2010 11:34 PM, Anthony Liguori wrote:


Comparing (from personal experience) the complexity of the Windows
drivers for Xen and virtio shows that it's not a bad idea at all.


Not quite sure what you're suggesting, but I could have been clearer.
Instead of having virtio-blk where a virtio disk has a 1-1 mapping to a
PCI device, we probably should have just done virtio-scsi.


If you did virtio-scsi you might have as well ditched virtio-pci 
altogether and provide a single PCI device just like Xen does.  Just 
make your network device also speak SCSI (which is actually in the 
spec...), and the same for serial devices.


But now your driver that has to implement its own hot-plug/hot-unplug 
mechanism rather than deferring it to the PCI subsystem of the OS 
(like Xen), greatly adding to the complication.  In fact, a SCSI 
controller's firmware has a lot of other communication channels with 
the driver besides SCSI commands, and all this would be mapped into 
additional complexity on both the host side and the guest side.  Yet 
another reminder of Xen.


Despite the shortcomings, I think virtio-pci is the best example of 
balancing PV-specific aspects (do not make things too complicated) and 
"real world" aspects (do not invent new buses and the like).


So how do we enable support for more than 20 disks?  I think a 
virtio-scsi is inevitable..


Regards,

Anthony Liguori


Paolo

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 04:24 AM, Richard W.M. Jones wrote:

On Wed, Aug 04, 2010 at 08:54:35AM +0300, Avi Kivity wrote:
   

  On 08/04/2010 01:06 AM, Richard W.M. Jones wrote:
 

On Tue, Aug 03, 2010 at 10:24:41PM +0300, Avi Kivity wrote:
   

Why do we need to transfer roms?  These are devices on the memory
bus or pci bus, it just needs to be there at the right address.
Boot splash should just be another rom as it would be on a real
system.
 

Just like the initrd?
   

There isn't enough address space for a 100MB initrd in ROM.
 

Because of limits of the original PC, sure, where you had to fit
everything in 0xa-0xf or whatever it was.

But this isn't a real PC.

You can map the read-only memory anywhere you want.
   


It's not that simple.  Option roms are initialized in 16-bit mode so the 
physical address space is limited.  The address mappings have very well 
defined semantics.


Regards,

Anthony Liguori


Rich.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 03:17 AM, Avi Kivity wrote:

For playing games, there are three options:
- existing fwcfg
- fwcfg+dma
- put roms in 4GB-2MB (or whatever we decide the flash size is) and 
have the BIOS copy them


Existing fwcfg is the least amount of work and probably satisfactory 
for isapc.  fwcfg+dma is IMO going off a tangent.  High memory flash 
is the most hardware-like solution, pretty easy from a qemu point of 
view but requires more work.


The only trouble I see is that high memory isn't always available.  If 
it's a 32-bit PC and you've exhausted RAM space, then you're only left 
with the PCI hole and it's not clear to me if you can really pull out 
100mb of space there as an option ROM without breaking something.


Regards,

Anthony Liguori

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
> On 08/04/2010 03:17 AM, Avi Kivity wrote:
> >For playing games, there are three options:
> >- existing fwcfg
> >- fwcfg+dma
> >- put roms in 4GB-2MB (or whatever we decide the flash size is)
> >and have the BIOS copy them
> >
> >Existing fwcfg is the least amount of work and probably
> >satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
> >High memory flash is the most hardware-like solution, pretty easy
> >from a qemu point of view but requires more work.
> 
> The only trouble I see is that high memory isn't always available.
> If it's a 32-bit PC and you've exhausted RAM space, then you're only
> left with the PCI hole and it's not clear to me if you can really
> pull out 100mb of space there as an option ROM without breaking
> something.
> 
We can map it on demand. Guest tells qemu to map rom "A" to address X by
writing into some io port. Guest copies rom. Guest tells qemu to unmap
it. Better then DMA interface IMHO.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 08:07 AM, Gleb Natapov wrote:

On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
   

On 08/04/2010 03:17 AM, Avi Kivity wrote:
 

For playing games, there are three options:
- existing fwcfg
- fwcfg+dma
- put roms in 4GB-2MB (or whatever we decide the flash size is)
and have the BIOS copy them

Existing fwcfg is the least amount of work and probably
satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
High memory flash is the most hardware-like solution, pretty easy
   

>from a qemu point of view but requires more work.

The only trouble I see is that high memory isn't always available.
If it's a 32-bit PC and you've exhausted RAM space, then you're only
left with the PCI hole and it's not clear to me if you can really
pull out 100mb of space there as an option ROM without breaking
something.

 

We can map it on demand. Guest tells qemu to map rom "A" to address X by
writing into some io port. Guest copies rom. Guest tells qemu to unmap
it. Better then DMA interface IMHO.
   


That's what I thought too, but in a 32-bit guest using ~3.5GB of RAM, 
where can you safely get 100MB of memory to full map the ROM?  If you're 
going to map chunks at a time, you are basically doing DMA.


And what's the upper limit on ROM size that we impose?  100MB is already 
at the ridiculously large size.


Regards,

Anthony Liguori


--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Richard W.M. Jones


On Wed, Aug 04, 2010 at 04:07:09PM +0300, Gleb Natapov wrote:
> On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
> > On 08/04/2010 03:17 AM, Avi Kivity wrote:
> > >For playing games, there are three options:
> > >- existing fwcfg
> > >- fwcfg+dma
> > >- put roms in 4GB-2MB (or whatever we decide the flash size is)
> > >and have the BIOS copy them
> > >
> > >Existing fwcfg is the least amount of work and probably
> > >satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
> > >High memory flash is the most hardware-like solution, pretty easy
> > >from a qemu point of view but requires more work.
> > 
> > The only trouble I see is that high memory isn't always available.
> > If it's a 32-bit PC and you've exhausted RAM space, then you're only
> > left with the PCI hole and it's not clear to me if you can really
> > pull out 100mb of space there as an option ROM without breaking
> > something.
> > 
> We can map it on demand. Guest tells qemu to map rom "A" to address X by
> writing into some io port. Guest copies rom. Guest tells qemu to unmap
> it. Better then DMA interface IMHO.

I think this is a fine idea.  Do you want me to try to implement
something like this?  (I'm on holiday this week and next week at
the KVM Forum, so it won't be for a while ...)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Richard W.M. Jones

On Wed, Aug 04, 2010 at 08:15:04AM -0500, Anthony Liguori wrote:
> On 08/04/2010 08:07 AM, Gleb Natapov wrote:
> >On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
> >>On 08/04/2010 03:17 AM, Avi Kivity wrote:
> >>>For playing games, there are three options:
> >>>- existing fwcfg
> >>>- fwcfg+dma
> >>>- put roms in 4GB-2MB (or whatever we decide the flash size is)
> >>>and have the BIOS copy them
> >>>
> >>>Existing fwcfg is the least amount of work and probably
> >>>satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
> >>>High memory flash is the most hardware-like solution, pretty easy
> >>>from a qemu point of view but requires more work.
> >>
> >>The only trouble I see is that high memory isn't always available.
> >>If it's a 32-bit PC and you've exhausted RAM space, then you're only
> >>left with the PCI hole and it's not clear to me if you can really
> >>pull out 100mb of space there as an option ROM without breaking
> >>something.
> >>
> >We can map it on demand. Guest tells qemu to map rom "A" to address X by
> >writing into some io port. Guest copies rom. Guest tells qemu to unmap
> >it. Better then DMA interface IMHO.
> 
> That's what I thought too, but in a 32-bit guest using ~3.5GB of
> RAM, where can you safely get 100MB of memory to full map the ROM?
> If you're going to map chunks at a time, you are basically doing
> DMA.

It's boot time, so you can just map it over some existing RAM surely?
Linuxboot.bin can work out where to map it so it won't be in any
memory either being used or the target for the copy.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://et.redhat.com/~rjones/libguestfs/
See what it can do: http://et.redhat.com/~rjones/libguestfs/recipes.html

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 02:24:08PM +0100, Richard W.M. Jones wrote:
> On Wed, Aug 04, 2010 at 08:15:04AM -0500, Anthony Liguori wrote:
> > On 08/04/2010 08:07 AM, Gleb Natapov wrote:
> > >On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
> > >>On 08/04/2010 03:17 AM, Avi Kivity wrote:
> > >>>For playing games, there are three options:
> > >>>- existing fwcfg
> > >>>- fwcfg+dma
> > >>>- put roms in 4GB-2MB (or whatever we decide the flash size is)
> > >>>and have the BIOS copy them
> > >>>
> > >>>Existing fwcfg is the least amount of work and probably
> > >>>satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
> > >>>High memory flash is the most hardware-like solution, pretty easy
> > >>>from a qemu point of view but requires more work.
> > >>
> > >>The only trouble I see is that high memory isn't always available.
> > >>If it's a 32-bit PC and you've exhausted RAM space, then you're only
> > >>left with the PCI hole and it's not clear to me if you can really
> > >>pull out 100mb of space there as an option ROM without breaking
> > >>something.
> > >>
> > >We can map it on demand. Guest tells qemu to map rom "A" to address X by
> > >writing into some io port. Guest copies rom. Guest tells qemu to unmap
> > >it. Better then DMA interface IMHO.
> > 
> > That's what I thought too, but in a 32-bit guest using ~3.5GB of
> > RAM, where can you safely get 100MB of memory to full map the ROM?
> > If you're going to map chunks at a time, you are basically doing
> > DMA.
> 
> It's boot time, so you can just map it over some existing RAM surely?
Not with current qemu. This  is broken now.

> Linuxboot.bin can work out where to map it so it won't be in any
> memory either being used or the target for the copy.
> 

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 02:22:29PM +0100, Richard W.M. Jones wrote:
> 
> On Wed, Aug 04, 2010 at 04:07:09PM +0300, Gleb Natapov wrote:
> > On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
> > > On 08/04/2010 03:17 AM, Avi Kivity wrote:
> > > >For playing games, there are three options:
> > > >- existing fwcfg
> > > >- fwcfg+dma
> > > >- put roms in 4GB-2MB (or whatever we decide the flash size is)
> > > >and have the BIOS copy them
> > > >
> > > >Existing fwcfg is the least amount of work and probably
> > > >satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
> > > >High memory flash is the most hardware-like solution, pretty easy
> > > >from a qemu point of view but requires more work.
> > > 
> > > The only trouble I see is that high memory isn't always available.
> > > If it's a 32-bit PC and you've exhausted RAM space, then you're only
> > > left with the PCI hole and it's not clear to me if you can really
> > > pull out 100mb of space there as an option ROM without breaking
> > > something.
> > > 
> > We can map it on demand. Guest tells qemu to map rom "A" to address X by
> > writing into some io port. Guest copies rom. Guest tells qemu to unmap
> > it. Better then DMA interface IMHO.
> 
> I think this is a fine idea.  Do you want me to try to implement
> something like this?  (I'm on holiday this week and next week at
> the KVM Forum, so it won't be for a while ...)
> 
I wouldn't do that without principal agreement from Avi and Anthony :)

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 08:15:04AM -0500, Anthony Liguori wrote:
> On 08/04/2010 08:07 AM, Gleb Natapov wrote:
> >On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
> >>On 08/04/2010 03:17 AM, Avi Kivity wrote:
> >>>For playing games, there are three options:
> >>>- existing fwcfg
> >>>- fwcfg+dma
> >>>- put roms in 4GB-2MB (or whatever we decide the flash size is)
> >>>and have the BIOS copy them
> >>>
> >>>Existing fwcfg is the least amount of work and probably
> >>>satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
> >>>High memory flash is the most hardware-like solution, pretty easy
> >>>from a qemu point of view but requires more work.
> >>
> >>The only trouble I see is that high memory isn't always available.
> >>If it's a 32-bit PC and you've exhausted RAM space, then you're only
> >>left with the PCI hole and it's not clear to me if you can really
> >>pull out 100mb of space there as an option ROM without breaking
> >>something.
> >>
> >We can map it on demand. Guest tells qemu to map rom "A" to address X by
> >writing into some io port. Guest copies rom. Guest tells qemu to unmap
> >it. Better then DMA interface IMHO.
> 
> That's what I thought too, but in a 32-bit guest using ~3.5GB of
> RAM, where can you safely get 100MB of memory to full map the ROM?
> If you're going to map chunks at a time, you are basically doing
> DMA.
> 
This is not like DMA event if done in chunks and chunks can be pretty
big. The code that dials with copying may temporary unmap some pci
devices to have more space there.

> And what's the upper limit on ROM size that we impose?  100MB is
> already at the ridiculously large size.
> 
Agree. We have two solutions:
1. Avoid the problem
2. Fix the problem.

Both are fine with me and I prefer 1, but if we are going with 2 I
prefer something sane.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 08:34 AM, Gleb Natapov wrote:

On Wed, Aug 04, 2010 at 08:15:04AM -0500, Anthony Liguori wrote:
   

On 08/04/2010 08:07 AM, Gleb Natapov wrote:
 

On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
   

On 08/04/2010 03:17 AM, Avi Kivity wrote:
 

For playing games, there are three options:
- existing fwcfg
- fwcfg+dma
- put roms in 4GB-2MB (or whatever we decide the flash size is)
and have the BIOS copy them

Existing fwcfg is the least amount of work and probably
satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
High memory flash is the most hardware-like solution, pretty easy
   

>from a qemu point of view but requires more work.

The only trouble I see is that high memory isn't always available.
If it's a 32-bit PC and you've exhausted RAM space, then you're only
left with the PCI hole and it's not clear to me if you can really
pull out 100mb of space there as an option ROM without breaking
something.

 

We can map it on demand. Guest tells qemu to map rom "A" to address X by
writing into some io port. Guest copies rom. Guest tells qemu to unmap
it. Better then DMA interface IMHO.
   

That's what I thought too, but in a 32-bit guest using ~3.5GB of
RAM, where can you safely get 100MB of memory to full map the ROM?
If you're going to map chunks at a time, you are basically doing
DMA.

 

This is not like DMA event if done in chunks and chunks can be pretty
big. The code that dials with copying may temporary unmap some pci
devices to have more space there.
   


That's a bit complicated because SeaBIOS is managing the PCI devices 
whereas the kernel code is running as an option rom.  I don't know the 
BIOS PCI interfaces well so I don't know how doable this is.


Maybe we're just being too fancy here.

We could rewrite -kernel/-append/-initrd to just generate a floppy image 
in RAM, and just boot from floppy.


Regards,

Anthony Liguori

   

And what's the upper limit on ROM size that we impose?  100MB is
already at the ridiculously large size.

 

Agree. We have two solutions:
1. Avoid the problem
2. Fix the problem.

Both are fine with me and I prefer 1, but if we are going with 2 I
prefer something sane.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 08:52:44AM -0500, Anthony Liguori wrote:
> On 08/04/2010 08:34 AM, Gleb Natapov wrote:
> >On Wed, Aug 04, 2010 at 08:15:04AM -0500, Anthony Liguori wrote:
> >>On 08/04/2010 08:07 AM, Gleb Natapov wrote:
> >>>On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
> On 08/04/2010 03:17 AM, Avi Kivity wrote:
> >For playing games, there are three options:
> >- existing fwcfg
> >- fwcfg+dma
> >- put roms in 4GB-2MB (or whatever we decide the flash size is)
> >and have the BIOS copy them
> >
> >Existing fwcfg is the least amount of work and probably
> >satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
> >High memory flash is the most hardware-like solution, pretty easy
> >from a qemu point of view but requires more work.
> 
> The only trouble I see is that high memory isn't always available.
> If it's a 32-bit PC and you've exhausted RAM space, then you're only
> left with the PCI hole and it's not clear to me if you can really
> pull out 100mb of space there as an option ROM without breaking
> something.
> 
> >>>We can map it on demand. Guest tells qemu to map rom "A" to address X by
> >>>writing into some io port. Guest copies rom. Guest tells qemu to unmap
> >>>it. Better then DMA interface IMHO.
> >>That's what I thought too, but in a 32-bit guest using ~3.5GB of
> >>RAM, where can you safely get 100MB of memory to full map the ROM?
> >>If you're going to map chunks at a time, you are basically doing
> >>DMA.
> >>
> >This is not like DMA event if done in chunks and chunks can be pretty
> >big. The code that dials with copying may temporary unmap some pci
> >devices to have more space there.
> 
> That's a bit complicated because SeaBIOS is managing the PCI devices
> whereas the kernel code is running as an option rom.  I don't know
> the BIOS PCI interfaces well so I don't know how doable this is.
> 
Unmapping device and mapping it at the same place is easy. Enumerating
pci devices from multiboot.bin looks like unneeded churn though.

> Maybe we're just being too fancy here.
> 
> We could rewrite -kernel/-append/-initrd to just generate a floppy
> image in RAM, and just boot from floppy.
> 
May be. Can floppy be 100M?

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 09:00 AM, Gleb Natapov wrote:

On Wed, Aug 04, 2010 at 08:52:44AM -0500, Anthony Liguori wrote:
   

On 08/04/2010 08:34 AM, Gleb Natapov wrote:
 

On Wed, Aug 04, 2010 at 08:15:04AM -0500, Anthony Liguori wrote:
   

On 08/04/2010 08:07 AM, Gleb Natapov wrote:
 

On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
   

On 08/04/2010 03:17 AM, Avi Kivity wrote:
 

For playing games, there are three options:
- existing fwcfg
- fwcfg+dma
- put roms in 4GB-2MB (or whatever we decide the flash size is)
and have the BIOS copy them

Existing fwcfg is the least amount of work and probably
satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
High memory flash is the most hardware-like solution, pretty easy
   

>from a qemu point of view but requires more work.

The only trouble I see is that high memory isn't always available.
If it's a 32-bit PC and you've exhausted RAM space, then you're only
left with the PCI hole and it's not clear to me if you can really
pull out 100mb of space there as an option ROM without breaking
something.

 

We can map it on demand. Guest tells qemu to map rom "A" to address X by
writing into some io port. Guest copies rom. Guest tells qemu to unmap
it. Better then DMA interface IMHO.
   

That's what I thought too, but in a 32-bit guest using ~3.5GB of
RAM, where can you safely get 100MB of memory to full map the ROM?
If you're going to map chunks at a time, you are basically doing
DMA.

 

This is not like DMA event if done in chunks and chunks can be pretty
big. The code that dials with copying may temporary unmap some pci
devices to have more space there.
   

That's a bit complicated because SeaBIOS is managing the PCI devices
whereas the kernel code is running as an option rom.  I don't know
the BIOS PCI interfaces well so I don't know how doable this is.

 

Unmapping device and mapping it at the same place is easy. Enumerating
pci devices from multiboot.bin looks like unneeded churn though.

   

Maybe we're just being too fancy here.

We could rewrite -kernel/-append/-initrd to just generate a floppy
image in RAM, and just boot from floppy.

 

May be. Can floppy be 100M?
   


No, I forgot just how small they are.  R/O usb mass storage device?  
CDROM?  I'm beginning thing that loading such a large initrd through 
fwcfg is simply a dead end.


Regards,

Anthony Liguori


--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 08:26 AM, Gleb Natapov wrote:

On Wed, Aug 04, 2010 at 02:24:08PM +0100, Richard W.M. Jones wrote:
   

On Wed, Aug 04, 2010 at 08:15:04AM -0500, Anthony Liguori wrote:
 

On 08/04/2010 08:07 AM, Gleb Natapov wrote:
   

On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
 

On 08/04/2010 03:17 AM, Avi Kivity wrote:
   

For playing games, there are three options:
- existing fwcfg
- fwcfg+dma
- put roms in 4GB-2MB (or whatever we decide the flash size is)
and have the BIOS copy them

Existing fwcfg is the least amount of work and probably
satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
High memory flash is the most hardware-like solution, pretty easy
 

>from a qemu point of view but requires more work.

The only trouble I see is that high memory isn't always available.
If it's a 32-bit PC and you've exhausted RAM space, then you're only
left with the PCI hole and it's not clear to me if you can really
pull out 100mb of space there as an option ROM without breaking
something.

   

We can map it on demand. Guest tells qemu to map rom "A" to address X by
writing into some io port. Guest copies rom. Guest tells qemu to unmap
it. Better then DMA interface IMHO.
 

That's what I thought too, but in a 32-bit guest using ~3.5GB of
RAM, where can you safely get 100MB of memory to full map the ROM?
If you're going to map chunks at a time, you are basically doing
DMA.
   

It's boot time, so you can just map it over some existing RAM surely?
 

Not with current qemu. This  is broken now.
   


But even if it wasn't it can potentially create havoc.  I think we 
currently believe that the northbridge likely never forwards RAM access 
to a device so this doesn't fit how hardware would work.


More importantly, BIOSes and ROMs do very funny things with RAM.  It's 
not unusual for a ROM to muck with the e820 map to allocate RAM for 
itself which means there's always the chance that we're going to walk 
over RAM being used for something else.


Regards,

Anthony Liguori


Linuxboot.bin can work out where to map it so it won't be in any
memory either being used or the target for the copy.

 

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Paolo Bonzini


On 08/04/2010 04:00 PM, Gleb Natapov wrote:

Maybe we're just being too fancy here.

We could rewrite -kernel/-append/-initrd to just generate a floppy
image in RAM, and just boot from floppy.


May be. Can floppy be 100M?


Well, in theory you can have 16384 bytes/sector, 256 tracks, 255 
sectors, 2 heads... that makes 2^(14+8+8+1) = 2 GB. :)  Not sure the 
BIOS would read such a beast, or SYSLINUX.


By the way, if libguestfs insists for an initrd rather than a CDROM 
image, it could do something in between and make an ISO image with 
ISOLINUX and the required kernel/initrd pair.


(By the way, a network installation image for a typical distribution has 
a 120M initrd, so it's not just libguestfs.  It is very useful to pass 
the network installation images directly to qemu via -kernel/-initrd).


Paolo

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 09:14:01AM -0500, Anthony Liguori wrote:
> >Unmapping device and mapping it at the same place is easy. Enumerating
> >pci devices from multiboot.bin looks like unneeded churn though.
> >
> >>Maybe we're just being too fancy here.
> >>
> >>We could rewrite -kernel/-append/-initrd to just generate a floppy
> >>image in RAM, and just boot from floppy.
> >>
> >May be. Can floppy be 100M?
> 
> No, I forgot just how small they are.  R/O usb mass storage device?
> CDROM?  I'm beginning thing that loading such a large initrd through
> fwcfg is simply a dead end.
> 
Well, libguestfs can use CDROM by itself to begin with.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 09:22:22AM -0500, Anthony Liguori wrote:
> On 08/04/2010 08:26 AM, Gleb Natapov wrote:
> >On Wed, Aug 04, 2010 at 02:24:08PM +0100, Richard W.M. Jones wrote:
> >>On Wed, Aug 04, 2010 at 08:15:04AM -0500, Anthony Liguori wrote:
> >>>On 08/04/2010 08:07 AM, Gleb Natapov wrote:
> On Wed, Aug 04, 2010 at 08:04:09AM -0500, Anthony Liguori wrote:
> >On 08/04/2010 03:17 AM, Avi Kivity wrote:
> >>For playing games, there are three options:
> >>- existing fwcfg
> >>- fwcfg+dma
> >>- put roms in 4GB-2MB (or whatever we decide the flash size is)
> >>and have the BIOS copy them
> >>
> >>Existing fwcfg is the least amount of work and probably
> >>satisfactory for isapc.  fwcfg+dma is IMO going off a tangent.
> >>High memory flash is the most hardware-like solution, pretty easy
> >>from a qemu point of view but requires more work.
> >
> >The only trouble I see is that high memory isn't always available.
> >If it's a 32-bit PC and you've exhausted RAM space, then you're only
> >left with the PCI hole and it's not clear to me if you can really
> >pull out 100mb of space there as an option ROM without breaking
> >something.
> >
> We can map it on demand. Guest tells qemu to map rom "A" to address X by
> writing into some io port. Guest copies rom. Guest tells qemu to unmap
> it. Better then DMA interface IMHO.
> >>>That's what I thought too, but in a 32-bit guest using ~3.5GB of
> >>>RAM, where can you safely get 100MB of memory to full map the ROM?
> >>>If you're going to map chunks at a time, you are basically doing
> >>>DMA.
> >>It's boot time, so you can just map it over some existing RAM surely?
> >Not with current qemu. This  is broken now.
> 
> But even if it wasn't it can potentially create havoc.  I think we
> currently believe that the northbridge likely never forwards RAM
> access to a device so this doesn't fit how hardware would work.
> 
Good point.

> More importantly, BIOSes and ROMs do very funny things with RAM.
> It's not unusual for a ROM to muck with the e820 map to allocate RAM
> for itself which means there's always the chance that we're going to
> walk over RAM being used for something else.
> 
ROM does not muck with the e820. It uses PMM to allocate memory and the
memory it gets is marked as reserved in e820 map.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 09:22 AM, Paolo Bonzini wrote:

On 08/04/2010 04:00 PM, Gleb Natapov wrote:

Maybe we're just being too fancy here.

We could rewrite -kernel/-append/-initrd to just generate a floppy
image in RAM, and just boot from floppy.


May be. Can floppy be 100M?


Well, in theory you can have 16384 bytes/sector, 256 tracks, 255 
sectors, 2 heads... that makes 2^(14+8+8+1) = 2 GB. :)  Not sure the 
BIOS would read such a beast, or SYSLINUX.


By the way, if libguestfs insists for an initrd rather than a CDROM 
image, it could do something in between and make an ISO image with 
ISOLINUX and the required kernel/initrd pair.


(By the way, a network installation image for a typical distribution 
has a 120M initrd, so it's not just libguestfs.  It is very useful to 
pass the network installation images directly to qemu via 
-kernel/-initrd).


We could make kernel an awful lot smarter but unless we've got someone 
just itching to write 16-bit option rom code, I think our best bet is to 
try to leverage a standard bootloader and expose a disk containing the 
kernel/initrd.


Otherwise, we just stick with what we have and deal with the performance 
as is.


Regards,

Anthony Liguori



Paolo

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 09:38 AM, Gleb Natapov wrote:


But even if it wasn't it can potentially create havoc.  I think we
currently believe that the northbridge likely never forwards RAM
access to a device so this doesn't fit how hardware would work.

 

Good point.

   

More importantly, BIOSes and ROMs do very funny things with RAM.
It's not unusual for a ROM to muck with the e820 map to allocate RAM
for itself which means there's always the chance that we're going to
walk over RAM being used for something else.

 

ROM does not muck with the e820. It uses PMM to allocate memory and the
memory it gets is marked as reserved in e820 map.
   


PMM allocations are only valid during the init function's execution.  
It's intention is to enable the use of scratch memory to decompress or 
otherwise modify the ROM to shrink its size.


If a ROM needs memory after the init function, it needs to use the 
traditional tricks to allocate long term memory and the most popular one 
is modifying the e820 tables.


See src/arch/i386/firmware/pcbios/e820mangler.S in gPXE.

Regards,

Anthony Liguori


--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread David S. Ahern

On 08/03/10 12:43, Avi Kivity wrote:
> libguestfs does not depend on an x86 architectural feature. 
> qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We should
> discourage people from depending on this interface for production use.

That is a feature of qemu - and an important one to me as well. Why
should it be discouraged? You end up at the same place -- a running
kernel and in-ram filesystem; why require going through a bootloader
just because the hardware case needs it?

David

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 09:51 AM, David S. Ahern wrote:


On 08/03/10 12:43, Avi Kivity wrote:
   

libguestfs does not depend on an x86 architectural feature.
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We should
discourage people from depending on this interface for production use.
 

That is a feature of qemu - and an important one to me as well. Why
should it be discouraged? You end up at the same place -- a running
kernel and in-ram filesystem; why require going through a bootloader
just because the hardware case needs it?
   


It's smoke and mirrors.  We're still providing a boot loader it's just a 
little tiny one that we've written soley for this purpose.


And it works fine for production use.  The question is whether we ought 
to be aggressively optimizing it for large initrd sizes.  To be honest, 
after a lot of discussion of possibilities, I've come to the conclusion 
that it's just not worth it.


There are better ways like using string I/O and optimizing the PIO path 
in the kernel.  That should cut down the 1s slow down with a 100MB 
initrd by a bit.  But honestly, shaving a couple hundred ms further off 
the initrd load is just not worth it using the current model.


If this is important to someone, we ought to look at refactoring the 
loader completely to be disk based which is a higher performance interface.


Regards,

Anthony Liguori


David

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 09:50:55AM -0500, Anthony Liguori wrote:
> On 08/04/2010 09:38 AM, Gleb Natapov wrote:
> >>
> >>But even if it wasn't it can potentially create havoc.  I think we
> >>currently believe that the northbridge likely never forwards RAM
> >>access to a device so this doesn't fit how hardware would work.
> >>
> >Good point.
> >
> >>More importantly, BIOSes and ROMs do very funny things with RAM.
> >>It's not unusual for a ROM to muck with the e820 map to allocate RAM
> >>for itself which means there's always the chance that we're going to
> >>walk over RAM being used for something else.
> >>
> >ROM does not muck with the e820. It uses PMM to allocate memory and the
> >memory it gets is marked as reserved in e820 map.
> 
> PMM allocations are only valid during the init function's execution.
> It's intention is to enable the use of scratch memory to decompress
> or otherwise modify the ROM to shrink its size.
> 
Hm, may be. I read seabios code differently, but may be I misread it.
 
> If a ROM needs memory after the init function, it needs to use the
> traditional tricks to allocate long term memory and the most popular
> one is modifying the e820 tables.
> 
e820 has no in memory format,

> See src/arch/i386/firmware/pcbios/e820mangler.S in gPXE.
so this ugly code intercepts int15 and mangle result. OMG. How this can
even work if more then two ROMs want to do that?

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Anthony Liguori


On 08/04/2010 10:01 AM, Gleb Natapov wrote:


Hm, may be. I read seabios code differently, but may be I misread it.
   


The BIOS Boot Specification spells it all out pretty clearly.


If a ROM needs memory after the init function, it needs to use the
traditional tricks to allocate long term memory and the most popular
one is modifying the e820 tables.

 

e820 has no in memory format,
   


Indeed.


See src/arch/i386/firmware/pcbios/e820mangler.S in gPXE.
 

so this ugly code intercepts int15 and mangle result. OMG. How this can
even work if more then two ROMs want to do that?
   


You have to save the old handlers and invoke them.  Where do you save 
the old handlers?  There's tricks you can do by trying to use some 
unused vectors and also temporarily using the stack.


But basically, yeah, I'm amazed every time I see a PC boot that it all 
actually works :-)


Regards,

Anthony Liguori


--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 10:07:24AM -0500, Anthony Liguori wrote:
> On 08/04/2010 10:01 AM, Gleb Natapov wrote:
> >
> >Hm, may be. I read seabios code differently, but may be I misread it.
> 
> The BIOS Boot Specification spells it all out pretty clearly.
> 
I have the spec. Isn't this enough to be an expert? Or do you mean I
should read it too?

> >>If a ROM needs memory after the init function, it needs to use the
> >>traditional tricks to allocate long term memory and the most popular
> >>one is modifying the e820 tables.
> >>
> >e820 has no in memory format,
> 
> Indeed.
> 
> >>See src/arch/i386/firmware/pcbios/e820mangler.S in gPXE.
> >so this ugly code intercepts int15 and mangle result. OMG. How this can
> >even work if more then two ROMs want to do that?
> 
> You have to save the old handlers and invoke them.  Where do you
> save the old handlers?  There's tricks you can do by trying to use
> some unused vectors and also temporarily using the stack.
> 
> But basically, yeah, I'm amazed every time I see a PC boot that it
> all actually works :-)
> 
Heh.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 09:57:17AM -0500, Anthony Liguori wrote:
> On 08/04/2010 09:51 AM, David S. Ahern wrote:
> >
> >On 08/03/10 12:43, Avi Kivity wrote:
> >>libguestfs does not depend on an x86 architectural feature.
> >>qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We should
> >>discourage people from depending on this interface for production use.
> >That is a feature of qemu - and an important one to me as well. Why
> >should it be discouraged? You end up at the same place -- a running
> >kernel and in-ram filesystem; why require going through a bootloader
> >just because the hardware case needs it?
> 
> It's smoke and mirrors.  We're still providing a boot loader it's
> just a little tiny one that we've written soley for this purpose.
> 
> And it works fine for production use.  The question is whether we
> ought to be aggressively optimizing it for large initrd sizes.  To
> be honest, after a lot of discussion of possibilities, I've come to
> the conclusion that it's just not worth it.
> 
> There are better ways like using string I/O and optimizing the PIO
> path in the kernel.  That should cut down the 1s slow down with a
> 100MB initrd by a bit.  But honestly, shaving a couple hundred ms
> further off the initrd load is just not worth it using the current
> model.
> 
The slow down is not 1s any more. String PIO emulation had many bugs
that were fixed in 2.6.35. I verified how much time it took to load 100M
via fw_cfg interface on older kernel and on 2.6.35. On older kernels on
my machine it took ~2-3 second on 2.6.35 it took 26s. Some optimizations
that was already committed make it 20s. I have some code prototype that
makes it 11s. I don't see how we can get below that, surely not back to
~2-3sec.

> If this is important to someone, we ought to look at refactoring the
> loader completely to be disk based which is a higher performance
> interface.
> 
> Regards,
> 
> Anthony Liguori
> 
> >David
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Alexander Graf


On 04.08.2010, at 17:25, Gleb Natapov wrote:

> On Wed, Aug 04, 2010 at 09:57:17AM -0500, Anthony Liguori wrote:
>> On 08/04/2010 09:51 AM, David S. Ahern wrote:
>>> 
>>> On 08/03/10 12:43, Avi Kivity wrote:
 libguestfs does not depend on an x86 architectural feature.
 qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We should
 discourage people from depending on this interface for production use.
>>> That is a feature of qemu - and an important one to me as well. Why
>>> should it be discouraged? You end up at the same place -- a running
>>> kernel and in-ram filesystem; why require going through a bootloader
>>> just because the hardware case needs it?
>> 
>> It's smoke and mirrors.  We're still providing a boot loader it's
>> just a little tiny one that we've written soley for this purpose.
>> 
>> And it works fine for production use.  The question is whether we
>> ought to be aggressively optimizing it for large initrd sizes.  To
>> be honest, after a lot of discussion of possibilities, I've come to
>> the conclusion that it's just not worth it.
>> 
>> There are better ways like using string I/O and optimizing the PIO
>> path in the kernel.  That should cut down the 1s slow down with a
>> 100MB initrd by a bit.  But honestly, shaving a couple hundred ms
>> further off the initrd load is just not worth it using the current
>> model.
>> 
> The slow down is not 1s any more. String PIO emulation had many bugs
> that were fixed in 2.6.35. I verified how much time it took to load 100M
> via fw_cfg interface on older kernel and on 2.6.35. On older kernels on
> my machine it took ~2-3 second on 2.6.35 it took 26s. Some optimizations
> that was already committed make it 20s. I have some code prototype that
> makes it 11s. I don't see how we can get below that, surely not back to
> ~2-3sec.

What exactly is the reason for the slowdown? It can't be only boundary and 
permission checks, right?


Alex

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 05:31:12PM +0200, Alexander Graf wrote:
> 
> On 04.08.2010, at 17:25, Gleb Natapov wrote:
> 
> > On Wed, Aug 04, 2010 at 09:57:17AM -0500, Anthony Liguori wrote:
> >> On 08/04/2010 09:51 AM, David S. Ahern wrote:
> >>> 
> >>> On 08/03/10 12:43, Avi Kivity wrote:
>  libguestfs does not depend on an x86 architectural feature.
>  qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We should
>  discourage people from depending on this interface for production use.
> >>> That is a feature of qemu - and an important one to me as well. Why
> >>> should it be discouraged? You end up at the same place -- a running
> >>> kernel and in-ram filesystem; why require going through a bootloader
> >>> just because the hardware case needs it?
> >> 
> >> It's smoke and mirrors.  We're still providing a boot loader it's
> >> just a little tiny one that we've written soley for this purpose.
> >> 
> >> And it works fine for production use.  The question is whether we
> >> ought to be aggressively optimizing it for large initrd sizes.  To
> >> be honest, after a lot of discussion of possibilities, I've come to
> >> the conclusion that it's just not worth it.
> >> 
> >> There are better ways like using string I/O and optimizing the PIO
> >> path in the kernel.  That should cut down the 1s slow down with a
> >> 100MB initrd by a bit.  But honestly, shaving a couple hundred ms
> >> further off the initrd load is just not worth it using the current
> >> model.
> >> 
> > The slow down is not 1s any more. String PIO emulation had many bugs
> > that were fixed in 2.6.35. I verified how much time it took to load 100M
> > via fw_cfg interface on older kernel and on 2.6.35. On older kernels on
> > my machine it took ~2-3 second on 2.6.35 it took 26s. Some optimizations
> > that was already committed make it 20s. I have some code prototype that
> > makes it 11s. I don't see how we can get below that, surely not back to
> > ~2-3sec.
> 
> What exactly is the reason for the slowdown? It can't be only boundary and 
> permission checks, right?
> 
> 
The big part of slowdown right now is that write into memory is done
for each byte. It means for each byte we call kvm_write_guest() and
kvm_mmu_pte_write(). The second call is needed in case memory, instruction
is trying to write to, is shadowed. Previously we didn't checked for
that at all. This can be mitigated by introducing write cache and do
combined writes into the memory and unshadow the page if there is more
then one write into it. This optimization saves ~10secs. Currently string
emulation enter guest from time to time to check if event injection is
needed and read from userspace is done in 1K chunks, not 4K like it was,
but when I made reads to be 4K and disabled guest reentry I haven't seen
any speed improvements worth talking about.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Alexander Graf


On 04.08.2010, at 17:48, Gleb Natapov wrote:

> On Wed, Aug 04, 2010 at 05:31:12PM +0200, Alexander Graf wrote:
>> 
>> On 04.08.2010, at 17:25, Gleb Natapov wrote:
>> 
>>> On Wed, Aug 04, 2010 at 09:57:17AM -0500, Anthony Liguori wrote:
 On 08/04/2010 09:51 AM, David S. Ahern wrote:
> 
> On 08/03/10 12:43, Avi Kivity wrote:
>> libguestfs does not depend on an x86 architectural feature.
>> qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We should
>> discourage people from depending on this interface for production use.
> That is a feature of qemu - and an important one to me as well. Why
> should it be discouraged? You end up at the same place -- a running
> kernel and in-ram filesystem; why require going through a bootloader
> just because the hardware case needs it?
 
 It's smoke and mirrors.  We're still providing a boot loader it's
 just a little tiny one that we've written soley for this purpose.
 
 And it works fine for production use.  The question is whether we
 ought to be aggressively optimizing it for large initrd sizes.  To
 be honest, after a lot of discussion of possibilities, I've come to
 the conclusion that it's just not worth it.
 
 There are better ways like using string I/O and optimizing the PIO
 path in the kernel.  That should cut down the 1s slow down with a
 100MB initrd by a bit.  But honestly, shaving a couple hundred ms
 further off the initrd load is just not worth it using the current
 model.
 
>>> The slow down is not 1s any more. String PIO emulation had many bugs
>>> that were fixed in 2.6.35. I verified how much time it took to load 100M
>>> via fw_cfg interface on older kernel and on 2.6.35. On older kernels on
>>> my machine it took ~2-3 second on 2.6.35 it took 26s. Some optimizations
>>> that was already committed make it 20s. I have some code prototype that
>>> makes it 11s. I don't see how we can get below that, surely not back to
>>> ~2-3sec.
>> 
>> What exactly is the reason for the slowdown? It can't be only boundary and 
>> permission checks, right?
>> 
>> 
> The big part of slowdown right now is that write into memory is done
> for each byte. It means for each byte we call kvm_write_guest() and
> kvm_mmu_pte_write(). The second call is needed in case memory, instruction
> is trying to write to, is shadowed. Previously we didn't checked for
> that at all. This can be mitigated by introducing write cache and do
> combined writes into the memory and unshadow the page if there is more
> then one write into it. This optimization saves ~10secs. Currently string

Ok, so you tackled that bit already.

> emulation enter guest from time to time to check if event injection is
> needed and read from userspace is done in 1K chunks, not 4K like it was,
> but when I made reads to be 4K and disabled guest reentry I haven't seen
> any speed improvements worth talking about.

So what are we wasting those 10 seconds on then? Does perf tell you anything 
useful?


Alex

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Gleb Natapov

On Wed, Aug 04, 2010 at 05:59:40PM +0200, Alexander Graf wrote:
> 
> On 04.08.2010, at 17:48, Gleb Natapov wrote:
> 
> > On Wed, Aug 04, 2010 at 05:31:12PM +0200, Alexander Graf wrote:
> >> 
> >> On 04.08.2010, at 17:25, Gleb Natapov wrote:
> >> 
> >>> On Wed, Aug 04, 2010 at 09:57:17AM -0500, Anthony Liguori wrote:
>  On 08/04/2010 09:51 AM, David S. Ahern wrote:
> > 
> > On 08/03/10 12:43, Avi Kivity wrote:
> >> libguestfs does not depend on an x86 architectural feature.
> >> qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We 
> >> should
> >> discourage people from depending on this interface for production use.
> > That is a feature of qemu - and an important one to me as well. Why
> > should it be discouraged? You end up at the same place -- a running
> > kernel and in-ram filesystem; why require going through a bootloader
> > just because the hardware case needs it?
>  
>  It's smoke and mirrors.  We're still providing a boot loader it's
>  just a little tiny one that we've written soley for this purpose.
>  
>  And it works fine for production use.  The question is whether we
>  ought to be aggressively optimizing it for large initrd sizes.  To
>  be honest, after a lot of discussion of possibilities, I've come to
>  the conclusion that it's just not worth it.
>  
>  There are better ways like using string I/O and optimizing the PIO
>  path in the kernel.  That should cut down the 1s slow down with a
>  100MB initrd by a bit.  But honestly, shaving a couple hundred ms
>  further off the initrd load is just not worth it using the current
>  model.
>  
> >>> The slow down is not 1s any more. String PIO emulation had many bugs
> >>> that were fixed in 2.6.35. I verified how much time it took to load 100M
> >>> via fw_cfg interface on older kernel and on 2.6.35. On older kernels on
> >>> my machine it took ~2-3 second on 2.6.35 it took 26s. Some optimizations
> >>> that was already committed make it 20s. I have some code prototype that
> >>> makes it 11s. I don't see how we can get below that, surely not back to
> >>> ~2-3sec.
> >> 
> >> What exactly is the reason for the slowdown? It can't be only boundary and 
> >> permission checks, right?
> >> 
> >> 
> > The big part of slowdown right now is that write into memory is done
> > for each byte. It means for each byte we call kvm_write_guest() and
> > kvm_mmu_pte_write(). The second call is needed in case memory, instruction
> > is trying to write to, is shadowed. Previously we didn't checked for
> > that at all. This can be mitigated by introducing write cache and do
> > combined writes into the memory and unshadow the page if there is more
> > then one write into it. This optimization saves ~10secs. Currently string
> 
> Ok, so you tackled that bit already.
> 
> > emulation enter guest from time to time to check if event injection is
> > needed and read from userspace is done in 1K chunks, not 4K like it was,
> > but when I made reads to be 4K and disabled guest reentry I haven't seen
> > any speed improvements worth talking about.
> 
> So what are we wasting those 10 seconds on then? Does perf tell you anything 
> useful?
> 
Not 10, but 7-8 seconds.

After applying cache fix nothing definite as far as I remember (I ran it last 
time
almost 2 week ago, need to rerun). Code always go through emulator now
and check direction flags to update SI/DI accordingly. Emulator is a big
switch and it calls various callbacks that may also slow things down.

--
Gleb.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Avi Kivity


 On 08/04/2010 04:04 PM, Anthony Liguori wrote:

On 08/04/2010 03:17 AM, Avi Kivity wrote:

For playing games, there are three options:
- existing fwcfg
- fwcfg+dma
- put roms in 4GB-2MB (or whatever we decide the flash size is) and 
have the BIOS copy them


Existing fwcfg is the least amount of work and probably satisfactory 
for isapc.  fwcfg+dma is IMO going off a tangent.  High memory flash 
is the most hardware-like solution, pretty easy from a qemu point of 
view but requires more work.


The only trouble I see is that high memory isn't always available.  If 
it's a 32-bit PC and you've exhausted RAM space, then you're only left 
with the PCI hole and it's not clear to me if you can really pull out 
100mb of space there as an option ROM without breaking something.




100MB is out of the question, certainly.  I'm talking about your isapc 
problem, not about a cdrom replacement.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Avi Kivity


 On 08/04/2010 04:24 PM, Richard W.M. Jones wrote:


It's boot time, so you can just map it over some existing RAM surely?
Linuxboot.bin can work out where to map it so it won't be in any
memory either being used or the target for the copy.


There's no such thing as boot time from the host's point of view.  There 
are interfaces and they should work whatever the guest is doing right now.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-04 Thread Avi Kivity


 On 08/04/2010 04:52 PM, Anthony Liguori wrote:



This is not like DMA event if done in chunks and chunks can be pretty
big. The code that dials with copying may temporary unmap some pci
devices to have more space there.



That's a bit complicated because SeaBIOS is managing the PCI devices 
whereas the kernel code is running as an option rom.  I don't know the 
BIOS PCI interfaces well so I don't know how doable this is.


Maybe we're just being too fancy here.

We could rewrite -kernel/-append/-initrd to just generate a floppy 
image in RAM, and just boot from floppy.


How could this work?  the RAM belongs to SeaBIOS immediately after 
reset, it would just scribble over it.  Or worse, not scribble on it 
until some date in the future.


-kernel data has to find its way to memory after the bios gives control 
to some optionrom.  An alternative would be to embed knowledge of 
-kernel in seabios, but I don't think it's a good one.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

1 2 >

1 - 100 of 150 matches

Mail list logo