[questions] savevm|loadvm

2010-03-29 Thread Wenhao Xu
Hi, all,
   I am working with switching QEMU from running in KVM mode to QEMU
emulatoin mode dynamically.
   Intuitively, if the snapshot created using savevm in kvm mode can
be used by the loadvm command in QEMU emulator mode, the switchment
could makes use of this.  I tried to do so. However, it does not work.
 Any idea how to fix it?
    Thanks for the help.

regards,
Wenhao

--
~_~
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2933400 ] virtio-blk io errors / data corruption on raw drives > 1 TB

2010-03-29 Thread SourceForge.net
Bugs item #2933400, was opened at 2010-01-16 09:35
Message generated for change (Comment added) made by jyellick
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2933400&group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 9
Private: No
Submitted By: MaSc82 (masc82)
Assigned to: Avi Kivity (avik)
Summary: virtio-blk io errors / data corruption on raw drives > 1 TB

Initial Comment:
When attaching raw drives > 1 TB, buffer io errors will most likely occur, 
filesystems get corrupted. Processes (like mkfs.ext4) might freeze completely 
when filesystems are created on the guest.

Here's a typical log excerpt when using mkfs.ext4 on a 1.5 TB drive on a Ubuntu 
9.10 guest:
(kern.log)
Jan 15 20:40:44 q kernel: [  677.076602] Buffer I/O error on device vde, 
logical block 366283764
Jan 15 20:40:44 q kernel: [  677.076607] Buffer I/O error on device vde, 
logical block 366283765
Jan 15 20:40:44 q kernel: [  677.076611] Buffer I/O error on device vde, 
logical block 366283766
Jan 15 20:40:44 q kernel: [  677.076616] Buffer I/O error on device vde, 
logical block 366283767
Jan 15 20:40:44 q kernel: [  677.076621] Buffer I/O error on device vde, 
logical block 366283768
Jan 15 20:40:44 q kernel: [  677.076626] Buffer I/O error on device vde, 
logical block 366283769
(messages)
Jan 15 20:40:44 q kernel: [  677.076534] lost page write due to I/O error on vde
Jan 15 20:40:44 q kernel: [  677.076541] lost page write due to I/O error on vde
Jan 15 20:40:44 q kernel: [  677.076546] lost page write due to I/O error on vde
Jan 15 20:40:44 q kernel: [  677.076599] lost page write due to I/O error on vde
Jan 15 20:40:44 q kernel: [  677.076604] lost page write due to I/O error on vde
Jan 15 20:40:44 q kernel: [  677.076609] lost page write due to I/O error on vde
Jan 15 20:40:44 q kernel: [  677.076613] lost page write due to I/O error on vde
Jan 15 20:40:44 q kernel: [  677.076618] lost page write due to I/O error on vde
Jan 15 20:40:44 q kernel: [  677.076623] lost page write due to I/O error on vde
Jan 15 20:40:44 q kernel: [  677.076628] lost page write due to I/O error on vde
Jan 15 20:45:55 q Backgrounding to notify hosts...
(The following entries will repeat infinitely, mkfs.ext4 will not exit and 
cannot be killed)
Jan 15 20:49:27 q kernel: [ 1200.520096] mkfs.ext4 D  0 
 1839   1709 0x
Jan 15 20:49:27 q kernel: [ 1200.520101]  88004e157cb8 0082 
88004e157c58 00015880
Jan 15 20:49:27 q kernel: [ 1200.520115]  88004ef6c7c0 00015880 
00015880 00015880
Jan 15 20:49:27 q kernel: [ 1200.520118]  00015880 88004ef6c7c0 
00015880 00015880
Jan 15 20:49:27 q kernel: [ 1200.520123] Call Trace:
Jan 15 20:49:27 q kernel: [ 1200.520157]  [] ? 
sync_page+0x0/0x50
Jan 15 20:49:27 q kernel: [ 1200.520178]  [] 
io_schedule+0x28/0x40
Jan 15 20:49:27 q kernel: [ 1200.520182]  [] 
sync_page+0x3d/0x50
Jan 15 20:49:27 q kernel: [ 1200.520185]  [] 
__wait_on_bit+0x57/0x80
Jan 15 20:49:27 q kernel: [ 1200.520192]  [] 
wait_on_page_bit+0x6e/0x80
Jan 15 20:49:27 q kernel: [ 1200.520205]  [] ? 
wake_bit_function+0x0/0x40
Jan 15 20:49:27 q kernel: [ 1200.520210]  [] ? 
pagevec_lookup_tag+0x20/0x30
Jan 15 20:49:27 q kernel: [ 1200.520213]  [] 
wait_on_page_writeback_range+0xf5/0x190
Jan 15 20:49:27 q kernel: [ 1200.520217]  [] 
filemap_fdatawait+0x27/0x30
Jan 15 20:49:27 q kernel: [ 1200.520220]  [] 
filemap_write_and_wait+0x44/0x50
Jan 15 20:49:27 q kernel: [ 1200.520235]  [] 
__sync_blockdev+0x1f/0x40
Jan 15 20:49:27 q kernel: [ 1200.520239]  [] 
sync_blockdev+0xe/0x10
Jan 15 20:49:27 q kernel: [ 1200.520241]  [] 
block_fsync+0x1a/0x20
Jan 15 20:49:27 q kernel: [ 1200.520249]  [] 
vfs_fsync+0x86/0xf0
Jan 15 20:49:27 q kernel: [ 1200.520252]  [] 
do_fsync+0x39/0x60
Jan 15 20:49:27 q kernel: [ 1200.520255]  [] 
sys_fsync+0xb/0x10
Jan 15 20:49:27 q kernel: [ 1200.520271]  [] 
system_call_fastpath+0x16/0x1b

In my case I was switching to virtio at one point, but the behaviour didn't 
show until there was > 1 TB data on the filesystem. very dangerous.

Tested using 2 different SATA controllers, 1.5 TB lvm/mdraid, single 1.5 TB 
drive and 2 TB lvm/mdraid.
The behaviour does not occur with if=scsi or if=ide.

#2914397 might be related: 
https://sourceforge.net/tracker/?func=detail&aid=2914397&group_id=180599&atid=893831
This blog post might also relate: 
http://www.neuhalfen.name/2009/08/05/OpenSolaris_KVM_and_large_IDE_drives/

CPU: Intel Xeon E5430
KVM: qemu-kvm-0.12.1.2
Kernel:  2.6.32.2, x86_64
Guest OS: Verified to occur on guests Ubuntu Linux 9.10 (64-bit) and Gentoo 
Linux (64-bit)
Commandline (atm using ide instead of virtio for large drives as a workaround): 
 qemu-system-x86_64 -S -M pc-0.11 -enab

KVM call agenda for Mar 30

2010-03-29 Thread Chris Wright
- vhost-blk

Please send in any additional agenda items you are interested in covering.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Update sar command and handle OSError error.

2010-03-29 Thread Feng Yang
This patch do following things:
1. Update sar command in start function in /profilers/sar/sar.py,
because when i manual run '/usr/bin/sar -o %s %d 0' command, help
message is show up. Sames count number could not be 0, so use default
count.
2. Put os.kill in sar.py into try block to avoid traceback.
Sometimes it tried to kill an already terminated process which can
cause a traceback.

Signed-off-by: Feng Yang 
---
 client/profilers/sar/sar.py |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/client/profilers/sar/sar.py b/client/profilers/sar/sar.py
index fbe0639..e10156f 100644
--- a/client/profilers/sar/sar.py
+++ b/client/profilers/sar/sar.py
@@ -21,14 +21,17 @@ class sar(profiler.profiler):
 logfile = open(os.path.join(test.profdir, "sar"), 'w')
 # Save the sar data as binary, convert to text after the test.
 raw = os.path.join(test.profdir, "sar.raw")
-cmd = "/usr/bin/sar -o %s %d 0" % (raw, self.interval)
+cmd = "/usr/bin/sar -o %s %d " % (raw, self.interval)
 p = subprocess.Popen(cmd, shell=True, stdout=logfile, \
  stderr=subprocess.STDOUT)
 self.pid = p.pid
 
 
 def stop(self, test):
-os.kill(self.pid, 15)
+try:
+os.kill(self.pid, 15)
+except OSError:
+pass
 
 
 def report(self, test):
-- 
1.5.5.6

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Autotest] [Autotest PATCH] KVM-test: Add a subtest 'qemu_img'

2010-03-29 Thread Yolkfull Chow
On Wed, Mar 17, 2010 at 10:38:58AM -0300, Lucas Meneghel Rodrigues wrote:
> Copying Michael on the message.
> 
> Hi Yolkfull, I have reviewed this patch and I have some comments to
> make on it, similar to the ones I made on an earlier version of it:
> 
> One of the things that I noticed is that this patch doesn't work very
> well out of the box:
> 
> [...@freedom kvm]$ ./scan_results.py
> Test  Status  Seconds 
> Info
>   --  --- 
> 
> (Result file: ../../results/default/status)
> smp2.Fedora.11.64.qemu_img.check  GOOD47  
> completed successfully
> smp2.Fedora.11.64.qemu_img.create GOOD44  
> completed successfully
> smp2.Fedora.11.64.qemu_img.convert.to_qcow2   FAIL45  
> Image
> converted failed; Command: /usr/bin/qemu-img convert -f qcow2 -O qcow2
> /tmp/kvm_autotest_root/images/fc11-64.qcow2
> /tmp/kvm_autotest_root/images/fc11-64.qcow2.converted_qcow2;Output is:
> qemu-img: Could not open '/tmp/kvm_autotest_root/images/fc11-64.qcow2'
> smp2.Fedora.11.64.qemu_img.convert.to_raw FAIL46  
> Image
> converted failed; Command: /usr/bin/qemu-img convert -f qcow2 -O raw
> /tmp/kvm_autotest_root/images/fc11-64.qcow2
> /tmp/kvm_autotest_root/images/fc11-64.qcow2.converted_raw;Output is:
> qemu-img: Could not open '/tmp/kvm_autotest_root/images/fc11-64.qcow2'
> smp2.Fedora.11.64.qemu_img.snapshot   FAIL44  
> Create
> snapshot failed via command: /usr/bin/qemu-img snapshot -c snapshot0
> /tmp/kvm_autotest_root/images/fc11-64.qcow2;Output is: qemu-img: Could
> not open '/tmp/kvm_autotest_root/images/fc11-64.qcow2'
> smp2.Fedora.11.64.qemu_img.commit GOOD44  
> completed successfully
> smp2.Fedora.11.64.qemu_img.info   FAIL44  
> Unhandled
> str: Unhandled TypeError: argument of type 'NoneType' is not iterable
> smp2.Fedora.11.64.qemu_img.rebase TEST_NA 43  
> Current
> kvm user space version does not support 'rebase' subcommand
>   GOOD412 
> 
> We need to fix that before upstream inclusion.

Hi Lucas, did you run the test on fedora or other box? I ran this test on my 
fedora 13 box for
several times, worked fine:

# ./scan_results.py 
TestStatus  Seconds 
Info
--  --- 

(Result file: ../../results/default/status)
smp2.RHEL.5.4.i386.qemu_img.check   GOOD132 
completed successfully
smp2.RHEL.5.4.i386.qemu_img.create  GOOD144 
completed successfully
smp2.RHEL.5.4.i386.qemu_img.convert.to_qcow2GOOD251 
completed successfully
smp2.RHEL.5.4.i386.qemu_img.convert.to_raw  GOOD245 
completed successfully
smp2.RHEL.5.4.i386.qemu_img.snapshotGOOD140 
completed successfully
smp2.RHEL.5.4.i386.qemu_img.commit  GOOD146 
completed successfully
smp2.RHEL.5.4.i386.qemu_img.infoGOOD133 
completed successfully
smp2.RHEL.5.4.i386.qemu_img.rebase  TEST_NA 137 
Current kvm user space version does not support 'rebase' subcommand
GOOD1392
[r...@afu kvm]# 

Weird why there are some case failed...
Please test again based on the new patch I will send later.

> 
> Also, one thing that I've noticed is that this test doesn't depend of
> any other variants, so we don't need to repeat it to every combination
> of guest and qemu command line options. Michael, does it occur to you
> a way to get this test out of the variants block, so it gets executed
> only once per job and not every combination of guest and other qemu
> options?

Lucas and Michael, maybe we could add a parameter say 'ignore_vm_config = yes' 
to config file
which let a test ignore all configurations combination. 
Another method is ugly adding following block into config file:

---
qemu_img:
only ide
only qcow2
only up
only ...
(use 'only' to filter all configurations combination)
---

But I don't think it's a good idea. What do you think?

> 
> On Fri, Jan 29, 2010 at 4:00 AM, Yolkfull Chow  wrote:
> > This is designed to test all subcommands of 'qemu-img' however
> > so far 'commit' is not implemented.
> >
> > * For 'check' subcommand test, it will 'dd' to create a file with specified
> > size and see whether it's supported to be checked. Then convert it to be
> > supported formats ('qcow2' and 'raw' so far) to see wheth

Re: Setting nx bit in virtual CPU

2010-03-29 Thread Chris Wright
* Richard Simpson (rs1...@huskydog.org.uk) wrote:
> So, is there any way of having the nx bit and the benefits of KVM
> acceleration.

WFM here (both current git tree and 0.12.3) w/ either -cpu host or -cpu
qemu64.  The code definitly does what you'd expect in both those cases.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCI passthrough resource remapping

2010-03-29 Thread Chris Wright
* Kenni Lund (ke...@kelu.dk) wrote:
> Client dmesg: http://pastebin.com/uNG4QK5j
> Host dmesg: http://pastebin.com/jZu3WKZW
> 
> I just verified it and I do get the call trace in the host (which
> disables IRQ 19, used by the PCI USB card), exactly at the same second

It looks like IRQ 19 is shared between the ehci controller and the
ivtv tuner.  What do you see in /proc/interrupts on the host (before
you unbind and after you bind to pci stub)?

The host dmesg shows other possible devices sharing that interrupt:

$ grep "IRQ 19" jZu3WKZW
   482. ahci :00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19
   578. ehci_hcd :02:01.3: PCI INT A -> GSI 19 (level, low) -> IRQ 19
   623. uhci_hcd :00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
   795. ivtv :03:09.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
   814. IRQ 19/ivtv1: IRQF_DISABLED is not guaranteed on shared IRQs
   955. pci-stub :02:01.3: PCI INT A -> GSI 19 (level, low) -> IRQ 19

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Web frontend database: Expand test and subdir fields from tko_tests v2

2010-03-29 Thread Lucas Meneghel Rodrigues
Tests with large tags, such as the ones that can appear on the kvm test
can have the fields 'test' and 'subdir' too large. This patch raises
the length of such fields on the autotest_web database to 300.

Changes from v1:

 * Realized that tko/migrations/019_widen_test_name_field.py expands the
original value that test has from 30 to 60, so changed the downgrade
sql statement accordingly.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 .../migrations/052_expand_test_subdir_fields.py|9 +
 frontend/tko/models.py |4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)
 create mode 100644 frontend/migrations/052_expand_test_subdir_fields.py

diff --git a/frontend/migrations/052_expand_test_subdir_fields.py 
b/frontend/migrations/052_expand_test_subdir_fields.py
new file mode 100644
index 000..9f2732c
--- /dev/null
+++ b/frontend/migrations/052_expand_test_subdir_fields.py
@@ -0,0 +1,9 @@
+UP_SQL = """
+ALTER TABLE tko_tests MODIFY test varchar(300) default NULL;
+ALTER TABLE tko_tests MODIFY subdir varchar(300) default NULL;
+"""
+
+DOWN_SQL = """
+ALTER TABLE tko_tests MODIFY test varchar(60) default NULL;
+ALTER TABLE tko_tests MODIFY subdir varchar(60) default NULL;
+"""
\ No newline at end of file
diff --git a/frontend/tko/models.py b/frontend/tko/models.py
index 7348b07..e0215a6 100644
--- a/frontend/tko/models.py
+++ b/frontend/tko/models.py
@@ -169,8 +169,8 @@ class Test(dbmodels.Model, model_logic.ModelExtensions,
model_logic.ModelWithAttributes):
 test_idx = dbmodels.AutoField(primary_key=True)
 job = dbmodels.ForeignKey(Job, db_column='job_idx')
-test = dbmodels.CharField(max_length=90)
-subdir = dbmodels.CharField(blank=True, max_length=180)
+test = dbmodels.CharField(max_length=300)
+subdir = dbmodels.CharField(blank=True, max_length=300)
 kernel = dbmodels.ForeignKey(Kernel, db_column='kernel_idx')
 status = dbmodels.ForeignKey(Status, db_column='status')
 reason = dbmodels.CharField(blank=True, max_length=3072)
-- 
1.6.6.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCI passthrough resource remapping

2010-03-29 Thread Kenni Lund
2010/3/30 Chris Wright :
> * Alexander Graf (ag...@suse.de) wrote:
>> On 30.03.2010, at 01:00, Kenni Lund wrote:
>>
>> > 2010/3/29 Alexander Graf :
>> >>
>> >> On 29.03.2010, at 19:23, Kenni Lund wrote:
>> >>
>> > 2010/1/9 Alexander Graf :
>> >>
>> >> On 09.01.2010, at 03:45, Ryan C. Underwood wrote:
>> >>
>> >>>
>> >>> I have a multifunction PCI device that I'd like to pass through to 
>> >>> KVM.
>> >>> In order to do that, I'm reading that the PCI memory region must be
>> >>> 4K-page
>> >>> aligned and the PCI memory resources itself must also be exact 
>> >>> multiples
>> >>> of 4K pages.
>> >>>
>> >>> I have added the following on my kernel command line:
>> >>> reassign_resources 
>> >>> reassigndev=08:09.0,08:09.1,08:09.2,08:09.3,08:09.4
>> >>>
>> >>> But I don't know if it has any effect.  The resources are still not
>> >>> sized in 4K pages.  Also, this seems to screw up the last device.
>> >>
>> >> I submitted a patch to qemu-kvm recently that got rid of that 
>> >> limitation.
>> >> Please try out if the current git head works for you.
>> >>
>> >> Alex--
>> >
>> > I just upgraded to kernel 2.6.32.10 with qemu-kvm  0.12.3 and I still
>> > get the following error when trying to pass through a dedicated PCI
>> > USB card:
>> >
>> > "Unable to assign device: PCI region 0 at address 0xe9403000 has size
>> > 0x100,  which is not a multiple of 4K
>> > Error initializing device pci-assign"
>> >
>> > Didn't the above patch make it into qemu-kvm? I don't know why, but I
>> > was under the impression that this was fixed when I upgraded to
>> > qemu-kvm 0.12.3.
>> >
>>  It's only in qemu-kvm.git. Maybe it should go into qemu-kvm-0.12.4 if 
>>  there
>>  is one
>> >>>
>> >>> That would be highly appriciated...with the current USB support in
>> >>> QEMU, PCI passthrough is the only way to get USB 2.0 support. I've
>> >>> bought two dedicated PCI USB cards for this, but none of them works
>> >>> due to the above limitation.
>> >>>
>> >>> Perhaps a developer can comment on this? Are there any plans on
>> >>> including this patch in the stable releases in the near future?
>> >>
>> >> Please first try out to build the current git snapshot of qemu-kvm. If it 
>> >> works properly for you then I agree that we should take this into 
>> >> 0.12-stable.
>> >>
>> >> I wrote the support for a card that still didn't work even with this 
>> >> patch. So having someone say it makes things work for him is definitely a 
>> >> must :-).
>> >
>> > Sure, I have compiled the current git snapshot and performed some
>> > tests...It's at least mostly working, so I'm a bit unsure if this is a
>> > bug related to this or to something else.
>>
>> Chris, any idea on this? Looks like something's going wrong with function 
>> assignment.
>
> Hmm, one thing that sticks out to me is the debug port.  Kenni, can you
> post full dmesg on both host and guest, nothing is obviously broken (and
> in fact the guest should never "see" the debug port).
>

Uploaded here:
Client dmesg: http://pastebin.com/uNG4QK5j
Host dmesg: http://pastebin.com/jZu3WKZW

I just verified it and I do get the call trace in the host (which
disables IRQ 19, used by the PCI USB card), exactly at the same second
I ask the DVB-T tuner to view a channel in the guest.

Thanks..

Best Regards
Kenni Lund
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Web frontend database: Expand test and subdir fields from tko_tests

2010-03-29 Thread Lucas Meneghel Rodrigues
Tests with large tags, such as the ones that can appear on the kvm test
can have the fields 'test' and 'subdir' too large. This patch raises
the length of such fields on the autotest_web database to 300.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 .../migrations/052_expand_test_subdir_fields.py|9 +
 frontend/tko/models.py |4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)
 create mode 100644 frontend/migrations/052_expand_test_subdir_fields.py

diff --git a/frontend/migrations/052_expand_test_subdir_fields.py 
b/frontend/migrations/052_expand_test_subdir_fields.py
new file mode 100644
index 000..d0f7a8b
--- /dev/null
+++ b/frontend/migrations/052_expand_test_subdir_fields.py
@@ -0,0 +1,9 @@
+UP_SQL = """
+ALTER TABLE tko_tests MODIFY test varchar(300) default NULL;
+ALTER TABLE tko_tests MODIFY subdir varchar(300) default NULL;
+"""
+
+DOWN_SQL = """
+ALTER TABLE tko_tests MODIFY test varchar(30) default NULL;
+ALTER TABLE tko_tests MODIFY subdir varchar(60) default NULL;
+"""
\ No newline at end of file
diff --git a/frontend/tko/models.py b/frontend/tko/models.py
index 7348b07..e0215a6 100644
--- a/frontend/tko/models.py
+++ b/frontend/tko/models.py
@@ -169,8 +169,8 @@ class Test(dbmodels.Model, model_logic.ModelExtensions,
model_logic.ModelWithAttributes):
 test_idx = dbmodels.AutoField(primary_key=True)
 job = dbmodels.ForeignKey(Job, db_column='job_idx')
-test = dbmodels.CharField(max_length=90)
-subdir = dbmodels.CharField(blank=True, max_length=180)
+test = dbmodels.CharField(max_length=300)
+subdir = dbmodels.CharField(blank=True, max_length=300)
 kernel = dbmodels.ForeignKey(Kernel, db_column='kernel_idx')
 status = dbmodels.ForeignKey(Status, db_column='status')
 reason = dbmodels.CharField(blank=True, max_length=3072)
-- 
1.6.6.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] vhost-blk implementation

2010-03-29 Thread Chris Wright
* Badari Pulavarty (pbad...@us.ibm.com) wrote:
> On Mon, 2010-03-29 at 23:37 +0300, Avi Kivity wrote:
> > On 03/29/2010 09:20 PM, Chris Wright wrote:
> > > Your io wait time is twice as long and your throughput is about half.
> > > I think the qmeu block submission does an extra attempt at merging
> > > requests.  Does blktrace tell you anything interesting?
> 
> Yes. I see that in my testcase (2M writes) - QEMU is pickup 512K
> requests from the virtio ring and merging them back to 2M before
> submitting them. 
> 
> Unfortunately, I can't do that quite easily in vhost-blk. QEMU
> does re-creates iovecs for the merged IO. I have to come up with
> a scheme to do this :(

Is close cooperator logic kicking in at all?  Alternatively, using same
io_context.

> > It does.  I suggest using fio O_DIRECT random access patterns to avoid 
> > such issues.
> 
> Well, I am not trying to come up with a test case where vhost-blk
> performs better than virtio-blk. I am trying to understand where
> and why vhost-blk performnce worse than virtio-blk.

It would just level the playing field.  Alternatively, commenting out
merging in qemu to further validate it's the source, something as simple
as this in block.c:

-num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb);
+//num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb);


Although, from above, sounds like you've already verified this is the
difference.  IIRC, the sync write path is at odds w/ cfq elevator merging
(so cache=none would have trouble, but you were doing cache=writeback,
right?), but if you can hack your worker threads to share an io_context,
like you had done CLONE_IO,  you might get some merging back (again,
just a hack to pinpoint merging as the culprit).

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCI passthrough resource remapping

2010-03-29 Thread Chris Wright
* Alexander Graf (ag...@suse.de) wrote:
> On 30.03.2010, at 01:00, Kenni Lund wrote:
> 
> > 2010/3/29 Alexander Graf :
> >> 
> >> On 29.03.2010, at 19:23, Kenni Lund wrote:
> >> 
> > 2010/1/9 Alexander Graf :
> >> 
> >> On 09.01.2010, at 03:45, Ryan C. Underwood wrote:
> >> 
> >>> 
> >>> I have a multifunction PCI device that I'd like to pass through to 
> >>> KVM.
> >>> In order to do that, I'm reading that the PCI memory region must be
> >>> 4K-page
> >>> aligned and the PCI memory resources itself must also be exact 
> >>> multiples
> >>> of 4K pages.
> >>> 
> >>> I have added the following on my kernel command line:
> >>> reassign_resources reassigndev=08:09.0,08:09.1,08:09.2,08:09.3,08:09.4
> >>> 
> >>> But I don't know if it has any effect.  The resources are still not
> >>> sized in 4K pages.  Also, this seems to screw up the last device.
> >> 
> >> I submitted a patch to qemu-kvm recently that got rid of that 
> >> limitation.
> >> Please try out if the current git head works for you.
> >> 
> >> Alex--
> > 
> > I just upgraded to kernel 2.6.32.10 with qemu-kvm  0.12.3 and I still
> > get the following error when trying to pass through a dedicated PCI
> > USB card:
> > 
> > "Unable to assign device: PCI region 0 at address 0xe9403000 has size
> > 0x100,  which is not a multiple of 4K
> > Error initializing device pci-assign"
> > 
> > Didn't the above patch make it into qemu-kvm? I don't know why, but I
> > was under the impression that this was fixed when I upgraded to
> > qemu-kvm 0.12.3.
> > 
>  It's only in qemu-kvm.git. Maybe it should go into qemu-kvm-0.12.4 if 
>  there
>  is one
> >>> 
> >>> That would be highly appriciated...with the current USB support in
> >>> QEMU, PCI passthrough is the only way to get USB 2.0 support. I've
> >>> bought two dedicated PCI USB cards for this, but none of them works
> >>> due to the above limitation.
> >>> 
> >>> Perhaps a developer can comment on this? Are there any plans on
> >>> including this patch in the stable releases in the near future?
> >> 
> >> Please first try out to build the current git snapshot of qemu-kvm. If it 
> >> works properly for you then I agree that we should take this into 
> >> 0.12-stable.
> >> 
> >> I wrote the support for a card that still didn't work even with this 
> >> patch. So having someone say it makes things work for him is definitely a 
> >> must :-).
> > 
> > Sure, I have compiled the current git snapshot and performed some
> > tests...It's at least mostly working, so I'm a bit unsure if this is a
> > bug related to this or to something else.
> 
> Chris, any idea on this? Looks like something's going wrong with function 
> assignment.

Hmm, one thing that sticks out to me is the debug port.  Kenni, can you
post full dmesg on both host and guest, nothing is obviously broken (and
in fact the guest should never "see" the debug port).

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCI passthrough resource remapping

2010-03-29 Thread Alexander Graf

On 30.03.2010, at 01:00, Kenni Lund wrote:

> 2010/3/29 Alexander Graf :
>> 
>> On 29.03.2010, at 19:23, Kenni Lund wrote:
>> 
> 2010/1/9 Alexander Graf :
>> 
>> On 09.01.2010, at 03:45, Ryan C. Underwood wrote:
>> 
>>> 
>>> I have a multifunction PCI device that I'd like to pass through to KVM.
>>> In order to do that, I'm reading that the PCI memory region must be
>>> 4K-page
>>> aligned and the PCI memory resources itself must also be exact multiples
>>> of 4K pages.
>>> 
>>> I have added the following on my kernel command line:
>>> reassign_resources reassigndev=08:09.0,08:09.1,08:09.2,08:09.3,08:09.4
>>> 
>>> But I don't know if it has any effect.  The resources are still not
>>> sized in 4K pages.  Also, this seems to screw up the last device.
>> 
>> I submitted a patch to qemu-kvm recently that got rid of that limitation.
>> Please try out if the current git head works for you.
>> 
>> Alex--
> 
> I just upgraded to kernel 2.6.32.10 with qemu-kvm  0.12.3 and I still
> get the following error when trying to pass through a dedicated PCI
> USB card:
> 
> "Unable to assign device: PCI region 0 at address 0xe9403000 has size
> 0x100,  which is not a multiple of 4K
> Error initializing device pci-assign"
> 
> Didn't the above patch make it into qemu-kvm? I don't know why, but I
> was under the impression that this was fixed when I upgraded to
> qemu-kvm 0.12.3.
> 
 It's only in qemu-kvm.git. Maybe it should go into qemu-kvm-0.12.4 if there
 is one
>>> 
>>> That would be highly appriciated...with the current USB support in
>>> QEMU, PCI passthrough is the only way to get USB 2.0 support. I've
>>> bought two dedicated PCI USB cards for this, but none of them works
>>> due to the above limitation.
>>> 
>>> Perhaps a developer can comment on this? Are there any plans on
>>> including this patch in the stable releases in the near future?
>> 
>> Please first try out to build the current git snapshot of qemu-kvm. If it 
>> works properly for you then I agree that we should take this into 
>> 0.12-stable.
>> 
>> I wrote the support for a card that still didn't work even with this patch. 
>> So having someone say it makes things work for him is definitely a must :-).
> 
> Sure, I have compiled the current git snapshot and performed some
> tests...It's at least mostly working, so I'm a bit unsure if this is a
> bug related to this or to something else.

Chris, any idea on this? Looks like something's going wrong with function 
assignment.


Alex--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCI passthrough resource remapping

2010-03-29 Thread Kenni Lund
2010/3/29 Alexander Graf :
>
> On 29.03.2010, at 19:23, Kenni Lund wrote:
>
 2010/1/9 Alexander Graf :
>
> On 09.01.2010, at 03:45, Ryan C. Underwood wrote:
>
>>
>> I have a multifunction PCI device that I'd like to pass through to KVM.
>> In order to do that, I'm reading that the PCI memory region must be
>> 4K-page
>> aligned and the PCI memory resources itself must also be exact multiples
>> of 4K pages.
>>
>> I have added the following on my kernel command line:
>> reassign_resources reassigndev=08:09.0,08:09.1,08:09.2,08:09.3,08:09.4
>>
>> But I don't know if it has any effect.  The resources are still not
>> sized in 4K pages.  Also, this seems to screw up the last device.
>
> I submitted a patch to qemu-kvm recently that got rid of that limitation.
> Please try out if the current git head works for you.
>
> Alex--

 I just upgraded to kernel 2.6.32.10 with qemu-kvm  0.12.3 and I still
 get the following error when trying to pass through a dedicated PCI
 USB card:

 "Unable to assign device: PCI region 0 at address 0xe9403000 has size
 0x100,  which is not a multiple of 4K
 Error initializing device pci-assign"

 Didn't the above patch make it into qemu-kvm? I don't know why, but I
 was under the impression that this was fixed when I upgraded to
 qemu-kvm 0.12.3.

>>> It's only in qemu-kvm.git. Maybe it should go into qemu-kvm-0.12.4 if there
>>> is one
>>
>> That would be highly appriciated...with the current USB support in
>> QEMU, PCI passthrough is the only way to get USB 2.0 support. I've
>> bought two dedicated PCI USB cards for this, but none of them works
>> due to the above limitation.
>>
>> Perhaps a developer can comment on this? Are there any plans on
>> including this patch in the stable releases in the near future?
>
> Please first try out to build the current git snapshot of qemu-kvm. If it 
> works properly for you then I agree that we should take this into 0.12-stable.
>
> I wrote the support for a card that still didn't work even with this patch. 
> So having someone say it makes things work for him is definitely a must :-).

Sure, I have compiled the current git snapshot and performed some
tests...It's at least mostly working, so I'm a bit unsure if this is a
bug related to this or to something else.

Here's my test results on trying to passthrough a PCI USB card (I've
copy-pasted the text below into http://pastebin.com/8RJE36wG in case
formatting is lost below):



qemu-kvm complete command line:

qemu-kvm -usbdevice tablet -net
nic,macaddr=52:54:00:00:00:01,model=virtio -net tap,ifname=tap0 -vnc
:1 -smp 2 -m 2048 -cdrom
/data/server/Linux/mythbuntu-9.10-desktop-amd64.iso -drive
file=/data/virtualization/01_Mythbuntu.img,if=virtio,boot=on -boot c
-localtime -daemonize -pcidevice host=02:01.0 -pcidevice host=02:01.1
-pcidevice host=02:01.2 -pcidevice host=02:01.3 -monitor
unix:/var/run/kvm/01.socket,server,nowait -k da


uname -a on host

Linux mediaserver 2.6.32-ARCH #1 SMP PREEMPT Fri Mar 26 02:03:53 CET
2010 x86_64 Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz GenuineIntel
GNU/Linux

Exact kernel version is 2.6.32.10.



lspci -v on host, only for the USB PCI card:

02:01.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
(prog-if 10 [OHCI])
Subsystem: ALi Corporation ASRock 939Dual-SATA2 Motherboard
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 18
Memory at e940 (32-bit, non-prefetchable) [size=4K]
Capabilities: [60] Power Management version 2
Kernel driver in use: pci-stub
Kernel modules: ohci-hcd

02:01.1 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
(prog-if 10 [OHCI])
Subsystem: ALi Corporation ASRock 939Dual-SATA2 Motherboard
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 16
Memory at e9401000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [60] Power Management version 2
Kernel driver in use: pci-stub
Kernel modules: ohci-hcd

02:01.2 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
(prog-if 10 [OHCI])
Subsystem: ALi Corporation ASRock 939Dual-SATA2 Motherboard
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 20
Memory at e9402000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [60] Power Management version 2
Kernel driver in use: pci-stub
Kernel modules: ohci-hcd

02:01.3 USB Controller: ALi Corporation USB 2.0 Controller (rev 01)
(prog-if 20 [EHCI])
Subsystem: Device 2020:
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 19
Memory at e9403000 (32-bit, non-prefetchable) [size=256]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port: BAR=1 offset=0090
Kernel driver in use: pci-stub
Kernel modules: ehci-hcd
-

Re: [RFC] vhost-blk implementation

2010-03-29 Thread Badari Pulavarty
On Mon, 2010-03-29 at 23:37 +0300, Avi Kivity wrote:
> On 03/29/2010 09:20 PM, Chris Wright wrote:
> > * Badari Pulavarty (pbad...@us.ibm.com) wrote:
> >
> >> I modified my vhost-blk implementation to offload work to
> >> work_queues instead of doing synchronously. Infact, I tried
> >> to spread the work across all the CPUs. But to my surprise,
> >> this did not improve the performance compared to virtio-blk.
> >>
> >> I see vhost-blk taking more interrupts and context switches
> >> compared to virtio-blk. What is virtio-blk doing which I
> >> am not able to from vhost-blk ???
> >>  
> > Your io wait time is twice as long and your throughput is about half.
> > I think the qmeu block submission does an extra attempt at merging
> > requests.  Does blktrace tell you anything interesting?
> >

Yes. I see that in my testcase (2M writes) - QEMU is pickup 512K
requests from the virtio ring and merging them back to 2M before
submitting them. 

Unfortunately, I can't do that quite easily in vhost-blk. QEMU
does re-creates iovecs for the merged IO. I have to come up with
a scheme to do this :(

> It does.  I suggest using fio O_DIRECT random access patterns to avoid 
> such issues.

Well, I am not trying to come up with a test case where vhost-blk
performs better than virtio-blk. I am trying to understand where
and why vhost-blk performnce worse than virtio-blk.


Thanks,
Badari


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Setting nx bit in virtual CPU

2010-03-29 Thread Richard Simpson
Hello,

Summary: How can I have a virtual CPU with the nx bit set whilst
enjoying KVM acceleration?

My Host - AMD Athlon(tm) 64 Processor 3200+ running Gentoo
My VM - KVM running hardened Gentoo
My KVM version - 0.12.3
My Task - Implement restricted secure VM to handle services exposed to
internet.
My Command - kvm -hda /dev/mapper/vols-andrew -kernel ./bzImage -append
root=/dev/hda2 -cpu host -runas xxx -net nic -net user -m 256 -k en-gb
-vnc :1 -monitor stdio

In order to maximise the security of my VM, I have enabled PaX which is
supposed to prevent various address space attacks.  Sadly, when I run
'paxtest' it reports that my VM is still vulnerable.  I have concluded
that the problem is most likely caused by the virtual CPU not having the
nx bit set.

Flags in virtual CPU: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall mmxext fxsr_opt
lm rep_good pni cx16 lahf_lm

Flags in host CPU: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt
rdtscp lm 3dnowext 3dnow rep_good nopl pni cx16 lahf_lm svm extapic
cr8_legacy

As you can see, despite using the '-cpu host' command, several host
flags, including nx, are missing in the VM.  Setting '-cpu host,+nx'
doesn't make any difference.

If however, I remove the '-cpu host' option and add the '-no-kvm' option
the virtual CPU has the nx flag and paxtest reports that my VM is
secure.  Of course the down side is that everything runs much slower.

Confusingly, the following page about tuning KVM
(http://www.linux-kvm.org/page/Tuning_KVM) lists the flags for the
default qemu64 cpu and nx is clearly included.  But, when I set '-cpu
qemu64' I get a model name of QEMU Virtual CPU, but no sign of an nx bit.

So, is there any way of having the nx bit and the benefits of KVM
acceleration.

Thank you.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/1] Shared memory uio_pci driver

2010-03-29 Thread Avi Kivity

On 03/28/2010 10:48 PM, Cam Macdonell wrote:

On Sat, Mar 27, 2010 at 11:48 AM, Avi Kivity  wrote:
   

On 03/26/2010 07:14 PM, Cam Macdonell wrote:
 
   

I'm not familiar with the uio internals, but for the interface, an
ioctl()
on the fd to assign an eventfd to an MSI vector.  Similar to ioeventfd,
but
instead of mapping a doorbell to an eventfd, it maps a real MSI to an
eventfd.

 

uio will never support ioctls.
   

Why not?
 

Perhaps I spoke too strongly, but it was rejected before

http://thread.gmane.org/gmane.linux.kernel/756481

With a compelling case perhaps it could be added.
   


Ah, the usual "ioctls are ugly, go away".

It could be done via sysfs:

  $ cat /sys/.../msix/max-interrupts
  256
  $ echo 4 > /sys/.../msix/allocate
  $ # subdirectories 0 1 2 3 magically appear
  $ # bind fd 13 to msix
  $ echo 13 > /sys/.../msix/2/bind-fd
  $ # from now on, msix interrupt 2 will call eventfd_signal() on fd 13

Call me old fashioned, but I prefer ioctls.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] vhost-blk implementation

2010-03-29 Thread Avi Kivity

On 03/29/2010 09:20 PM, Chris Wright wrote:

* Badari Pulavarty (pbad...@us.ibm.com) wrote:
   

I modified my vhost-blk implementation to offload work to
work_queues instead of doing synchronously. Infact, I tried
to spread the work across all the CPUs. But to my surprise,
this did not improve the performance compared to virtio-blk.

I see vhost-blk taking more interrupts and context switches
compared to virtio-blk. What is virtio-blk doing which I
am not able to from vhost-blk ???
 

Your io wait time is twice as long and your throughput is about half.
I think the qmeu block submission does an extra attempt at merging
requests.  Does blktrace tell you anything interesting?
   


It does.  I suggest using fio O_DIRECT random access patterns to avoid 
such issues.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCI passthrough resource remapping

2010-03-29 Thread Alexander Graf

On 29.03.2010, at 19:23, Kenni Lund wrote:

>>> 2010/1/9 Alexander Graf :
 
 On 09.01.2010, at 03:45, Ryan C. Underwood wrote:
 
> 
> I have a multifunction PCI device that I'd like to pass through to KVM.
> In order to do that, I'm reading that the PCI memory region must be
> 4K-page
> aligned and the PCI memory resources itself must also be exact multiples
> of 4K pages.
> 
> I have added the following on my kernel command line:
> reassign_resources reassigndev=08:09.0,08:09.1,08:09.2,08:09.3,08:09.4
> 
> But I don't know if it has any effect.  The resources are still not
> sized in 4K pages.  Also, this seems to screw up the last device.
 
 I submitted a patch to qemu-kvm recently that got rid of that limitation.
 Please try out if the current git head works for you.
 
 Alex--
>>> 
>>> I just upgraded to kernel 2.6.32.10 with qemu-kvm  0.12.3 and I still
>>> get the following error when trying to pass through a dedicated PCI
>>> USB card:
>>> 
>>> "Unable to assign device: PCI region 0 at address 0xe9403000 has size
>>> 0x100,  which is not a multiple of 4K
>>> Error initializing device pci-assign"
>>> 
>>> Didn't the above patch make it into qemu-kvm? I don't know why, but I
>>> was under the impression that this was fixed when I upgraded to
>>> qemu-kvm 0.12.3.
>>> 
>> It's only in qemu-kvm.git. Maybe it should go into qemu-kvm-0.12.4 if there
>> is one
> 
> That would be highly appriciated...with the current USB support in
> QEMU, PCI passthrough is the only way to get USB 2.0 support. I've
> bought two dedicated PCI USB cards for this, but none of them works
> due to the above limitation.
> 
> Perhaps a developer can comment on this? Are there any plans on
> including this patch in the stable releases in the near future?

Please first try out to build the current git snapshot of qemu-kvm. If it works 
properly for you then I agree that we should take this into 0.12-stable.

I wrote the support for a card that still didn't work even with this patch. So 
having someone say it makes things work for him is definitely a must :-).


Alex--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] vhost-blk implementation

2010-03-29 Thread Chris Wright
* Badari Pulavarty (pbad...@us.ibm.com) wrote:
> I modified my vhost-blk implementation to offload work to
> work_queues instead of doing synchronously. Infact, I tried
> to spread the work across all the CPUs. But to my surprise,
> this did not improve the performance compared to virtio-blk.
> 
> I see vhost-blk taking more interrupts and context switches
> compared to virtio-blk. What is virtio-blk doing which I
> am not able to from vhost-blk ???

Your io wait time is twice as long and your throughput is about half.
I think the qmeu block submission does an extra attempt at merging
requests.  Does blktrace tell you anything interesting?

> procs ---memory-- ---swap-- -io --system-- 
> -cpu-
>  r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa 
> st
>  3  1   8920  56076  20760 56035560  104   196 79826 17164 13912  0  5 65 
> 30  0
>  2  4   9488  57216  20744 56056160  114   195 81120 17397 13824  0  5 65 
> 30  0
>  2  2  10028  68476  20728 55947640  108   206 80318 17162 13845  0  5 65 
> 30  0
>  0  4  10560  70856  20708 55930880  106   205 82363 17402 13904  0  5 65 
> 30  0
>  1  3  10948  80380  20672 55844520   78   178 79714 17113 13875  0  5 66 
> 29  0
> 
> qemu virtio-blk:
> 
> procs ---memory-- ---swap-- -io --system-- 
> -cpu-
>  r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa 
> st
>  0  1  14124  57456   5144 492406000   139 142546 11287 9312  1  4 80 
> 15  0
>  0  2  14124  56736   5148 492739600   146 142968 11283 9248  1  4 80 
> 15  0
>  0  1  14124  56712   5384 49270200074 150738 11182 9327  1  4 80 
> 16  0
>  1  1  14124  55496   5392 492790400 2 159902 11172 9401  1  3 79 
> 17  0
>  0  1  14124  55968   5408 492723200 0 159202 11212 9325  1  3 80 
> 16  0
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCI passthrough resource remapping

2010-03-29 Thread Kenni Lund
>> 2010/1/9 Alexander Graf :
>>>
>>> On 09.01.2010, at 03:45, Ryan C. Underwood wrote:
>>>

 I have a multifunction PCI device that I'd like to pass through to KVM.
 In order to do that, I'm reading that the PCI memory region must be
 4K-page
 aligned and the PCI memory resources itself must also be exact multiples
 of 4K pages.

 I have added the following on my kernel command line:
 reassign_resources reassigndev=08:09.0,08:09.1,08:09.2,08:09.3,08:09.4

 But I don't know if it has any effect.  The resources are still not
 sized in 4K pages.  Also, this seems to screw up the last device.
>>>
>>> I submitted a patch to qemu-kvm recently that got rid of that limitation.
>>> Please try out if the current git head works for you.
>>>
>>> Alex--
>>
>> I just upgraded to kernel 2.6.32.10 with qemu-kvm  0.12.3 and I still
>> get the following error when trying to pass through a dedicated PCI
>> USB card:
>>
>> "Unable to assign device: PCI region 0 at address 0xe9403000 has size
>> 0x100,  which is not a multiple of 4K
>> Error initializing device pci-assign"
>>
>> Didn't the above patch make it into qemu-kvm? I don't know why, but I
>> was under the impression that this was fixed when I upgraded to
>> qemu-kvm 0.12.3.
>>
> It's only in qemu-kvm.git. Maybe it should go into qemu-kvm-0.12.4 if there
> is one

That would be highly appriciated...with the current USB support in
QEMU, PCI passthrough is the only way to get USB 2.0 support. I've
bought two dedicated PCI USB cards for this, but none of them works
due to the above limitation.

Perhaps a developer can comment on this? Are there any plans on
including this patch in the stable releases in the near future?

Thanks :)

Best Regards
Kenni Lund
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MSI-X not enabled for ixgbe device-passthrough

2010-03-29 Thread Alexander Graf


Am 29.03.2010 um 18:46 schrieb Chris Wright :


* Hannes Reinecke (h...@suse.de) wrote:

Ah. So I'll have to shout at Alex Graf.

No problems there :-)


I like that, when in doubt, shout at Alex ;-)


Yep, whenever in doubt, just shout at me :).


Alex




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MSI-X not enabled for ixgbe device-passthrough

2010-03-29 Thread Chris Wright
* Hannes Reinecke (h...@suse.de) wrote:
> Ah. So I'll have to shout at Alex Graf.
> 
> No problems there :-)

I like that, when in doubt, shout at Alex ;-)

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] vhost-blk implementation

2010-03-29 Thread Badari Pulavarty
Hi Christoph,

I am wondering if you can provide your thoughts here..

I modified my vhost-blk implementation to offload work to
work_queues instead of doing synchronously. Infact, I tried
to spread the work across all the CPUs. But to my surprise,
this did not improve the performance compared to virtio-blk.

I see vhost-blk taking more interrupts and context switches
compared to virtio-blk. What is virtio-blk doing which I
am not able to from vhost-blk ???

Thanks,
Badari


vhost-blk

procs ---memory-- ---swap-- -io --system-- -cpu-
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa st
 3  1   8920  56076  20760 56035560  104   196 79826 17164 13912  0  5 65 
30  0
 2  4   9488  57216  20744 56056160  114   195 81120 17397 13824  0  5 65 
30  0
 2  2  10028  68476  20728 55947640  108   206 80318 17162 13845  0  5 65 
30  0
 0  4  10560  70856  20708 55930880  106   205 82363 17402 13904  0  5 65 
30  0
 1  3  10948  80380  20672 55844520   78   178 79714 17113 13875  0  5 66 
29  0

qemu virtio-blk:

procs ---memory-- ---swap-- -io --system-- -cpu-
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa st
 0  1  14124  57456   5144 492406000   139 142546 11287 9312  1  4 80 
15  0
 0  2  14124  56736   5148 492739600   146 142968 11283 9248  1  4 80 
15  0
 0  1  14124  56712   5384 49270200074 150738 11182 9327  1  4 80 
16  0
 1  1  14124  55496   5392 492790400 2 159902 11172 9401  1  3 79 
17  0
 0  1  14124  55968   5408 492723200 0 159202 11212 9325  1  3 80 
16  0

---
 drivers/vhost/blk.c |  310 
 1 file changed, 310 insertions(+)

Index: net-next/drivers/vhost/blk.c
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ net-next/drivers/vhost/blk.c2010-03-25 20:06:57.484054770 -0400
@@ -0,0 +1,310 @@
+ /*
+  * virtio-block server in host kernel.
+  * Inspired by vhost-net and shamlessly ripped code from it :)
+  */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "vhost.h"
+
+#define VHOST_BLK_VQ_MAX 1
+
+#if 0
+#define myprintk(fmt, ...) printk(pr_fmt(fmt), ##__VA_ARGS__)
+#else
+#define myprintk(fmt, ...)
+#endif
+
+struct vhost_blk {
+   struct vhost_dev dev;
+   struct vhost_virtqueue vqs[VHOST_BLK_VQ_MAX];
+   struct vhost_poll poll[VHOST_BLK_VQ_MAX];
+};
+
+struct vhost_blk_io {
+   struct work_struct work;
+   struct vhost_blk *blk;
+   struct file *file;
+   int head;
+   uint32_t type;
+   uint64_t sector;
+   struct iovec *iov;
+   int nvecs;
+};
+
+static struct workqueue_struct *vblk_workqueue;
+
+static void handle_io_work(struct work_struct *work)
+{
+   struct vhost_blk_io *vbio;
+   struct vhost_virtqueue *vq;
+   struct vhost_blk *blk;
+   int i, ret = 0;
+   loff_t pos;
+   uint8_t status = 0;
+
+   vbio = container_of(work, struct vhost_blk_io, work);
+   blk = vbio->blk;
+   vq = &blk->dev.vqs[0];
+   pos = vbio->sector << 8;
+
+   use_mm(blk->dev.mm);
+
+   if (vbio->type & VIRTIO_BLK_T_FLUSH)  {
+   ret = vfs_fsync(vbio->file, vbio->file->f_path.dentry, 1);
+   } else if (vbio->type & VIRTIO_BLK_T_OUT) {
+   ret = vfs_writev(vbio->file, vbio->iov, vbio->nvecs, &pos);
+   } else {
+   ret = vfs_readv(vbio->file, vbio->iov, vbio->nvecs, &pos);
+   }
+
+   status = (ret < 0) ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
+   if (copy_to_user(vbio->iov[vbio->nvecs].iov_base, &status, sizeof 
status) < 0) {
+   printk("copy to user failed\n");
+   vhost_discard_vq_desc(vq);
+   unuse_mm(blk->dev.mm);
+   return;
+   }
+   mutex_lock(&vq->mutex);
+   vhost_add_used_and_signal(&blk->dev, vq, vbio->head, ret);
+   mutex_unlock(&vq->mutex);
+   unuse_mm(blk->dev.mm);
+   kfree(vbio);
+}
+
+static int cpu = 0;
+static int handoff_io(struct vhost_blk *blk, int head,
+   uint32_t type, uint64_t sector,
+   struct iovec *iov, int nvecs)
+{
+   struct vhost_virtqueue *vq = &blk->dev.vqs[0];
+   struct vhost_blk_io *vbio;
+
+   vbio = kmalloc(sizeof(struct vhost_blk_io), GFP_KERNEL);
+   if (!vbio)
+   return -ENOMEM;
+
+   INIT_WORK(&vbio->work, handle_io_work);
+   vbio->blk = blk;
+   vbio->file = vq->private_data;
+   vbio->head = head;
+   vbio->type = type;
+   vbio->sector = sector;
+   vbio->iov = iov;
+   vbio->nvecs = nvecs;
+
+   cpu = cpumask_next(cpu, cpu_online_mask);
+   if (cpu >= nr_cpu_ids)
+   cpu = cpumask_first(cpu_online_mask);
+   

-netdev & PXEboot

2010-03-29 Thread Michael Tokarev
I sent this email yesterday but it never come back to
me and is not shown in archives.  So resending it.

I noticed that with the -netdev syntax (introduced in 0.12)
it is not possible to perform network booting anymore:

$ kvm -net tap,ifname=tap-kvm,id=net0 \
 -device virtio-net-pci,id=net0 -boot n
Cannot boot from non-existent NIC

On the other hand,

$ kvm -net tap,ifname=tap-kvm \
  -net nic,model=virtio -boot n

boots ok.

I also noticed that it's not possible to use
boot=on parameter for network booting at all:

$ kvm -net tap,ifname=tap-kvm  -net nic,model=virtio,boot=on
option "boot" is not valid for net
$ kvm -net tap,ifname=tap-kvm,id=net0 -device virtio-net-pci,id=net0,boot=on
property "virtio-net-pci.boot" not found
can't set property "boot" to "on" for "virtio-net-pci"

What's the way to PXE-boot when using the new
-netdev syntax?

Thanks!

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM test: Put os.kill in kvm_stat into try block to avoid traceback

2010-03-29 Thread Lucas Meneghel Rodrigues
On Mon, 2010-03-29 at 17:56 +0800, Yolkfull Chow wrote:
> Sometimes it tried to kill an already terminated process which can cause
> a traceback. This patch fixes the problem.

Thanks, applied!

> Signed-off-by: Yolkfull Chow 
> ---
>  client/profilers/kvm_stat/kvm_stat.py |5 -
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/client/profilers/kvm_stat/kvm_stat.py 
> b/client/profilers/kvm_stat/kvm_stat.py
> index 7568a03..59d6ff6 100644
> --- a/client/profilers/kvm_stat/kvm_stat.py
> +++ b/client/profilers/kvm_stat/kvm_stat.py
> @@ -51,7 +51,10 @@ class kvm_stat(profiler.profiler):
>  
>  @param test: Autotest test on which this profiler will operate on.
>  """
> -os.kill(self.pid, 15)
> +try:
> +os.kill(self.pid, 15)
> +except OSError:
> +pass
>  
> 
>  def report(self, test):


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Clocksource tsc unstable (delta = -4398046474878 ns)

2010-03-29 Thread Athanasius
On Sun, Mar 28, 2010 at 01:46:35PM +0200, Sebastian Hetze wrote:
> this message appeared in the KVM guest kern.log last night:
> 
> Mar 27 22:35:30 guest kernel: [260041.559462] Clocksource tsc unstable (delta 
> = -4398046474878 ns)
> 
> The guest is running a 2.6.31-20-generic-pae ubuntu kernel with
> hrtimer-tune-hrtimer_interrupt-hang-logic.patch applied.
> 
> If I understand things correct, in kernel/time/clocksource.c
> clocksource_watchdog() checks all the
> /sys/devices/system/clocksource/clocksource0/available_clocksource
> every 0.5sec for an delta of more than 0.0625s. So the tsc must have
> changed more than one hour within two subsequent calls of
> clocksource_watchdog. No event in the host nor anything in the
> guest gives reasonable cause for this step.
> 
> However, the number 4398046474878 is only 36226 ns away from
> 4*1024*1024*1024*1024

  I didn't see any such messages but I've had a recent experience with
the time on one KVM host leaping *forwards* approx. 5 and 2.5 hours in
two separate incidents.  Eerily the exact jumps, as best I can tell from
logs are of 17592 and 8796 seconds, give or take a second or two.  If
you look at these as nanoseconds then that's 'exactly' 2^44 and 2^43
nanoseconds.
  What I've done that seems to have avoided this happening again is drop
KVM_CLOCK kernel option from the kvm guests' kernel.

  This is with a Debian squeeze (testing) KVM host running 2.6.33 from
vanilla sources and my own config.  The guests are Debian lenny
(stable) and were also running a 2.6.33 kernel from vanilla sources and
my own (different, to match the virtual hardware in a KVM guest) config.
Both systems/kernels are 64 bit.  The base machine is a Dell R210 with
an Intel Xeon X3450 quad-core CPU, with the hyper-threading enabled to
give 8 visible CPUs in Linux.  This only happened on one of the two
guests, the much busier one (it does shell accounts, email, IMAP/POP3, a
small news server and NFS serves web pages to the other guest which only
runs apache2 and nagios3).
  It took around 2-3 days to see the problem both times.  Without
KVM_CLOCK it's been up and stable for well over a week now.  Without
KVM_CLOCK the only clocksource is acpi_pm and thus that is being used.
I didn't test forcing that with a boot-time parameter and KVM_CLOCK
still enabled.

  Given turning KVM_CLOCK off fixed my problem and the problem repeating
itself causes all manner of trouble given how busy the machine is I'm
not really willing to test alternative fixes.

-- 
- Athanasius = Athanasius(at)miggy.org / http://www.miggy.org/
  Finger athan(at)fysh.org for PGP key
   "And it's me who is my enemy. Me who beats me up.
Me who makes the monsters. Me who strips my confidence." Paula Cole - ME


signature.asc
Description: Digital signature


[PATCH] KVM test: Put os.kill in kvm_stat into try block to avoid traceback

2010-03-29 Thread Yolkfull Chow
Sometimes it tried to kill an already terminated process which can cause
a traceback. This patch fixes the problem.

Signed-off-by: Yolkfull Chow 
---
 client/profilers/kvm_stat/kvm_stat.py |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/client/profilers/kvm_stat/kvm_stat.py 
b/client/profilers/kvm_stat/kvm_stat.py
index 7568a03..59d6ff6 100644
--- a/client/profilers/kvm_stat/kvm_stat.py
+++ b/client/profilers/kvm_stat/kvm_stat.py
@@ -51,7 +51,10 @@ class kvm_stat(profiler.profiler):
 
 @param test: Autotest test on which this profiler will operate on.
 """
-os.kill(self.pid, 15)
+try:
+os.kill(self.pid, 15)
+except OSError:
+pass
 
 
 def report(self, test):
-- 
1.7.0.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html