Re: large amount of NMI_INTERRUPT disgrade winxp VM performance much.

2011-08-10 Thread Avi Kivity

On 08/11/2011 08:41 AM, Tian, Kevin wrote:

>   qemu-system-x86-4454  [004]   549.958172: kvm_exit:
>  reason EXCEPTION_NMI rip 0x8051d5e1
>   qemu-system-x86-4454  [004]   549.958172: kvm_page_fault:
>  address c8f8a000 error_code b

error_code 'b' means the page fault is caused by a write access in kernel
space, but related page entry has reserved bit set. This is usually used by
OS for some special tricks, e.g. to handle page swaps. You may check related
setting in win guest.



In this case it was kvm setting the reserved bit.

Looking at your other trace, that server is probably an AMD with the NPT 
feature, while your first one is an Intel without the equivalent (EPT).  
For this type of workload, you need EPT or NPT.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: large amount of NMI_INTERRUPT disgrade winxp VM performance much.

2011-08-10 Thread ya su
To clear the problem from guest settings, I run the same winxp image
on another server with the same kernel/qemu-kvm/command. the copy is
fast. so I think this problem relates only with some kind of host's
special hardware. the fast server's trace-cmd output as the following:

 qemu-system-x86-7681  [001] 20054.604841: kvm_entry:vcpu 0
 qemu-system-x86-7681  [001] 20054.604842: kvm_exit:
reason UNKNOWN rip 0x806e7d33
 qemu-system-x86-7681  [001] 20054.604842: kvm_page_fault:
address fee000b0 error_code 6
 qemu-system-x86-7681  [001] 20054.604843: kvm_mmio: mmio
write len 4 gpa 0xfee000b0 val 0x0
 qemu-system-x86-7681  [001] 20054.604843: kvm_apic:
apic_write APIC_EOI = 0x0
 qemu-system-x86-7681  [001] 20054.604844: kvm_entry:vcpu 0
 qemu-system-x86-7681  [001] 20054.604917: kvm_exit:
reason UNKNOWN rip 0xbff63b14
 qemu-system-x86-7681  [001] 20054.604917: kvm_page_fault:
address b8040 error_code 4
 qemu-system-x86-7681  [001] 20054.604920: kvm_mmio: mmio
unsatisfied-read len 1 gpa 0xb8040 val 0x0
 qemu-system-x86-7681  [001] 20054.604923: kvm_mmio: mmio
read len 1 gpa 0xb8040 val 0x0
 qemu-system-x86-7681  [001] 20054.604924: kvm_mmio: mmio
write len 1 gpa 0xb8040 val 0x0
 qemu-system-x86-7681  [001] 20054.604925: kvm_entry:vcpu 0
 qemu-system-x86-7681  [001] 20054.604926: kvm_exit:
reason UNKNOWN rip 0xbff63b1a
 qemu-system-x86-7681  [001] 20054.604927: kvm_page_fault:
address b801a error_code 6
 qemu-system-x86-7681  [001] 20054.604928: kvm_mmio: mmio
write len 1 gpa 0xb801a val 0xd
 qemu-system-x86-7681  [001] 20054.604928: kvm_entry:vcpu 0
 qemu-system-x86-7681  [001] 20054.604929: kvm_exit:
reason UNKNOWN rip 0xbff63b23
 qemu-system-x86-7681  [001] 20054.604929: kvm_page_fault:
address b8014 error_code 6
 qemu-system-x86-7681  [001] 20054.604930: kvm_mmio: mmio
write len 4 gpa 0xb8014 val 0x15f900

   According to Tian's  suggest, the NMI is produced by guest's write
to reservd page, Is there any way to find out why the slow-copy server
reserve the memory page?

   I checked the server's memory, there is large free space, no swap is used.

   and I have tested with server with  kernel 2.6.39, the problem remains.


Regards.

Suya.

2011/8/11 Tian, Kevin :
>> From: ya su
>> Sent: Thursday, August 11, 2011 11:57 AM
>>
>>  When I run winxp guest on one server, copy one file about 4G will
>> take a time of 40-50 min; if I run a FC14 guest, it will take about
>> 2-3 min;
>>
>>  I copy and run the winxp image on another server, it works well, take
>> about 3min.
>>
>>  I run trace-cmd while copying files, the main difference of  the two
>> outputs is that: the slow one's output have many NMI_INTERRUPT
>> vm_exit, while the fast output has no such vm_exit. both of the two
>> servers have NMI enabled default. the slow one's output as the
>> following:
>>  qemu-system-x86-4454  [004]   549.958147: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958172: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958172: kvm_page_fault:
>> address c8f8a000 error_code b
>>  qemu-system-x86-4454  [004]   549.958177: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958202: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958204: kvm_page_fault:
>> address c8f8b000 error_code b
>>  qemu-system-x86-4454  [004]   549.958209: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958234: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958234: kvm_page_fault:
>> address c8f8c000 error_code b
>>  qemu-system-x86-4454  [004]   549.958239: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958264: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958264: kvm_page_fault:
>> address c8f8d000 error_code b
>>  qemu-system-x86-4454  [004]   549.958267: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958292: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958294: kvm_page_fault:
>> address c8f8e000 error_code b
>>  qemu-system-x86-4454  [004]   549.958299: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958324: kvm_exit:
>> reason EXCEPTION_NMI rip 0x8051d5e1
>>  qemu-system-x86-4454  [004]   549.958324: kvm_page_fault:
>> address c8f8f000 error_code b
>>  qemu-system-x86-4454  [004]   549.958329: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958447: kvm_exit:
>> reason EXTERNAL_INTERRUPT rip 0x80547ac8
>>  qemu-system-x86-4454  [004]   549.958450: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958461: kvm_exit:
>> reason CR_ACCESS rip 0x8054428c
>>  qemu-system-x86-4454  [004]   549.958461: kvm_cr:
>> cr_write 0 = 0x80010031
>>  qemu-system-x86-4454  [004]   549.958541: kvm_entry:
>> vcpu 0
>>  qemu-system-x86-4454  [004]   549.958573: kvm_exit:
>> reason

Fix refcounting in hugetlbfs quota handling

2011-08-10 Thread David Gibson
Linus, please apply

hugetlbfs tracks the current usage of hugepages per hugetlbfs
mountpoint.  To correctly track this when hugepages are released, it
must find the right hugetlbfs super_block from the struct page
available in free_huge_page().

It does this by storing a pointer to the hugepage's struct
address_space in the page_private data.  The hugetlb_{get,put}_quota
functions go from this mapping to the inode and thence to the
super_block.

However, this usage is buggy, because nothing ensures that the
address_space is not freed before all the hugepages that belonged to
it are.  In practice that will usually be the case, but if extra page
references have been taken by e.g. drivers or kvm doing
get_user_pages() then the file, inode and address space may be
destroyed before all the pages.

In addition, the quota functions use the mapping only to get the inode
then the super_block.  However, most of the callers already have the
inode anyway and have to get the mapping from there.

This patch, therefore, stores a pointer to the inode instead of the
address_space in the page private data for hugepages.  More
importantly it correctly adjusts the reference count on the inodes
when they're added to the page private data.  This ensures that the
inode (and therefore the super block) will not be freed before we use
it from free_huge_page.

Signed-off-by: David Gibson 

Index: working-2.6/fs/hugetlbfs/inode.c
===
--- working-2.6.orig/fs/hugetlbfs/inode.c   2011-08-10 16:45:47.864758406 
+1000
+++ working-2.6/fs/hugetlbfs/inode.c2011-08-10 17:22:21.899638039 +1000
@@ -874,10 +874,10 @@ out_free:
return -ENOMEM;
 }
 
-int hugetlb_get_quota(struct address_space *mapping, long delta)
+int hugetlb_get_quota(struct inode *inode, long delta)
 {
int ret = 0;
-   struct hugetlbfs_sb_info *sbinfo = HUGETLBFS_SB(mapping->host->i_sb);
+   struct hugetlbfs_sb_info *sbinfo = HUGETLBFS_SB(inode->i_sb);
 
if (sbinfo->free_blocks > -1) {
spin_lock(&sbinfo->stat_lock);
@@ -891,9 +891,9 @@ int hugetlb_get_quota(struct address_spa
return ret;
 }
 
-void hugetlb_put_quota(struct address_space *mapping, long delta)
+void hugetlb_put_quota(struct inode *inode, long delta)
 {
-   struct hugetlbfs_sb_info *sbinfo = HUGETLBFS_SB(mapping->host->i_sb);
+   struct hugetlbfs_sb_info *sbinfo = HUGETLBFS_SB(inode->i_sb);
 
if (sbinfo->free_blocks > -1) {
spin_lock(&sbinfo->stat_lock);
Index: working-2.6/include/linux/hugetlb.h
===
--- working-2.6.orig/include/linux/hugetlb.h2011-08-10 16:58:27.952527484 
+1000
+++ working-2.6/include/linux/hugetlb.h 2011-08-10 17:22:08.723572707 +1000
@@ -171,8 +171,8 @@ extern const struct file_operations huge
 extern const struct vm_operations_struct hugetlb_vm_ops;
 struct file *hugetlb_file_setup(const char *name, size_t size, vm_flags_t acct,
struct user_struct **user, int creat_flags);
-int hugetlb_get_quota(struct address_space *mapping, long delta);
-void hugetlb_put_quota(struct address_space *mapping, long delta);
+int hugetlb_get_quota(struct inode *inode, long delta);
+void hugetlb_put_quota(struct inode *inode, long delta);
 
 static inline int is_file_hugepages(struct file *file)
 {
Index: working-2.6/mm/hugetlb.c
===
--- working-2.6.orig/mm/hugetlb.c   2011-08-10 16:44:12.212284092 +1000
+++ working-2.6/mm/hugetlb.c2011-08-10 17:21:49.603477888 +1000
@@ -533,10 +533,12 @@ static void free_huge_page(struct page *
 */
struct hstate *h = page_hstate(page);
int nid = page_to_nid(page);
-   struct address_space *mapping;
+   struct inode *inode;
 
-   mapping = (struct address_space *) page_private(page);
+   inode = (struct inode *) page_private(page);
set_page_private(page, 0);
+   iput(inode);
+
page->mapping = NULL;
BUG_ON(page_count(page));
BUG_ON(page_mapcount(page));
@@ -551,8 +553,8 @@ static void free_huge_page(struct page *
enqueue_huge_page(h, page);
}
spin_unlock(&hugetlb_lock);
-   if (mapping)
-   hugetlb_put_quota(mapping, 1);
+   if (inode)
+   hugetlb_put_quota(inode, 1);
 }
 
 static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
@@ -1035,7 +1037,7 @@ static struct page *alloc_huge_page(stru
if (chg < 0)
return ERR_PTR(-VM_FAULT_OOM);
if (chg)
-   if (hugetlb_get_quota(inode->i_mapping, chg))
+   if (hugetlb_get_quota(inode, chg))
return ERR_PTR(-VM_FAULT_SIGBUS);
 
spin_lock(&hugetlb_lock);
@@ -1045,12 +1047,12 @@ static struct page *alloc_huge_page(stru
if (!page) {
page = alloc_buddy_

Re: kvm linux guest hanging for minutes at a time

2011-08-10 Thread Avi Kivity

On 08/09/2011 06:33 PM, Nick wrote:

Hi,

Just joined this list, looking for leads to solve a similar-sounding problem
(guest processes hanging for seconds or minutes when host IO load is high).
I'll say more in a separate email, but I caught the end of this thread and
wanted to ask about kvm-clock.

Naively I'd have thought that using the wrong clock would not actually *cause*
hangs like this. Or is that what you're implying?




Using the wrong clock easily causes hangs.  The system schedules a 
wakeup in 3 ms, wrong clock causes it to wakeup in 3 years, you get a 
hang for (3 years - 3 ms).


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Permit -mem-path without sync mmu

2011-08-10 Thread David Gibson
On Wed, Aug 10, 2011 at 12:01:04PM +0300, Avi Kivity wrote:
> On 08/10/2011 08:10 AM, David Gibson wrote:
> >On Mon, Aug 08, 2011 at 11:24:09AM +0300, Avi Kivity wrote:
> >>  On 08/08/2011 09:03 AM, David Gibson wrote:
[snip]
> >This would seem to be a genuine bug in the hugepage code, which has
> >just been hidden by SYNC_MMU.  It should be quite easy to fix - the
> >mapping is only stored in the struct page to get to the hugetlbfs
> >superblock, so we could just store a direct superblock pointer
> >instead, and bump it's refcount when we put that in the page private
> >pointer.
> >
> >But then I'm not sure how qemu would detect that it's on a kernel
> >where the bug is fixed and allow -mem-path to be used again.  Any
> >ideas?
> 
> If it's just a kernel bug, the fix belongs in the kernel, not in qemu.

Obviously.

> We used to have KVM_CAPs to declare this sort of thing
> (KVM_CAP_HUGETLBFS_WORKS_EVEN_WITHOUT_SYNC_MMU) but I don't think it
> was a good idea.

I tend to agree - especially since there's nothing actually kvm
specific about this bug.  AFAICT a driver which did gup on hugepages
could trigger the bug equally well.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v4 23/39] lsi53c895a: convert to memory API

2011-08-10 Thread Avi Kivity

On 08/10/2011 10:28 PM, Gerhard Wiesinger wrote:

Hello Avi,

qemu/qemu-kvm doesn't boot any more with LSI option rom (-option-rom 
8xx_64.rom). Guess it comes from changes of the memory API. Also 
latest git version from seabios.


Sorry, no further details currently available.



Thanks for your report.  I was able to reproduce the problem and will 
send out patches shortly.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] kvm tools: check negative value of num_pages

2011-08-10 Thread walimis
On Thu, Aug 11, 2011 at 08:37:01AM +0300, Sasha Levin wrote:
>On Thu, 2011-08-11 at 13:07 +0800, Liming Wang wrote:
>> If num_pages is negative, balloon will make kernel crash with
>> "out of memory". So we check this value to avoid it to be negative.
>> 
>> Signed-off-by: Liming Wang 
>> ---
>>  tools/kvm/virtio/balloon.c |7 ++-
>>  1 files changed, 6 insertions(+), 1 deletions(-)
>> 
>> diff --git a/tools/kvm/virtio/balloon.c b/tools/kvm/virtio/balloon.c
>> index 854d04b..0223ee4 100644
>> --- a/tools/kvm/virtio/balloon.c
>> +++ b/tools/kvm/virtio/balloon.c
>> @@ -222,8 +222,13 @@ static void handle_sigmem(int sig)
>>  {
>>  if (sig == SIGKVMADDMEM)
>>  bdev.config.num_pages += 256;
>> -else
>> +else {
>>  bdev.config.num_pages -= 256;
>> +if ((s32)bdev.config.num_pages < 0){
>
>imo it's worth doing this check before the decrement instead of casting
>to signed here.
You mean that check whether num_pages less than 256 or equal to 0?

>
>you also need to wrap the 'if ()' with parenthesis if you add them to
>the 'else' case.
Sorry, I'm not clear what that does mean?

walimis
>
>> +bdev.config.num_pages = 0;
>> +return;
>> +}
>> +}
>>  
>>  /* Notify that the configuration space has changed */
>>  bdev.isr = VIRTIO_PCI_ISR_CONFIG;
>
>-- 
>
>Sasha.
>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: large amount of NMI_INTERRUPT disgrade winxp VM performance much.

2011-08-10 Thread Tian, Kevin
> From: ya su
> Sent: Thursday, August 11, 2011 11:57 AM
> 
>  When I run winxp guest on one server, copy one file about 4G will
> take a time of 40-50 min; if I run a FC14 guest, it will take about
> 2-3 min;
> 
>  I copy and run the winxp image on another server, it works well, take
> about 3min.
> 
>  I run trace-cmd while copying files, the main difference of  the two
> outputs is that: the slow one's output have many NMI_INTERRUPT
> vm_exit, while the fast output has no such vm_exit. both of the two
> servers have NMI enabled default. the slow one's output as the
> following:
>  qemu-system-x86-4454  [004]   549.958147: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958172: kvm_exit:
> reason EXCEPTION_NMI rip 0x8051d5e1
>  qemu-system-x86-4454  [004]   549.958172: kvm_page_fault:
> address c8f8a000 error_code b
>  qemu-system-x86-4454  [004]   549.958177: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958202: kvm_exit:
> reason EXCEPTION_NMI rip 0x8051d5e1
>  qemu-system-x86-4454  [004]   549.958204: kvm_page_fault:
> address c8f8b000 error_code b
>  qemu-system-x86-4454  [004]   549.958209: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958234: kvm_exit:
> reason EXCEPTION_NMI rip 0x8051d5e1
>  qemu-system-x86-4454  [004]   549.958234: kvm_page_fault:
> address c8f8c000 error_code b
>  qemu-system-x86-4454  [004]   549.958239: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958264: kvm_exit:
> reason EXCEPTION_NMI rip 0x8051d5e1
>  qemu-system-x86-4454  [004]   549.958264: kvm_page_fault:
> address c8f8d000 error_code b
>  qemu-system-x86-4454  [004]   549.958267: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958292: kvm_exit:
> reason EXCEPTION_NMI rip 0x8051d5e1
>  qemu-system-x86-4454  [004]   549.958294: kvm_page_fault:
> address c8f8e000 error_code b
>  qemu-system-x86-4454  [004]   549.958299: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958324: kvm_exit:
> reason EXCEPTION_NMI rip 0x8051d5e1
>  qemu-system-x86-4454  [004]   549.958324: kvm_page_fault:
> address c8f8f000 error_code b
>  qemu-system-x86-4454  [004]   549.958329: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958447: kvm_exit:
> reason EXTERNAL_INTERRUPT rip 0x80547ac8
>  qemu-system-x86-4454  [004]   549.958450: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958461: kvm_exit:
> reason CR_ACCESS rip 0x8054428c
>  qemu-system-x86-4454  [004]   549.958461: kvm_cr:
> cr_write 0 = 0x80010031
>  qemu-system-x86-4454  [004]   549.958541: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958573: kvm_exit:
> reason CR_ACCESS rip 0x80546beb
>  qemu-system-x86-4454  [004]   549.958575: kvm_cr:
> cr_write 0 = 0x8001003b
>  qemu-system-x86-4454  [004]   549.958585: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958610: kvm_exit:
> reason CR_ACCESS rip 0x80546b6c
>  qemu-system-x86-4454  [004]   549.958610: kvm_cr:
> cr_write 3 = 0x6e00020
>  qemu-system-x86-4454  [004]   549.958621: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958645: kvm_exit:
> reason EXCEPTION_NMI rip 0x8051d7f4
>  qemu-system-x86-4454  [004]   549.958645: kvm_page_fault:
> address c0648200 error_code 3
>  qemu-system-x86-4454  [004]   549.958653: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958725: kvm_exit:
> reason EXCEPTION_NMI rip 0x8050a26a
>  qemu-system-x86-4454  [004]   549.958726: kvm_page_fault:
> address c0796994 error_code 3
>  qemu-system-x86-4454  [004]   549.958738: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958750: kvm_exit:
> reason IO_INSTRUCTION rip 0x806edad0
>  qemu-system-x86-4454  [004]   549.958750: kvm_pio:
> pio_write at 0xc050 size 2 count 1
>  qemu-system-x86-4454  [004]   549.958838: kvm_entry:
> vcpu 0
>  qemu-system-x86-4454  [004]   549.958844: kvm_exit:
> reason APIC_ACCESS rip 0x806e7b85
>  qemu-system-x86-4454  [004]   549.958852: kvm_apic:
> apic_read APIC_ICR = 0x40041
>  qemu-system-x86-4454  [004]   549.958855: kvm_mmio:
> mmio
> read len 4 gpa 0xfee00300 val 0x40041
>  qemu-system-x86-4454  [004]   549.958857: kvm_mmio:
> mmio
> write len 4 gpa 0xfee00300 val 0x40041
>  qemu-system-x86-4454  [004]   549.958858: kvm_apic:
> apic_write APIC_ICR = 0x40041
>  qemu-system-x86-4454  [004]   549.958860: kvm_apic_ipi: dst 1
> vec 65 (Fixed|physical|de-assert|edge|self)
>  qemu-system-x86-4454  [004]   549.958860: kvm_apic_accept_irq:
> apicid 0 vec 65 (Fixed|edge)
> 
>  Even I disable the NMI when I boot the kernel with nmi_watchdog=0,
> the trace-cmd output still show there are many NMI_INTERRUPT. I find
> that in /proc/interrupts, the amount of NMI is 0. Does this mean that
> NMI is produced in winxp guest OS, or this setting can not hinder kvm
> to catch NMI interrupt?
> 
>   I think the difference between FC14 and winxp is that: fc14 process
> the NMI interrupt correctly, but winxp can not, is this right?
> 
>   I run qemu-kvm with version 0.14.0, kernel version 2.6.32-131.6.4. I
> change kvm

Re: [PATCH 2/3] kvm tools: check negative value of num_pages

2011-08-10 Thread Sasha Levin
On Thu, 2011-08-11 at 13:07 +0800, Liming Wang wrote:
> If num_pages is negative, balloon will make kernel crash with
> "out of memory". So we check this value to avoid it to be negative.
> 
> Signed-off-by: Liming Wang 
> ---
>  tools/kvm/virtio/balloon.c |7 ++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
> 
> diff --git a/tools/kvm/virtio/balloon.c b/tools/kvm/virtio/balloon.c
> index 854d04b..0223ee4 100644
> --- a/tools/kvm/virtio/balloon.c
> +++ b/tools/kvm/virtio/balloon.c
> @@ -222,8 +222,13 @@ static void handle_sigmem(int sig)
>  {
>   if (sig == SIGKVMADDMEM)
>   bdev.config.num_pages += 256;
> - else
> + else {
>   bdev.config.num_pages -= 256;
> + if ((s32)bdev.config.num_pages < 0){

imo it's worth doing this check before the decrement instead of casting
to signed here.

you also need to wrap the 'if ()' with parenthesis if you add them to
the 'else' case.

> + bdev.config.num_pages = 0;
> + return;
> + }
> + }
>  
>   /* Notify that the configuration space has changed */
>   bdev.isr = VIRTIO_PCI_ISR_CONFIG;

-- 

Sasha.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/4] block: add the block queue support

2011-08-10 Thread Zhi Yong Wu
On Wed, Aug 10, 2011 at 5:37 PM, Stefan Hajnoczi
 wrote:
> On Wed, Aug 10, 2011 at 01:54:33PM +0800, Zhi Yong Wu wrote:
>> On Tue, Aug 9, 2011 at 8:49 PM, Stefan Hajnoczi  wrote:
>> > On Tue, Aug 9, 2011 at 5:17 AM, Zhi Yong Wu  
>> > wrote:
>> >> +BlockDriverAIOCB *qemu_block_queue_enqueue(BlockQueue *queue,
>> >> +                        BlockDriverState *bs,
>> >> +                        BlockRequestHandler *handler,
>> >> +                        int64_t sector_num,
>> >> +                        QEMUIOVector *qiov,
>> >> +                        int nb_sectors,
>> >> +                        BlockDriverCompletionFunc *cb,
>> >> +                        void *opaque)
>> >> +{
>> >> +    BlockIORequest *request;
>> >> +    BlockDriverAIOCB *acb;
>> >> +
>> >> +    request = qemu_malloc(sizeof(BlockIORequest));
>> >> +    request->bs = bs;
>> >> +    request->handler = handler;
>> >> +    request->sector_num = sector_num;
>> >> +    request->qiov = qiov;
>> >> +    request->nb_sectors = nb_sectors;
>> >> +    request->cb = cb;
>> >> +    request->opaque = opaque;
>> >> +
>> >> +    QTAILQ_INSERT_TAIL(&queue->requests, request, entry);
>> >
>> > It would be simpler to define BlockQueueAIOCB and using it as our acb
>> > instead of managing an extra BlockIORequest structure.  That way you
>> > don't need to worry about extra mallocs and frees.
>> Sorry, i don't get what you mean. how to define it? Can you elaborate?
>
> BlockDriverAIOCB is designed to be embedded inside a bigger struct.  For
> example, QEDAIOCB is a larger struct that contains BlockDriverAIOCB as
> its first field:
>
> typedef struct QEDAIOCB {
>    BlockDriverAIOCB common;
>    ...
> } QEDAIOCB;
>
> And the QED AIOPool contains the size of QEDAIOCB so that qemu_aio_get() can
> allocate the full QEDAIOCB struct:
>
> static AIOPool qed_aio_pool = {
>    .aiocb_size         = sizeof(QEDAIOCB),
>    .cancel             = qed_aio_cancel,
> };
>
> This allows QED to store per-request state in QEDAIOCB for the lifetime of a
> request:
>
> QEDAIOCB *acb = qemu_aio_get(&qed_aio_pool, bs, cb, opaque);
>
> acb->is_write = is_write;
> acb->finished = NULL;
> acb->qiov = qiov;
> ...
>
> I suggest creating a BlockQueueAIOCB that contains the fields from
> BlockIORequest (which is no longer needed as a separate struct):
>
> typedef struct BlockQueueAIOCB {
>    BlockDriverAIOCB common;
>    BlockRequestHandler *handler;
>    int64_t sector_num;
>    QEMUIOVector *qiov;
>    int nb_sectors;
>    QTAILQ_ENTRY(BlockQueueAIOCB) entry; /* pending request queue */
> } BlockQueueAIOCB;
Its name is a bit confusing. One ACB is inserted into block queue.
>
> Now you can drop the malloc and simply qemu_aio_get() a new
> BlockQueueAIOCB.
>
> Stefan
>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v5 1/4] block: add the command line support

2011-08-10 Thread Stefan Hajnoczi
On Thu, Aug 11, 2011 at 12:44:11PM +0800, Zhi Yong Wu wrote:
> On Wed, Aug 10, 2011 at 5:27 PM, Stefan Hajnoczi
>  wrote:
> > On Wed, Aug 10, 2011 at 01:20:22PM +0800, Zhi Yong Wu wrote:
> >> On Tue, Aug 9, 2011 at 8:25 PM, Stefan Hajnoczi  wrote:
> >> > On Tue, Aug 9, 2011 at 5:17 AM, Zhi Yong Wu  
> >> > wrote:
> >> >> Signed-off-by: Zhi Yong Wu 
> >> >> ---
> >> >>  Makefile.objs   |    2 +-
> >> >>  blockdev.c      |   39 +++
> >> >>  qemu-config.c   |   24 
> >> >>  qemu-option.c   |   17 +
> >> >>  qemu-option.h   |    1 +
> >> >>  qemu-options.hx |    1 +
> >> >>  6 files changed, 83 insertions(+), 1 deletions(-)
> >> >>
> >> >> diff --git a/Makefile.objs b/Makefile.objs
> >> >> index 9f99ed4..06f2033 100644
> >> >> --- a/Makefile.objs
> >> >> +++ b/Makefile.objs
> >> >> @@ -23,7 +23,7 @@ block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o 
> >> >> cloop.o dmg.o bochs.o vpc.o vv
> >> >>  block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o 
> >> >> qcow2-snapshot.o qcow2-cache.o
> >> >>  block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o 
> >> >> qed-cluster.o
> >> >>  block-nested-y += qed-check.o
> >> >> -block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
> >> >> +block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o 
> >> >> blk-queue.o
> >> >
> >> > This does not build:
> >> >  LINK  qemu-ga
> >> > gcc: error: block/blk-queue.o: No such file or directory
> >> >
> >> > This Makefile.objs change should be in the commit that adds blk-queue.c.
> >> >
> >> > Each patch in a series should compile cleanly and can only depend on
> >> > previous patches.  This is important so that git-bisect(1) can be
> >> > used, it only works if every commit builds a working program.  It also
> >> > makes patch review easier when the patch series builds up logically.
> >> It seems that it will take a bit much time if we strictly stage the
> >> hunks into each corresponding patch.:)
> >> OK, i will.
> >
> > Some people like using Stacked Git to manage patch series:
> > http://www.procode.org/stgit/
> Let me try.
> >
> > I typically just use git rebase -i and git add -i manually to clean up
> > patch series.
> >
> > It also becomes easier once you plan to write patches that follow these
> > guidelines.
> OK
> >
> >> >> +    /* disk io throttling */
> >> >> +    iol_flag = qemu_opt_io_limits_enable_flag(opts, iol_opts);
> >> >> +    if (iol_flag) {
> >> >> +        memset(&io_limits, 0, sizeof(BlockIOLimit));
> >> >> +
> >> >> +        io_limits.bps[BLOCK_IO_LIMIT_TOTAL]  =
> >> >> +                           qemu_opt_get_number(opts, "bps", 0);
> >> >> +        io_limits.bps[BLOCK_IO_LIMIT_READ]   =
> >> >> +                           qemu_opt_get_number(opts, "bps_rd", 0);
> >> >> +        io_limits.bps[BLOCK_IO_LIMIT_WRITE]  =
> >> >> +                           qemu_opt_get_number(opts, "bps_wr", 0);
> >> >> +        io_limits.iops[BLOCK_IO_LIMIT_TOTAL] =
> >> >> +                           qemu_opt_get_number(opts, "iops", 0);
> >> >> +        io_limits.iops[BLOCK_IO_LIMIT_READ]  =
> >> >> +                           qemu_opt_get_number(opts, "iops_rd", 0);
> >> >> +        io_limits.iops[BLOCK_IO_LIMIT_WRITE] =
> >> >> +                           qemu_opt_get_number(opts, "iops_wr", 0);
> >> >> +
> >> >> +        if (((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] != 0)
> >> >> +            && ((io_limits.bps[BLOCK_IO_LIMIT_READ] != 0)
> >> >> +            || (io_limits.bps[BLOCK_IO_LIMIT_WRITE] != 0)))
> >> >> +            || ((io_limits.iops[BLOCK_IO_LIMIT_TOTAL] != 0)
> >> >> +            && ((io_limits.iops[BLOCK_IO_LIMIT_READ] != 0)
> >> >> +            || (io_limits.iops[BLOCK_IO_LIMIT_WRITE] != 0 {
> >> >> +            error_report("bps(iops) and bps_rd/bps_wr(iops_rd/iops_wr) 
> >> >> \
> >> >> +                                        cannot be used at the same 
> >> >> time");
> >> >> +            return NULL;
> >> >> +        }
> >> >> +    }
> >> >> +
> >> >>     on_write_error = BLOCK_ERR_STOP_ENOSPC;
> >> >>     if ((buf = qemu_opt_get(opts, "werror")) != NULL) {
> >> >>         if (type != IF_IDE && type != IF_SCSI && type != IF_VIRTIO && 
> >> >> type != IF_NONE) {
> >> >> @@ -483,6 +517,11 @@ DriveInfo *drive_init(QemuOpts *opts, int 
> >> >> default_to_scsi)
> >> >>
> >> >>     bdrv_set_on_error(dinfo->bdrv, on_read_error, on_write_error);
> >> >>
> >> >> +    /* disk I/O throttling */
> >> >> +    if (iol_flag) {
> >> >> +        bdrv_set_io_limits(dinfo->bdrv, &io_limits);
> >> >> +    }
> >> >
> >> > iol_flag and qemu_opt_io_limits_enable_flag() are not necessary.  If
> >> > no limits were set then all fields will be 0 (unlimited).
> >> Are they not necessary here? why? qemu_opt_io_limits_enable_flag is
> >> used to determine if io_limits is enabled.
> >> If yes, iol_flag will be set to ONE. So i think that they are necessay 
> >> here.
> >
> > There are two possible cases: the

[PATCH 3/3] kvm tools: enable keyboard press repeat for sdl

2011-08-10 Thread Liming Wang
Set keyboard repeat rate to enable keyboard press repeat.

It means that don't repeat the key value every 50 milliseconds
until 200 milliseconds later when the key is pressed.

Signed-off-by: Liming Wang 
---
 tools/kvm/ui/sdl.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tools/kvm/ui/sdl.c b/tools/kvm/ui/sdl.c
index 088cd29..6320ce7 100644
--- a/tools/kvm/ui/sdl.c
+++ b/tools/kvm/ui/sdl.c
@@ -99,6 +99,8 @@ static void *sdl__thread(void *p)
if (!screen)
die("Unable to set SDL video mode");
 
+   SDL_EnableKeyRepeat(200, 50);
+
for (;;) {
SDL_BlitSurface(guest_screen, NULL, screen, NULL);
SDL_Flip(screen);
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] kvm tools: add signal placeholder to avoid balloon crash

2011-08-10 Thread Liming Wang
If "kvm run" without balloon option, use "kvm balloon" may
crash kvm. Add signal placeholder to avoid this.

Signed-off-by: Liming Wang 
---
 tools/kvm/builtin-run.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/tools/kvm/builtin-run.c b/tools/kvm/builtin-run.c
index fa5de27..d725834 100644
--- a/tools/kvm/builtin-run.c
+++ b/tools/kvm/builtin-run.c
@@ -280,6 +280,10 @@ static void handle_sigstop(int sig)
kvm_cpu__reboot();
 }
 
+static void handle_empty(int sig)
+{
+}
+
 static void *kvm_cpu_thread(void *arg)
 {
current_kvm_cpu = arg;
@@ -469,6 +473,9 @@ int kvm_cmd_run(int argc, const char **argv, const char 
*prefix)
signal(SIGUSR2, handle_sigusr2);
signal(SIGKVMSTOP, handle_sigstop);
signal(SIGKVMRESUME, handle_sigusr2);
+   /* signal placeholder if not enable balloon */
+   signal(SIGKVMADDMEM, handle_empty);
+   signal(SIGKVMDELMEM, handle_empty);
 
nr_online_cpus = sysconf(_SC_NPROCESSORS_ONLN);
 
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] kvm tools: check negative value of num_pages

2011-08-10 Thread Liming Wang
If num_pages is negative, balloon will make kernel crash with
"out of memory". So we check this value to avoid it to be negative.

Signed-off-by: Liming Wang 
---
 tools/kvm/virtio/balloon.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/tools/kvm/virtio/balloon.c b/tools/kvm/virtio/balloon.c
index 854d04b..0223ee4 100644
--- a/tools/kvm/virtio/balloon.c
+++ b/tools/kvm/virtio/balloon.c
@@ -222,8 +222,13 @@ static void handle_sigmem(int sig)
 {
if (sig == SIGKVMADDMEM)
bdev.config.num_pages += 256;
-   else
+   else {
bdev.config.num_pages -= 256;
+   if ((s32)bdev.config.num_pages < 0){
+   bdev.config.num_pages = 0;
+   return;
+   }
+   }
 
/* Notify that the configuration space has changed */
bdev.isr = VIRTIO_PCI_ISR_CONFIG;
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/4] block: add the block queue support

2011-08-10 Thread Zhi Yong Wu
On Wed, Aug 10, 2011 at 5:37 PM, Stefan Hajnoczi
 wrote:
> On Wed, Aug 10, 2011 at 01:54:33PM +0800, Zhi Yong Wu wrote:
>> On Tue, Aug 9, 2011 at 8:49 PM, Stefan Hajnoczi  wrote:
>> > On Tue, Aug 9, 2011 at 5:17 AM, Zhi Yong Wu  
>> > wrote:
>> >> +BlockDriverAIOCB *qemu_block_queue_enqueue(BlockQueue *queue,
>> >> +                        BlockDriverState *bs,
>> >> +                        BlockRequestHandler *handler,
>> >> +                        int64_t sector_num,
>> >> +                        QEMUIOVector *qiov,
>> >> +                        int nb_sectors,
>> >> +                        BlockDriverCompletionFunc *cb,
>> >> +                        void *opaque)
>> >> +{
>> >> +    BlockIORequest *request;
>> >> +    BlockDriverAIOCB *acb;
>> >> +
>> >> +    request = qemu_malloc(sizeof(BlockIORequest));
>> >> +    request->bs = bs;
>> >> +    request->handler = handler;
>> >> +    request->sector_num = sector_num;
>> >> +    request->qiov = qiov;
>> >> +    request->nb_sectors = nb_sectors;
>> >> +    request->cb = cb;
>> >> +    request->opaque = opaque;
>> >> +
>> >> +    QTAILQ_INSERT_TAIL(&queue->requests, request, entry);
>> >
>> > It would be simpler to define BlockQueueAIOCB and using it as our acb
>> > instead of managing an extra BlockIORequest structure.  That way you
>> > don't need to worry about extra mallocs and frees.
>> Sorry, i don't get what you mean. how to define it? Can you elaborate?
>
> BlockDriverAIOCB is designed to be embedded inside a bigger struct.  For
> example, QEDAIOCB is a larger struct that contains BlockDriverAIOCB as
> its first field:
>
> typedef struct QEDAIOCB {
>    BlockDriverAIOCB common;
>    ...
> } QEDAIOCB;
>
> And the QED AIOPool contains the size of QEDAIOCB so that qemu_aio_get() can
> allocate the full QEDAIOCB struct:
>
> static AIOPool qed_aio_pool = {
>    .aiocb_size         = sizeof(QEDAIOCB),
>    .cancel             = qed_aio_cancel,
> };
>
> This allows QED to store per-request state in QEDAIOCB for the lifetime of a
> request:
>
> QEDAIOCB *acb = qemu_aio_get(&qed_aio_pool, bs, cb, opaque);
>
> acb->is_write = is_write;
> acb->finished = NULL;
> acb->qiov = qiov;
> ...
>
> I suggest creating a BlockQueueAIOCB that contains the fields from
> BlockIORequest (which is no longer needed as a separate struct):
>
> typedef struct BlockQueueAIOCB {
>    BlockDriverAIOCB common;
>    BlockRequestHandler *handler;
>    int64_t sector_num;
>    QEMUIOVector *qiov;
>    int nb_sectors;
>    QTAILQ_ENTRY(BlockQueueAIOCB) entry; /* pending request queue */
> } BlockQueueAIOCB;
>
> Now you can drop the malloc and simply qemu_aio_get() a new
> BlockQueueAIOCB.
Yesterday, i had actually done as this:). thanks.
>
> Stefan
>



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v5 1/4] block: add the command line support

2011-08-10 Thread Zhi Yong Wu
On Wed, Aug 10, 2011 at 5:27 PM, Stefan Hajnoczi
 wrote:
> On Wed, Aug 10, 2011 at 01:20:22PM +0800, Zhi Yong Wu wrote:
>> On Tue, Aug 9, 2011 at 8:25 PM, Stefan Hajnoczi  wrote:
>> > On Tue, Aug 9, 2011 at 5:17 AM, Zhi Yong Wu  
>> > wrote:
>> >> Signed-off-by: Zhi Yong Wu 
>> >> ---
>> >>  Makefile.objs   |    2 +-
>> >>  blockdev.c      |   39 +++
>> >>  qemu-config.c   |   24 
>> >>  qemu-option.c   |   17 +
>> >>  qemu-option.h   |    1 +
>> >>  qemu-options.hx |    1 +
>> >>  6 files changed, 83 insertions(+), 1 deletions(-)
>> >>
>> >> diff --git a/Makefile.objs b/Makefile.objs
>> >> index 9f99ed4..06f2033 100644
>> >> --- a/Makefile.objs
>> >> +++ b/Makefile.objs
>> >> @@ -23,7 +23,7 @@ block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o 
>> >> cloop.o dmg.o bochs.o vpc.o vv
>> >>  block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o 
>> >> qcow2-snapshot.o qcow2-cache.o
>> >>  block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o 
>> >> qed-cluster.o
>> >>  block-nested-y += qed-check.o
>> >> -block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
>> >> +block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o 
>> >> blk-queue.o
>> >
>> > This does not build:
>> >  LINK  qemu-ga
>> > gcc: error: block/blk-queue.o: No such file or directory
>> >
>> > This Makefile.objs change should be in the commit that adds blk-queue.c.
>> >
>> > Each patch in a series should compile cleanly and can only depend on
>> > previous patches.  This is important so that git-bisect(1) can be
>> > used, it only works if every commit builds a working program.  It also
>> > makes patch review easier when the patch series builds up logically.
>> It seems that it will take a bit much time if we strictly stage the
>> hunks into each corresponding patch.:)
>> OK, i will.
>
> Some people like using Stacked Git to manage patch series:
> http://www.procode.org/stgit/
Let me try.
>
> I typically just use git rebase -i and git add -i manually to clean up
> patch series.
>
> It also becomes easier once you plan to write patches that follow these
> guidelines.
OK
>
>> >> +    /* disk io throttling */
>> >> +    iol_flag = qemu_opt_io_limits_enable_flag(opts, iol_opts);
>> >> +    if (iol_flag) {
>> >> +        memset(&io_limits, 0, sizeof(BlockIOLimit));
>> >> +
>> >> +        io_limits.bps[BLOCK_IO_LIMIT_TOTAL]  =
>> >> +                           qemu_opt_get_number(opts, "bps", 0);
>> >> +        io_limits.bps[BLOCK_IO_LIMIT_READ]   =
>> >> +                           qemu_opt_get_number(opts, "bps_rd", 0);
>> >> +        io_limits.bps[BLOCK_IO_LIMIT_WRITE]  =
>> >> +                           qemu_opt_get_number(opts, "bps_wr", 0);
>> >> +        io_limits.iops[BLOCK_IO_LIMIT_TOTAL] =
>> >> +                           qemu_opt_get_number(opts, "iops", 0);
>> >> +        io_limits.iops[BLOCK_IO_LIMIT_READ]  =
>> >> +                           qemu_opt_get_number(opts, "iops_rd", 0);
>> >> +        io_limits.iops[BLOCK_IO_LIMIT_WRITE] =
>> >> +                           qemu_opt_get_number(opts, "iops_wr", 0);
>> >> +
>> >> +        if (((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] != 0)
>> >> +            && ((io_limits.bps[BLOCK_IO_LIMIT_READ] != 0)
>> >> +            || (io_limits.bps[BLOCK_IO_LIMIT_WRITE] != 0)))
>> >> +            || ((io_limits.iops[BLOCK_IO_LIMIT_TOTAL] != 0)
>> >> +            && ((io_limits.iops[BLOCK_IO_LIMIT_READ] != 0)
>> >> +            || (io_limits.iops[BLOCK_IO_LIMIT_WRITE] != 0 {
>> >> +            error_report("bps(iops) and bps_rd/bps_wr(iops_rd/iops_wr) \
>> >> +                                        cannot be used at the same 
>> >> time");
>> >> +            return NULL;
>> >> +        }
>> >> +    }
>> >> +
>> >>     on_write_error = BLOCK_ERR_STOP_ENOSPC;
>> >>     if ((buf = qemu_opt_get(opts, "werror")) != NULL) {
>> >>         if (type != IF_IDE && type != IF_SCSI && type != IF_VIRTIO && 
>> >> type != IF_NONE) {
>> >> @@ -483,6 +517,11 @@ DriveInfo *drive_init(QemuOpts *opts, int 
>> >> default_to_scsi)
>> >>
>> >>     bdrv_set_on_error(dinfo->bdrv, on_read_error, on_write_error);
>> >>
>> >> +    /* disk I/O throttling */
>> >> +    if (iol_flag) {
>> >> +        bdrv_set_io_limits(dinfo->bdrv, &io_limits);
>> >> +    }
>> >
>> > iol_flag and qemu_opt_io_limits_enable_flag() are not necessary.  If
>> > no limits were set then all fields will be 0 (unlimited).
>> Are they not necessary here? why? qemu_opt_io_limits_enable_flag is
>> used to determine if io_limits is enabled.
>> If yes, iol_flag will be set to ONE. So i think that they are necessay here.
>
> There are two possible cases: the user does not set any options or the
> user sets at least one option.  In both cases io_limits will be
> initialized correctly, here is why:
>
> When an option is not specified by the user the value will be 0, which
> means "unlimited".  bdrv_set_io_limits() calls
> bdrv_io_limits_enabled(b

large amount of NMI_INTERRUPT disgrade winxp VM performance much.

2011-08-10 Thread ya su
 When I run winxp guest on one server, copy one file about 4G will
take a time of 40-50 min; if I run a FC14 guest, it will take about
2-3 min;

 I copy and run the winxp image on another server, it works well, take
about 3min.

 I run trace-cmd while copying files, the main difference of  the two
outputs is that: the slow one's output have many NMI_INTERRUPT
vm_exit, while the fast output has no such vm_exit. both of the two
servers have NMI enabled default. the slow one's output as the
following:
 qemu-system-x86-4454  [004]   549.958147: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958172: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958172: kvm_page_fault:
address c8f8a000 error_code b
 qemu-system-x86-4454  [004]   549.958177: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958202: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958204: kvm_page_fault:
address c8f8b000 error_code b
 qemu-system-x86-4454  [004]   549.958209: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958234: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958234: kvm_page_fault:
address c8f8c000 error_code b
 qemu-system-x86-4454  [004]   549.958239: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958264: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958264: kvm_page_fault:
address c8f8d000 error_code b
 qemu-system-x86-4454  [004]   549.958267: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958292: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958294: kvm_page_fault:
address c8f8e000 error_code b
 qemu-system-x86-4454  [004]   549.958299: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958324: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d5e1
 qemu-system-x86-4454  [004]   549.958324: kvm_page_fault:
address c8f8f000 error_code b
 qemu-system-x86-4454  [004]   549.958329: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958447: kvm_exit:
reason EXTERNAL_INTERRUPT rip 0x80547ac8
 qemu-system-x86-4454  [004]   549.958450: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958461: kvm_exit:
reason CR_ACCESS rip 0x8054428c
 qemu-system-x86-4454  [004]   549.958461: kvm_cr:
cr_write 0 = 0x80010031
 qemu-system-x86-4454  [004]   549.958541: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958573: kvm_exit:
reason CR_ACCESS rip 0x80546beb
 qemu-system-x86-4454  [004]   549.958575: kvm_cr:
cr_write 0 = 0x8001003b
 qemu-system-x86-4454  [004]   549.958585: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958610: kvm_exit:
reason CR_ACCESS rip 0x80546b6c
 qemu-system-x86-4454  [004]   549.958610: kvm_cr:
cr_write 3 = 0x6e00020
 qemu-system-x86-4454  [004]   549.958621: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958645: kvm_exit:
reason EXCEPTION_NMI rip 0x8051d7f4
 qemu-system-x86-4454  [004]   549.958645: kvm_page_fault:
address c0648200 error_code 3
 qemu-system-x86-4454  [004]   549.958653: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958725: kvm_exit:
reason EXCEPTION_NMI rip 0x8050a26a
 qemu-system-x86-4454  [004]   549.958726: kvm_page_fault:
address c0796994 error_code 3
 qemu-system-x86-4454  [004]   549.958738: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958750: kvm_exit:
reason IO_INSTRUCTION rip 0x806edad0
 qemu-system-x86-4454  [004]   549.958750: kvm_pio:
pio_write at 0xc050 size 2 count 1
 qemu-system-x86-4454  [004]   549.958838: kvm_entry:vcpu 0
 qemu-system-x86-4454  [004]   549.958844: kvm_exit:
reason APIC_ACCESS rip 0x806e7b85
 qemu-system-x86-4454  [004]   549.958852: kvm_apic:
apic_read APIC_ICR = 0x40041
 qemu-system-x86-4454  [004]   549.958855: kvm_mmio: mmio
read len 4 gpa 0xfee00300 val 0x40041
 qemu-system-x86-4454  [004]   549.958857: kvm_mmio: mmio
write len 4 gpa 0xfee00300 val 0x40041
 qemu-system-x86-4454  [004]   549.958858: kvm_apic:
apic_write APIC_ICR = 0x40041
 qemu-system-x86-4454  [004]   549.958860: kvm_apic_ipi: dst 1
vec 65 (Fixed|physical|de-assert|edge|self)
 qemu-system-x86-4454  [004]   549.958860: kvm_apic_accept_irq:
apicid 0 vec 65 (Fixed|edge)

 Even I disable the NMI when I boot the kernel with nmi_watchdog=0,
the trace-cmd output still show there are many NMI_INTERRUPT. I find
that in /proc/interrupts, the amount of NMI is 0. Does this mean that
NMI is produced in winxp guest OS, or this setting can not hinder kvm
to catch NMI interrupt?

  I think the difference between FC14 and winxp is that: fc14 process
the NMI interrupt correctly, but winxp can not, is this right?

  I run qemu-kvm with version 0.14.0, kernel version 2.6.32-131.6.4. I
change kvm-kmod to 2.6.32-27, it produce the same result.

  Any suggestions? thanks.

Regards.

Suya.
--
To unsubscribe from th

Re: [PATCH] kvm tools: Print only serial output to the terminal

2011-08-10 Thread walimis
On Wed, Aug 10, 2011 at 07:45:27PM +0300, Pekka Enberg wrote:
>> On 08/10/2011 09:34 PM, Pekka Enberg wrote:
>>> On Wed, 10 Aug 2011, Sasha Levin wrote:
 This patch changes the serial device to print only auxiliary output to
 the
 terminal.

 Doing so prevents printing output which the guest kernel never
 intended us
 to print and by printing it we wrote junk to the users terminal.

 Signed-off-by: Sasha Levin 
>>>
>>> This doesn't seem to work for me. As soon as we switch to userspace, I
>>> don't get any output from serial console.
>
>On Wed, Aug 10, 2011 at 7:39 PM, Asias He  wrote:
>> This works for. Pekka, did you enable something like:
>>
>> ? T0:23:respawn:/sbin/getty -L ttyS0 9600 vt100
>>
>> in guest's /etc/inittab.
>
>No I didn't but that's not really a solution to the problem. We can't
>expect people to configure guest userspace for something as basic as
>console.
My suggestion is that we don't need to fix this problem, because it's
not a real issue of kvm, but of the driver of kernel.

walimis
>
> Pekka
>--
>To unsubscribe from this list: send the line "unsubscribe kvm" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH]vhost-blk: In-kernel accelerator for virtio block device

2011-08-10 Thread Liu Yuan

On 08/11/2011 11:01 AM, Liu Yuan wrote:



It looks like the patch wouldn't work for testing multiple devices.

vhost_blk_open() does
+   used_info_cachep = KMEM_CACHE(used_info, SLAB_HWCACHE_ALIGN |
SLAB_PANIC);



This is weird. how do you open multiple device?I just opened the 
device with following command:


-drive file=/dev/sda6,if=virtio,cache=none,aio=native -drive 
file=~/data0.img,if=virtio,cache=none,aio=native -drive 
file=~/data1.img,if=virtio,cache=none,aio=native


And I didn't meet any problem.

this would tell qemu to open three devices, and pass three FDs to 
three instances of vhost_blk module.

So KMEM_CACHE() is okay in vhost_blk_open().



Oh, you are right. KMEM_CACHE() is in the wrong place. it is three 
instances vhost worker threads created. Hmmm, but I didn't meet any 
problem when opening it and running it. So strange. I'll go to figure it 
out.



When opening second device, we get panic since used_info_cachep is
already created. Just to make progress I moved this call to
vhost_blk_init().

I don't see any host panics now. With single block device (dd),
it seems to work fine. But when I start testing multiple block
devices I quickly run into hangs in the guest. I see following
messages in the guest from virtio_ring.c:

virtio_blk virtio2: requests: id 0 is not a head !
virtio_blk virtio1: requests: id 0 is not a head !
virtio_blk virtio4: requests: id 1 is not a head !
virtio_blk virtio3: requests: id 39 is not a head !

Thanks,
Badari




vq->data[] is initialized by guest virtio-blk driver and vhost_blk is 
unware of it. it looks like used ID passed
over by vhost_blk to guest virtio_blk is wrong, but, it should not 
happen. :|


And I can't reproduce this on my laptop. :(


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH]vhost-blk: In-kernel accelerator for virtio block device

2011-08-10 Thread Liu Yuan



It looks like the patch wouldn't work for testing multiple devices.

vhost_blk_open() does
+   used_info_cachep = KMEM_CACHE(used_info, SLAB_HWCACHE_ALIGN |
SLAB_PANIC);



This is weird. how do you open multiple device?I just opened the device 
with following command:


-drive file=/dev/sda6,if=virtio,cache=none,aio=native -drive 
file=~/data0.img,if=virtio,cache=none,aio=native -drive 
file=~/data1.img,if=virtio,cache=none,aio=native


And I didn't meet any problem.

this would tell qemu to open three devices, and pass three FDs to three 
instances of vhost_blk module.

So KMEM_CACHE() is okay in vhost_blk_open().


When opening second device, we get panic since used_info_cachep is
already created. Just to make progress I moved this call to
vhost_blk_init().

I don't see any host panics now. With single block device (dd),
it seems to work fine. But when I start testing multiple block
devices I quickly run into hangs in the guest. I see following
messages in the guest from virtio_ring.c:

virtio_blk virtio2: requests: id 0 is not a head !
virtio_blk virtio1: requests: id 0 is not a head !
virtio_blk virtio4: requests: id 1 is not a head !
virtio_blk virtio3: requests: id 39 is not a head !

Thanks,
Badari




vq->data[] is initialized by guest virtio-blk driver and vhost_blk is 
unware of it. it looks like used ID passed
over by vhost_blk to guest virtio_blk is wrong, but, it should not 
happen. :|


And I can't reproduce this on my laptop. :(


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] postcopy livemigration proposal

2011-08-10 Thread Isaku Yamahata
On Wed, Aug 10, 2011 at 04:55:32PM +0300, Avi Kivity wrote:
> On 08/09/2011 05:33 AM, Isaku Yamahata wrote:
>> On Mon, Aug 08, 2011 at 03:38:54PM +0300, Avi Kivity wrote:
>> >  On 08/08/2011 06:24 AM, Isaku Yamahata wrote:
>> >>  This mail is on "Yabusame: Postcopy Live Migration for Qemu/KVM"
>> >>  on which we'll give a talk at KVM-forum.
>> >>  The purpose of this mail is to letting developers know it in advance
>> >>  so that we can get better feedback on its design/implementation approach
>> >>  early before our starting to implement it.
>> >
>> >  Interesting; what is the impact of increased latency on memory reads?
>>
>> Many people has already discussed it much in another thread. :-)
>> That's much more than I expected.
>
> Can you point me to the discussion?

I misunderstood of your question.
Please refer to the papers which includes the evaluation results including
network latency. It discusses about it in details.
And the presentation that we will give at the KVM forum also includes
some results.


>> >>  There are several design points.
>> >> - who takes care of pulling page contents.
>> >>   an independent daemon vs a thread in qemu
>> >>   The daemon approach is preferable because an independent daemon 
>> >> would
>> >>   easy for debug postcopy memory mechanism without qemu.
>> >>   If required, it wouldn't be difficult to convert a daemon into
>> >>   a thread in qemu
>> >
>> >  Isn't this equivalent to touching each page in sequence?
>>
>> No. I don't get your point of this question.
>
> If you have a qemu thread that does
>
>for (each guest page)
>sum += *(char *)page;
>
> doesn't that effectively pull all pages from the source node?
>
> (but maybe I'm assuming that the kernel takes care of things and this  
> isn't the case?)

Now I see your point. Right, it doesn't matter who starts the access
to guest RAM.
My point is, after the page fault, someone has to resolve the fault
by sending the request for the page to the migration source.

I think, daemon or thread isn't a big issue anyway.
If nbd with swap device is used, its IO request may be sent to the source
directly.


>> >> - hooking guest RAM access
>> >>   Introduce a character device to handle page fault.
>> >>   When page fault occurs, it queues page request up to user space 
>> >> daemon
>> >>   at the destination. And the daemon pulls page contents from the 
>> >> source
>> >>   and serves it into the character device. Then the page fault is 
>> >> resovlved.
>> >
>> >  This doesn't play well with host swapping, transparent hugepages, or
>> >  ksm, does it?
>>
>> No. At least it wouldn't be so difficult to fix it, I haven't looked ksm,
>> thp so closely though.
>> Although the vma is backed by the device, the populated page is
>> anonymous. (by MMAP_PRIVATE or the deriver returning anonymous page)
>> So swapping, thp, ksm should work.
>
> I'm not 100% sure, but I think that thp and ksm need the vma to be  
> anonymous, not just the page.

Yes, they seems to check if not only the page is anonymous, but also the vma.
I'd like to hear from Andrea before digging into the code deeply.


>> >  It would need to be a special kind of swap device since we only want to
>> >  swap in, and never out, to that device.  We'd also need a special way of
>> >  telling the kernel that memory comes from that device.  In that it's
>> >  similar your second option.
>> >
>> >  Maybe we should use a backing file (using nbd) and have a madvise() call
>> >  that converts the vma to anonymous memory once the migration is finished.
>>
>> With whichever options, I'd like to convert the vma into anonymous area
>> after the migration completes somehow. i.e. nulling vma->vm_ops.
>> (The pages are already anonymous.)
>>
>> It seems troublesome involving complicated races/lockings. So I'm not sure
>> it's worthwhile.
>
> Andrea, what's your take on this?

I'd also like to hear from those who are familiar with ksm/thp.

If it is possible to convert the vma into anonymous, swap device or
backed by device/file wouldn't matter in respect to ksm and thp.
Acquiring mmap_sem suffices?

thanks,
-- 
yamahata
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] client.virt.kvm_monitor: Throw less cryptic exceptions

2011-08-10 Thread Lucas Meneghel Rodrigues
Sometimes, when the kvm monitor code tries to verify if
there is data available on the monitor socket, a socket.error
might be thrown, leading to somewhat cryptic error messages,
such as:

 raise error(EBADF, 'Bad file descriptor')
 error: [Errno 9] Bad file descriptor

So, wrap the select operation on a try block and raise a
more comprehensive MonitorSocketError, along with the
original exception.

Also, turning the error into a MonitorError makes KVM
autotest to not blow when trying to get a screendump
from the VM during postprocessing.

Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/virt/kvm_monitor.py |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/client/virt/kvm_monitor.py b/client/virt/kvm_monitor.py
index c96f062..7e6a055 100644
--- a/client/virt/kvm_monitor.py
+++ b/client/virt/kvm_monitor.py
@@ -120,7 +120,10 @@ class Monitor:
 
 def _data_available(self, timeout=0):
 timeout = max(0, timeout)
-return bool(select.select([self._socket], [], [], timeout)[0])
+try:
+return bool(select.select([self._socket], [], [], timeout)[0])
+except socket.error, e:
+raise MonitorSocketError("Verifying data on monitor socket", e)
 
 
 def _recvall(self):
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] KVM test: Omit debug printing from iozone_windows

2011-08-10 Thread Lucas Meneghel Rodrigues
Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/virt/tests/iozone_windows.py |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/client/virt/tests/iozone_windows.py 
b/client/virt/tests/iozone_windows.py
index 4046106..b4779c6 100644
--- a/client/virt/tests/iozone_windows.py
+++ b/client/virt/tests/iozone_windows.py
@@ -27,7 +27,7 @@ def run_iozone_windows(test, params, env):
 c = params.get("iozone_cmd")
 t = int(params.get("iozone_timeout"))
 logging.info("Running IOzone command on guest, timeout %ss", t)
-results = session.cmd_output(cmd=c, timeout=t, print_func=logging.debug)
+results = session.cmd_output(cmd=c, timeout=t)
 utils.open_write_close(results_path, results)
 
 # Postprocess the results using the IOzone postprocessing module
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] virt.virt_test_utils: Fix migration message

2011-08-10 Thread Lucas Meneghel Rodrigues
Signed-off-by: Lucas Meneghel Rodrigues 
---
 client/virt/virt_test_utils.py |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/client/virt/virt_test_utils.py b/client/virt/virt_test_utils.py
index 26407bc..cf718c4 100644
--- a/client/virt/virt_test_utils.py
+++ b/client/virt/virt_test_utils.py
@@ -575,7 +575,7 @@ def run_autotest(vm, session, control_path, timeout, 
outputdir, params):
 bg.start()
 
 while bg.is_alive():
-logging.info("Tests is not ended, start a round of"
+logging.info("Autotest job did not end, start a round of "
  "migration")
 vm.migrate(timeout=mig_timeout, protocol=mig_protocol)
 else:
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM on RHEL 6.1

2011-08-10 Thread Gonzalo Servat
Bad form to reply to my own messages! Anyway, it looks like I'm not
the only one having this issue:

   http://www.spinics.net/lists/kvm/msg53924.html

TJ, did you ever find out what the problem was? You had exactly the
same problem I'm having.

Regards,
Gonzalo

On Wed, Aug 10, 2011 at 10:23 PM, Gonzalo Servat
 wrote:
> Hi All,
>
> I have been running KVM on RHEL 5.x successfully, and I'm keen to
> upgrade to RHEL6.1 so I'm currently giving it a try. The host appears
> to work fine up until the point that I start a VM that was previously
> running fine on 5.x. At this point, the system becomes extremely laggy
> (noticeable delay between key presses and when they appear via SSH).
>
> I did notice that dstat starts reporting "missed X ticks" as the VM
> starts up and the system becomes sluggish.
>
> I have been monitoring the system via dstat/top, and load average,
> memory and CPU look fine. I did however notice the number of
> interrupts and context switches is a lot higher in RHEL6.1 than it was
> on RHEL5.x.
>
> I have not seen anything interesting in /var/log/messages.
>
> Any ideas? Happy to provide whatever info you like.
>
> Hardware: IBM x3850 X5 / 128G RAM
> Kernel:  2.6.32-131.6.1.el6
> QEMU-KVM: 0.12.1.2-2.160.el6_1.6
>
> Many thanks in advance.
>
> Gonzalo
>



-- 
Gonzalo Servat
Systems Administrator, Sirca Ltd
Level 9, 80 Clarence St, Sydney NSW 2000
PO Box H58 Australia Square NSW 1215
Tel: +61 2 9236 9161
Fax: +61 2 9231 5988
www.sirca.org.au
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] powerpc/kvm: fix build errors with older toolchains

2011-08-10 Thread Alexander Graf

On 10.08.2011, at 20:56, Benjamin Herrenschmidt wrote:

> On Wed, 2011-08-10 at 16:58 +0200, Alexander Graf wrote:
>> On 08/03/2011 08:55 PM, Nishanth Aravamudan wrote:
>>> On a box with gcc 4.3.2, I see errors like:
>>> 
>>> arch/powerpc/kvm/book3s_hv_rmhandlers.S:1254: Error: Unrecognized opcode: 
>>> stxvd2x
>>> arch/powerpc/kvm/book3s_hv_rmhandlers.S:1316: Error: Unrecognized opcode: 
>>> lxvd2x
>> 
>> Paul, mind to ack?
> 
> I merged it already :-) It was trivial & annoying enough.

Alright then, I won't touch it :)


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH]vhost-blk: In-kernel accelerator for virtio block device

2011-08-10 Thread Badari Pulavarty
On Wed, 2011-08-10 at 10:19 +0800, Liu Yuan wrote:
> On 08/09/2011 01:16 AM, Badari Pulavarty wrote:
> > On 8/8/2011 12:31 AM, Liu Yuan wrote:
> >> On 08/08/2011 01:04 PM, Badari Pulavarty wrote:
> >>> On 8/7/2011 6:35 PM, Liu Yuan wrote:
>  On 08/06/2011 02:02 AM, Badari Pulavarty wrote:
> > On 8/5/2011 4:04 AM, Liu Yuan wrote:
> >> On 08/05/2011 05:58 AM, Badari Pulavarty wrote:
> >>> Hi Liu Yuan,
> >>>
> >>> I started testing your patches. I applied your kernel patch to 3.0
> >>> and applied QEMU to latest git.
> >>>
> >>> I passed 6 blockdevices from the host to guest (4 vcpu, 4GB RAM).
> >>> I ran simple "dd" read tests from the guest on all block devices
> >>> (with various blocksizes, iflag=direct).
> >>>
> >>> Unfortunately, system doesn't stay up. I immediately get into
> >>> panic on the host. I didn't get time to debug the problem. 
> >>> Wondering
> >>> if you have seen this issue before and/or you have new patchset
> >>> to try ?
> >>>
> >>> Let me know.
> >>>
> >>> Thanks,
> >>> Badari
> >>>
> >>
> >> Okay, it is actually a bug pointed out by MST on the other 
> >> thread, that it needs a mutex for completion thread.
> >>
> >> Now would you please this attachment?This patch only applies to 
> >> kernel part, on top of v1 kernel patch.
> >>
> >> This patch mainly moves completion thread into vhost thread as a 
> >> function. As a result, both requests submitting and completion 
> >> signalling is in the same thread.
> >>
> >> Yuan
> >
> > Unfortunately, "dd" tests (4 out of 6) in the guest hung. I see 
> > following messages
> >
> > virtio_blk virtio2: requests: id 0 is not a head !
> > virtio_blk virtio3: requests: id 1 is not a head !
> > virtio_blk virtio5: requests: id 1 is not a head !
> > virtio_blk virtio1: requests: id 1 is not a head !
> >
> > I still see host panics. I will collect the host panic and see if 
> > its still same or not.
> >
> > Thanks,
> > Badari
> >
> >
>  Would you please show me how to reproduce it step by step? I tried 
>  dd with two block device attached, but didn't get hung nor panic.
> 
>  Yuan
> >>>
> >>> I did 6 "dd"s on 6 block devices..
> >>>
> >>> dd if=/dev/vdb of=/dev/null bs=1M iflag=direct &
> >>> dd if=/dev/vdc of=/dev/null bs=1M iflag=direct &
> >>> dd if=/dev/vdd of=/dev/null bs=1M iflag=direct &
> >>> dd if=/dev/vde of=/dev/null bs=1M iflag=direct &
> >>> dd if=/dev/vdf of=/dev/null bs=1M iflag=direct &
> >>> dd if=/dev/vdg of=/dev/null bs=1M iflag=direct &
> >>>
> >>> I can reproduce the problem with in 3 minutes :(
> >>>
> >>> Thanks,
> >>> Badari
> >>>
> >>>
> >> Ah...I made an embarrassing mistake that I tried to 'free()' an 
> >> kmem_cache object.
> >>
> >> Would you please revert the vblk-for-kernel-2 patch and apply the new 
> >> one attached in this letter?
> >>
> > Hmm.. My version of the code seems to have kzalloc() for used_info. I 
> > don't have a version
> > that is using kmem_cache_alloc(). Would it be possible for you to send 
> > out complete patch
> > (with all the fixes applied) for me to try ? This will avoid all the 
> > confusion ..
> >
> > Thanks,
> > Badari
> >
>
> Okay, please apply the attached patch to the vanilla kernel. :)


It looks like the patch wouldn't work for testing multiple devices.

vhost_blk_open() does
+   used_info_cachep = KMEM_CACHE(used_info, SLAB_HWCACHE_ALIGN |
SLAB_PANIC);

When opening second device, we get panic since used_info_cachep is
already created. Just to make progress I moved this call to
vhost_blk_init().

I don't see any host panics now. With single block device (dd),
it seems to work fine. But when I start testing multiple block
devices I quickly run into hangs in the guest. I see following
messages in the guest from virtio_ring.c:

virtio_blk virtio2: requests: id 0 is not a head !
virtio_blk virtio1: requests: id 0 is not a head !
virtio_blk virtio4: requests: id 1 is not a head !
virtio_blk virtio3: requests: id 39 is not a head !

Thanks,
Badari



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v4 23/39] lsi53c895a: convert to memory API

2011-08-10 Thread Gerhard Wiesinger

Hello Avi,

qemu/qemu-kvm doesn't boot any more with LSI option rom (-option-rom 
8xx_64.rom). Guess it comes from changes of the memory API. Also latest 
git version from seabios.


Sorry, no further details currently available.

Ciao,
Gerhard

--
http://www.wiesinger.com/


On Mon, 8 Aug 2011, Avi Kivity wrote:


An optimization that fast-pathed DMA reads from the SCRIPTS memory
was removed int the process.  Likely it breaks with iommus anyway.

Reviewed-by: Richard Henderson 
Reviewed-by: Anthony Liguori 
Signed-off-by: Avi Kivity 
---
hw/lsi53c895a.c |  258 ---
1 files changed, 56 insertions(+), 202 deletions(-)

diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index e9904c4..0ab8c78 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -185,9 +185,9 @@ typedef struct lsi_request {

typedef struct {
PCIDevice dev;
-int mmio_io_addr;
-int ram_io_addr;
-uint32_t script_ram_base;
+MemoryRegion mmio_io;
+MemoryRegion ram_io;
+MemoryRegion io_io;

int carry; /* ??? Should this be an a visible register somewhere?  */
int status;
@@ -391,10 +391,9 @@ static inline uint32_t read_dword(LSIState *s, uint32_t 
addr)
{
uint32_t buf;

-/* Optimize reading from SCRIPTS RAM.  */
-if ((addr & 0xe000) == s->script_ram_base) {
-return s->script_ram[(addr & 0x1fff) >> 2];
-}
+/* XXX: an optimization here used to fast-path the read from scripts
+ * memory.  But that bypasses any iommu.
+ */
cpu_physical_memory_read(addr, (uint8_t *)&buf, 4);
return cpu_to_le32(buf);
}
@@ -1899,232 +1898,90 @@ static void lsi_reg_writeb(LSIState *s, int offset, 
uint8_t val)
#undef CASE_SET_REG32
}

-static void lsi_mmio_writeb(void *opaque, target_phys_addr_t addr, uint32_t 
val)
+static void lsi_mmio_write(void *opaque, target_phys_addr_t addr,
+   uint64_t val, unsigned size)
{
LSIState *s = opaque;

lsi_reg_writeb(s, addr & 0xff, val);
}

-static void lsi_mmio_writew(void *opaque, target_phys_addr_t addr, uint32_t 
val)
-{
-LSIState *s = opaque;
-
-addr &= 0xff;
-lsi_reg_writeb(s, addr, val & 0xff);
-lsi_reg_writeb(s, addr + 1, (val >> 8) & 0xff);
-}
-
-static void lsi_mmio_writel(void *opaque, target_phys_addr_t addr, uint32_t 
val)
-{
-LSIState *s = opaque;
-
-addr &= 0xff;
-lsi_reg_writeb(s, addr, val & 0xff);
-lsi_reg_writeb(s, addr + 1, (val >> 8) & 0xff);
-lsi_reg_writeb(s, addr + 2, (val >> 16) & 0xff);
-lsi_reg_writeb(s, addr + 3, (val >> 24) & 0xff);
-}
-
-static uint32_t lsi_mmio_readb(void *opaque, target_phys_addr_t addr)
+static uint64_t lsi_mmio_read(void *opaque, target_phys_addr_t addr,
+  unsigned size)
{
LSIState *s = opaque;

return lsi_reg_readb(s, addr & 0xff);
}

-static uint32_t lsi_mmio_readw(void *opaque, target_phys_addr_t addr)
-{
-LSIState *s = opaque;
-uint32_t val;
-
-addr &= 0xff;
-val = lsi_reg_readb(s, addr);
-val |= lsi_reg_readb(s, addr + 1) << 8;
-return val;
-}
-
-static uint32_t lsi_mmio_readl(void *opaque, target_phys_addr_t addr)
-{
-LSIState *s = opaque;
-uint32_t val;
-addr &= 0xff;
-val = lsi_reg_readb(s, addr);
-val |= lsi_reg_readb(s, addr + 1) << 8;
-val |= lsi_reg_readb(s, addr + 2) << 16;
-val |= lsi_reg_readb(s, addr + 3) << 24;
-return val;
-}
-
-static CPUReadMemoryFunc * const lsi_mmio_readfn[3] = {
-lsi_mmio_readb,
-lsi_mmio_readw,
-lsi_mmio_readl,
-};
-
-static CPUWriteMemoryFunc * const lsi_mmio_writefn[3] = {
-lsi_mmio_writeb,
-lsi_mmio_writew,
-lsi_mmio_writel,
+static const MemoryRegionOps lsi_mmio_ops = {
+.read = lsi_mmio_read,
+.write = lsi_mmio_write,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.impl = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
};

-static void lsi_ram_writeb(void *opaque, target_phys_addr_t addr, uint32_t val)
+static void lsi_ram_write(void *opaque, target_phys_addr_t addr,
+  uint64_t val, unsigned size)
{
LSIState *s = opaque;
uint32_t newval;
+uint32_t mask;
int shift;

-addr &= 0x1fff;
newval = s->script_ram[addr >> 2];
shift = (addr & 3) * 8;
-newval &= ~(0xff << shift);
+mask = ((uint64_t)1 << (size * 8)) - 1;
+newval &= ~(mask << shift);
newval |= val << shift;
s->script_ram[addr >> 2] = newval;
}

-static void lsi_ram_writew(void *opaque, target_phys_addr_t addr, uint32_t val)
-{
-LSIState *s = opaque;
-uint32_t newval;
-
-addr &= 0x1fff;
-newval = s->script_ram[addr >> 2];
-if (addr & 2) {
-newval = (newval & 0x) | (val << 16);
-} else {
-newval = (newval & 0x) | val;
-}
-s->script_ram[addr >> 2] = newval;
-}
-
-
-static void lsi_ram_writel(void *opaque, target_phys_addr_t addr, uint32_t val)
-{
-LSIState *s = opaque;
-
-addr &= 0x1fff;
-s->script_ram

Re: [PATCH] powerpc/kvm: fix build errors with older toolchains

2011-08-10 Thread Benjamin Herrenschmidt
On Wed, 2011-08-10 at 16:58 +0200, Alexander Graf wrote:
> On 08/03/2011 08:55 PM, Nishanth Aravamudan wrote:
> > On a box with gcc 4.3.2, I see errors like:
> >
> > arch/powerpc/kvm/book3s_hv_rmhandlers.S:1254: Error: Unrecognized opcode: 
> > stxvd2x
> > arch/powerpc/kvm/book3s_hv_rmhandlers.S:1316: Error: Unrecognized opcode: 
> > lxvd2x
> 
> Paul, mind to ack?

I merged it already :-) It was trivial & annoying enough.

Cheers,
Ben.

> 
> Alex
> 
> > Signed-off-by: Nishanth Aravamudan
> > ---
> >   arch/powerpc/kvm/book3s_hv_rmhandlers.S |4 ++--
> >   1 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > index 6dd3358..de29501 100644
> > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > @@ -1251,7 +1251,7 @@ BEGIN_FTR_SECTION
> > reg = 0
> > .rept   32
> > li  r6,reg*16+VCPU_VSRS
> > -   stxvd2x reg,r6,r3
> > +   STXVD2X(reg,r6,r3)
> > reg = reg + 1
> > .endr
> >   FTR_SECTION_ELSE
> > @@ -1313,7 +1313,7 @@ BEGIN_FTR_SECTION
> > reg = 0
> > .rept   32
> > li  r7,reg*16+VCPU_VSRS
> > -   lxvd2x  reg,r7,r4
> > +   LXVD2X(reg,r7,r4)
> > reg = reg + 1
> > .endr
> >   FTR_SECTION_ELSE
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tools: Stop guest if SDL window is closed

2011-08-10 Thread Pekka Enberg
Send SIGKVMSTOP signal to the process if the SDL window is closed. This fixes
an annoying problem where closing the window makes the guest seem like it froze
although it's really running in the background but the UI is unusable.

Cc: Asias He 
Cc: Cyrill Gorcunov 
Cc: Ingo Molnar 
Cc: Prasad Joshi 
Cc: Sasha Levin 
Signed-off-by: Pekka Enberg 
---
 tools/kvm/ui/sdl.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/tools/kvm/ui/sdl.c b/tools/kvm/ui/sdl.c
index aa3eff9..088cd29 100644
--- a/tools/kvm/ui/sdl.c
+++ b/tools/kvm/ui/sdl.c
@@ -3,9 +3,11 @@
 #include "kvm/framebuffer.h"
 #include "kvm/i8042.h"
 #include "kvm/util.h"
+#include "kvm/kvm.h"
 
 #include 
 #include 
+#include 
 
 #define FRAME_RATE 25
 
@@ -127,6 +129,8 @@ static void *sdl__thread(void *p)
SDL_Delay(1000 / FRAME_RATE);
}
 exit:
+   kill(0, SIGKVMSTOP);
+
return NULL;
 }
 
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 21/24] isa: add isa_address_space()

2011-08-10 Thread Avi Kivity

On 08/10/2011 07:26 PM, Richard Henderson wrote:

On 08/10/2011 09:24 AM, Richard Henderson wrote:
>  Of course, as far as I can see, this variable is only used by
>  the VGA devices.  Surely we can arrange to pass down some address
>  space during setup of the VGA?

... Which seems to be what you've done in patch 23.

So what's the point of this patch?




Some more about this:

We have a few dual (or even tripe) interface devices, of which vga is an 
example.  This requires a common interface when we add a memory region 
to a device's bus.  Let's enumerate the options:


1. just ignore the parent bus

this is cpu_register_physical_memory()

2. pass the right MemoryRegion to the device

this series

3. Integrate qdev and the memory API

introduce qdev_add_memory(), have vga.c call that.

4. Have the specialized device constructor (isa-vga.c's vga_initfn() 
pass down a callback to the generaized device (vga_init_vbe()) that 
wraps the bus-specific memory registration function:


static void isa_vga_add_region(void *opaque, ...)
{
 ISAVGAState *d = opaque;

 isa_add_region(&d->dev, ...);
}

vga_initfn()
{
...
vga_init_vbe(s, isa_vga_add_region, d);
...
}

1 and 2 are layering violations in that they don't allow bookkeeping by 
the specific bus.  3 allows it to some extent but I still feel it is 
wrong.  4 is correct in this sense but takes some work.


It may be that no bookkeeping is actually required and we can do 3, but 
that will have to wait until qdev/memory integration.  If not we'll 
switch to 4.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tools: Improve 'kvm version' output

2011-08-10 Thread Pekka Enberg
This patch improves 'kvm version' output as follows as suggested by Ingo
Molnar:

  $ ./kvm version
  kvm tool 3.0.rc5.763.ga6c998

Cc: Ingo Molnar 
Signed-off-by: Pekka Enberg 
---
 tools/kvm/builtin-version.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/kvm/builtin-version.c b/tools/kvm/builtin-version.c
index 5b3b598..b8bb859 100644
--- a/tools/kvm/builtin-version.c
+++ b/tools/kvm/builtin-version.c
@@ -9,7 +9,7 @@
 
 int kvm_cmd_version(int argc, const char **argv, const char *prefix)
 {
-   printf("%s\n", KVMTOOLS_VERSION);
+   printf("kvm tool %s\n", KVMTOOLS_VERSION);
 
return 0;
 }
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tool: Fix builtin command usage printouts

2011-08-10 Thread Pekka Enberg
This patch fixes builtin commands to use usage_with_options() instead of die().
The latter prefixes messages with "Fatal" which makes the usage text ugly.

Cc: Asias He 
Cc: Cyrill Gorcunov 
Cc: Ingo Molnar 
Cc: Prasad Joshi 
Cc: Sasha Levin 
Signed-off-by: Pekka Enberg 
---
 tools/kvm/builtin-balloon.c |   12 +++-
 tools/kvm/builtin-debug.c   |   12 +++-
 tools/kvm/builtin-pause.c   |   12 +++-
 tools/kvm/builtin-resume.c  |   12 +++-
 tools/kvm/builtin-stop.c|   12 +++-
 5 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/tools/kvm/builtin-balloon.c b/tools/kvm/builtin-balloon.c
index ce076ba..907a56a 100644
--- a/tools/kvm/builtin-balloon.c
+++ b/tools/kvm/builtin-balloon.c
@@ -5,8 +5,18 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
+static const char * const balloon_usage[] = {
+   "kvm balloon {inflate|deflate}  ",
+   NULL
+};
+
+static const struct option balloon_options[] = {
+   OPT_END()
+};
+
 int kvm_cmd_balloon(int argc, const char **argv, const char *prefix)
 {
int pid;
@@ -14,7 +24,7 @@ int kvm_cmd_balloon(int argc, const char **argv, const char 
*prefix)
int inflate = 0;
 
if (argc != 3)
-   die("Usage: kvm balloon [inflate/deflate] [size in MiB] 
[instance name]\n");
+   usage_with_options(balloon_usage, balloon_options);
 
pid = kvm__get_pid_by_instance(argv[2]);
if (pid < 0)
diff --git a/tools/kvm/builtin-debug.c b/tools/kvm/builtin-debug.c
index 4cf3b7c..adb0b54 100644
--- a/tools/kvm/builtin-debug.c
+++ b/tools/kvm/builtin-debug.c
@@ -2,11 +2,21 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 
+static const char * const debug_usage[] = {
+   "kvm debug ",
+   NULL
+};
+
+static const struct option debug_options[] = {
+   OPT_END()
+};
+
 static int do_debug(const char *name, int pid)
 {
return kill(pid, SIGQUIT);
@@ -17,7 +27,7 @@ int kvm_cmd_debug(int argc, const char **argv, const char 
*prefix)
int pid;
 
if (argc != 1)
-   die("Usage: kvm debug [instance name]\n");
+   usage_with_options(debug_usage, debug_options);
 
if (strcmp(argv[0], "all") == 0) {
return kvm__enumerate_instances(do_debug);
diff --git a/tools/kvm/builtin-pause.c b/tools/kvm/builtin-pause.c
index f5805e6..7ac793c 100644
--- a/tools/kvm/builtin-pause.c
+++ b/tools/kvm/builtin-pause.c
@@ -6,6 +6,16 @@
 #include 
 #include 
 #include 
+#include 
+
+static const char * const pause_usage[] = {
+   "kvm pause ",
+   NULL
+};
+
+static const struct option pause_options[] = {
+   OPT_END()
+};
 
 static int do_pause(const char *name, int pid)
 {
@@ -17,7 +27,7 @@ int kvm_cmd_pause(int argc, const char **argv, const char 
*prefix)
int pid;
 
if (argc != 1)
-   die("Usage: kvm pause [instance name]\n");
+   usage_with_options(pause_usage, pause_options);
 
if (strcmp(argv[0], "all") == 0) {
return kvm__enumerate_instances(do_pause);
diff --git a/tools/kvm/builtin-resume.c b/tools/kvm/builtin-resume.c
index a9bf6c5..3b08d3f 100644
--- a/tools/kvm/builtin-resume.c
+++ b/tools/kvm/builtin-resume.c
@@ -6,6 +6,16 @@
 #include 
 #include 
 #include 
+#include 
+
+static const char * const resume_usage[] = {
+   "kvm resume ",
+   NULL
+};
+
+static const struct option resume_options[] = {
+   OPT_END()
+};
 
 static int do_resume(const char *name, int pid)
 {
@@ -17,7 +27,7 @@ int kvm_cmd_resume(int argc, const char **argv, const char 
*prefix)
int pid;
 
if (argc != 1)
-   die("Usage: kvm resume [instance name]\n");
+   usage_with_options(resume_usage, resume_options);
 
if (strcmp(argv[0], "all") == 0) {
return kvm__enumerate_instances(do_resume);
diff --git a/tools/kvm/builtin-stop.c b/tools/kvm/builtin-stop.c
index 46be393..efbf979 100644
--- a/tools/kvm/builtin-stop.c
+++ b/tools/kvm/builtin-stop.c
@@ -2,11 +2,21 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 
+static const char * const stop_usage[] = {
+   "kvm stop ",
+   NULL
+};
+
+static const struct option stop_options[] = {
+   OPT_END()
+};
+
 static int do_stop(const char *name, int pid)
 {
return kill(pid, SIGKVMSTOP);
@@ -17,7 +27,7 @@ int kvm_cmd_stop(int argc, const char **argv, const char 
*prefix)
int pid;
 
if (argc != 1)
-   die("Usage: kvm stop [instance name]\n");
+   usage_with_options(stop_usage, stop_options);
 
if (strcmp(argv[0], "all") == 0) {
return kvm__enumerate_instances(do_stop);
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tools: Improve 'kvm list' output

2011-08-10 Thread Pekka Enberg
This patch improves 'kvm list' output to look more like 'ps':

  $ ./kvm list
PID GUEST
   2820 guest-2820

Cc: Ingo Molnar 
Cc: Sasha Levin 
Signed-off-by: Pekka Enberg 
---
 tools/kvm/builtin-list.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/tools/kvm/builtin-list.c b/tools/kvm/builtin-list.c
index 43f37f2..34cc03b 100644
--- a/tools/kvm/builtin-list.c
+++ b/tools/kvm/builtin-list.c
@@ -25,7 +25,7 @@ static int print_guest(const char *name, int pid)
if (strncmp(comm, PROCESS_NAME, strlen(PROCESS_NAME)))
goto cleanup;
 
-   printf("%s (PID: %d)\n", name, pid);
+   printf("%5d %s\n", pid, name);
 
free(comm);
 
@@ -45,5 +45,7 @@ cleanup:
 
 int kvm_cmd_list(int argc, const char **argv, const char *prefix)
 {
+   printf("  PID GUEST\n");
+
return kvm__enumerate_instances(print_guest);
 }
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Use correct size for VESA memory BAR

2011-08-10 Thread David Evensky

I don't know if there were any other drivers for this patch, but it
along with another patch (maybe integrated elsewhere for 32bit BAR vs
8bit) certainly helped me out a lot.  These patches fixed ioremap
errors I was seeing (I had a 16MB PCI memory region, but it appeared
to be only 256 bytes in size; the kernel complained bitterly about
that on ioremap). It also was an issue of expected vs unexpected
output from lspci -vvv.

I'm working on my out-of-tree PCI driver to see if it can become in
tree. I have more cleanup to do, and seeing how close I can come to
the target coding standards.

\dae

On Wed, Aug 10, 2011 at 09:13:35AM +0300, Pekka Enberg wrote:
> On 8/9/11 7:46 PM, Avi Kivity wrote:
> >On 08/09/2011 06:39 PM, Ingo Molnar wrote:
> >>* Sasha Levin  wrote:
> >>
> >>>  This patch makes BAR 1 16k, instead of BAR0 - which is the PIO bar.
> >>>
> >>
> >>This changelog is missing some key information:
> >>
> >>  - how did you find the bug (by chance via code review or did you see
> >>some actual badness?)
> >>
> >>  - what practical effect (if any) did you see from this patch?
> >>
> >>  - what practical effect (if any) do you expect others to see
> >>from this patch?
> >>
> >>I suspect this patch is only for completeness/correctness but has no
> >>practical effect - but that's a guess.
> >>
> >
> >My guess would be that seabios tried to lay out the BARs and had
> >trouble fitting a 16k pio bar in the small PCI pio region.
> 
> Sasha? IIRC this fixed some issue with David's out-of-tree PCI driver?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 40872] windows XP Guest node crashes using kvm-86

2011-08-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=40872


Avi Kivity  changed:

   What|Removed |Added

 CC||a...@redhat.com
 AssignedTo|virtualization_kvm@kernel-b |a...@redhat.com
   |ugs.osdl.org|
 Status|NEW |RESOLVED
 Resolution||OBSOLETE




--- Comment #1 from Avi Kivity   2011-08-10 16:45:55 ---
kvm-86 is no longer supported.  Please try a recent kernel like Linux 3.0.  If
you're using RHEL, use RHEL's KVM and report bugs through Red Hat bugzilla.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Print only serial output to the terminal

2011-08-10 Thread Pekka Enberg
> On 08/10/2011 09:34 PM, Pekka Enberg wrote:
>> On Wed, 10 Aug 2011, Sasha Levin wrote:
>>> This patch changes the serial device to print only auxiliary output to
>>> the
>>> terminal.
>>>
>>> Doing so prevents printing output which the guest kernel never
>>> intended us
>>> to print and by printing it we wrote junk to the users terminal.
>>>
>>> Signed-off-by: Sasha Levin 
>>
>> This doesn't seem to work for me. As soon as we switch to userspace, I
>> don't get any output from serial console.

On Wed, Aug 10, 2011 at 7:39 PM, Asias He  wrote:
> This works for. Pekka, did you enable something like:
>
>   T0:23:respawn:/sbin/getty -L ttyS0 9600 vt100
>
> in guest's /etc/inittab.

No I didn't but that's not really a solution to the problem. We can't
expect people to configure guest userspace for something as basic as
console.

 Pekka
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 40872] New: windows XP Guest node crashes using kvm-86

2011-08-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=40872

   Summary: windows XP Guest node  crashes using kvm-86
   Product: Virtualization
   Version: unspecified
Kernel Version: 2.6.18-194.11.3.el5  x86_64
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: high
  Priority: P1
 Component: kvm
AssignedTo: virtualization_...@kernel-bugs.osdl.org
ReportedBy: msal...@msn.com
Regression: No


Hello,
I am having windows xp sp3 crashes when using the following:

the host is red hat 5 with 2.6.18-194.11.3.el5 64 bit
model name  : Intel(R) Xeon(R) CPU   E5540  @ 2.53GH

the host is running kvm-86 for kernel modules and qemu-kvm-0.14.0 for
userspace.
the guest is windows xp sp3.

the problem: sometimes when compiling using visual studio compiler, the guest
will crash with system error , and when we check the mini dump for that system
error it points to visual studio.


the host server will show the following error in dmesg:
printk: 1 messages suppressed.
kvm: 29097: cpu0 unhandled wrmsr: 0x198 data 0
kvm: 29097: cpu0 kvm_set_msr_common: MSR_IA32_MCG_STATUS 0x0, nop


Please advice!

Thanks,
Majo

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Print only serial output to the terminal

2011-08-10 Thread Asias He
On 08/10/2011 09:34 PM, Pekka Enberg wrote:
> On Wed, 10 Aug 2011, Sasha Levin wrote:
>> This patch changes the serial device to print only auxiliary output to
>> the
>> terminal.
>>
>> Doing so prevents printing output which the guest kernel never
>> intended us
>> to print and by printing it we wrote junk to the users terminal.
>>
>> Signed-off-by: Sasha Levin 
> 
> This doesn't seem to work for me. As soon as we switch to userspace, I
> don't get any output from serial console.

This works for. Pekka, did you enable something like:

   T0:23:respawn:/sbin/getty -L ttyS0 9600 vt100

in guest's /etc/inittab.

-- 
Best Regards,
Asias He
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 21/24] isa: add isa_address_space()

2011-08-10 Thread Avi Kivity

On 08/10/2011 07:35 PM, Avi Kivity wrote:

On 08/10/2011 07:26 PM, Richard Henderson wrote:

On 08/10/2011 09:24 AM, Richard Henderson wrote:
>  Of course, as far as I can see, this variable is only used by
>  the VGA devices.  Surely we can arrange to pass down some address
>  space during setup of the VGA?

... Which seems to be what you've done in patch 23.

So what's the point of this patch?




Okay, so I'm learning a lot and also applying that learning... a good 
thing except that I've forgotten all about it.  Had a similar case 
with omap_gpmc.




Well, actually reading patch 23, this one doesn't make much sense.

I'd like to let it in though.  If I spend time polishing patchsets I'll 
never get rid of ram_addr_t.


The plan is:
- convert devices
- clean up bus APIs

If I do all of the at once, it will end up in tears.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 21/24] isa: add isa_address_space()

2011-08-10 Thread Avi Kivity

On 08/10/2011 07:26 PM, Richard Henderson wrote:

On 08/10/2011 09:24 AM, Richard Henderson wrote:
>  Of course, as far as I can see, this variable is only used by
>  the VGA devices.  Surely we can arrange to pass down some address
>  space during setup of the VGA?

... Which seems to be what you've done in patch 23.

So what's the point of this patch?




Okay, so I'm learning a lot and also applying that learning... a good 
thing except that I've forgotten all about it.  Had a similar case with 
omap_gpmc.


Again, there's so much work here that the quality is bound to be less 
than perfect.  The plan is to have a pretty good end result though 
(fully hierarchical - no globals).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 21/24] isa: add isa_address_space()

2011-08-10 Thread Avi Kivity

On 08/10/2011 07:24 PM, Richard Henderson wrote:

>  @@ -202,4 +203,9 @@ static char *isabus_get_fw_dev_path(DeviceState *dev)
>   return strdup(path);
>   }
>
>  +MemoryRegion *isa_address_space(ISADevice *dev)
>  +{
>  +return get_system_memory();
>  +}
>  +

This does not help get rid of isa_mem_base, as far as I can see.
All this is going to do is the equivalent of isa_mem_base == 0.


It abstracts away the problem.  We can now have multiple ISA buses, and 
each can have its own base, we just need to change the implementation.



Did you have a plan beyond this?


I'll probably add two MemoryRegion *s to isabus_bridge_init().  The plan 
is to eliminate all get_system_memory() calls.




The simplest replacement, as far as I can see, is to replace one
global variable with another:

   MemoryRegion *isa_address_space;

Define that this variable *must* be set by any platform that
wants to use ISA.  It can be set to the address_space_mem
parameter (i.e. the copy of get_system_memory() that the
platforms receive as a parameter).

For Alpha, I can set this to the sub-region that I pass off to
the PCI subsystem.

For those other non-x86 platforms that currently set isa_mem_base
to a non-zero value, they either re-use an otherwise existing
memory region or create a new sub-region properly placed.

Of course, as far as I can see, this variable is only used by
the VGA devices.  Surely we can arrange to pass down some address
space during setup of the VGA?



You mean pass VGA's memory regions to ISA, instead of vice versa?  Yes, 
that's the right thing to do, and is how PCI and sysbus do it.


Because of the huge amount of devices that need to be converted, I'm 
giving myself some leeway in the conversion process and introducing some 
shortcuts.  I don't see a away to do a perfect conversion in a 
reasonable time frame, and in any case I'm learning a lot during the 
conversion (except that I call it "collecting requirements").


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 21/24] isa: add isa_address_space()

2011-08-10 Thread Richard Henderson
On 08/10/2011 09:24 AM, Richard Henderson wrote:
> Of course, as far as I can see, this variable is only used by
> the VGA devices.  Surely we can arrange to pass down some address
> space during setup of the VGA?

... Which seems to be what you've done in patch 23.

So what's the point of this patch?


r~
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 21/24] isa: add isa_address_space()

2011-08-10 Thread Richard Henderson
On 08/08/2011 10:07 AM, Avi Kivity wrote:
> A helper that returns the address space used by ISA devices.  Useful
> for getting rid of isa_mem_base, multiple ISA buses, or ISA buses behind
> bridges.
> 
> Signed-off-by: Avi Kivity 
> ---
>  hw/isa-bus.c |6 ++
>  hw/isa.h |1 +
>  2 files changed, 7 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/isa-bus.c b/hw/isa-bus.c
> index 2765543..1cb497f 100644
> --- a/hw/isa-bus.c
> +++ b/hw/isa-bus.c
> @@ -20,6 +20,7 @@
>  #include "monitor.h"
>  #include "sysbus.h"
>  #include "isa.h"
> +#include "exec-memory.h"
>  
>  struct ISABus {
>  BusState qbus;
> @@ -202,4 +203,9 @@ static char *isabus_get_fw_dev_path(DeviceState *dev)
>  return strdup(path);
>  }
>  
> +MemoryRegion *isa_address_space(ISADevice *dev)
> +{
> +return get_system_memory();
> +}
> +

This does not help get rid of isa_mem_base, as far as I can see.
All this is going to do is the equivalent of isa_mem_base == 0.

Did you have a plan beyond this?

The simplest replacement, as far as I can see, is to replace one
global variable with another:

  MemoryRegion *isa_address_space;

Define that this variable *must* be set by any platform that 
wants to use ISA.  It can be set to the address_space_mem 
parameter (i.e. the copy of get_system_memory() that the
platforms receive as a parameter).

For Alpha, I can set this to the sub-region that I pass off to
the PCI subsystem.

For those other non-x86 platforms that currently set isa_mem_base
to a non-zero value, they either re-use an otherwise existing
memory region or create a new sub-region properly placed.

Of course, as far as I can see, this variable is only used by
the VGA devices.  Surely we can arrange to pass down some address
space during setup of the VGA?



r~
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] staging: zcache: support multiple clients, prep for KVM and RAMster

2011-08-10 Thread Seth Jennings
On 08/10/2011 10:08 AM, Dan Magenheimer wrote:
>> From: Seth Jennings [mailto:sjenn...@linux.vnet.ibm.com]
>> Subject: Re: [PATCH v2] staging: zcache: support multiple clients, prep for 
>> KVM and RAMster
>>
>>> This crash is hit every time a high memory page is swapped out.
>>>
>>> I have no solution right now other that to revert this patch and
>>> restore the original signatures.
> 
> Hi Seth --
> 
> Thanks for your testing.  I haven't done much testing on 32-bit.
> 
>> Sorry for the noise, but I noticed right after I sent this that
>> the tmem layer doesn't DO anything with the data parameter. So
>> a possible solution is to just pass the page pointer instead of
>> the virtual address.  After all, pointers are pointers.
> 
> Yes, this looks like a good patch.
> 
>> --- a/drivers/staging/zcache/zcache.c
>> +++ b/drivers/staging/zcache/zcache.c
>> @@ -1153,7 +1153,7 @@ static void *zcache_pampd_create(char *data, size_t 
>> size,
>> size_t clen;
>> int ret;
>> unsigned long count;
>> -   struct page *page = virt_to_page(data);
>> +   struct page *page = (struct page *)(data);
>> struct zcache_client *cli = pool->client;
>> uint16_t client_id = get_client_id_from_client(cli);
>> unsigned long zv_mean_zsize;
>> @@ -1220,7 +1220,7 @@ static int zcache_pampd_get_data(char *data, size_t 
>> *bufsi
>> int ret = 0;
>>
>> BUG_ON(is_ephemeral(pool));
>> -   zv_decompress(virt_to_page(data), pampd);
>> +   zv_decompress((struct page *)(data), pampd);
>> return ret;
>>  }
>>
>> @@ -1532,7 +1532,7 @@ static int zcache_put_page(int cli_id, int pool_id, 
>> struct
>> goto out;
>> if (!zcache_freeze && zcache_do_preload(pool) == 0) {
>> /* preload does preempt_disable on success */
>> -   ret = tmem_put(pool, oidp, index, page_address(page),
>> +   ret = tmem_put(pool, oidp, index, (char *)(page),
>> PAGE_SIZE, 0, is_ephemeral(pool));
>> if (ret < 0) {
>> if (is_ephemeral(pool))
>> @@ -1565,7 +1565,7 @@ static int zcache_get_page(int cli_id, int pool_id, 
>> struct
>> pool = zcache_get_pool_by_id(cli_id, pool_id);
>> if (likely(pool != NULL)) {
>> if (atomic_read(&pool->obj_count) > 0)
>> -   ret = tmem_get(pool, oidp, index, page_address(page),
>> +   ret = tmem_get(pool, oidp, index, (char *)(page),
>> &size, 0, is_ephemeral(pool));
>> zcache_put_pool(pool);
>> }
>>
>> I tested this and it works.
>>
>> Dan, does this mess anything else up?
> 
> Acked-by: Dan Magenheimer 
> 
>>> What was the rationale for the signature changes?
>>> Seth
> 
> The change on the tmem side allows tmem to handle pre-compressed pages,
> which is useful to RAMster and possibly for KVM.  The new "raw"
> parameter identifies that case, but for zcache "raw" is always zero so
> your solution looks fine.
> 
> Seth, could you submit an "official" patch (i.e. proper subject field,
> signed-off-by) and I will ack that and ask GregKH to queue it up for
> a 3.1-rc?

Will do. I'm actually about to send out a set of 3 patches for zcache.
There is a Makefile issue, and a 32-bit link-time issue I have found
as well.  Hopefully, I'll send it out today.

> 
> Subject something like: staging: zcache: fix highmem crash on 32-bit
> 
> Thanks,
> Dan
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email:  em...@kvack.org 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Nested TSC handling

2011-08-10 Thread Avi Kivity

On 08/02/2011 03:53 PM, Nadav Har'El wrote:

The following are patches I propose for fixing the bug discovered by Bandan
Das and discussed in the "Nested VMX - L1 hangs on running L2" thread.

The first patch should fix the originally-reported bug, as explained in the
aforementioned thread: A new x86_op read_l1_tsc() is called L1's TSC is
needed, instead of assuming that calling kvm_read_msr() will do that
(because this has to return L2's TSC when a nested guest is running).

The second and third patches fix relatively-unimportant corner cases in
nested VMX and nested SVM TSC handling.

I'd appreciate it if the people who noticed this bug can verify that these
patches indeed solve it for them.



Thanks, applied.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Print only serial output to the terminal

2011-08-10 Thread Pekka Enberg

On Wed, 10 Aug 2011, Sasha Levin wrote:

Pekka, could you check whats the MCR value when your guest is writing
from usermode?


MCR: b
MCR: b
MCR: b
MCR: b
MCR: b
MCR: b
MCR: b
MCR: b
MCR: b
MCR: b

  # KVM session terminated.

diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
index 1199264..44d9439 100644
--- a/tools/kvm/hw/serial.c
+++ b/tools/kvm/hw/serial.c
@@ -213,8 +213,10 @@ static bool serial8250_out(struct ioport *ioport, struct 
kvm *kvm, u16 port, voi
case UART_TX: {
char *addr = data;

-   if (!(dev->mcr & UART_MCR_LOOP))
+   if (!(dev->mcr & (UART_MCR_LOOP | UART_MCR_OUT2)))
term_putc(CONSOLE_8250, addr, size * count);
+   else
+   printf("MCR: %x\n",  dev->mcr);

dev->iir = UART_IIR_NO_INT;
break;
--
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2] staging: zcache: support multiple clients, prep for KVM and RAMster

2011-08-10 Thread Dan Magenheimer
> From: Seth Jennings [mailto:sjenn...@linux.vnet.ibm.com]
> Subject: Re: [PATCH v2] staging: zcache: support multiple clients, prep for 
> KVM and RAMster
> 
> > This crash is hit every time a high memory page is swapped out.
> >
> > I have no solution right now other that to revert this patch and
> > restore the original signatures.

Hi Seth --

Thanks for your testing.  I haven't done much testing on 32-bit.

> Sorry for the noise, but I noticed right after I sent this that
> the tmem layer doesn't DO anything with the data parameter. So
> a possible solution is to just pass the page pointer instead of
> the virtual address.  After all, pointers are pointers.

Yes, this looks like a good patch.

> --- a/drivers/staging/zcache/zcache.c
> +++ b/drivers/staging/zcache/zcache.c
> @@ -1153,7 +1153,7 @@ static void *zcache_pampd_create(char *data, size_t 
> size,
> size_t clen;
> int ret;
> unsigned long count;
> -   struct page *page = virt_to_page(data);
> +   struct page *page = (struct page *)(data);
> struct zcache_client *cli = pool->client;
> uint16_t client_id = get_client_id_from_client(cli);
> unsigned long zv_mean_zsize;
> @@ -1220,7 +1220,7 @@ static int zcache_pampd_get_data(char *data, size_t 
> *bufsi
> int ret = 0;
> 
> BUG_ON(is_ephemeral(pool));
> -   zv_decompress(virt_to_page(data), pampd);
> +   zv_decompress((struct page *)(data), pampd);
> return ret;
>  }
> 
> @@ -1532,7 +1532,7 @@ static int zcache_put_page(int cli_id, int pool_id, 
> struct
> goto out;
> if (!zcache_freeze && zcache_do_preload(pool) == 0) {
> /* preload does preempt_disable on success */
> -   ret = tmem_put(pool, oidp, index, page_address(page),
> +   ret = tmem_put(pool, oidp, index, (char *)(page),
> PAGE_SIZE, 0, is_ephemeral(pool));
> if (ret < 0) {
> if (is_ephemeral(pool))
> @@ -1565,7 +1565,7 @@ static int zcache_get_page(int cli_id, int pool_id, 
> struct
> pool = zcache_get_pool_by_id(cli_id, pool_id);
> if (likely(pool != NULL)) {
> if (atomic_read(&pool->obj_count) > 0)
> -   ret = tmem_get(pool, oidp, index, page_address(page),
> +   ret = tmem_get(pool, oidp, index, (char *)(page),
> &size, 0, is_ephemeral(pool));
> zcache_put_pool(pool);
> }
> 
> I tested this and it works.
> 
> Dan, does this mess anything else up?

Acked-by: Dan Magenheimer 

> > What was the rationale for the signature changes?
> > Seth

The change on the tmem side allows tmem to handle pre-compressed pages,
which is useful to RAMster and possibly for KVM.  The new "raw"
parameter identifies that case, but for zcache "raw" is always zero so
your solution looks fine.

Seth, could you submit an "official" patch (i.e. proper subject field,
signed-off-by) and I will ack that and ask GregKH to queue it up for
a 3.1-rc?

Subject something like: staging: zcache: fix highmem crash on 32-bit

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Memory API feature page

2011-08-10 Thread Avi Kivity
While the memory API is not a feature, it deserves a feature page, so I 
can put the TODO list there:


  http://wiki.qemu.org/Features/Memory_API

Please consider helping out.  Device conversions are easy, if boring, 
and there's no end to them.  There is a lot of more interesting work on 
the core itself, but I'd like to defer it until conversion is complete.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Print only serial output to the terminal

2011-08-10 Thread Sasha Levin
On Wed, 2011-08-10 at 21:43 +0800, walimis wrote:
> On Wed, Aug 10, 2011 at 04:34:53PM +0300, Pekka Enberg wrote:
> >On Wed, 10 Aug 2011, Sasha Levin wrote:
> >>This patch changes the serial device to print only auxiliary output to the
> >>terminal.
> >>
> >>Doing so prevents printing output which the guest kernel never intended us
> >>to print and by printing it we wrote junk to the users terminal.
> >>
> >>Signed-off-by: Sasha Levin 
> >
> >This doesn't seem to work for me. As soon as we switch to userspace,
> >I don't get any output from serial console.
> It works for me.

Same here.

Pekka, could you check whats the MCR value when your guest is writing
from usermode?

-- 

Sasha.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm linux guest hanging for minutes at a time

2011-08-10 Thread Philipp Hahn
Hi,

> > > On August 9, 2011, Avi Kivity wrote:
> > > >  Yikes.  Please trace such a hang according to
> > > >  http://www.linux-kvm.org/page/Tracing.

I have observed the same problem multiple times now and a college of mine 
also. He's able to reproduce this problem every time. For me it manifests on 
the serial console hanging after reboot, for him it is after the reboot when 
a process does a sleep().
The problem occurrs with clocksource=kvm-clock.
After the reboot tsc is disabled with the message "Clocksource tsc unstable 
(delta = 137303562 ns)" and is no longer shown in available_clocksource.
When booting with clocksouce=hpet, the system does not hang.

Our trace-cmd data is available from 

including the trace.dat. show-task-states_portmap.txt and 
show-task-states_syslogd.txt contain the captures output of Alt-SysRq-t.

Both host and guest are running an 2.6.32.x amd64 kernel, qemu is 0.14.1. The 
system is currently not in production and can be used for experiments.

Sincerely
Philipp Hahn

PS: This issue is tracked in out German bugtracker at 

-- 
Philipp Hahn   Open Source Software Engineer  h...@univention.de
Univention GmbHLinux for Your Businessfon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen fax: +49 421 22 232-99
   http://www.univention.de/


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH] powerpc/kvm: fix build errors with older toolchains

2011-08-10 Thread Alexander Graf

On 08/03/2011 08:55 PM, Nishanth Aravamudan wrote:

On a box with gcc 4.3.2, I see errors like:

arch/powerpc/kvm/book3s_hv_rmhandlers.S:1254: Error: Unrecognized opcode: 
stxvd2x
arch/powerpc/kvm/book3s_hv_rmhandlers.S:1316: Error: Unrecognized opcode: lxvd2x


Paul, mind to ack?


Alex


Signed-off-by: Nishanth Aravamudan
---
  arch/powerpc/kvm/book3s_hv_rmhandlers.S |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6dd3358..de29501 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1251,7 +1251,7 @@ BEGIN_FTR_SECTION
reg = 0
.rept   32
li  r6,reg*16+VCPU_VSRS
-   stxvd2x reg,r6,r3
+   STXVD2X(reg,r6,r3)
reg = reg + 1
.endr
  FTR_SECTION_ELSE
@@ -1313,7 +1313,7 @@ BEGIN_FTR_SECTION
reg = 0
.rept   32
li  r7,reg*16+VCPU_VSRS
-   lxvd2x  reg,r7,r4
+   LXVD2X(reg,r7,r4)
reg = reg + 1
.endr
  FTR_SECTION_ELSE


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] staging: zcache: support multiple clients, prep for KVM and RAMster

2011-08-10 Thread Seth Jennings
On 08/10/2011 09:21 AM, Seth Jennings wrote:
> On 06/30/2011 02:01 PM, Dan Magenheimer wrote:
>> Hi Greg --
>>
>> I think this patch is now ready for staging-next and for merging when
>> the 3.1 window opens.  Please let me know if you need any logistics
>> done differently.
>>
>> Thanks,
>> Dan
>>
>> ===
>>
>>> From: Dan Magenheimer 
>> Subject: staging: zcache: support multiple clients, prep for KVM and RAMster
>>
>> This is version 2 of an update to zcache, incorporating feedback from the 
>> list.
>> This patch adds support to the in-kernel transcendent memory ("tmem") code
>> and the zcache driver for multiple clients, which will be needed for both
>> RAMster and KVM support.  It also adds additional tmem callbacks to support
>> RAMster and corresponding no-op stubs in the zcache driver.  In v2, I've
>> also taken the liberty of adding some additional sysfs variables to
>> both surface information and allow policy control.  Those experimenting
>> with zcache should find them useful.
>>
>> Signed-off-by: Dan Magenheimer 
>>
>> [v2: konrad.w...@oracle.com: fix bools, add check for NULL, fix a comment]
>> [v2: sjenn...@linux.vnet.ibm.com: add info/tunables for poor compression]
>> [v2: marcuskl...@googlemail.com: add tunable for max persistent pages]
>> Cc: Nitin Gupta 
>> Cc: linux...@kvack.org
>> Cc: kvm@vger.kernel.org
>>
>> ===
>>
>> Diffstat:
>>  drivers/staging/zcache/tmem.c|  100 +++-
>>  drivers/staging/zcache/tmem.h|   23 
>>  drivers/staging/zcache/zcache.c  |  512 +
>>  3 files changed, 520 insertions(+), 115 deletions(-)
>>
>>
> 
>> @@ -901,48 +1144,59 @@ static unsigned long zcache_curr_pers_pa
>>  /* forward reference */
>>  static int zcache_compress(struct page *from, void **out_va, size_t 
>> *out_len);
>>  
>> -static void *zcache_pampd_create(struct tmem_pool *pool, struct tmem_oid 
>> *oid,
>> - uint32_t index, struct page *page)
>> +static void *zcache_pampd_create(char *data, size_t size, bool raw, int eph,
>> +struct tmem_pool *pool, struct tmem_oid *oid,
>> + uint32_t index)
>>  {
>>  void *pampd = NULL, *cdata;
>>  size_t clen;
>>  int ret;
>> -bool ephemeral = is_ephemeral(pool);
>>  unsigned long count;
>> +struct page *page = virt_to_page(data);
> 
> With zcache_put_page() modified to pass page_address(page) instead of the 
> actual page structure, in combination with the function signature changes 
> to tmem_put() and zcache_pampd_create(), zcache_pampd_create() tries to 
> (re)derive the page structure from the virtual address.  However, if the 
> original page is a high memory page (or any unmapped page), this 
> virt_to_page() fails because the page_address() in zcache_put_page()
> returned NULL.
> 
> With CONFIG_DEBUG_VIRTUAL set, the BUG message is this:
> ==
> [  101.347711] [ cut here ]
> [  101.348030] kernel BUG at arch/x86/mm/physaddr.c:51!
> [  101.348030] invalid opcode:  [#1] DEBUG_PAGEALLOC
> [  101.348030] Modules linked in:
> [  101.348030] 
> [  101.348030] Pid: 20, comm: kswapd0 Not tainted 3.1.0-rc1+ #229 Bochs Bochs
> [  101.348030] EIP: 0060:[] EFLAGS: 00010013 CPU: 0
> [  101.348030] EIP is at __phys_addr+0x1a/0x50
> [  101.348030] EAX:  EBX: f5e64000 ECX:  EDX: 1000
> [  101.348030] ESI: f6ffd000 EDI:  EBP: f6353c10 ESP: f6353c10
> [  101.348030]  DS: 007b ES: 007b FS:  GS:  SS: 0068
> [  101.348030] Process kswapd0 (pid: 20, ti=f6352000 task=f69b9c20 
> task.ti=f6352000)
> [  101.348030] Stack:
> [  101.348030]  f6353c60 c12e4114  0001  c12e5682 
>  0046
> [  101.348030]   f5e65668 f5e65658 f5e65654 f580f000 f6353c60 
> c13904f5 
> [  101.348030]  f5e65654 f5e64000 f5eab000  f6353cb0 c12e5713 
>  f5e64000
> [  101.348030] Call Trace:
> [  101.348030]  [] zcache_pampd_create+0x14/0x6a0
> [  101.348030]  [] ? tmem_put+0x52/0x3f0
> [  101.348030]  [] ? _raw_spin_lock+0x45/0x50
> [  101.348030]  [] tmem_put+0xe3/0x3f0
> [  101.348030]  [] ? page_address+0xb8/0xe0
> [  101.348030]  [] zcache_frontswap_put_page+0x1eb/0x2e0
> [  101.348030]  [] __frontswap_put_page+0x6b/0x110
> [  101.348030]  [] swap_writepage+0x8f/0xf0
> [  101.348030]  [] shmem_writepage+0x1a7/0x1d0
> [  101.348030]  [] shrink_page_list+0x3f7/0x7c0
> [  101.348030]  [] shrink_inactive_list+0x12d/0x360
> [  101.348030]  [] shrink_zone+0x3e8/0x530
> [  101.348030]  [] kswapd+0x552/0x940
> [  101.348030]  [] ? wake_up_bit+0x30/0x30
> [  101.348030]  [] ? shrink_zone+0x530/0x530
> [  101.348030]  [] kthread+0x74/0x80
> [  101.348030]  [] ? __init_kthread_worker+0x60/0x60
> [  101.348030]  [] kernel_thread_helper+0x6/0xd
> [  101.348030] Code: 00 c0 ff 81 eb 00 20 00 00 39 d8 72 cd eb ae 66 90 55 3d 
> ff ff ff bf 89 e5 76 10 80 3d 10 88 53 c1 00 75 09 05 00 00 00 40 5d c3 <0f> 
> 0b 8b 15 b0 6a 9e c1 81

Re: [RFC] virtio: Support releasing lock during kick

2011-08-10 Thread Stefan Hajnoczi
On Wed, Aug 10, 2011 at 09:18:01AM -0400, Christoph Hellwig wrote:
> Any progress on these patches?

Khoa ran ffsb benchmarks on his rig and we didn't see any benefit.  I
have not started investigating yet, been working on other things.

It will be necessary to compare against the old patches which did show
an improvment last year when we ran them.  Then we'll know if I broke
something in the new patches.

If you like I can send the latest patches for you to take a look.  I
will not get a chance to play with this until after LinuxCon.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] staging: zcache: support multiple clients, prep for KVM and RAMster

2011-08-10 Thread Seth Jennings
On 06/30/2011 02:01 PM, Dan Magenheimer wrote:
> Hi Greg --
> 
> I think this patch is now ready for staging-next and for merging when
> the 3.1 window opens.  Please let me know if you need any logistics
> done differently.
> 
> Thanks,
> Dan
> 
> ===
> 
>> From: Dan Magenheimer 
> Subject: staging: zcache: support multiple clients, prep for KVM and RAMster
> 
> This is version 2 of an update to zcache, incorporating feedback from the 
> list.
> This patch adds support to the in-kernel transcendent memory ("tmem") code
> and the zcache driver for multiple clients, which will be needed for both
> RAMster and KVM support.  It also adds additional tmem callbacks to support
> RAMster and corresponding no-op stubs in the zcache driver.  In v2, I've
> also taken the liberty of adding some additional sysfs variables to
> both surface information and allow policy control.  Those experimenting
> with zcache should find them useful.
> 
> Signed-off-by: Dan Magenheimer 
> 
> [v2: konrad.w...@oracle.com: fix bools, add check for NULL, fix a comment]
> [v2: sjenn...@linux.vnet.ibm.com: add info/tunables for poor compression]
> [v2: marcuskl...@googlemail.com: add tunable for max persistent pages]
> Cc: Nitin Gupta 
> Cc: linux...@kvack.org
> Cc: kvm@vger.kernel.org
> 
> ===
> 
> Diffstat:
>  drivers/staging/zcache/tmem.c|  100 +++-
>  drivers/staging/zcache/tmem.h|   23 
>  drivers/staging/zcache/zcache.c  |  512 +
>  3 files changed, 520 insertions(+), 115 deletions(-)
> 
> 

> @@ -901,48 +1144,59 @@ static unsigned long zcache_curr_pers_pa
>  /* forward reference */
>  static int zcache_compress(struct page *from, void **out_va, size_t 
> *out_len);
>  
> -static void *zcache_pampd_create(struct tmem_pool *pool, struct tmem_oid 
> *oid,
> -  uint32_t index, struct page *page)
> +static void *zcache_pampd_create(char *data, size_t size, bool raw, int eph,
> + struct tmem_pool *pool, struct tmem_oid *oid,
> +  uint32_t index)
>  {
>   void *pampd = NULL, *cdata;
>   size_t clen;
>   int ret;
> - bool ephemeral = is_ephemeral(pool);
>   unsigned long count;
> + struct page *page = virt_to_page(data);

With zcache_put_page() modified to pass page_address(page) instead of the 
actual page structure, in combination with the function signature changes 
to tmem_put() and zcache_pampd_create(), zcache_pampd_create() tries to 
(re)derive the page structure from the virtual address.  However, if the 
original page is a high memory page (or any unmapped page), this 
virt_to_page() fails because the page_address() in zcache_put_page()
returned NULL.

With CONFIG_DEBUG_VIRTUAL set, the BUG message is this:
==
[  101.347711] [ cut here ]
[  101.348030] kernel BUG at arch/x86/mm/physaddr.c:51!
[  101.348030] invalid opcode:  [#1] DEBUG_PAGEALLOC
[  101.348030] Modules linked in:
[  101.348030] 
[  101.348030] Pid: 20, comm: kswapd0 Not tainted 3.1.0-rc1+ #229 Bochs Bochs
[  101.348030] EIP: 0060:[] EFLAGS: 00010013 CPU: 0
[  101.348030] EIP is at __phys_addr+0x1a/0x50
[  101.348030] EAX:  EBX: f5e64000 ECX:  EDX: 1000
[  101.348030] ESI: f6ffd000 EDI:  EBP: f6353c10 ESP: f6353c10
[  101.348030]  DS: 007b ES: 007b FS:  GS:  SS: 0068
[  101.348030] Process kswapd0 (pid: 20, ti=f6352000 task=f69b9c20 
task.ti=f6352000)
[  101.348030] Stack:
[  101.348030]  f6353c60 c12e4114  0001  c12e5682  
0046
[  101.348030]   f5e65668 f5e65658 f5e65654 f580f000 f6353c60 c13904f5 

[  101.348030]  f5e65654 f5e64000 f5eab000  f6353cb0 c12e5713  
f5e64000
[  101.348030] Call Trace:
[  101.348030]  [] zcache_pampd_create+0x14/0x6a0
[  101.348030]  [] ? tmem_put+0x52/0x3f0
[  101.348030]  [] ? _raw_spin_lock+0x45/0x50
[  101.348030]  [] tmem_put+0xe3/0x3f0
[  101.348030]  [] ? page_address+0xb8/0xe0
[  101.348030]  [] zcache_frontswap_put_page+0x1eb/0x2e0
[  101.348030]  [] __frontswap_put_page+0x6b/0x110
[  101.348030]  [] swap_writepage+0x8f/0xf0
[  101.348030]  [] shmem_writepage+0x1a7/0x1d0
[  101.348030]  [] shrink_page_list+0x3f7/0x7c0
[  101.348030]  [] shrink_inactive_list+0x12d/0x360
[  101.348030]  [] shrink_zone+0x3e8/0x530
[  101.348030]  [] kswapd+0x552/0x940
[  101.348030]  [] ? wake_up_bit+0x30/0x30
[  101.348030]  [] ? shrink_zone+0x530/0x530
[  101.348030]  [] kthread+0x74/0x80
[  101.348030]  [] ? __init_kthread_worker+0x60/0x60
[  101.348030]  [] kernel_thread_helper+0x6/0xd
[  101.348030] Code: 00 c0 ff 81 eb 00 20 00 00 39 d8 72 cd eb ae 66 90 55 3d 
ff ff ff bf 89 e5 76 10 80 3d 10 88 53 c1 00 75 09 05 00 00 00 40 5d c3 <0f> 0b 
8b 15 b0 6a 9e c1 81 c2 00 00 80 00 39 d0 72 e7 8b 15 a8 
[  101.348030] EIP: [] __phys_addr+0x1a/0x50 SS:ESP 0068:f6353c10
==

This crash is hit every time a high memory page is swapped out.

I have no solu

Re: [PATCH] kvm tools: Print only serial output to the terminal

2011-08-10 Thread walimis
On Wed, Aug 10, 2011 at 04:34:53PM +0300, Pekka Enberg wrote:
>On Wed, 10 Aug 2011, Sasha Levin wrote:
>>This patch changes the serial device to print only auxiliary output to the
>>terminal.
>>
>>Doing so prevents printing output which the guest kernel never intended us
>>to print and by printing it we wrote junk to the users terminal.
>>
>>Signed-off-by: Sasha Levin 
>
>This doesn't seem to work for me. As soon as we switch to userspace,
>I don't get any output from serial console.
It works for me.

walimis
>
>>---
>>tools/kvm/hw/serial.c |2 +-
>>1 files changed, 1 insertions(+), 1 deletions(-)
>>
>>diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
>>index 1199264..0393d3d 100644
>>--- a/tools/kvm/hw/serial.c
>>+++ b/tools/kvm/hw/serial.c
>>@@ -213,7 +213,7 @@ static bool serial8250_out(struct ioport *ioport, struct 
>>kvm *kvm, u16 port, voi
>>  case UART_TX: {
>>  char *addr = data;
>>
>>- if (!(dev->mcr & UART_MCR_LOOP))
>>+ if (!(dev->mcr & (UART_MCR_LOOP | UART_MCR_OUT2)))
>>  term_putc(CONSOLE_8250, addr, size * count);
>>
>>  dev->iir= UART_IIR_NO_INT;
>>-- 
>>1.7.6
>>
>>
>--
>To unsubscribe from this list: send the line "unsubscribe kvm" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] postcopy livemigration proposal

2011-08-10 Thread Avi Kivity

On 08/09/2011 05:33 AM, Isaku Yamahata wrote:

On Mon, Aug 08, 2011 at 03:38:54PM +0300, Avi Kivity wrote:
>  On 08/08/2011 06:24 AM, Isaku Yamahata wrote:
>>  This mail is on "Yabusame: Postcopy Live Migration for Qemu/KVM"
>>  on which we'll give a talk at KVM-forum.
>>  The purpose of this mail is to letting developers know it in advance
>>  so that we can get better feedback on its design/implementation approach
>>  early before our starting to implement it.
>
>  Interesting; what is the impact of increased latency on memory reads?

Many people has already discussed it much in another thread. :-)
That's much more than I expected.


Can you point me to the discussion?



>>  There are several design points.
>> - who takes care of pulling page contents.
>>   an independent daemon vs a thread in qemu
>>   The daemon approach is preferable because an independent daemon would
>>   easy for debug postcopy memory mechanism without qemu.
>>   If required, it wouldn't be difficult to convert a daemon into
>>   a thread in qemu
>
>  Isn't this equivalent to touching each page in sequence?

No. I don't get your point of this question.


If you have a qemu thread that does

   for (each guest page)
   sum += *(char *)page;

doesn't that effectively pull all pages from the source node?

(but maybe I'm assuming that the kernel takes care of things and this 
isn't the case?)



>>
>> - hooking guest RAM access
>>   Introduce a character device to handle page fault.
>>   When page fault occurs, it queues page request up to user space daemon
>>   at the destination. And the daemon pulls page contents from the source
>>   and serves it into the character device. Then the page fault is 
resovlved.
>
>  This doesn't play well with host swapping, transparent hugepages, or
>  ksm, does it?

No. At least it wouldn't be so difficult to fix it, I haven't looked ksm,
thp so closely though.
Although the vma is backed by the device, the populated page is
anonymous. (by MMAP_PRIVATE or the deriver returning anonymous page)
So swapping, thp, ksm should work.


I'm not 100% sure, but I think that thp and ksm need the vma to be 
anonymous, not just the page.



>
>  It would need to be a special kind of swap device since we only want to
>  swap in, and never out, to that device.  We'd also need a special way of
>  telling the kernel that memory comes from that device.  In that it's
>  similar your second option.
>
>  Maybe we should use a backing file (using nbd) and have a madvise() call
>  that converts the vma to anonymous memory once the migration is finished.

With whichever options, I'd like to convert the vma into anonymous area
after the migration completes somehow. i.e. nulling vma->vm_ops.
(The pages are already anonymous.)

It seems troublesome involving complicated races/lockings. So I'm not sure
it's worthwhile.


Andrea, what's your take on this?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] nVMX: Document 'nested' parameter

2011-08-10 Thread Avi Kivity

On 08/09/2011 02:28 PM, Sasha Levin wrote:

Add documentation of the new 'nested' parameter to
'Documentation/kernel-parameters.txt'.


Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix SMEP issues

2011-08-10 Thread Avi Kivity

On 08/09/2011 01:14 PM, Yang, Wei Y wrote:

This patch fix kvm-unit-tests hanging and incorrect PT_ACCESSED_MASK
bit set in the case of SMEP fault.



Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 40752] BUG: unable to handle kernel paging request at bffffc20

2011-08-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=40752





--- Comment #3 from Avi Kivity   2011-08-10 13:39:24 ---
On 08/09/2011 09:16 PM, bugzilla-dae...@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=40752
>
>
>
>
>
>

Xiao, any idea?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Print only serial output to the terminal

2011-08-10 Thread Pekka Enberg

On Wed, 10 Aug 2011, Sasha Levin wrote:

This patch changes the serial device to print only auxiliary output to the
terminal.

Doing so prevents printing output which the guest kernel never intended us
to print and by printing it we wrote junk to the users terminal.

Signed-off-by: Sasha Levin 


This doesn't seem to work for me. As soon as we switch to userspace, I 
don't get any output from serial console.



---
tools/kvm/hw/serial.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
index 1199264..0393d3d 100644
--- a/tools/kvm/hw/serial.c
+++ b/tools/kvm/hw/serial.c
@@ -213,7 +213,7 @@ static bool serial8250_out(struct ioport *ioport, struct 
kvm *kvm, u16 port, voi
case UART_TX: {
char *addr = data;

-   if (!(dev->mcr & UART_MCR_LOOP))
+   if (!(dev->mcr & (UART_MCR_LOOP | UART_MCR_OUT2)))
term_putc(CONSOLE_8250, addr, size * count);

dev->iir = UART_IIR_NO_INT;
--
1.7.6



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2011-08-10 Thread Christoph Hellwig
Any progress on these patches?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Fix TSC MSR read in nested SVM

2011-08-10 Thread Joerg Roedel
On Wed, Aug 10, 2011 at 03:35:28PM +0300, Nadav Har'El wrote:
> On Sun, Aug 07, 2011, Joerg Roedel wrote about "Re: [PATCH 3/3] Fix TSC MSR 
> read in nested SVM":
> > > Joerg, can you review and ack Nadav's SVM patch? TIA
> > 
> > Tested-by: Joerg Roedel 
> > Acked-by: Joerg Roedel 
> > 
> > Reviewed and tested (didn't apply cleanly, but was easy to fix that up).
> > Looks all good so far.
> 
> Hi guys, I'm a bit confused how we want to proceed from here.
> 
> The patches which I sent appear to fix the original bug (as confirmed by
> the two people who reported it), but Zachary warned that it would break
> migration of nested SVM while L2 is running. I asked whether migration
> works at all while L2 is running (without exiting to L1 first) - and
> Marcelo suggested that it doesn't.

Migration doesn't work today when L2 is running, so don't worry about it
for now.

Regards,

Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Fix TSC MSR read in nested SVM

2011-08-10 Thread Nadav Har'El
On Sun, Aug 07, 2011, Joerg Roedel wrote about "Re: [PATCH 3/3] Fix TSC MSR 
read in nested SVM":
> > Joerg, can you review and ack Nadav's SVM patch? TIA
> 
> Tested-by: Joerg Roedel 
> Acked-by: Joerg Roedel 
> 
> Reviewed and tested (didn't apply cleanly, but was easy to fix that up).
> Looks all good so far.

Hi guys, I'm a bit confused how we want to proceed from here.

The patches which I sent appear to fix the original bug (as confirmed by
the two people who reported it), but Zachary warned that it would break
migration of nested SVM while L2 is running. I asked whether migration
works at all while L2 is running (without exiting to L1 first) - and
Marcelo suggested that it doesn't.

If the problem Zachary pointed to is considered serious, I proposed a second
option - to leave the code to *wrongly* return the L1 TSC MSR always, and
check (and warn) in situations where this value is wrongly returned to the
guest, but this will leave qemu to always read the TSC MSR from L1, even when
L2 is running. While I proposed this as a second option, I don't think I
can recommend it.

So what's the verdict? :-)

Thanks,
Nadav.

-- 
Nadav Har'El|   Wednesday, Aug 10 2011, 10 Av 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |It's fortunate I have bad luck - without
http://nadav.harel.org.il   |it I would have no luck at all!
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/10] KVM: PPC: Add sanity checking to vcpu_run

2011-08-10 Thread Alexander Graf
There are multiple features in PowerPC KVM that can now be enabled
depending on the user's wishes. Some of the combinations don't make
sense or don't work though.

So this patch adds a way to check if the executing environment would
actually be able to run the guest properly. It also adds sanity
checks if PVR is set (should always be true given the current code
flow), if PAPR is only used with book3s_64 where it works and that
HV KVM is only used in PAPR mode.

Signed-off-by: Alexander Graf 
---
 arch/powerpc/include/asm/kvm.h  |5 +
 arch/powerpc/include/asm/kvm_host.h |2 ++
 arch/powerpc/include/asm/kvm_ppc.h  |1 +
 arch/powerpc/kvm/44x.c  |2 ++
 arch/powerpc/kvm/book3s_hv.c|8 
 arch/powerpc/kvm/book3s_pr.c|   10 ++
 arch/powerpc/kvm/booke.c|   10 +-
 arch/powerpc/kvm/e500.c |2 ++
 arch/powerpc/kvm/powerpc.c  |   28 
 9 files changed, 67 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index a6a253e..08fe69e 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -284,6 +284,11 @@ struct kvm_guest_debug_arch {
 #define KVM_INTERRUPT_UNSET-2U
 #define KVM_INTERRUPT_SET_LEVEL-3U
 
+#define KVM_CPU_4401
+#define KVM_CPU_E500V2 2
+#define KVM_CPU_3S_32  3
+#define KVM_CPU_3S_64  4
+
 /* for KVM_CAP_SPAPR_TCE */
 struct kvm_create_spapr_tce {
__u64 liobn;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index e681302..2b8284f 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -390,6 +390,8 @@ struct kvm_vcpu_arch {
u8 osi_needed;
u8 osi_enabled;
u8 papr_enabled;
+   u8 sane;
+   u8 cpu_type;
u8 hcall_needed;
 
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index d121f49..46efd1a 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -66,6 +66,7 @@ extern int kvmppc_emulate_instruction(struct kvm_run *run,
 extern int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu);
 extern void kvmppc_emulate_dec(struct kvm_vcpu *vcpu);
 extern u32 kvmppc_get_dec(struct kvm_vcpu *vcpu, u64 tb);
+extern int kvmppc_sanity_check(struct kvm_vcpu *vcpu);
 
 /* Core-specific hooks */
 
diff --git a/arch/powerpc/kvm/44x.c b/arch/powerpc/kvm/44x.c
index da3a122..ca1f88b 100644
--- a/arch/powerpc/kvm/44x.c
+++ b/arch/powerpc/kvm/44x.c
@@ -78,6 +78,8 @@ int kvmppc_core_vcpu_setup(struct kvm_vcpu *vcpu)
for (i = 0; i < ARRAY_SIZE(vcpu_44x->shadow_refs); i++)
vcpu_44x->shadow_refs[i].gtlb_index = -1;
 
+   vcpu->arch.cpu_type = KVM_CPU_440;
+
return 0;
 }
 
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index cc0d7f1..830e07a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -510,6 +510,9 @@ struct kvm_vcpu *kvmppc_core_vcpu_create(struct kvm *kvm, 
unsigned int id)
spin_unlock(&vcore->lock);
vcpu->arch.vcore = vcore;
 
+   vcpu->arch.cpu_type = KVM_CPU_3S_64;
+   kvmppc_sanity_check(vcpu);
+
return vcpu;
 
 free_vcpu:
@@ -800,6 +803,11 @@ int kvmppc_vcpu_run(struct kvm_run *run, struct kvm_vcpu 
*vcpu)
 {
int r;
 
+   if (!vcpu->arch.sane) {
+   kvm_run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+   return -EINVAL;
+   }
+
do {
r = kvmppc_run_vcpu(run, vcpu);
 
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 48558f6..6e3488b 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -153,6 +153,7 @@ void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr)
if (!to_book3s(vcpu)->hior_sregs)
to_book3s(vcpu)->hior = 0xfff0;
to_book3s(vcpu)->msr_mask = 0xULL;
+   vcpu->arch.cpu_type = KVM_CPU_3S_64;
} else
 #endif
{
@@ -160,8 +161,11 @@ void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr)
if (!to_book3s(vcpu)->hior_sregs)
to_book3s(vcpu)->hior = 0;
to_book3s(vcpu)->msr_mask = 0xULL;
+   vcpu->arch.cpu_type = KVM_CPU_3S_32;
}
 
+   kvmppc_sanity_check(vcpu);
+
/* If we are in hypervisor level on 970, we can tell the CPU to
 * treat DCBZ as 32 bytes store */
vcpu->arch.hflags &= ~BOOK3S_HFLAG_DCBZ32;
@@ -938,6 +942,12 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
 #endif
ulong ext_msr;
 
+   /* Check if we can run the vcpu at all */
+   if (!vcpu->arch.sane) {
+   kvm_run->exit_reason = KVM_EXIT_INT

[PATCH] device-assignment: convert to memory API

2011-08-10 Thread Avi Kivity
Signed-off-by: Avi Kivity 
---
 hw/device-assignment.c |  158 ++--
 hw/device-assignment.h |6 +-
 2 files changed, 75 insertions(+), 89 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 59d9b92..942d3af 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -42,6 +42,8 @@
 #include 
 #include "sysemu.h"
 
+#define MSIX_PAGE_SIZE 0x1000
+
 /* From linux/ioport.h */
 #define IORESOURCE_IO   0x0100  /* Resource type */
 #define IORESOURCE_MEM  0x0200
@@ -97,7 +99,7 @@ static uint32_t assigned_dev_ioport_rw(AssignedDevRegion 
*dev_region,
uint32_t addr, int len, uint32_t *val)
 {
 uint32_t ret = 0;
-uint32_t offset = addr - dev_region->e_physbase;
+uint32_t offset = addr;
 int fd = dev_region->region->resource_fd;
 
 if (fd >= 0) {
@@ -252,33 +254,25 @@ static void slow_bar_writel(void *opaque, 
target_phys_addr_t addr, uint32_t val)
 *out = val;
 }
 
-static CPUWriteMemoryFunc * const slow_bar_write[] = {
-&slow_bar_writeb,
-&slow_bar_writew,
-&slow_bar_writel
-};
-
-static CPUReadMemoryFunc * const slow_bar_read[] = {
-&slow_bar_readb,
-&slow_bar_readw,
-&slow_bar_readl
+static const MemoryRegionOps slow_bar_ops = {
+.old_mmio = {
+.read = { slow_bar_readb, slow_bar_readw, slow_bar_readl, },
+.write = { slow_bar_writeb, slow_bar_writew, slow_bar_writel, },
+},
+.endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static void assigned_dev_iomem_map(PCIDevice *pci_dev, int region_num,
-   pcibus_t e_phys, pcibus_t e_size, int type)
+static void assigned_dev_iomem_setup(PCIDevice *pci_dev, int region_num,
+ pcibus_t e_size)
 {
 AssignedDevice *r_dev = DO_UPCAST(AssignedDevice, dev, pci_dev);
 AssignedDevRegion *region = &r_dev->v_addrs[region_num];
 PCIRegion *real_region = &r_dev->real_device.regions[region_num];
 
-DEBUG("e_phys=%08" FMT_PCIBUS " r_virt=%p type=%d len=%08" FMT_PCIBUS " 
region_num=%d \n",
-  e_phys, region->u.r_virtbase, type, e_size, region_num);
-
-region->e_physbase = e_phys;
-region->e_size = e_size;
-
 if (e_size > 0) {
-cpu_register_physical_memory(e_phys, e_size, region->memory_index);
+memory_region_init(®ion->container, "assigned-dev-container",
+   e_size);
+memory_region_add_subregion(®ion->container, 0, 
®ion->real_iomem);
 
 /* deal with MSI-X MMIO page */
 if (real_region->base_addr <= r_dev->msix_table_addr &&
@@ -286,27 +280,39 @@ static void assigned_dev_iomem_map(PCIDevice *pci_dev, 
int region_num,
 r_dev->msix_table_addr) {
 int offset = r_dev->msix_table_addr - real_region->base_addr;
 
-cpu_register_physical_memory(e_phys + offset,
-TARGET_PAGE_SIZE, r_dev->mmio_index);
+memory_region_add_subregion_overlap(®ion->container,
+offset,
+&r_dev->mmio,
+1);
 }
 }
 }
 
-static void assigned_dev_ioport_map(PCIDevice *pci_dev, int region_num,
-pcibus_t addr, pcibus_t size, int type)
+static const MemoryRegionPortio assigned_dev_old_portio[] = {
+{ 0x1, 1, .read = assigned_dev_ioport_readb, },
+{ 0x1, 2, .read = assigned_dev_ioport_readw, },
+{ 0x1, 4, .read = assigned_dev_ioport_readl, },
+{ 0x1, 1, .write = assigned_dev_ioport_writeb, },
+{ 0x1, 2, .write = assigned_dev_ioport_writew, },
+{ 0x1, 4, .write = assigned_dev_ioport_writel, },
+PORTIO_END_OF_LIST()
+};
+
+static const MemoryRegionOps assigned_dev_ioport_ops = {
+.old_portio = assigned_dev_old_portio,
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void assigned_dev_ioport_setup(PCIDevice *pci_dev, int region_num,
+  pcibus_t size)
 {
 AssignedDevice *r_dev = DO_UPCAST(AssignedDevice, dev, pci_dev);
 AssignedDevRegion *region = &r_dev->v_addrs[region_num];
-int first_map = (region->e_size == 0);
 int r;
 
-region->e_physbase = addr;
 region->e_size = size;
 
-DEBUG("e_phys=0x%" FMT_PCIBUS " r_baseport=%x type=0x%x len=%" FMT_PCIBUS 
" region_num=%d \n",
-  addr, region->u.r_baseport, type, size, region_num);
-
-if (first_map && region->region->resource_fd < 0) {
+if (region->region->resource_fd < 0) {
 r = kvm_add_ioport_region(region->u.r_baseport, region->r_size,
   pci_dev->qdev.hotplugged);
 if (r < 0) {
@@ -314,19 +320,11 @@ static void assigned_dev_ioport_map(PCIDevice *pci_dev, 
int region_num,
 __func__);
 }
 }
-
-register_ioport_read(addr, size, 1, assig

Re: [PATCH 10/10] KVM: PPC: Enable the PAPR CAP for Book3S

2011-08-10 Thread Alexander Graf

On 08/10/2011 02:26 PM, Paul Mackerras wrote:

On Wed, Aug 10, 2011 at 09:59:41AM +0200, Alexander Graf wrote:

Am 10.08.2011 um 06:42 schrieb Paul Mackerras:


On Tue, Aug 09, 2011 at 06:31:48PM +0200, Alexander Graf wrote:


Now that Book3S PV mode can also run PAPR guests, we can add a PAPR cap and
enable it for all Book3S targets. Enabling that CAP switches KVM into PAPR
mode.

Don't we want to enable it only for 64-bit hosts?  Trying to run a
PAPR guest on a 32-bit Book 3S host won't work very well, unless I am
missing something...

I agree that it doesn't make sense, but if anything we should
restrict it to 64-bit _guests_. you can also run 32-bit guests on
64-bit hosts.

I had a look in PAPR and I didn't find anything that says the
processor has to be 64-bit, so I guess a 32-bit PAPR guest is possible
in theory.  However, I don't think there are currently any 32-bit PAPR
operating systems that would use hcalls.


That's what I figured :). The code flow we're affecting here is pretty 
much generic.



And so far, we don't have a single interface setting PVR and PAPR
mode at the same time, so you could still enable PAPR with a 64-bit
guest CPU and then switch to a 32-bit CPU.

It'd be a nightmare to check all configurations on every setter function.

Unless...

We could introduce a sanity check function that gets executed every
time we change PVR or enable PAPR. That could set a variable in the
vcpu struct to indicate that the config is ok. We could then check
that on vcpu_run.

It's probably not worth worrying about it.


Too late, already implemented it :). It really does make sense to have 
some sort of checking here - even if it only means that our hypercall 
implementation can't handle it yet or that we didn't test it. And we can 
put the "HV KVM can only run PAPR" check in there as well.



Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] qemu-kvm-0.15.0

2011-08-10 Thread Nadav Har'El
On Wed, Aug 10, 2011, Thomas Mittelstaedt wrote about "Re: [ANNOUNCE] 
qemu-kvm-0.15.0":
> Grabbed the version, created a disk image of 5G and tried to load and
> iso via 
> /usr/local/bin/qemu  -hda ~/virtualdisk.img -cdrom
> ~/Downloads/oneiric-desktop-i386-alpha3-lubuntu.iso -boot d -m 1024
> 
> and it is very, very slow. Tried the same with the installed 0.11

Sounds like maybe you are using Qemu without KVM?
Is running qemu with the "--enable-kvm" option better?

"--enable-kvm" used to be the default in qemu-kvm, but maybe this changed
(I haven't been following the discussion in this area).

-- 
Nadav Har'El|   Wednesday, Aug 10 2011, 10 Av 5771
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Always keep your words soft and sweet,
http://nadav.harel.org.il   |just in case you have to eat them.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] KVM: PPC: Enable the PAPR CAP for Book3S

2011-08-10 Thread Paul Mackerras
On Wed, Aug 10, 2011 at 09:59:41AM +0200, Alexander Graf wrote:
> 
> Am 10.08.2011 um 06:42 schrieb Paul Mackerras :
> 
> > On Tue, Aug 09, 2011 at 06:31:48PM +0200, Alexander Graf wrote:
> > 
> >> Now that Book3S PV mode can also run PAPR guests, we can add a PAPR cap and
> >> enable it for all Book3S targets. Enabling that CAP switches KVM into PAPR
> >> mode.
> > 
> > Don't we want to enable it only for 64-bit hosts?  Trying to run a
> > PAPR guest on a 32-bit Book 3S host won't work very well, unless I am
> > missing something...
> 
> I agree that it doesn't make sense, but if anything we should
> restrict it to 64-bit _guests_. you can also run 32-bit guests on
> 64-bit hosts.

I had a look in PAPR and I didn't find anything that says the
processor has to be 64-bit, so I guess a 32-bit PAPR guest is possible
in theory.  However, I don't think there are currently any 32-bit PAPR
operating systems that would use hcalls.

> And so far, we don't have a single interface setting PVR and PAPR
> mode at the same time, so you could still enable PAPR with a 64-bit
> guest CPU and then switch to a 32-bit CPU.
> 
> It'd be a nightmare to check all configurations on every setter function.
> 
> Unless...
> 
> We could introduce a sanity check function that gets executed every
> time we change PVR or enable PAPR. That could set a variable in the
> vcpu struct to indicate that the config is ok. We could then check
> that on vcpu_run.

It's probably not worth worrying about it.

The rest of the series looks very nice.

Regards,
Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM on RHEL 6.1

2011-08-10 Thread Gonzalo Servat
Hi All,

I have been running KVM on RHEL 5.x successfully, and I'm keen to
upgrade to RHEL6.1 so I'm currently giving it a try. The host appears
to work fine up until the point that I start a VM that was previously
running fine on 5.x. At this point, the system becomes extremely laggy
(noticeable delay between key presses and when they appear via SSH).

I did notice that dstat starts reporting "missed X ticks" as the VM
starts up and the system becomes sluggish.

I have been monitoring the system via dstat/top, and load average,
memory and CPU look fine. I did however notice the number of
interrupts and context switches is a lot higher in RHEL6.1 than it was
on RHEL5.x.

I have not seen anything interesting in /var/log/messages.

Any ideas? Happy to provide whatever info you like.

Hardware: IBM x3850 X5 / 128G RAM
Kernel:  2.6.32-131.6.1.el6
QEMU-KVM: 0.12.1.2-2.160.el6_1.6

Many thanks in advance.

Gonzalo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 3/4] block: add block timer and block throttling algorithm

2011-08-10 Thread Stefan Hajnoczi
On Wed, Aug 10, 2011 at 7:57 AM, Zhi Yong Wu  wrote:
> On Tue, Aug 9, 2011 at 11:19 PM, Stefan Hajnoczi  wrote:
>> On Tue, Aug 9, 2011 at 5:17 AM, Zhi Yong Wu  wrote:
>>> Note:
>>>      1.) When bps/iops limits are specified to a small value such as 511 
>>> bytes/s, this VM will hang up. We are considering how to handle this 
>>> senario.
>>
>> If an I/O request is larger than the limit itself then I think we
>> should let it through with a warning that the limit is too low.  This
> If it will print a waring, it seems to be not a nice way, the
> throtting function will take no effect on this scenario.

Setting the limit below the max request size is a misconfiguration.
It's a problem that the user needs to be aware of and fix, so a
warning is appropriate.  Unfortunately the max request size is not
easy to determine from the block layer and may vary between guest
OSes, that's why we need to do something at runtime.

We cannot leave this case unhandled because it results in the guest
timing out I/O without any information about the problem - that makes
it hard to troubleshoot for the user.  Any other ideas on how to
handle this case?

>>> @@ -90,6 +100,68 @@ int is_windows_drive(const char *filename)
>>>  }
>>>  #endif
>>>
>>> +/* throttling disk I/O limits */
>>> +void bdrv_io_limits_disable(BlockDriverState *bs)
>>> +{
>>> +    bs->io_limits_enabled = false;
>>> +    bs->req_from_queue    = false;
>>> +
>>> +    if (bs->block_queue) {
>>> +        qemu_block_queue_flush(bs->block_queue);
>>> +        qemu_del_block_queue(bs->block_queue);
>>
>> When you fix the acb lifecycle in block-queue.c this will no longer be
>> safe.  Requests that are dispatched still have acbs that belong to
> When the request is dispatched, if it is successfully serviced, our
> acb will been removed.

Yes, that is how the patches work right now.  But the acb lifecycle
that I explained in response to the other patch changes this.

>> this queue.  It is simplest to keep the queue for the lifetime of the
>> BlockDriverState - it's just a list so it doesn't take up much memory.
> I more prefer to removing this queue, even though it is simple and
> harmless to let it exist.

If the existing acbs do not touch their BlockQueue after this point in
the code, then it will be safe to delete the queue since it will no
longer be referenced.  If you can guarantee this then it's fine to
keep this code.

>
>>
>>> +    }
>>> +
>>> +    if (bs->block_timer) {
>>> +        qemu_del_timer(bs->block_timer);
>>> +        qemu_free_timer(bs->block_timer);
>>
>> To prevent double frees:
>>
>> bs->block_timer = NULL;
> Can qemu_free_timer() not ensure that it is NULL? This is not necessary.:)

No, qemu_free_timer() is a function.  qemu_free_timer() cannot change
the pointer variable bs->block_timer because in C functions take
arguments by value.

After qemu_free_timer() bs->block_timer will still hold the old
address of the (freed) timer.  Then when bdrv_close() is called it
does qemu_free_timer() again because bs->block_timer is not NULL.  It
is illegal to free() a pointer twice and it can cause a crash.

>
>>
>>> +    }
>>> +
>>> +    bs->slice_start[0]   = 0;
>>> +    bs->slice_start[1]   = 0;
>>> +
>>> +    bs->slice_end[0]     = 0;
>>> +    bs->slice_end[1]     = 0;
>>> +}
>>> +
>>> +static void bdrv_block_timer(void *opaque)
>>> +{
>>> +    BlockDriverState *bs = opaque;
>>> +    BlockQueue *queue = bs->block_queue;
>>> +
>>> +    qemu_block_queue_flush(queue);
>>> +}
>>> +
>>> +void bdrv_io_limits_enable(BlockDriverState *bs)
>>> +{
>>> +    bs->req_from_queue = false;
>>> +
>>> +    bs->block_queue    = qemu_new_block_queue();
>>> +    bs->block_timer    = qemu_new_timer_ns(vm_clock, bdrv_block_timer, bs);
>>> +
>>> +    bs->slice_start[BLOCK_IO_LIMIT_READ]  = qemu_get_clock_ns(vm_clock);
>>> +    bs->slice_start[BLOCK_IO_LIMIT_WRITE] = qemu_get_clock_ns(vm_clock);
>>> +
>>> +    bs->slice_end[BLOCK_IO_LIMIT_READ]    =
>>> +                      qemu_get_clock_ns(vm_clock) + BLOCK_IO_SLICE_TIME;
>>> +    bs->slice_end[BLOCK_IO_LIMIT_WRITE]   =
>>> +                      qemu_get_clock_ns(vm_clock) + BLOCK_IO_SLICE_TIME;
>>
>> The slice times could just be initialized to 0 here.  When
>> bdrv_exceed_io_limits() is called it will start a new slice.
> The result should be same as current method even though they are set to ZERO.

Yes, the result is the same except we only create new slices in one place.

>>> +            }
>>> +
>>> +            return true;
>>> +        }
>>> +    }
>>> +
>>> +    elapsed_time  = real_time - bs->slice_start[is_write];
>>> +    elapsed_time  /= (BLOCK_IO_SLICE_TIME * 10.0);
>>
>> Why * 10.0?
> elapsed_time currently is in ns, and it need to be translated into a
> floating value in minutes.

I think you mean seconds instead of minutes.  Then the conversion has
nothing to do with BLOCK_IO_SLICE_TIME, it's just nanoseconds to
seconds.  It would be clearer to drop BLOCK_IO_SLICE_TIME from the
expression and divide by a n

Re: [PATCH] kvm tools: Print only serial output to the terminal

2011-08-10 Thread Pekka Enberg
On Wed, Aug 10, 2011 at 12:52 PM, Sasha Levin  wrote:
> This patch changes the serial device to print only auxiliary output to the
> terminal.
>
> Doing so prevents printing output which the guest kernel never intended us
> to print and by printing it we wrote junk to the users terminal.
>
> Signed-off-by: Sasha Levin 
> ---
>  tools/kvm/hw/serial.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
> index 1199264..0393d3d 100644
> --- a/tools/kvm/hw/serial.c
> +++ b/tools/kvm/hw/serial.c
> @@ -213,7 +213,7 @@ static bool serial8250_out(struct ioport *ioport, struct 
> kvm *kvm, u16 port, voi
>                case UART_TX: {
>                        char *addr = data;
>
> -                       if (!(dev->mcr & UART_MCR_LOOP))
> +                       if (!(dev->mcr & (UART_MCR_LOOP | UART_MCR_OUT2)))
>                                term_putc(CONSOLE_8250, addr, size * count);
>
>                        dev->iir                = UART_IIR_NO_INT;
> --
> 1.7.6

Aah, you can actually see the magic 0xff byte in
drivers/tty/serial/8250.c::autoconfig_irq(). Thanks Sasha!

Pekka
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm tools: Print only serial output to the terminal

2011-08-10 Thread Sasha Levin
This patch changes the serial device to print only auxiliary output to the
terminal.

Doing so prevents printing output which the guest kernel never intended us
to print and by printing it we wrote junk to the users terminal.

Signed-off-by: Sasha Levin 
---
 tools/kvm/hw/serial.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
index 1199264..0393d3d 100644
--- a/tools/kvm/hw/serial.c
+++ b/tools/kvm/hw/serial.c
@@ -213,7 +213,7 @@ static bool serial8250_out(struct ioport *ioport, struct 
kvm *kvm, u16 port, voi
case UART_TX: {
char *addr = data;
 
-   if (!(dev->mcr & UART_MCR_LOOP))
+   if (!(dev->mcr & (UART_MCR_LOOP | UART_MCR_OUT2)))
term_putc(CONSOLE_8250, addr, size * count);
 
dev->iir= UART_IIR_NO_INT;
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 0/4] The intro of QEMU block I/O throttling

2011-08-10 Thread Stefan Hajnoczi
On Wed, Aug 10, 2011 at 01:09:27PM +0800, Zhi Yong Wu wrote:
> On Tue, Aug 9, 2011 at 8:08 PM, Stefan Hajnoczi  wrote:
> > On Tue, Aug 9, 2011 at 5:17 AM, Zhi Yong Wu  
> > wrote:
> >>  Makefile.objs     |    2 +-
> >>  block.c           |  347 
> >> +++--
> >>  block.h           |    6 +-
> >>  block/blk-queue.c |  141 ++
> >>  block/blk-queue.h |   73 +++
> >>  block_int.h       |   30 +
> >>  blockdev.c        |  108 +
> >>  blockdev.h        |    2 +
> >>  hmp-commands.hx   |   15 +++
> >>  qemu-config.c     |   24 
> >>  qemu-option.c     |   17 +++
> >>  qemu-option.h     |    1 +
> >>  qemu-options.hx   |    1 +
> >>  qerror.c          |    4 +
> >>  qerror.h          |    3 +
> >>  qmp-commands.hx   |   52 -
> >>  16 files changed, 813 insertions(+), 13 deletions(-)
> >>  create mode 100644 block/blk-queue.c
> >>  create mode 100644 block/blk-queue.h
> >
> > This series is against qemu-kvm.git, please send block patches against
> > qemu.git/master in the future.  There is no qemu-kvm.git-specific code
> > here so just working against qemu.git and emailing qemu-devel is best.
> Do you know how to rebase it from qemu-kvm.git to qemu.git directly?

This is what I do:

# Export patches for all commits I have made since master
$ cd qemu-kvm
$ git format-patch master..

# Import patches into qemu.git
$ cd qemu && git checkout -b io_limits master
$ git am ~/qemu-kvm/*.patch

git-am(1) will ask you to resolve any conflicts.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/4] block: add the block queue support

2011-08-10 Thread Stefan Hajnoczi
On Wed, Aug 10, 2011 at 01:54:33PM +0800, Zhi Yong Wu wrote:
> On Tue, Aug 9, 2011 at 8:49 PM, Stefan Hajnoczi  wrote:
> > On Tue, Aug 9, 2011 at 5:17 AM, Zhi Yong Wu  
> > wrote:
> >> +BlockDriverAIOCB *qemu_block_queue_enqueue(BlockQueue *queue,
> >> +                        BlockDriverState *bs,
> >> +                        BlockRequestHandler *handler,
> >> +                        int64_t sector_num,
> >> +                        QEMUIOVector *qiov,
> >> +                        int nb_sectors,
> >> +                        BlockDriverCompletionFunc *cb,
> >> +                        void *opaque)
> >> +{
> >> +    BlockIORequest *request;
> >> +    BlockDriverAIOCB *acb;
> >> +
> >> +    request = qemu_malloc(sizeof(BlockIORequest));
> >> +    request->bs = bs;
> >> +    request->handler = handler;
> >> +    request->sector_num = sector_num;
> >> +    request->qiov = qiov;
> >> +    request->nb_sectors = nb_sectors;
> >> +    request->cb = cb;
> >> +    request->opaque = opaque;
> >> +
> >> +    QTAILQ_INSERT_TAIL(&queue->requests, request, entry);
> >
> > It would be simpler to define BlockQueueAIOCB and using it as our acb
> > instead of managing an extra BlockIORequest structure.  That way you
> > don't need to worry about extra mallocs and frees.
> Sorry, i don't get what you mean. how to define it? Can you elaborate?

BlockDriverAIOCB is designed to be embedded inside a bigger struct.  For
example, QEDAIOCB is a larger struct that contains BlockDriverAIOCB as
its first field:

typedef struct QEDAIOCB {
BlockDriverAIOCB common;
...
} QEDAIOCB;

And the QED AIOPool contains the size of QEDAIOCB so that qemu_aio_get() can
allocate the full QEDAIOCB struct:

static AIOPool qed_aio_pool = {
.aiocb_size = sizeof(QEDAIOCB),
.cancel = qed_aio_cancel,
};

This allows QED to store per-request state in QEDAIOCB for the lifetime of a
request:

QEDAIOCB *acb = qemu_aio_get(&qed_aio_pool, bs, cb, opaque);

acb->is_write = is_write;
acb->finished = NULL;
acb->qiov = qiov;
...

I suggest creating a BlockQueueAIOCB that contains the fields from
BlockIORequest (which is no longer needed as a separate struct):

typedef struct BlockQueueAIOCB {
BlockDriverAIOCB common;
BlockRequestHandler *handler;
int64_t sector_num;
QEMUIOVector *qiov;
int nb_sectors;
QTAILQ_ENTRY(BlockQueueAIOCB) entry; /* pending request queue */
} BlockQueueAIOCB;

Now you can drop the malloc and simply qemu_aio_get() a new
BlockQueueAIOCB.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] kvm tools: handle failure of command

2011-08-10 Thread Asias He
On 08/10/2011 02:49 PM, Pekka Enberg wrote:
> On Wed, Aug 10, 2011 at 9:32 AM, walimis  wrote:
>> I used to use tap to enable network for guest os, so that I can mount
>> nfs rootfs. meanwhile, I can access the guest os from host os through the
>> network. For kvm tools, I don't know how to do that through userspace
>> networking. Maybe we need a brief explanation in README.
> 
> Right. I've never used that. Asias is that supported?
> 

I have not tried nfs with our user mode network. But I think walimis can
do it with

sudo ./kvm --network tap

We need sudo because we need CAP_NET_ADMIN to open /dev/net/tun.

By default, the ip address in host side is:

192.168.33.1/24

In guest side, you need to config a ip address for our guest:

$ sudo ifconfig eth0 192.168.33.15
$ sudo route add default gw 192.168.33.1
$ ping 192.168.33.1

That's it.

If you want to access the real world in guest, try NAT or Bridge the tap
device.



-- 
Best Regards,
Asias He
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v5 1/4] block: add the command line support

2011-08-10 Thread Stefan Hajnoczi
On Wed, Aug 10, 2011 at 01:20:22PM +0800, Zhi Yong Wu wrote:
> On Tue, Aug 9, 2011 at 8:25 PM, Stefan Hajnoczi  wrote:
> > On Tue, Aug 9, 2011 at 5:17 AM, Zhi Yong Wu  
> > wrote:
> >> Signed-off-by: Zhi Yong Wu 
> >> ---
> >>  Makefile.objs   |    2 +-
> >>  blockdev.c      |   39 +++
> >>  qemu-config.c   |   24 
> >>  qemu-option.c   |   17 +
> >>  qemu-option.h   |    1 +
> >>  qemu-options.hx |    1 +
> >>  6 files changed, 83 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/Makefile.objs b/Makefile.objs
> >> index 9f99ed4..06f2033 100644
> >> --- a/Makefile.objs
> >> +++ b/Makefile.objs
> >> @@ -23,7 +23,7 @@ block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o 
> >> cloop.o dmg.o bochs.o vpc.o vv
> >>  block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o 
> >> qcow2-snapshot.o qcow2-cache.o
> >>  block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o 
> >> qed-cluster.o
> >>  block-nested-y += qed-check.o
> >> -block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
> >> +block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o 
> >> blk-queue.o
> >
> > This does not build:
> >  LINK  qemu-ga
> > gcc: error: block/blk-queue.o: No such file or directory
> >
> > This Makefile.objs change should be in the commit that adds blk-queue.c.
> >
> > Each patch in a series should compile cleanly and can only depend on
> > previous patches.  This is important so that git-bisect(1) can be
> > used, it only works if every commit builds a working program.  It also
> > makes patch review easier when the patch series builds up logically.
> It seems that it will take a bit much time if we strictly stage the
> hunks into each corresponding patch.:)
> OK, i will.

Some people like using Stacked Git to manage patch series:
http://www.procode.org/stgit/

I typically just use git rebase -i and git add -i manually to clean up
patch series.

It also becomes easier once you plan to write patches that follow these
guidelines.

> >> +    /* disk io throttling */
> >> +    iol_flag = qemu_opt_io_limits_enable_flag(opts, iol_opts);
> >> +    if (iol_flag) {
> >> +        memset(&io_limits, 0, sizeof(BlockIOLimit));
> >> +
> >> +        io_limits.bps[BLOCK_IO_LIMIT_TOTAL]  =
> >> +                           qemu_opt_get_number(opts, "bps", 0);
> >> +        io_limits.bps[BLOCK_IO_LIMIT_READ]   =
> >> +                           qemu_opt_get_number(opts, "bps_rd", 0);
> >> +        io_limits.bps[BLOCK_IO_LIMIT_WRITE]  =
> >> +                           qemu_opt_get_number(opts, "bps_wr", 0);
> >> +        io_limits.iops[BLOCK_IO_LIMIT_TOTAL] =
> >> +                           qemu_opt_get_number(opts, "iops", 0);
> >> +        io_limits.iops[BLOCK_IO_LIMIT_READ]  =
> >> +                           qemu_opt_get_number(opts, "iops_rd", 0);
> >> +        io_limits.iops[BLOCK_IO_LIMIT_WRITE] =
> >> +                           qemu_opt_get_number(opts, "iops_wr", 0);
> >> +
> >> +        if (((io_limits.bps[BLOCK_IO_LIMIT_TOTAL] != 0)
> >> +            && ((io_limits.bps[BLOCK_IO_LIMIT_READ] != 0)
> >> +            || (io_limits.bps[BLOCK_IO_LIMIT_WRITE] != 0)))
> >> +            || ((io_limits.iops[BLOCK_IO_LIMIT_TOTAL] != 0)
> >> +            && ((io_limits.iops[BLOCK_IO_LIMIT_READ] != 0)
> >> +            || (io_limits.iops[BLOCK_IO_LIMIT_WRITE] != 0 {
> >> +            error_report("bps(iops) and bps_rd/bps_wr(iops_rd/iops_wr) \
> >> +                                        cannot be used at the same time");
> >> +            return NULL;
> >> +        }
> >> +    }
> >> +
> >>     on_write_error = BLOCK_ERR_STOP_ENOSPC;
> >>     if ((buf = qemu_opt_get(opts, "werror")) != NULL) {
> >>         if (type != IF_IDE && type != IF_SCSI && type != IF_VIRTIO && type 
> >> != IF_NONE) {
> >> @@ -483,6 +517,11 @@ DriveInfo *drive_init(QemuOpts *opts, int 
> >> default_to_scsi)
> >>
> >>     bdrv_set_on_error(dinfo->bdrv, on_read_error, on_write_error);
> >>
> >> +    /* disk I/O throttling */
> >> +    if (iol_flag) {
> >> +        bdrv_set_io_limits(dinfo->bdrv, &io_limits);
> >> +    }
> >
> > iol_flag and qemu_opt_io_limits_enable_flag() are not necessary.  If
> > no limits were set then all fields will be 0 (unlimited).
> Are they not necessary here? why? qemu_opt_io_limits_enable_flag is
> used to determine if io_limits is enabled.
> If yes, iol_flag will be set to ONE. So i think that they are necessay here.

There are two possible cases: the user does not set any options or the
user sets at least one option.  In both cases io_limits will be
initialized correctly, here is why:

When an option is not specified by the user the value will be 0, which
means "unlimited".  bdrv_set_io_limits() calls
bdrv_io_limits_enabled(bs) to check whether any I/O limit is non-zero.
This means the iol_flag check is already being done by
bdrv_set_io_limits() and there is no need to duplicate it.  The iol_flag
code can be eliminated and

Re: [PATCH] kvm tools: Sanitize output characters in serial console

2011-08-10 Thread Pekka Enberg
On Wed, Aug 10, 2011 at 11:31 AM, Asias He  wrote:
>> I think the reason is not important here. I use qemu to test
>> and it also outputs that extra characters. They seem to be outputted by
>> driver or kernel of guest os.
>>
>> It's not sane to change the output of guest os. Suppose if we just
>> want to see what "cat /dev/urandom" prints, then only
>> allow ascii characters to be outputted? I think that's not what we expect.
>
> This is a good reason why we should not only allow ascii to be printed.
>
> Anyway, this is really annoying. Maybe it is a serial driver bug?

Could be. It's probably best to track down where the byte is being
sent in the driver.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Permit -mem-path without sync mmu

2011-08-10 Thread Avi Kivity

On 08/10/2011 08:10 AM, David Gibson wrote:

On Mon, Aug 08, 2011 at 11:24:09AM +0300, Avi Kivity wrote:
>  On 08/08/2011 09:03 AM, David Gibson wrote:
>  >Second, if userspace qemu passing hugepages to kvm can cause (host)
>  >kernel memory corruption, that is clearly a host kernel bug.  So am I
>  >correct in thinking this is basically just a safety feature if qemu is
>  >run on a buggy kernel.
>
>  Seems so, yes.  2.6.2[456] are exploitable.  We only found out after
>  these were all released.
>
>  >Presumably this bug was corrected at some
>  >point?  Is the presence of the SYNC_MMU feature just being used as a
>  >proxy for "is this kernel recent enough to have the corruption bug
>  >fixed"?
>
>  SYNC_MMU actually fixes the bug.

Ah, so SYNC_MMU fixed the bug on x86, and all the other archs without
SYNC_MMU were left with a serious memory corruption bug, under a
userspace bandaid.  Thanks for that.


Unfortunately it's all too easy to ignore non-x86.

It may be considered that not implementing SYNC_MMU is a bug in itself, 
as it allows userspace to pin arbitrary amounts of user memory.  At 
least on x86 we had shrinkers that kill off shadow page tables under 
memory pressure, unpinning memory, but I don't see it on ppc.



As I understand the bug that causes the problem, it's because removing
all the hugepage VMAs from userspace will cause the inode (and
therefore address_space) for the hugepage file to be freed, but not
the pages (because another ref is held by kvm).  Then when kvm
releases the pages, the address_space will be touched after free from
free_huge_page().

This would seem to be a genuine bug in the hugepage code, which has
just been hidden by SYNC_MMU.  It should be quite easy to fix - the
mapping is only stored in the struct page to get to the hugetlbfs
superblock, so we could just store a direct superblock pointer
instead, and bump it's refcount when we put that in the page private
pointer.

But then I'm not sure how qemu would detect that it's on a kernel
where the bug is fixed and allow -mem-path to be used again.  Any
ideas?


If it's just a kernel bug, the fix belongs in the kernel, not in qemu.

We used to have KVM_CAPs to declare this sort of thing 
(KVM_CAP_HUGETLBFS_WORKS_EVEN_WITHOUT_SYNC_MMU) but I don't think it was 
a good idea.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Sanitize output characters in serial console

2011-08-10 Thread Asias He
On 08/10/2011 02:53 PM, walimis wrote:
> On Wed, Aug 10, 2011 at 02:15:45PM +0800, Asias He wrote:
>> On 08/10/2011 01:55 PM, Pekka Enberg wrote:
>>> On Wed, Aug 10, 2011 at 8:43 AM, Asias He  wrote:
 On 08/10/2011 01:30 PM, Pekka Enberg wrote:
> On 8/10/11 2:56 AM, Asias He wrote:
>> This patch fixes strange characters in serial console.
>>
>> Before:
>>
>> [0.448000] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>> �[0.695000] serial8250: ttyS0 at I/O 0x3f8 (irq = 0) is a 16550A
>> �[0.942000] serial8250: ttyS1 at I/O 0x2f8 (irq = 0) is a 16550A
>> �[1.189000] serial8250: ttyS2 at I/O 0x3e8 (irq = 0) is a 16550A
>> [1.194000] Non-volatile memory driver v1.3
>>
>> After:
>>
>> [0.541000] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>> [0.788000] serial8250: ttyS0 at I/O 0x3f8 (irq = 0) is a 16550A
>> [1.041000] serial8250: ttyS1 at I/O 0x2f8 (irq = 0) is a 16550A
>> [1.294000] serial8250: ttyS2 at I/O 0x3e8 (irq = 0) is a 16550A
>> [1.309000] Non-volatile memory driver v1.3
>>
>> Signed-off-by: Asias He
>> ---
>>   tools/kvm/term.c |7 +--
>>   1 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/kvm/term.c b/tools/kvm/term.c
>> index 2a3e1f0..85b41e7 100644
>> --- a/tools/kvm/term.c
>> +++ b/tools/kvm/term.c
>> @@ -5,6 +5,7 @@
>>   #include
>>   #include
>>   #include
>> +#include
>>
>>   #include "kvm/read-write.h"
>>   #include "kvm/term.h"
>> @@ -57,8 +58,10 @@ int term_putc(int who, char *addr, int cnt)
>>   if (who != active_console)
>>   return -1;
>>
>> -while (cnt--)
>> -fprintf(stdout, "%c", *addr++);
>> +while (cnt--) {
>> +if (isascii(*addr))
>
> Do things like backspace still work with this?

 Sure. Have a try ;-)

 http://en.wikipedia.org/wiki/ASCII
>>>
>>> OK, cool. Do we know what the extra characters are and why the guest
>>> is sending them?
> I think the reason is not important here. I use qemu to test
> and it also outputs that extra characters. They seem to be outputted by
> driver or kernel of guest os.
> 
> It's not sane to change the output of guest os. Suppose if we just 
> want to see what "cat /dev/urandom" prints, then only 
> allow ascii characters to be outputted? I think that's not what we expect.


This is a good reason why we should not only allow ascii to be printed.

Anyway, this is really annoying. Maybe it is a serial driver bug?

> 
> walimis
>>>
>>
>>
>> This tells us
>>
>>while (cnt--) {
>>   if (isascii(*addr))
>>   fprintf(stdout, "%c", *addr++);
>>   else
>>   fprintf(stdout, "\n---%x---\n", *addr++);
>>}
>>
>> the extra chars is 0xff.
>>
>> The reason is not understood.
>>
>> -- 
>> Best Regards,
>> Asias He
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Best Regards,
Asias He
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] kvm tools: remove unused options

2011-08-10 Thread Ingo Molnar

* Pekka Enberg  wrote:

> On Tue, 9 Aug 2011, Ingo Molnar wrote:
> >>kvm "--help" and "--version" are not implemented, so remove them to
> >>avoid ambiguous.
> >
> >would be nice to implement them: 'kvm --help' should probably map to
> >'kvm help', and 'kvm --version' should output something similar to
> >what 'perf --version' does:
> >
> >$ perf --version
> >perf version 3.1.rc1.1356.g57e4aa
> >
> >which is derived from the base kernel sha1.
> 
> We already have that :-)
> 
> $ ./kvm version
> 3.0.rc5.755.ge8c28

Yeah - but --version is also a common thing to type. So in perf we 
support both (Git does the same):

 $ perf version
 perf version 3.1.rc1.1356.g57e4aa
 $ perf --version
 perf version 3.1.rc1.1356.g57e4aa
 $ git version
 git version 1.7.6
 $ git --version
 git version 1.7.6
 $

Btw., it might also be better to output:

 kvm version 3.0.1614.geb2e2c

Instead of the current:

 3.0.1614.geb2e2c

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] KVM: PPC: Enable the PAPR CAP for Book3S

2011-08-10 Thread Alexander Graf

Am 10.08.2011 um 06:42 schrieb Paul Mackerras :

> On Tue, Aug 09, 2011 at 06:31:48PM +0200, Alexander Graf wrote:
> 
>> Now that Book3S PV mode can also run PAPR guests, we can add a PAPR cap and
>> enable it for all Book3S targets. Enabling that CAP switches KVM into PAPR
>> mode.
> 
> Don't we want to enable it only for 64-bit hosts?  Trying to run a
> PAPR guest on a 32-bit Book 3S host won't work very well, unless I am
> missing something...

I agree that it doesn't make sense, but if anything we should restrict it to 
64-bit _guests_. you can also run 32-bit guests on 64-bit hosts.

And so far, we don't have a single interface setting PVR and PAPR mode at the 
same time, so you could still enable PAPR with a 64-bit guest CPU and then 
switch to a 32-bit CPU.

It'd be a nightmare to check all configurations on every setter function.

Unless...

We could introduce a sanity check function that gets executed every time we 
change PVR or enable PAPR. That could set a variable in the vcpu struct to 
indicate that the config is ok. We could then check that on vcpu_run.


Alex

> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Sanitize output characters in serial console

2011-08-10 Thread Amos Kong
On Wed, Aug 10, 2011 at 2:53 PM, walimis  wrote:
> On Wed, Aug 10, 2011 at 02:15:45PM +0800, Asias He wrote:
>>On 08/10/2011 01:55 PM, Pekka Enberg wrote:
>>> On Wed, Aug 10, 2011 at 8:43 AM, Asias He  wrote:
 On 08/10/2011 01:30 PM, Pekka Enberg wrote:
> On 8/10/11 2:56 AM, Asias He wrote:
>> This patch fixes strange characters in serial console.

...
>>> OK, cool. Do we know what the extra characters are and why the guest
>>> is sending them?
> I think the reason is not important here. I use qemu to test
> and it also outputs that extra characters. They seem to be outputted by
> driver or kernel of guest os.
>
> It's not sane to change the output of guest os. Suppose if we just
> want to see what "cat /dev/urandom" prints, then only
> allow ascii characters to be outputted? I think that's not what we expect.

So we don't need this patch, otherwise, the possible error would be hidden.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm 0.15 usb problem

2011-08-10 Thread Gerd Hoffmann

On 08/09/11 21:30, Michael wrote:

Hi All,
After installed 0.15 ( and 0.15rc1) guest VM failed at some point with
USB attached with error:
qemu-system-x86_64: /tmp/qemu-kvm-0.15.0/hw/usb.c:336:
usb_packet_complete: Assertion `p->owner != ((void *)0)' failed.


Looks like a bug.


host intel,linux-3.0 ( debian squeeze x64 )
start with "qemu-system-x86_64 -m 512 -smp 2 /BIG/Virtual/debian6.base
-usb -usbdevice host:067b:2303"


What kind of device is this?

cheers,
  Gerd

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Sanitize output characters in serial console

2011-08-10 Thread walimis
On Wed, Aug 10, 2011 at 02:15:45PM +0800, Asias He wrote:
>On 08/10/2011 01:55 PM, Pekka Enberg wrote:
>> On Wed, Aug 10, 2011 at 8:43 AM, Asias He  wrote:
>>> On 08/10/2011 01:30 PM, Pekka Enberg wrote:
 On 8/10/11 2:56 AM, Asias He wrote:
> This patch fixes strange characters in serial console.
>
> Before:
>
> [0.448000] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> �[0.695000] serial8250: ttyS0 at I/O 0x3f8 (irq = 0) is a 16550A
> �[0.942000] serial8250: ttyS1 at I/O 0x2f8 (irq = 0) is a 16550A
> �[1.189000] serial8250: ttyS2 at I/O 0x3e8 (irq = 0) is a 16550A
> [1.194000] Non-volatile memory driver v1.3
>
> After:
>
> [0.541000] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [0.788000] serial8250: ttyS0 at I/O 0x3f8 (irq = 0) is a 16550A
> [1.041000] serial8250: ttyS1 at I/O 0x2f8 (irq = 0) is a 16550A
> [1.294000] serial8250: ttyS2 at I/O 0x3e8 (irq = 0) is a 16550A
> [1.309000] Non-volatile memory driver v1.3
>
> Signed-off-by: Asias He
> ---
>   tools/kvm/term.c |7 +--
>   1 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/tools/kvm/term.c b/tools/kvm/term.c
> index 2a3e1f0..85b41e7 100644
> --- a/tools/kvm/term.c
> +++ b/tools/kvm/term.c
> @@ -5,6 +5,7 @@
>   #include
>   #include
>   #include
> +#include
>
>   #include "kvm/read-write.h"
>   #include "kvm/term.h"
> @@ -57,8 +58,10 @@ int term_putc(int who, char *addr, int cnt)
>   if (who != active_console)
>   return -1;
>
> -while (cnt--)
> -fprintf(stdout, "%c", *addr++);
> +while (cnt--) {
> +if (isascii(*addr))

 Do things like backspace still work with this?
>>>
>>> Sure. Have a try ;-)
>>>
>>> http://en.wikipedia.org/wiki/ASCII
>> 
>> OK, cool. Do we know what the extra characters are and why the guest
>> is sending them?
I think the reason is not important here. I use qemu to test
and it also outputs that extra characters. They seem to be outputted by
driver or kernel of guest os.

It's not sane to change the output of guest os. Suppose if we just 
want to see what "cat /dev/urandom" prints, then only 
allow ascii characters to be outputted? I think that's not what we expect.

walimis
>> 
>
>
>This tells us
>
>while (cnt--) {
>   if (isascii(*addr))
>   fprintf(stdout, "%c", *addr++);
>   else
>   fprintf(stdout, "\n---%x---\n", *addr++);
>}
>
>the extra chars is 0xff.
>
>The reason is not understood.
>
>-- 
>Best Regards,
>Asias He
>--
>To unsubscribe from this list: send the line "unsubscribe kvm" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm tools: Sanitize output characters in serial console

2011-08-10 Thread Sasha Levin
On Wed, 2011-08-10 at 14:15 +0800, Asias He wrote:
> On 08/10/2011 01:55 PM, Pekka Enberg wrote:
> > On Wed, Aug 10, 2011 at 8:43 AM, Asias He  wrote:
> >> On 08/10/2011 01:30 PM, Pekka Enberg wrote:
> >>> On 8/10/11 2:56 AM, Asias He wrote:
>  This patch fixes strange characters in serial console.
> 
>  Before:
> 
>  [0.448000] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>  �[0.695000] serial8250: ttyS0 at I/O 0x3f8 (irq = 0) is a 16550A
>  �[0.942000] serial8250: ttyS1 at I/O 0x2f8 (irq = 0) is a 16550A
>  �[1.189000] serial8250: ttyS2 at I/O 0x3e8 (irq = 0) is a 16550A
>  [1.194000] Non-volatile memory driver v1.3
> 
>  After:
> 
>  [0.541000] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>  [0.788000] serial8250: ttyS0 at I/O 0x3f8 (irq = 0) is a 16550A
>  [1.041000] serial8250: ttyS1 at I/O 0x2f8 (irq = 0) is a 16550A
>  [1.294000] serial8250: ttyS2 at I/O 0x3e8 (irq = 0) is a 16550A
>  [1.309000] Non-volatile memory driver v1.3
> 
>  Signed-off-by: Asias He
>  ---
>    tools/kvm/term.c |7 +--
>    1 files changed, 5 insertions(+), 2 deletions(-)
> 
>  diff --git a/tools/kvm/term.c b/tools/kvm/term.c
>  index 2a3e1f0..85b41e7 100644
>  --- a/tools/kvm/term.c
>  +++ b/tools/kvm/term.c
>  @@ -5,6 +5,7 @@
>    #include
>    #include
>    #include
>  +#include
> 
>    #include "kvm/read-write.h"
>    #include "kvm/term.h"
>  @@ -57,8 +58,10 @@ int term_putc(int who, char *addr, int cnt)
>    if (who != active_console)
>    return -1;
> 
>  -while (cnt--)
>  -fprintf(stdout, "%c", *addr++);
>  +while (cnt--) {
>  +if (isascii(*addr))
> >>>
> >>> Do things like backspace still work with this?
> >>
> >> Sure. Have a try ;-)
> >>
> >> http://en.wikipedia.org/wiki/ASCII
> > 
> > OK, cool. Do we know what the extra characters are and why the guest
> > is sending them?
> > 
> 
> 
> This tells us
> 
> while (cnt--) {
>if (isascii(*addr))
>fprintf(stdout, "%c", *addr++);
>else
>fprintf(stdout, "\n---%x---\n", *addr++);
> }
> 
> the extra chars is 0xff.
> 
> The reason is not understood.
> 

They are being printed when the serial device is probed. Can we
mistakenly interpret the probe as a print request?

-- 

Sasha.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html