كورسات مهمة

2010-08-03 Thread new association2010
  المركز الأمريكي للبحوث
 A.R.C.

Tel  :  ( 00202 ) 25082727  -  2508884183
Fax No : ( 00202 ) 25082727
E-MAIL:  new.a...@ymail .com
محمول  :  162494849 - 0020


السيد / وكيل الجامعة لشئون المكتبات

يسر المركز بعرض الدراسات العلمية التطبيقية للمواضيع التالية
التصميم الهندسي

تطور نظم التصميم القائم على الكمبيوتر - حل المشكلات الهندسية بصورة
إبداعية - التصميم التخيلي والتصنيع 0

   Eالتصنيع الهندسيQ

( التصميم والتصنيع من اجل دعم التنمية - استخدام التصنيع بواسطة
الكمبيوتر في الصناعة - الية التصنيع والحلول ذات الاستجابة السريعة
لتنمية المنتج  ) 0

  
العلوم الهندسية

( التصميم الهندسي للمانع والآلآت بطريقة تقلل من الضوضاء عن طريق قياسها
ومعرفة اسبابها - تطور التصميم المرن للآلآت بالمصانع - تجنب تآكل الآلآت
والمضخات في المصانع

( الهندسة الطبية )?
تحسين تصميم المستشفيات ومعداتها وتقليل انبعاث غاز ثاني اكسيد الكربون -
تطور صناعة المفاصل الصناعية في مجال الطب - الآليات البيولوجية بزراعة
المفاصل  )

الهندسة النووية
تأكيد سلامة التكامل الهيكلي وتقييم المخاطر - تحسين كفاءة الصناعات
البتر وكيميائية الغاز والبترول - اكتساب البيانات الخاصة وتكنولوجيا
التنقيب عن البترول تحت سطح البحر

ç(  صناعات الطاقة )è

توليد الطاقة من نفايات الكربون بواسطة الحل الحراري - تكنولوجيا الطاقة
النظيفة من الطاقة الشمسية - توليد الطاقة باستخدام الغاز المستخرج من
أراضي النفايات 0

 للاستفسار يرجى الاتصال هاتفيا او عن طريق البريد
الالكتروني   مدير عام قطاع البحوث


Re: KVM Processor cache size

2010-08-03 Thread Dor Laor

On 08/03/2010 02:36 AM, Anthony Liguori wrote:

On 08/02/2010 05:42 PM, Andre Przywara wrote:

Anthony Liguori wrote:

On 08/02/2010 08:49 AM, Ulrich Drepper wrote:

glibc uses the cache size information returned by cpuid to perform
optimizations. For instance, copy operations which would pollute too
much of the cache because they are large will use non-temporal
instructions. There are real performance benefits.


I imagine that there would be real performance problems from doing
live migration with -cpu host too if we don't guarantee these values
remain stable across migration...

Again, -cpu host is not meant to be migrated.


Then it needs to prevent migration from happening. Otherwise, it's a bug
waiting to happen.


There are other virtualization use cases than cloud-like server
virtualization. Sometimes users don't care about migration (or even
the live version), but want full CPU exposure for performance reasons
(think of virtualizing Windows on a Linux desktop).
I agree that -cpu host and migration should be addressed, but only to
a certain degree. And missing migration experience should not be a
road blocker for -cpu host.


When we can reasonably prevent it, we should prevent users from shooting
themselves in the foot. Honestly, I think -cpu host is exactly what you
would want to use in a cloud. A lot of private clouds and even public
clouds are largely based on homogenous hardware.


There are two good solutions for that:
a. keep adding newer -cpu definition like the Penryn, Nehalem,
   Opteron_gx, so newer models will be abstracted as similar to the
   physical properties
b. Use strict flag with -cpu host and pass the info with the live
   migration protocol.
   Our live migration protocol can do better job with validation the
   cmdline and the current set of devices/hw on the src/dst and fail
   migration if there is a diff. Today we relay on libvirt for that,
   another mechanism will surely help, especially for -cpu host.
   The goodie is that there won't be a need to wait for the non-live
   migration part, and more cpu cycles will be saved.



I actually think the case where you want to migrate between heterogenous
hardware is grossly overstated.

Regards,

Anthony Liguori



Regards,
Andre.



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bad O_DIRECT read and write performance with small block sizes with virtio

2010-08-03 Thread Dor Laor

On 08/02/2010 11:50 PM, Stefan Hajnoczi wrote:

On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguorianth...@codemonkey.ws  wrote:

On 08/02/2010 12:15 PM, John Leach wrote:


Hi,

I've come across a problem with read and write disk IO performance when
using O_DIRECT from within a kvm guest.  With O_DIRECT, reads and writes
are much slower with smaller block sizes.  Depending on the block size
used, I've seen 10 times slower.

For example, with an 8k block size, reading directly from /dev/vdb
without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s.

As a comparison, reading in O_DIRECT mode in 8k blocks directly from the
backend device on the host gives 2.3 GB/s.  Reading in O_DIRECT mode
from a xen guest on the same hardware manages 263 MB/s.



Stefan has a few fixes for this behavior that help a lot.  One of them
(avoiding memset) is already upstream but not in 0.12.x.

The other two are not done yet but should be on the ML in the next couple
weeks.  They involve using ioeventfd for notification and unlocking the
block queue lock while doing a kick notification.


Thanks for mentioning those patches.  The ioeventfd patch will be sent
this week, I'm checking that migration works correctly and then need
to check that vhost-net still works.


Writing is affected in the same way, and exhibits the same behaviour
with O_SYNC too.

Watching with vmstat on the host, I see the same number of blocks being
read, but about 14 times the number of context switches in O_DIRECT mode
(4500 cs vs. 63000 cs) and a little more cpu usage.

The device I'm writing to is a device-mapper zero device that generates
zeros on read and throws away writes, you can set it up
at /dev/mapper/zero like this:

echo 0 21474836480 zero | dmsetup create zero

My libvirt config for the disk is:

disk type='block' device='disk'
   driver cache='none'/
   source dev='/dev/mapper/zero'/
   target dev='vdb' bus='virtio'/
   address type='pci' domain='0x' bus='0x00' slot='0x06'
function='0x0'/
/disk

which translates to the kvm arg:

-device
virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
-drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none


aio=native and change the io scheduler on the host to deadline should 
help as well.




I'm testing with dd:

dd if=/dev/vdb of=/dev/null bs=8k iflag=direct

As a side note, as you increase the block size read performance in
O_DIRECT mode starts to overtake non O_DIRECT mode reads (from about
150k block size). By 550k block size I'm seeing 1 GB/s reads with
O_DIRECT and 770 MB/s without.


Can you take QEMU out of the picture and run the same test on the host:

dd if=/dev/vdb of=/dev/null bs=8k iflag=direct
vs
dd if=/dev/vdb of=/dev/null bs=8k

This isn't quite the same because QEMU will use a helper thread doing
preadv.  I'm not sure what syscall dd will use.

It should be close enough to determine whether QEMU and device
emulation are involved at all though, or whether these differences are
due to the host kernel code path down to the device mapper zero device
being different for normal vs O_DIRECT.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm cleanup: Introduce sibling_pte and do cleanup for reverse map and parent_pte

2010-08-03 Thread Avi Kivity

 On 08/03/2010 05:30 AM, Lai Jiangshan wrote:

This patch is just a big cleanup. it reduces 220 lines of code.

It introduces sibling_pte array for tracking identical sptes, so the
identical sptes can be linked as a single linked list by their
corresponding sibling_pte. A reverse map or a parent_pte points at
the head of this single linked list. So we can do cleanup for
reverse map and parent_pte VERY LARGELY.

BAD:
   If most rmap have only one entry or most sp have only one parent,
   this patch may use more memory than before.


That is the case with NPT and EPT.  Each page has exactly one spte 
(except a few vga pages), and each sp has exactly one parent_pte (except 
the root pages).



GOOD:
   1) Reduce a lot of code, The functions which are in hot path becomes
  very very simple and terrifically fast.
   2) rmap_next(): O(N) -  O(1). traveling a ramp: O(N*N) -  O(N)


The existing rmap_next() is not O(N), it's O(RMAP_EXT), which is 4.  The 
data structure was chosen over a simple linked list to avoid extra cache 
misses.



   3) Remove the ugly interlayer: struct kvm_rmap_desc, struct kvm_pte_chain


kvm_rmap_desc and kvm_pte_chain are indeed ugly, but they do save a lot 
of memory and cache misses.



   4) We don't need to allocate any thing when we change the mappings.
  So we can avoid allocation when we have held kvm mmu spin lock.
  (this feature is very helpful in future).
   5) better readability.


I agree the new code is more readable.  Unfortunately it uses more 
memory and is likely to be slower.  You add a cache miss for every spte, 
while kvm_rmap_desc amortizes the cache miss among 4 sptes, and special 
cases 1 spte to have no cache misses (or extra memory requirements).


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 3/4] Paravirtualized spinlock implementation for KVM guests

2010-08-03 Thread Avi Kivity

 On 08/02/2010 06:20 PM, Jeremy Fitzhardinge wrote:

 On 08/02/2010 01:48 AM, Avi Kivity wrote:

 On 07/26/2010 09:15 AM, Srivatsa Vaddagiri wrote:
Paravirtual spinlock implementation for KVM guests, based heavily on 
Xen guest's

spinlock implementation.


+
+static struct spinlock_stats
+{
+u64 taken;
+u32 taken_slow;
+
+u64 released;
+
+#define HISTO_BUCKETS30
+u32 histo_spin_total[HISTO_BUCKETS+1];
+u32 histo_spin_spinning[HISTO_BUCKETS+1];
+u32 histo_spin_blocked[HISTO_BUCKETS+1];
+
+u64 time_total;
+u64 time_spinning;
+u64 time_blocked;
+} spinlock_stats;


Could these be replaced by tracepoints when starting to spin/stopping 
spinning etc?  Then userspace can reconstruct the histogram as well 
as see which locks are involved and what call paths.


Unfortunately not; the tracing code uses spinlocks.

(TBH I haven't actually tried, but I did give the code an eyeball to 
this end.)


Hm.  The tracing code already uses a specialized lock (arch_spinlock_t), 
perhaps we can make this lock avoid the tracing?


It's really sad, btw, there's all those nice lockless ring buffers and 
then a spinlock for ftrace_vbprintk(), instead of a per-cpu buffer.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm IPC

2010-08-03 Thread Amit Shah
On (Thu) Jul 29 2010 [16:17:48], Nirmal Guhan wrote:
 Hi,
 
 I run Fedora 12 and guest is also Fedora 12. I use br0/tap0 for
 networking and communicate between host-guest using socket.  I do
 see some references to virtio, pci based ipc and inter-vm shared
 memory but they are not current. My question is : Is there a better
 IPC mechanism for host-guest and intern vm communication and if so
 could you provide me with pointers?

There's virtio-serial, which is a channel between a guest and the host.
You can short-circuit two host-side chardevs to get inter-VM channels as
well.

See 

https://fedoraproject.org/wiki/Features/VirtioSerial

for more info.

This is only available from F13, though.

Amit
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: SVM: Check for nested vmrun intercept before emulating vmrun

2010-08-03 Thread Avi Kivity

 On 08/02/2010 11:33 PM, Joerg Roedel wrote:

On Mon, Aug 02, 2010 at 06:18:09PM +0300, Avi Kivity wrote:

  On 08/02/2010 05:46 PM, Joerg Roedel wrote:

This patch lets the nested vmrun fail if the L1 hypervisor
has not intercepted vmrun. This fixes the vmrun intercept
check unit test.
+
   static bool nested_svm_vmrun(struct vcpu_svm *svm)
   {
struct vmcb *nested_vmcb;
@@ -2029,6 +2037,17 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm)
if (!nested_vmcb)
return false;

+   if (!nested_vmcb_checks(nested_vmcb)) {
+   nested_vmcb-control.exit_code= SVM_EXIT_ERR;
+   nested_vmcb-control.exit_code_hi = 0;
+   nested_vmcb-control.exit_info_1  = 0;
+   nested_vmcb-control.exit_info_2  = 0;
+
+   nested_svm_unmap(page);
+
+   return false;
+   }
+

Don't you have to transfer an injected event to exitintinfo?

APM2 seems to be quiet about this.


Well, my copy says

The VMRUN instruction then checks the guest state just loaded. If an 
illegal state has been loaded, the

processor exits back to the host (see “#VMEXIT” on page 374).


This matches illegal state and #VMEXIT but doesn't match guest state.


I just tried it out and event_inj
still contains the event after a failed vmrun on real hardware. This
makes sense because this is no real vmexit because the vm was never
entered.


Okay; will apply the patches.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] KVM updates for the 2.6.36 merge window

2010-08-03 Thread Avi Kivity

 Linus, please pull from

  git://git.kernel.org/pub/scm/virt/kvm/kvm.git kvm-updates/2.6.36

to receive the KVM updates for the 2.6.36 cycle.  No major features: 
mostly improved mmu and emulator correctness, some performance 
improvements, and support for guest XSAVE and AVX.


Alex Williamson (1):
  KVM: remove CAP_SYS_RAWIO requirement from kvm_vm_ioctl_assign_irq

Alexander Graf (5):
  KVM: PPC: Remove obsolete kvmppc_mmu_find_pte
  KVM: PPC: Use kernel hash function
  KVM: PPC: Make BAT only guest segments work
  KVM: PPC: Add generic hpte management functions
  KVM: PPC: Make use of hash based Shadow MMU

Andi Kleen (2):
  KVM: Fix KVM_SET_SIGNAL_MASK with arg == NULL
  KVM: Fix unused but set warnings

Andrea Arcangeli (1):
  KVM: MMU: fix mmu notifier invalidate handler for huge spte

Andreas Schwab (1):
  KVM: PPC: elide struct thread_struct instances from stack

Asias He (1):
  KVM: PPC: fix uninitialized variable warning in 
kvm_ppc_core_deliver_interrupts


Avi Kivity (50):
  KVM: VMX: Simplify vmx_get_nmi_mask()
  KVM: kvm_pdptr_read() may sleep
  KVM: VMX: Avoid writing HOST_CR0 every entry
  KVM: Get rid of KVM_REQ_KICK
  KVM: Document KVM_SET_IDENTITY_MAP ioctl
  KVM: Document KVM_SET_BOOT_CPU_ID
  KVM: MMU: Fix free memory accounting race in mmu_alloc_roots()
  KVM: move vcpu locking to dispatcher for generic vcpu ioctls
  KVM: x86: Lock arch specific vcpu ioctls centrally
  KVM: s390: Centrally lock arch specific vcpu ioctls
  KVM: PPC: Centralize locking of arch specific vcpu ioctls
  KVM: Consolidate arch specific vcpu ioctl locking
  KVM: Update Red Hat copyrights
  KVM: MMU: Allow spte.w=1 for gpte.w=0 and cr0.wp=0 only in shadow 
mode

  KVM: MMU: Document cr0.wp emulation
  KVM: MMU: Document large pages
  KVM: VMX: Fix incorrect rcu deref in rmode_tss_base()
  KVM: Fix mov cr0 #GP at wrong instruction
  KVM: Fix mov cr4 #GP at wrong instruction
  KVM: Fix mov cr3 #GP at wrong instruction
  KVM: Fix xsave and xcr save/restore memory leak
  KVM: Consolidate load/save temporary buffer allocation and freeing
  KVM: Remove memory alias support
  KVM: Remove kernel-allocated memory regions
  KVM: i8259: reduce excessive abstraction for pic_irq_request()
  KVM: i8259: simplify pic_irq_request() calling sequence
  KVM: Add mini-API for vcpu-requests
  KVM: Reduce atomic operations on vcpu-requests
  KVM: Keep slot ID in memory slot structure
  KVM: Prevent internal slots from being COWed
  KVM: Simplify vcpu_enter_guest() mmu reload logic slightly
  KVM: Document KVM specific review items
  KVM: MMU: Introduce drop_spte()
  KVM: MMU: Move accessed/dirty bit checks from rmap_remove() to 
drop_spte()

  KVM: MMU: Atomically check for accessed bit when dropping an spte
  KVM: MMU: Don't drop accessed bit while updating an spte
  KVM: MMU: Only indicate a fetch fault in page fault error code if 
nx is enabled

  KVM: MMU: Keep going on permission error
  KVM: Expose MCE control MSRs to userspace
  KVM: Document MCE banks non-exposure via KVM_GET_MSR_INDEX_LIST
  KVM: MMU: Add link_shadow_page() helper
  KVM: MMU: Use __set_spte to link shadow pages
  KVM: MMU: Add drop_large_spte() helper
  KVM: MMU: Add validate_direct_spte() helper
  KVM: MMU: Add gpte_valid() helper
  KVM: MMU: Simplify spte fetch() function
  KVM: MMU: Validate all gptes during fetch, not just those used 
for new pages

  KVM: MMU: Eliminate redundant temporaries in FNAME(fetch)
  KVM: Document KVM_GET_SUPPORTED_CPUID2 ioctl
  KVM: VMX: Fix host GDT.LIMIT corruption

Chris Lalancette (4):
  KVM: x86: Introduce a workqueue to deliver PIT timer interrupts
  KVM: x86: Allow any LAPIC to accept PIC interrupts
  KVM: x86: In DM_LOWEST, only deliver interrupts to vcpus with 
enabled LAPIC's

  KVM: Search the LAPIC's for one that will accept a PIC interrupt

Christian Borntraeger (2):
  KVM: s390: Fix build failure due to centralized vcpu locking patches
  KVM: s390: Don't exit SIE on SIGP sense running

Denis Kirjanov (1):
  KVM: PPC: fix build warning in kvm_arch_vcpu_ioctl_run

Dexuan Cui (1):
  KVM: VMX: Enable XSAVE/XRSTOR for guest

Dongxiao Xu (4):
  KVM: VMX: Define new functions to wrapper direct call of asm code
  KVM: VMX: Some minor changes to code structure
  KVM: VMX: VMCLEAR/VMPTRLD usage changes
  KVM: VMX: VMXON/VMXOFF usage changes

Glauber Costa (1):
  KVM: Add Documentation/kvm/msr.txt

Gleb Natapov (32):
  KVM: x86 emulator: introduce read cache
  KVM: x86 emulator: fix Move r/m16 to segment register decoding
  KVM: x86 emulator: cleanup xchg emulation
  KVM: x86 emulator: cleanup nop emulation
  KVM: x86 emulator: handle far address source operand
  KVM: x86 emulator: add (set|get)_dr 

RE: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.

2010-08-03 Thread Xin, Xiaohui
-Original Message-
From: Shirley Ma [mailto:mashi...@us.ibm.com]
Sent: Friday, July 30, 2010 6:31 AM
To: Xin, Xiaohui
Cc: net...@vger.kernel.org; kvm@vger.kernel.org; linux-ker...@vger.kernel.org;
m...@redhat.com; mi...@elte.hu; da...@davemloft.net; 
herb...@gondor.apana.org.au;
jd...@linux.intel.com
Subject: Re: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.

Hello Xiaohui,

On Thu, 2010-07-29 at 19:14 +0800, xiaohui@intel.com wrote:
 The idea is simple, just to pin the guest VM user space and then
 let host NIC driver has the chance to directly DMA to it.
 The patches are based on vhost-net backend driver. We add a device
 which provides proto_ops as sendmsg/recvmsg to vhost-net to
 send/recv directly to/from the NIC driver. KVM guest who use the
 vhost-net backend may bind any ethX interface in the host side to
 get copyless data transfer thru guest virtio-net frontend.

Since vhost-net already supports macvtap/tun backends, do you think
whether it's better to implement zero copy in macvtap/tun than inducing
a new media passthrough device here?

 Our goal is to improve the bandwidth and reduce the CPU usage.
 Exact performance data will be provided later.

I did some vhost performance measurement over 10Gb ixgbe, and found that
in order to get consistent BW results, netperf/netserver, qemu, vhost
threads smp affinities are required.

Looking forward to these results for small message size comparison. For
large message size 10Gb ixgbe BW already reached by doing vhost smp
affinity w/i offloading support, we will see how much CPU utilization it
can be reduced.

Please provide latency results as well. I did some experimental on
macvtap zero copy sendmsg, what I have found that get_user_pages latency
pretty high.

May you share me with your performance results (including BW and latency)on 
vhost-net and how you get them(your configuration and especially with the 
affinity 
settings)?

Thanks
Xiaohui

Thanks
Shirley




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm cleanup: Introduce sibling_pte and do cleanup for reverse map and parent_pte

2010-08-03 Thread Lai Jiangshan
On 08/03/2010 02:51 PM, Avi Kivity wrote:
  On 08/03/2010 05:30 AM, Lai Jiangshan wrote:
 This patch is just a big cleanup. it reduces 220 lines of code.

 It introduces sibling_pte array for tracking identical sptes, so the
 identical sptes can be linked as a single linked list by their
 corresponding sibling_pte. A reverse map or a parent_pte points at
 the head of this single linked list. So we can do cleanup for
 reverse map and parent_pte VERY LARGELY.

 BAD:
If most rmap have only one entry or most sp have only one parent,
this patch may use more memory than before.
 
 That is the case with NPT and EPT.  Each page has exactly one spte
 (except a few vga pages), and each sp has exactly one parent_pte (except
 the root pages).
 
 GOOD:
1) Reduce a lot of code, The functions which are in hot path becomes
   very very simple and terrifically fast.
2) rmap_next(): O(N) -  O(1). traveling a ramp: O(N*N) -  O(N)
 
 The existing rmap_next() is not O(N), it's O(RMAP_EXT), which is 4.  The
 data structure was chosen over a simple linked list to avoid extra cache
 misses.
 
3) Remove the ugly interlayer: struct kvm_rmap_desc, struct
 kvm_pte_chain
 
 kvm_rmap_desc and kvm_pte_chain are indeed ugly, but they do save a lot
 of memory and cache misses.
 
4) We don't need to allocate any thing when we change the mappings.
   So we can avoid allocation when we have held kvm mmu spin lock.
   (this feature is very helpful in future).
5) better readability.
 
 I agree the new code is more readable.  Unfortunately it uses more
 memory and is likely to be slower.  You add a cache miss for every spte,
 while kvm_rmap_desc amortizes the cache miss among 4 sptes, and special
 cases 1 spte to have no cache misses (or extra memory requirements).
 

You are right, please omit this patch

thanks, lai.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Alt SeaBIOS SSDT cpu hotplug

2010-08-03 Thread Liu, Jinsong
Zheng, Shaohui wrote:
 In our experiences, windows 2008 datacenter is the only version to
 support CPU hotplug, and we did not find any official announce for
 other windows, so we tested windows 2008 data center only.  
 
 Thanks for Kevin pointing out it, we will try windows7 hotplug
 feature. 
 
 Thanks  Regards,
 Shaohui
 
 
 -Original Message-
 From: Kevin O'Connor [mailto:ke...@koconnor.net]
 Sent: Tuesday, August 03, 2010 1:27 AM
 To: Avi Kivity
 Cc: Alexander Graf; Liu, Jinsong; seab...@seabios.org;
 kvm@vger.kernel.org; Jiang, Yunhong; Li, Xin; Zheng, Shaohui; Zhang,
 Jianwu; You, Yongkang  
 Subject: Re: Alt SeaBIOS SSDT cpu hotplug
 
 On Mon, Aug 02, 2010 at 07:13:34PM +0300, Avi Kivity wrote:
  On 08/02/2010 06:55 PM, Kevin O'Connor wrote:
 On Mon, Aug 02, 2010 at 10:12:31AM +0200, Alexander Graf wrote:
 On 02.08.2010, at 07:49, Kevin O'Connor wrote:
 On Mon, Aug 02, 2010 at 10:41:39AM +0800, Liu, Jinsong wrote:
 It seems the Windows acpi interpreter is significantly different
 from the Linux one.  The only guess I have is that Windows
 doesn't like one of the ASL constructs even though they all look
 valid.  I'd try to debug this by commenting out parts of the ASL
 until I narrowed down the parts causing the problem. 
 Unfortunately, I don't have Windows 2008 to do this directly. 
 
 Any other ideas?
 Just grab yourself a free copy of the Hyper-V server 2008:
 
 http://arstechnica.com/microsoft/news/2009/08/microsoft-hyper-v-server-2008-r2-arrives-for-free.ars
 I downloaded and installed it, but I can't reproduce the crash.  It
 seems like a really stripped down version of Windows, so I can't
 tell if it actually worked or not either.
 
 I thought only the Datacenter edition supported cpu hotplug.
 
 I just tried an old Win 7 Ultimate beta (build 7100) I had on my HD.
 It looks like it supports cpu hotplug.  However, I don't see any
 failures - it seems to work fine.  (After running cpu_set 1 online,
 the event pops up in the system event log as a UserPnP event, and the
 CPU appears in the system devices list.)
 
 -Kevin

Kevin,

I just test your new patch with Windows 2008 DataCenter at my platform, it 
works OK! We can hot-add new cpus and they appear at Device Manager.
(BTW, yesterday I test your new patch with linux 2.6.32 hvm, it works fine, we 
can add-remove-add-remove... cpus)
Sorry for make you spend more time. It's our fault.

Thanks,
Jinsong

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] kvm-unit-tests.git

2010-08-03 Thread Avi Kivity
 The kvm unit tests, previously found in qemu-kvm.git's kvm/test/ 
directory, have been moved to their own repository, kvm-unit-tests.git.


The repository URL is 
git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git; more 
information can be found in 
http://git.kernel.org/?p=virt/kvm/kvm-unit-tests.git;a=summary.


Due to file moves before the migration, history was not migrated.  
Please use qemu-kvm.git for historical information.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM test: Subtest unittest: append extra_params to qemu cmdline

2010-08-03 Thread Avi Kivity

 On 08/02/2010 08:29 PM, Lucas Meneghel Rodrigues wrote:

The extra_param config option on qemu-kvm's unittest config
file wasn't being honored due to a silly mistake on the latest
version of the unittest patchset (forgot to add the extra_params
to the params dictionary). This patch fixes the problem.

Signed-off-by: Lucas Meneghel Rodriguesl...@redhat.com
---
  client/tests/kvm/tests/unittest.py |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/tests/unittest.py 
b/client/tests/kvm/tests/unittest.py
index 8be1f27..ad95720 100644
--- a/client/tests/kvm/tests/unittest.py
+++ b/client/tests/kvm/tests/unittest.py
@@ -75,6 +75,7 @@ def run_unittest(test, params, env):
  extra_params = None
  if parser.has_option(t, 'extra_params'):
  extra_params = parser.get(t, 'extra_params')
+params['extra_params'] += ' %s' % extra_params


Not quite:


08/03 13:57:04 DEBUG|kvm_vm:0637| Running qemu command:
/root/autotest/client/tests/kvm/qemu -name 'vm1' -monitor 
unix:'/tmp/monitor-humanmonitor1-20100803-135522-SqL2',server,nowait 
-serial unix:'/tmp/serial-20100803-135522-SqL2',server,nowait -m 512 
-kernel '/root/autotest/client/tests/kvm/unittests/svm.flat' -vnc :0 
-chardev file,id=testlog,path=/tmp/testlog-20100803-135522-SqL2 -device 
testdev,chardev=testlog  -S -cpu qemu64,-svm -cpu qemu64,+x2apic 
-enable-nesting -cpu qemu64,+svm



Looks the += is a little excessive.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM test: Subtest unittest: append extra_params to qemu cmdline

2010-08-03 Thread Avi Kivity

 On 08/03/2010 02:25 PM, Avi Kivity wrote:

 On 08/02/2010 08:29 PM, Lucas Meneghel Rodrigues wrote:

The extra_param config option on qemu-kvm's unittest config
file wasn't being honored due to a silly mistake on the latest
version of the unittest patchset (forgot to add the extra_params
to the params dictionary). This patch fixes the problem.

Signed-off-by: Lucas Meneghel Rodriguesl...@redhat.com
---
  client/tests/kvm/tests/unittest.py |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/tests/unittest.py 
b/client/tests/kvm/tests/unittest.py

index 8be1f27..ad95720 100644
--- a/client/tests/kvm/tests/unittest.py
+++ b/client/tests/kvm/tests/unittest.py
@@ -75,6 +75,7 @@ def run_unittest(test, params, env):
  extra_params = None
  if parser.has_option(t, 'extra_params'):
  extra_params = parser.get(t, 'extra_params')
+params['extra_params'] += ' %s' % extra_params


Not quite:


08/03 13:57:04 DEBUG|kvm_vm:0637| Running qemu command:
/root/autotest/client/tests/kvm/qemu -name 'vm1' -monitor 
unix:'/tmp/monitor-humanmonitor1-20100803-135522-SqL2',server,nowait 
-serial unix:'/tmp/serial-20100803-135522-SqL2',server,nowait -m 512 
-kernel '/root/autotest/client/tests/kvm/unittests/svm.flat' -vnc :0 
-chardev file,id=testlog,path=/tmp/testlog-20100803-135522-SqL2 
-device testdev,chardev=testlog  -S -cpu qemu64,-svm -cpu 
qemu64,+x2apic -enable-nesting -cpu qemu64,+svm



Looks the += is a little excessive.



It also leaks to other tests, screwing them up.  So I think you might 
need to keep the += (so you inherit global settings) but undo it after 
the unit test completes.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 02:33:02PM +0300, Gleb Natapov wrote:
 On Tue, Aug 03, 2010 at 12:13:06PM +0100, Richard W.M. Jones wrote:
  
  qemu compiled from today's git.  Using the following command line:
  
  $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios \
  -drive file=/dev/null,if=virtio \
  -enable-kvm \
  -nodefaults \
  -nographic \
  -serial stdio \
  -m 500 \
  -no-reboot \
  -no-hpet \
  -net user,vlan=0,net=169.254.0.0/16 \
  -net nic,model=ne2k_pci,vlan=0 \
  -kernel /tmp/libguestfsEyAMut/kernel \
  -initrd /tmp/libguestfsEyAMut/initrd \
  -append 'panic=1 console=ttyS0 udevtimeout=300 noapic acpi=off 
  printk.time=1 cgroup_disable=memory selinux=0 
  guestfs_vmchannel=tcp:169.254.2.2:35007 guestfs_verbose=1 TERM=xterm-color '
  
  With kernel 2.6.35 [*], this takes about 1 min 20 s before the guest
  starts.
  
  If I revert back to kernel 2.6.34, it's pretty quick as usual.
  
  strace is not very informative.  It's in a loop doing select and
  reading/writing from some file descriptors, including the signalfd and
  two pipe fds.
  
  Anyone seen anything like this?
  
 I assume your initrd is huge.

It's ~110MB, yes.

 In newer kernels ins/outs are much slower that they were. They are
 much more correct too. It shouldn't be 1 min 20 sec for 100M initrd
 though, but it can take 20-30 sec. This belongs to kvm list BTW.

I can't see anything about this in the kernel changelog.  Can you
point me to the commit or the key phrase to look for?

Also, what's the point of making in/out more correct when they we
know we're talking to qemu (eg. from the CPUID) and we know it already
worked fine before with qemu?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into Xen guests.
http://et.redhat.com/~rjones/virt-p2v
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/24] Implement VMCLEAR

2010-08-03 Thread Nadav Har'El
On Tue, Jul 06, 2010, Dong, Eddie wrote about RE: [PATCH 9/24] Implement 
VMCLEAR:
 Nadav Har'El wrote:
  This patch implements the VMCLEAR instruction.
...
 SDM implements alignment check, range check and reserve bit check and may 
 generate VMfail(VMCLEAR with invalid physical address).
 As well as addr != VMXON pointer check
 Missed?

Right. I will add some of the missing checks - e.g., currently if the given
address is not page-aligned, I chop off the last bits and pretend that it
is, which can cause problems (although not for correctly-written hypervisors).

About the missing addr != VMXON pointer, as I explained in a comment in the
code (handle_vmon()), this was a deliberate ommission: the current
implementation doesn't store anything in the VMXON page (and I see no reason
why this will change in the future), so the VMXON emulation (handle_vmon())
doesn't even bother to save the pointer it is given, and VMCLEAR and VMPTRLD
don't check that the address they are given are different from this pointer,
since there is no real cause for concern even if it is.

I can quite easily add the missing code to save the vmxon pointer and check
it on vmclear/vmptrld, but frankly, wouldn't it be rather pointless?

 SDM has formal definition of VMSucceed. Cleating CF/ZF only is not sufficient 
 as SDM 2B 5.2 mentioned.
 Any special concern here?
 
 BTW, should we define formal VMfail()  VMsucceed() API for easy understand 
 and map to SDM?

This is a good idea, and I'll do that.

-- 
Nadav Har'El| Tuesday, Aug  3 2010, 23 Av 5770
n...@math.technion.ac.il |-
Phone +972-523-790466, ICQ 13349191 |Sign in zoo: Do not feed the animals. If
http://nadav.harel.org.il   |you have food give it to the guard on duty
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gleb Natapov
On Tue, Aug 03, 2010 at 01:10:00PM +0100, Richard W.M. Jones wrote:
 On Tue, Aug 03, 2010 at 02:33:02PM +0300, Gleb Natapov wrote:
  On Tue, Aug 03, 2010 at 12:13:06PM +0100, Richard W.M. Jones wrote:
   
   qemu compiled from today's git.  Using the following command line:
   
   $qemudir/x86_64-softmmu/qemu-system-x86_64 -L $qemudir/pc-bios \
   -drive file=/dev/null,if=virtio \
   -enable-kvm \
   -nodefaults \
   -nographic \
   -serial stdio \
   -m 500 \
   -no-reboot \
   -no-hpet \
   -net user,vlan=0,net=169.254.0.0/16 \
   -net nic,model=ne2k_pci,vlan=0 \
   -kernel /tmp/libguestfsEyAMut/kernel \
   -initrd /tmp/libguestfsEyAMut/initrd \
   -append 'panic=1 console=ttyS0 udevtimeout=300 noapic acpi=off 
   printk.time=1 cgroup_disable=memory selinux=0 
   guestfs_vmchannel=tcp:169.254.2.2:35007 guestfs_verbose=1 
   TERM=xterm-color '
   
   With kernel 2.6.35 [*], this takes about 1 min 20 s before the guest
   starts.
   
   If I revert back to kernel 2.6.34, it's pretty quick as usual.
   
   strace is not very informative.  It's in a loop doing select and
   reading/writing from some file descriptors, including the signalfd and
   two pipe fds.
   
   Anyone seen anything like this?
   
  I assume your initrd is huge.
 
 It's ~110MB, yes.
 
  In newer kernels ins/outs are much slower that they were. They are
  much more correct too. It shouldn't be 1 min 20 sec for 100M initrd
  though, but it can take 20-30 sec. This belongs to kvm list BTW.
 
 I can't see anything about this in the kernel changelog.  Can you
 point me to the commit or the key phrase to look for?
 
7972995b0c346de76

 Also, what's the point of making in/out more correct when they we
 know we're talking to qemu (eg. from the CPUID) and we know it already
 worked fine before with qemu?
 
Qemu has nothing to do with that. ins/outs didn't worked correctly for
some situation. They didn't work at all if destination/source memory
was MMIO (didn't work as in hang vcpu IIRC and this is security risk).
Direction flag wasn't handled at all (if it was set instruction injected
#GP into a gust). It didn't check that memory it writes to is shadowed
in which case special action should be taken. It didn't delivered events
during long string operations. May be more. Unfortunately adding all that
makes emulation much slower.  I already implemented some speedups, and
more is possible, but we will not be able to get to previous string io speed
which was our upper limit.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 03:37:14PM +0300, Gleb Natapov wrote:
 On Tue, Aug 03, 2010 at 01:10:00PM +0100, Richard W.M. Jones wrote:
  I can't see anything about this in the kernel changelog.  Can you
  point me to the commit or the key phrase to look for?
  
 7972995b0c346de76

Thanks - I see.

  Also, what's the point of making in/out more correct when they we
  know we're talking to qemu (eg. from the CPUID) and we know it already
  worked fine before with qemu?
  
 Qemu has nothing to do with that. ins/outs didn't worked correctly for
 some situation. They didn't work at all if destination/source memory
 was MMIO (didn't work as in hang vcpu IIRC and this is security risk).
 Direction flag wasn't handled at all (if it was set instruction injected
 #GP into a gust). It didn't check that memory it writes to is shadowed
 in which case special action should be taken. It didn't delivered events
 during long string operations. May be more. Unfortunately adding all that
 makes emulation much slower.  I already implemented some speedups, and
 more is possible, but we will not be able to get to previous string io speed
 which was our upper limit.

Thanks for the explanation.  I'll repost my DMA-like fw-cfg patch
once I've rebased it and done some more testing.  This huge regression
for a common operation (implementing -initrd) needs to be solved
without using inb/rep ins.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for August 3

2010-08-03 Thread Luiz Capitulino
On Tue, 03 Aug 2010 01:46:01 +0200
Juan Quintela quint...@redhat.com wrote:

 
 Please send in any agenda items you are interested in covering.

- 0.13

Let's keep remembering Anthony ;-)

 
 thanks,
 
 Juan
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for August 3

2010-08-03 Thread Avi Kivity

 On 08/03/2010 04:01 PM, Luiz Capitulino wrote:

On Tue, 03 Aug 2010 01:46:01 +0200
Juan Quintelaquint...@redhat.com  wrote:


Please send in any agenda items you are interested in covering.

- 0.13


More specifically, 0.13-rc0.  Tagged but not announced?  I'd like to 
announce it so people can start testing it.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 03:48 PM, Richard W.M. Jones wrote:


Thanks for the explanation.  I'll repost my DMA-like fw-cfg patch
once I've rebased it and done some more testing.  This huge regression
for a common operation (implementing -initrd) needs to be solved
without using inb/rep ins.


Adding more interfaces is easy but a problem in the long term.  We'll 
optimize it as much as we can.  Meanwhile, why are you loading huge 
initrds?  Use a cdrom instead (it will also be faster since the guest 
doesn't need to unpack it).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for August 3

2010-08-03 Thread Anthony Liguori

On 08/03/2010 08:16 AM, Avi Kivity wrote:

 On 08/03/2010 04:01 PM, Luiz Capitulino wrote:

On Tue, 03 Aug 2010 01:46:01 +0200
Juan Quintelaquint...@redhat.com  wrote:


Please send in any agenda items you are interested in covering.

- 0.13


More specifically, 0.13-rc0.  Tagged but not announced?  I'd like to 
announce it so people can start testing it.


That's the normal process.  0.13.0-rc0 is just a git snapshot and as 
such, everyone has been testing it already.  0.13.0-rc1 is due to be 
tagged later today and that's the first one that's useful to test 
separately.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM test: Unittest subtest: Avoid leak of extra_params

2010-08-03 Thread Lucas Meneghel Rodrigues
This is the sequel of the previous fix on the unittest
subtest: As we're running on a loop through the
unittest list, the original extra_params need to be
restored at the end of each test, so previously
set extra_params don't leak to other unittests.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/tests/unittest.py |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/tests/unittest.py 
b/client/tests/kvm/tests/unittest.py
index ad95720..c52637a 100644
--- a/client/tests/kvm/tests/unittest.py
+++ b/client/tests/kvm/tests/unittest.py
@@ -46,6 +46,8 @@ def run_unittest(test, params, env):
 
 timeout = int(params.get('unittest_timeout', 600))
 
+extra_params_original = params['extra_params']
+
 for t in test_list:
 logging.info('Running %s', t)
 
@@ -111,5 +113,8 @@ def run_unittest(test, params, env):
 except NameError, IOError:
logging.error(Not possible to collect logs)
 
+# Restore the extra params so other tests can run normally
+params['extra_params'] = extra_params_original
+
 if nfail != 0:
 raise error.TestFail(Unit tests failed: %s %  .join(tests_failed))
-- 
1.7.1.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for August 3

2010-08-03 Thread Avi Kivity

 On 08/03/2010 04:31 PM, Anthony Liguori wrote:

On 08/03/2010 08:16 AM, Avi Kivity wrote:

 On 08/03/2010 04:01 PM, Luiz Capitulino wrote:

On Tue, 03 Aug 2010 01:46:01 +0200
Juan Quintelaquint...@redhat.com  wrote:


Please send in any agenda items you are interested in covering.

- 0.13


More specifically, 0.13-rc0.  Tagged but not announced?  I'd like to 
announce it so people can start testing it.


That's the normal process.  0.13.0-rc0 is just a git snapshot and as 
such, everyone has been testing it already.  0.13.0-rc1 is due to be 
tagged later today and that's the first one that's useful to test 
separately.


I meant users.  Many users avoid git and test tarballs which come from 
an announcement instead.  Same for distros, things like rawhide can 
package an -rc0.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 04:19:39PM +0300, Avi Kivity wrote:
  On 08/03/2010 03:48 PM, Richard W.M. Jones wrote:
 
 Thanks for the explanation.  I'll repost my DMA-like fw-cfg patch
 once I've rebased it and done some more testing.  This huge regression
 for a common operation (implementing -initrd) needs to be solved
 without using inb/rep ins.
 
 Adding more interfaces is easy but a problem in the long term.
 We'll optimize it as much as we can.  Meanwhile, why are you loading
 huge initrds?  Use a cdrom instead (it will also be faster since the
 guest doesn't need to unpack it).

Because it involves rewriting the entire appliance building process,
and we don't necessarily know if it'll be faster after we've done
that.

Look: currently we create the initrd on the fly in 700ms.  We've no
reason to believe that creating a CD-ROM on the fly wouldn't take
around the same time.  After all, both processes involve reading all
the host files from disk and writing a temporary file.

You have to create these things on the fly, because we don't actually
ship an appliance to end users, just a tiny ( 1 MB) skeleton.  You
can't ship a massive statically linked appliance to end users because
it's just unmanageable (think: security; updates; bandwidth).

Loading the initrd currently takes 115ms (or could do, if a sensible
50 line patch was permitted).

So the only possible saving would be the 115ms load time of the
initrd.  In theory the CD-ROM device could be detected in 0 time.

Total saving: 115ms.

But will it be any faster, since after spending 115ms, everything runs
from memory, versus being loaded from the CD?

Let's face the fact that qemu has suffered from an enormous
regression.  From some hundreds of milliseconds up to over a minute,
in the space of 6 months of development.  For a very simple operation:
loading a file into memory.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for August 3

2010-08-03 Thread Anthony Liguori

On 08/03/2010 08:49 AM, Avi Kivity wrote:

 On 08/03/2010 04:31 PM, Anthony Liguori wrote:

On 08/03/2010 08:16 AM, Avi Kivity wrote:

 On 08/03/2010 04:01 PM, Luiz Capitulino wrote:

On Tue, 03 Aug 2010 01:46:01 +0200
Juan Quintelaquint...@redhat.com  wrote:


Please send in any agenda items you are interested in covering.

- 0.13


More specifically, 0.13-rc0.  Tagged but not announced?  I'd like to 
announce it so people can start testing it.


That's the normal process.  0.13.0-rc0 is just a git snapshot and as 
such, everyone has been testing it already.  0.13.0-rc1 is due to be 
tagged later today and that's the first one that's useful to test 
separately.


I meant users.  Many users avoid git and test tarballs which come from 
an announcement instead.  Same for distros, things like rawhide can 
package an -rc0.


-rc0 is available in rawhide FWIW.

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] KVM call agenda for August 3

2010-08-03 Thread Avi Kivity

 On 08/03/2010 05:25 PM, Anthony Liguori wrote:
I meant users.  Many users avoid git and test tarballs which come 
from an announcement instead.  Same for distros, things like rawhide 
can package an -rc0.



-rc0 is available in rawhide FWIW.



Cool.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 05:05 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 04:19:39PM +0300, Avi Kivity wrote:

  On 08/03/2010 03:48 PM, Richard W.M. Jones wrote:

Thanks for the explanation.  I'll repost my DMA-like fw-cfg patch
once I've rebased it and done some more testing.  This huge regression
for a common operation (implementing -initrd) needs to be solved
without using inb/rep ins.

Adding more interfaces is easy but a problem in the long term.
We'll optimize it as much as we can.  Meanwhile, why are you loading
huge initrds?  Use a cdrom instead (it will also be faster since the
guest doesn't need to unpack it).

Because it involves rewriting the entire appliance building process,
and we don't necessarily know if it'll be faster after we've done
that.

Look: currently we create the initrd on the fly in 700ms.  We've no
reason to believe that creating a CD-ROM on the fly wouldn't take
around the same time.  After all, both processes involve reading all
the host files from disk and writing a temporary file.


The time will only continue to grow as you add features and as the 
distro bloats naturally.


Much better to create it once and only update it if some dependent file 
changes (basically the current on-the-fly code + save a list of file 
timestamps).


Alternatively, pass through the host filesystem.


You have to create these things on the fly, because we don't actually
ship an appliance to end users, just a tiny (  1 MB) skeleton.  You
can't ship a massive statically linked appliance to end users because
it's just unmanageable (think: security; updates; bandwidth).


Shipping it is indeed out of the question.  But on-the-fly creation is 
not the only alternative.



Loading the initrd currently takes 115ms (or could do, if a sensible
50 line patch was permitted).

So the only possible saving would be the 115ms load time of the
initrd.  In theory the CD-ROM device could be detected in 0 time.

Total saving: 115ms.


815 ms by my arithmetic.  You also save 3*N-2*P memory where N is the 
size of your initrd and P is the actual amount used by the guest.



But will it be any faster, since after spending 115ms, everything runs
from memory, versus being loaded from the CD?

Let's face the fact that qemu has suffered from an enormous
regression.  From some hundreds of milliseconds up to over a minute,
in the space of 6 months of development.


It wasn't qemu, but kvm.  And it didn't take six months, just a few 
commits.  Those aren't going back, they're a lot more important than 
some libguestfs problem which shouldn't have been coded differently in 
the first place.



For a very simple operation:
loading a file into memory.


Loading a file into memory is plenty fast if you use the standard 
interfaces.  -kernel -initrd is a specialized interface.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bad O_DIRECT read and write performance with small block sizes with virtio

2010-08-03 Thread John Leach
On Mon, 2010-08-02 at 21:50 +0100, Stefan Hajnoczi wrote:
 On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguori anth...@codemonkey.ws wrote:
  On 08/02/2010 12:15 PM, John Leach wrote:
 
  Hi,
 
  I've come across a problem with read and write disk IO performance when
  using O_DIRECT from within a kvm guest.  With O_DIRECT, reads and writes
  are much slower with smaller block sizes.  Depending on the block size
  used, I've seen 10 times slower.
 
  For example, with an 8k block size, reading directly from /dev/vdb
  without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s.
 
  As a comparison, reading in O_DIRECT mode in 8k blocks directly from the
  backend device on the host gives 2.3 GB/s.  Reading in O_DIRECT mode
  from a xen guest on the same hardware manages 263 MB/s.
 
 
  Stefan has a few fixes for this behavior that help a lot.  One of them
  (avoiding memset) is already upstream but not in 0.12.x.

Anthony, that patch is already applied in the RHEL6 package I'm been
testing with - I've just manually confirmed that.  Thanks though.

 
  The other two are not done yet but should be on the ML in the next couple
  weeks.  They involve using ioeventfd for notification and unlocking the
  block queue lock while doing a kick notification.
 
 Thanks for mentioning those patches.  The ioeventfd patch will be sent
 this week, I'm checking that migration works correctly and then need
 to check that vhost-net still works.

I'll give them a test as soon as I can get hold of them, thanks Stefan!

  Writing is affected in the same way, and exhibits the same behaviour
  with O_SYNC too.
 
  Watching with vmstat on the host, I see the same number of blocks being
  read, but about 14 times the number of context switches in O_DIRECT mode
  (4500 cs vs. 63000 cs) and a little more cpu usage.
 
  The device I'm writing to is a device-mapper zero device that generates
  zeros on read and throws away writes, you can set it up
  at /dev/mapper/zero like this:
 
  echo 0 21474836480 zero | dmsetup create zero
 
  My libvirt config for the disk is:
 
  disk type='block' device='disk'
driver cache='none'/
source dev='/dev/mapper/zero'/
target dev='vdb' bus='virtio'/
address type='pci' domain='0x' bus='0x00' slot='0x06'
  function='0x0'/
  /disk
 
  which translates to the kvm arg:
 
  -device
  virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
  -drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none
 
  I'm testing with dd:
 
  dd if=/dev/vdb of=/dev/null bs=8k iflag=direct
 
  As a side note, as you increase the block size read performance in
  O_DIRECT mode starts to overtake non O_DIRECT mode reads (from about
  150k block size). By 550k block size I'm seeing 1 GB/s reads with
  O_DIRECT and 770 MB/s without.
 
 Can you take QEMU out of the picture and run the same test on the host:
 
 dd if=/dev/vdb of=/dev/null bs=8k iflag=direct
 vs
 dd if=/dev/vdb of=/dev/null bs=8k
 
 This isn't quite the same because QEMU will use a helper thread doing
 preadv.  I'm not sure what syscall dd will use.
 
 It should be close enough to determine whether QEMU and device
 emulation are involved at all though, or whether these differences are
 due to the host kernel code path down to the device mapper zero device
 being different for normal vs O_DIRECT.


dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 iflag=direct
819200 bytes (8.2 GB) copied, 3.46529 s, 2.4 GB/s

dd if=/dev/mapper/zero of=/dev/null bs=8k count=100
819200 bytes (8.2 GB) copied, 5.5741 s, 1.5 GB/s

dd is just using read.

Thanks,

John.




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bad O_DIRECT read and write performance with small block sizes with virtio

2010-08-03 Thread Avi Kivity

 On 08/03/2010 05:40 PM, John Leach wrote:


dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 iflag=direct
819200 bytes (8.2 GB) copied, 3.46529 s, 2.4 GB/s

dd if=/dev/mapper/zero of=/dev/null bs=8k count=100
819200 bytes (8.2 GB) copied, 5.5741 s, 1.5 GB/s

dd is just using read.



What's /dev/mapper/zero?  A real volume or a zero target?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.35 hangs on early boot in KVM

2010-08-03 Thread Avi Kivity

 On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote:

I have basically built 2.6.35 with make oldconfig from a working 2.6.34.
Latter works fine in kvm while 2.6.35 hangs very early. I see nothing after
grub (have early printk and verbose bootup enabled), just a blinking VGA
cursor and CPU at 100%.



Please copy kvm@vger.kernel.org on kvm issues.


CONFIG_PRINTK_TIME=y



Try disabling this as a workaround.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bad O_DIRECT read and write performance with small block sizes with virtio

2010-08-03 Thread John Leach
On Tue, 2010-08-03 at 09:35 +0300, Dor Laor wrote:
 On 08/02/2010 11:50 PM, Stefan Hajnoczi wrote:
  On Mon, Aug 2, 2010 at 6:46 PM, Anthony Liguorianth...@codemonkey.ws  
  wrote:
  On 08/02/2010 12:15 PM, John Leach wrote:
 
  Hi,
 
  I've come across a problem with read and write disk IO performance when
  using O_DIRECT from within a kvm guest.  With O_DIRECT, reads and writes
  are much slower with smaller block sizes.  Depending on the block size
  used, I've seen 10 times slower.
 
  For example, with an 8k block size, reading directly from /dev/vdb
  without O_DIRECT I see 750 MB/s, but with O_DIRECT I see 79 MB/s.
 
  As a comparison, reading in O_DIRECT mode in 8k blocks directly from the
  backend device on the host gives 2.3 GB/s.  Reading in O_DIRECT mode
  from a xen guest on the same hardware manages 263 MB/s.
 
 
  Stefan has a few fixes for this behavior that help a lot.  One of them
  (avoiding memset) is already upstream but not in 0.12.x.
 
  The other two are not done yet but should be on the ML in the next couple
  weeks.  They involve using ioeventfd for notification and unlocking the
  block queue lock while doing a kick notification.
 
  Thanks for mentioning those patches.  The ioeventfd patch will be sent
  this week, I'm checking that migration works correctly and then need
  to check that vhost-net still works.
 
  Writing is affected in the same way, and exhibits the same behaviour
  with O_SYNC too.
 
  Watching with vmstat on the host, I see the same number of blocks being
  read, but about 14 times the number of context switches in O_DIRECT mode
  (4500 cs vs. 63000 cs) and a little more cpu usage.
 
  The device I'm writing to is a device-mapper zero device that generates
  zeros on read and throws away writes, you can set it up
  at /dev/mapper/zero like this:
 
  echo 0 21474836480 zero | dmsetup create zero
 
  My libvirt config for the disk is:
 
  disk type='block' device='disk'
 driver cache='none'/
 source dev='/dev/mapper/zero'/
 target dev='vdb' bus='virtio'/
 address type='pci' domain='0x' bus='0x00' slot='0x06'
  function='0x0'/
  /disk
 
  which translates to the kvm arg:
 
  -device
  virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
  -drive file=/dev/mapper/zero,if=none,id=drive-virtio-disk1,cache=none
 
 aio=native and change the io scheduler on the host to deadline should 
 help as well.

No improvement in this case (I was already using deadline on the host,
and just tested with aio=native). Tried with a real disk backend too,
still no improvement.

I'll try with and without once I get Stefan's other patches too though.

Thanks,

John.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 05:38:25PM +0300, Avi Kivity wrote:
 The time will only continue to grow as you add features and as the
 distro bloats naturally.
 
 Much better to create it once and only update it if some dependent
 file changes (basically the current on-the-fly code + save a list of
 file timestamps).

This applies to both cases, the initrd could also be saved, so:

 Total saving: 115ms.
 
 815 ms by my arithmetic.

no, not true, 115ms.

 You also save 3*N-2*P memory where N is the size of your initrd and
 P is the actual amount used by the guest.

Can you explain this?

 Loading a file into memory is plenty fast if you use the standard
 interfaces.  -kernel -initrd is a specialized interface.

Why bother with any command line options at all?  After all, they keep
changing and causing problems for qemu's users ...  Apparently we're
all doing stuff wrong, in ways that are never explained by the
developers.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bad O_DIRECT read and write performance with small block sizes with virtio

2010-08-03 Thread John Leach
On Tue, 2010-08-03 at 17:44 +0300, Avi Kivity wrote:
 On 08/03/2010 05:40 PM, John Leach wrote:
 
  dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 iflag=direct
  819200 bytes (8.2 GB) copied, 3.46529 s, 2.4 GB/s
 
  dd if=/dev/mapper/zero of=/dev/null bs=8k count=100
  819200 bytes (8.2 GB) copied, 5.5741 s, 1.5 GB/s
 
  dd is just using read.
 
 
 What's /dev/mapper/zero?  A real volume or a zero target?
 

zero target:

echo 0 21474836480 zero | dmsetup create zero

The same performance penalty occurs when using real disks though, I just
moved to a zero target to rule out the variables of spinning metal and
raid controller caches.

John.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.35 hangs on early boot in KVM

2010-08-03 Thread Tvrtko Ursulin
On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote:
   On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote:
  I have basically built 2.6.35 with make oldconfig from a working 2.6.34.
  Latter works fine in kvm while 2.6.35 hangs very early. I see nothing
  after grub (have early printk and verbose bootup enabled), just a
  blinking VGA cursor and CPU at 100%.

 Please copy kvm@vger.kernel.org on kvm issues.

  CONFIG_PRINTK_TIME=y

 Try disabling this as a workaround.

I am in the middle of a bisect run with five builds left to go, currently I
have:

bad 537b60d17894b7c19a6060feae40299d7109d6e7
good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65

Tvrtko

Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United 
Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.35 hangs on early boot in KVM

2010-08-03 Thread Tvrtko Ursulin
On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote:
 On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote:
On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote:
   I have basically built 2.6.35 with make oldconfig from a working
   2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see
   nothing after grub (have early printk and verbose bootup enabled),
   just a blinking VGA cursor and CPU at 100%.
 
  Please copy kvm@vger.kernel.org on kvm issues.
 
   CONFIG_PRINTK_TIME=y
 
  Try disabling this as a workaround.

 I am in the middle of a bisect run with five builds left to go, currently I
 have:

 bad 537b60d17894b7c19a6060feae40299d7109d6e7
 good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65

Bisect is looking good, narrowed it to ten revisions, but I am not sure to
make it to the end today:

bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea
good 41d59102e146a4423a490b8eca68a5860af4fe1c

One interesting waning spotted:

include/config/auto.conf:555:warning: symbol value '-fcall-saved-ecx -fcall-
saved-edx' invalid for ARCH_HWEIGHT_CFLAGS

Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United 
Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.35 hangs on early boot in KVM

2010-08-03 Thread Tvrtko Ursulin

On Tuesday 03 Aug 2010 16:17:20 Tvrtko Ursulin wrote:
 On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote:
  On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote:
 On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote:
I have basically built 2.6.35 with make oldconfig from a working
2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see
nothing after grub (have early printk and verbose bootup enabled),
just a blinking VGA cursor and CPU at 100%.
  
   Please copy kvm@vger.kernel.org on kvm issues.
  
CONFIG_PRINTK_TIME=y
  
   Try disabling this as a workaround.
 
  I am in the middle of a bisect run with five builds left to go, currently
  I have:
 
  bad 537b60d17894b7c19a6060feae40299d7109d6e7
  good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65

 Bisect is looking good, narrowed it to ten revisions, but I am not sure to
 make it to the end today:

 bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea
 good 41d59102e146a4423a490b8eca68a5860af4fe1c

 One interesting waning spotted:

 include/config/auto.conf:555:warning: symbol value '-fcall-saved-ecx
 -fcall- saved-edx' invalid for ARCH_HWEIGHT_CFLAGS

Copying Peter and Borislav, guys please look at the above warning. I am
bisecting a non-bootable 2.6.35 under KVM and while I am not there yet, it is
close to the hweight commit (cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea) and I
spotted this warning.

Tvrtko


Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United 
Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/11] uio: do not use PCI resources before pci_enable_device()

2010-08-03 Thread Kulikov Vasiliy
IRQ and resource[] may not have correct values until
after PCI hotplug setup occurs at pci_enable_device() time.

The semantic match that finds this problem is as follows:

// smpl
@@
identifier x;
identifier request ~= pci_request.*|pci_resource.*;
@@

(
* x-irq
|
* x-resource
|
* request(x, ...)
)
 ...
*pci_enable_device(x)
// /smpl

Signed-off-by: Kulikov Vasiliy sego...@gmail.com
---
 drivers/uio/uio_pci_generic.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
index 85c9884..fc22e1e 100644
--- a/drivers/uio/uio_pci_generic.c
+++ b/drivers/uio/uio_pci_generic.c
@@ -128,12 +128,6 @@ static int __devinit probe(struct pci_dev *pdev,
struct uio_pci_generic_dev *gdev;
int err;
 
-   if (!pdev-irq) {
-   dev_warn(pdev-dev, No IRQ assigned to device: 
-no support for interrupts?\n);
-   return -ENODEV;
-   }
-
err = pci_enable_device(pdev);
if (err) {
dev_err(pdev-dev, %s: pci_enable_device failed: %d\n,
@@ -141,6 +135,13 @@ static int __devinit probe(struct pci_dev *pdev,
return err;
}
 
+   if (!pdev-irq) {
+   dev_warn(pdev-dev, No IRQ assigned to device: 
+no support for interrupts?\n);
+   pci_disable_device(pdev);
+   return -ENODEV;
+   }
+
err = verify_pci_2_3(pdev);
if (err)
goto err_verify;
-- 
1.7.0.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/11] uio: do not use PCI resources before pci_enable_device()

2010-08-03 Thread Michael S. Tsirkin
On Tue, Aug 03, 2010 at 07:44:23PM +0400, Kulikov Vasiliy wrote:
 IRQ and resource[] may not have correct values until
 after PCI hotplug setup occurs at pci_enable_device() time.
 
 The semantic match that finds this problem is as follows:
 
 // smpl
 @@
 identifier x;
 identifier request ~= pci_request.*|pci_resource.*;
 @@
 
 (
 * x-irq
 |
 * x-resource
 |
 * request(x, ...)
 )
  ...
 *pci_enable_device(x)
 // /smpl
 
 Signed-off-by: Kulikov Vasiliy sego...@gmail.com

Looks sane.
Acked-by: Michael S. Tsirkin m...@redhat.com

 ---
  drivers/uio/uio_pci_generic.c |   13 +++--
  1 files changed, 7 insertions(+), 6 deletions(-)
 
 diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
 index 85c9884..fc22e1e 100644
 --- a/drivers/uio/uio_pci_generic.c
 +++ b/drivers/uio/uio_pci_generic.c
 @@ -128,12 +128,6 @@ static int __devinit probe(struct pci_dev *pdev,
   struct uio_pci_generic_dev *gdev;
   int err;
  
 - if (!pdev-irq) {
 - dev_warn(pdev-dev, No IRQ assigned to device: 
 -  no support for interrupts?\n);
 - return -ENODEV;
 - }
 -
   err = pci_enable_device(pdev);
   if (err) {
   dev_err(pdev-dev, %s: pci_enable_device failed: %d\n,
 @@ -141,6 +135,13 @@ static int __devinit probe(struct pci_dev *pdev,
   return err;
   }
  
 + if (!pdev-irq) {
 + dev_warn(pdev-dev, No IRQ assigned to device: 
 +  no support for interrupts?\n);
 + pci_disable_device(pdev);
 + return -ENODEV;
 + }
 +
   err = verify_pci_2_3(pdev);
   if (err)
   goto err_verify;
 -- 
 1.7.0.4
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.35 hangs on early boot in KVM

2010-08-03 Thread Borislav Petkov
From: Tvrtko Ursulin tvrtko.ursu...@sophos.com
Date: Tue, Aug 03, 2010 at 11:31:02AM -0400

  One interesting waning spotted:
 
  include/config/auto.conf:555:warning: symbol value '-fcall-saved-ecx
  -fcall- saved-edx' invalid for ARCH_HWEIGHT_CFLAGS
 
 Copying Peter and Borislav, guys please look at the above warning. I am
 bisecting a non-bootable 2.6.35 under KVM and while I am not there yet, it is
 close to the hweight commit (cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea) and I
 spotted this warning.

That's because you're at a bisection point before the hweight patch but
your .config already contains the ARCH_HWEIGHT_CFLAGS variable because
of the previous bisection point which contained the hweight patch.

I think this can be safely ignored.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.

2010-08-03 Thread Shirley Ma
Hello Xiaohui,

On Tue, 2010-08-03 at 16:48 +0800, Xin, Xiaohui wrote:
 May you share me with your performance results (including BW and
 latency)on 
 vhost-net and how you get them(your configuration and especially with
 the affinity 
 settings)? 

My macvtap zero copy is incomplete, I am testing sendmsg only now. The
initial performance is not good especially for latency (zero copy vs.
copy). I am still working on it to find out why and how to improve.
That's the reason I am eager to know your performance results and how
much performance gain you have seen.

Since your patch has completed. I would try your patch here for
performance. If you have some performance results to share here that
would be great.

Thanks
Shirley

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.35 hangs on early boot in KVM

2010-08-03 Thread Tvrtko Ursulin
On Tuesday 03 Aug 2010 16:17:20 Tvrtko Ursulin wrote:
 On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote:
  On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote:
 On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote:
I have basically built 2.6.35 with make oldconfig from a working
2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see
nothing after grub (have early printk and verbose bootup enabled),
just a blinking VGA cursor and CPU at 100%.
  
   Please copy kvm@vger.kernel.org on kvm issues.
  
CONFIG_PRINTK_TIME=y
  
   Try disabling this as a workaround.
 
  I am in the middle of a bisect run with five builds left to go, currently
  I have:
 
  bad 537b60d17894b7c19a6060feae40299d7109d6e7
  good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65

 Bisect is looking good, narrowed it to ten revisions, but I am not sure to
 make it to the end today:

 bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea
 good 41d59102e146a4423a490b8eca68a5860af4fe1c

Bisect points the finger to x86, ioapic: In mpparse use mp_register_ioapic
(cf7500c0ea133d66f8449d86392d83f840102632), so I am copying Eric. No idea
whether this commit is solely to blame or it is a combined interaction with
KVM, but I am sure you guys will know.

If you want me to test something else please shout.

Tvrtko

Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United 
Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.35 hangs on early boot in KVM

2010-08-03 Thread Tvrtko Ursulin
On Tuesday 03 Aug 2010 16:49:01 Borislav Petkov wrote:
 From: Tvrtko Ursulin tvrtko.ursu...@sophos.com
 Date: Tue, Aug 03, 2010 at 11:31:02AM -0400

   One interesting waning spotted:
  
   include/config/auto.conf:555:warning: symbol value '-fcall-saved-ecx
   -fcall- saved-edx' invalid for ARCH_HWEIGHT_CFLAGS
 
  Copying Peter and Borislav, guys please look at the above warning. I am
  bisecting a non-bootable 2.6.35 under KVM and while I am not there yet,
  it is close to the hweight commit
  (cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea) and I spotted this warning.

 That's because you're at a bisection point before the hweight patch but
 your .config already contains the ARCH_HWEIGHT_CFLAGS variable because
 of the previous bisection point which contained the hweight patch.

 I think this can be safely ignored.

Yep, bisect pointed to another commit so I continued another part of this
thread. Thanks for the explanation!

Tvrtko

Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United 
Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 05:53 PM, Richard W.M. Jones wrote:



Total saving: 115ms.

815 ms by my arithmetic.

no, not true, 115ms.


If you bypass creating the initrd/cdrom (700 ms) and loading it (115ms) 
you save 815ms.



You also save 3*N-2*P memory where N is the size of your initrd and
P is the actual amount used by the guest.

Can you explain this?


(assuming ahead-of-time image generation)

initrd:
  qemu reads image (host pagecache): N
  qemu stores image in RAM: N
  guest copies image to its RAM: N
  guest faults working set (no XIP): P
  total: 3N+P

initramfs:
  qemu reads image (host pagecache): N
  qemu stores image: N
  guest copies image: N
  guest extracts image (XIP): N
  total: 4N

cdrom:
  guest faults working set: P
  kernel faults working set: P
  total: 2P

difference: 3N-P or 4N-2P depending on model



Loading a file into memory is plenty fast if you use the standard
interfaces.  -kernel -initrd is a specialized interface.

Why bother with any command line options at all?  After all, they keep
changing and causing problems for qemu's users ...  Apparently we're
all doing stuff wrong, in ways that are never explained by the
developers.


That's a real problem.  It's hard to explain the intent behind 
something, especially when it's obvious to the author and not so obvious 
to the user.  However making everything do everything under all 
circumstances has its costs.


-kernel and -initrd is a developer's interface intended to make life 
easier for users that use qemu to develop kernels.  It was not intended 
as a high performance DMA engine.  Neither was the firmware 
_configuration_ interface.  That is what virtio and to a lesser extent 
IDE was written to perform.  You'll get much better results from them.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bad O_DIRECT read and write performance with small block sizes with virtio

2010-08-03 Thread Avi Kivity

 On 08/03/2010 05:57 PM, John Leach wrote:

On Tue, 2010-08-03 at 17:44 +0300, Avi Kivity wrote:

On 08/03/2010 05:40 PM, John Leach wrote:

dd if=/dev/mapper/zero of=/dev/null bs=8k count=100 iflag=direct
819200 bytes (8.2 GB) copied, 3.46529 s, 2.4 GB/s

dd if=/dev/mapper/zero of=/dev/null bs=8k count=100
819200 bytes (8.2 GB) copied, 5.5741 s, 1.5 GB/s

dd is just using read.


What's /dev/mapper/zero?  A real volume or a zero target?


zero target:

echo 0 21474836480 zero | dmsetup create zero

The same performance penalty occurs when using real disks though, I just
moved to a zero target to rule out the variables of spinning metal and
raid controller caches.


Don't, it's confusing things.  I'd expect dd to be slower with 
iflag=direct since the kernel can't do readahead an instead must 
roundtrip to the controller.  With a zero target it's faster since it 
doesn't have to roundtrip and instead avoids a copy.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 07:10:18PM +0300, Avi Kivity wrote:
 -kernel and -initrd is a developer's interface intended to make life
 easier for users that use qemu to develop kernels.  It was not
 intended as a high performance DMA engine.  Neither was the firmware
 _configuration_ interface.  That is what virtio and to a lesser
 extent IDE was written to perform.  You'll get much better results
 from them.

Firmware configuration replaced something which was already working
really fast -- preloading the images into memory -- with something
which worked slower, and has just recently got _way_ more slow.

This is a regression.  Plain and simple.

I have posted a small patch which makes this 650x faster without
appreciable complication.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/27] KVM PPC PV framework v3

2010-08-03 Thread Scott Wood
On Sun, 1 Aug 2010 22:21:37 +0200
Alexander Graf ag...@suse.de wrote:

 
 On 01.08.2010, at 16:02, Avi Kivity wrote:
 
  Looks reasonable.  Since it's fair to say I understand nothing about 
  powerpc, I'd like someone who does to review it and ack, please, with an 
  emphasis on the interfaces.
 
 Sounds good. Preferably someone with access to the ePAPR spec :).

The ePAPR-relevant stuff in patches 7, 16, and 17 looks reasonable.
Did I miss any ePAPR-relevant stuff in the other patches?

-Scott

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 09:53 AM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 05:38:25PM +0300, Avi Kivity wrote:
   

The time will only continue to grow as you add features and as the
distro bloats naturally.

Much better to create it once and only update it if some dependent
file changes (basically the current on-the-fly code + save a list of
file timestamps).
 

This applies to both cases, the initrd could also be saved, so:

   

Total saving: 115ms.
   

815 ms by my arithmetic.
 

no, not true, 115ms.

   

You also save 3*N-2*P memory where N is the size of your initrd and
P is the actual amount used by the guest.
 

Can you explain this?

   

Loading a file into memory is plenty fast if you use the standard
interfaces.  -kernel -initrd is a specialized interface.
 

Why bother with any command line options at all?  After all, they keep
changing and causing problems for qemu's users ...  Apparently we're
all doing stuff wrong, in ways that are never explained by the
developers.
   


Let's be fair.  I think we've all agreed to adjust the fw_cfg interface 
to implement DMA.  The only requirement was that the DMA operation not 
be triggered from a single port I/O but rather based on a polling 
operation which better fits the way real hardware works.


Is this a regression?  Probably.  But performance regressions that 
result from correctness fixes don't get reverted.  We have to find an 
approach to improve performance without impacting correctness.


That said, the general view of -kernel/-append is that these are 
developer options and we don't really look at it as a performance 
critical interface.  We could do a better job of communicating this to 
users but that's true of most of the features we support.


Regards,

Anthony Liguori


Rich.

   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 07:28 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 07:10:18PM +0300, Avi Kivity wrote:

-kernel and -initrd is a developer's interface intended to make life
easier for users that use qemu to develop kernels.  It was not
intended as a high performance DMA engine.  Neither was the firmware
_configuration_ interface.  That is what virtio and to a lesser
extent IDE was written to perform.  You'll get much better results
from them.

Firmware configuration replaced something which was already working
really fast -- preloading the images into memory -- with something
which worked slower, and has just recently got _way_ more slow.

This is a regression.  Plain and simple.


It's only a regression if there was any intent at making this a 
performant interface.  Otherwise any change an be interpreted as a 
regression.  Even binary doesn't hash to exact same signature is a 
regression.



I have posted a small patch which makes this 650x faster without
appreciable complication.


It doesn't appear to support live migration, or hiding the feature for 
-M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the kernel 
and virtio support demand loading of any image size you'd want to use.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 11:44 AM, Avi Kivity wrote:

 On 08/03/2010 07:28 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 07:10:18PM +0300, Avi Kivity wrote:

-kernel and -initrd is a developer's interface intended to make life
easier for users that use qemu to develop kernels.  It was not
intended as a high performance DMA engine.  Neither was the firmware
_configuration_ interface.  That is what virtio and to a lesser
extent IDE was written to perform.  You'll get much better results
from them.

Firmware configuration replaced something which was already working
really fast -- preloading the images into memory -- with something
which worked slower, and has just recently got _way_ more slow.

This is a regression.  Plain and simple.


It's only a regression if there was any intent at making this a 
performant interface.  Otherwise any change an be interpreted as a 
regression.  Even binary doesn't hash to exact same signature is a 
regression.



I have posted a small patch which makes this 650x faster without
appreciable complication.


It doesn't appear to support live migration, or hiding the feature for 
-M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the kernel 
and virtio support demand loading of any image size you'd want to use.


firmware is totally broken with respect to -M older FWIW.

Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 07:44 PM, Avi Kivity wrote:


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the kernel 
and virtio support demand loading of any image size you'd want to use.




Even better would be to use virtio-9p.  You don't even need an image in 
this case.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 07:46 PM, Anthony Liguori wrote:
It doesn't appear to support live migration, or hiding the feature 
for -M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the 
kernel and virtio support demand loading of any image size you'd want 
to use.



firmware is totally broken with respect to -M older FWIW.



Well, then this is adding to the brokenness.

fwcfg dma is going to have exactly one user, libguestfs.  Much better to 
have libguestfs move to some other interface and improve are 
users-to-interfaces ratio.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 11:50 AM, Avi Kivity wrote:

 On 08/03/2010 07:46 PM, Anthony Liguori wrote:
It doesn't appear to support live migration, or hiding the feature 
for -M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the 
kernel and virtio support demand loading of any image size you'd 
want to use.



firmware is totally broken with respect to -M older FWIW.



Well, then this is adding to the brokenness.

fwcfg dma is going to have exactly one user, libguestfs.  Much better 
to have libguestfs move to some other interface and improve are 
users-to-interfaces ratio.


You mean, only one class of users cares about the performance of loading 
an initrd.  However, you've also argued in other threads how important 
it is not to break libvirt even if it means we have to do silly things 
(like change help text).


So... why is it that libguestfs has to change itself and yet we should 
bend over backwards so libvirt doesn't have to change itself?


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 07:53 PM, Anthony Liguori wrote:

On 08/03/2010 11:50 AM, Avi Kivity wrote:

 On 08/03/2010 07:46 PM, Anthony Liguori wrote:
It doesn't appear to support live migration, or hiding the feature 
for -M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the 
kernel and virtio support demand loading of any image size you'd 
want to use.



firmware is totally broken with respect to -M older FWIW.



Well, then this is adding to the brokenness.

fwcfg dma is going to have exactly one user, libguestfs.  Much better 
to have libguestfs move to some other interface and improve are 
users-to-interfaces ratio.


You mean, only one class of users cares about the performance of 
loading an initrd.  However, you've also argued in other threads how 
important it is not to break libvirt even if it means we have to do 
silly things (like change help text).


So... why is it that libguestfs has to change itself and yet we should 
bend over backwards so libvirt doesn't have to change itself?


libvirt is a major user that is widely deployed, and would be completely 
broken if we change -help.  Changing -help is of no consequence to us.
libguestfs is a (pardon me) minor user that is not widely used, and 
would suffer a performance regression, not total breakage, unless we add 
a fw-dma interface.  Adding the interface is of consequence to us: we 
have to implement live migration and backwards compatibility, and 
support this new interface for a long while.


In an ideal world we wouldn't tolerate any regression.  The world is not 
ideal, so we prioritize.


the -help change scores very high on benfit/cost.  fw-dma, much lower.

Note in both cases the long term solution is for the user to move to 
another interface (cap reporting, virtio), so adding an interface which 
would only be abandoned later by its only user drops the benfit/cost 
ratio even further.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 07:56 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 07:44:49PM +0300, Avi Kivity wrote:

  On 08/03/2010 07:28 PM, Richard W.M. Jones wrote:

I have posted a small patch which makes this 650x faster without
appreciable complication.

It doesn't appear to support live migration, or hiding the feature
for -M older.

AFAICT live migration should still work (even assuming someone live
migrates a domain during early boot, which seems pretty unlikely ...)


Live migration is sometimes performed automatically by management tools, 
which have no idea (nor do they care) what the guest is doing.



Maybe you mean live migration of the dma_* global variables?  I can
fix that.


Yes.


It's not a good path to follow.  Tomorrow we'll need to load 300MB
initrds and we'll have to rework this yet again.

Not a very good straw man ...  The patch would take ~300ms instead
of ~115ms, versus something like 2 mins 40 seconds with the current
method.



It's still 300ms extra time, with a 900MB footprint.

btw, a DMA interface which blocks the guest and/or qemu for 115ms is not 
something we want to introduce to qemu.  dma is hard, doing something 
simple means it won't work very well.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 11:50 AM, Avi Kivity wrote:

 On 08/03/2010 07:46 PM, Anthony Liguori wrote:
It doesn't appear to support live migration, or hiding the feature 
for -M older.


It's not a good path to follow.  Tomorrow we'll need to load 300MB 
initrds and we'll have to rework this yet again.  Meanwhile the 
kernel and virtio support demand loading of any image size you'd 
want to use.



firmware is totally broken with respect to -M older FWIW.



Well, then this is adding to the brokenness.

fwcfg dma is going to have exactly one user, libguestfs.  Much better 
to have libguestfs move to some other interface and improve are 
users-to-interfaces ratio.


BTW, the brokenness is that regardless of -M older, we always use the 
newest firmware.  Because always use the newest firmware, fwcfg is not a 
backwards compatible interface.


Migration totally screws this up.  While we migrate roms (and correctly 
now thanks to Alex's patches), we size the allocation based on the 
newest firmware size.  That means if we ever decreased the size of a 
rom, we'd see total failure (even if we had a compatible fwcfg interface).


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 07:44:49PM +0300, Avi Kivity wrote:
  On 08/03/2010 07:28 PM, Richard W.M. Jones wrote:
 I have posted a small patch which makes this 650x faster without
 appreciable complication.
 
 It doesn't appear to support live migration, or hiding the feature
 for -M older.

AFAICT live migration should still work (even assuming someone live
migrates a domain during early boot, which seems pretty unlikely ...)
Maybe you mean live migration of the dma_* global variables?  I can
fix that.

 It's not a good path to follow.  Tomorrow we'll need to load 300MB
 initrds and we'll have to rework this yet again.

Not a very good straw man ...  The patch would take ~300ms instead
of ~115ms, versus something like 2 mins 40 seconds with the current
method.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/11] uio: do not use PCI resources before pci_enable_device()

2010-08-03 Thread Hans J. Koch
On Tue, Aug 03, 2010 at 07:44:23PM +0400, Kulikov Vasiliy wrote:
 IRQ and resource[] may not have correct values until
 after PCI hotplug setup occurs at pci_enable_device() time.
 
 The semantic match that finds this problem is as follows:
 
 // smpl
 @@
 identifier x;
 identifier request ~= pci_request.*|pci_resource.*;
 @@
 
 (
 * x-irq
 |
 * x-resource
 |
 * request(x, ...)
 )
  ...
 *pci_enable_device(x)
 // /smpl
 
 Signed-off-by: Kulikov Vasiliy sego...@gmail.com

Looks alright to me, thanks!

Signed-off-by: Hans J. Koch h...@linutronix.de

 ---
  drivers/uio/uio_pci_generic.c |   13 +++--
  1 files changed, 7 insertions(+), 6 deletions(-)
 
 diff --git a/drivers/uio/uio_pci_generic.c b/drivers/uio/uio_pci_generic.c
 index 85c9884..fc22e1e 100644
 --- a/drivers/uio/uio_pci_generic.c
 +++ b/drivers/uio/uio_pci_generic.c
 @@ -128,12 +128,6 @@ static int __devinit probe(struct pci_dev *pdev,
   struct uio_pci_generic_dev *gdev;
   int err;
  
 - if (!pdev-irq) {
 - dev_warn(pdev-dev, No IRQ assigned to device: 
 -  no support for interrupts?\n);
 - return -ENODEV;
 - }
 -
   err = pci_enable_device(pdev);
   if (err) {
   dev_err(pdev-dev, %s: pci_enable_device failed: %d\n,
 @@ -141,6 +135,13 @@ static int __devinit probe(struct pci_dev *pdev,
   return err;
   }
  
 + if (!pdev-irq) {
 + dev_warn(pdev-dev, No IRQ assigned to device: 
 +  no support for interrupts?\n);
 + pci_disable_device(pdev);
 + return -ENODEV;
 + }
 +
   err = verify_pci_2_3(pdev);
   if (err)
   goto err_verify;
 -- 
 1.7.0.4
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 12:01 PM, Avi Kivity wrote:
You mean, only one class of users cares about the performance of 
loading an initrd.  However, you've also argued in other threads how 
important it is not to break libvirt even if it means we have to do 
silly things (like change help text).


So... why is it that libguestfs has to change itself and yet we 
should bend over backwards so libvirt doesn't have to change itself?



libvirt is a major user that is widely deployed, and would be 
completely broken if we change -help.  Changing -help is of no 
consequence to us.
libguestfs is a (pardon me) minor user that is not widely used, and 
would suffer a performance regression, not total breakage, unless we 
add a fw-dma interface.  Adding the interface is of consequence to us: 
we have to implement live migration and backwards compatibility, and 
support this new interface for a long while.


I certainly buy the argument about making changes of little consequence 
to us vs. ones that we have to be concerned about long term.


However, I don't think we can objectively differentiate between a 
major and minor user.  Generally speaking, I would rather that we 
not take the position of you are a minor user therefore we're not going 
to accommodate you.


Regards,

Anthony Liguori



In an ideal world we wouldn't tolerate any regression.  The world is 
not ideal, so we prioritize.


the -help change scores very high on benfit/cost.  fw-dma, much lower.

Note in both cases the long term solution is for the user to move to 
another interface (cap reporting, virtio), so adding an interface 
which would only be abandoned later by its only user drops the 
benfit/cost ratio even further.




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 3/4] Paravirtualized spinlock implementation for KVM guests

2010-08-03 Thread Jeremy Fitzhardinge

 On 08/02/2010 11:59 PM, Avi Kivity wrote:

 On 08/02/2010 06:20 PM, Jeremy Fitzhardinge wrote:

 On 08/02/2010 01:48 AM, Avi Kivity wrote:

 On 07/26/2010 09:15 AM, Srivatsa Vaddagiri wrote:
Paravirtual spinlock implementation for KVM guests, based heavily 
on Xen guest's

spinlock implementation.


+
+static struct spinlock_stats
+{
+u64 taken;
+u32 taken_slow;
+
+u64 released;
+
+#define HISTO_BUCKETS30
+u32 histo_spin_total[HISTO_BUCKETS+1];
+u32 histo_spin_spinning[HISTO_BUCKETS+1];
+u32 histo_spin_blocked[HISTO_BUCKETS+1];
+
+u64 time_total;
+u64 time_spinning;
+u64 time_blocked;
+} spinlock_stats;


Could these be replaced by tracepoints when starting to 
spin/stopping spinning etc?  Then userspace can reconstruct the 
histogram as well as see which locks are involved and what call paths.


Unfortunately not; the tracing code uses spinlocks.

(TBH I haven't actually tried, but I did give the code an eyeball to 
this end.)


Hm.  The tracing code already uses a specialized lock 
(arch_spinlock_t), perhaps we can make this lock avoid the tracing?


That's not really a specialized lock; that's just the naked 
architecture-provided spinlock implementation, without all the lockdep, 
etc, etc stuff layered on top.  All these changes are at a lower level, 
so giving tracing its own type of spinlock amounts to making the 
architectures provide two complete spinlock implementations.  We could 
make tracing use, for example, an rwlock so long as we promise not to 
put tracing in the rwlock implementation - but that's hardly elegant.


It's really sad, btw, there's all those nice lockless ring buffers and 
then a spinlock for ftrace_vbprintk(), instead of a per-cpu buffer.


Sad indeed.

J

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 08:42 PM, Anthony Liguori wrote:
However, I don't think we can objectively differentiate between a 
major and minor user.  Generally speaking, I would rather that we 
not take the position of you are a minor user therefore we're not 
going to accommodate you.


Again it's a matter of practicalities.  With have written virtio drivers 
for Windows and Linux, but not for FreeDOS or NetWare.  To speed up 
Windows XP we have (in qemu-kvm) kvm-tpr-opt.c that is a gross breach of 
decency, would we go to the same lengths to speed up Haiku?  I suggest 
that we would not.


libvirt and Windows XP did not win major user status by making large 
anonymous donations to qemu developers.  They did so by having lots of 
users.  Those users are our end users, and we should be focusing our 
efforts in a way that maximizes the gain for as large a number of those 
end users as we can.


Not breaking libvirt will be unknowingly appreciated by a large number 
of users, every day.  Not slowing down libguestfs, by a much smaller 
number for a much shorter time.  If it were just a matter of changing 
the help text I wouldn't mind at all, but introducing an undocumented 
migration-unsafe broken-dma interface isn't something I'm happy to do.


btw, gaining back some of the speed that we lost _is_ something I want 
to do, since it doesn't break or add any interfaces, and would be a gain 
not just for libguestfs, but also for Windows installs (which use string 
pio extensively).  Richard, can you test kvm.git master?  it already 
contains one fix and we plan to add more.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 08:58:10PM +0300, Avi Kivity wrote:
 Richard, can you test kvm.git
 master?  it already contains one fix and we plan to add more.

Yup, I will ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 12:58 PM, Avi Kivity wrote:

 On 08/03/2010 08:42 PM, Anthony Liguori wrote:
However, I don't think we can objectively differentiate between a 
major and minor user.  Generally speaking, I would rather that we 
not take the position of you are a minor user therefore we're not 
going to accommodate you.


Again it's a matter of practicalities.  With have written virtio 
drivers for Windows and Linux, but not for FreeDOS or NetWare.  To 
speed up Windows XP we have (in qemu-kvm) kvm-tpr-opt.c that is a 
gross breach of decency, would we go to the same lengths to speed up 
Haiku?  I suggest that we would not.


tpr-opt optimizes a legitimate dependence on the x86 architecture that 
Windows has.  While the implementation may be grossly indecent, it 
certainly fits the overall mission of what we're trying to do in qemu 
and kvm which is emulate an architecture.


You've invested a lot of time and effort into it because it's important 
to you (or more specifically, your employer).  That's because Windows is 
important to you.


If someone as adept and commit as you was heavily invested in Haiku and 
was willing to implement something equivalent to tpr-opt and also 
willing to do all of the work of maintaining it, then reject such a 
patch would be a mistake.


If Richard is willing to do the work to make -kernel perform faster in 
such a way that it fits into the overall mission of what we're building, 
then I see no reason to reject it.  The criteria for evaluating a patch 
should only depend on how it affects other areas of qemu and whether it 
impacts overall usability.


As a side note, we ought to do a better job of removing features that 
have created a burden on other areas of qemu that aren't actively being 
maintained.  That's a different discussion though.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm IPC

2010-08-03 Thread Nirmal Guhan
Thanks. Will this work if the guest or host are different combo - for
instance ubuntu/debian or fedora/ubuntu? In other words, is there
anything generic other than using the sockets ? Am ok to use PCI to
communicate too if that can improve performance. Any pointers would be
helpful.

--Nirmal

On Tue, Aug 3, 2010 at 1:05 AM, Amit Shah amit.s...@redhat.com wrote:
 On (Thu) Jul 29 2010 [16:17:48], Nirmal Guhan wrote:
 Hi,

 I run Fedora 12 and guest is also Fedora 12. I use br0/tap0 for
 networking and communicate between host-guest using socket.  I do
 see some references to virtio, pci based ipc and inter-vm shared
 memory but they are not current. My question is : Is there a better
 IPC mechanism for host-guest and intern vm communication and if so
 could you provide me with pointers?

 There's virtio-serial, which is a channel between a guest and the host.
 You can short-circuit two host-side chardevs to get inter-VM channels as
 well.

 See

 https://fedoraproject.org/wiki/Features/VirtioSerial

 for more info.

 This is only available from F13, though.

                Amit

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 09:26 PM, Anthony Liguori wrote:

On 08/03/2010 12:58 PM, Avi Kivity wrote:

 On 08/03/2010 08:42 PM, Anthony Liguori wrote:
However, I don't think we can objectively differentiate between a 
major and minor user.  Generally speaking, I would rather that 
we not take the position of you are a minor user therefore we're 
not going to accommodate you.


Again it's a matter of practicalities.  With have written virtio 
drivers for Windows and Linux, but not for FreeDOS or NetWare.  To 
speed up Windows XP we have (in qemu-kvm) kvm-tpr-opt.c that is a 
gross breach of decency, would we go to the same lengths to speed up 
Haiku?  I suggest that we would not.


tpr-opt optimizes a legitimate dependence on the x86 architecture that 
Windows has.  While the implementation may be grossly indecent, it 
certainly fits the overall mission of what we're trying to do in qemu 
and kvm which is emulate an architecture.


You've invested a lot of time and effort into it because it's 
important to you (or more specifically, your employer).  That's 
because Windows is important to you.


Correct.



If someone as adept and commit as you was heavily invested in Haiku 
and was willing to implement something equivalent to tpr-opt and also 
willing to do all of the work of maintaining it, then reject such a 
patch would be a mistake.


libguestfs does not depend on an x86 architectural feature.  
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We should 
discourage people from depending on this interface for production use.




If Richard is willing to do the work to make -kernel perform faster in 
such a way that it fits into the overall mission of what we're 
building, then I see no reason to reject it.  The criteria for 
evaluating a patch should only depend on how it affects other areas of 
qemu and whether it impacts overall usability.


That's true, but extending fwcfg doesn't fit into the overall picture 
well.  We have well defined interfaces for pushing data into a guest: 
virtio-serial (dma upload), virtio-blk (adds demand paging), and 
virtio-p9fs (no image needed).  Adapting libguestfs to use one of these 
is a better move than adding yet another interface.


A better (though still inaccurate) analogy is would be if the developers 
of a guest OS came up with a virtual bus for devices and were willing to 
do the work to make this bus perform better.  Would we accept this new 
work or would we point them at our existing bus (pci) instead?


Really, the bar on new interfaces (both to guest and host) should be 
high, much higher than it is now.  Interfaces should be well documented, 
future proof, migration safe, and orthogonal to existing interfaces.  
While the first three points could be improved with some effort, adding 
a new dma interface is not going to be orthogonal to virtio.  And 
frankly, libguestfs is better off switching to one of the other 
interfaces.  Slurping huge initrds isn't the right way to do this.


As a side note, we ought to do a better job of removing features that 
have created a burden on other areas of qemu that aren't actively 
being maintained.  That's a different discussion though.


Sure, we need something like Linux' 
Documentation/feature-removal-schedule.txt for people to ignore.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 09:43 PM, Avi Kivity wrote:
Really, the bar on new interfaces (both to guest and host) should be 
high, much higher than it is now.  Interfaces should be well 
documented, future proof, migration safe, and orthogonal to existing 
interfaces.  While the first three points could be improved with some 
effort, adding a new dma interface is not going to be orthogonal to 
virtio.  And frankly, libguestfs is better off switching to one of the 
other interfaces.  Slurping huge initrds isn't the right way to do this.


btw, precedent should play no role here.  Just because an older 
interfaces wasn't documented or migration safe or unit-tested doesn't 
mean new ones get off the hook.


It does help to have a framework in place that we can point people at, 
for example I added a skeleton Documentation/kvm/api.txt and some unit 
tests and then made contributors fill them in for new features.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 01:43 PM, Avi Kivity wrote:


If Richard is willing to do the work to make -kernel perform faster 
in such a way that it fits into the overall mission of what we're 
building, then I see no reason to reject it.  The criteria for 
evaluating a patch should only depend on how it affects other areas 
of qemu and whether it impacts overall usability.


That's true, but extending fwcfg doesn't fit into the overall picture 
well.  We have well defined interfaces for pushing data into a guest: 
virtio-serial (dma upload), virtio-blk (adds demand paging), and 
virtio-p9fs (no image needed).  Adapting libguestfs to use one of 
these is a better move than adding yet another interface.


On real hardware, there's an awful lot of interaction between the 
firmware and the platform.  It's a pretty rich interface.  On IBM 
systems, we actually extend that all the way down to userspace via a 
virtual USB RNDIS driver that you can use IPMI over.


A better (though still inaccurate) analogy is would be if the 
developers of a guest OS came up with a virtual bus for devices and 
were willing to do the work to make this bus perform better.  Would we 
accept this new work or would we point them at our existing bus (pci) 
instead?


Doesn't this precisely describe virtio-s390?



Really, the bar on new interfaces (both to guest and host) should be 
high, much higher than it is now.  Interfaces should be well 
documented, future proof, migration safe, and orthogonal to existing 
interfaces.


Okay, but this is a bigger discussion that I'm very eager to have.  But 
we shouldn't explicitly apply new policies to random patches without 
clearly stating the policy up front.


Regards,

Anthony Liguori

  While the first three points could be improved with some effort, 
adding a new dma interface is not going to be orthogonal to virtio.  
And frankly, libguestfs is better off switching to one of the other 
interfaces.  Slurping huge initrds isn't the right way to do this.


As a side note, we ought to do a better job of removing features that 
have created a burden on other areas of qemu that aren't actively 
being maintained.  That's a different discussion though.


Sure, we need something like Linux' 
Documentation/feature-removal-schedule.txt for people to ignore.




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 09:55 PM, Anthony Liguori wrote:

On 08/03/2010 01:43 PM, Avi Kivity wrote:


If Richard is willing to do the work to make -kernel perform faster 
in such a way that it fits into the overall mission of what we're 
building, then I see no reason to reject it.  The criteria for 
evaluating a patch should only depend on how it affects other areas 
of qemu and whether it impacts overall usability.


That's true, but extending fwcfg doesn't fit into the overall picture 
well.  We have well defined interfaces for pushing data into a guest: 
virtio-serial (dma upload), virtio-blk (adds demand paging), and 
virtio-p9fs (no image needed).  Adapting libguestfs to use one of 
these is a better move than adding yet another interface.


On real hardware, there's an awful lot of interaction between the 
firmware and the platform.  It's a pretty rich interface.  On IBM 
systems, we actually extend that all the way down to userspace via a 
virtual USB RNDIS driver that you can use IPMI over.


That is fine and we'll do pv interfaces when we have to.  That's fwfg, 
that's virtio.  But let's not do more than we have to.




A better (though still inaccurate) analogy is would be if the 
developers of a guest OS came up with a virtual bus for devices and 
were willing to do the work to make this bus perform better.  Would 
we accept this new work or would we point them at our existing bus 
(pci) instead?


Doesn't this precisely describe virtio-s390?


As I understood it, s390 had good reasons not to use their native 
interfaces.  On x86 we have no good reason not to use pci and no good 
reason not to use virtio for dma.




Really, the bar on new interfaces (both to guest and host) should be 
high, much higher than it is now.  Interfaces should be well 
documented, future proof, migration safe, and orthogonal to existing 
interfaces.


Okay, but this is a bigger discussion that I'm very eager to have.  
But we shouldn't explicitly apply new policies to random patches 
without clearly stating the policy up front.




Migration safety has been part of the criteria for a while.  Future 
proofness less so.  Documentation was usually completely missing but I 
see no reason not to insist on it now, better late than never.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gleb Natapov
On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
 
 If Richard is willing to do the work to make -kernel perform
 faster in such a way that it fits into the overall mission of what
 we're building, then I see no reason to reject it.  The criteria
 for evaluating a patch should only depend on how it affects other
 areas of qemu and whether it impacts overall usability.
 
 That's true, but extending fwcfg doesn't fit into the overall
 picture well.  We have well defined interfaces for pushing data into
 a guest: virtio-serial (dma upload), virtio-blk (adds demand
 paging), and virtio-p9fs (no image needed).  Adapting libguestfs to
 use one of these is a better move than adding yet another interface.
 
+1. I already proposed that. Nobody objects against fast fast
communication channel between guest and host. In fact we have one:
virtio-serial. Of course it is much easier to hack dma semantic into
fw_cfg interface than add virtio-serial to seabios, but it doesn't make
it right. Does virtio-serial has to be exposed as PCI to a guest or can
we expose it as ISA device too in case someone want to use -kernel option
but do not see additional PCI device in a guest?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 10:05 PM, Gleb Natapov wrote:



That's true, but extending fwcfg doesn't fit into the overall
picture well.  We have well defined interfaces for pushing data into
a guest: virtio-serial (dma upload), virtio-blk (adds demand
paging), and virtio-p9fs (no image needed).  Adapting libguestfs to
use one of these is a better move than adding yet another interface.


+1. I already proposed that. Nobody objects against fast fast
communication channel between guest and host. In fact we have one:
virtio-serial. Of course it is much easier to hack dma semantic into
fw_cfg interface than add virtio-serial to seabios, but it doesn't make
it right. Does virtio-serial has to be exposed as PCI to a guest or can
we expose it as ISA device too in case someone want to use -kernel option
but do not see additional PCI device in a guest?


No need for virtio-serial in firmware.  We can have a small initrd slurp 
a larger filesystem via virtio-serial, or mount a virtio-blk or 
virtio-p9fs, or boot the whole thing from a virtio-blk image and avoid 
-kernel -initrd completely.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
 libguestfs does not depend on an x86 architectural feature.
 qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
 should discourage people from depending on this interface for
 production use.

I really don't get this whole thing where we must slavishly
emulate an exact PC ...

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 02:05 PM, Gleb Natapov wrote:

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
   

If Richard is willing to do the work to make -kernel perform
faster in such a way that it fits into the overall mission of what
we're building, then I see no reason to reject it.  The criteria
for evaluating a patch should only depend on how it affects other
areas of qemu and whether it impacts overall usability.
   

That's true, but extending fwcfg doesn't fit into the overall
picture well.  We have well defined interfaces for pushing data into
a guest: virtio-serial (dma upload), virtio-blk (adds demand
paging), and virtio-p9fs (no image needed).  Adapting libguestfs to
use one of these is a better move than adding yet another interface.

 

+1. I already proposed that. Nobody objects against fast fast
communication channel between guest and host. In fact we have one:
virtio-serial. Of course it is much easier to hack dma semantic into
fw_cfg interface than add virtio-serial to seabios, but it doesn't make
it right. Does virtio-serial has to be exposed as PCI to a guest or can
we expose it as ISA device too in case someone want to use -kernel option
but do not see additional PCI device in a guest?
   


fw_cfg has to be available pretty early on so relying on a PCI device 
isn't reasonable.  Having dual interfaces seems wasteful.


We're already doing bulk data transfer over fw_cfg as we need to do it 
to transfer roms and potentially a boot splash.  Even outside of loading 
an initrd, the performance is going to start to matter with a large 
number of devices.


Regards,

Anthony Liguori


--
Gleb.
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gleb Natapov
On Tue, Aug 03, 2010 at 08:13:46PM +0100, Richard W.M. Jones wrote:
 On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
  libguestfs does not depend on an x86 architectural feature.
  qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
  should discourage people from depending on this interface for
  production use.
 
 I really don't get this whole thing where we must slavishly
 emulate an exact PC ...
 
May be because you don't have to dial with consequences of not doing so?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 02:13 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
   

libguestfs does not depend on an x86 architectural feature.
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
should discourage people from depending on this interface for
production use.
 

I really don't get this whole thing where we must slavishly
emulate an exact PC ...
   


History has shown that when we deviate, we usually get it wrong and it 
becomes very painful to fix.


Regards,

Anthony Liguori


Rich.

   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 10:13 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:

libguestfs does not depend on an x86 architectural feature.
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
should discourage people from depending on this interface for
production use.

I really don't get this whole thing where we must slavishly
emulate an exact PC ...


This has two motivations:

- documented interfaces: we suck at documentation.  We seldom document.  
Even when we do document something, the documentation is often 
inaccurate, misleading, and incomplete.  While an exact PC 
unfortunately doesn't exist, it's a lot closer to reality than, say, an 
exact Linux syscall interface.  If we adopt an existing interface, we 
already have the documentation, and if there's a conflict between the 
documentation and our implementation, it's clear who wins (well, not 
always).


- preexisting guests: if we design a new interface, we get to update all 
guests; and there are many of them.  Whereas an exact PC will be seen 
by the guest vendors as well who will then add whatever support is 
necessary.


Obviously we break this when we have to, but we don't, we shouldn't.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 10:15 PM, Anthony Liguori wrote:


fw_cfg has to be available pretty early on so relying on a PCI device 
isn't reasonable.  Having dual interfaces seems wasteful.


Agree.



We're already doing bulk data transfer over fw_cfg as we need to do it 
to transfer roms and potentially a boot splash. 


Why do we need to transfer roms?  These are devices on the memory bus or 
pci bus, it just needs to be there at the right address.  Boot splash 
should just be another rom as it would be on a real system.


Even outside of loading an initrd, the performance is going to start 
to matter with a large number of devices.


I don't really see why.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gleb Natapov
On Tue, Aug 03, 2010 at 02:15:05PM -0500, Anthony Liguori wrote:
 On 08/03/2010 02:05 PM, Gleb Natapov wrote:
 On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
 If Richard is willing to do the work to make -kernel perform
 faster in such a way that it fits into the overall mission of what
 we're building, then I see no reason to reject it.  The criteria
 for evaluating a patch should only depend on how it affects other
 areas of qemu and whether it impacts overall usability.
 That's true, but extending fwcfg doesn't fit into the overall
 picture well.  We have well defined interfaces for pushing data into
 a guest: virtio-serial (dma upload), virtio-blk (adds demand
 paging), and virtio-p9fs (no image needed).  Adapting libguestfs to
 use one of these is a better move than adding yet another interface.
 
 +1. I already proposed that. Nobody objects against fast fast
 communication channel between guest and host. In fact we have one:
 virtio-serial. Of course it is much easier to hack dma semantic into
 fw_cfg interface than add virtio-serial to seabios, but it doesn't make
 it right. Does virtio-serial has to be exposed as PCI to a guest or can
 we expose it as ISA device too in case someone want to use -kernel option
 but do not see additional PCI device in a guest?
 
 fw_cfg has to be available pretty early on so relying on a PCI
 device isn't reasonable.  Having dual interfaces seems wasteful.
 
fw_cfg wasn't mean to be used for bulk transfers (seabios doesn't even
use string pio to access it which make load time 50 times slower that
what Richard reports). It was meant to be easy to use on very early
stages of booting. Kernel/initrd are loaded on very late stage of
booting at which point PCI is fully initialized.

 We're already doing bulk data transfer over fw_cfg as we need to do
 it to transfer roms and potentially a boot splash.  Even outside of
 loading an initrd, the performance is going to start to matter with
 a large number of devices.
 
Most roms are loaded from rom PIC bars, so this leaves us with boot
splash, but boot splash image should be relatively small and if user
wants it he does not care about boot time already since bios need to
pause to show the boot splash anyway.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 02:24 PM, Avi Kivity wrote:

 On 08/03/2010 10:15 PM, Anthony Liguori wrote:


fw_cfg has to be available pretty early on so relying on a PCI device 
isn't reasonable.  Having dual interfaces seems wasteful.


Agree.



We're already doing bulk data transfer over fw_cfg as we need to do 
it to transfer roms and potentially a boot splash. 


Why do we need to transfer roms?  These are devices on the memory bus 
or pci bus, it just needs to be there at the right address.


Not quite.  The BIOS owns the option ROM space.  The way it works on 
bare metal is that the PCI ROM BAR gets mapped to some location in 
physical memory by the BIOS, the BIOS executes the initialization 
vector, and after initialization, the ROM will reorganize itself into 
something smaller.  It's nice and clean.


But ISA is not nearly as clean.  Ultimately, to make this mix work in a 
reasonable way, we have to provide a side channel interface to SeaBIOS 
such that we can deliver ROMs outside of PCI and still let SeaBIOS 
decide how ROMs get organized.


It's additionally complicated by the fact that we didn't support PCI ROM 
BAR until recently so to maintain compatibility with -M older, we have 
to use a side channel to lay out option roms.


Regards,

Anthony Liguori


  Boot splash should just be another rom as it would be on a real system.

Even outside of loading an initrd, the performance is going to start 
to matter with a large number of devices.


I don't really see why.



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Avi Kivity

 On 08/03/2010 10:38 PM, Anthony Liguori wrote:
Why do we need to transfer roms?  These are devices on the memory bus 
or pci bus, it just needs to be there at the right address.



Not quite.  The BIOS owns the option ROM space.  The way it works on 
bare metal is that the PCI ROM BAR gets mapped to some location in 
physical memory by the BIOS, the BIOS executes the initialization 
vector, and after initialization, the ROM will reorganize itself into 
something smaller.  It's nice and clean.


But ISA is not nearly as clean. 


So far so good.

Ultimately, to make this mix work in a reasonable way, we have to 
provide a side channel interface to SeaBIOS such that we can deliver 
ROMs outside of PCI and still let SeaBIOS decide how ROMs get organized.


I don't follow.  Why do we need this side channel?  What would a real 
ISA machine do?  Are there actually enough ISA devices for there to be a 
problem?




It's additionally complicated by the fact that we didn't support PCI 
ROM BAR until recently so to maintain compatibility with -M older, we 
have to use a side channel to lay out option roms.


Again I don't follow.  We can just lay out the ROMs in memory like we 
did in the past?


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 02:41 PM, Avi Kivity wrote:

 On 08/03/2010 10:38 PM, Anthony Liguori wrote:
Why do we need to transfer roms?  These are devices on the memory 
bus or pci bus, it just needs to be there at the right address.



Not quite.  The BIOS owns the option ROM space.  The way it works on 
bare metal is that the PCI ROM BAR gets mapped to some location in 
physical memory by the BIOS, the BIOS executes the initialization 
vector, and after initialization, the ROM will reorganize itself into 
something smaller.  It's nice and clean.


But ISA is not nearly as clean. 


So far so good.

Ultimately, to make this mix work in a reasonable way, we have to 
provide a side channel interface to SeaBIOS such that we can deliver 
ROMs outside of PCI and still let SeaBIOS decide how ROMs get organized.


I don't follow.  Why do we need this side channel?  What would a real 
ISA machine do?


It depends on the ISA machine.  In the worst case, there's a DIP switch 
on the card and if you've got a conflict between two cards, you start 
flipping DIP switches.  It's pure awesomeness.  No, I don't want to 
emulate DIP switches :-)



  Are there actually enough ISA devices for there to be a problem?


No, but -M older has the same problem.



It's additionally complicated by the fact that we didn't support PCI 
ROM BAR until recently so to maintain compatibility with -M older, we 
have to use a side channel to lay out option roms.


Again I don't follow.  We can just lay out the ROMs in memory like we 
did in the past?


Because only one component can own the option ROM space.  Either that's 
SeaBIOS and we need a side channel or it's QEMU and we can't use PMM.


I guess that's the real issue here.  Previously we used etherboot which 
was well under 32k.  We only loaded roms we needed.  Now we use gPXE 
which is much bigger and if you don't use PMM, then you run out of 
option rom space very quickly.


Previously, we loaded option ROMs on demand when a user used -boot n but 
that was a giant hack and wasn't like bare metal at all.  It involved 
x86-isms in vl.c.  Now we always load ROMs so PMM is very important.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 10:22:22PM +0300, Avi Kivity wrote:
  On 08/03/2010 10:13 PM, Richard W.M. Jones wrote:
 On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
 libguestfs does not depend on an x86 architectural feature.
 qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
 should discourage people from depending on this interface for
 production use.
 I really don't get this whole thing where we must slavishly
 emulate an exact PC ...
 
 This has two motivations:
 
 - documented interfaces: we suck at documentation.  We seldom
 document.  Even when we do document something, the documentation is
 often inaccurate, misleading, and incomplete.  While an exact PC
 unfortunately doesn't exist, it's a lot closer to reality than, say,
 an exact Linux syscall interface.  If we adopt an existing
 interface, we already have the documentation, and if there's a
 conflict between the documentation and our implementation, it's
 clear who wins (well, not always).
 
 - preexisting guests: if we design a new interface, we get to update
 all guests; and there are many of them.  Whereas an exact PC will
 be seen by the guest vendors as well who will then add whatever
 support is necessary.

On the other hand we end up with stuff like only being able to add 29
virtio-blk devices to a single guest.  As best as I can tell, this
comes from PCI, and this limit required a bunch of hacks when
implementing virt-df.

These are reasonable motivations, but I think they are partially about
us:

We could document things better and make things future-proof.  I'm
surprised by how lacking the doc requirements are for qemu (compared
to, hmm, libguestfs for example).

We could demand that OSes write device drivers for more qemu devices
-- already OS vendors write thousands of device drivers for all sorts
of obscure devices, so this isn't really much of a demand for them.
In fact, they're already doing it.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine.  Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] ceph/rbd block driver for qemu-kvm (v4)

2010-08-03 Thread Christian Brunner
On Tue, Aug 03, 2010 at 12:37:18AM +0400, malc wrote:
 
 Thare are whitespace issues in this patch.

Thanks for looking at the patch. Here is an updated patch, that 
should fix the whitespace issues:

This is a block driver for the distributed file system Ceph
(http://ceph.newdream.net/). This driver uses librados (which
is part of the Ceph server) for direct access to the Ceph object
store and is running entirely in userspace.

It now has (read only) snapshot support and passes all relevant
qemu-iotests.

To compile the driver you need at least ceph 0.21.

Additional information is available on the Ceph-Wiki:

http://ceph.newdream.net/wiki/Kvm-rbd

The patch is based on git://repo.or.cz/qemu/kevin.git block

Signed-off-by: Christian Brunner c...@muc.de

---
 Makefile.objs |1 +
 block/rbd.c   |  907 +
 block/rbd_types.h |   71 +
 configure |   31 ++
 4 files changed, 1010 insertions(+), 0 deletions(-)
 create mode 100644 block/rbd.c
 create mode 100644 block/rbd_types.h

diff --git a/Makefile.objs b/Makefile.objs
index 4a1eaa1..bf45142 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -18,6 +18,7 @@ block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o
 block-nested-$(CONFIG_WIN32) += raw-win32.o
 block-nested-$(CONFIG_POSIX) += raw-posix.o
 block-nested-$(CONFIG_CURL) += curl.o
+block-nested-$(CONFIG_RBD) += rbd.o
 
 block-obj-y +=  $(addprefix block/, $(block-nested-y))
 
diff --git a/block/rbd.c b/block/rbd.c
new file mode 100644
index 000..0e6b2a5
--- /dev/null
+++ b/block/rbd.c
@@ -0,0 +1,907 @@
+/*
+ * QEMU Block driver for RADOS (Ceph)
+ *
+ * Copyright (C) 2010 Christian Brunner c...@muc.de
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include qemu-common.h
+#include qemu-error.h
+#include sys/types.h
+#include stdbool.h
+
+#include qemu-common.h
+
+#include rbd_types.h
+#include module.h
+#include block_int.h
+
+#include stdio.h
+#include stdlib.h
+#include rados/librados.h
+
+#include signal.h
+
+
+int eventfd(unsigned int initval, int flags);
+
+
+/*
+ * When specifying the image filename use:
+ *
+ * rbd:poolname/devicename
+ *
+ * poolname must be the name of an existing rados pool
+ *
+ * devicename is the basename for all objects used to
+ * emulate the raw device.
+ *
+ * Metadata information (image size, ...) is stored in an
+ * object with the name devicename.rbd.
+ *
+ * The raw device is split into 4MB sized objects by default.
+ * The sequencenumber is encoded in a 12 byte long hex-string,
+ * and is attached to the devicename, separated by a dot.
+ * e.g. devicename.1234567890ab
+ *
+ */
+
+#define OBJ_MAX_SIZE (1UL  OBJ_DEFAULT_OBJ_ORDER)
+
+typedef struct RBDAIOCB {
+BlockDriverAIOCB common;
+QEMUBH *bh;
+int ret;
+QEMUIOVector *qiov;
+char *bounce;
+int write;
+int64_t sector_num;
+int aiocnt;
+int error;
+struct BDRVRBDState *s;
+} RBDAIOCB;
+
+typedef struct RADOSCB {
+int rcbid;
+RBDAIOCB *acb;
+int done;
+int64_t segsize;
+char *buf;
+} RADOSCB;
+
+typedef struct BDRVRBDState {
+int efd;
+rados_pool_t pool;
+rados_pool_t header_pool;
+char name[RBD_MAX_OBJ_NAME_SIZE];
+char block_name[RBD_MAX_BLOCK_NAME_SIZE];
+uint64_t size;
+uint64_t objsize;
+int qemu_aio_count;
+int read_only;
+} BDRVRBDState;
+
+typedef struct rbd_obj_header_ondisk RbdHeader1;
+
+static int rbd_parsename(const char *filename, char *pool, char **snap,
+ char *name)
+{
+const char *rbdname;
+char *p;
+int l;
+
+if (!strstart(filename, rbd:, rbdname)) {
+return -EINVAL;
+}
+
+pstrcpy(pool, 2 * RBD_MAX_SEG_NAME_SIZE, rbdname);
+p = strchr(pool, '/');
+if (p == NULL) {
+return -EINVAL;
+}
+
+*p = '\0';
+
+l = strlen(pool);
+if(l = RBD_MAX_SEG_NAME_SIZE) {
+error_report(pool name to long);
+return -EINVAL;
+} else if (l = 0) {
+error_report(pool name to short);
+return -EINVAL;
+}
+
+l = strlen(++p);
+if (l = RBD_MAX_OBJ_NAME_SIZE) {
+error_report(object name to long);
+return -EINVAL;
+} else if (l = 0) {
+error_report(object name to short);
+return -EINVAL;
+}
+
+strcpy(name, p);
+
+*snap = strchr(name, '@');
+if (*snap) {
+*(*snap) = '\0';
+(*snap)++;
+if (!*snap) *snap = NULL;
+}
+
+return l;
+}
+
+static int create_tmap_op(uint8_t op, const char *name, char **tmap_desc)
+{
+uint32_t len = strlen(name);
+/* total_len = encoding op + name + empty buffer */
+uint32_t total_len = 1 + (sizeof(uint32_t) + len) + sizeof(uint32_t);
+char *desc = NULL;
+
+desc = qemu_malloc(total_len);
+
+*tmap_desc = desc;
+
+*desc = op;
+desc++;
+memcpy(desc, len, sizeof(len));
+desc += sizeof(len);
+

Re: 2.6.35 hangs on early boot in KVM

2010-08-03 Thread Eric W. Biederman
Tvrtko Ursulin tvrtko.ursu...@sophos.com writes:

 On Tuesday 03 Aug 2010 16:17:20 Tvrtko Ursulin wrote:
 On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote:
  On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote:
 On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote:
I have basically built 2.6.35 with make oldconfig from a working
2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see
nothing after grub (have early printk and verbose bootup enabled),
just a blinking VGA cursor and CPU at 100%.
  
   Please copy kvm@vger.kernel.org on kvm issues.
  
CONFIG_PRINTK_TIME=y
  
   Try disabling this as a workaround.
 
  I am in the middle of a bisect run with five builds left to go, currently
  I have:
 
  bad 537b60d17894b7c19a6060feae40299d7109d6e7
  good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65

 Bisect is looking good, narrowed it to ten revisions, but I am not sure to
 make it to the end today:

 bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea
 good 41d59102e146a4423a490b8eca68a5860af4fe1c

 Bisect points the finger to x86, ioapic: In mpparse use mp_register_ioapic
 (cf7500c0ea133d66f8449d86392d83f840102632), so I am copying Eric. No idea
 whether this commit is solely to blame or it is a combined interaction with
 KVM, but I am sure you guys will know.

 If you want me to test something else please shout.

Interesting.  This is the second report I have heard of no VGA output
and a hang early in boot, that was bisected to this commit.  Since I
could not reproduce it I was hoping it was a fluke with a single piece
of hardware, but it appears not.

There was in fact an off by one bug in that commit, but if that had
been the issue 2.6.35 would have booted ok.  There was nothing in that
commit that should have prevented early output, and in fact I can boot
with a very similar configuration. So I am trying to figure out what
pieces are interacting to cause this failure mode to happen.

What version of kvm are you running on your host (in case that matters)?

I want to reproduce this myself so I can start guessing what weird
interactions are going on.

Eric
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 03:00 PM, Richard W.M. Jones wrote:

On Tue, Aug 03, 2010 at 10:22:22PM +0300, Avi Kivity wrote:
   

  On 08/03/2010 10:13 PM, Richard W.M. Jones wrote:
 

On Tue, Aug 03, 2010 at 09:43:39PM +0300, Avi Kivity wrote:
   

libguestfs does not depend on an x86 architectural feature.
qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We
should discourage people from depending on this interface for
production use.
 

I really don't get this whole thing where we must slavishly
emulate an exact PC ...
   

This has two motivations:

- documented interfaces: we suck at documentation.  We seldom
document.  Even when we do document something, the documentation is
often inaccurate, misleading, and incomplete.  While an exact PC
unfortunately doesn't exist, it's a lot closer to reality than, say,
an exact Linux syscall interface.  If we adopt an existing
interface, we already have the documentation, and if there's a
conflict between the documentation and our implementation, it's
clear who wins (well, not always).

- preexisting guests: if we design a new interface, we get to update
all guests; and there are many of them.  Whereas an exact PC will
be seen by the guest vendors as well who will then add whatever
support is necessary.
 

On the other hand we end up with stuff like only being able to add 29
virtio-blk devices to a single guest.  As best as I can tell, this
comes from PCI


No, this comes from us being too clever for our own good and not 
following the way hardware does it.


All modern systems keep disks on their own dedicated bus.  In 
virtio-blk, we have a 1-1 relationship between disks and PCI devices.  
That's a perfect example of what happens when we try to improve things.



, and this limit required a bunch of hacks when
implementing virt-df.

These are reasonable motivations, but I think they are partially about
us:

We could document things better and make things future-proof.  I'm
surprised by how lacking the doc requirements are for qemu (compared
to, hmm, libguestfs for example).
   


We enjoy complaining about our lack of documentation more than we like 
actually writing documentation.



We could demand that OSes write device drivers for more qemu devices
-- already OS vendors write thousands of device drivers for all sorts
of obscure devices, so this isn't really much of a demand for them.
In fact, they're already doing it.
   


So far, MS hasn't quite gotten the clue yet that they should write 
device drivers for qemu :-)  In fact, noone has.


Regards,

Anthony Liguori


Rich.

   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.35 hangs on early boot in KVM

2010-08-03 Thread Yinghai Lu
On Tue, Aug 3, 2010 at 8:59 AM, Tvrtko Ursulin
tvrtko.ursu...@sophos.com wrote:
 On Tuesday 03 Aug 2010 16:17:20 Tvrtko Ursulin wrote:
 On Tuesday 03 Aug 2010 15:57:03 Tvrtko Ursulin wrote:
  On Tuesday 03 Aug 2010 15:51:08 Avi Kivity wrote:
     On 08/03/2010 12:28 PM, Tvrtko Ursulin wrote:
I have basically built 2.6.35 with make oldconfig from a working
2.6.34. Latter works fine in kvm while 2.6.35 hangs very early. I see
nothing after grub (have early printk and verbose bootup enabled),
just a blinking VGA cursor and CPU at 100%.
  
   Please copy kvm@vger.kernel.org on kvm issues.
  
CONFIG_PRINTK_TIME=y
  
   Try disabling this as a workaround.
 
  I am in the middle of a bisect run with five builds left to go, currently
  I have:
 
  bad 537b60d17894b7c19a6060feae40299d7109d6e7
  good 93c9d7f60c0cb7715890b1f9e159da6f4d1f5a65

 Bisect is looking good, narrowed it to ten revisions, but I am not sure to
 make it to the end today:

 bad cb41838bbc4403f7270a94b93a9a0d9fc9c2e7ea
 good 41d59102e146a4423a490b8eca68a5860af4fe1c

 Bisect points the finger to x86, ioapic: In mpparse use mp_register_ioapic
 (cf7500c0ea133d66f8449d86392d83f840102632), so I am copying Eric. No idea
 whether this commit is solely to blame or it is a combined interaction with
 KVM, but I am sure you guys will know.

 If you want me to test something else please shout.


please try attached patch, to see if it help.

Yinghai
[PATCH] x86: check if apic/pin is shared with legacy one

fix system that external device that have io apic on apic0/pin(0-15)

also
for the io apic out of order system:
6ACPI: IOAPIC (id[0x10] address[0xfecff000] gsi_base[0])
6IOAPIC[0]: apic_id 16, version 0, address 0xfecff000, GSI 0-2
6ACPI: IOAPIC (id[0x0f] address[0xfec0] gsi_base[3])
6IOAPIC[1]: apic_id 15, version 0, address 0xfec0, GSI 3-38
6ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[39])
6IOAPIC[2]: apic_id 14, version 0, address 0xfec01000, GSI 39-74
6ACPI: INT_SRC_OVR (bus 0 bus_irq 1 global_irq 4 dfl dfl)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 5 dfl dfl)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 3 global_irq 6 dfl dfl)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 4 global_irq 7 dfl dfl)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 6 global_irq 9 dfl dfl)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 7 global_irq 10 dfl dfl)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 11 low edge)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 12 dfl dfl)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 12 global_irq 15 dfl dfl)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 13 global_irq 16 dfl dfl)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 17 low edge)
6ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 18 dfl dfl)

after this patch will get

apic0, pin0, GSI 0: irq 0+75
apic0, pin1, GSI 1: irq 1+75
apic0, pin2, GSI 2: irq 2
apic1, pin0, GSI 3: irq 3+75
apic1, pin5, GSI 8: irq 8+75
apic1, pin10,GSI 13: irq 13+75
apic1, pin11,GSI 14: irq 14+75

because mp_config_acpi_legacy_irqs will put apic0, pin2, irq2 in mp_irqs...
so pin_2_irq_legacy will report 2.
irq_to_gsi will still report 2. so it is right.
gsi_to_irq will report 2.

for 0, 1, 3, 8, 13, 14: still right

Signed-off-by: Yinghai Lu ying...@kernel.org

---
 arch/x86/kernel/apic/io_apic.c |   31 ---
 1 file changed, 28 insertions(+), 3 deletions(-)

Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
===
--- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
+++ linux-2.6/arch/x86/kernel/apic/io_apic.c
@@ -1013,6 +1013,28 @@ static inline int irq_trigger(int idx)
 	return MPBIOS_trigger(idx);
 }
 
+static int pin_2_irq_leagcy(int apic, int pin)
+{
+	int i;
+
+	for (i = 0; i  mp_irq_entries; i++) {
+		int bus = mp_irqs[i].srcbus;
+
+		if (!test_bit(bus, mp_bus_not_pci))
+			continue;
+
+		if (mp_ioapics[apic].apicid != mp_irqs[i].dstapic)
+			continue;
+
+		if (mp_irqs[i].dstirq != pin)
+			continue;
+
+		return mp_irqs[i].srcbusirq;
+	}
+
+	return -1;
+}
+
 static int pin_2_irq(int idx, int apic, int pin)
 {
 	int irq;
@@ -1029,10 +1051,13 @@ static int pin_2_irq(int idx, int apic,
 	} else {
 		u32 gsi = mp_gsi_routing[apic].gsi_base + pin;
 
-		if (gsi = NR_IRQS_LEGACY)
+		if (gsi = NR_IRQS_LEGACY) {
 			irq = gsi;
-		else
-			irq = gsi_top + gsi;
+		} else {
+			irq = pin_2_irq_legacy(apic, pin);
+			if (irq  0)
+irq = gsi_top + gsi;
+		}
 	}
 
 #ifdef CONFIG_X86_32


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Paolo Bonzini

On 08/03/2010 10:49 PM, Anthony Liguori wrote:

On the other hand we end up with stuff like only being able to add 29
virtio-blk devices to a single guest.  As best as I can tell, this
comes from PCI


No, this comes from us being too clever for our own good and not
following the way hardware does it.

All modern systems keep disks on their own dedicated bus.  In
virtio-blk, we have a 1-1 relationship between disks and PCI devices.
That's a perfect example of what happens when we try to improve things.


Comparing (from personal experience) the complexity of the Windows 
drivers for Xen and virtio shows that it's not a bad idea at all.


Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gerd Hoffmann

  Hi,


We're already doing bulk data transfer over fw_cfg as we need to do it
to transfer roms and potentially a boot splash.


Why do we need to transfer roms? These are devices on the memory bus or
pci bus, it just needs to be there at the right address.


Indeed.  We do that in most cases.  The exceptions are:

  (1) -M somethingold.  PCI devices don't have a pci rom bar then by
  default because they didn't not have one in older qemu versions,
  so we need some other way to pass the option rom to seabios.
  (2) vgabios.bin.  vgabios needs patches to make loading via pci rom
  bar work (vgabios-cirrus.bin works fine already).  I have patches
  in the queue to do that.
  (3) roms not associated with a PCI device:  multiboot, extboot,
  -option-rom command line switch, vgabios for -M isapc.

The default configuration (qemu $diskimage) loads two roms: 
vgabios-cirrus.bin and e1000.bin.  Both are loaded via pci rom bar and 
not via fw_cfg.


cheers,
  Gerd

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Gerd Hoffmann

  Hi,


Again I don't follow. We can just lay out the ROMs in memory like we did
in the past?


Well.  We have some size issues then.  PCI ROMS are loaded by the BIOS 
in a way that only a small fraction is actually resident in the small 
0xd - 0xe area.  That doesn't work if qemu tries to simply copy 
the whole thing there like old versions did.  With the size of the gPXE 
roms this matters in real life.


cheers,
  Gerd

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Anthony Liguori

On 08/03/2010 04:13 PM, Paolo Bonzini wrote:

On 08/03/2010 10:49 PM, Anthony Liguori wrote:

On the other hand we end up with stuff like only being able to add 29
virtio-blk devices to a single guest.  As best as I can tell, this
comes from PCI


No, this comes from us being too clever for our own good and not
following the way hardware does it.

All modern systems keep disks on their own dedicated bus.  In
virtio-blk, we have a 1-1 relationship between disks and PCI devices.
That's a perfect example of what happens when we try to improve 
things.


Comparing (from personal experience) the complexity of the Windows 
drivers for Xen and virtio shows that it's not a bad idea at all.


Not quite sure what you're suggesting, but I could have been clearer.  
Instead of having virtio-blk where a virtio disk has a 1-1 mapping to a 
PCI device, we probably should have just done virtio-scsi.


Since most OSes have a SCSI-centric block layer, it would have resulted 
in much simpler drivers and we could support more than 1 disk per PCI 
slot.  I had thought Christoph was working on such a device at some 
point in time...


Regards,

Anthony Liguori



Paolo


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


performance with libvirt and kvm

2010-08-03 Thread Nirmal Guhan
Hi,

I am seeing a performance degradation while using libvirt to start my
vm (kvm). vm is fedora 12 and host is also fedora 12, both with
2.6.32.10-90.fc12.i686. Here are the statistics from iperf :

From VM: [  3]  0.0-30.0 sec   199 MBytes  55.7 Mbits/sec

From host : [  3]  0.0-30.0 sec   331 MBytes  92.6 Mbits/sec

libvirt command as seen from ps output :

/usr/bin/qemu-kvm -S -M pc-0.11 -enable-kvm -m 512 -smp 1 -name
f12kvm1 -uuid 9300bfe2-2b9c-d9f0-3b03-9c7fe9934393 -monitor
unix:/var/lib/libvirt/qemu/f12kvm1.monitor,server,nowait -boot c
-drive file=/var/lib/libvirt/f12.img,if=ide,bus=0,unit=0,boot=on,format=raw
-drive if=ide,media=cdrom,bus=1,unit=0,format=raw -net
nic,macaddr=52:54:00:51:7c:39,vlan=0,model=virtio,name=net0 -net
tap,fd=21,vlan=0,name=hostnet0 -serial pty -parallel none -usb -vnc
127.0.0.1:0 -k en-us -vga cirrus -balloon virtio


If I start a similar vm using qemu-kvm directly, the performance
matches with the host.
qemu kvm : [  3]  0.0-30.0 sec   329 MBytes  91.9 Mbits/sec
TCP window size is 64K for all the cases.

Command used : qemu-kvm vdisk.img -m 512 -net nic,model=virtio
macaddr=$macaddress -net tap,script=/etc/qemu-ifup

Any clues?

Thanks,
Nirmal
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Richard W.M. Jones
On Tue, Aug 03, 2010 at 10:24:41PM +0300, Avi Kivity wrote:
 Why do we need to transfer roms?  These are devices on the memory
 bus or pci bus, it just needs to be there at the right address.
 Boot splash should just be another rom as it would be on a real
 system.

Just like the initrd?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://et.redhat.com/~rjones/libguestfs/
See what it can do: http://et.redhat.com/~rjones/libguestfs/recipes.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Alt SeaBIOS SSDT cpu hotplug

2010-08-03 Thread Kevin O'Connor
On Tue, Aug 03, 2010 at 05:00:49PM +0800, Liu, Jinsong wrote:
 I just test your new patch with Windows 2008 DataCenter at my
 platform, it works OK! We can hot-add new cpus and they appear at
 Device Manager.  (BTW, yesterday I test your new patch with linux
 2.6.32 hvm, it works fine, we can add-remove-add-remove... cpus)
 Sorry for make you spend more time. It's our fault.

Thanks.

I'll go ahead and commit it then.  I have one incremental patch (see
below) which I will also commit.

-Kevin


--- ssdt-proc.dsl   2010-08-03 18:45:12.0 -0400
+++ src/ssdt-proc.dsl   2010-08-03 18:45:17.0 -0400
@@ -44,7 +44,7 @@
 Return(CPST(ID))
 }
 Method (_EJ0, 1, NotSerialized) {
-Return(CPEJ(ID, Arg0))
+CPEJ(ID, Arg0)
 }
 }
 }
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?

2010-08-03 Thread Jamie Lokier
Richard W.M. Jones wrote:
 We could demand that OSes write device drivers for more qemu devices
 -- already OS vendors write thousands of device drivers for all sorts
 of obscure devices, so this isn't really much of a demand for them.
 In fact, they're already doing it.

Result: Most OSes not working with qemu?

Actually we seem to be going that way.  Recent qemus don't work with
older versions of Windows any more, so we have to use different
versions of qemu for different guests.

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


buildbot failure in qemu-kvm on disable_kvm_x86_64_debian_5_0

2010-08-03 Thread qemu-kvm
The Buildbot has detected a new failure of disable_kvm_x86_64_debian_5_0 on 
qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_x86_64_debian_5_0/builds/497

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_1

Build Reason: The Nightly scheduler named 'nightly_disable_kvm' triggered this 
build
Build Source Stamp: [branch master] HEAD
Blamelist: 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


buildbot failure in qemu-kvm on disable_kvm_i386_debian_5_0

2010-08-03 Thread qemu-kvm
The Buildbot has detected a new failure of disable_kvm_i386_debian_5_0 on 
qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_i386_debian_5_0/builds/498

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_2

Build Reason: The Nightly scheduler named 'nightly_disable_kvm' triggered this 
build
Build Source Stamp: [branch master] HEAD
Blamelist: 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


buildbot failure in qemu-kvm on disable_kvm_x86_64_out_of_tree

2010-08-03 Thread qemu-kvm
The Buildbot has detected a new failure of disable_kvm_x86_64_out_of_tree on 
qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_x86_64_out_of_tree/builds/446

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_1

Build Reason: The Nightly scheduler named 'nightly_disable_kvm' triggered this 
build
Build Source Stamp: [branch master] HEAD
Blamelist: 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


buildbot failure in qemu-kvm on disable_kvm_i386_out_of_tree

2010-08-03 Thread qemu-kvm
The Buildbot has detected a new failure of disable_kvm_i386_out_of_tree on 
qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/disable_kvm_i386_out_of_tree/builds/446

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_2

Build Reason: The Nightly scheduler named 'nightly_disable_kvm' triggered this 
build
Build Source Stamp: [branch master] HEAD
Blamelist: 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC PATCH v8 00/16] Provide a zero-copy method on KVM virtio-net.

2010-08-03 Thread Dong, Eddie
Arnd Bergmann wrote:
 On Friday 30 July 2010 17:51:52 Shirley Ma wrote:
 On Fri, 2010-07-30 at 16:53 +0800, Xin, Xiaohui wrote:
 Since vhost-net already supports macvtap/tun backends, do you think
 whether it's better to implement zero copy in macvtap/tun than
 inducing a new media passthrough device here?
 
 
 I'm not sure if there will be more duplicated code in the kernel.
 
 I think it should be less duplicated code in the kernel if we use
 macvtap to support what media passthrough driver here. Since macvtap
 has support virtio_net head and offloading already, the only missing
 func is zero copy. Also QEMU supports macvtap, we just need add a
 zero copy flag in option.
 
 Yes, I fully agree and that was one of the intended directions for
 macvtap to start with. Thank you so much for following up on that,
 I've long been planning to work on macvtap zero-copy myself but it's
 now lower on my priorities, so it's good to hear that you made
 progress on it, even if there are still performance issues.
 

But zero-copy is a Linux generic feature that can be used by other VMMs as well 
if the BE service drivers want to incorporate.  If we can make mp device 
VMM-agnostic (it may be not yet in current patch), that will help Linux more.


Thx, Eddie--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] x86: Allow accessing IDT via emulator ops

2010-08-03 Thread Mohammed Gamal
The patch adds a new member get_idt() to x86_emulate_ops.
It also adds a function to get the idt in order to be used by the emulator.

This is needed for real mode interrupt injection and the emulation of int 
instructions.

Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
---
 arch/x86/include/asm/kvm_emulate.h |1 +
 arch/x86/kvm/x86.c |6 ++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index cbebf1d..f22e5da 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -139,6 +139,7 @@ struct x86_emulate_ops {
void (*set_segment_selector)(u16 sel, int seg, struct kvm_vcpu *vcpu);
unsigned long (*get_cached_segment_base)(int seg, struct kvm_vcpu 
*vcpu);
void (*get_gdt)(struct desc_ptr *dt, struct kvm_vcpu *vcpu);
+   void (*get_idt)(struct desc_ptr *dt, struct kvm_vcpu *vcpu);
ulong (*get_cr)(int cr, struct kvm_vcpu *vcpu);
int (*set_cr)(int cr, ulong val, struct kvm_vcpu *vcpu);
int (*cpl)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e7e3b50..416aa0e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3790,6 +3790,11 @@ static void emulator_get_gdt(struct desc_ptr *dt, struct 
kvm_vcpu *vcpu)
kvm_x86_ops-get_gdt(vcpu, dt);
 }
 
+static void emulator_get_idt(struct desc_ptr *dt, struct kvm_vcpu *vcpu)
+{
+   kvm_x86_ops-get_idt(vcpu, dt);
+}
+
 static unsigned long emulator_get_cached_segment_base(int seg,
  struct kvm_vcpu *vcpu)
 {
@@ -3883,6 +3888,7 @@ static struct x86_emulate_ops emulate_ops = {
.set_segment_selector = emulator_set_segment_selector,
.get_cached_segment_base = emulator_get_cached_segment_base,
.get_gdt = emulator_get_gdt,
+   .get_idt = emulator_get_idt,
.get_cr  = emulator_get_cr,
.set_cr  = emulator_set_cr,
.cpl = emulator_get_cpl,
-- 
1.7.0.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 1/3] export __get_user_pages_fast() function

2010-08-03 Thread Xiao Guangrong
This function is used by KVM to pin process's page in the atomic context.

Define the 'weak' function to avoid other architecture not support it

Acked-by: Nick Piggin npig...@suse.de
Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com
---
 mm/util.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/mm/util.c b/mm/util.c
index f5712e8..4f0d32b 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -250,6 +250,19 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
 }
 #endif
 
+/*
+ * Like get_user_pages_fast() except its IRQ-safe in that it won't fall
+ * back to the regular GUP.
+ * If the architecture not support this fucntion, simply return with no
+ * page pinned
+ */
+int __attribute__((weak)) __get_user_pages_fast(unsigned long start,
+int nr_pages, int write, struct page **pages)
+{
+   return 0;
+}
+EXPORT_SYMBOL_GPL(__get_user_pages_fast);
+
 /**
  * get_user_pages_fast() - pin user pages in memory
  * @start: starting user address
-- 
1.6.1.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >