Re: [kvm-devel] kvm-45 problems

2007-10-12 Thread Avi Kivity
Zhao, Yunfeng wrote:
 Hi,Avi
 With latest kvm commits,  the SMP linux guests causes host soft lock issues 
 can not be reproduced on my machine.
 Have you fixed it?

   

Not intentionally...

Maybe the rmap fix is somehow responsible.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kernel device reset support

2007-10-12 Thread Avi Kivity
Dong, Eddie wrote:
 Current VP wake up logic thru INIT/SIPI doesn't support this when
 irqchip in kernel. 


   
 Doesn't this code imply that waiting for SIPI is supported?
 

 It is supported to wake up VCPU in kernel, but can't wake up the VCPU
 in user level since irqchip_in_kernel is TRUE here. vcpu-mp_state
 doesn't export to user level.

   

We never sleep in user level if irqchip_in_kernel().  So the thread will
eventually go back to kernel mode.


 You can put a goto to the top of the loop to redo the mmu reload.  In
 any case you need to do that because you don't want to execute
 the reset
 code with interrupts and preemption disabled.
 

 A goto cross function? It is too aggresive and bad code style IMO.
 The vcpu-request check is in __vcpu_run, while entering block 
 state is in its parent function kvm_vcpu_ioctl_run.

   

goto the label 'again' in __vcpu_run(), which has the call to
kvm_mmu_reload().

 But if you want, we can return a special value, 
 say REQUEST_INTERNAL_LOOP, to 
 kvm_vcpu_ioctl_run and let kvm_vcpu_ioctl_run use sepcial logic
 to do goto within function if it see the special return value 
 REQUEST_INTERNAL_LOOP. But is it cleaner?

 Also we will add more kernel to user EXIT reason, such as RESET request
 from kernel sensored guest tripple fault etc.

   

There is already a triple fault exit code.

 The VCPU may be executing in kernel still, which may modify kernel
 device state. E.g. A VCPU may be doing PIO emulating.


   
 In that case we will wait when taking kvm-lock.
 

 Lock doesn't help. Lock can only avoid no 2+ modifcation
 in same time. But what we care if all other VCPUs can't
 do modification after BSP do device reset.
 It is different semantics.

 Maybe you are still arguing it is the AP who do RESET ops.
 Let us go to next discussion first.

   

We first halt all vcpus, then take the lock.  So:

- other processors won't start after the device reset because they are
halted
- we won't do the reset concurrently with other processors because of
the lock

 If BSP reset the kernel devices earlier than the VCPU modify the
 device state, we are in trouble.

 No, VCPU0 (BSP) is current VCPU (though you don't have the current
 vcpu parameter explicitly) like mentioned in previous mail and
 as pre-requirement of user level change. Please refer my abswer
 above of this mail. 


   
 We can't rely on user space not to cause host kernel corruption.
 

 ???

 Even an AP trigger RESET, it just sets a reset_request flag in user
 level.
 It is another VCPU who will execute RESET operation.

 It seems the argument is who should do the RESET operation, 
 say RST_CPU. BSP only or AP too. For me, since after RESET
 only BSP can execute, and the thread executing 
 qemu_system_reset will continously execute 
 (after RESET kernel) per current Qemu code, so what we can do is:

   1: RST_CPU=BSP. Then BSP does qemu_system_reset, or
   2: RST_CPU = AP, say RAP, does qemu_system_reset, user level
 then 
 need to block RAP after qemu_system_reset and wake up BSP to take over.
   A point here we can't blcok RAP in case 2 at kernel RESET time,
 since
 kernel RESET may be not the last step of qemu_system_reset. It may go 
 to kernel again.

   If we go with #1, just 1 line change as in my previous mail.
   If we go with #2, we have to add a new ABI for the AP to enter
 kernel 
 wait for INIT/SIPI/SIPI state, otherwise normal INIT/SIPI/SIPI couldn't
 wake it up.

   I see much complicate in #2 while #1 has same functionality but
 simple.

   

My view is:
- vcpu threads never sleep in userspace.  they will always eventually
end up in the kernel so we can stop or restart them there
- reset is a platform API so it can't be dependent on which vcpu thread
executes it (if any; it may be executed from an unrelated thread,
remember we plan to separate the qemu signal handling code into a
separate thread)
- we already have a way to send messages to other vcpus

So it seems to me everything is in place to make it fairly simple.

I'll try writing a patch that does what I mean and post it.  Either I'll
convince you that in-kernel is simpler, or I'll convince myself that it
is harder.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] RFC/patch portability: split kvm_vcpu_ioctl v1

2007-10-12 Thread Avi Kivity
Carsten Otte wrote:
 Avi Kivity wrote:
 Applied, thanks.  I renamed kvm_vcpu_load() and kvm_vcpu_put() back
 to vcpu_load() and vcpu_put() in order to keep the patch small and
 simple, and because I'm emotionally attached to the original names.
 Oh, I think I had a very good reason for renaming it: it's no longer
 static, and thus part of the kernel's global namespace in case kvm is
 built-in. As far as I know, modules are expected to prefix any symbol
 they use with their module name.
 I am sorry for the emotional part of it, I tend to stick to old names
 too once I got used to them.
 In case you decide you want kvm_vcpu_load/put again, let me know so
 that I can supply a patch on top of git that renames it.

I agree 100%, I'm just using the keep the patch dead simple excuse to
delay the change.  We can have a 'add kvm_ prefix' patch round later. 
Let's complete the separation first.

There are some bigger offenders too, like set_crX(), which don't even
start with the magic V and are exported to modules.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Expose infrastructure for unpinning guest memory

2007-10-12 Thread Avi Kivity
Anthony Liguori wrote:
 Now that we have userspace memory allocation, I wanted to play with 
 ballooning.
 The idea is that when a guest balloons down, we simply unpin the underlying
 physical memory and the host kernel may or may not swap it.  To reclaim
 ballooned memory, the guest can just start using it and we'll pin it on 
 demand.

 The following patch is a stab at providing the right infrastructure for 
 pinning
 and automatic repinning.  I don't have a lot of comfort in the MMU code so I
 thought I'd get some feedback before going much further.

 gpa_to_hpa is a little awkward to hook, but it seems like the right place in 
 the
 code.  I'm most uncertain about the SMP safety of the unpinning.  Presumably,
 I have to hold the kvm lock around the mmu_unshadow and page_cache release to
 ensure that another VCPU doesn't fault the page back in after mmu_unshadow?

   

One we have true swapping capabilities (which imply ability for the
kernel to remove a page from the shadow page tables) you can unpin by
calling munmap() or madvise(MADV_REMOVE) on the pages to be unpinned.

Other than that the approach seems right.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch] [0/3] Patches to support new architectures.

2007-10-12 Thread Zhang, Xiantao
Avi Kivity wrote:
 Zhang, Xiantao wrote:
 
 x86 will continue to use kvm_x86_ops for that purposes.  But other
 archs should not. 
 
 x86 will use both mechanisms: first, linkage will select the x86
 function, and then kvm_x86_ops will be used to select the
 implementation dependent code.  The two levels are very different as
 kvm_x86_ops is very low level and x86 specific.
 
 Hi Avi,
  Maybe linkage is a better choice. But if we need to maintain two
 different implmentation for different archs, it may introduce
 unnecessary effort. In addition, I can't figure out any
 disadvantages with function pointers, moreover, it makes source
 uniform for all architectures, though it is not very necessary.
 
 
 Linkage is more efficient (though I don't think we'll be able to
 measure the difference) and is also the traditional way of doing
 things in Linux. 
 
 I don't see why it causes extra effort.  Can you explain?

I orgirnally mean we have to wrap all functions related to kvm_x86_ops.
But seems it doesn't introduce 
extra maintain effort, if other architectures implment these functions
directly.  Good method!

Thanks  
Xiantao

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [ kvm-Bugs-1812043 ] Cannot boot 32bit smp RHEL5.1 guest

2007-10-12 Thread SourceForge.net
Bugs item #1812043, was opened at 2007-10-12 14:41
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=1812043group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: yunfeng (yunfeng)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cannot boot 32bit smp RHEL5.1 guest

Initial Comment:
32bit smp RHEL5.1 guest cannot boot on 32bit host and 64bit host.
With -no-kvm-irqchip it also fails.
But if booted it with UP or with -no-kvm, it has no any problem.


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=1812043group_id=180599

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch] [0/3] Patches to support new architectures.

2007-10-12 Thread Avi Kivity
Zhang, Xiantao wrote:

 x86 will continue to use kvm_x86_ops for that purposes.  But other
 archs should not.

 x86 will use both mechanisms: first, linkage will select the x86
 function, and then kvm_x86_ops will be used to select the
 implementation dependent code.  The two levels are very different as
 kvm_x86_ops is very low level and x86 specific.
 
 Hi Avi,
  Maybe linkage is a better choice. But if we need to maintain two
 different implmentation for different archs, it may introduce
 unnecessary effort.
 In addition, I can't figure out any disadvantages with function
 pointers, moreover, it makes source uniform for all architectures,
 though it is not very necessary. 
   

Linkage is more efficient (though I don't think we'll be able to measure
the difference) and is also the traditional way of doing things in Linux.

I don't see why it causes extra effort.  Can you explain?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [ kvm-Bugs-1812050 ] segfault while booting 64bit linux with 4GB mem

2007-10-12 Thread SourceForge.net
Bugs item #1812050, was opened at 2007-10-12 15:00
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=1812050group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: yunfeng (yunfeng)
Assigned to: Nobody/Anonymous (nobody)
Summary: segfault while booting 64bit linux with 4GB mem

Initial Comment:
Segment fault happens while booting a 64bit linux guest with 4GB mem.
Here is the error message:
qemu-system-x86[8052]: segfault at 2b9d3be19000 rip 003d35876d40 rsp 
7fff6d586ce8 error 4


The guest is installed with RHEL4U3, the kernel is 2.6.9-34.EL
The host machine is a harwitch/paxville (16LPs), mem is 8GB.

Here is the command line:
qemu-system-x86_64 . -m 4096 -net nic,macaddr=00:16:3e:17:fa:66,model=rtl8139 
-net tap,script=/etc/kvm/qemu-ifup -hda 
/share/xvs/var/tmp-img_CPL_MEM_05_1192172400_1

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=1812050group_id=180599

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kernel device reset support

2007-10-12 Thread Dong, Eddie

 I'll try writing a patch that does what I mean and post it.
 Either I'll
 convince you that in-kernel is simpler, or I'll convince myself that
 it is harder. 
 
OK, let us see which one is simple.

BTW, you have swapped to N+1 SMP model in this discussion which
is not there yet. And this is the difference.

For me N+1 model means the device emulation will go to the N+1 thread
while all CPU execution will still be in its own thread. It looks like
you
are proposing different.

Eddie

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [ kvm-Bugs-1812072 ] Cannot boot 64bit Vista

2007-10-12 Thread SourceForge.net
Bugs item #1812072, was opened at 2007-10-12 15:36
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=1812072group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: yunfeng (yunfeng)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cannot boot 64bit Vista

Initial Comment:
I cannot boot 64bit UP Vista.
It turns a black windows after booting for while.
The issue doesn't exist if adding -no-kvm-irqchip.

Here is the command:
 qemu-system-x86_64 . -m 1024-net nic,macaddr=00:16:3e:3f:02:d6,model=rtl8139 
-net tap,script=/etc/kvm/qemu-ifup -hda 
/share/xvs/var/tmp-img_gbp17_1192173849_1

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=1812072group_id=180599

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ANNOUNCE] kvm-46 release

2007-10-12 Thread Gerd Hoffmann
Avi Kivity wrote:
 We've now switched to allocating guest memory in userspace rather than
 in the kernel.

Hmm, a quick glimpse over kvmctl.h doesn't show an obvious way how to
use that.  If I want to back vm memory with a file mapping, how can I do
that?

cheers,
  Gerd


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] RFC/patch portability: split kvm_vcpu_ioctl v1

2007-10-12 Thread Carsten Otte
Avi Kivity wrote:
 I agree 100%, I'm just using the keep the patch dead simple excuse to
 delay the change.  We can have a 'add kvm_ prefix' patch round later. 
 Let's complete the separation first.
Okay, fine with me.

 There are some bigger offenders too, like set_crX(), which don't even
 start with the magic V and are exported to modules.
Boo.
Stop name space pollution, save the planet!
Name space pollution causes global warmth!

We should clean that up one day.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Test for KVM, kernel 33aaf..., userspace, 803145...

2007-10-12 Thread Zhao, Yunfeng
Hi,
Here is the summary for current KVM quality, kernel
33aafecf3f106ba6aa8847dfdae033a73e5d1b50, userspace,
80314525755d89861c6709c5ba6104fbe34a64ea.
4 old issues have been fixed in this week. And 3 new issues have been
found.
Totally 8 major issues still exist.

Five fixed old issues:
1. 64bit linux with 2.6.9 kernel cannot get ip
https://sourceforge.net/tracker/index.php?func=detailaid=1802580group_
id=180599atid=893831
2. soft lockup while running SMP linux guest with 4vpus
https://sourceforge.net/tracker/index.php?func=detailaid=1804597group_
id=180599atid=893831
3 Creating multiple guests may cause host to hang
https://sourceforge.net/tracker/index.php?func=detailaid=1741312group_
id=180599atid=893831
4. Fails to install x86 Vista on 64bit host 
https://sourceforge.net/tracker/index.php?func=detailaid=1805007group_
id=180599atid=893831

Issue List:
One Network issue:
1. 64bit guest with 2.6.16 kernel crashes when start up nic
-no-kvm-irqchip has the same issue
https://sourceforge.net/tracker/index.php?func=detailaid=1804990group_
id=180599atid=893831

Five Windows issues:
2. Cannot boot 64bit Vista
-no-kvm-irqchip hasn't the issue.
https://sourceforge.net/tracker/index.php?func=detailaid=1812072group_
id=180599atid=893831
3. windows xp with acpi hal fails to reboot
-no-kvm-irqchip has the same issue
https://sourceforge.net/tracker/index.php?func=detailaid=1805016group_
id=180599atid=893831
4. 64bit xpsp2 installer crashed when rebooting
With -no-kvm-irqchip, blue screen happens when reboot
https://sourceforge.net/tracker/index.php?func=detailaid=1804990group_
id=180599atid=893831
5. xpsp2 with 2vpus may fail to boot
-no-kvm-irqchip has the same issue
https://sourceforge.net/tracker/index.php?func=detailaid=1805017group_
id=180599atid=893831

Three Linux guest issues:
6. segfault while booting 64bit linux with 4GB mem
https://sourceforge.net/tracker/?func=detailatid=893831aid=1812050gro
up_id=180599
7. Cannot boot 32bit smp RHEL5.1 guest
Only -on-kvm can boot it.
https://sourceforge.net/tracker/?func=detailatid=893831aid=1812043gro
up_id=180599
8 Some ltp cases fail on KVM guests
https://sourceforge.net/tracker/index.php?func=detailaid=1741316group_
id=180599atid=893831

thanks
Yunfeng

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch] [0/3] Patches to support new architectures.

2007-10-12 Thread Carsten Otte
Zhang, Xiantao wrote:
 I orgirnally mean we have to wrap all functions related to kvm_x86_ops.
 But seems it doesn't introduce 
 extra maintain effort, if other architectures implment these functions
 directly.  Good method!
That was my idea at first too, until Hollis has beaten me up on this. 
Most archs don't have the split like vmx/svm do, and the compiler 
automagically inlines the wrapper for x86 which gets optimized away 
completely.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Christian Ehrhardt
Zhang, Xiantao wrote:
 --- /dev/null
 +++ b/drivers/kvm/kvm_arch.h
[...]
 +struct kvm_arch_vcpu{
 + 
 + u64 host_tsc;
 + 
 + unsigned long regs[NR_VCPU_REGS]; /* for rsp:
 vcpu_load_rsp_rip() */
 + unsigned long rip;  /* needs vcpu_load_rsp_rip() */
 +
 + unsigned long cr0;
 + unsigned long cr2;
 + unsigned long cr3;
 + unsigned long cr4;
 + unsigned long cr8;
 + u64 pdptrs[4]; /* pae */
 + u64 shadow_efer;
 + u64 apic_base;
 + struct kvm_lapic *apic;/* kernel irqchip context */
 +
 + u64 ia32_misc_enable_msr;
 +
 + 
 + struct i387_fxsave_struct host_fx_image;
 + struct i387_fxsave_struct guest_fx_image;
 + int fpu_active;
 + int guest_fpu_loaded;
 +
 + gva_t mmio_fault_cr2;
 + 
 + struct {
 + int active;
 + u8 save_iopl;
 + struct kvm_save_segment {
 + u16 selector;
 + unsigned long base;
 + u32 limit;
 + u32 ar;
 + } tr, es, ds, fs, gs;
 + } rmode;
[...]

As far as I can see without applying it, that split is ok for powerpc. I had a 
similar approach in my local patch queue too.
Minor differences in which elements of the structs are arch dependent or not 
can be changed in small patches later ;-)

But the file kvm_arch.h name confuses me a bit - I assume you had the coming 
asm split in mind where every architecture can define it's asm/kvm_arch.h.
Since we don't have that asm structure for kvm yet, the changes you made to 
kvm_arch.h may be better located at the x86.h atm.

-- 

Grüsse / regards, 
Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
+49 7031/16-3385
[EMAIL PROTECTED]
[EMAIL PROTECTED]

IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen 
Geschäftsführung: Herbert Kircher 
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Zhang, Xiantao
Christian Ehrhardt wrote:
 Zhang, Xiantao wrote:
 --- /dev/null
 +++ b/drivers/kvm/kvm_arch.h
 [...]
 +struct kvm_arch_vcpu{
 +
 +u64 host_tsc;
 +
 +unsigned long regs[NR_VCPU_REGS]; /* for rsp:
 vcpu_load_rsp_rip() */
 +unsigned long rip;  /* needs vcpu_load_rsp_rip() */ +
 +unsigned long cr0;
 +unsigned long cr2;
 +unsigned long cr3;
 +unsigned long cr4;
 +unsigned long cr8;
 +u64 pdptrs[4]; /* pae */
 +u64 shadow_efer;
 +u64 apic_base;
 +struct kvm_lapic *apic;/* kernel irqchip context */ +
 +u64 ia32_misc_enable_msr;
 +
 +
 +struct i387_fxsave_struct host_fx_image;
 +struct i387_fxsave_struct guest_fx_image;
 +int fpu_active;
 +int guest_fpu_loaded;
 +
 +gva_t mmio_fault_cr2;
 +
 +struct {
 +int active;
 +u8 save_iopl;
 +struct kvm_save_segment {
 +u16 selector;
 +unsigned long base;
 +u32 limit;
 +u32 ar;
 +} tr, es, ds, fs, gs;
 +} rmode;
 [...]
 
 As far as I can see without applying it, that split is ok for
 powerpc. I had a similar approach in my local patch queue too. 
 Minor differences in which elements of the structs are arch dependent
 or not can be changed in small patches later ;-) 

 But the file kvm_arch.h name confuses me a bit - I assume you had the
 coming asm split in mind where every architecture can define it's
 asm/kvm_arch.h. Since we don't have that asm structure for kvm yet,
 the changes you made to kvm_arch.h may be better located at the x86.h
 atm.   
According to our previous discuss,  we proposed a source layout, which
contains an include directory to hold header files for all archs
under drivers/kvm/, and kvm_arch.h will finally go into
drivers/kvm/include/kvm-x86/(linked as kvm when compile). So, every
architecture can defines its own kvm_arch.h for their arch, and compile
will choose it per ARCH when compile time. But for now, we can just put
it here before another real new arch in.  Then, we can remove x86.h,
since it is not so common for all archs.  :)  
BTW, header files should be managed with a uniform method, because
possible archs, such as IA64, maybe need many ones. 
Thanks 
Xiatnao


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Carsten Otte
Zhang, Xiantao wrote:
 Thank you, I will resend it :)
I do greatly appreciate it. We'll do this together, please do also 
pick on my patches whenever you see something that does'nt fit what 
you need for ia64.

thanks,
Carsten

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] Add some \n in ioapic_debug()

2007-10-12 Thread Laurent Vivier
Add new-line at end of debug strings.

Signed-off-by: Laurent Vivier [EMAIL PROTECTED]
---
 drivers/kvm/ioapic.c |   25 ++---
 1 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/kvm/ioapic.c b/drivers/kvm/ioapic.c
index 3b69541..1a5e59a 100644
--- a/drivers/kvm/ioapic.c
+++ b/drivers/kvm/ioapic.c
@@ -40,8 +40,11 @@
 #include asm/apicdef.h
 #include asm/io_apic.h
 #include irq.h
-/* #define ioapic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg) */
+#if 0
+#define ioapic_debug(fmt,arg...) printk(KERN_WARNING fmt,##arg)
+#else
 #define ioapic_debug(fmt, arg...)
+#endif
 static void ioapic_deliver(struct kvm_ioapic *vioapic, int irq);
 
 static unsigned long ioapic_read_indirect(struct kvm_ioapic *ioapic,
@@ -113,7 +116,7 @@ static void ioapic_write_indirect(struct kvm_ioapic 
*ioapic, u32 val)
default:
index = (ioapic-ioregsel - 0x10)  1;
 
-   ioapic_debug(change redir index %x val %x, index, val);
+   ioapic_debug(change redir index %x val %x\n, index, val);
if (index = IOAPIC_NUM_PINS)
return;
if (ioapic-ioregsel  1) {
@@ -134,7 +137,7 @@ static void ioapic_inj_irq(struct kvm_ioapic *ioapic,
   struct kvm_lapic *target,
   u8 vector, u8 trig_mode, u8 delivery_mode)
 {
-   ioapic_debug(irq %d trig %d deliv %d, vector, trig_mode,
+   ioapic_debug(irq %d trig %d deliv %d\n, vector, trig_mode,
 delivery_mode);
 
ASSERT((delivery_mode == dest_Fixed) ||
@@ -151,7 +154,7 @@ static u32 ioapic_get_delivery_bitmask(struct kvm_ioapic 
*ioapic, u8 dest,
struct kvm *kvm = ioapic-kvm;
struct kvm_vcpu *vcpu;
 
-   ioapic_debug(dest %d dest_mode %d, dest, dest_mode);
+   ioapic_debug(dest %d dest_mode %d\n, dest, dest_mode);
 
if (dest_mode == 0) {   /* Physical mode. */
if (dest == 0xFF) { /* Broadcast. */
@@ -179,7 +182,7 @@ static u32 ioapic_get_delivery_bitmask(struct kvm_ioapic 
*ioapic, u8 dest,
kvm_apic_match_logical_addr(vcpu-apic, dest))
mask |= 1  vcpu-vcpu_id;
}
-   ioapic_debug(mask %x, mask);
+   ioapic_debug(mask %x\n, mask);
return mask;
 }
 
@@ -196,12 +199,12 @@ static void ioapic_deliver(struct kvm_ioapic *ioapic, int 
irq)
int vcpu_id;
 
ioapic_debug(dest=%x dest_mode=%x delivery_mode=%x 
-vector=%x trig_mode=%x,
+vector=%x trig_mode=%x\n,
 dest, dest_mode, delivery_mode, vector, trig_mode);
 
deliver_bitmask = ioapic_get_delivery_bitmask(ioapic, dest, dest_mode);
if (!deliver_bitmask) {
-   ioapic_debug(no target on destination);
+   ioapic_debug(no target on destination\n);
return;
}
 
@@ -214,7 +217,7 @@ static void ioapic_deliver(struct kvm_ioapic *ioapic, int 
irq)
   trig_mode, delivery_mode);
else
ioapic_debug(null round robin: 
-mask=%x vector=%x delivery_mode=%x,
+mask=%x vector=%x delivery_mode=%x\n,
 deliver_bitmask, vector, dest_LowestPrio);
break;
case dest_Fixed:
@@ -304,7 +307,7 @@ static void ioapic_mmio_read(struct kvm_io_device *this, 
gpa_t addr, int len,
struct kvm_ioapic *ioapic = (struct kvm_ioapic *)this-private;
u32 result;
 
-   ioapic_debug(addr %lx, (unsigned long)addr);
+   ioapic_debug(addr %lx\n, (unsigned long)addr);
ASSERT(!(addr  0xf));  /* check alignment */
 
addr = 0xff;
@@ -341,8 +344,8 @@ static void ioapic_mmio_write(struct kvm_io_device *this, 
gpa_t addr, int len,
struct kvm_ioapic *ioapic = (struct kvm_ioapic *)this-private;
u32 data;
 
-   ioapic_debug(ioapic_mmio_write addr=%lx len=%d val=%p\n,
-addr, len, val);
+   ioapic_debug(ioapic_mmio_write addr=%p len=%d val=%p\n,
+(void*)addr, len, val);
ASSERT(!(addr  0xf));  /* check alignment */
if (len == 4 || len == 8)
data = *(u32 *) val;
-- 
1.5.2.4


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Zhang, Xiantao
Carsten Otte wrote:
 Zhang, Xiantao wrote:
 Thank you, I will resend it :)
 I do greatly appreciate it. We'll do this together, please do also
 pick on my patches whenever you see something that does'nt fit what
 you need for ia64.
Sure:)
Xiantao

 thanks,
 Carsten

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Zhang, Xiantao

  int vcpu_id;
  struct mutex mutex;
  int   cpu;
 -u64 host_tsc;
  struct kvm_run *run;
  int interrupt_window_open;
 I am not sure if this is the right thing for all archs. We have
 various forms of interrupts (I/O, external etc) which can all be
 masked seperately. I think interrupt_window_open should go to arch.

Thank you, I will resend it :)

  int guest_mode;
  unsigned long requests;
  unsigned long irq_summary; /* bit vector: 1 per word in
 irq_pending */
 We don't have irq. This works completely different for us, thus this
 needs to go to arch.
 
  DECLARE_BITMAP(irq_pending, KVM_NR_INTERRUPTS); Same here.
 
  #define VCPU_MP_STATE_RUNNABLE  0
  #define VCPU_MP_STATE_UNINITIALIZED 1
  #define VCPU_MP_STATE_INIT_RECEIVED 2
 @@ -339,7 +329,6 @@ struct kvm_vcpu {
  #define VCPU_MP_STATE_HALTED4
  int mp_state;
  int sipi_vector;
 This one is arch dependent and should go to arch.
 
 -u64 ia32_misc_enable_msr;
 
  struct kvm_mmu mmu;
 
 @@ -354,10 +343,6 @@ struct kvm_vcpu {
 
  struct kvm_guest_debug guest_debug;
 
 -struct i387_fxsave_struct host_fx_image;
 -struct i387_fxsave_struct guest_fx_image;
 -int fpu_active;
 -int guest_fpu_loaded;
 I think guest_fpu_loaded should be generic. Don't you want to use the
 lazy fpu restore with preempt notification too?
 
 
  int mmio_needed;
  int mmio_read_completed;
 This is arch dependent, we don't have CONFIG_MMIO.
 
 @@ -365,7 +350,6 @@ struct kvm_vcpu {
  int mmio_size;
  unsigned char mmio_data[8];
  gpa_t mmio_phys_addr;
 -gva_t mmio_fault_cr2;
  struct kvm_pio_request pio;
  void *pio_data;
 All above are arch dependent.
 
 diff --git a/drivers/kvm/kvm_arch.h b/drivers/kvm/kvm_arch.h
 new file mode 100644
 index 000..fe73d3d
 --- /dev/null
 +++ b/drivers/kvm/kvm_arch.h
 @@ -0,0 +1,65 @@
 +#ifndef __KVM_ARCH_H
 +#define __KVM_ARCH_H
 This should go to x86.h, no new header please.
 
 +struct kvm_arch_vcpu{
 +
 +u64 host_tsc;
 +
 +unsigned long regs[NR_VCPU_REGS]; /* for rsp:
 vcpu_load_rsp_rip() */
 +unsigned long rip;  /* needs vcpu_load_rsp_rip() */ +
 +unsigned long cr0;
 +unsigned long cr2;
 +unsigned long cr3;
 +unsigned long cr4;
 +unsigned long cr8;
 +u64 pdptrs[4]; /* pae */
 +u64 shadow_efer;
 +u64 apic_base;
 +struct kvm_lapic *apic;/* kernel irqchip context */ +
 +u64 ia32_misc_enable_msr;
 +
 +
 +struct i387_fxsave_struct host_fx_image;
 +struct i387_fxsave_struct guest_fx_image;
 +int fpu_active;
 +int guest_fpu_loaded;
 +
 +gva_t mmio_fault_cr2;
 +
 +struct {
 +int active;
 +u8 save_iopl;
 +struct kvm_save_segment {
 +u16 selector;
 +unsigned long base;
 +u32 limit;
 +u32 ar;
 +} tr, es, ds, fs, gs;
 +} rmode;
 +
 +int cpuid_nent;
 +struct kvm_cpuid_entry cpuid_entries[KVM_MAX_CPUID_ENTRIES]; +
 +/* emulate context */
 +
 +struct x86_emulate_ctxt emulate_ctxt;
 +};
 +
 +#endif
 Very nice. The only thing that should'nt be here is fpu_active as far
 as I can tell.

Since some archs don't need to care fpu, so I put it under arch. If most
archs need it, maybe we can move it to top level. Just a tradeoff.:)

 I like this split overall, per architecture vcpu data structures are
 an important step and clearly the right way to go.
 

 with kind regards,
 Carsten

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Carsten Otte
Zhang, Xiantao wrote:
From 12457e0fb85ef32f1a1f808be294bebe8d22667c Mon Sep 17 00:00:00 2001
 From: Zhang xiantao [EMAIL PROTECTED]
 Date: Fri, 12 Oct 2007 13:29:30 +0800
 Subject: [PATCH] Split kvm_vcpu to support new archs. Define a new sub
 field
 kvm_arch_vcpu to hold arch-specific sections.
 
 I am not sure data fields related to mmu should put under kvm_arch_vcpu
 or not, because 
 IA64 side doesn't need them, and only need kvm module to allocate memory
 for guests.
We don't need them either on 390, and so does ppc. I think we should 
consider Avi's ingenious softmmu to be x86 specific. Therefore, those 
fields should go to the x86 part afaics.

 diff --git a/drivers/kvm/ioapic.c b/drivers/kvm/ioapic.c
 index 3b69541..b149c07 100644
 --- a/drivers/kvm/ioapic.c
 +++ b/drivers/kvm/ioapic.c
 @@ -156,7 +156,7 @@ static u32 ioapic_get_delivery_bitmask(struct
 kvm_ioapic *ioapic, u8 dest,
   if (dest_mode == 0) {   /* Physical mode. */
   if (dest == 0xFF) { /* Broadcast. */
   for (i = 0; i  KVM_MAX_VCPUS; ++i)
 - if (kvm-vcpus[i] 
 kvm-vcpus[i]-apic)
 + if (kvm-vcpus[i] 
 kvm-vcpus[i]-arch.apic)
   mask |= 1  i;
   return mask;
   }
Your mail client wraps lines, thus the patch is not applicable when 
taking from an email. Try using mudd or evolution for sending patches. 
In evolution, select preformat mode and paste into.

 diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
 index 4a52d6e..eaa28c8 100644
 --- a/drivers/kvm/kvm.h
 +++ b/drivers/kvm/kvm.h
 @@ -307,31 +307,21 @@ struct kvm_io_device *kvm_io_bus_find_dev(struct
 kvm_io_bus *bus, gpa_t addr);
  void kvm_io_bus_register_dev(struct kvm_io_bus *bus,
struct kvm_io_device *dev);
 
 +
 +#include kvm_arch.h
 +
This should be x86.h for now, and later on be moved to 
include/asm-x86/to-be-named.h

  struct kvm_vcpu {
   struct kvm *kvm;
   struct preempt_notifier preempt_notifier;
   int vcpu_id;
   struct mutex mutex;
   int   cpu;
 - u64 host_tsc;
   struct kvm_run *run;
   int interrupt_window_open;
I am not sure if this is the right thing for all archs. We have 
various forms of interrupts (I/O, external etc) which can all be 
masked seperately. I think interrupt_window_open should go to arch.

   int guest_mode;
   unsigned long requests;
   unsigned long irq_summary; /* bit vector: 1 per word in
 irq_pending */
We don't have irq. This works completely different for us, thus this 
needs to go to arch.

   DECLARE_BITMAP(irq_pending, KVM_NR_INTERRUPTS);
Same here.

  #define VCPU_MP_STATE_RUNNABLE  0
  #define VCPU_MP_STATE_UNINITIALIZED 1
  #define VCPU_MP_STATE_INIT_RECEIVED 2
 @@ -339,7 +329,6 @@ struct kvm_vcpu {
  #define VCPU_MP_STATE_HALTED4
   int mp_state;
   int sipi_vector;
This one is arch dependent and should go to arch.

 - u64 ia32_misc_enable_msr;
 
   struct kvm_mmu mmu;
 
 @@ -354,10 +343,6 @@ struct kvm_vcpu {
 
   struct kvm_guest_debug guest_debug;
 
 - struct i387_fxsave_struct host_fx_image;
 - struct i387_fxsave_struct guest_fx_image;
 - int fpu_active;
 - int guest_fpu_loaded;
I think guest_fpu_loaded should be generic. Don't you want to use the 
lazy fpu restore with preempt notification too?

 
   int mmio_needed;
   int mmio_read_completed;
This is arch dependent, we don't have CONFIG_MMIO.

 @@ -365,7 +350,6 @@ struct kvm_vcpu {
   int mmio_size;
   unsigned char mmio_data[8];
   gpa_t mmio_phys_addr;
 - gva_t mmio_fault_cr2;
   struct kvm_pio_request pio;
   void *pio_data;
All above are arch dependent.

 diff --git a/drivers/kvm/kvm_arch.h b/drivers/kvm/kvm_arch.h
 new file mode 100644
 index 000..fe73d3d
 --- /dev/null
 +++ b/drivers/kvm/kvm_arch.h
 @@ -0,0 +1,65 @@
 +#ifndef __KVM_ARCH_H
 +#define __KVM_ARCH_H
This should go to x86.h, no new header please.

 +struct kvm_arch_vcpu{
 + 
 + u64 host_tsc;
 + 
 + unsigned long regs[NR_VCPU_REGS]; /* for rsp:
 vcpu_load_rsp_rip() */
 + unsigned long rip;  /* needs vcpu_load_rsp_rip() */
 +
 + unsigned long cr0;
 + unsigned long cr2;
 + unsigned long cr3;
 + unsigned long cr4;
 + unsigned long cr8;
 + u64 pdptrs[4]; /* pae */
 + u64 shadow_efer;
 + u64 apic_base;
 + struct kvm_lapic *apic;/* kernel irqchip context */
 +
 + u64 ia32_misc_enable_msr;
 +
 + 
 + struct i387_fxsave_struct host_fx_image;
 + struct i387_fxsave_struct guest_fx_image;
 + int fpu_active;
 + int guest_fpu_loaded;
 +
 + gva_t mmio_fault_cr2;
 + 
 + struct {
 + int active;
 + u8 save_iopl;
 + struct kvm_save_segment {
 + u16 selector;
 + unsigned long base;
 + u32 limit;
 +  

Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Carsten Otte
Zhang, Xiantao wrote:
 According to our previous discuss,  we proposed a source layout, which
 contains an include directory to hold header files for all archs
 under drivers/kvm/, and kvm_arch.h will finally go into
 drivers/kvm/include/kvm-x86/(linked as kvm when compile). 
Right. The thing is, I've started a new header for this purpose 
yesterday. And this should be in the _same_ header, no matter where 
it'll end up. It is the x86 specific header file, currently named 
drivers/kvm/x86.h, which needs to be renamed/moved in the future.

 So, every
 architecture can defines its own kvm_arch.h for their arch, and compile
 will choose it per ARCH when compile time. But for now, we can just put
 it here before another real new arch in.  Then, we can remove x86.h,
 since it is not so common for all archs.  :)  
 BTW, header files should be managed with a uniform method, because
 possible archs, such as IA64, maybe need many ones. 
That's fine with me. But prior to that we'll need to split x86 so that 
it can be relocated in its arch directory different from the common 
kvm location. And until we're there, we use x86.h as a place to store 
x86 specific header content.

so long,
Carsten

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] RFC/patch portability: split kvm_vm_ioctl

2007-10-12 Thread Carsten Otte
This patch splits kvm_vm_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. 

Common ioctls for all architectures are:
KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG

I'd really like to see more commonalities, but all others did not fit
our needs. I would love to keep KVM_GET_DIRTY_LOG common, so that the
ingenious migration code does not need to care too much about different
architectures.

x86 specific ioctls are:
KVM_SET_MEMORY_REGION, KVM_SET_USER_MEMORY_REGION,
KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP,
KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP

While the pic/apic related functions are obviously x86 specific, some
other ioctls seem to be common at a first glance.
KVM_SET_(USER)_MEMORY_REGION for example. We've got a total different
address layout on s390: we cannot support multiple slots, and a user
memory range always equals the guest physical memory [guest_phys + vm
specific offset = host user address]. We don't have nor need dedicated
vmas for the guest memory, we just use what the memory managment has in
stock. This is true, because we reuse the page table for user and guest
mode.
Looks to me like the s390 might have a lot in common with a future AMD
nested page table implementation. If AMD choose to reuse the page table
too, we might share the same ioctl to set up guest addressing with them.

signed-off-by: Carsten Otte [EMAIL PROTECTED]
reviewed-by: Christian Borntraeger [EMAIL PROTECTED]
reviewed-by: Christian Ehrhardt [EMAIL PROTECTED]
---
Index: kvm/drivers/kvm/kvm.h
===
--- kvm.orig/drivers/kvm/kvm.h  2007-10-12 13:38:59.0 +0200
+++ kvm/drivers/kvm/kvm.h   2007-10-12 14:22:40.0 +0200
@@ -661,6 +661,9 @@
unsigned int ioctl, unsigned long arg);
 long kvm_arch_vcpu_ioctl(struct file *filp,
 unsigned int ioctl, unsigned long arg);
+long kvm_arch_vm_ioctl(struct file *filp,
+   unsigned int ioctl, unsigned long arg);
+void kvm_arch_destroy_vm(struct kvm *kvm);
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu);
 
Index: kvm/drivers/kvm/kvm_main.c
===
--- kvm.orig/drivers/kvm/kvm_main.c 2007-10-12 13:38:59.0 +0200
+++ kvm/drivers/kvm/kvm_main.c  2007-10-12 13:57:30.0 +0200
@@ -40,7 +40,6 @@
 #include linux/anon_inodes.h
 #include linux/profile.h
 #include linux/kvm_para.h
-#include linux/pagemap.h
 
 #include asm/processor.h
 #include asm/msr.h
@@ -319,61 +318,6 @@
return kvm;
 }
 
-static void kvm_free_userspace_physmem(struct kvm_memory_slot *free)
-{
-   int i;
-
-   for (i = 0; i  free-npages; ++i) {
-   if (free-phys_mem[i]) {
-   if (!PageReserved(free-phys_mem[i]))
-   SetPageDirty(free-phys_mem[i]);
-   page_cache_release(free-phys_mem[i]);
-   }
-   }
-}
-
-static void kvm_free_kernel_physmem(struct kvm_memory_slot *free)
-{
-   int i;
-
-   for (i = 0; i  free-npages; ++i)
-   if (free-phys_mem[i])
-   __free_page(free-phys_mem[i]);
-}
-
-/*
- * Free any memory in @free but not in @dont.
- */
-static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
- struct kvm_memory_slot *dont)
-{
-   if (!dont || free-phys_mem != dont-phys_mem)
-   if (free-phys_mem) {
-   if (free-user_alloc)
-   kvm_free_userspace_physmem(free);
-   else
-   kvm_free_kernel_physmem(free);
-   vfree(free-phys_mem);
-   }
-   if (!dont || free-rmap != dont-rmap)
-   vfree(free-rmap);
-
-   if (!dont || free-dirty_bitmap != dont-dirty_bitmap)
-   vfree(free-dirty_bitmap);
-
-   free-phys_mem = NULL;
-   free-npages = 0;
-   free-dirty_bitmap = NULL;
-}
-
-static void kvm_free_physmem(struct kvm *kvm)
-{
-   int i;
-
-   for (i = 0; i  kvm-nmemslots; ++i)
-   kvm_free_physmem_slot(kvm-memslots[i], NULL);
-}
-
 static void free_pio_guest_pages(struct kvm_vcpu *vcpu)
 {
int i;
@@ -421,7 +365,7 @@
kfree(kvm-vpic);
kfree(kvm-vioapic);
kvm_free_vcpus(kvm);
-   kvm_free_physmem(kvm);
+   kvm_arch_destroy_vm(kvm);
kfree(kvm);
 }
 
@@ -686,183 +630,6 @@
 EXPORT_SYMBOL_GPL(fx_init);
 
 /*
- * Allocate some memory and give it an address in the guest physical address
- * space.
- *
- * Discontiguous memory is allowed, mostly for framebuffers.
- */
-static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
- struct
- kvm_userspace_memory_region *mem,
-  

Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl

2007-10-12 Thread Anthony Liguori
Carsten Otte wrote:
 This patch splits kvm_vm_ioctl into archtecture independent parts, and
 x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. 

 Common ioctls for all architectures are:
 KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG

 I'd really like to see more commonalities, but all others did not fit
 our needs. I would love to keep KVM_GET_DIRTY_LOG common, so that the
 ingenious migration code does not need to care too much about different
 architectures.

 x86 specific ioctls are:
 KVM_SET_MEMORY_REGION, KVM_SET_USER_MEMORY_REGION,
 KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP,
 KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP

 While the pic/apic related functions are obviously x86 specific, some
 other ioctls seem to be common at a first glance.
 KVM_SET_(USER)_MEMORY_REGION for example. We've got a total different
 address layout on s390: we cannot support multiple slots, and a user
 memory range always equals the guest physical memory [guest_phys + vm
 specific offset = host user address]. We don't have nor need dedicated
 vmas for the guest memory, we just use what the memory managment has in
 stock. This is true, because we reuse the page table for user and guest
 mode.
   

You still need to tell the kernel about vm specific offset right?  So 
doesn't KVM_SET_USER_MEMORY_REGION for you just become that?  There's 
nothing wrong with s390 not supporting multiple memory slots, but 
there's no reason the ioctl interface can't be the same.

Regards,

Anthony Liguori

 Looks to me like the s390 might have a lot in common with a future AMD
 nested page table implementation. If AMD choose to reuse the page table
 too, we might share the same ioctl to set up guest addressing with them.

 signed-off-by: Carsten Otte [EMAIL PROTECTED]
 reviewed-by: Christian Borntraeger [EMAIL PROTECTED]
 reviewed-by: Christian Ehrhardt [EMAIL PROTECTED]
 ---
   


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Expose infrastructure for unpinning guest memory

2007-10-12 Thread Anthony Liguori
Avi Kivity wrote:
 Anthony Liguori wrote:
   
 Now that we have userspace memory allocation, I wanted to play with 
 ballooning.
 The idea is that when a guest balloons down, we simply unpin the underlying
 physical memory and the host kernel may or may not swap it.  To reclaim
 ballooned memory, the guest can just start using it and we'll pin it on 
 demand.

 The following patch is a stab at providing the right infrastructure for 
 pinning
 and automatic repinning.  I don't have a lot of comfort in the MMU code so I
 thought I'd get some feedback before going much further.

 gpa_to_hpa is a little awkward to hook, but it seems like the right place in 
 the
 code.  I'm most uncertain about the SMP safety of the unpinning.  Presumably,
 I have to hold the kvm lock around the mmu_unshadow and page_cache release to
 ensure that another VCPU doesn't fault the page back in after mmu_unshadow?

   
 

 One we have true swapping capabilities (which imply ability for the
 kernel to remove a page from the shadow page tables) you can unpin by
 calling munmap() or madvise(MADV_REMOVE) on the pages to be unpinned.
   

So does MADV_REMOVE remove the backing page but still allow for memory 
to be faulted in?  That is, after calling MADV_REMOVE, there's no 
guarantee that the contents of a give VA range will remain the same (but 
it won't SEGV the app if it accesses that memory)?

If so, I think that would be the right way to treat it.  That allows for 
two types of hints for the guest to provide: 1) I won't access this 
memory for a very long time (so it's a good candidate to swap out) and 
2) I won't access this memory and don't care about it's contents.

Regards,

Anthony Liguori

 Other than that the approach seems right.

   


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl

2007-10-12 Thread Carsten Otte
Am Freitag, den 12.10.2007, 15:37 +0200 schrieb Arnd Bergmann:
 I assume the contents are ok, since you're just moving code around, but
 please
 write this 'Signed-off-by' and 'Reviewed-by' (capital letters), and
 include a diffstat for any patch that doesn't fit on a few pages
 of mail client screen space.
The intend of an rfc is in general to review a patch, not to pick on
formalities.

Signed-off-by: Carsten Otte [EMAIL PROTECTED]
Reviewed-by: Christian Borntraeger [EMAIL PROTECTED]
Reviewed-by: Christian Ehrhardt [EMAIL PROTECTED]
---
 kvm.h  |3 
 kvm_main.c |  460 ---
 x86.c  |  472 +
 3 files changed, 478 insertions(+), 457 deletions(-)
Index: kvm/drivers/kvm/kvm.h
===
--- kvm.orig/drivers/kvm/kvm.h  2007-10-12 13:38:59.0 +0200
+++ kvm/drivers/kvm/kvm.h   2007-10-12 14:22:40.0 +0200
@@ -661,6 +661,9 @@
unsigned int ioctl, unsigned long arg);
 long kvm_arch_vcpu_ioctl(struct file *filp,
 unsigned int ioctl, unsigned long arg);
+long kvm_arch_vm_ioctl(struct file *filp,
+   unsigned int ioctl, unsigned long arg);
+void kvm_arch_destroy_vm(struct kvm *kvm);
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu);
 
Index: kvm/drivers/kvm/kvm_main.c
===
--- kvm.orig/drivers/kvm/kvm_main.c 2007-10-12 13:38:59.0 +0200
+++ kvm/drivers/kvm/kvm_main.c  2007-10-12 13:57:30.0 +0200
@@ -40,7 +40,6 @@
 #include linux/anon_inodes.h
 #include linux/profile.h
 #include linux/kvm_para.h
-#include linux/pagemap.h
 
 #include asm/processor.h
 #include asm/msr.h
@@ -319,61 +318,6 @@
return kvm;
 }
 
-static void kvm_free_userspace_physmem(struct kvm_memory_slot *free)
-{
-   int i;
-
-   for (i = 0; i  free-npages; ++i) {
-   if (free-phys_mem[i]) {
-   if (!PageReserved(free-phys_mem[i]))
-   SetPageDirty(free-phys_mem[i]);
-   page_cache_release(free-phys_mem[i]);
-   }
-   }
-}
-
-static void kvm_free_kernel_physmem(struct kvm_memory_slot *free)
-{
-   int i;
-
-   for (i = 0; i  free-npages; ++i)
-   if (free-phys_mem[i])
-   __free_page(free-phys_mem[i]);
-}
-
-/*
- * Free any memory in @free but not in @dont.
- */
-static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
- struct kvm_memory_slot *dont)
-{
-   if (!dont || free-phys_mem != dont-phys_mem)
-   if (free-phys_mem) {
-   if (free-user_alloc)
-   kvm_free_userspace_physmem(free);
-   else
-   kvm_free_kernel_physmem(free);
-   vfree(free-phys_mem);
-   }
-   if (!dont || free-rmap != dont-rmap)
-   vfree(free-rmap);
-
-   if (!dont || free-dirty_bitmap != dont-dirty_bitmap)
-   vfree(free-dirty_bitmap);
-
-   free-phys_mem = NULL;
-   free-npages = 0;
-   free-dirty_bitmap = NULL;
-}
-
-static void kvm_free_physmem(struct kvm *kvm)
-{
-   int i;
-
-   for (i = 0; i  kvm-nmemslots; ++i)
-   kvm_free_physmem_slot(kvm-memslots[i], NULL);
-}
-
 static void free_pio_guest_pages(struct kvm_vcpu *vcpu)
 {
int i;
@@ -421,7 +365,7 @@
kfree(kvm-vpic);
kfree(kvm-vioapic);
kvm_free_vcpus(kvm);
-   kvm_free_physmem(kvm);
+   kvm_arch_destroy_vm(kvm);
kfree(kvm);
 }
 
@@ -686,183 +630,6 @@
 EXPORT_SYMBOL_GPL(fx_init);
 
 /*
- * Allocate some memory and give it an address in the guest physical address
- * space.
- *
- * Discontiguous memory is allowed, mostly for framebuffers.
- */
-static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
- struct
- kvm_userspace_memory_region *mem,
- int user_alloc)
-{
-   int r;
-   gfn_t base_gfn;
-   unsigned long npages;
-   unsigned long i;
-   struct kvm_memory_slot *memslot;
-   struct kvm_memory_slot old, new;
-
-   r = -EINVAL;
-   /* General sanity checks */
-   if (mem-memory_size  (PAGE_SIZE - 1))
-   goto out;
-   if (mem-guest_phys_addr  (PAGE_SIZE - 1))
-   goto out;
-   if (mem-slot = KVM_MEMORY_SLOTS)
-   goto out;
-   if (mem-guest_phys_addr + mem-memory_size  mem-guest_phys_addr)
-   goto out;
-
-   memslot = kvm-memslots[mem-slot];
-   base_gfn = mem-guest_phys_addr  PAGE_SHIFT;
-   npages = mem-memory_size  PAGE_SHIFT;
-
-   if (!npages)
-  

Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl

2007-10-12 Thread Arnd Bergmann
On Friday 12 October 2007, Carsten Otte wrote:
 This patch splits kvm_vm_ioctl into archtecture independent parts, and
 x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c. 
 

I assume the contents are ok, since you're just moving code around, but
please

 signed-off-by: Carsten Otte [EMAIL PROTECTED]
 reviewed-by: Christian Borntraeger [EMAIL PROTECTED]
 reviewed-by: Christian Ehrhardt [EMAIL PROTECTED]

write this 'Signed-off-by' and 'Reviewed-by' (capital letters), and
include a diffstat for any patch that doesn't fit on a few pages
of mail client screen space.

Arnd 

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][Resend] Split kvm_vcpu to support new archs.

2007-10-12 Thread Carsten Otte
Zhang, Xiantao wrote:
 diff --git a/drivers/kvm/ioapic.c b/drivers/kvm/ioapic.c
 index 3b69541..df67292 100644
 --- a/drivers/kvm/ioapic.c
 +++ b/drivers/kvm/ioapic.c
 @@ -156,7 +156,7 @@ static u32 ioapic_get_delivery_bitmask(struct
 kvm_ioapic *ioapic, u8 dest,
   if (dest_mode == 0) {   /* Physical mode. */
   if (dest == 0xFF) { /* Broadcast. */
   for (i = 0; i  KVM_MAX_VCPUS; ++i)
 - if (kvm-vcpus[i] 
 kvm-vcpus[i]-apic)
 + if (kvm-vcpus[i] 
 kvm-vcpus[i]-arch.apic)
   mask |= 1  i;
   return mask;
   }
Your mail client still wraps here, the patch is not applicable.

  struct kvm_vcpu {
   struct kvm *kvm;
   struct preempt_notifier preempt_notifier;
   int vcpu_id;
   struct mutex mutex;
   int   cpu;
 - u64 host_tsc;
   struct kvm_run *run;
   int interrupt_window_open;
This one should go to arch.
   int guest_mode;
   unsigned long requests;
   unsigned long irq_summary; /* bit vector: 1 per word in
 irq_pending */
   DECLARE_BITMAP(irq_pending, KVM_NR_INTERRUPTS);
Both irq related ones too please.

   int mmio_needed;
   int mmio_read_completed;
Not all architectures have mmio, please put this into arch specific part.

Other then that, the patch looks fine to me.



-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Working on an entry-level project

2007-10-12 Thread Cam Macdonell
Dor Laor wrote:
 Cam Macdonell wrote:

 It's a simple test, when there are keyboard/mouse/display changes keep 
 the refresh rate high. When there are no changes start decrease the rate 
 until a minimum
 reached. The performance benefit should also be checked since if it 
 minimal there's no use for this optimization.
 Related to that, what is the status of VMGL's 
 (http://www.cs.toronto.edu/~andreslc/xen-gl/) integration with KVM or 
 QEMU?  Has anyone tried it?  I've found some pages that refer to QEMU 
 and VMGL but nothing definitive.

 Go ahead, there claim it can work with qemu. Try first with qemu since 
 it is the repository to contribute the code to.
 KVM will inherit it from qemu.

Is there a way to test with QEMU that is not painfully slow?

Cam

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Carsten Otte
Zhang, Xiantao wrote:
 So, every
 architecture can defines its own kvm_arch.h for their arch, and
 compile will choose it per ARCH when compile time. But for now, we
 can just put it here before another real new arch in.  Then, we can
 remove x86.h, since it is not so common for all archs.  :)
 BTW, header files should be managed with a uniform method, because
 possible archs, such as IA64, maybe need many ones.
 That's fine with me. But prior to that we'll need to split x86 so that
 it can be relocated in its arch directory different from the common
 kvm location. And until we're there, we use x86.h as a place to store
 x86 specific header content.
 
 OK, I will change it to x86.h, but we also renamed it to such
 kvm_arch.h, because kvm.h will includes it. 
Which kvm.h? The one in include/linux or the one in drivers/kvm?

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Zhang, Xiantao
Carsten Otte wrote:
 Zhang, Xiantao wrote:
 According to our previous discuss,  we proposed a source layout,
 which contains an include directory to hold header files for all
 archs under drivers/kvm/, and kvm_arch.h will finally go into
 drivers/kvm/include/kvm-x86/(linked as kvm when compile).
 Right. The thing is, I've started a new header for this purpose
 yesterday. And this should be in the _same_ header, no matter where
 it'll end up. It is the x86 specific header file, currently named
 drivers/kvm/x86.h, which needs to be renamed/moved in the future.

Agree. future rename or remove operation is needed.

 So, every
 architecture can defines its own kvm_arch.h for their arch, and
 compile will choose it per ARCH when compile time. But for now, we
 can just put it here before another real new arch in.  Then, we can
 remove x86.h, since it is not so common for all archs.  :)
 BTW, header files should be managed with a uniform method, because
 possible archs, such as IA64, maybe need many ones.
 That's fine with me. But prior to that we'll need to split x86 so that
 it can be relocated in its arch directory different from the common
 kvm location. And until we're there, we use x86.h as a place to store
 x86 specific header content.

OK, I will change it to x86.h, but we also renamed it to such
kvm_arch.h, because kvm.h will includes it. 

 so long,
 Carsten

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Working on an entry-level project

2007-10-12 Thread Dor Laor
Cam Macdonell wrote:


 Dor Laor wrote:
 Cam Macdonell wrote:

 You may choose the interactivity improvements:in 
 http://kvm.qumranet.com/kvmwiki/TODO
 Dor

 Thanks Dor, I'll look into it.  Beyond the description, can you 
 elaborate on the problem with frame rate during interactivity?  Is the 
 a simple test that reveals the problem?

It's a simple test, when there are keyboard/mouse/display changes keep 
the refresh rate high. When there are no changes start decrease the rate 
until a minimum
reached. The performance benefit should also be checked since if it 
minimal there's no use for this optimization.
 Related to that, what is the status of VMGL's 
 (http://www.cs.toronto.edu/~andreslc/xen-gl/) integration with KVM or 
 QEMU?  Has anyone tried it?  I've found some pages that refer to QEMU 
 and VMGL but nothing definitive.

Go ahead, there claim it can work with qemu. Try first with qemu since 
it is the repository to contribute the code to.
KVM will inherit it from qemu.
 Cam



-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Hollis Blanchard
On Fri, 2007-10-12 at 13:08 -0300, Glauber de Oliveira Costa wrote:
 +config KVM_CLOCK
 +   bool KVM paravirtualized clock
 +   depends on PARAVIRT  GENERIC_CLOCKEVENTS
 +   help
 + Turning on this option will allow you to run a paravirtualized clock
 + when running over the KVM hypervisor. Instead of relying on a PIT
 + (or probably other) emulation by the underlying device model, the 
 host
 + provides the guest with timing infrastructure, as time of day, and
 + timer expiration.

I must have missed earlier discussion on this topic, so I'm left
wondering... what's the point? What's wrong with PIT (et al) emulation?

-- 
Hollis Blanchard
IBM Linux Technology Center


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Patch][RFC]Split kvm_vcpu to support new archs.

2007-10-12 Thread Zhang, Xiantao
Carsten Otte wrote:
 Zhang, Xiantao wrote:
 So, every
 architecture can defines its own kvm_arch.h for their arch, and
 compile will choose it per ARCH when compile time. But for now, we
 can just put it here before another real new arch in.  Then, we can
 remove x86.h, since it is not so common for all archs.  :)
 BTW, header files should be managed with a uniform method, because
 possible archs, such as IA64, maybe need many ones.
 That's fine with me. But prior to that we'll need to split x86 so
 that it can be relocated in its arch directory different from the
 common kvm location. And until we're there, we use x86.h as a place
 to store x86 specific header content.
 
 OK, I will change it to x86.h, but we also renamed it to such
 kvm_arch.h, because kvm.h will includes it.
 Which kvm.h? The one in include/linux or the one in drivers/kvm?

I mean drivers/kvm/kvm.h 

Xiantao 

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Glauber de Oliveira Costa
Hi,

Attached is a first draft to a paravirt implementation for a timer to
KVM. It is inspired in anthony's last patch about it, but not that
much based on it.

I'm not using hypercalls to get the current time, but rather,
registering an address that will get timer updates once in a while.

Also, it includes a clockevent oneshot implementation (which is the
very thing of this patch), that will allow us interest things like
dynticks.

It's still not yet working on SMP, and I'm currently not sure why (ok,
ok, if you actually read the patch, it will become obvious the why: it
only delivers interrupt for vector 0x20, but I'm further with it, this
patch is just a snapshot)

My next TODOs with it are:
* Get SMP working
* Try something for stolen time, as jeremy's last suggestion for anthony's patch
* Measure the time it takes for a hypercall, and subtract this time
for calculating the expiry time for the timer event.
* Testing and fixing bugs: I'm sure they exist!

Meanwhile, all your suggestions are welcome.

-- 
Glauber de Oliveira Costa.
Free as in Freedom
http://glommer.net

The less confident you are, the more serious you have to act.
diff --git a/arch/i386/Kconfig b/arch/i386/Kconfig
index 97b64d7..622e4d2 100644
--- a/arch/i386/Kconfig
+++ b/arch/i386/Kconfig
@@ -236,6 +236,15 @@ config VMI
 	  (it could be used by other hypervisors in theory too, but is not
 	  at the moment), by linking the kernel to a GPL-ed ROM module
 	  provided by the hypervisor.
+config KVM_CLOCK
+	bool KVM paravirtualized clock
+	depends on PARAVIRT  GENERIC_CLOCKEVENTS
+	help
+	  Turning on this option will allow you to run a paravirtualized clock
+	  when running over the KVM hypervisor. Instead of relying on a PIT
+	  (or probably other) emulation by the underlying device model, the host
+	  provides the guest with timing infrastructure, as time of day, and
+	  timer expiration.
 
 config ACPI_SRAT
 	bool
diff --git a/arch/i386/kernel/Makefile b/arch/i386/kernel/Makefile
index 9d33b00..90c5dc4 100644
--- a/arch/i386/kernel/Makefile
+++ b/arch/i386/kernel/Makefile
@@ -42,6 +42,7 @@ obj-$(CONFIG_K8_NB)		+= k8.o
 obj-$(CONFIG_MGEODE_LX)		+= geode.o
 
 obj-$(CONFIG_VMI)		+= vmi.o vmiclock.o
+obj-$(CONFIG_KVM_CLOCK)		+= kvmclock.o
 obj-$(CONFIG_PARAVIRT)		+= paravirt.o
 obj-y+= pcspeaker.o
 
diff --git a/arch/i386/kernel/kvmclock.c b/arch/i386/kernel/kvmclock.c
new file mode 100644
index 000..8c4df5d
--- /dev/null
+++ b/arch/i386/kernel/kvmclock.c
@@ -0,0 +1,222 @@
+#include linux/clocksource.h
+#include linux/clockchips.h
+#include linux/interrupt.h
+#include linux/kvm_para.h
+#include asm/arch_hooks.h
+#include asm/i8253.h
+
+#include mach_ipi.h
+#include irq_vectors.h
+
+#define KVM_SCALE 22
+
+static int no_kvmclock = 0;
+extern struct clock_event_device *global_clock_event;
+
+static int parse_no_kvmclock(char *arg)
+{
+	no_kvmclock = 1;
+	return 0;
+}
+early_param(no-kvmclock, parse_no_kvmclock);
+
+/* The hypervisor will put information about time periodically here */
+struct kvm_hv_clock hv_clock;
+
+/*
+ * The wallclock is the time of day when we booted. Since then, some time may
+ * have elapsed since the hypervisor wrote the data. So we try to account for
+ * that. Even if the tsc is not accurate, it gives us a more accurate timing
+ * than not adjusting at all
+ */
+unsigned long kvm_get_wallclock(void)
+{
+	unsigned long wallclock;
+	unsigned long long now;
+	wallclock  = hv_clock.wc.tv_sec;
+
+	rdtscll(now);
+
+	now -= hv_clock.last_tsc;
+	now = (now * hv_clock.tsc_mult)  KVM_SCALE;
+	now += hv_clock.wc.tv_nsec;
+	do_div(now, NSEC_PER_SEC);
+	return wallclock;
+}
+
+int kvm_set_wallclock(unsigned long now)
+{
+	return 0;
+}
+
+/*
+ * This is our read_clock function. The host puts an tsc timestamp each time
+ * it updates a new time, and then we can use it to derive a slightly more
+ * precise notion of elapsed time, converted to nanoseconds.
+ *
+ * If the platform provides a stable tsc, we just use it, and there is no need
+ * for the host to update anything.
+ */
+static cycle_t kvm_clock_read(void) {
+
+	u64 delta, last_tsc;
+	struct timespec *now;
+
+	if (hv_clock.stable_tsc) {
+		rdtscll(last_tsc);
+		return last_tsc;
+	}
+
+	do {
+		last_tsc = hv_clock.last_tsc;
+		rmb();
+		now = hv_clock.now;
+		rmb();
+	} while (hv_clock.last_tsc != last_tsc);
+
+	delta = native_read_tsc() - last_tsc;
+	delta = (delta * hv_clock.tsc_mult)  KVM_SCALE;
+
+	return (cycle_t)now-tv_sec * NSEC_PER_SEC + now-tv_nsec + delta;
+}
+
+static void kvm_timer_set_mode(enum clock_event_mode mode,
+struct clock_event_device *evt)
+{
+	WARN_ON(!irqs_disabled());
+
+	switch (mode) {
+	case CLOCK_EVT_MODE_ONESHOT:
+		/* this is what we want */
+		break;
+	case CLOCK_EVT_MODE_RESUME:
+		break;
+	case CLOCK_EVT_MODE_PERIODIC:
+		WARN_ON(1);
+		break;
+	case CLOCK_EVT_MODE_UNUSED:
+	case CLOCK_EVT_MODE_SHUTDOWN:
+		kvm_hypercall0(KVM_HCALL_STOP_ONESHOT);
+		break;
+	default:
+		break;
+	}
+}
+
+/*
+ * Programming the next event is 

Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Jeremy Fitzhardinge
Glauber de Oliveira Costa wrote:
 My next TODOs with it are:
 * Get SMP working
 * Try something for stolen time, as jeremy's last suggestion for anthony's 
 patch
 * Measure the time it takes for a hypercall, and subtract this time
 for calculating the expiry time for the timer event.
   

I don't think there's much point in trying to do stuff like this.  The
guest can be preempted at any time, so there's an arbitrary amount of
time between deciding to set a timeout, and the time the timeout
actually happens.

In theory you can mitigate this by using an absolute rather than
relative timeout value, but in practice I don't think it makes much
difference.

 +
 +/*
 + * This is our read_clock function. The host puts an tsc timestamp each time
 + * it updates a new time, and then we can use it to derive a slightly more
 + * precise notion of elapsed time, converted to nanoseconds.
 + *
 + * If the platform provides a stable tsc, we just use it, and there is no 
 need
 + * for the host to update anything.


How would you deal with suspend/resume/migrate?  Also, do you assume
that stable_tsc also means synchronized tsc on an SMP host?

 + */
 +static cycle_t kvm_clock_read(void) {
 +
 + u64 delta, last_tsc;
 + struct timespec *now;
 +
 + if (hv_clock.stable_tsc) {
 + rdtscll(last_tsc);
 + return last_tsc;
   

So this returns a tsc here? 

 + }
 +
 + do {
 + last_tsc = hv_clock.last_tsc;
 + rmb();
 + now = hv_clock.now;
   

Shouldn't this be taking a copy of now, rather than a pointer to it? 
Otherwise what's the point of this loop?

 + rmb();
 + } while (hv_clock.last_tsc != last_tsc);
   

This won't be an atomic compare on 32-bit; it could get confused by
seeing a half-updated tsc value.

 +
 + delta = native_read_tsc() - last_tsc;
 + delta = (delta * hv_clock.tsc_mult)  KVM_SCALE;
 +
 + return (cycle_t)now-tv_sec * NSEC_PER_SEC + now-tv_nsec + delta;
   

--- But returns ns here?

 +}
 +
 +static void kvm_timer_set_mode(enum clock_event_mode mode,
 + struct clock_event_device *evt)
 +{
 + WARN_ON(!irqs_disabled());
 +
 + switch (mode) {
 + case CLOCK_EVT_MODE_ONESHOT:
 + /* this is what we want */
 + break;
 + case CLOCK_EVT_MODE_RESUME:
 + break;
 + case CLOCK_EVT_MODE_PERIODIC:
 + WARN_ON(1);
 + break;
 + case CLOCK_EVT_MODE_UNUSED:
 + case CLOCK_EVT_MODE_SHUTDOWN:
 + kvm_hypercall0(KVM_HCALL_STOP_ONESHOT);
 + break;
 + default:
 + break;
 + }
 +}
 +
 +/*
 + * Programming the next event is just a matter of asking the host
 + * to generate us an interrupt when the time expires. We pass the
 + * delta on, and hypervisor will do all remaining tricks. For a more
 + * precise timing, we can just subtract the time spent by the hypercall
   

Not worthwhile.  It would be better to make the hypercall take an
absolute time, and pass it now+delta.  At least then if you get
preempted past the timeout period you can return -ETIME, and the clock
subsystem will know what to do.

 + */
 +static int kvm_timer_next_event(unsigned long delta,
 + struct clock_event_device *evt)
 +{
 + WARN_ON(evt-mode != CLOCK_EVT_MODE_ONESHOT);
 + kvm_hypercall1(KVM_HCALL_SET_ALARM, delta);
 + return 0;
 +}
 +
 diff --git a/arch/i386/kernel/setup.c b/arch/i386/kernel/setup.c
 index d474cd6..fd758f9 100644
 --- a/arch/i386/kernel/setup.c
 +++ b/arch/i386/kernel/setup.c
 @@ -46,6 +46,7 @@
  #include linux/crash_dump.h
  #include linux/dmi.h
  #include linux/pfn.h
 +#include linux/kvm_para.h
  
  #include video/edid.h
  
 @@ -579,6 +580,9 @@ void __init setup_arch(char **cmdline_p)
   vmi_init();
  #endif
  
 +#ifdef CONFIG_KVM_CLOCK
 + kvmclock_init();
 +#endif
   

Why is this necessary?  Can't you hook one of the existing pvops?


J

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Anthony Liguori
Glauber de Oliveira Costa wrote:
 Hi,

 Attached is a first draft to a paravirt implementation for a timer to
 KVM. It is inspired in anthony's last patch about it, but not that
 much based on it.

 I'm not using hypercalls to get the current time, but rather,
 registering an address that will get timer updates once in a while.

 Also, it includes a clockevent oneshot implementation (which is the
 very thing of this patch), that will allow us interest things like
 dynticks.

 It's still not yet working on SMP, and I'm currently not sure why (ok,
 ok, if you actually read the patch, it will become obvious the why: it
 only delivers interrupt for vector 0x20, but I'm further with it, this
 patch is just a snapshot)

 My next TODOs with it are:
 * Get SMP working
 * Try something for stolen time, as jeremy's last suggestion for anthony's 
 patch
 * Measure the time it takes for a hypercall, and subtract this time
 for calculating the expiry time for the timer event.
 * Testing and fixing bugs: I'm sure they exist!

 Meanwhile, all your suggestions are welcome.

   

snip

 +
 +void __init kvmclock_init(void)
 +{
 +
 + unsigned long shared_page = (unsigned long)hv_clock;
 + /*
 +  * If we can't use the paravirt clock, just go with
 +  * the usual timekeeping
 +  */
 + if (!kvm_para_available() || no_kvmclock)
 + return;
   

You should also check kvm_para_has_feature() and define a feature flag 
for the clock.

 + if (kvm_hypercall1(KVM_HCALL_SET_SHARED_PAGE, shared_page))
 + return;
 +
 + paravirt_ops.get_wallclock = kvm_get_wallclock;
 + paravirt_ops.set_wallclock = kvm_set_wallclock;
 + paravirt_ops.sched_clock = kvm_sched_clock;
 + paravirt_ops.time_init = kvm_time_init;
 + /*
 +  * If we let the normal APIC initialization code run, they will
 +  * override our event handler, relying that the APIC will deliver
 +  * the interrupts in the LOCAL_TIMER_VECTOR. The easy solution is
 +  * keep the PIT running until then
 +  */
 + paravirt_ops.setup_boot_clock = kvm_disable_pit;
 +}


 diff --git a/drivers/kvm/irq.c b/drivers/kvm/irq.c
 index 0f663fe..7baf798 100644
 --- a/drivers/kvm/irq.c
 +++ b/drivers/kvm/irq.c
 @@ -32,6 +32,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
  {
   struct kvm_pic *s;
  
 + if (v-timer_vector != -1)
 + return 1;
   if (kvm_apic_has_interrupt(v) == -1) {  /* LAPIC */
   if (kvm_apic_accept_pic_intr(v)) {
   s = pic_irqchip(v-kvm);/* PIC */
 @@ -43,6 +45,12 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
  }
  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
  
 +static int kvm_get_pvclock_interrupt(struct kvm_vcpu *v)
 +{
 + int ret = v-timer_vector;
 + v-timer_vector = -1;
 + return  ret;
 +}
  /*
   * Read pending interrupt vector and intack.
   */
 @@ -51,7 +59,9 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
   struct kvm_pic *s;
   int vector;
  
 - vector = kvm_get_apic_interrupt(v); /* APIC */
 + vector = kvm_get_pvclock_interrupt(v);
 + if (vector == -1)
 + vector = kvm_get_apic_interrupt(v); /* APIC */
   

It might be better to just rely on the in-kernel APIC to inject an 
interrupt for the clock (via kvm_pic_set_irq()).

Regards,

Anthony LIguori

   if (vector == -1) {
   if (kvm_apic_accept_pic_intr(v)) {
   s = pic_irqchip(v-kvm)

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Anthony Liguori
Hollis Blanchard wrote:
 On Fri, 2007-10-12 at 13:08 -0300, Glauber de Oliveira Costa wrote:
   
 +config KVM_CLOCK
 +   bool KVM paravirtualized clock
 +   depends on PARAVIRT  GENERIC_CLOCKEVENTS
 +   help
 + Turning on this option will allow you to run a paravirtualized 
 clock
 + when running over the KVM hypervisor. Instead of relying on a PIT
 + (or probably other) emulation by the underlying device model, the 
 host
 + provides the guest with timing infrastructure, as time of day, and
 + timer expiration.
 

 I must have missed earlier discussion on this topic, so I'm left
 wondering... what's the point? What's wrong with PIT (et al) emulation?
   

There are three separate reasons, that I know of, to have a PV timer.

1) the PIT is periodic.  a PV timer can offer a one shot timer which 
enables dynticks.

2) the TSC would have to be used as a clocksource.  You don't know the 
frequency which is the first problem with using the TSC but some systems 
have a TSC that changes frequencies.  A PV time source gives you more 
stable clocksource (although as in glommer's patch, when the TSC can be 
used, it's better to use it).

3) a PV clock can support stolen time calculation which there really 
isn't a concept of with emulation.

Regards,

Anthony Liguori

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Windows XP PFN_LIST_CORRUPT error during install.

2007-10-12 Thread John Clemens
Not much detail here but I'll post all I can. 

KVM-46 (from tarball, using kvm-46 modules), ubuntu gutsy
2.6.22-14-generic amd64, Turion X2 with SVM,  1GB total memory on
laptop.

Was in the middle of an windows xp pro sp2 install using this command
line:  

sudo ~/kvm46/bin/qemu-system-x86_64 -m 368 -boot c -cdrom winxp.iso -hda
winxp-pro-work.qcow2 -vnc :1 -net user -net
nic,mac=00:11:22:33:44:55:66,model=rtl8139 -localtime 

While installing, I went to do other work, which involved starting
another VM running a linux boot CD with 256MB of ram.  Was running both
side by side without problem, other than the fact my host X-windows was
sluggish due to the lack of total ram available (1G-368M-256M).  When
finished with the other VM, I shut it down, and a few seconds later the
windows install which was almost complete in the first VM threw a
bluescreen with a 'PFN_LIST_CORRUPT' error.  Note that this was in the
second stage of the install where windows had already rebooted into the
install environment on the hard disk and was 'registering components' or
something. 

Nothing in the kernel log on the host that I can see, anywhere else I
can look? 

Unrelated, but earlier I tried a windows install with '-clock dynticks'
and windows installation would eventually fail with an
IRQL__NOT_LESS_THAN_OR_EQUAL  (or something like that.. I'm working from
memory).  Removing the 'clock' option made it work fine. Not sure if
it's supposed to work yet or not, but it didn't for me. 

john.c


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] OpenBSD 4.1 failes with kvm-45

2007-10-12 Thread Oliver Kowalke
Hi,
I've applied the patch now the kernel detects:
cpu0: Intel Pentium Pro, II or III...
cpu0: FPU,V86,...
But I get in the next line:
kernel: page fault trap, code=0
Stopped at trap+0x16f: testb $0x3,0x38(%ecx)
I don't know what that means.
Hopefuly you could help.
kind regards, Oliver

Am Donnerstag, 11. Oktober 2007 09:57:19 schrieben Sie:
 On Wed, Oct 10, 2007 at 07:38:55PM +0200, Oliver Kowalke wrote:
  Will this patch be included into the new kvm version (46)?

 No. Hopefully it will be included in a near future version of kvm.

  hmm - Iget an error:
  patch -p1  ./qemu.patch
 
  patching file qemu/hw/pc.c
  patch:  malformed patch at line 6:                       DisplayState
  *ds, const char **fd_filename, int snapshot,
 
  regards, Oliver

 I might be wrong, but this sounds like a patch gone bad by the mail
 system, since here I get clean and quite patch relative to both kvm-45
 and kvm-46. I'll try to resend as attachment.

 Regards,

 Dan.



-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Anthony Liguori
Hollis Blanchard wrote:
 On Fri, 2007-10-12 at 15:02 -0500, Anthony Liguori wrote:
   
 Hollis Blanchard wrote:
 
 On Fri, 2007-10-12 at 13:08 -0300, Glauber de Oliveira Costa wrote:
   
   
 +config KVM_CLOCK
 +   bool KVM paravirtualized clock
 +   depends on PARAVIRT  GENERIC_CLOCKEVENTS
 +   help
 + Turning on this option will allow you to run a paravirtualized 
 clock
 + when running over the KVM hypervisor. Instead of relying on a PIT
 + (or probably other) emulation by the underlying device model, 
 the host
 + provides the guest with timing infrastructure, as time of day, 
 and
 + timer expiration.
 
 
 I must have missed earlier discussion on this topic, so I'm left
 wondering... what's the point? What's wrong with PIT (et al) emulation?
   
   
 There are three separate reasons, that I know of, to have a PV timer.

 1) the PIT is periodic.  a PV timer can offer a one shot timer which 
 enables dynticks.
 

 Obviously people have figured out how to do dynticks on real x86
 hardware, so I don't accept this reason. :)
   

Using more advanced timers like the HPET.

 2) the TSC would have to be used as a clocksource.  You don't know the 
 frequency which is the first problem with using the TSC but some systems 
 have a TSC that changes frequencies.  A PV time source gives you more 
 stable clocksource (although as in glommer's patch, when the TSC can be 
 used, it's better to use it).
 

 As I understand it, the TSC is based on CPU frequency, which changes
 with power management. Architectural bug.

 However, PV time still doesn't help here:
   * The TSC is _user_ accessible, so PV time support in the guest
 kernel doesn't solve the problem.
   * It looks like external agents can perform out-of-kernel
 frequency scaling on x86 (at least I see options for it on IBM
 blades). So there must already exist some mechanism for a kernel
 to be informed that the TSC frequency has been changed.
   

I don't know if that is scaled transparently to the host OS or just at 
boot time.  Keep in mind too, modern Intel processors have fixed 
frequency TSCs so it's possible that that's only an option for those 
processors.

Regards,

Anthony Liguori

 3) a PV clock can support stolen time calculation which there really 
 isn't a concept of with emulation.
 

 This is true, and I know other platforms support this functionality. I
 think it's mostly useful for process time accounting. Is that actually
 supported in this patch?

   


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Paravirt timer for KVM

2007-10-12 Thread Hollis Blanchard
On Fri, 2007-10-12 at 15:02 -0500, Anthony Liguori wrote:
 Hollis Blanchard wrote:
  On Fri, 2007-10-12 at 13:08 -0300, Glauber de Oliveira Costa wrote:

  +config KVM_CLOCK
  +   bool KVM paravirtualized clock
  +   depends on PARAVIRT  GENERIC_CLOCKEVENTS
  +   help
  + Turning on this option will allow you to run a paravirtualized 
  clock
  + when running over the KVM hypervisor. Instead of relying on a PIT
  + (or probably other) emulation by the underlying device model, 
  the host
  + provides the guest with timing infrastructure, as time of day, 
  and
  + timer expiration.
  
 
  I must have missed earlier discussion on this topic, so I'm left
  wondering... what's the point? What's wrong with PIT (et al) emulation?

 
 There are three separate reasons, that I know of, to have a PV timer.
 
 1) the PIT is periodic.  a PV timer can offer a one shot timer which 
 enables dynticks.

Obviously people have figured out how to do dynticks on real x86
hardware, so I don't accept this reason. :)

 2) the TSC would have to be used as a clocksource.  You don't know the 
 frequency which is the first problem with using the TSC but some systems 
 have a TSC that changes frequencies.  A PV time source gives you more 
 stable clocksource (although as in glommer's patch, when the TSC can be 
 used, it's better to use it).

As I understand it, the TSC is based on CPU frequency, which changes
with power management. Architectural bug.

However, PV time still doesn't help here:
  * The TSC is _user_ accessible, so PV time support in the guest
kernel doesn't solve the problem.
  * It looks like external agents can perform out-of-kernel
frequency scaling on x86 (at least I see options for it on IBM
blades). So there must already exist some mechanism for a kernel
to be informed that the TSC frequency has been changed.

 3) a PV clock can support stolen time calculation which there really 
 isn't a concept of with emulation.

This is true, and I know other platforms support this functionality. I
think it's mostly useful for process time accounting. Is that actually
supported in this patch?

-- 
Hollis Blanchard
IBM Linux Technology Center


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 0/4] Swapping

2007-10-12 Thread Izik Eidus
this patchs allow the guest not shadowed memory to be swapped out.

to make it the must effective you should run -kvm-shadow-memory 1 (witch 
will make your machine slow)
with -kvm-shadow-memory 1,  3giga memory guest can get to be just 32mb 
on physical host!

when not using -kvm-shadow-memory, i saw 4100mb machine getting to as 
low as 168mb on the physical host (not as bad as i thought it would be, 
and surely not as bad as it can be with 41mb of shadow pages :))


it seems to be very stable, it didnt crushed to me once, and i was able 
to run:
2 3giga each windows xp  + 5giga linux guest

and
2 4.1 giga each windows xp and 2 2giga each windows xp.

few things to note:
ignore for now the ugly messages at dmesg, it is due to the fact that 
gfn_to_page try to sleep while local intrreupts disabled ( we have to 
split some emulator function so it wont do it)

and i saw some issue with the new rmapp at fedora 7 live cd, for some 
reason , in the nonpaging mode rmap_remove getting called about 50 times 
less than it need
it doesnt happen at other linux guests, need to check this... (for now 
it mean you might have about 200k of memory leak for each fedora 7 live 
cd you are runing )

also note that now kvm load much faster, beacuse no memset on all the 
memory is needed (beacuse gfn_to_page get called at run time)

(avi, and dor, note that this patch include small fix to a bug in the 
patch that i sent you)

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/4] Swapping

2007-10-12 Thread Izik Eidus

this make the rmap keep reverse mapping on all the present shadow pages
From 3b5821a55836f82f987b878982cbc6fc8336371f Mon Sep 17 00:00:00 2001
From: Izik Eidus [EMAIL PROTECTED](none)
Date: Sat, 13 Oct 2007 01:47:44 +0200
Subject: [PATCH] modify the rmap so it will hold reverse mapping to all present
shadow pages

Signed-off-by: Izik Eidus [EMAIL PROTECTED]
---
 drivers/kvm/mmu.c |   52 
 1 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index f52604a..cfbeec8 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -211,8 +211,8 @@ static int is_io_pte(unsigned long pte)
 
 static int is_rmap_pte(u64 pte)
 {
-	return (pte  (PT_WRITABLE_MASK | PT_PRESENT_MASK))
-		== (PT_WRITABLE_MASK | PT_PRESENT_MASK);
+	return pte != shadow_trap_nonpresent_pte
+		 pte != shadow_notrap_nonpresent_pte;
 }
 
 static void set_shadow_pte(u64 *sptep, u64 spte)
@@ -456,29 +456,51 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
 	}
 }
 
-static void rmap_write_protect(struct kvm *kvm, u64 gfn)
+static u64 *rmap_next(struct kvm *kvm, unsigned long *rmapp, u64 *spte)
 {
 	struct kvm_rmap_desc *desc;
+	struct kvm_rmap_desc *prev_desc;
+	u64 *prev_spte;
+	int i;
+
+	if (!*rmapp)
+		return NULL;
+	else if (!(*rmapp  1)) {
+		if (!spte)
+			return (u64 *)*rmapp;
+		return NULL;
+	}
+	desc = (struct kvm_rmap_desc *)(*rmapp  ~1ul);
+	prev_desc = NULL;
+	prev_spte = NULL;
+	while (desc) {
+		for (i = 0; i  RMAP_EXT  desc-shadow_ptes[i]; ++i) {
+			if (prev_spte == spte)
+return desc-shadow_ptes[i];
+			prev_spte = desc-shadow_ptes[i];
+		}
+		desc = desc-more;
+	}
+	return NULL;
+}
+
+static void rmap_write_protect(struct kvm *kvm, u64 gfn)
+{
 	unsigned long *rmapp;
 	u64 *spte;
 
 	gfn = unalias_gfn(kvm, gfn);
 	rmapp = gfn_to_rmap(kvm, gfn);
 
-	while (*rmapp) {
-		if (!(*rmapp  1))
-			spte = (u64 *)*rmapp;
-		else {
-			desc = (struct kvm_rmap_desc *)(*rmapp  ~1ul);
-			spte = desc-shadow_ptes[0];
-		}
+	spte = rmap_next(kvm, rmapp, NULL);
+	while (spte) {
 		BUG_ON(!spte);
 		BUG_ON(!(*spte  PT_PRESENT_MASK));
-		BUG_ON(!(*spte  PT_WRITABLE_MASK));
 		rmap_printk(rmap_write_protect: spte %p %llx\n, spte, *spte);
-		rmap_remove(kvm, spte);
-		set_shadow_pte(spte, *spte  ~PT_WRITABLE_MASK);
+		if (is_writeble_pte(*spte))
+			set_shadow_pte(spte, *spte  ~PT_WRITABLE_MASK);
 		kvm_flush_remote_tlbs(kvm);
+		spte = rmap_next(kvm, rmapp, spte);
 	}
 }
 
@@ -1399,10 +1421,8 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 		pt = page-spt;
 		for (i = 0; i  PT64_ENT_PER_PAGE; ++i)
 			/* avoid RMW */
-			if (pt[i]  PT_WRITABLE_MASK) {
-rmap_remove(kvm, pt[i]);
+			if (pt[i]  PT_WRITABLE_MASK)
 pt[i] = ~PT_WRITABLE_MASK;
-			}
 	}
 }
 
-- 
1.5.2.4

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 2/4] Swapping

2007-10-12 Thread Izik Eidus
this patch make gfn_to_page always safe function (return bad_page in 
case there is no such page in the guest)
From 51a8851a2805f5b61d3fbe506ab317ecb677c3da Mon Sep 17 00:00:00 2001
From: Izik Eidus [EMAIL PROTECTED](none)
Date: Sat, 13 Oct 2007 02:01:54 +0200
Subject: [PATCH] change gfn_to_page to be safe always function.

Signed-off-by: Izik Eidus [EMAIL PROTECTED]
---
 drivers/kvm/kvm.h |3 ++-
 drivers/kvm/kvm_main.c|   26 ++
 drivers/kvm/mmu.c |   16 +---
 drivers/kvm/paging_tmpl.h |   11 +++
 4 files changed, 24 insertions(+), 32 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index 4a52d6e..a155c2b 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -565,8 +565,9 @@ static inline int is_error_hpa(hpa_t hpa) { return hpa  HPA_MSB; }
 hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva);
 struct page *gva_to_page(struct kvm_vcpu *vcpu, gva_t gva);
 
-extern hpa_t bad_page_address;
+extern struct page *bad_page;
 
+int is_error_page(struct page *page);
 gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn);
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
 int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index a0f8366..bfa201c 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1012,6 +1012,12 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
 	return r;
 }
 
+int is_error_page(struct page *page)
+{
+	return page == bad_page;
+}
+EXPORT_SYMBOL_GPL(is_error_page);
+
 gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
 {
 	int i;
@@ -1053,7 +1059,7 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 	gfn = unalias_gfn(kvm, gfn);
 	slot = __gfn_to_memslot(kvm, gfn);
 	if (!slot)
-		return NULL;
+		return bad_page;
 	return slot-phys_mem[gfn - slot-base_gfn];
 }
 EXPORT_SYMBOL_GPL(gfn_to_page);
@@ -1073,7 +1079,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (!page)
+	if (is_error_page(page))
 		return -EFAULT;
 	page_virt = kmap_atomic(page, KM_USER0);
 
@@ -,7 +1117,7 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (!page)
+	if (is_error_page(page))
 		return -EFAULT;
 	page_virt = kmap_atomic(page, KM_USER0);
 
@@ -1149,7 +1155,7 @@ int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (!page)
+	if (is_error_page(page))
 		return -EFAULT;
 	page_virt = kmap_atomic(page, KM_USER0);
 
@@ -3075,7 +3081,7 @@ static struct page *kvm_vm_nopage(struct vm_area_struct *vma,
 
 	pgoff = ((address - vma-vm_start)  PAGE_SHIFT) + vma-vm_pgoff;
 	page = gfn_to_page(kvm, pgoff);
-	if (!page)
+	if (is_error_page(page))
 		return NOPAGE_SIGBUS;
 	get_page(page);
 	if (type != NULL)
@@ -3390,7 +3396,7 @@ static struct sys_device kvm_sysdev = {
 	.cls = kvm_sysdev_class,
 };
 
-hpa_t bad_page_address;
+struct page *bad_page;
 
 static inline
 struct kvm_vcpu *preempt_notifier_to_vcpu(struct preempt_notifier *pn)
@@ -3519,7 +3525,6 @@ EXPORT_SYMBOL_GPL(kvm_exit_x86);
 
 static __init int kvm_init(void)
 {
-	static struct page *bad_page;
 	int r;
 
 	r = kvm_mmu_module_init();
@@ -3530,16 +3535,13 @@ static __init int kvm_init(void)
 
 	kvm_arch_init();
 
-	bad_page = alloc_page(GFP_KERNEL);
+	bad_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
 
 	if (bad_page == NULL) {
 		r = -ENOMEM;
 		goto out;
 	}
 
-	bad_page_address = page_to_pfn(bad_page)  PAGE_SHIFT;
-	memset(__va(bad_page_address), 0, PAGE_SIZE);
-
 	return 0;
 
 out:
@@ -3552,7 +3554,7 @@ out4:
 static __exit void kvm_exit(void)
 {
 	kvm_exit_debug();
-	__free_page(pfn_to_page(bad_page_address  PAGE_SHIFT));
+	__free_page(bad_page);
 	kvm_mmu_module_exit();
 }
 
diff --git a/drivers/kvm/mmu.c b/drivers/kvm/mmu.c
index cfbeec8..e6a9b4a 100644
--- a/drivers/kvm/mmu.c
+++ b/drivers/kvm/mmu.c
@@ -850,23 +850,17 @@ static void page_header_update_slot(struct kvm *kvm, void *pte, gpa_t gpa)
 	__set_bit(slot, page_head-slot_bitmap);
 }
 
-hpa_t safe_gpa_to_hpa(struct kvm *kvm, gpa_t gpa)
-{
-	hpa_t hpa = gpa_to_hpa(kvm, gpa);
-
-	return is_error_hpa(hpa) ? bad_page_address | (gpa  ~PAGE_MASK): hpa;
-}
-
 hpa_t gpa_to_hpa(struct kvm *kvm, gpa_t gpa)
 {
 	struct page *page;
+	hpa_t hpa;
 
 	ASSERT((gpa  HPA_ERR_MASK) == 0);
 	page = gfn_to_page(kvm, gpa  PAGE_SHIFT);
-	if (!page)
-		return gpa | HPA_ERR_MASK;
-	return ((hpa_t)page_to_pfn(page)  PAGE_SHIFT)
-		| (gpa  (PAGE_SIZE-1));
+	hpa = ((hpa_t)page_to_pfn(page)  PAGE_SHIFT) | (gpa  (PAGE_SIZE-1));
+	if (is_error_page(page))
+		return hpa | HPA_ERR_MASK;
+	return hpa;
 }
 
 hpa_t gva_to_hpa(struct kvm_vcpu *vcpu, gva_t gva)
diff --git a/drivers/kvm/paging_tmpl.h b/drivers/kvm/paging_tmpl.h
index a9e687b..58fd35a 100644
--- 

[kvm-devel] [PATCH 3/4] Swapping

2007-10-12 Thread Izik Eidus

this patch make the guest non shadowed pages swappedable

From 8e25e215b8ed95ca4ff51cbfcf5bdc438bb799f4 Mon Sep 17 00:00:00 2001
From: Izik Eidus [EMAIL PROTECTED](none)
Date: Sat, 13 Oct 2007 04:03:28 +0200
Subject: [PATCH] make the guest non shadowed memory swappable

Signed-off-by: Izik Eidus [EMAIL PROTECTED]
---
 drivers/kvm/kvm.h |1 +
 drivers/kvm/kvm_main.c|   66 +---
 drivers/kvm/mmu.c |   13 -
 drivers/kvm/paging_tmpl.h |   23 +--
 4 files changed, 70 insertions(+), 33 deletions(-)

diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index a155c2b..2e83fa7 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -409,6 +409,7 @@ struct kvm_memory_slot {
 	unsigned long *rmap;
 	unsigned long *dirty_bitmap;
 	int user_alloc; /* user allocated memory */
+	unsigned long userspace_addr;
 };
 
 struct kvm {
diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index bfa201c..0dce93c 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -321,15 +321,6 @@ static struct kvm *kvm_create_vm(void)
 
 static void kvm_free_userspace_physmem(struct kvm_memory_slot *free)
 {
-	int i;
-
-	for (i = 0; i  free-npages; ++i) {
-		if (free-phys_mem[i]) {
-			if (!PageReserved(free-phys_mem[i]))
-SetPageDirty(free-phys_mem[i]);
-			page_cache_release(free-phys_mem[i]);
-		}
-	}
 }
 
 static void kvm_free_kernel_physmem(struct kvm_memory_slot *free)
@@ -771,19 +762,8 @@ static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
 		memset(new.phys_mem, 0, npages * sizeof(struct page *));
 		memset(new.rmap, 0, npages * sizeof(*new.rmap));
 		if (user_alloc) {
-			unsigned long pages_num;
-
 			new.user_alloc = 1;
-			down_read(current-mm-mmap_sem);
-
-			pages_num = get_user_pages(current, current-mm,
-		   mem-userspace_addr,
-		   npages, 1, 0, new.phys_mem,
-		   NULL);
-
-			up_read(current-mm-mmap_sem);
-			if (pages_num != npages)
-goto out_unlock;
+			new.userspace_addr = mem-userspace_addr;
 		} else {
 			for (i = 0; i  npages; ++i) {
 new.phys_mem[i] = alloc_page(GFP_HIGHUSER
@@ -1058,8 +1038,27 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 
 	gfn = unalias_gfn(kvm, gfn);
 	slot = __gfn_to_memslot(kvm, gfn);
-	if (!slot)
+	if (!slot) {
+		get_page(bad_page);
 		return bad_page;
+	}
+	if (slot-user_alloc) {
+		struct page *page[1];
+		int npages;
+
+		down_read(current-mm-mmap_sem);
+		npages = get_user_pages(current, current-mm,
+	slot-userspace_addr
+	+ (gfn - slot-base_gfn) * PAGE_SIZE, 1,
+	1, 0, page, NULL);
+		up_read(current-mm-mmap_sem);
+		if (npages != 1) {
+			get_page(bad_page);
+			return bad_page;
+		}
+		return page[0];
+	}
+	get_page(slot-phys_mem[gfn - slot-base_gfn]);
 	return slot-phys_mem[gfn - slot-base_gfn];
 }
 EXPORT_SYMBOL_GPL(gfn_to_page);
@@ -1079,13 +1078,16 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		put_page(page);
 		return -EFAULT;
+	}
 	page_virt = kmap_atomic(page, KM_USER0);
 
 	memcpy(data, page_virt + offset, len);
 
 	kunmap_atomic(page_virt, KM_USER0);
+	put_page(page);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_read_guest_page);
@@ -1117,14 +1119,17 @@ int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data,
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		put_page(page);
 		return -EFAULT;
+	}
 	page_virt = kmap_atomic(page, KM_USER0);
 
 	memcpy(page_virt + offset, data, len);
 
 	kunmap_atomic(page_virt, KM_USER0);
 	mark_page_dirty(kvm, gfn);
+	put_page(page);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_write_guest_page);
@@ -1155,13 +1160,16 @@ int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len)
 	struct page *page;
 
 	page = gfn_to_page(kvm, gfn);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		put_page(page);
 		return -EFAULT;
+	}
 	page_virt = kmap_atomic(page, KM_USER0);
 
 	memset(page_virt + offset, 0, len);
 
 	kunmap_atomic(page_virt, KM_USER0);
+	put_page(page);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(kvm_clear_guest_page);
@@ -2090,13 +2098,12 @@ int kvm_emulate_pio_string(struct kvm_vcpu *vcpu, struct kvm_run *run, int in,
 	for (i = 0; i  nr_pages; ++i) {
 		mutex_lock(vcpu-kvm-lock);
 		page = gva_to_page(vcpu, address + i * PAGE_SIZE);
-		if (page)
-			get_page(page);
 		vcpu-pio.guest_pages[i] = page;
 		mutex_unlock(vcpu-kvm-lock);
 		if (!page) {
 			inject_gp(vcpu);
 			free_pio_guest_pages(vcpu);
 			return 1;
 		}
 	}
@@ -3081,9 +3088,10 @@ static struct page *kvm_vm_nopage(struct vm_area_struct *vma,
 
 	pgoff = ((address - vma-vm_start)  PAGE_SHIFT) + vma-vm_pgoff;
 	page = gfn_to_page(kvm, pgoff);
-	if (is_error_page(page))
+	if (is_error_page(page)) {
+		put_page(page);
 		return NOPAGE_SIGBUS;
-	get_page(page);
+	}
 	if (type != NULL)
 		*type = 

[kvm-devel] [PATCH 4/4] Swapping

2007-10-12 Thread Izik Eidus
this patch just remove the memset from kvmctl, so the vm will load much 
faster now
From dc0164113041c2f2bf22fc066ca99b9b8531d627 Mon Sep 17 00:00:00 2001
From: Izik Eidus [EMAIL PROTECTED](none)
Date: Sat, 13 Oct 2007 02:56:25 +0200
Subject: [PATCH] now that gfn_to_page get called at run time, we dont have to do memset on the memory.
(it is now much faster to load VM with alot of memory)

Signed-off-by: Izik Eidus [EMAIL PROTECTED]
---
 user/kvmctl.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/user/kvmctl.c b/user/kvmctl.c
index 0604f2f..ff2014e 100644
--- a/user/kvmctl.c
+++ b/user/kvmctl.c
@@ -391,7 +391,6 @@ int kvm_alloc_userspace_memory(kvm_context_t kvm, unsigned long memory,
 
 
 	low_memory.userspace_addr = (unsigned long)*vm_mem;
-	memset((unsigned long *)low_memory.userspace_addr, 0, low_memory.memory_size);
 	/* 640K should be enough. */
 	r = ioctl(kvm-vm_fd, KVM_SET_USER_MEMORY_REGION, low_memory);
 	if (r == -1) {
@@ -406,7 +405,6 @@ int kvm_alloc_userspace_memory(kvm_context_t kvm, unsigned long memory,
 			return -1;
 		}
 		extended_memory.userspace_addr = (unsigned long)(*vm_mem + exmem);
-		memset((unsigned long *)extended_memory.userspace_addr, 0, extended_memory.memory_size);
 		r = ioctl(kvm-vm_fd, KVM_SET_USER_MEMORY_REGION, extended_memory);
 		if (r == -1) {
 			fprintf(stderr, kvm_create_memory_region: %m\n);
@@ -422,7 +420,6 @@ int kvm_alloc_userspace_memory(kvm_context_t kvm, unsigned long memory,
 			return -1;
 		}
 		above_4g_memory.userspace_addr = (unsigned long)(*vm_mem + 0x1);
-		memset((unsigned long *)above_4g_memory.userspace_addr, 0, above_4g_memory.memory_size);
 		r = ioctl(kvm-vm_fd, KVM_SET_USER_MEMORY_REGION, above_4g_memory);
 		if (r == -1) {
 			fprintf(stderr, kvm_create_memory_region: %m\n);
-- 
1.5.2.4

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Test for KVM, kernel 33aaf..., userspace, 803145...

2007-10-12 Thread Izik Eidus
Zhao, Yunfeng wrote:

 Three Linux guest issues:
 6. segfault while booting 64bit linux with 4GB mem
 https://sourceforge.net/tracker/?func=detailatid=893831aid=1812050gro
 up_id=180599
   

did it happen to you before kvm-46?
 -
 This SF.net email is sponsored by: Splunk Inc.
 Still grepping through log files to find problems?  Stop.
 Now Search log events and configuration files using AJAX and a browser.
 Download your FREE copy of Splunk now  http://get.splunk.com/
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel
   


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Test for KVM, kernel 33aaf..., userspace, 803145...

2007-10-12 Thread Zhao, Yunfeng
Yes, it also happened before kvm-46.
Guest with 1.5GB mem hasn't the problem.

-Original Message-
From: Izik Eidus [mailto:[EMAIL PROTECTED]
Sent: 2007年10月13日 10:24
To: Zhao, Yunfeng
Cc: kvm-devel
Subject: Re: [kvm-devel] Test for KVM, kernel 33aaf..., userspace, 803145...

Zhao, Yunfeng wrote:

 Three Linux guest issues:
 6. segfault while booting 64bit linux with 4GB mem
 https://sourceforge.net/tracker/?funcÞtailatid‰3831aid12050gro
 up_id0599


did it happen to you before kvm-46?
 -
 This SF.net email is sponsored by: Splunk Inc.
 Still grepping through log files to find problems?  Stop.
 Now Search log events and configuration files using AJAX and a browser.
 Download your FREE copy of Splunk now  http://get.splunk.com/
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] RFC/patch portability: split kvm_vm_ioctl

2007-10-12 Thread Zhang, Xiantao
Carsten Otte wrote:
 This patch splits kvm_vm_ioctl into archtecture independent parts, and
 x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
 
 Common ioctls for all architectures are:
 KVM_CREATE_VCPU, KVM_GET_DIRTY_LOG
 
 I'd really like to see more commonalities, but all others did not fit
 our needs. I would love to keep KVM_GET_DIRTY_LOG common, so that the
 ingenious migration code does not need to care too much about
 different architectures.

 x86 specific ioctls are:
 KVM_SET_MEMORY_REGION, KVM_SET_USER_MEMORY_REGION,
 KVM_GET/SET_NR_MMU_PAGES, KVM_SET_MEMORY_ALIAS, KVM_CREATE_IRQCHIP,
 KVM_CREATE_IRQ_LINE, KVM_GET/SET_IRQCHIP

I don't know why we not put KVM_SET_MEMORY_REGION,
KVM_SET_USER_MEMORY_REGION as common, although 
I have read the reasons you listed. I think they should work for most of
archs, although it is not very friendly with s390.  If we put them 
as arch-specific ones, we have to  duplicate many copies for them in KVM
code. 

One suggestion:  Maybe we can comment out current memory allocation
logic in userspace for S390, and s390 use your apporach to get its
memory.



 While the pic/apic related functions are obviously x86 specific, some
 other ioctls seem to be common at a first glance.
 KVM_SET_(USER)_MEMORY_REGION for example. We've got a total different
 address layout on s390: we cannot support multiple slots, and a user
 memory range always equals the guest physical memory [guest_phys + vm
 specific offset = host user address]. We don't have nor need dedicated
 vmas for the guest memory, we just use what the memory managment has
 in 
 stock. This is true, because we reuse the page table for user and
 guest 
 mode.
 Looks to me like the s390 might have a lot in common with a future AMD
 nested page table implementation. If AMD choose to reuse the page
 table 
 too, we might share the same ioctl to set up guest addressing with
 them. 





-
 This SF.net email is sponsored by: Splunk Inc.
 Still grepping through log files to find problems?  Stop.
 Now Search log events and configuration files using AJAX and a
 browser. Download your FREE copy of Splunk now 
 http://get.splunk.com/ ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][Resend] Split kvm_vcpu to support new archs.

2007-10-12 Thread Zhang, Xiantao
Carsten Otte wrote:
 Zhang, Xiantao wrote:
 diff --git a/drivers/kvm/ioapic.c b/drivers/kvm/ioapic.c
 index 3b69541..df67292 100644
 --- a/drivers/kvm/ioapic.c
 +++ b/drivers/kvm/ioapic.c
 @@ -156,7 +156,7 @@ static u32 ioapic_get_delivery_bitmask(struct
  kvm_ioapic *ioapic, u8 dest, if (dest_mode == 0) {  /*
Physical
  mode. */ if (dest == 0xFF) {/* Broadcast. */
  for (i = 0; i  KVM_MAX_VCPUS; ++i)
 -if (kvm-vcpus[i] 
 kvm-vcpus[i]-apic)
 +if (kvm-vcpus[i] 
 kvm-vcpus[i]-arch.apic)
  mask |= 1  i;
  return mask;
  }
 Your mail client still wraps here, the patch is not applicable.

Maybe my mail client has something wrong, I will check them next time.
Thanks

  struct kvm_vcpu {
  struct kvm *kvm;
  struct preempt_notifier preempt_notifier;
  int vcpu_id;
  struct mutex mutex;
  int   cpu;
 -u64 host_tsc;
  struct kvm_run *run;
  int interrupt_window_open;
 This one should go to arch.
  int guest_mode;
  unsigned long requests;
  unsigned long irq_summary; /* bit vector: 1 per word in
  irq_pending */ DECLARE_BITMAP(irq_pending, KVM_NR_INTERRUPTS);
 Both irq related ones too please.

I can't understand about it, doesn't s390 need userspace to transfer
interrupts into kvm module? or other approaches? 
If need, we had better follow existing infrastructure of KVM, or it may
introduce unnecessary for most archs. 
Please don't forget that we are in KVM world :)

  int mmio_needed;
  int mmio_read_completed;
 Not all architectures have mmio, please put this into arch specific
 part. 

OK.

 Other then that, the patch looks fine to me.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] APIC_TMCCT register read bug

2007-10-12 Thread Kevin Pedretti
Hi,

While booting a non-Linux OS under kvm-46, I noticed that reading
APIC_TMCCT before initializing APIC_TDCR to something other than its
boot time value would lead to a host kernel divide by zero exception.
It's due to apic-timer.divide_count being set to 0 at boot... it should
be set to 2 since APIC_TDCR=0 means 'divide count by 2'.  The last hunk
of the attached patch results in apic-timer.divide_count being set to 2
and eliminates the oops.

The other changes to apic_get_tmcct() are intended to clean it up a bit,
although completely untested other than to verify 0 is returned for a
read of APIC_TMCCT at boot.  'apic' should not be used before the
ASSERT() and using u32 for counter_passed makes it fairly easy to
overflow.

Kevin
--- kvm-46.orig/kernel/lapic.c  2007-10-10 02:06:36.0 -0600
+++ kvm-46.fix/kernel/lapic.c   2007-10-12 22:50:01.0 -0600
@@ -487,12 +487,19 @@
 
 static u32 apic_get_tmcct(struct kvm_lapic *apic)
 {
-   u32 counter_passed;
-   ktime_t passed, now = apic-timer.dev.base-get_time();
-   u32 tmcct = apic_get_reg(apic, APIC_TMICT);
+   u64 counter_passed;
+   ktime_t passed, now;
+   u32 tmcct;
 
ASSERT(apic != NULL);
 
+   now = apic-timer.dev.base-get_time();
+   tmcct = apic_get_reg(apic, APIC_TMICT);
+
+   /* if initial count is 0, current count should also be 0 */
+   if (tmcct == 0)
+   return 0;
+
if (unlikely(ktime_to_ns(now) =
ktime_to_ns(apic-timer.last_update))) {
/* Wrap around */
@@ -507,15 +514,24 @@
 
counter_passed = div64_64(ktime_to_ns(passed),
  (APIC_BUS_CYCLE_NS * 
apic-timer.divide_count));
-   tmcct -= counter_passed;
 
-   if (tmcct = 0) {
-   if (unlikely(!apic_lvtt_period(apic)))
+   if (counter_passed  tmcct) {
+   if (unlikely(!apic_lvtt_period(apic))) {
+   /* one-shot timers stick at 0 until reset */
tmcct = 0;
-   else
-   do {
-   tmcct += apic_get_reg(apic, APIC_TMICT);
-   } while (tmcct = 0);
+   } else {
+   /*
+* periodic timers reset to APIC_TMICT when they
+* hit 0. The while loop simulates this happening N
+* times. (counter_passed %= tmcct) would also work,
+* but might be slower or not work on 32-bit??
+*/
+   while (counter_passed  tmcct)
+   counter_passed -= tmcct;
+   tmcct -= counter_passed;
+   }
+   } else {
+   tmcct -= counter_passed;
}
 
return tmcct;
@@ -844,7 +860,7 @@
apic_set_reg(apic, APIC_ISR + 0x10 * i, 0);
apic_set_reg(apic, APIC_TMR + 0x10 * i, 0);
}
-   apic-timer.divide_count = 0;
+   update_divide_count(apic);
atomic_set(apic-timer.pending, 0);
if (vcpu-vcpu_id == 0)
vcpu-apic_base |= MSR_IA32_APICBASE_BSP;
-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now  http://get.splunk.com/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel