date:20080307

[kvm-devel] kvm-devel

2008-03-07 Thread weskypp13

中层经理系统化管理 成就企业目标(证书班2天)
   
深圳  3月29-30日、4月26-27日、5月31-6月1日  深圳金百合大酒店
上海  4月12-13日、5月17-18日、6月14-15日   上海新梅华东大酒店
  
━━━
培训对象：
企业副总、各部门经理、主管、各级中层管理人员、新提拔的、从专业人才转型到管理的、
进一步想提高管理绩效的、晋升到高层管理以及其它预备管理人员

培训证书：
培训合格者颁发由香港光华管理学院签署的《职业经理人培训证书》

主办单位: 众人行管理咨询

培训费用：1980元/人
  
备注：
详细查询： www.126.px.hk
咨询电话：0755-26075365  26075429  22008632
传真：0755-61351396  
联 系 人：凌小姐   彭小姐

市场竞争越来越激烈，企业的经营压力也越来越大，企业的发展需要两种力量，推力
和拉力，推力是指依靠系统化、规范化、制度化的体系来推动企业的发展，这一部分占到
80%的作用；拉力是指依靠领导者的领导风格、个人魅力、威信等的力量，这一部分占到
20%的作用。推力的作用即企业的基础管理规范，必须建立一套有效的制度化和体系化的
标准，依靠制度来规范企业的行为，企业的发展才能进入良性发展的阶段。

本次课程主要侧重于企业推力的建设，企业的规范管理主要是目标、计划与绩效管理
以及达成目标的执行方法与技巧。

课程收获 
?   让每位中层经理明白：任何中层经理首先是该部门的人力资源管理者！
?   熟悉KPI、BSC、MBO等绩效管理工具的运用 
?   掌握绩效管理中的难点，如指标提取、权重、考核结果并掌握解决方案；
?   掌握绩效评估、绩效反馈的基本方法
?   如何保质、保量的快速达到企业目标―执行力的建设


━━━

第一天、目标计划与绩效管理

一、绩效管理的发展及内涵
1、企业绩效管理的理念
2、绩效管理和绩效考核的区别
3、企业绩效管理的现状及难点
   员工素质差异
   经营任务重
4、组织推进绩效管理效果不佳的原因

二、绩效管理工具KPI、BSC、MBO的运用
1、职业经理在绩效管理中承担的责任和义务
   KPI关键指标确定的导向和原则
   如何设置关键的KPI指标
   能力和工作的匹配
   意愿和工作的匹配
2、BSC平衡计分卡的应用
   战略执行的推动方式是什么
   什么是平衡计分卡
   平衡计分卡的发展历程
   平衡计分卡的四个纬度
3、MBO目标管理计划执行

三、 绩效管理中的绩效反馈和沟通技巧
1、绩效反馈的重要性
2、绩效反馈的方法
3、通过绩效管理反馈需要解决的问题
   绩效管理体系≠绩效考核手段
   重点指出被考核者考核中反映出的问题和解决办法
   兑现激励办法
4、上对下的沟通技巧

四、如何对下属进行绩效辅导
1、如何根据考核结果要培育部属
2、如何进行合理授权

五、绩效管理中的激励原则
1、走出金钱万能的误区
2、如何进行员工激励
3、激励机制

六、考核结果的应用
1、考核结果在薪酬方面的运用
2、考核结果在员工发展方面的应用

第二天、执行力的提升

一、执行的根源是什么
1、执行的定义及分层的内涵
2、执行操作时不同人群的不同处理方式
3、组织行为学的五要素
4、企业发展战略的执行平台

二、执行力如何建立
1、执行力体系构建模型
2、职责与权限控制
3、人员控制
4、内部投诉控制
5、不良事故控制
6、对动态职责的控制
7、素质
8、执行力是“训练”出来的，先有学习力,后有执行力
9、执行力是“激发”出来的
   10、执行力是“淘汰”出来的
   11、态度：３个不放过
   找不到具体责任人不放过
   找不到问题的真正原因不放过
   问题得不到解决决不放过
   12、责任意识＝卓越意识
   想到了≠做到了   
   做到了≠做对了
   做对了≠做到位了 
   做到位了≠做出效果了


━━━
 
讲师介绍--楚 天

众人行人力资源管理专家、美国ASTD协会职业培训师、NQA（英国国家质量保证有限公司），
SNQA（上海恩可埃认证有限公司），BCC（北京新世纪认证有限公司）特种行业审核组人力
资源管理技术专家。

90年代后期，开始从事人力资源管理咨询工作，对中国企业的管理现状及其解决方案有深
入思考和独到见解，擅长于绩效管理、组织变革、战略性人力资源管理、薪酬福利、执行、精
细化管理、企业文化、人力资源的规划招聘、培训发展等体系的开发和设计。

培训或咨询的企业：清华大学EMBA、陕西电信、湖南电信、湖北省电信、青海网通、
中兴通讯、广州立白集团、喜之朗集团、正大康地集团、蛇口集装箱码头、深圳市水务集团、
深圳市集贸市场有限公司、联合船舶、三九集团、中国宝安集团、中石油、长城证券、
深圳航空公司等。

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 2/2] [PATCH] disable kvm clock unless addr's LSB is set.

2008-03-07 Thread Yang, Sheng

On Thursday 06 March 2008 22:45:44 Glauber Costa wrote:
> Use LSB of the address passed through the msr to enable/disable
> the clock. Setting it to 1 enables it, setting it to 0 disables it.
>
> As the guest data structures are aligned anyway, this
> won't be a problem, as this bit is free.
>
> Guest is changed accordingly
>
> Signed-off-by: Glauber Costa <[EMAIL PROTECTED]>
> ---
>  arch/x86/kernel/kvmclock.c |2 +-
>  arch/x86/kvm/x86.c |9 -
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 7c481a3..f654a12 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -125,7 +125,7 @@ static int kvm_register_clock(void)
>  {
>   int cpu = smp_processor_id();
>   int low, high;
> - low = (int)__pa(&per_cpu(hv_clock, cpu));
> + low = (int)__pa(&per_cpu(hv_clock, cpu)) | 1;
>   high = ((u64)__pa(&per_cpu(hv_clock, cpu)) >> 32);
>
>   return native_write_msr_safe(MSR_KVM_SYSTEM_TIME, low, high);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 6abd784..64beff6 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -591,8 +591,15 @@ int kvm_set_msr_common(struct kvm_vcpu *
>   if (vcpu->arch.time_page)
>   kvm_release_page_dirty(vcpu->arch.time_page);
>
> + /* we verify if the enable bit is set... */
> + if (!(data & 1)) {
> + vcpu->arch.time = NULL;

This line made compiler complaining...

Thanks
Yang, Sheng

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] headersinstall of kvm.h does not work

2008-03-07 Thread David Woodhouse

On Fri, 2008-03-07 at 13:26 +0100, Christian Borntraeger wrote:
> +unifdef-$(CONFIG_HAVE_KVM) += kvm.h
>  unifdef-y += llc.h
>  unifdef-y += loop.h
> snip--
> 
> This patch does not work. Kbuild (scripts/Makefile.headersinst) does
> not check the config file, so kvm.h is never installed.
> 
> Sam is there an easy way to allow constructs like
> "unifdef-$(CONFIG_FOO)"?

That might be justifiable for HAVE_xxx which is a constant for any given
architecture -- but I predict that soon as we add such a facility, it
would get abused for stuff which really _is_ configurable. And that
would make me sad.

-- 
dwmw2

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter

On Fri, 7 Mar 2008, Andrea Arcangeli wrote:

> This is a replacement for the previously posted 3/4, one of the pieces
> to allow the mmu notifier methods to sleep.

Looks good. That is what we talked about last week. What guarantees now 
that we see the cacheline referenced after the cacheline that 
contains the pointer that was changed? hlist_for_reach does a 
rcu_dereference with implied memory barrier? So its like EMM?

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter

On Fri, 7 Mar 2008, Andrea Arcangeli wrote:

> PS. this problem I pointed out of _end possibly called before _begin
> is the same for #v9 and EMM V1 as far as I can tell.

Hmmm.. We could just push that on the driver saying that is has to 
tolerate it. Otherwise how can we solve this?

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter

On Fri, 7 Mar 2008, Andrea Arcangeli wrote:

> In the meantime I've also been thinking that we could need the
> write_seqlock in mmu_notifier_register, to know when to restart the
> loop if somebody does a mmu_notifier_register;
> synchronize_rcu(). Otherwise there's no way to be sure the mmu
> notifier will start firing immediately after synchronize_rcu. I'm
> unsure if it's acceptable that in-progress mmu notifier invocations,
> don't need to notice the fact that somebody did mmu_notifier_register;
> synchronize_rcu. If they don't need to notice, then we can just drop
> unregister and all rcu_read_lock()s instead of adding write_seqlock to
> the register operation.

This is all getting into some very complicated issues.

> Overall my effort is to try to avoid expand the list walk with
> explicit memory barriers like in EMM while trying to be equally
> efficient.

The smp_rmb is such a big problem? You have seqlock, rcu etc all in there 
as well. I doubt that this is more efficient.

> Another issue is that the _begin/_end logic doesn't provide any
> guarantee that the _begin will start firing before _end, if a kernel
> module is loaded while another cpu is already running inside some
> munmap operation etc.. The KVM usage of mmu notifier has no problem
> with that detail, but KVM doesn't use _begin at all, I wonder if
> others would have problems. This is a kind of a separate problem, but
> quite related to the question if the notifiers must be guaranteed to
> start firing immediately after mmu_notifier_unregister;synchronize_rcu
> or not, that's why I mentioned it here.

Ahh. Yes that is an interesting issue. If a device driver cannot handle 
this then _begin must prohibit module loading. That means not allowing 
stop_machine_run I guess which should not be that difficult.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] 4/4 i_mmap_lock spinlock2rwsem (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter

On Fri, 7 Mar 2008, Andrea Arcangeli wrote:

> I didn't look into this but it shows how it would be risky to make
> this change in .25. It's a bit strange that the bugcheck triggers

Yes this was never intended for .25. I think we need to split this into a 
copule of patches. One needs to get rid of the spinlock dropping, then one 
that deals with the read concurrency issues and finally one that converts 
the spinlock. Thanks for looking at it.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter

On Fri, 7 Mar 2008, Andrea Arcangeli wrote:

> This combines the non-sleep-capable RCU locking of #v9 with a seqlock
> so the mmu notifier fast path will require zero cacheline
> writes/bouncing while still providing mmu_notifier_unregister and
> allowing to schedule inside the mmu notifier methods. If we drop
> mmu_notifier_unregister we can as well drop all seqlock and
> rcu_read_lock()s. But this locking scheme combination is sexy enough
> and 100% scalable (the mmu_notifier_list cacheline will be preloaded
> anyway and that will most certainly include the sequence number value
> in l1 for free even in Christoph's NUMA systems) so IMHO it worth to
> keep mmu_notifier_unregister.

Well its adds lots of processing. Not sure if its really worth it. Seems 
that this scheme cannot work since the existence of the structure passed 
to the callbacks is not guaranteed since the RCU locks are not held. You 
need some kind of a refcount to give the existence guarantee.

> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -20,7 +20,9 @@ void __mmu_notifier_release(struct mm_st
>  void __mmu_notifier_release(struct mm_struct *mm)
>  {
>   struct mmu_notifier *mn;
> + unsigned seq;
>  
> + seq = read_seqbegin(&mm->mmu_notifier_lock);
>   while (unlikely(!hlist_empty(&mm->mmu_notifier_list))) {
>   mn = hlist_entry(mm->mmu_notifier_list.first,
>struct mmu_notifier,
> @@ -28,6 +30,7 @@ void __mmu_notifier_release(struct mm_st
>   hlist_del(&mn->hlist);
>   if (mn->ops->release)
>   mn->ops->release(mn, mm);
> + BUG_ON(read_seqretry(&mm->mmu_notifier_lock, seq));
>   }
>  }

So this is only for sanity checking? The BUG_ON detects concurrent 
operations that should not happen? Need a comment here.

> @@ -42,11 +45,19 @@ int __mmu_notifier_clear_flush_young(str
>   struct mmu_notifier *mn;
>   struct hlist_node *n;
>   int young = 0;
> + unsigned seq;
>  
>   rcu_read_lock();
> +restart:
> + seq = read_seqbegin(&mm->mmu_notifier_lock);
>   hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
> - if (mn->ops->clear_flush_young)
> + if (mn->ops->clear_flush_young) {
> + rcu_read_unlock();
>   young |= mn->ops->clear_flush_young(mn, mm, address);
> + rcu_read_lock();
> + }
> + if (read_seqretry(&mm->mmu_notifier_lock, seq))
> + goto restart;

Great innovative idea of the seqlock for versioning checks.

>   }
>   rcu_read_unlock();
>  

Well that gets pretty sophisticated here. If you drop the rcu lock then 
the entity pointed to by mn can go away right? So how can you pass that 
structure to clear_flush_young? What is guaranteeing the existence of the 
structure?

> @@ -58,11 +69,19 @@ void __mmu_notifier_invalidate_page(stru
>  {
>   struct mmu_notifier *mn;
>   struct hlist_node *n;
> + unsigned seq;
>  
>   rcu_read_lock();
> +restart:
> + seq = read_seqbegin(&mm->mmu_notifier_lock);
>   hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
> - if (mn->ops->invalidate_page)
> + if (mn->ops->invalidate_page) {
> + rcu_read_unlock();
>   mn->ops->invalidate_page(mn, mm, address);

Ditto structure can vanish since no existence guarantee exists.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] 2/4 move all invalidate_page outside of PT lock (#v9 was 1/4)

2008-03-07 Thread Christoph Lameter

On Fri, 7 Mar 2008, Andrea Arcangeli wrote:

> This below simple patch invalidates the "invalidate_page" part, the
> next patch will invalidate the RCU part, and btw in a way that doesn't
> forbid unregistering the mmu notifiers at runtime (like your brand new
> EMM does).

Sounds good.

> The reason I keep this incremental (unlike your EMM that does
> everything all at the same time mixed in a single patch) is to
> decrease the non obviously safe mangling over mm/* during .25. The
> below patch is simple, but not as obviously safe as
> s/ptep_clear_flush/ptep_clear_flush_notify/.

There was never a chance to merge for .25. Lets drop that and focus on 
a solution that is good for all.

>  #endif /* _LINUX_MMU_NOTIFIER_H */
> diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
> --- a/mm/filemap_xip.c
> +++ b/mm/filemap_xip.c
> @@ -194,11 +194,13 @@ __xip_unmap (struct address_space * mapp
>   if (pte) {
>   /* Nuke the page table entry. */
>   flush_cache_page(vma, address, pte_pfn(*pte));
> - pteval = ptep_clear_flush_notify(vma, address, pte);
> + pteval = ptep_clear_flush(vma, address, pte);
>   page_remove_rmap(page, vma);
>   dec_mm_counter(mm, file_rss);
>   BUG_ON(pte_dirty(pteval));
>   pte_unmap_unlock(pte, ptl);
> + /* must invalidate_page _before_ freeing the page */
> + mmu_notifier_invalidate_page(mm, address);
>   page_cache_release(page);
>   }
>   }

Ok but we still hold the i_mmap_lock here.


> @@ -834,6 +846,8 @@ static void try_to_unmap_cluster(unsigne
>   if (!pmd_present(*pmd))
>   return;
>  
> + start = address;
> + mmu_notifier_invalidate_range_begin(mm, start, end);

H.. Okay you going for range invalidate here like EMM but there are 
still some invalidate_pages() left.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Andrea Arcangeli

On Fri, Mar 07, 2008 at 07:45:52PM +0100, Andrea Arcangeli wrote:
> On Fri, Mar 07, 2008 at 07:01:35PM +0100, Peter Zijlstra wrote:
> > The reason Christoph can do without RCU is because he doesn't allow
> > unregister, and as soon as you drop that you'll end up with something
> 
> Not sure to follow, what do you mean "he doesn't allow"? We'll also
> have to rip unregister regardless after you pointed out the ->release
> won't be called after calling my mmu_notifier_unregister in 3/4. If
> you figured out how to retain mmu_notifier_unregister I'm not seeing
> it anymore.

Given I don't see other (buggy ;) ways anymore to retain
mmu_notifier_unregister, I did like in EMM and I dropped the
unregister function.

To me it looks like this will be enough and equally efficient as the
expanded version in EMM that is not using the highlevel hlist_rcu
macros. If you can see any pitfall let me know! Thanks a lot for the
help.

--
This is a replacement for the previously posted 3/4, one of the pieces
to allow the mmu notifier methods to sleep.

Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]>

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -70,17 +70,6 @@ static inline int mm_has_notifiers(struc
  */
 extern void mmu_notifier_register(struct mmu_notifier *mn,
  struct mm_struct *mm);
-/*
- * Must hold the mmap_sem for write.
- *
- * RCU is used to traverse the list. A quiescent period needs to pass
- * before the "struct mmu_notifier" can be freed. Alternatively it
- * can be synchronously freed inside ->release when the list can't
- * change anymore and nobody could possibly walk it.
- */
-extern void mmu_notifier_unregister(struct mmu_notifier *mn,
-   struct mm_struct *mm);
-
 extern void __mmu_notifier_release(struct mm_struct *mm);
 extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
  unsigned long address);
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -43,12 +43,10 @@ int __mmu_notifier_clear_flush_young(str
struct hlist_node *n;
int young = 0;
 
-   rcu_read_lock();
hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
if (mn->ops->clear_flush_young)
young |= mn->ops->clear_flush_young(mn, mm, address);
}
-   rcu_read_unlock();
 
return young;
 }
@@ -59,12 +57,10 @@ void __mmu_notifier_invalidate_page(stru
struct mmu_notifier *mn;
struct hlist_node *n;
 
-   rcu_read_lock();
hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
if (mn->ops->invalidate_page)
mn->ops->invalidate_page(mn, mm, address);
}
-   rcu_read_unlock();
 }
 
 void __mmu_notifier_invalidate_range_begin(struct mm_struct *mm,
@@ -73,12 +69,10 @@ void __mmu_notifier_invalidate_range_beg
struct mmu_notifier *mn;
struct hlist_node *n;
 
-   rcu_read_lock();
hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
if (mn->ops->invalidate_range_begin)
mn->ops->invalidate_range_begin(mn, mm, start, end);
}
-   rcu_read_unlock();
 }
 
 void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
@@ -87,12 +81,10 @@ void __mmu_notifier_invalidate_range_end
struct mmu_notifier *mn;
struct hlist_node *n;
 
-   rcu_read_lock();
hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
if (mn->ops->invalidate_range_end)
mn->ops->invalidate_range_end(mn, mm, start, end);
}
-   rcu_read_unlock();
 }
 
 /*
@@ -106,9 +98,3 @@ void mmu_notifier_register(struct mmu_no
hlist_add_head_rcu(&mn->hlist, &mm->mmu_notifier_list);
 }
 EXPORT_SYMBOL_GPL(mmu_notifier_register);
-
-void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm)
-{
-   hlist_del_rcu(&mn->hlist);
-}
-EXPORT_SYMBOL_GPL(mmu_notifier_unregister);

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 1/2] [PATCH] allow machine_crash_shutdown to be replaced

2008-03-07 Thread Glauber Costa

Avi Kivity wrote:
> Glauber Costa wrote:
>> This patch a llows machine_crash_shutdown to
>> be replaced, just like any of the other functions
>> in machine_ops
>>
>>   
> er, against what tree is this?  doesn't apply to kvm.git.
> 
It'd be kvm.git with the machine_ops non-static functions patch.
However, as it turned out, we only used one of the functions, instead of 
all of them. If ingo prefers, we can revert that patch, and come up with 
a new one that just exposes the function we're actually using. I can 
then route it through kvm.git entirely, instead of x86.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] loop in copy_user_generic_string

2008-03-07 Thread Zdenek Kabelac

2008/3/5, Zdenek Kabelac <[EMAIL PROTECTED]>:
> 2008/3/5, Avi Kivity <[EMAIL PROTECTED]>:
>
> > Andi Kleen wrote:
>  >  > Avi Kivity <[EMAIL PROTECTED]> writes:
>  >  >
>  >  >> Most likely movs emulation is broken for long counts.  Please post a
>  >  >> disassembly of copy_user_generic_string to make sure we're looking at
>  >  >> the same code.
>  >  >>
>  >  >
>  >  > Be careful -- this code is patched at runtime and what you
>  >  > see in the vmlinux is not necessarily the same that is executed
>  >  >
>  >  >
>  >
>  >
>  > If the disassembled instruction isn't marked as an alternative in the
>  >  source, then it can't be patched, right?
>  >

Hello

Any progress on this - It looks like I get this bug quite often when I test
device-mapper code.

Should I test something special ?

Also I'm seeing some problems with nfs - not yet tracked down, but I'd
like to get bugs fixed after another. Also about two times qemu-kvm
coredumped - unfortunately it's compiled without debugs thus the
traceback was not really useful to make any report...

Zdenek

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 1/2] [PATCH] allow machine_crash_shutdown to be replaced

2008-03-07 Thread Avi Kivity

Glauber Costa wrote:
> This patch a llows machine_crash_shutdown to
> be replaced, just like any of the other functions
> in machine_ops
>
>   
er, against what tree is this?  doesn't apply to kvm.git.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] Notifier for Externally Mapped Memory (EMM) V1

2008-03-07 Thread Andrea Arcangeli

On Wed, Mar 05, 2008 at 04:22:11PM -0800, Christoph Lameter wrote:
> + if (e->callback) {
> + x = e->callback(e, mm, op, start, end);
> + if (x)
> + return x;
[..]
> +
> + if (emm_notify(mm, emm_referenced, address, address + PAGE_SIZE))
> + referenced++;

This has still the same aging bug as in the RFC version.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Andrea Arcangeli

On Fri, Mar 07, 2008 at 07:01:35PM +0100, Peter Zijlstra wrote:
> The reason Christoph can do without RCU is because he doesn't allow
> unregister, and as soon as you drop that you'll end up with something

Not sure to follow, what do you mean "he doesn't allow"? We'll also
have to rip unregister regardless after you pointed out the ->release
won't be called after calling my mmu_notifier_unregister in 3/4. If
you figured out how to retain mmu_notifier_unregister I'm not seeing
it anymore.

> Curious problem indeed. Would it make sense to require registering these
> MMU notifiers when the process is still single threaded along with the
> requirement that they can never be removed again from a running process?

I'm afraid that won't help much (even if the mmu notifiers users could
cope with that restriction like KVM can) because the VM will run
concurrently in another CPU despite the task is single threaded. See
2/4 in try_to_unmap_cluster: _start/end are not only invoked in the
context of the current task.

PS. this problem I pointed out of _end possibly called before _begin
is the same for #v9 and EMM V1 as far as I can tell.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Andrea Arcangeli

On Fri, Mar 07, 2008 at 05:52:42PM +0100, Peter Zijlstra wrote:
> hlist_del_rcu(&mn->hlist)
> 
> > +   rcu_read_unlock();
> 
> kfree(mn);
> 
> > young |= mn->ops->clear_flush_young(mn, mm, address);
> 
> *BANG*

My objective was to allow mmu_notifier_register/unregister to be
called with the same mmu notifier object, I didn't mean the object
could have been freed until ->release is called. However you reminded
me that after unregistering ->release won't be called so unregister
isn't very useful and I doubt we can keep it ;).

In the meantime I've also been thinking that we could need the
write_seqlock in mmu_notifier_register, to know when to restart the
loop if somebody does a mmu_notifier_register;
synchronize_rcu(). Otherwise there's no way to be sure the mmu
notifier will start firing immediately after synchronize_rcu. I'm
unsure if it's acceptable that in-progress mmu notifier invocations,
don't need to notice the fact that somebody did mmu_notifier_register;
synchronize_rcu. If they don't need to notice, then we can just drop
unregister and all rcu_read_lock()s instead of adding write_seqlock to
the register operation.

Overall my effort is to try to avoid expand the list walk with
explicit memory barriers like in EMM while trying to be equally
efficient.

Another issue is that the _begin/_end logic doesn't provide any
guarantee that the _begin will start firing before _end, if a kernel
module is loaded while another cpu is already running inside some
munmap operation etc.. The KVM usage of mmu notifier has no problem
with that detail, but KVM doesn't use _begin at all, I wonder if
others would have problems. This is a kind of a separate problem, but
quite related to the question if the notifiers must be guaranteed to
start firing immediately after mmu_notifier_unregister;synchronize_rcu
or not, that's why I mentioned it here.

Once I get comments on the suggested direction for these details, I'll
quickly repost a replacement patch for 3/4.

Thanks Peter!

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] Chicas Caliente aqui.

2008-03-07 Thread jiang hanifen

Spray your juice deeper into her.-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] can KVM use all cpu cores in a guest

2008-03-07 Thread Alexey Eremenko

KVM fully uses all 4 CPUs.

1. You need to make sure KVM is activated (not Qemu).
2. You need to have the guest workload to be multi-threaded.

-- 
-Alexey Eremenko "Technologov"

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] can KVM use all cpu cores in a guest

2008-03-07 Thread Martin Maurer

Hi all,

 

As far as I see I can configure 4 cpu´s in a guest (using the -smp option) 
but this seems to be only virtual, means the process on the host only uses
one physical cpu.

 

Background: I want to run just one guest, using the full cpu power of a host
(quadcore). Now I only get 25 % cpu load on the host.

 

Will this be possible with future versions?

 

Thanks,

Best Regards,

 

Martin

 

 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 0/6] Latest in kernel PIT patch

2008-03-07 Thread Avi Kivity

Yang, Sheng wrote:
> Hi
>
> Here is the latest in kernel PIT patch. Not much change from last edition. 
>
> One known issue is on 2.6.9 pae guest(e.g. RHEL4), you need "clock=pit" 
> kernel 
> parameter to get the correct time. That's because the kernel is too active 
> to "fix the lost interrupt" when PIT interrupts pending... We may find more 
> elegant way to deal with it later.
>   

Thanks, all applied.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] disable clock before rebooting.

2008-03-07 Thread Glauber Costa

Avi Kivity wrote:
> Glauber Costa wrote:
>> as for kexec, it uses precisely the shutdown function, doesn't it?
>>
>> Or is it crash_shutdown?
>> Humm, /me looks, and I think it's the later, right?
>>
> 
> Only on crash-triggered kexecs.  It can also happen via sys_reboot().  
> Which, it appears, goes through machine_shutdown().
> 
yeah, it's already addressed in my new patches

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] headersinstall of kvm.h does not work

2008-03-07 Thread Avi Kivity

Christian Borntraeger wrote:
> Am Freitag, 7. März 2008 schrieb Avi Kivity:
>   
>> As I'm about to disappear for a week, consider a patch to remove the 
>> config dependency and add asm-*/kvm.h pre-acked for mainline.  Maybe the 
>> presence of those empty asm-*/kvm.h files will encourage further kvm 
>> ports to *.
>> 
>
> Something like the following for all architectures?
>
>   

Yes.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] disable clock before rebooting.

2008-03-07 Thread Avi Kivity

Glauber Costa wrote:
> as for kexec, it uses precisely the shutdown function, doesn't it?
>
> Or is it crash_shutdown?
> Humm, /me looks, and I think it's the later, right?
>

Only on crash-triggered kexecs.  It can also happen via sys_reboot().  
Which, it appears, goes through machine_shutdown().

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] disable clock before rebooting.

2008-03-07 Thread Avi Kivity

Glauber Costa wrote:
>>
>> Why not go all the way and to _restart the same way?
>>
> Because it got a parameter, and doing it in the same macro would make
> my beautiful macros ugly.
> Using another one, to pass the argument, didn't seem justifiable to 
> me, since there were just one of its kind.

Yes, of course.  My mistake.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] [PATCH 2/2] [PATCH] disable clock before rebooting.

2008-03-07 Thread Glauber Costa

This patch writes 0 (actually, what really matters is that the
LSB is cleared) to the system time msr before shutting down
the machine for kexec.

Without it, we can have a random memory location being written
when the guest comes back

It overrides the functions shutdown, used in the path of kernel_kexec() (sys.c)
and crash_shutdown, used in the path of crash_kexec() (kexec.c)

Signed-off-by: Glauber Costa <[EMAIL PROTECTED]>
---
 arch/x86/kernel/kvmclock.c |   23 +++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index f654a12..8b838d9 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -21,6 +21,7 @@ #include 
 #include 
 #include 
 #include 
+#include 
 
 #define KVM_SCALE 22
 
@@ -142,6 +143,26 @@ static void kvm_setup_secondary_clock(vo
setup_secondary_APIC_clock();
 }
 
+/*
+ * After the clock is registered, the host will keep writing to the
+ * registered memory location. If the guest happens to shutdown, this memory
+ * won't be valid. In cases like kexec, in which you install a new kernel, this
+ * means a random memory location will be kept being written. So before any
+ * kind of shutdown from our side, we unregister the clock by writting anything
+ * that does not have the 'enable' bit set in the msr
+ */
+static void kvm_crash_shutdown(struct pt_regs *regs)
+{
+   native_write_msr_safe(MSR_KVM_SYSTEM_TIME, 0, 0);
+   native_machine_crash_shutdown(regs);
+}
+
+static void kvm_shutdown(void)
+{
+   native_write_msr_safe(MSR_KVM_SYSTEM_TIME, 0, 0);
+   native_machine_shutdown();
+}
+
 void __init kvmclock_init(void)
 {
if (!kvm_para_available())
@@ -154,6 +175,8 @@ void __init kvmclock_init(void)
pv_time_ops.set_wallclock = kvm_set_wallclock;
pv_time_ops.sched_clock = kvm_clock_read;
pv_apic_ops.setup_secondary_clock = kvm_setup_secondary_clock;
+   machine_ops.shutdown  = kvm_shutdown;
+   machine_ops.crash_shutdown  = kvm_crash_shutdown;
clocksource_register(&kvm_clock);
}
 }
-- 
1.4.2


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] [PATCH 0/2] prevent memory corruption across reboots

2008-03-07 Thread Glauber Costa

Avi,

I tracked down the kexec paths that requires overloading of machine_ops to two.
So here's a simpler version, that'll probably be a best fit.

thanks



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] [PATCH 1/2] [PATCH] allow machine_crash_shutdown to be replaced

2008-03-07 Thread Glauber Costa

This patch a llows machine_crash_shutdown to
be replaced, just like any of the other functions
in machine_ops

Signed-off-by: Glauber Costa <[EMAIL PROTECTED]>
---
 arch/x86/kernel/crash.c  |3 ++-
 arch/x86/kernel/reboot.c |7 ++-
 include/asm-x86/reboot.h |1 +
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 9a5fa0a..d262306 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -25,6 +25,7 @@ #include 
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_32
 #include 
@@ -121,7 +122,7 @@ static void nmi_shootdown_cpus(void)
 }
 #endif
 
-void machine_crash_shutdown(struct pt_regs *regs)
+void native_machine_crash_shutdown(struct pt_regs *regs)
 {
/* This function is only called after the system
 * has panicked or is otherwise in a critical state.
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 8b577f1..fa30d4e 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -447,7 +447,8 @@ struct machine_ops machine_ops = {
.shutdown = native_machine_shutdown,
.emergency_restart = native_machine_emergency_restart,
.restart = native_machine_restart,
-   .halt = native_machine_halt
+   .halt = native_machine_halt,
+   .crash_shutdown = native_machine_crash_shutdown,
 };
 
 void machine_power_off(void)
@@ -475,3 +476,7 @@ void machine_halt(void)
machine_ops.halt();
 }
 
+void machine_crash_shutdown(struct pt_regs *regs)
+{
+   machine_ops.crash_shutdown(regs);
+}
diff --git a/include/asm-x86/reboot.h b/include/asm-x86/reboot.h
index 2ea857c..53dcc12 100644
--- a/include/asm-x86/reboot.h
+++ b/include/asm-x86/reboot.h
@@ -22,5 +22,6 @@ void native_machine_shutdown(void);
 void native_machine_restart(char *__unused);
 void native_machine_halt(void);
 void native_machine_power_off(void);
+void native_machine_crash_shutdown(struct pt_regs *regs);
 
 #endif /* _ASM_REBOOT_H */
-- 
1.4.2


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] [PATCH] 4/4 i_mmap_lock spinlock2rwsem (#v9 was 1/4)

2008-03-07 Thread Andrea Arcangeli

This is a rediff of Christoph's plain i_mmap_lock2rwsem patch on top
of #v9 1/4 + 2/4 + 3/4 (hence this is called 4/4). This is mostly to
show that after 3/4, any patch that plugs on the EMM patchset will
plug nicely on top of my MMU notifer patchset too.

The patch trigger bug checks here in modprobe:

BUG_ON(mm->nr_ptes > (FIRST_USER_ADDRESS+PMD_SIZE-1)>>PMD_SHIFT);

kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 252k freed
[ cut here ]
kernel BUG at mm/mmap.c:2063!
invalid opcode:  [1] SMP
CPU 0
Modules linked in:
Pid: 1123, comm: modprobe.sh Not tainted 2.6.25-rc3 #22
RIP: 0010:[]  [] exit_mmap+0xef/0xfa
RSP: :81003c79bed8  EFLAGS: 00010206
RAX:  RBX: 810001004840 RCX: 81003c79bee0
RDX:  RSI: 81003c5e8918 RDI: 81003d8048c0
RBP:  R08: 0008 R09: 810002c00040
R10: 0002 R11: 810001009180 R12: 81003c57b800
R13:  R14: 005f0db0 R15: 7fff3f2af234
FS:  7f283714b6f0() GS:80694000() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 00458f40 CR3: 00201000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process modprobe.sh (pid: 1123, threadinfo 81003c79a000, task 
81003cf9ca50)
Stack:  0091 810001004840 81003c57b800 81003c57b880
  8022f7bf 0001 0001
 81003cf9ca50 802349b6 0292 80354c63
Call Trace:
 [] mmput+0x30/0x9d
 [] do_exit+0x223/0x66c
 [] __up_read+0x13/0x8a
 [] do_group_exit+0x6f/0x8a
 [] system_call_after_swapgs+0x7b/0x80


Code: 7b 18 e8 4a 5c 00 00 c7 43 08 00 00 00 00 eb 0b 48 89 ef e8 d1 fe ff ff 
48 89 c5 48 85 ed 75 f0 49 modprobe.sh[1114]: segfault at 0 ip 7f998d2e972b sp 
7fff959d8ed0 error 4 in libc-2.6.1.so[7f998d27b000+136000]


I didn't look into this but it shows how it would be risky to make
this change in .25. It's a bit strange that the bugcheck triggers
given I've preempt disabled (I mean CONFIG_PREEMPT_VOLUNTARY=y, nobody
should turn off that config option) and so even if code depended on
the implicit preempt_disable in spin_lock, no race should happen. The
down_read sections at first glance didn't seem capable of altering
nr_ptes, but I didn't look seriously into the above. I rediffed it
just to be 100% on par with EMM sleep-capabilities (but while
retaining more features and cleaner code I hope).

--
From: Christoph Lameter <[EMAIL PROTECTED]>
Subject: Conversion of i_mmap_lock to semaphore

Not there but the system boots and is usable. Complains about atomic
contexts because the tlb functions use a get_cpu() and thus disable preempt.

Not sure yet what to do about the cond_resched_lock stuff etc.


Convert i_mmap_lock to i_mmap_sem

The conversion to a rwsemaphore allows callbacks during rmap traversal
for files in a non atomic context. A rw style lock allows concurrent
walking of the reverse map.

Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>

---
 arch/x86/mm/hugetlbpage.c |4 ++--
 fs/hugetlbfs/inode.c  |4 ++--
 fs/inode.c|2 +-
 include/linux/fs.h|2 +-
 include/linux/mm.h|2 +-
 kernel/fork.c |4 ++--
 mm/filemap.c  |8 
 mm/filemap_xip.c  |4 ++--
 mm/fremap.c   |4 ++--
 mm/hugetlb.c  |   11 +--
 mm/memory.c   |   28 
 mm/migrate.c  |4 ++--
 mm/mmap.c |   16 
 mm/mremap.c   |4 ++--
 mm/rmap.c |   20 +---
 15 files changed, 51 insertions(+), 66 deletions(-)

diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -69,7 +69,7 @@ static void huge_pmd_share(struct mm_str
if (!vma_shareable(vma, addr))
return;
 
-   spin_lock(&mapping->i_mmap_lock);
+   down_read(&mapping->i_mmap_sem);
vma_prio_tree_foreach(svma, &iter, &mapping->i_mmap, idx, idx) {
if (svma == vma)
continue;
@@ -94,7 +94,7 @@ static void huge_pmd_share(struct mm_str
put_page(virt_to_page(spte));
spin_unlock(&mm->page_table_lock);
 out:
-   spin_unlock(&mapping->i_mmap_lock);
+   up_read(&mapping->i_mmap_sem);
 }
 
 /*
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -454,10 +454,10 @@ static int hugetlb_vmtruncate(struct ino
pgoff = offset >> PAGE_SHIFT;
 
i_size_write(inode, offset);
-   s

Re: [kvm-devel] [PATCH] 3/4 combine RCU with seqlock to allow mmu notifier methods to sleep (#v9 was 1/4)

2008-03-07 Thread Andrea Arcangeli

This combines the non-sleep-capable RCU locking of #v9 with a seqlock
so the mmu notifier fast path will require zero cacheline
writes/bouncing while still providing mmu_notifier_unregister and
allowing to schedule inside the mmu notifier methods. If we drop
mmu_notifier_unregister we can as well drop all seqlock and
rcu_read_lock()s. But this locking scheme combination is sexy enough
and 100% scalable (the mmu_notifier_list cacheline will be preloaded
anyway and that will most certainly include the sequence number value
in l1 for free even in Christoph's NUMA systems) so IMHO it worth to
keep mmu_notifier_unregister.

Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]>

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -230,6 +231,7 @@ struct mm_struct {
 #endif
 #ifdef CONFIG_MMU_NOTIFIER
struct hlist_head mmu_notifier_list;
+   seqlock_t mmu_notifier_lock;
 #endif
 };
 
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -130,6 +130,7 @@ static inline void mmu_notifier_mm_init(
 static inline void mmu_notifier_mm_init(struct mm_struct *mm)
 {
INIT_HLIST_HEAD(&mm->mmu_notifier_list);
+   seqlock_init(&mm->mmu_notifier_lock);
 }
 
 
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -20,7 +20,9 @@ void __mmu_notifier_release(struct mm_st
 void __mmu_notifier_release(struct mm_struct *mm)
 {
struct mmu_notifier *mn;
+   unsigned seq;
 
+   seq = read_seqbegin(&mm->mmu_notifier_lock);
while (unlikely(!hlist_empty(&mm->mmu_notifier_list))) {
mn = hlist_entry(mm->mmu_notifier_list.first,
 struct mmu_notifier,
@@ -28,6 +30,7 @@ void __mmu_notifier_release(struct mm_st
hlist_del(&mn->hlist);
if (mn->ops->release)
mn->ops->release(mn, mm);
+   BUG_ON(read_seqretry(&mm->mmu_notifier_lock, seq));
}
 }
 
@@ -42,11 +45,19 @@ int __mmu_notifier_clear_flush_young(str
struct mmu_notifier *mn;
struct hlist_node *n;
int young = 0;
+   unsigned seq;
 
rcu_read_lock();
+restart:
+   seq = read_seqbegin(&mm->mmu_notifier_lock);
hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
-   if (mn->ops->clear_flush_young)
+   if (mn->ops->clear_flush_young) {
+   rcu_read_unlock();
young |= mn->ops->clear_flush_young(mn, mm, address);
+   rcu_read_lock();
+   }
+   if (read_seqretry(&mm->mmu_notifier_lock, seq))
+   goto restart;
}
rcu_read_unlock();
 
@@ -58,11 +69,19 @@ void __mmu_notifier_invalidate_page(stru
 {
struct mmu_notifier *mn;
struct hlist_node *n;
+   unsigned seq;
 
rcu_read_lock();
+restart:
+   seq = read_seqbegin(&mm->mmu_notifier_lock);
hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
-   if (mn->ops->invalidate_page)
+   if (mn->ops->invalidate_page) {
+   rcu_read_unlock();
mn->ops->invalidate_page(mn, mm, address);
+   rcu_read_lock();
+   }
+   if (read_seqretry(&mm->mmu_notifier_lock, seq))
+   goto restart;
}
rcu_read_unlock();
 }
@@ -72,11 +91,19 @@ void __mmu_notifier_invalidate_range_beg
 {
struct mmu_notifier *mn;
struct hlist_node *n;
+   unsigned seq;
 
rcu_read_lock();
+restart:
+   seq = read_seqbegin(&mm->mmu_notifier_lock);
hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
-   if (mn->ops->invalidate_range_begin)
+   if (mn->ops->invalidate_range_begin) {
+   rcu_read_unlock();
mn->ops->invalidate_range_begin(mn, mm, start, end);
+   rcu_read_lock();
+   }
+   if (read_seqretry(&mm->mmu_notifier_lock, seq))
+   goto restart;
}
rcu_read_unlock();
 }
@@ -86,11 +113,19 @@ void __mmu_notifier_invalidate_range_end
 {
struct mmu_notifier *mn;
struct hlist_node *n;
+   unsigned seq;
 
rcu_read_lock();
+restart:
+   seq = read_seqbegin(&mm->mmu_notifier_lock);
hlist_for_each_entry_rcu(mn, n, &mm->mmu_notifier_list, hlist) {
-   if (mn->ops->invalidate_range_end)
+   if (mn->ops->invalidate_range_end) {
+   rcu_read_unlock();
mn->ops->invalidate_range_end(mn, mm, start, end);
+   rcu_read_lock();
+

[kvm-devel] [PATCH] 2/4 move all invalidate_page outside of PT lock (#v9 was 1/4)

2008-03-07 Thread Andrea Arcangeli

On Tue, Mar 04, 2008 at 02:35:21PM -0800, Christoph Lameter wrote:
> It is the atomic dead end that we want to avoid. And your patch is exactly 
> that. Both the invalidate_page and the RCU locks us into this.

I preferred to answer with code to avoid any possible misunderstanding
(I through already tried to explain with words and I obviously failed
miserably if you ended up writing such an erratic weird claim like
above ;).

This below simple patch invalidates the "invalidate_page" part, the
next patch will invalidate the RCU part, and btw in a way that doesn't
forbid unregistering the mmu notifiers at runtime (like your brand new
EMM does).

This is incremental with my #v9. I still ask Andrew/Linus to merge the
#v9 patch I posted a few days ago in .25 so KVM/GRU will be 100%
covered in a optimal way on all respects and with maximum flexibility
for future changes of API (to allow for future methods that may take
more than start,end, this was pointed out once by both me and Avi). My
#v9 is zero risk for .25 and it sure worth merging now.

Then in .26 we'll modify the semantics of the API to be blocking
starting with the below patchx. This is a kernel _internal_ API, and
we aren't distributions that have to respect kabi here, but even if we
were, making methods sleepable is a 100% backwards compatible
semantical change, so there's no possible reason to defer the #v9
merging. The changes in .26 will be transparent to any user (even if
they don't need to! even if we turn out to be totally wrong about .26
requiring a minor change of API everything will be perfectly
fine). Nothing of this is visible to userland so we can change it at
any time as we wish.

The reason I keep this incremental (unlike your EMM that does
everything all at the same time mixed in a single patch) is to
decrease the non obviously safe mangling over mm/* during .25. The
below patch is simple, but not as obviously safe as
s/ptep_clear_flush/ptep_clear_flush_notify/.

Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]>

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -134,27 +134,6 @@ static inline void mmu_notifier_mm_init(

-#define ptep_clear_flush_notify(__vma, __address, __ptep)  \
-({ \
-   pte_t __pte;\
-   struct vm_area_struct *___vma = __vma;  \
-   unsigned long ___address = __address;   \
-   __pte = ptep_clear_flush(___vma, ___address, __ptep);   \
-   mmu_notifier_invalidate_page(___vma->vm_mm, ___address);\
-   __pte;  \
-})
-
-#define ptep_clear_flush_young_notify(__vma, __address, __ptep)
\
-({ \
-   int __young;\
-   struct vm_area_struct *___vma = __vma;  \
-   unsigned long ___address = __address;   \
-   __young = ptep_clear_flush_young(___vma, ___address, __ptep);   \
-   __young |= mmu_notifier_clear_flush_young(___vma->vm_mm,\
- ___address);  \
-   __young;\
-})
-
 #else /* CONFIG_MMU_NOTIFIER */

 static inline void mmu_notifier_release(struct mm_struct *mm)
@@ -186,9 +165,6 @@ static inline void mmu_notifier_mm_init(
 {
 }

-#define ptep_clear_flush_young_notify ptep_clear_flush_young
-#define ptep_clear_flush_notify ptep_clear_flush
-
 #endif /* CONFIG_MMU_NOTIFIER */

 #endif /* _LINUX_MMU_NOTIFIER_H */
diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
--- a/mm/filemap_xip.c
+++ b/mm/filemap_xip.c
@@ -194,11 +194,13 @@ __xip_unmap (struct address_space * mapp
if (pte) {
/* Nuke the page table entry. */
flush_cache_page(vma, address, pte_pfn(*pte));
-   pteval = ptep_clear_flush_notify(vma, address, pte);
+   pteval = ptep_clear_flush(vma, address, pte);
page_remove_rmap(page, vma);
dec_mm_counter(mm, file_rss);
BUG_ON(pte_dirty(pteval));
pte_unmap_unlock(pte, ptl);
+   /* must invalidate_page _before_ freeing the page */
+   mmu_notifier_invalidate_page(mm, address);
page_cache_release(page);
}
}
diff --git a/mm/memory.c b/mm/memory.c
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1626,9 +1626,10 @@ static int do_wp_page(struct mm_struct *
 */
page_table = pte_offset_map_lock(mm, pmd, address,

Re: [kvm-devel] KVM and WinPE failure

2008-03-07 Thread Ryan Harper

* [EMAIL PROTECTED] <[EMAIL PROTECTED]> [2008-03-06 18:03]:
> On Thu, Mar 6, 2008 at 6:19 PM, Ryan Harper <[EMAIL PROTECTED]> wrote:
> 
> >
> > (dethklok) kvm-63 % ./kvm_stat
> > Please mount debugfs ('mount -t debugfs debugfs /sys/kernel/debug')
> > and ensure the kvm modules are loaded
> > (dethklok) kvm-61 % sudo mount -t debugfs debugfs /sys/kernel/debug
> >
> > then run ./kvm_stat again
> >
> >
> [EMAIL PROTECTED] ~ $ lsmod | fgrep kvm
> kvm_intel  34272  1
> kvm   138120  1 kvm_intel
> [EMAIL PROTECTED] ~ $ mount | fgrep debug
> debugfs on /sys/kernel/debug type debugfs (rw)
> [EMAIL PROTECTED] ~ $ kvm_stat
> Please mount debugfs ('mount -t debugfs debugfs /sys/kernel/debug')
> and ensure the kvm modules are loaded
> 
> No joy. Suggestions?

What's the error then?  You should see the stats with all zeros until
you start your guest.  Somethink like:

kvm statistics

 efer_reload  0   0
 exits0   0
 fpu_reload   0   0
 halt_exits   0   0
 halt_wakeup  0   0
 host_state_reload0   0
 insn_emulation   0   0
 insn_emulation_fail  0   0
 invlpg   0   0
 io_exits 0   0
 irq_exits0   0
 irq_window   0   0
 mmio_exits   0   0
 mmu_cache_miss   0   0
 mmu_flooded  0   0
 mmu_pde_zapped   0   0
 mmu_pte_updated  0   0
 mmu_pte_write0   0
 mmu_recycled 0   0
 mmu_shadow_zapped0   0
 pf_fixed 0   0
 pf_guest 0   0
 remote_tlb_flush 0   0
 request_irq  0   0
 signal_exits 0   0
 tlb_flush0   0

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
[EMAIL PROTECTED]

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] disable clock before rebooting.

2008-03-07 Thread Glauber Costa

Avi Kivity wrote:
> Glauber Costa wrote:
>> This patch writes 0 (actually, what really matters is that the
>> LSB is cleared) to the system time msr before rebooting/shutting down
>> the machine.
>>
>> Without it, we can have a random memory location being written
>> when the guest comes back
>>  if (!kvm_para_available())
>> @@ -154,6 +181,11 @@ void __init kvmclock_init(void)
>>  pv_time_ops.set_wallclock = kvm_set_wallclock;
>>  pv_time_ops.sched_clock = kvm_clock_read;
>>  pv_apic_ops.setup_secondary_clock = kvm_setup_secondary_clock;
>> +machine_ops.emergency_restart = kvm_emergency_restart;
>> +machine_ops.shutdown  = kvm_shutdown;
>> +machine_ops.restart  = kvm_restart;
>> +machine_ops.halt  = kvm_halt;
>> +machine_ops.power_off  = kvm_power_off;
>>  clocksource_register(&kvm_clock);
>>  }
>>  }
>>   
> 
> Oh, I think that these are all unnecessary.  You need to stop the clock 
> only if the memory it uses will be reused.  Halt, shutdown and poweroff 
> clearly don't.  Resets need to go through the host anyway, since they 
> can be invoked without the guest knowing about it.

power off, agreed.
halt, it doesn't really do anything anyway in reboot.c, and is here just 
for "future completeness".

> The only case I can think of where we need to stop the clock is kexec.
> 
as for kexec, it uses precisely the shutdown function, doesn't it?

Or is it crash_shutdown?
Humm, /me looks, and I think it's the later, right?

If this is true and resets goes through the host, then we don't even 
need the header exports. Ingo'll be happy.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] disable clock before rebooting.

2008-03-07 Thread Glauber Costa

Avi Kivity wrote:
> Glauber Costa wrote:
>> This patch writes 0 (actually, what really matters is that the
>> LSB is cleared) to the system time msr before rebooting/shutting down
>> the machine.
>>
>> Without it, we can have a random memory location being written
>> when the guest comes back
>>
>> Signed-off-by: Glauber Costa <[EMAIL PROTECTED]>
>> ---
>>  arch/x86/kernel/kvmclock.c |   32 
>>  1 files changed, 32 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>> index f654a12..5c9ff8d 100644
>> --- a/arch/x86/kernel/kvmclock.c
>> +++ b/arch/x86/kernel/kvmclock.c
>> @@ -21,6 +21,7 @@ #include 
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  
>>  #define KVM_SCALE 22
>>  
>> @@ -142,6 +143,32 @@ static void kvm_setup_secondary_clock(vo
>>  setup_secondary_APIC_clock();
>>  }
>>  
>> +/*
>> + * After the clock is registered, the host will keep writing to the
>> + * registered memory location. If the guest happens to shutdown, or 
>> restart,
>> + * this memory won't be valid. In cases like kexec, in which you 
>> install a new kernel,
>> + * this will mean a random memory location will be kept being 
>> written. So before
>> + * any kind of shutdown from our side, we unregister the clock by 
>> writting anything
>> + * that does not have the 'enable' bit set in the msr
>> + */ +static void kvm_restart(char *unused) {
>>   
> 
> This looks like a struct, with the { sitting there on the end.
my bad.

>> +native_write_msr_safe(MSR_KVM_SYSTEM_TIME, 0, 0);
>> +native_machine_restart(unused);
>> +}
>> +
>> +/* Forgive me dear lord, for my laziness */
>> +#define kvm_reboot_fn(x) \
>> +static void kvm_##x(void) { \
>> +native_write_msr_safe(MSR_KVM_SYSTEM_TIME, 0, 0); \
>> +native_machine_##x(); \
>> +}
>> +
>> +kvm_reboot_fn(emergency_restart)
>> +kvm_reboot_fn(shutdown)
>> +kvm_reboot_fn(halt)
>> +kvm_reboot_fn(power_off)
>> +#undef kvm_reboot_fn
>> +
>>   
> 
> Why not go all the way and to _restart the same way?
> 
Because it got a parameter, and doing it in the same macro would make
my beautiful macros ugly.
Using another one, to pass the argument, didn't seem justifiable to me, 
since there were just one of its kind.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] headersinstall of kvm.h does not work

2008-03-07 Thread Christian Borntraeger

Am Freitag, 7. März 2008 schrieb Avi Kivity:
> As I'm about to disappear for a week, consider a patch to remove the 
> config dependency and add asm-*/kvm.h pre-acked for mainline.  Maybe the 
> presence of those empty asm-*/kvm.h files will encourage further kvm 
> ports to *.

Something like the following for all architectures?

--- linux-2.6.orig/include/asm-alpha/Kbuild
+++ linux-2.6/include/asm-alpha/Kbuild
@@ -1,6 +1,7 @@
 include include/asm-generic/Kbuild.asm

 header-y += gentrap.h
+header-y += kvm.h
 header-y += regdef.h
 header-y += pal.h
 header-y += reg.h
Index: linux-2.6/include/asm-alpha/kvm.h
===
--- /dev/null
+++ linux-2.6/include/asm-alpha/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_ALPHA_H
+#define __LINUX_KVM_ALPHA_H
+
+/* alpha does not support KVM */
+
+#endif

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] headersinstall of kvm.h does not work

2008-03-07 Thread Avi Kivity

Christian Borntraeger wrote:
> Hello Avi,
>
> in commit fb56dbb31c4738a3918db81fd24da732ce3b4ae6 you changed 
> include/linux/Kbuild:
> snip
> KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM
> Currently, make headers_check barfs due to , which 
> includes, not existing.  Rather than add a zillion s, export 
> kvm.h only if the arch actually supports it.
> [...]
>  unifdef-y += keyboard.h
> -unifdef-y += kvm.h
> +unifdef-$(CONFIG_HAVE_KVM) += kvm.h
>  unifdef-y += llc.h
>  unifdef-y += loop.h
> snip--
>
> This patch does not work. Kbuild (scripts/Makefile.headersinst) does not 
> check the config file, so kvm.h is never installed.
>
> Sam is there an easy way to allow constructs like "unifdef-$(CONFIG_FOO)"?
>   

I think this cleverness has caused too much trouble already, and adding 
asm-*/kvm.h would have been better.

As I'm about to disappear for a week, consider a patch to remove the 
config dependency and add asm-*/kvm.h pre-acked for mainline.  Maybe the 
presence of those empty asm-*/kvm.h files will encourage further kvm 
ports to *.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] [PATCH 4/6] KVM: Add save/restore supporting of in kernel PIT

2008-03-07 Thread Yang, Sheng

From a16353c6fd51d5057d748198ba24d272be73d86b Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Tue, 4 Mar 2008 00:50:59 +0800
Subject: [PATCH] KVM: Add save/restore supporting of in kernel PIT


Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 arch/x86/kvm/i8254.c  |7 +++
 arch/x86/kvm/i8254.h  |1 +
 arch/x86/kvm/x86.c|   48 
 include/asm-x86/kvm.h |   21 +
 include/linux/kvm.h   |2 ++
 5 files changed, 79 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 1031901..7776f50 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -286,6 +286,13 @@ static void pit_load_count(struct kvm *kvm, int channel, 
u32 val)
}
 }

+void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val)
+{
+   mutex_lock(&kvm->arch.vpit->pit_state.lock);
+   pit_load_count(kvm, channel, val);
+   mutex_unlock(&kvm->arch.vpit->pit_state.lock);
+}
+
 static void pit_ioport_write(struct kvm_io_device *this,
 gpa_t addr, int len, const void *data)
 {
diff --git a/arch/x86/kvm/i8254.h b/arch/x86/kvm/i8254.h
index 38184d5..586bbf0 100644
--- a/arch/x86/kvm/i8254.h
+++ b/arch/x86/kvm/i8254.h
@@ -54,6 +54,7 @@ struct kvm_pit {

 void kvm_inject_pit_timer_irqs(struct kvm_vcpu *vcpu);
 void kvm_pit_timer_intr_post(struct kvm_vcpu *vcpu, int vec);
+void kvm_pit_load_count(struct kvm *kvm, int channel, u32 val);
 struct kvm_pit *kvm_create_pit(struct kvm *kvm);
 void kvm_free_pit(struct kvm *kvm);

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 421b2b5..5339ab1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1501,6 +1501,23 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, 
struct kvm_irqchip *chip)
return r;
 }

+static int kvm_vm_ioctl_get_pit(struct kvm *kvm, struct kvm_pit_state *ps)
+{
+   int r = 0;
+
+   memcpy(ps, &kvm->arch.vpit->pit_state, sizeof(struct kvm_pit_state));
+   return r;
+}
+
+static int kvm_vm_ioctl_set_pit(struct kvm *kvm, struct kvm_pit_state *ps)
+{
+   int r = 0;
+
+   memcpy(&kvm->arch.vpit->pit_state, ps, sizeof(struct kvm_pit_state));
+   kvm_pit_load_count(kvm, 0, ps->channels[0].count);
+   return r;
+}
+
 /*
  * Get (and clear) the dirty memory log for a memory slot.
  */
@@ -1654,6 +1671,37 @@ long kvm_arch_vm_ioctl(struct file *filp,
r = 0;
break;
}
+   case KVM_GET_PIT: {
+   struct kvm_pit_state ps;
+   r = -EFAULT;
+   if (copy_from_user(&ps, argp, sizeof ps))
+   goto out;
+   r = -ENXIO;
+   if (!kvm->arch.vpit)
+   goto out;
+   r = kvm_vm_ioctl_get_pit(kvm, &ps);
+   if (r)
+   goto out;
+   r = -EFAULT;
+   if (copy_to_user(argp, &ps, sizeof ps))
+   goto out;
+   r = 0;
+   break;
+   }
+   case KVM_SET_PIT: {
+   struct kvm_pit_state ps;
+   r = -EFAULT;
+   if (copy_from_user(&ps, argp, sizeof ps))
+   goto out;
+   r = -ENXIO;
+   if (!kvm->arch.vpit)
+   goto out;
+   r = kvm_vm_ioctl_set_pit(kvm, &ps);
+   if (r)
+   goto out;
+   r = 0;
+   break;
+   }
default:
;
}
diff --git a/include/asm-x86/kvm.h b/include/asm-x86/kvm.h
index 7a71120..12b4b25 100644
--- a/include/asm-x86/kvm.h
+++ b/include/asm-x86/kvm.h
@@ -188,4 +188,25 @@ struct kvm_cpuid2 {
struct kvm_cpuid_entry2 entries[0];
 };

+/* for KVM_GET_PIT and KVM_SET_PIT */
+struct kvm_pit_channel_state {
+   __u32 count; /* can be 65536 */
+   __u16 latched_count;
+   __u8 count_latched;
+   __u8 status_latched;
+   __u8 status;
+   __u8 read_state;
+   __u8 write_state;
+   __u8 write_latch;
+   __u8 rw_mode;
+   __u8 mode;
+   __u8 bcd;
+   __u8 gate;
+   __s64 count_load_time;
+};
+
+struct kvm_pit_state {
+   struct kvm_pit_channel_state channels[3];
+};
+
 #endif
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index cefa9a2..a2f3274 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -260,6 +260,8 @@ struct kvm_vapic_addr {
 #define KVM_GET_IRQCHIP  _IOWR(KVMIO, 0x62, struct kvm_irqchip)
 #define KVM_SET_IRQCHIP  _IOR(KVMIO,  0x63, struct kvm_irqchip)
 #define KVM_CREATE_PIT   _IO(KVMIO,  0x64)
+#define KVM_GET_PIT  _IOWR(KVMIO, 0x65, struct kvm_pit_state)
+#define KVM_SET_PIT  _IOR(KVMIO,  0x66, struct kvm_pit_state)

 /*
  * ioctls for vcpu fds
--
debian.1.5.3.7.1-dirty

From a16353c6fd51d5057d748198ba24d272be73d86b Mon Sep 17 00:00:00 2001
From: Sheng Yang <[

[kvm-devel] [PATCH 1/6] KVM: In kernel PIT model

2008-03-07 Thread Yang, Sheng

From 19bc8000b0ed1c2021ddb509a3d923e1cd8d53ec Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Mon, 28 Jan 2008 05:10:22 +0800
Subject: [PATCH] KVM: In kernel PIT model

The patch moved PIT from userspace to kernel, and increase the timer accuracy 
greatly.

Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 arch/x86/kvm/Makefile  |3 +-
 arch/x86/kvm/i8254.c   |  585 

 arch/x86/kvm/i8254.h   |   60 +
 arch/x86/kvm/irq.c |3 +
 arch/x86/kvm/x86.c |9 +
 include/asm-x86/kvm_host.h |1 +
 include/linux/kvm.h|2 +
 7 files changed, 662 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/kvm/i8254.c
 create mode 100644 arch/x86/kvm/i8254.h

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index ffdd0b3..4d0c22e 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -6,7 +6,8 @@ common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o 
ioapic.o)

 EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm

-kvm-objs := $(common-objs) x86.o mmu.o x86_emulate.o i8259.o irq.o lapic.o
+kvm-objs := $(common-objs) x86.o mmu.o x86_emulate.o i8259.o irq.o lapic.o \
+   i8254.o
 obj-$(CONFIG_KVM) += kvm.o
 kvm-intel-objs = vmx.o
 obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
new file mode 100644
index 000..1031901
--- /dev/null
+++ b/arch/x86/kvm/i8254.c
@@ -0,0 +1,585 @@
+/*
+ * 8253/8254 interval timer emulation
+ *
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ * Copyright (c) 2006 Intel Corporation
+ * Copyright (c) 2007 Keir Fraser, XenSource Inc
+ * Copyright (c) 2008 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a 
copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the 
rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ * Authors:
+ *   Sheng Yang <[EMAIL PROTECTED]>
+ *   Based on QEMU and Xen.
+ */
+
+#include 
+
+#include "irq.h"
+#include "i8254.h"
+
+#ifndef CONFIG_X86_64
+#define mod_64(x, y) ((x) - (y) * div64_64(x, y))
+#else
+#define mod_64(x, y) ((x) % (y))
+#endif
+
+#define RW_STATE_LSB 1
+#define RW_STATE_MSB 2
+#define RW_STATE_WORD0 3
+#define RW_STATE_WORD1 4
+
+/* Compute with 96 bit intermediate result: (a*b)/c */
+static u64 muldiv64(u64 a, u32 b, u32 c)
+{
+   union {
+   u64 ll;
+   struct {
+   u32 low, high;
+   } l;
+   } u, res;
+   u64 rl, rh;
+
+   u.ll = a;
+   rl = (u64)u.l.low * (u64)b;
+   rh = (u64)u.l.high * (u64)b;
+   rh += (rl >> 32);
+   res.l.high = div64_64(rh, c);
+   res.l.low = div64_64(((mod_64(rh, c) << 32) + (rl & 0x)), c);
+   return res.ll;
+}
+
+static void pit_set_gate(struct kvm *kvm, int channel, u32 val)
+{
+   struct kvm_kpit_channel_state *c =
+   &kvm->arch.vpit->pit_state.channels[channel];
+
+   WARN_ON(!mutex_is_locked(&kvm->arch.vpit->pit_state.lock));
+
+   switch (c->mode) {
+   default:
+   case 0:
+   case 4:
+   /* XXX: just disable/enable counting */
+   break;
+   case 1:
+   case 2:
+   case 3:
+   case 5:
+   /* Restart counting on rising edge. */
+   if (c->gate < val)
+   c->count_load_time = ktime_get();
+   break;
+   }
+
+   c->gate = val;
+}
+
+int pit_get_gate(struct kvm *kvm, int channel)
+{
+   WARN_ON(!mutex_is_locked(&kvm->arch.vpit->pit_state.lock));
+
+   return kvm->arch.vpit->pit_state.channels[channel].gate;
+}
+
+static int pit_get_count(struct kvm *kvm, int channel)
+{
+   struct kvm_kpit_channel_state *c =
+   &kvm->arch.vpit->pit_state.channels[channel];
+   s64 d, t;
+   int counter;
+
+   WARN_ON(!mutex_is_locked(&kvm->arch.vpit->pit_state.lock));
+
+   t = ktime_to_ns(ktime_sub(ktime_get(), c->count_load_time));
+   d = muldiv64(t, KVM_PIT_FREQ, NSEC_PER_SEC);
+
+   switch (c->mode) {
+   case 0:
+

[kvm-devel] [PATCH 6/6] kvm: qemu: Add save/restore support for in kernel PIT

2008-03-07 Thread Yang, Sheng

From 1af4bc979495e9e51b67635d5a9890c559e31078 Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Fri, 7 Mar 2008 19:13:06 +0800
Subject: [PATCH] kvm: qemu: Add save/restore support for in kernel PIT


Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 qemu/hw/i8254.c |   73 
+++
 1 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/i8254.c b/qemu/hw/i8254.c
index 9e18ebc..e215f8b 100644
--- a/qemu/hw/i8254.c
+++ b/qemu/hw/i8254.c
@@ -414,12 +414,78 @@ static void pit_irq_timer(void *opaque)
 pit_irq_timer_update(s, s->next_transition_time);
 }

+#ifdef KVM_CAP_PIT
+
+static void kvm_kernel_pit_save_to_user(PITState *s)
+{
+struct kvm_pit_state pit;
+struct kvm_pit_channel_state *c;
+struct PITChannelState *sc;
+int i;
+
+kvm_get_pit(kvm_context, &pit);
+
+for (i = 0; i < 3; i++) {
+   c = &pit.channels[i];
+   sc = &s->channels[i];
+   sc->count = c->count;
+   sc->latched_count = c->latched_count;
+   sc->count_latched = c->count_latched;
+   sc->status_latched = c->status_latched;
+   sc->status = c->status;
+   sc->read_state = c->read_state;
+   sc->write_state = c->write_state;
+   sc->write_latch = c->write_latch;
+   sc->rw_mode = c->rw_mode;
+   sc->mode = c->mode;
+   sc->bcd = c->bcd;
+   sc->gate = c->gate;
+   sc->count_load_time = c->count_load_time;
+}
+}
+
+static void kvm_kernel_pit_load_from_user(PITState *s)
+{
+struct kvm_pit_state pit;
+struct kvm_pit_channel_state *c;
+struct PITChannelState *sc;
+int i;
+
+for (i = 0; i < 3; i++) {
+   c = &pit.channels[i];
+   sc = &s->channels[i];
+   c->count = sc->count;
+   c->latched_count = sc->latched_count;
+   c->count_latched = sc->count_latched;
+   c->status_latched = sc->status_latched;
+   c->status = sc->status;
+   c->read_state = sc->read_state;
+   c->write_state = sc->write_state;
+   c->write_latch = sc->write_latch;
+   c->rw_mode = sc->rw_mode;
+   c->mode = sc->mode;
+   c->bcd = sc->bcd;
+   c->gate = sc->gate;
+   c->count_load_time = sc->count_load_time;
+}
+
+kvm_set_pit(kvm_context, &pit);
+}
+
+#endif
+
 static void pit_save(QEMUFile *f, void *opaque)
 {
 PITState *pit = opaque;
 PITChannelState *s;
 int i;

+#ifdef KVM_CAP_PIT
+if (kvm_enabled() && qemu_kvm_pit_in_kernel()) {
+kvm_kernel_pit_save_to_user(pit);
+}
+#endif
+
 for(i = 0; i < 3; i++) {
 s = &pit->channels[i];
 qemu_put_be32(f, s->count);
@@ -471,6 +537,13 @@ static int pit_load(QEMUFile *f, void *opaque, int 
version_id)
 qemu_get_timer(f, s->irq_timer);
 }
 }
+
+#ifdef KVM_CAP_PIT
+if (kvm_enabled() && qemu_kvm_pit_in_kernel()) {
+kvm_kernel_pit_load_from_user(pit);
+}
+#endif
+
 return 0;
 }

--
debian.1.5.3.7.1-dirty

From 1af4bc979495e9e51b67635d5a9890c559e31078 Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Fri, 7 Mar 2008 19:13:06 +0800
Subject: [PATCH] kvm: qemu: Add save/restore support for in kernel PIT


Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 qemu/hw/i8254.c |   73 +++
 1 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/i8254.c b/qemu/hw/i8254.c
index 9e18ebc..e215f8b 100644
--- a/qemu/hw/i8254.c
+++ b/qemu/hw/i8254.c
@@ -414,12 +414,78 @@ static void pit_irq_timer(void *opaque)
 pit_irq_timer_update(s, s->next_transition_time);
 }
 
+#ifdef KVM_CAP_PIT
+
+static void kvm_kernel_pit_save_to_user(PITState *s)
+{
+struct kvm_pit_state pit;
+struct kvm_pit_channel_state *c;
+struct PITChannelState *sc;
+int i;
+
+kvm_get_pit(kvm_context, &pit);
+
+for (i = 0; i < 3; i++) {
+	c = &pit.channels[i];
+	sc = &s->channels[i];
+	sc->count = c->count;
+	sc->latched_count = c->latched_count;
+	sc->count_latched = c->count_latched;
+	sc->status_latched = c->status_latched;
+	sc->status = c->status;
+	sc->read_state = c->read_state;
+	sc->write_state = c->write_state;
+	sc->write_latch = c->write_latch;
+	sc->rw_mode = c->rw_mode;
+	sc->mode = c->mode;
+	sc->bcd = c->bcd;
+	sc->gate = c->gate;
+	sc->count_load_time = c->count_load_time;
+}
+}
+
+static void kvm_kernel_pit_load_from_user(PITState *s)
+{
+struct kvm_pit_state pit;
+struct kvm_pit_channel_state *c;
+struct PITChannelState *sc;
+int i;
+
+for (i = 0; i < 3; i++) {
+	c = &pit.channels[i];
+	sc = &s->channels[i];
+	c->count = sc->count;
+	c->latched_count = sc->latched_count;
+	c->count_latched = sc->count_latched;
+	c->status_latched = sc->status_latched;
+	c->status = sc->status;
+	c->read_state = sc->read_state;
+	c->write_state = sc->write_state;
+	c->write_latch = sc->write_latch;
+	c->rw_mode = sc->rw_mode;
+	c->mode = sc->mode;
+	c->bcd = sc->bcd;
+	c->gate = sc->gate;
+	c->count_load_t

[kvm-devel] [PATCH 3/6] kvm: qemu: Add option for enable/disable in kernel PIT

2008-03-07 Thread Yang, Sheng

From 98543bb3c3821e5bc9003bb91d7d0c755394ffac Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Fri, 7 Mar 2008 14:24:32 +0800
Subject: [PATCH] kvm: qemu: Add option for enable/disable in kernel PIT


Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 libkvm/libkvm-x86.c |9 +
 qemu/hw/i8254.c |   12 
 qemu/qemu-kvm.c |4 
 qemu/qemu-kvm.h |2 ++
 qemu/vl.c   |   11 ++-
 5 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index b3a241e..d19d17f 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -630,3 +630,12 @@ int kvm_disable_tpr_access_reporting(kvm_context_t kvm, 
int vcpu)
 }

 #endif
+
+int kvm_pit_in_kernel(kvm_context_t kvm)
+{
+#ifdef KVM_CAP_PIT
+   return kvm->pit_in_kernel;
+#else
+   return 0;
+#endif
+}
diff --git a/qemu/hw/i8254.c b/qemu/hw/i8254.c
index c281680..9e18ebc 100644
--- a/qemu/hw/i8254.c
+++ b/qemu/hw/i8254.c
@@ -26,6 +26,8 @@
 #include "isa.h"
 #include "qemu-timer.h"

+#include "qemu-kvm.h"
+
 //#define DEBUG_PIT

 #define RW_STATE_LSB 1
@@ -491,10 +493,12 @@ PITState *pit_init(int base, qemu_irq irq)
 PITState *pit = &pit_state;
 PITChannelState *s;

-s = &pit->channels[0];
-/* the timer 0 is connected to an IRQ */
-s->irq_timer = qemu_new_timer(vm_clock, pit_irq_timer, s);
-s->irq = irq;
+if (!kvm_enabled() || !qemu_kvm_pit_in_kernel()) {
+   s = &pit->channels[0];
+   /* the timer 0 is connected to an IRQ */
+   s->irq_timer = qemu_new_timer(vm_clock, pit_irq_timer, s);
+   s->irq = irq;
+}

 register_savevm("i8254", base, 1, pit_save, pit_load, pit);

diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index 45fddd3..4d59b39 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -10,6 +10,7 @@

 int kvm_allowed = 1;
 int kvm_irqchip = 1;
+int kvm_pit = 1;

 #include 
 #include "hw/hw.h"
@@ -544,6 +545,9 @@ int kvm_qemu_create_context(void)
 if (!kvm_irqchip) {
 kvm_disable_irqchip_creation(kvm_context);
 }
+if (!kvm_pit) {
+kvm_disable_pit_creation(kvm_context);
+}
 if (kvm_create(kvm_context, phys_ram_size, (void**)&phys_ram_base) < 0) {
kvm_qemu_destroy();
return -1;
diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h
index 8e45f30..ff9c86e 100644
--- a/qemu/qemu-kvm.h
+++ b/qemu/qemu-kvm.h
@@ -84,9 +84,11 @@ extern kvm_context_t kvm_context;

 #define kvm_enabled() (kvm_allowed)
 #define qemu_kvm_irqchip_in_kernel() kvm_irqchip_in_kernel(kvm_context)
+#define qemu_kvm_pit_in_kernel() kvm_pit_in_kernel(kvm_context)
 #else
 #define kvm_enabled() (0)
 #define qemu_kvm_irqchip_in_kernel() (0)
+#define qemu_kvm_pit_in_kernel() (0)
 #endif

 #endif
diff --git a/qemu/vl.c b/qemu/vl.c
index 4762cb0..21c9b53 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -8097,6 +8097,7 @@ static void help(int exitcode)
   "-no-kvm disable KVM hardware virtualization\n"
 #endif
   "-no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC\n"
+  "-no-kvm-pit disable KVM kernel mode PIT\n"
 #endif
 #ifdef TARGET_I386
"-std-vgasimulate a standard VGA card with VESA Bochs 
Extensions\n"
@@ -8219,6 +8220,7 @@ enum {
 QEMU_OPTION_curses,
 QEMU_OPTION_no_kvm,
 QEMU_OPTION_no_kvm_irqchip,
+QEMU_OPTION_no_kvm_pit,
 QEMU_OPTION_no_reboot,
 QEMU_OPTION_show_cursor,
 QEMU_OPTION_daemonize,
@@ -8305,6 +8307,7 @@ const QEMUOption qemu_options[] = {
 { "no-kvm", 0, QEMU_OPTION_no_kvm },
 #endif
 { "no-kvm-irqchip", 0, QEMU_OPTION_no_kvm_irqchip },
+{ "no-kvm-pit", 0, QEMU_OPTION_no_kvm_pit },
 #endif
 #if defined(TARGET_PPC) || defined(TARGET_SPARC)
 { "g", 1, QEMU_OPTION_g },
@@ -9238,8 +9241,14 @@ int main(int argc, char **argv)
kvm_allowed = 0;
break;
case QEMU_OPTION_no_kvm_irqchip: {
-   extern int kvm_irqchip;
+   extern int kvm_irqchip, kvm_pit;
kvm_irqchip = 0;
+   kvm_pit = 0;
+   break;
+   }
+   case QEMU_OPTION_no_kvm_pit: {
+   extern int kvm_pit;
+   kvm_pit = 0;
break;
}
 #endif
--
debian.1.5.3.7.1-dirty

From 98543bb3c3821e5bc9003bb91d7d0c755394ffac Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Fri, 7 Mar 2008 14:24:32 +0800
Subject: [PATCH] kvm: qemu: Add option for enable/disable in kernel PIT


Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 libkvm/libkvm-x86.c |9 +
 qemu/hw/i8254.c |   12 
 qemu/qemu-kvm.c |4 
 qemu/qemu-kvm.h |2 ++
 qemu/vl.c   |   11 ++-
 5 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index b3a241e..d19d17f 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -630,3 +630,12 @@ int kvm_disable_tpr_a

[kvm-devel] [PATCH 5/6] kvm: libkvm: Add interface for PIT save/restore supporting

2008-03-07 Thread Yang, Sheng

From 6ff60b78c3280505d84d0dc2619e95a087b88458 Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Fri, 7 Mar 2008 19:03:52 +0800
Subject: [PATCH] kvm: libkvm: Add interface for PIT save/restore supporting


Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 libkvm/libkvm-x86.c |   30 ++
 libkvm/libkvm.h |   26 ++
 2 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index d19d17f..d1809fd 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -361,6 +361,36 @@ int kvm_set_lapic(kvm_context_t kvm, int vcpu, struct 
kvm_lapic_state *s)

 #endif

+#ifdef KVM_CAP_PIT
+
+int kvm_get_pit(kvm_context_t kvm, struct kvm_pit_state *s)
+{
+   int r;
+   if (!kvm->pit_in_kernel)
+   return 0;
+   r = ioctl(kvm->vm_fd, KVM_GET_PIT, s);
+   if (r == -1) {
+   r = -errno;
+   perror("kvm_get_pit");
+   }
+   return r;
+}
+
+int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s)
+{
+   int r;
+   if (!kvm->pit_in_kernel)
+   return 0;
+   r = ioctl(kvm->vm_fd, KVM_SET_PIT, s);
+   if (r == -1) {
+   r = -errno;
+   perror("kvm_set_pit");
+   }
+   return r;
+}
+
+#endif
+
 void kvm_show_code(kvm_context_t kvm, int vcpu)
 {
 #define CR0_PE_MASK(1ULL<<0)
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index 395434c..4b40d2c 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -525,8 +525,34 @@ int kvm_get_lapic(kvm_context_t kvm, int vcpu, struct 
kvm_lapic_state *s);
  * \param s Local apic state of the specific virtual CPU
  */
 int kvm_set_lapic(kvm_context_t kvm, int vcpu, struct kvm_lapic_state *s);
+
+#endif
+
 #endif

+#ifdef KVM_CAP_PIT
+
+/*!
+ * \brief Get in kernel PIT of the virtual domain
+ *
+ * Save the PIT state.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param s PIT state of the virtual domain
+ */
+int kvm_get_pit(kvm_context_t kvm, struct kvm_pit_state *s);
+
+/*!
+ * \brief Set in kernel PIT of the virtual domain
+ *
+ * Restore the PIT state.
+ * Timer would be retriggerred after restored.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param s PIT state of the virtual domain
+ */
+int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s);
+
 #endif

 #ifdef KVM_CAP_VAPIC
--
debian.1.5.3.7.1-dirty

From 6ff60b78c3280505d84d0dc2619e95a087b88458 Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Fri, 7 Mar 2008 19:03:52 +0800
Subject: [PATCH] kvm: libkvm: Add interface for PIT save/restore supporting


Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 libkvm/libkvm-x86.c |   30 ++
 libkvm/libkvm.h |   26 ++
 2 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index d19d17f..d1809fd 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -361,6 +361,36 @@ int kvm_set_lapic(kvm_context_t kvm, int vcpu, struct kvm_lapic_state *s)
 
 #endif
 
+#ifdef KVM_CAP_PIT
+
+int kvm_get_pit(kvm_context_t kvm, struct kvm_pit_state *s)
+{
+	int r;
+	if (!kvm->pit_in_kernel)
+		return 0;
+	r = ioctl(kvm->vm_fd, KVM_GET_PIT, s);
+	if (r == -1) {
+		r = -errno;
+		perror("kvm_get_pit");
+	}
+	return r;
+}
+
+int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s)
+{
+	int r;
+	if (!kvm->pit_in_kernel)
+		return 0;
+	r = ioctl(kvm->vm_fd, KVM_SET_PIT, s);
+	if (r == -1) {
+		r = -errno;
+		perror("kvm_set_pit");
+	}
+	return r;
+}
+
+#endif
+
 void kvm_show_code(kvm_context_t kvm, int vcpu)
 {
 #define CR0_PE_MASK	(1ULL<<0)
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index 395434c..4b40d2c 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -525,8 +525,34 @@ int kvm_get_lapic(kvm_context_t kvm, int vcpu, struct kvm_lapic_state *s);
  * \param s Local apic state of the specific virtual CPU
  */
 int kvm_set_lapic(kvm_context_t kvm, int vcpu, struct kvm_lapic_state *s);
+
+#endif
+
 #endif
 
+#ifdef KVM_CAP_PIT
+
+/*!
+ * \brief Get in kernel PIT of the virtual domain
+ *
+ * Save the PIT state.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param s PIT state of the virtual domain
+ */
+int kvm_get_pit(kvm_context_t kvm, struct kvm_pit_state *s);
+
+/*!
+ * \brief Set in kernel PIT of the virtual domain
+ *
+ * Restore the PIT state.
+ * Timer would be retriggerred after restored.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param s PIT state of the virtual domain
+ */
+int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s);
+
 #endif
 
 #ifdef KVM_CAP_VAPIC
-- 
debian.1.5.3.7.1-dirty

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel maili

[kvm-devel] [PATCH 2/6] kvm: libkvm: Add Supporting for in-kernel PIT model

2008-03-07 Thread Yang, Sheng

From 0e5d4fad7a6a917232d89afb58b960f6951990de Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Sun, 27 Jan 2008 11:25:29 +0800
Subject: [PATCH] kvm: libkvm: Add Supporting for in-kernel PIT model


Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 kernel/Kbuild   |2 +-
 libkvm/kvm-common.h |4 
 libkvm/libkvm-x86.c |   25 +
 libkvm/libkvm.c |5 +
 4 files changed, 35 insertions(+), 1 deletions(-)

diff --git a/kernel/Kbuild b/kernel/Kbuild
index ed02f5a..014cc17 100644
--- a/kernel/Kbuild
+++ b/kernel/Kbuild
@@ -1,7 +1,7 @@
 EXTRA_CFLAGS := -I$(src)/include -include $(src)/external-module-compat.h
 obj-m := kvm.o kvm-intel.o kvm-amd.o
 kvm-objs := kvm_main.o x86.o mmu.o x86_emulate.o anon_inodes.o irq.o i8259.o 
\
-lapic.o ioapic.o preempt.o
+lapic.o ioapic.o preempt.o i8254.o
 kvm-intel-objs := vmx.o vmx-debug.o
 kvm-amd-objs := svm.o

diff --git a/libkvm/kvm-common.h b/libkvm/kvm-common.h
index d4df1a4..b8a88ee 100644
--- a/libkvm/kvm-common.h
+++ b/libkvm/kvm-common.h
@@ -47,6 +47,10 @@ struct kvm_context {
int no_irqchip_creation;
/// in-kernel irqchip status
int irqchip_in_kernel;
+   /// do not create in-kernel pit if set
+   int no_pit_creation;
+   /// in-kernel pit status
+   int pit_in_kernel;
 };

 void init_slots(void);
diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index 4bd0e2f..b3a241e 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -144,6 +144,27 @@ int kvm_arch_create_default_phys_mem(kvm_context_t kvm,
return 0;
 }

+int kvm_create_pit(kvm_context_t kvm)
+{
+#ifdef KVM_CAP_PIT
+   int r;
+
+   kvm->pit_in_kernel = 0;
+   if (!kvm->no_pit_creation) {
+   r = ioctl(kvm->fd, KVM_CHECK_EXTENSION, KVM_CAP_PIT);
+   if (r > 0) {
+   r = ioctl(kvm->vm_fd, KVM_CREATE_PIT);
+   if (r >= 0)
+   kvm->pit_in_kernel = 1;
+   else {
+   printf("Create kernel PIC irqchip failed\n");
+   return r;
+   }
+   }
+   }
+#endif
+   return 0;
+}

 int kvm_arch_create(kvm_context_t kvm, unsigned long phys_mem_bytes,
void **vm_mem)
@@ -154,6 +175,10 @@ int kvm_arch_create(kvm_context_t kvm, unsigned long 
phys_mem_bytes,
if (r < 0)
return r;

+   r = kvm_create_pit(kvm);
+   if (r < 0)
+   return r;
+
return 0;
 }

diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index 966501c..a7cc0e6 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -271,6 +271,11 @@ void kvm_disable_irqchip_creation(kvm_context_t kvm)
kvm->no_irqchip_creation = 1;
 }

+void kvm_disable_pit_creation(kvm_context_t kvm)
+{
+   kvm->no_pit_creation = 1;
+}
+
 int kvm_create_vcpu(kvm_context_t kvm, int slot)
 {
long mmap_size;
--
debian.1.5.3.7.1-dirty

From 0e5d4fad7a6a917232d89afb58b960f6951990de Mon Sep 17 00:00:00 2001
From: Sheng Yang <[EMAIL PROTECTED]>
Date: Sun, 27 Jan 2008 11:25:29 +0800
Subject: [PATCH] kvm: libkvm: Add Supporting for in-kernel PIT model


Signed-off-by: Sheng Yang <[EMAIL PROTECTED]>
---
 kernel/Kbuild   |2 +-
 libkvm/kvm-common.h |4 
 libkvm/libkvm-x86.c |   25 +
 libkvm/libkvm.c |5 +
 4 files changed, 35 insertions(+), 1 deletions(-)

diff --git a/kernel/Kbuild b/kernel/Kbuild
index ed02f5a..014cc17 100644
--- a/kernel/Kbuild
+++ b/kernel/Kbuild
@@ -1,7 +1,7 @@
 EXTRA_CFLAGS := -I$(src)/include -include $(src)/external-module-compat.h
 obj-m := kvm.o kvm-intel.o kvm-amd.o
 kvm-objs := kvm_main.o x86.o mmu.o x86_emulate.o anon_inodes.o irq.o i8259.o \
-	 lapic.o ioapic.o preempt.o
+	 lapic.o ioapic.o preempt.o i8254.o
 kvm-intel-objs := vmx.o vmx-debug.o
 kvm-amd-objs := svm.o
 
diff --git a/libkvm/kvm-common.h b/libkvm/kvm-common.h
index d4df1a4..b8a88ee 100644
--- a/libkvm/kvm-common.h
+++ b/libkvm/kvm-common.h
@@ -47,6 +47,10 @@ struct kvm_context {
 	int no_irqchip_creation;
 	/// in-kernel irqchip status
 	int irqchip_in_kernel;
+	/// do not create in-kernel pit if set
+	int no_pit_creation;
+	/// in-kernel pit status
+	int pit_in_kernel;
 };
 
 void init_slots(void);
diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index 4bd0e2f..b3a241e 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -144,6 +144,27 @@ int kvm_arch_create_default_phys_mem(kvm_context_t kvm,
 	return 0;
 }
 
+int kvm_create_pit(kvm_context_t kvm)
+{
+#ifdef KVM_CAP_PIT
+	int r;
+
+	kvm->pit_in_kernel = 0;
+	if (!kvm->no_pit_creation) {
+		r = ioctl(kvm->fd, KVM_CHECK_EXTENSION, KVM_CAP_PIT);
+		if (r > 0) {
+			r = ioctl(kvm->vm_fd, KVM_CREATE_PIT);
+			if (r >= 0)
+kvm->pit_in_kernel = 1;
+			else {
+printf("Create kernel PIC irqchip failed\n");
+return r;
+			}
+		}
+	}
+#endif
+	return 0;
+}
 
 in

[kvm-devel] [PATCH 0/6] Latest in kernel PIT patch

2008-03-07 Thread Yang, Sheng

Hi

Here is the latest in kernel PIT patch. Not much change from last edition. 

One known issue is on 2.6.9 pae guest(e.g. RHEL4), you need "clock=pit" kernel 
parameter to get the correct time. That's because the kernel is too active 
to "fix the lost interrupt" when PIT interrupts pending... We may find more 
elegant way to deal with it later.

Thanks
Yang, Sheng

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

[kvm-devel] headersinstall of kvm.h does not work

2008-03-07 Thread Christian Borntraeger

Hello Avi,

in commit fb56dbb31c4738a3918db81fd24da732ce3b4ae6 you changed 
include/linux/Kbuild:
snip
KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM
Currently, make headers_check barfs due to , which 
includes, not existing.  Rather than add a zillion s, export 
kvm.h only if the arch actually supports it.
[...]
 unifdef-y += keyboard.h
-unifdef-y += kvm.h
+unifdef-$(CONFIG_HAVE_KVM) += kvm.h
 unifdef-y += llc.h
 unifdef-y += loop.h
snip--

This patch does not work. Kbuild (scripts/Makefile.headersinst) does not 
check the config file, so kvm.h is never installed.

Sam is there an easy way to allow constructs like "unifdef-$(CONFIG_FOO)"?

Thanks

Christian


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 1/6] KVM: In kernel pit model

2008-03-07 Thread Avi Kivity

Yang, Sheng wrote:
> On Friday 07 March 2008 16:53:40 Avi Kivity wrote:
>   
>> Yang, Sheng wrote:
>> 
>>> Found more complex for KVM. Xen pulled pm timer down to kernel part, and
>>> used the guest TSC as source. So only adjust TSC is OK for it. But we are
>>> still using pm timer in QEmu, which using host time as source. So even we
>>> pull back TSC, the problem still exists, for 2.6.9 prefer to pm timer by
>>> default
>>>   
>> Interesting.  I guess we should pull the pm timer into the kernel as
>> well.  Timing is too tricky for userspace.
>> 
>
> ... Should we suggest using "clock=pit" on pae 2.6.9 at first?
>
>   

While it is hardly a lovely solution (things should work out of the box) 
it is reasonable as a temporary measure.

Can you repost your patchset?  If you're quick I can apply it today, 
otherwise it will have to wait until next week.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] disable clock before rebooting.

2008-03-07 Thread Avi Kivity

Glauber Costa wrote:
> This patch writes 0 (actually, what really matters is that the
> LSB is cleared) to the system time msr before rebooting/shutting down
> the machine.
>
> Without it, we can have a random memory location being written
> when the guest comes back
>   if (!kvm_para_available())
> @@ -154,6 +181,11 @@ void __init kvmclock_init(void)
>   pv_time_ops.set_wallclock = kvm_set_wallclock;
>   pv_time_ops.sched_clock = kvm_clock_read;
>   pv_apic_ops.setup_secondary_clock = kvm_setup_secondary_clock;
> + machine_ops.emergency_restart = kvm_emergency_restart;
> + machine_ops.shutdown  = kvm_shutdown;
> + machine_ops.restart  = kvm_restart;
> + machine_ops.halt  = kvm_halt;
> + machine_ops.power_off  = kvm_power_off;
>   clocksource_register(&kvm_clock);
>   }
>  }
>   

Oh, I think that these are all unnecessary.  You need to stop the clock 
only if the memory it uses will be reused.  Halt, shutdown and poweroff 
clearly don't.  Resets need to go through the host anyway, since they 
can be invoked without the guest knowing about it.

The only case I can think of where we need to stop the clock is kexec.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 1/6] KVM: In kernel pit model

2008-03-07 Thread Yang, Sheng

On Friday 07 March 2008 16:53:40 Avi Kivity wrote:
> Yang, Sheng wrote:
> > Found more complex for KVM. Xen pulled pm timer down to kernel part, and
> > used the guest TSC as source. So only adjust TSC is OK for it. But we are
> > still using pm timer in QEmu, which using host time as source. So even we
> > pull back TSC, the problem still exists, for 2.6.9 prefer to pm timer by
> > default
>
> Interesting.  I guess we should pull the pm timer into the kernel as
> well.  Timing is too tricky for userspace.

... Should we suggest using "clock=pit" on pae 2.6.9 at first?

-- 
Thanks
Yang, Sheng

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] disable clock before rebooting.

2008-03-07 Thread Avi Kivity

Glauber Costa wrote:
> This patch writes 0 (actually, what really matters is that the
> LSB is cleared) to the system time msr before rebooting/shutting down
> the machine.
>
> Without it, we can have a random memory location being written
> when the guest comes back
>
> Signed-off-by: Glauber Costa <[EMAIL PROTECTED]>
> ---
>  arch/x86/kernel/kvmclock.c |   32 
>  1 files changed, 32 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index f654a12..5c9ff8d 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -21,6 +21,7 @@ #include 
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define KVM_SCALE 22
>  
> @@ -142,6 +143,32 @@ static void kvm_setup_secondary_clock(vo
>   setup_secondary_APIC_clock();
>  }
>  
> +/*
> + * After the clock is registered, the host will keep writing to the
> + * registered memory location. If the guest happens to shutdown, or restart,
> + * this memory won't be valid. In cases like kexec, in which you install a 
> new kernel,
> + * this will mean a random memory location will be kept being written. So 
> before
> + * any kind of shutdown from our side, we unregister the clock by writting 
> anything
> + * that does not have the 'enable' bit set in the msr
> + */ 
> +static void kvm_restart(char *unused) {
>   

This looks like a struct, with the { sitting there on the end.

> + native_write_msr_safe(MSR_KVM_SYSTEM_TIME, 0, 0);
> + native_machine_restart(unused);
> +}
> +
> +/* Forgive me dear lord, for my laziness */
> +#define kvm_reboot_fn(x) \
> +static void kvm_##x(void) { \
> + native_write_msr_safe(MSR_KVM_SYSTEM_TIME, 0, 0); \
> + native_machine_##x(); \
> +}
> +
> +kvm_reboot_fn(emergency_restart)
> +kvm_reboot_fn(shutdown)
> +kvm_reboot_fn(halt)
> +kvm_reboot_fn(power_off)
> +#undef kvm_reboot_fn
> +
>   

Why not go all the way and to _restart the same way?

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 0/2] provide disable clock functionality.

2008-03-07 Thread Avi Kivity

Glauber Costa wrote:
> Avi,
>
> Hope this is better
>   

Applied, thanks.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH] Mark kobjects as unitialized

2008-03-07 Thread Avi Kivity

Greg KH wrote:
> and is on my TODO list, slowly getting
> closer to the top...
>
>   

Strange.  On my TODO list, things slowly get pushed to the bottom.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 1/6] KVM: In kernel pit model

2008-03-07 Thread Avi Kivity

Yang, Sheng wrote:
> Found more complex for KVM. Xen pulled pm timer down to kernel part, and used 
> the guest TSC as source. So only adjust TSC is OK for it. But we are still 
> using pm timer in QEmu, which using host time as source. So even we pull back 
> TSC, the problem still exists, for 2.6.9 prefer to pm timer by default

Interesting.  I guess we should pull the pm timer into the kernel as 
well.  Timing is too tricky for userspace.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Re: [kvm-devel] [PATCH 1/6] KVM: In kernel pit model

2008-03-07 Thread Yang, Sheng

On Thursday 06 March 2008 17:41:03 Yang, Sheng wrote:
> On Thursday 06 March 2008 16:43:18 Yang, Sheng wrote:
> > On Thursday 06 March 2008 16:06:51 Avi Kivity wrote:
> > > Yang, Sheng wrote:
> > > > Here is the updated patch. I kept 0xff because I think it's OK for
> > > > understand easily. :)
> > >
> > > Any news on the regression with older Linux guests?  That's the only
> > > thing keeping my from applying the patchset.
> >
> > Not much. PIT interrupts injection is all right, 1000 per second. Just
> > found two clock source in guest got problem: PM timer and TSC. Seems both
> > due to compensate for lost ticks. Get rid of something like "jiffies_64
> > += lost -1" get the time ok.
> >
> > And I think it's a exist bug. As you see, in most condition, userspace
> > pit + in kernel irqchip resulted in time flow slowly, due to the lost of
> > interrupts. But RHEL4 runs even faster than host...
> >
> > I will do more investigate.
>
> Get some clues.
>
> It seems like for the kernel which is active to inject lost interrupt, when
> some PIT interrupts were pending, the TSC/other clocksource got the wrong
> impression that some PIT interrupt lost, then using itself's counter to
> adjust the jiffies. So the problem occurs.
>
> Xen adjust TSC to fit this mode. It pull TSC backward to get the correct
> value when injecting one PIT interrupt. But this would causing trouble on
> some Windows. Then, it got the "time mode" concept...

Found more complex for KVM. Xen pulled pm timer down to kernel part, and used 
the guest TSC as source. So only adjust TSC is OK for it. But we are still 
using pm timer in QEmu, which using host time as source. So even we pull back 
TSC, the problem still exists, for 2.6.9 prefer to pm timer by default.

-- 
Thanks
Yang, Sheng

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

51 matches

Mail list logo