Re: [Xen-devel] what's inside hypercall page?

2016-03-02 Thread Juergen Gross
On 03/03/16 08:09, quizyjones wrote:
> What I want to do is predict how many instructions a hypercall entry of
> hypercall page (not hypercall handler) would execute before it finishes.
> Take HYPERVISOR_iret as an example, it precisely executes five
> instructions then call the hypercall handler, and it doesn't return so
> it just finish. But other hypercall entries expect a return value of
> syscall and might be interrupted during the execution, just as what you
> said about do_sched_up and do_xen_version, so it is hard to predict. Is
> there any solution to this problem? Just like syscall table, how can I
> predict how many instructions the syscall entries would execute before
> it actually go to the handler, thus accurately set traps to the syscall?

Just look at the code? It's not magic, just x86 instructions. In case of
interrupts
the interrupted code is normally resumed where the interrupt occurred.
So the
number of instructions executed in the hypercall page per hypercall is
just the
number of instructions you can see for that specific hypercall. Only
exception
is the case where the call into the hypervisor itself (syscall) may be
repeated
on request of the hypervisor (do_multicall_op).

Juergen

BTW: Please don't top-post.

> 
>> Subject: Re: [Xen-devel] what's inside hypercall page?
>> To: quizy_jo...@outlook.com; xen-de...@lists.xenproject.org
>> From: jgr...@suse.com
>> Date: Thu, 3 Mar 2016 06:28:06 +0100
>>
>> On 03/03/16 01:56, quizyjones wrote:
>> >> do_sched_op is self explaining: it is used for scheduling of the vcpu.
>> >> A vcpu going to idle is using this hypercall. So any interrupt waking
>> >> the vcpu up will seem to occur very near to the hypercall.
>> >
>> >> do_xen_version is often used as a very fast way to execute the check
>> >> for pending events in the hypervisor (kind of polling).
>> >
>> >> do_multicall might run for a long time. So the hypervisor returns to
>> >> the caller from time to time setting IP to the hypercall. The caller
>> >> has the chance to react to interrupts and will then continue the
>> >> hypercall.
>> >>
>> >>
>> >> HTH, Juergen
>> >
>> >
>> > Thanks for the replying. Does that mean we cannot predict when will
>> > these two hypercalls finish? I want to set up an interval to monitor the
>> > instructions (one time monitor per hypercall), so as to reduce the
>> > performance cost. This requires an accurate prediction of instructions'
>> > execution so as to avoid missing hypercalls. Is that possible? The main
>> > problem is the execution of syscall (0x050f), as each hypercall behaves
>> > different, how can I predict where will it go after the syscall returns?
>>
>> You can't predict how long a hypercall will run, as this depends on
> multiple
>> factors, like the overall load of the host, values of parameters, ...
>>
>> A hypercall is by it's nature much more complicated than e.g. a simple
>> arithmetic operation.
>>
>> What exactly do you want to achieve?
>>
>>
>> Juergen
>>
> 翻译朗读复制正在查询,请稍候……重试朗读复制复制朗读复制via 译


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] what's inside hypercall page?

2016-03-02 Thread quizyjones
What I want to do is predict how many instructions a hypercall entry of 
hypercall page (not hypercall handler) would execute before it finishes. Take 
HYPERVISOR_iret as an example, it precisely executes five instructions then 
call the hypercall handler, and it doesn't return so it just finish. But other 
hypercall entries expect a return value of syscall and might be interrupted 
during the execution, just as what you said about do_sched_up and 
do_xen_version, so it is hard to predict. Is there any solution to this 
problem? Just like syscall table, how can I predict how many instructions the 
syscall entries would execute before it actually go to the handler, thus 
accurately set traps to the syscall?

> Subject: Re: [Xen-devel] what's inside hypercall page?
> To: quizy_jo...@outlook.com; xen-de...@lists.xenproject.org
> From: jgr...@suse.com
> Date: Thu, 3 Mar 2016 06:28:06 +0100
> 
> On 03/03/16 01:56, quizyjones wrote:
> >> do_sched_op is self explaining: it is used for scheduling of the vcpu.
> >> A vcpu going to idle is using this hypercall. So any interrupt waking
> >> the vcpu up will seem to occur very near to the hypercall.
> > 
> >> do_xen_version is often used as a very fast way to execute the check
> >> for pending events in the hypervisor (kind of polling).
> > 
> >> do_multicall might run for a long time. So the hypervisor returns to
> >> the caller from time to time setting IP to the hypercall. The caller
> >> has the chance to react to interrupts and will then continue the
> >> hypercall.
> >>
> >>
> >> HTH, Juergen
> > 
> > 
> > Thanks for the replying. Does that mean we cannot predict when will
> > these two hypercalls finish? I want to set up an interval to monitor the
> > instructions (one time monitor per hypercall), so as to reduce the
> > performance cost. This requires an accurate prediction of instructions'
> > execution so as to avoid missing hypercalls. Is that possible? The main
> > problem is the execution of syscall (0x050f), as each hypercall behaves
> > different, how can I predict where will it go after the syscall returns?
> 
> You can't predict how long a hypercall will run, as this depends on multiple
> factors, like the overall load of the host, values of parameters, ...
> 
> A hypercall is by it's nature much more complicated than e.g. a simple
> arithmetic operation.
> 
> What exactly do you want to achieve?
> 
> 
> Juergen
> 
自动判断中文中文(简体)中文(香港)中文(繁体)英语日语朝鲜语德语法语俄语泰语南非语阿拉伯语阿塞拜疆语比利时语保加利亚语加泰隆语捷克语威尔士语丹麦语第维埃语希腊语世界语西班牙语爱沙尼亚语巴士克语法斯语芬兰语法罗语加里西亚语古吉拉特语希伯来语印地语克罗地亚语匈牙利语亚美尼亚语印度尼西亚语冰岛语意大利语格鲁吉亚语哈萨克语卡纳拉语孔卡尼语吉尔吉斯语立陶宛语拉脱维亚语毛利语马其顿语蒙古语马拉地语马来语马耳他语挪威语(伯克梅尔)荷兰语北梭托语旁遮普语波兰语葡萄牙语克丘亚语罗马尼亚语梵文北萨摩斯语斯洛伐克语斯洛文尼亚语阿尔巴尼亚语瑞典语斯瓦希里语叙利亚语泰米尔语泰卢固语塔加路语茨瓦纳语土耳其语宗加语鞑靼语乌克兰语乌都语乌兹别克语越南语班图语祖鲁语自动选择中文中文(简体)中文(香港)中文(繁体)英语日语朝鲜语德语法语俄语泰语南非语阿拉伯语阿塞拜疆语比利时语保加利亚语加泰隆语捷克语威尔士语丹麦语第维埃语希腊语世界语西班牙语爱沙尼亚语巴士克语法斯语芬兰语法罗语加里西亚语古吉拉特语希伯来语印地语克罗地亚语匈牙利语亚美尼亚语印度尼西亚语冰岛语意大利语格鲁吉亚语哈萨克语卡纳拉语孔卡尼语吉尔吉斯语立陶宛语拉脱维亚语毛利语马其顿语蒙古语马拉地语马来语马耳他语挪威语(伯克梅尔)荷兰语北梭托语旁遮普语波兰语葡萄牙语克丘亚语罗马尼亚语梵文北萨摩斯语斯洛伐克语斯洛文尼亚语阿尔巴尼亚语瑞典语斯瓦希里语叙利亚语泰米尔语泰卢固语塔加路语茨瓦纳语土耳其语宗加语鞑靼语乌克兰语乌都语乌兹别克语越南语班图语祖鲁语有道翻译百度翻译必应翻译谷歌翻译谷歌翻译(国内)翻译朗读复制正在查询,请稍候……重试朗读复制复制朗读复制via
 译   ___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 85020: regressions - trouble: blocked/broken/fail/pass

2016-03-02 Thread osstest service owner
flight 85020 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/85020/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt-xsm  3 host-install(3) broken REGR. vs. 59254
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 59254
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 59254
 test-amd64-amd64-xl  15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-xl-credit2  15 guest-localmigratefail REGR. vs. 59254
 test-amd64-i386-xl   15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-xl-xsm  15 guest-localmigratefail REGR. vs. 59254
 test-amd64-i386-xl-xsm   15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-xl-multivcpu 15 guest-localmigrate   fail REGR. vs. 59254
 test-amd64-amd64-pair  22 guest-migrate/dst_host/src_host fail REGR. vs. 59254
 test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeat fail REGR. vs. 
59254
 test-armhf-armhf-xl  15 guest-start/debian.repeat fail REGR. vs. 59254
 test-armhf-armhf-xl-xsm  15 guest-start/debian.repeat fail REGR. vs. 59254
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeat fail REGR. vs. 59254
 test-amd64-i386-pair   22 guest-migrate/dst_host/src_host fail REGR. vs. 59254

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 17 guest-localmigrate/x10fail REGR. vs. 59254
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 59254
 test-amd64-i386-libvirt-pair 22 guest-migrate/dst_host/src_host fail baseline 
untested
 test-amd64-amd64-libvirt-pair 21 guest-migrate/src_host/dst_host fail baseline 
untested
 test-armhf-armhf-xl-vhd   9 debian-di-install   fail baseline untested
 test-amd64-amd64-libvirt 15 guest-saverestore.2  fail blocked in 59254
 test-amd64-amd64-libvirt-xsm 15 guest-saverestore.2  fail blocked in 59254
 test-amd64-i386-libvirt  15 guest-saverestore.2  fail blocked in 59254
 test-amd64-i386-libvirt-xsm  15 guest-saverestore.2  fail blocked in 59254
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 59254
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59254
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 59254

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass

[Xen-devel] Questions about XenRT

2016-03-02 Thread Sunguodong
Hi James,



I found a PPT file named "XenRT-XenSource's Xen testing infrastructure" written 
by you, I have a few questions and hope you could help me there.

I am looking for a better test tool for Xen, so I recently setup XenRT in a 
debian VM according to this website: 
http://wiki.xenproject.org/wiki/Getting_Started_with_XenRT .

I downloaded xenrt.tgz and tests.tgz from 
http://wiki.xenproject.org/wiki/Category:XenRT .

 But I don't have a xenserver host, the testcases(e.g. 
xenserver.tc.host.TC6859) won't PASS.

 What I want to know is:

1)   Does XenRT support opensource xen host?(I'm using libvirt which calls 
libxenlight APIs)

2)   Is there any detailed documents which can tell me how to configure and 
use this tool?

3)   Is there any other tools(like OSSTest) that we normally use to do xen 
tests, including the tools that can provide a set of unit and functional tests 
and be executed by any Xen contributors to validate their changes before 
committing a patch?


Looking forward to your reply, thank you!


Regards,
Jason
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] xen: add hypercall option to temporarily pin a vcpu

2016-03-02 Thread Juergen Gross
On 02/03/16 18:21, Anshul Makkar wrote:
> Hi,
> 
> 
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of George 
> Dunlap
> Sent: 01 March 2016 15:53
> To: Juergen Gross ; xen-devel@lists.xen.org
> Cc: Wei Liu ; Stefano Stabellini 
> ; George Dunlap ; 
> Andrew Cooper ; Dario Faggioli 
> ; Ian Jackson ; David 
> Vrabel ; jbeul...@suse.com
> Subject: Re: [Xen-devel] [PATCH v2 2/3] xen: add hypercall option to 
> temporarily pin a vcpu
> 
> On 01/03/16 09:02, Juergen Gross wrote:
>> Some hardware (e.g. Dell studio 1555 laptops) require SMIs to be 
>> called on physical cpu 0 only. Linux drivers like dcdbas or i8k try to 
>> achieve this by pinning the running thread to cpu 0, but in Dom0 this 
>> is not enough: the vcpu must be pinned to physical cpu 0 via Xen, too.
>>
>> Add a stable hypercall option SCHEDOP_pin_temp to the sched_op 
>> hypercall to achieve this. It is taking a physical cpu number as 
>> parameter. If pinning is possible (the calling domain has the 
>> privilege to make the call and the cpu is available in the domain's
>> cpupool) the calling vcpu is pinned to the specified cpu. The old cpu 
>> affinity is saved. To undo the temporary pinning a cpu -1 is 
>> specified. This will restore the original cpu affinity for the vcpu.
>>
>> Signed-off-by: Juergen Gross 
>> ---
>> V2: - limit operation to hardware domain as suggested by Jan Beulich
>> - some style issues corrected as requested by Jan Beulich
>> - use fixed width types in interface as requested by Jan Beulich
>> - add compat layer checking as requested by Jan Beulich
>> ---
>>  xen/common/compat/schedule.c |  4 ++
>>  xen/common/schedule.c| 92 
>> +---
>>  xen/include/public/sched.h   | 17 
>>  xen/include/xlat.lst |  1 +
>>  4 files changed, 109 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/common/compat/schedule.c 
>> b/xen/common/compat/schedule.c index 812c550..73b0f01 100644
>> --- a/xen/common/compat/schedule.c
>> +++ b/xen/common/compat/schedule.c
>> @@ -10,6 +10,10 @@
>>  
>>  #define do_sched_op compat_sched_op
>>  
>> +#define xen_sched_pin_temp sched_pin_temp CHECK_sched_pin_temp; 
>> +#undef xen_sched_pin_temp
>> +
>>  #define xen_sched_shutdown sched_shutdown  CHECK_sched_shutdown;  
>> #undef xen_sched_shutdown diff --git a/xen/common/schedule.c 
>> b/xen/common/schedule.c index b0d4b18..653f852 100644
>> --- a/xen/common/schedule.c
>> +++ b/xen/common/schedule.c
>> @@ -271,6 +271,12 @@ int sched_move_domain(struct domain *d, struct cpupool 
>> *c)
>>  struct scheduler *old_ops;
>>  void *old_domdata;
>>  
>> +for_each_vcpu ( d, v )
>> +{
>> +if ( v->affinity_broken )
>> +return -EBUSY;
>> +}
>> +
>>  domdata = SCHED_OP(c->sched, alloc_domdata, d);
>>  if ( domdata == NULL )
>>  return -ENOMEM;
>> @@ -669,6 +675,14 @@ int cpu_disable_scheduler(unsigned int cpu)
>>  if ( cpumask_empty(&online_affinity) &&
>>   cpumask_test_cpu(cpu, v->cpu_hard_affinity) )
>>  {
>> +if ( v->affinity_broken )
>> +{
>> +/* The vcpu is temporarily pinned, can't move it. */
>> +vcpu_schedule_unlock_irqrestore(lock, flags, v);
>> +ret = -EBUSY;
>> +break;
>> +}
> 
> Does this mean that if the user closes the laptop lid while one of these 
> drivers has vcpu0 pinned, that Xen will crash (see 
> xen/arch/x86/smpboot.c:__cpu_disable())?  Or is it the OS's job to make sure 
> that all temporary pins are removed before suspending?
> 
> Also -- have you actually tested the "cpupool move while pinned"
> functionality to make sure it actually works?  There's a weird bit in
> cpupool_unassign_cpu_helper() where after calling cpu_disable_scheduler(cpu), 
> it unconditionally sets the cpu bit in the cpupool_free_cpus mask, even if it 
> returns an error.  That can't be right, even for the existing -EAGAIN case, 
> can it?
> 
> I see that you have a loop to retry this call several times in the next 
> patch; but what if it fails every time -- what state is the system in?
> 
> And, in general, what happens if the device driver gets mixed up and forgets 
> to unpin the vcpu?  Is the only recourse to reboot your host (or deal with 
> the fact that you can't reconfigure your cpupools)?
> 
>  -George
> 
> Sorry, lost the original thread so replying at the top of mail chain.
> 
> +static XSM_INLINE int xsm_schedop_pin_temp(XSM_DEFAULT_VOID) 
> +{ 
> + XSM_ASSERT_ACTION(XSM_PRIV); 
> + return xsm_default_action(action, current->domain, NULL); 
> +}
> 
> Is the intention is to restrict the hypercall usage to dom0 only ?

To be more precise: to the hardware domain (the patch sniplet you are
referencing was part of V1 of the series, it isn't existing in V2 any
longer).


Juergen

__

Re: [Xen-devel] what's inside hypercall page?

2016-03-02 Thread Juergen Gross
On 03/03/16 01:56, quizyjones wrote:
>> do_sched_op is self explaining: it is used for scheduling of the vcpu.
>> A vcpu going to idle is using this hypercall. So any interrupt waking
>> the vcpu up will seem to occur very near to the hypercall.
> 
>> do_xen_version is often used as a very fast way to execute the check
>> for pending events in the hypervisor (kind of polling).
> 
>> do_multicall might run for a long time. So the hypervisor returns to
>> the caller from time to time setting IP to the hypercall. The caller
>> has the chance to react to interrupts and will then continue the
>> hypercall.
>>
>>
>> HTH, Juergen
> 
> 
> Thanks for the replying. Does that mean we cannot predict when will
> these two hypercalls finish? I want to set up an interval to monitor the
> instructions (one time monitor per hypercall), so as to reduce the
> performance cost. This requires an accurate prediction of instructions'
> execution so as to avoid missing hypercalls. Is that possible? The main
> problem is the execution of syscall (0x050f), as each hypercall behaves
> different, how can I predict where will it go after the syscall returns?

You can't predict how long a hypercall will run, as this depends on multiple
factors, like the overall load of the host, values of parameters, ...

A hypercall is by it's nature much more complicated than e.g. a simple
arithmetic operation.

What exactly do you want to achieve?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 4/6] libxl: add pvusb API

2016-03-02 Thread Chun Yan Liu


>>> On 3/3/2016 at 02:32 AM, in message <56d731b1.60...@citrix.com>, George 
>>> Dunlap
 wrote: 
> On 01/03/16 08:09, Chunyan Liu wrote: 
> > Add pvusb APIs, including: 
> >  - attach/detach (create/destroy) virtual usb controller. 
> >  - attach/detach usb device 
> >  - list usb controller and usb devices 
> >  - some other helper functions 
> >  
> > Signed-off-by: Simon Cao  
> > Signed-off-by: George Dunlap  
> > Signed-off-by: Chunyan Liu  
> > --- 
> > Changes: 
> >   reorder usbdev_remove to following three steps: 
> >   1. Unassign all interfaces from usbback, stopping and returning an 
> >  error as soon as one attempt fails 
> >   2. Remove the pvusb xenstore nodes, stopping and returning an error 
> >  if it fails 
> >   3. Attempt to re-assign all interfaces to the original drivers, 
> >  stopping and returning an error as soon as one attempt fails. 
>  
> Thanks, Chunyan!  One minor comment about these changes... 
>  
> > +static int usbdev_rebind(libxl__gc *gc, const char *busid) 
> > +{ 
> > +char **intfs = NULL; 
> > +char *usbdev_encode = NULL; 
> > +char *path = NULL; 
> > +int i, num = 0; 
> > +int rc; 
> > + 
> > +rc = usbdev_get_all_interfaces(gc, busid, &intfs, &num); 
> > +if (rc) goto out; 
> > + 
> > +usbdev_encode = usb_interface_xenstore_encode(gc, busid); 
> > + 
> > +for (i = 0; i < num; i++) { 
> > +char *intf = intfs[i]; 
> > +char *usbintf_encode = NULL; 
> > +const char *drvpath; 
> > + 
> > +/* rebind USB interface to its originial driver */ 
> > +usbintf_encode = usb_interface_xenstore_encode(gc, intf); 
> > +path = GCSPRINTF(USBBACK_INFO_PATH "/%s/%s/driver_path", 
> > + usbdev_encode, usbintf_encode); 
> > +rc = libxl__xs_read_checked(gc, XBT_NULL, path, &drvpath); 
> > +if (rc) goto out; 
> > + 
> > +if (drvpath) { 
> > +rc = bind_usbintf(gc, intf, drvpath); 
> > +if (rc) { 
> > +LOGE(ERROR, "Couldn't rebind %s to %s", intf, drvpath); 
> > +goto out; 
> > +} 
> > +} 
> > +} 
> > + 
> > +path = GCSPRINTF(USBBACK_INFO_PATH "/%s", usbdev_encode); 
> > +rc = libxl__xs_rm_checked(gc, XBT_NULL, path); 
> > + 
> > +out: 
>  
> So it looks like if one of the re-binds fails, then it stops where it is 
> and leaves the USBBACK re-bind info in xenstore.  In that case it's not 
> clear to me how that information would ever be removed. 
>  
> I think until such time as we have a command to re-attempt the re-bind, 
>  if there's an error in the actual rebind, it should just break out of 
> the for loop, and remove the re-bind nodes, and document a way to let 
> the user try to clean things up. 

Just according to last time discussion about how to handle the rebind
failure, seems Ian preferred to add a xl command to deal with rebind
in future, based on that need, I think driver_path info should be kept
in xenstore then. Without that need, I agree it's best to remove
xenstore nodes. So, keep or remove?

[Post last time Ian's idea]
[start]
The only wrinkle is that the obvious implementation reads the old
bindings from xenstore between 1 and 2, deletes the information from
xenstore in 2, and uses that information in step 3, which is cheating
(and leads to the sysfs-wrangling-required state).  But that is quite
easy to avoid:
  - record the old driver bindings somewhere in xenstore (keyed by
the physical host device, not the guest domain)
  - provide a libxl call and corresponding xl command which attempts
to rebind

But, FAOD, I do not want to block this patch until such a thing is
implemented.  I think it is sufficient to document the existence of
the awkward state, and the likely workarounds.
[end]

>  
> > +static int do_usbdev_remove(libxl__gc *gc, uint32_t domid, 
> > +libxl_device_usbdev *usbdev) 
> > +{ 
> > +int rc; 
> > +char *busid; 
> > +libxl_device_usbctrl usbctrl; 
> > +libxl_usbctrlinfo usbctrlinfo; 
> > + 
> > +libxl_device_usbctrl_init(&usbctrl); 
> > +libxl_usbctrlinfo_init(&usbctrlinfo); 
> > +usbctrl.devid = usbdev->ctrl; 
> > + 
> > +rc = libxl_device_usbctrl_getinfo(CTX, domid, &usbctrl, &usbctrlinfo); 
> > +if (rc) goto out; 
> > + 
> > +switch (usbctrlinfo.type) { 
> > +case LIBXL_USBCTRL_TYPE_PV: 
> > +busid = usbdev_busid_from_ctrlport(gc, domid, usbdev); 
> > +if (!busid) { 
> > +rc = ERROR_FAIL; 
> > +goto out; 
> > +} 
> > + 
> > +rc = usbback_dev_unassign(gc, busid); 
> > +if (rc) goto out; 
> > + 
> > +rc = libxl__device_usbdev_remove_xenstore(gc, domid, usbdev); 
> > +if (rc) goto out; 
> > + 
> > +rc = usbdev_rebind(gc, busid); 
> > +if (rc) goto out; 
>  
> I think we need a comment here saying why we're doing things in this 
> order.  Maybe: 
>  
> 

[Xen-devel] [xen-4.6-testing test] 85017: tolerable trouble: broken/fail/pass - PUSHED

2016-03-02 Thread osstest service owner
flight 85017 xen-4.6-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/85017/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-libvirt-vhd  3 host-install(3)   broken pass in 84924
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail in 84924 pass in 85017
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeat fail in 84924 pass 
in 85017
 test-amd64-i386-rumpuserxen-i386 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail in 84924 pass in 85017
 test-armhf-armhf-xl-rtds 11 guest-startfail in 84924 pass in 85017
 test-armhf-armhf-xl-credit2   6 xen-bootfail pass in 84924
 test-armhf-armhf-xl  16 guest-start.2   fail pass in 84924

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
like 83674
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 83820
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 83820
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 83820
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail   like 83820

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-vhd 11 migrate-support-check fail in 84924 never pass
 test-armhf-armhf-xl-credit2 13 saverestore-support-check fail in 84924 never 
pass
 test-armhf-armhf-xl-credit2  12 migrate-support-check fail in 84924 never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  842e19d951c04c99c27a0fa2bca3d1e677a3
baseline version:
 xen  046e5d0218a0600f9a21fd3b5a5ccfbaaf4357b6

Last test of basis83820  2016-02-24 03:57:12 Z7 days
Testing same since84924  2016-03-01 13:42:37 Z1 days2 attempts


People who touched revisions under test:
  Ian Campbell 
  Ian Jackson 
  Olaf Hering 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass 

Re: [Xen-devel] [PATCH V15 4/6] libxl: add pvusb API

2016-03-02 Thread Chun Yan Liu


>>> On 3/3/2016 at 02:46 AM, in message <56d7350f.7010...@citrix.com>, George
Dunlap  wrote: 
> On 02/03/16 18:32, George Dunlap wrote: 
> > On 01/03/16 08:09, Chunyan Liu wrote: 
> >> Add pvusb APIs, including: 
> >>  - attach/detach (create/destroy) virtual usb controller. 
> >>  - attach/detach usb device 
> >>  - list usb controller and usb devices 
> >>  - some other helper functions 
> >> 
> >> Signed-off-by: Simon Cao  
> >> Signed-off-by: George Dunlap  
> >> Signed-off-by: Chunyan Liu  
> >> --- 
> >> Changes: 
> >>   reorder usbdev_remove to following three steps: 
> >>   1. Unassign all interfaces from usbback, stopping and returning an 
> >>  error as soon as one attempt fails 
> >>   2. Remove the pvusb xenstore nodes, stopping and returning an error 
> >>  if it fails 
> >>   3. Attempt to re-assign all interfaces to the original drivers, 
> >>  stopping and returning an error as soon as one attempt fails. 
> >  
> > Thanks, Chunyan!  One minor comment about these changes... 
> >  
> >> +static int usbdev_rebind(libxl__gc *gc, const char *busid) 
> >> +{ 
> >> +char **intfs = NULL; 
> >> +char *usbdev_encode = NULL; 
> >> +char *path = NULL; 
> >> +int i, num = 0; 
> >> +int rc; 
> >> + 
> >> +rc = usbdev_get_all_interfaces(gc, busid, &intfs, &num); 
> >> +if (rc) goto out; 
> >> + 
> >> +usbdev_encode = usb_interface_xenstore_encode(gc, busid); 
> >> + 
> >> +for (i = 0; i < num; i++) { 
> >> +char *intf = intfs[i]; 
> >> +char *usbintf_encode = NULL; 
> >> +const char *drvpath; 
> >> + 
> >> +/* rebind USB interface to its originial driver */ 
> >> +usbintf_encode = usb_interface_xenstore_encode(gc, intf); 
> >> +path = GCSPRINTF(USBBACK_INFO_PATH "/%s/%s/driver_path", 
> >> + usbdev_encode, usbintf_encode); 
> >> +rc = libxl__xs_read_checked(gc, XBT_NULL, path, &drvpath); 
> >> +if (rc) goto out; 
> >> + 
> >> +if (drvpath) { 
> >> +rc = bind_usbintf(gc, intf, drvpath); 
> >> +if (rc) { 
> >> +LOGE(ERROR, "Couldn't rebind %s to %s", intf, drvpath); 
> >> +goto out; 
> >> +} 
> >> +} 
> >> +} 
> >> + 
> >> +path = GCSPRINTF(USBBACK_INFO_PATH "/%s", usbdev_encode); 
> >> +rc = libxl__xs_rm_checked(gc, XBT_NULL, path); 
> >> + 
> >> +out: 
> >  
> > So it looks like if one of the re-binds fails, then it stops where it is 
> > and leaves the USBBACK re-bind info in xenstore.  In that case it's not 
> > clear to me how that information would ever be removed. 
> >  
> > I think until such time as we have a command to re-attempt the re-bind, 
> >  if there's an error in the actual rebind, it should just break out of 
> > the for loop, and remove the re-bind nodes, and document a way to let 
> > the user try to clean things up. 
> >  
> >> +static int do_usbdev_remove(libxl__gc *gc, uint32_t domid, 
> >> +libxl_device_usbdev *usbdev) 
> >> +{ 
> >> +int rc; 
> >> +char *busid; 
> >> +libxl_device_usbctrl usbctrl; 
> >> +libxl_usbctrlinfo usbctrlinfo; 
> >> + 
> >> +libxl_device_usbctrl_init(&usbctrl); 
> >> +libxl_usbctrlinfo_init(&usbctrlinfo); 
> >> +usbctrl.devid = usbdev->ctrl; 
> >> + 
> >> +rc = libxl_device_usbctrl_getinfo(CTX, domid, &usbctrl, 
> >> &usbctrlinfo); 
> >> +if (rc) goto out; 
> >> + 
> >> +switch (usbctrlinfo.type) { 
> >> +case LIBXL_USBCTRL_TYPE_PV: 
> >> +busid = usbdev_busid_from_ctrlport(gc, domid, usbdev); 
> >> +if (!busid) { 
> >> +rc = ERROR_FAIL; 
> >> +goto out; 
> >> +} 
> >> + 
> >> +rc = usbback_dev_unassign(gc, busid); 
> >> +if (rc) goto out; 
> >> + 
> >> +rc = libxl__device_usbdev_remove_xenstore(gc, domid, usbdev); 
> >> +if (rc) goto out; 
> >> + 
> >> +rc = usbdev_rebind(gc, busid); 
> >> +if (rc) goto out; 
> >  
> > I think we need a comment here saying why we're doing things in this 
> > order.  Maybe: 
> >  
> > "Things are done in this order to balance simplicity with robustness in 
> > the case of failure: 
> > * We unbind all interfaces before rebinding any interfaces, so that we 
> > never get into a situation where some interfaces are assigned to usbback 
> > and some are assigned to the original drivers. 
> > * We also unbind the interfaces before removing the pvusb xenstore 
> > nodes, so that if the unbind fails in the middle, the device still shows 
> > up in xl usb-list, and the user can re-try removing it." 
>  
> Sorry, just looked through the rest of the series, and there's one more 
> thing. 
>  
> Neither here nor in the man page do we explain what to do if something 
> goes wrong with the detach.  I think the best thing to do is probably to 
> make the logged error messages more helpful. 
>  
> What about something like this: 
>  
> * On fai

[Xen-devel] [libvirt test] 85019: tolerable FAIL - PUSHED

2016-03-02 Thread osstest service owner
flight 85019 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/85019/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  95aa1017951e410b6e1ebbc685034ac4cc49c6fb
baseline version:
 libvirt  33fb8ff185846a7b4974105d2c9400690a6f95cf

Last test of basis84468  2016-02-29 04:24:05 Z2 days
Testing same since85019  2016-03-02 04:25:57 Z0 days1 attempts


People who touched revisions under test:
  Alexander Burluka 
  Daniel Veillard 
  Henning Schild 
  Jason J. Herne 
  Jiri Denemark 
  John Ferlan 
  Marc-André Lureau 
  Marc-André Lureau 
  Martin Kletzander 
  Michal Privoznik 
  Nikolay Shirokovskiy 
  Pavel Hrdina 
  Peter Krempa 
  Shanzhi Yu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-armhf-armhf-libvirt-qcow2   fail
 test-armhf-armhf-libvirt-raw fail
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=libvirt
+ revision=95aa1017951e410b6e1ebbc685034ac4cc49c6fb
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/os

Re: [Xen-devel] [PATCH v10 31/31] cmdline switches and config vars to control colo-proxy

2016-03-02 Thread Wen Congyang
On 03/02/2016 11:05 PM, Wei Liu wrote:
> On Mon, Feb 22, 2016 at 10:52:35AM +0800, Wen Congyang wrote:
>> Add cmdline switches to 'xl migrate-receive' command to specify
>> a domain-specific hotplug script to setup COLO proxy.
>>
>> Add a new config var 'colo.default.agentscript' to xl.conf, that
>> allows the user to override the default global script used to
>> setup COLO proxy.
>>
>> Signed-off-by: Yang Hongyang 
>> Signed-off-by: Wen Congyang 
>> ---
>>  docs/man/xl.conf.pod.5  |  6 ++
>>  docs/man/xl.pod.1   |  1 -
>>  tools/libxl/libxl.c |  6 ++
>>  tools/libxl/libxl_create.c  | 14 --
>>  tools/libxl/libxl_types.idl |  1 +
>>  tools/libxl/xl.c|  3 +++
>>  tools/libxl/xl.h|  1 +
>>  tools/libxl/xl_cmdimpl.c| 47 
>> ++---
>>  8 files changed, 65 insertions(+), 14 deletions(-)
>>
>> diff --git a/docs/man/xl.conf.pod.5 b/docs/man/xl.conf.pod.5
>> index 8ae19bb..8f7fd28 100644
>> --- a/docs/man/xl.conf.pod.5
>> +++ b/docs/man/xl.conf.pod.5
>> @@ -111,6 +111,12 @@ Configures the default script used by Remus to setup 
>> network buffering.
>>  
>>  Default: C
>>  
>> +=item B
>> +
>> +Configures the default script used by COLO to setup colo-proxy.
>> +
>> +Default: C
>> +
>>  =item B
>>  
>>  Configures the default output format used by xl when printing "machine
>> diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
>> index 4f1901d..edeafcf 100644
>> --- a/docs/man/xl.pod.1
>> +++ b/docs/man/xl.pod.1
>> @@ -454,7 +454,6 @@ N.B: Remus support in xl is still in experimental 
>> (proof-of-concept) phase.
>>   Disk replication support is limited to DRBD disks.
>>  
>>   COLO support in xl is still in experimental (proof-of-concept) phase.
>> - There is no support for network at the moment.
> 
> 
> Same here, missing documentation on how to use the new parameters (if
> any). Please provide adequate documentation otherwise we can't
> meaningfully review the rest of this patch.

OK, will fix it in the next version.

Thanks
Wen Congyang

> 
> Wei.
> 
> 
> .
> 




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 24/31] Support colo mode for qemu disk

2016-03-02 Thread Wen Congyang
On 03/02/2016 11:04 PM, Wei Liu wrote:
> On Mon, Feb 22, 2016 at 10:52:28AM +0800, Wen Congyang wrote:
>> Usage: disk = 
>> ['...,colo,colo-host=xxx,colo-port=xxx,colo-export=xxx,active-disk=xxx,hidden-disk=xxx...']
>> For QEMU block replication details:
>> http://wiki.qemu.org/Features/BlockReplication
>>
>> Signed-off-by: Wen Congyang 
>> Signed-off-by: Yang Hongyang 
>> ---
>>  docs/man/xl.pod.1   |   2 +-
>>  docs/misc/xl-disk-configuration.txt |  50 ++
>>  tools/libxl/libxl.c |  62 +++-
>>  tools/libxl/libxl_create.c  |  25 -
>>  tools/libxl/libxl_device.c  |  54 +++
>>  tools/libxl/libxl_dm.c  | 184 
>> ++--
>>  tools/libxl/libxl_types.idl |   7 ++
>>  tools/libxl/libxlu_disk_l.l |   7 ++
>>  8 files changed, 382 insertions(+), 9 deletions(-)
>>
>> diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
>> index 1c6dd87..4f1901d 100644
>> --- a/docs/man/xl.pod.1
>> +++ b/docs/man/xl.pod.1
>> @@ -454,7 +454,7 @@ N.B: Remus support in xl is still in experimental 
>> (proof-of-concept) phase.
>>   Disk replication support is limited to DRBD disks.
>>  
>>   COLO support in xl is still in experimental (proof-of-concept) phase.
>> - There is no support for network or disk at the moment.
>> + There is no support for network at the moment.
> 
> You need some document here for the syntax, otherwise users have no clue
> how to configure disk replicate support. I also won't be able to
> meaningfully review this patch without a reference.

OK. will fix it in the next version.

> 
>>  
>>  B
>>  
>> diff --git a/docs/misc/xl-disk-configuration.txt 
>> b/docs/misc/xl-disk-configuration.txt
>> index 29f6ddb..6f23c2d 100644
>> --- a/docs/misc/xl-disk-configuration.txt
>> +++ b/docs/misc/xl-disk-configuration.txt
>> @@ -234,6 +234,56 @@ were intentionally created non-sparse to avoid 
>> fragmentation of the
>>  file.
>>  
>>  
> 
> Some nitpicking about the format below.
> 
>> +===
>> +COLO PARAMETERS
>> +===
>> +
>> +
>> +colo
>> +
>> +
>> +Enable COLO HA for disk. For better understanding block replication on
>> +QEMU, please refer to:
>> +http://wiki.qemu.org/Features/BlockReplication
>> +
>> +
>> +colo-host
>> +-
> 
> Blank line here please.
> 
>> +Description:   Secondary host's address
>> +Mandatory: Yes when COLO enabled
>> +
>> +
>> +colo-port
>> +-
> 
> Ditto.
> 
>> +Description:   Secondary port
>> +   We will run a nbd server on secondary host,
>> +   and the nbd server will listen this port.
>> +Mandatory: Yes when COLO enabled
>> +
>> +
>> +colo-export
>> +-
> 
> Here as well. And some more "-"s to match "colo-export".
> 
>> +Description:   We will run a nbd server on secondary host,
>> +   exportname is the nbd server's disk export name.
>> +Mandatory: Yes when COLO enabled
>> +
>> +
>> +active-disk
>> +---
>> +
>> +Description:   This is used by secondary. Secondary guest's write
>> +   will be buffered in this disk.
>> +Mandatory: Yes when COLO enabled
>> +
>> +
>> +hidden-disk
>> +---
>> +
>> +Description:   This is used by secondary. It buffers the original
>> +   content that is modified by the primary VM.
>> +Mandatory: Yes when COLO enabled
>> +
>> +
> 
> The rest of the patch is mainly for manipulating QEMU parameters. I've
> skipped it for now.

If you want to know about how qemu block repication works, you can see:
http://wiki.qemu.org/Features/BlockReplication

> 
>>  
>>  DEPRECATED PARAMETERS, PREFIXES AND SYNTAXES
>>  
>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>> index 12df81a..f691628 100644
>> --- a/tools/libxl/libxl.c
>> +++ b/tools/libxl/libxl.c
>> @@ -2309,6 +2309,8 @@ int libxl__device_disk_setdefault(libxl__gc *gc, 
>> libxl_device_disk *disk)
>>  int rc;
>>  
>>  libxl_defbool_setdefault(&disk->discard_enable, !!disk->readwrite);
>> +libxl_defbool_setdefault(&disk->colo_enable, false);
>> +libxl_defbool_setdefault(&disk->colo_restore_enable, false);
>>  
>>  rc = libxl__resolve_domid(gc, disk->backend_domname, 
>> &disk->backend_domid);
>>  if (rc < 0) return rc;
>> @@ -2507,6 +2509,18 @@ static void device_disk_add(libxl__egc *egc, uint32_t 
>> domid,
>>  flexarray_append(back, "params");
>>  flexarray_append(back, GCSPRINTF("%s:%s",
>>
>> libxl__device_disk_string_of_format(disk->format), disk->pdev_path));
>> +if (libxl_defbool_val(disk->colo_enable)) {
>> +flexarray_append(back, "colo-host");
>> +flexarray_append(back, libxl__

Re: [Xen-devel] [PATCH v10 22/31] implement the cmdline for COLO

2016-03-02 Thread Wen Congyang
On 03/02/2016 11:03 PM, Wei Liu wrote:
> On Mon, Feb 22, 2016 at 10:52:26AM +0800, Wen Congyang wrote:
> [...]
>> +if (libxl_defbool_val(info->colo)) {
>> +if (libxl_defbool_val(info->compression)) {
> 
> This can be simplified as
> 
>if (libxl_defbool_val(xxx) && libxl_defbool_val(yyy))

OK. will fix it in the next version.

> 
>> +LOG(ERROR, "cannot use memory checkpoint compression in COLO 
>> mode");
>> +rc = ERROR_FAIL;
>> +goto out;
>> +}
>> +}
>> +
>>  if (!libxl_defbool_val(info->allow_unsafe) &&
>>  (libxl_defbool_val(info->blackhole) ||
>>   !libxl_defbool_val(info->netbuf) ||
>> @@ -876,7 +892,10 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
>> libxl_domain_remus_info *info,
>>  dss->live = 1;
>>  dss->debug = 0;
>>  dss->remus = info;
>> -dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_REMUS;
>> +if (libxl_defbool_val(info->colo))
>> +dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_COLO;
>> +else
>> +dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_REMUS;
>>  
>>  assert(info);
>>  
>> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
>> index df7268b..0dc7220 100644
>> --- a/tools/libxl/xl_cmdimpl.c
>> +++ b/tools/libxl/xl_cmdimpl.c
>> @@ -4440,6 +4440,8 @@ static void migrate_receive(int debug, int daemonize, 
>> int monitor,
>>  char rc_buf;
>>  char *migration_domname;
>>  struct domain_create dom_info;
>> +const char *ha = checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO ?
>> + "COLO" : "Remus";
>>  
>>  signal(SIGPIPE, SIG_IGN);
>>  /* if we get SIGPIPE we'd rather just have it as an error */
>> @@ -4460,6 +4462,9 @@ static void migrate_receive(int debug, int daemonize, 
>> int monitor,
>>  dom_info.send_back_fd = send_fd;
>>  dom_info.migration_domname_r = &migration_domname;
>>  dom_info.checkpointed_stream = checkpointed;
>> +if (checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO)
>> +/* COLO uses stdout to send control message to master */
>> +dom_info.quiet = 1;
>>  
> 
> It seems that dom_info->quiet affects stderr, not stdout. See the only
> place that checks this in xl_cmdimpl.c.
> 
>>  rc = create_domain(&dom_info);
>>  if (rc < 0) {
>> @@ -4472,11 +4477,12 @@ static void migrate_receive(int debug, int 
>> daemonize, int monitor,
>>  
>>  switch (checkpointed) {
>>  case LIBXL_CHECKPOINTED_STREAM_REMUS:
>> +case LIBXL_CHECKPOINTED_STREAM_COLO:
>>  /* If we are here, it means that the sender (primary) has crashed.
>>   * TODO: Split-Brain Check.
>>   */
>> -fprintf(stderr, "migration target: Remus Failover for domain %u\n",
>> -domid);
>> +fprintf(stderr, "migration target: %s Failover for domain %u\n",
>> +ha, domid);
>>  
>>  /*
>>   * If domain renaming fails, lets just continue (as we need the 
>> domain
>> @@ -4492,16 +4498,20 @@ static void migrate_receive(int debug, int 
>> daemonize, int monitor,
>>  rc = libxl_domain_rename(ctx, domid, migration_domname,
>>   common_domname);
>>  if (rc)
>> -fprintf(stderr, "migration target (Remus): "
>> +fprintf(stderr, "migration target (%s): "
>>  "Failed to rename domain from %s to %s:%d\n",
>> -migration_domname, common_domname, rc);
>> +ha, migration_domname, common_domname, rc);
>>  }
>>  
>> +if (checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO)
>> +/* The guest is running after failover in COLO mode */
>> +exit(rc ? -ERROR_FAIL: 0);
>> +
>>  rc = libxl_domain_unpause(ctx, domid);
>>  if (rc)
>> -fprintf(stderr, "migration target (Remus): "
>> +fprintf(stderr, "migration target (%s): "
>>  "Failed to unpause domain %s (id: %u):%d\n",
>> -common_domname, domid, rc);
>> +ha, common_domname, domid, rc);
>>  
>>  exit(rc ? -ERROR_FAIL: 0);
>>  default:
>> @@ -4649,7 +4659,7 @@ int main_migrate_receive(int argc, char **argv)
>>  libxl_checkpointed_stream checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
>>  int opt;
>>  
>> -SWITCH_FOREACH_OPT(opt, "Fedr", NULL, "migrate-receive", 0) {
>> +SWITCH_FOREACH_OPT(opt, "Fedrc", NULL, "migrate-receive", 0) {
>>  case 'F':
>>  daemonize = 0;
>>  break;
>> @@ -4663,6 +4673,9 @@ int main_migrate_receive(int argc, char **argv)
>>  case 'r':
>>  checkpointed = LIBXL_CHECKPOINTED_STREAM_REMUS;
>>  break;
>> +case 'c':
>> +checkpointed = LIBXL_CHECKPOINTED_STREAM_COLO;
>> +break;
>>  }
>>  
>>  if (argc-optind != 0) {
>> @@ -8032,11 +8045,8 @@ int main_remus(int argc, char **

Re: [Xen-devel] [PATCH v10 10/31] tools/libxl: add back channel support to write stream

2016-03-02 Thread Wen Congyang
On 03/02/2016 11:02 PM, Wei Liu wrote:
> On Fri, Feb 26, 2016 at 10:11:27AM +0800, Wen Congyang wrote:
>> On 02/25/2016 11:54 PM, Wei Liu wrote:
>>> On Mon, Feb 22, 2016 at 10:52:14AM +0800, Wen Congyang wrote:
 Add back channel support to write stream. If the write stream is
 a back channel stream, this means the write stream is used by
 Secondary to send some records back.

 Signed-off-by: Yang Hongyang 
 Signed-off-by: Wen Congyang 
 ---
  tools/libxl/libxl_dom_save.c |  1 +
  tools/libxl/libxl_internal.h |  1 +
  tools/libxl/libxl_stream_write.c | 26 --
  3 files changed, 22 insertions(+), 6 deletions(-)

 diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
 index 72b61c7..18946c5 100644
 --- a/tools/libxl/libxl_dom_save.c
 +++ b/tools/libxl/libxl_dom_save.c
 @@ -404,6 +404,7 @@ void libxl__domain_save(libxl__egc *egc, 
 libxl__domain_save_state *dss)
  dss->sws.ao  = dss->ao;
  dss->sws.dss = dss;
  dss->sws.fd  = dss->fd;
 +dss->sws.back_channel = false;
  dss->sws.completion_callback = stream_done;
  
  libxl__stream_write_start(egc, &dss->sws);
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index 3d3e8e8..e02e554 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -3044,6 +3044,7 @@ struct libxl__stream_write_state {
  libxl__ao *ao;
  libxl__domain_save_state *dss;
  int fd;
 +bool back_channel;
  void (*completion_callback)(libxl__egc *egc,
  libxl__stream_write_state *sws,
  int rc);
 diff --git a/tools/libxl/libxl_stream_write.c 
 b/tools/libxl/libxl_stream_write.c
 index f6ea55d..5379126 100644
 --- a/tools/libxl/libxl_stream_write.c
 +++ b/tools/libxl/libxl_stream_write.c
 @@ -49,6 +49,13 @@
   *  - if (hvm)
   *  - Emulator context record
   *  - Checkpoint end record
 + *
 + * For back channel stream:
 + * - libxl__stream_write_start()
 + *- Set up the stream to running state
 + *
 + * - Add a new API to write the record. When the record is written
 + *   out, call stream->checkpoint_callback() to return.
>>>
>>> What does this mean? Which new API?
>>
>> The next patch introduces this API. The commits is very old.
>>
>> I think I can merge these two patches into one patch.
>>
> 
> Please reference the actual function / API.
> 
>>>
   */
  
  /* Success/error/cleanup handling. */
 @@ -225,6 +232,15 @@ void libxl__stream_write_start(libxl__egc *egc,
  
  stream->running = true;
  
 +dc->ao= ao;
 +dc->readfd= -1;
 +dc->copywhat  = "save v2 stream";
 +dc->writefd   = stream->fd;
 +dc->maxsz = -1;
 +
 +if (stream->back_channel)
 +return;
 +
>>>
>>> There seems to be very subtle change of behaviour.
>>>
>>> Before this patch, the dc->* are not set until the emulator check is
>>> done. With this path, it is possible in the normal case some of the
>>> fields are initialised in the error path.
>>>
>>> I think this is OK given the callbacks in the upper layer and in
>>> the writer don't rely on those fields to clean up. Just one thing to
>>> note.
>>>
>>> That said, I suggest you move all initialisation of dc->* in one place.
>>
>> OK, I will do it.
>>
>>>
  if (dss->type == LIBXL_DOMAIN_TYPE_HVM) {
  stream->device_model_version =
  libxl__device_model_version_running(gc, dss->domid);
 @@ -249,12 +265,7 @@ void libxl__stream_write_start(libxl__egc *egc,
  stream->emu_sub_hdr.index = 0;
  }
  
 -dc->ao= ao;
 -dc->readfd= -1;
  dc->writewhat = "stream header";
 -dc->copywhat  = "save v2 stream";
 -dc->writefd   = stream->fd;
 -dc->maxsz = -1;
  dc->callback  = stream_header_done;
  
  rc = libxl__datacopier_start(dc);
 @@ -279,6 +290,7 @@ void libxl__stream_write_start_checkpoint(libxl__egc 
 *egc,
  {
  assert(stream->running);
  assert(!stream->in_checkpoint);
 +assert(!stream->back_channel);
  stream->in_checkpoint = true;
  
  write_emulator_xenstore_record(egc, stream);
 @@ -590,7 +602,9 @@ static void stream_done(libxl__egc *egc,
  libxl__carefd_close(stream->emu_carefd);
  free(stream->emu_body);
  
 -check_all_finished(egc, stream, rc);
 +if (!stream->back_channel)
 +/* back channel stream doesn't have save helper */
 +check_all_finished(egc, stream, rc);
>>>
>>> Though it doesn't have helper, do you not need to check if the back
>>> channel stream itself is OK? The comment itself 

Re: [Xen-devel] [PATCH v2 1/2] x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled

2016-03-02 Thread Luis R. Rodriguez
On Wed, Mar 02, 2016 at 04:33:06PM -0800, Andy Lutomirski wrote:
> On Tue, Mar 1, 2016 at 4:15 PM, Luis R. Rodriguez  wrote:
> > Ingo, your feedback appreciated at the end here, regarding quirks.
> >
> > On Tue, Mar 01, 2016 at 09:00:53AM -0500, Boris Ostrovsky wrote:
> >> On 02/29/2016 06:50 PM, Andy Lutomirski wrote:
> >> >diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> >> >index 91ddae732a36..c6ef4da8e4f4 100644
> >> >--- a/arch/x86/kernel/cpu/common.c
> >> >+++ b/arch/x86/kernel/cpu/common.c
> >> >@@ -979,6 +979,31 @@ static void identify_cpu(struct cpuinfo_x86 *c)
> >
> > Note: Andy's change is on identify_cpu() modification here at the end.
> >
> >> >  #ifdef CONFIG_NUMA
> >> > numa_add_cpu(smp_processor_id());
> >> >  #endif
> >> >+
> >> >+/*
> >> >+ * ESPFIX is a strange bug.  All real CPUs have it.  Paravirt
> >> >+ * systems that run Linux at CPL > 0 may or may not have the
> >> >+ * issue, but, even if they have the issue, there's absolutely
> >> >+ * nothing we can do about it because we can't use the real IRET
> >> >+ * instruction.
> >> >+ *
> >> >+ * NB: For the time being, only 32-bit kernels support
> >> >+ * X86_BUG_ESPFIX as such.  64-bit kernels directly choose
> >> >+ * whether to apply espfix using paravirt hooks.  If any
> >> >+ * non-paravirt system ever shows up that does *not* have the
> >> >+ * ESPFIX issue, we can change this.
> >> >+ */
> >> >+#ifdef CONFIG_X86_32
> >> >+#ifdef CONFIG_PARAVIRT
> >> >+do {
> >> >+extern void native_iret(void);
> >> >+if (pv_cpu_ops.iret == native_iret)
> >> >+set_cpu_bug(c, X86_BUG_ESPFIX);
> >> >+} while (0);
> >> >+#else
> >> >+set_cpu_bug(c, X86_BUG_ESPFIX);
> >> >+#endif
> >> >+#endif
> >> >  }
> >> >  /*
> >>
> >> Alternatively, PV guests can clear X86_BUG_ESPFIX in their init
> >> code. E.g in .set_cpu_features op, just like we do for
> >> X86_BUG_SYSRET_SS_ATTRS
> >
> > Andy's proposal works out of identify_cpu() and that covers both boot
> > processor and secondary CPUs. The summary is as follows, tracing back in
> > time from left to right.
> >
> > --- identify_boot_cpu() --- check_bugs() --- start_kernel()
> >/
> > identify_cpu()<
> >\
> > --- identify_secondary_cpu() --- cpu_up() --- smp_init()
> > --- kernel_init_freeable() --- kernel_init()
> > --- rest_init() --- start_kernel()
> >
> >
> > set_cpu_features() is called from both: init_hypervisor_platform()
> > during setup_arch() and identify_cpu(). Since it'll be called on
> > check_bugs() already on identify_boot_cpu() though I think the
> > call from init_hypervisor_platform() seems redundant ?
> >
> > We assume we just call:
> >
> > setup_arch() --> init_hypervisor_platform() --> 
> > init_hypervisor(&boot_cpu_data)
> >
> > But the above map on identify_cpu() also shows we call:
> >
> > start_kernel --> check_bugs() --> identify_boot_cpu() -->
> > identify_cpu() --> init_hypervisor() --> set_cpu_features()
> >
> >
> > void init_hypervisor(struct cpuinfo_x86 *c)
> > {
> > if (x86_hyper && x86_hyper->set_cpu_features)
> > x86_hyper->set_cpu_features(c);
> > }
> >
> > Anyway, since we're consolidating quirks, and since it turns out the other
> > quirks are being shifted away from paravirt_enabled() out into another 
> > struct
> > x86_platform_ops CPU specific quirk, I wonder why not just also replace this
> > set_cpu_features() thing as a struct x86_platform_ops quirk CPU callback.
> >
> >> (although this may require adding struct
> >> hypervisor_x86 for lguests. Which I think they should have anyway).
> >
> > lguest already uses x86_platform, and setting up a per CPU quirk would
> > be rather trivial.
> >
> > CPU feature / CPU quirk...
> >
> > I've stashed the other quirks into a x86_early_init_platform_quirks(),
> > this was to have all quirks handled in one place. We handle differences
> > with subarch there. vmware has no subarch though, and it uses its own
> > set_cpu_features(). We have a few options I  can think of:
> >
> >  1) keep this on set_cpu_features() and modify lguest to add a struct 
> > hypervisor_x86
> > as boris suggests
> >
> >  2) move set_cpu_features() as a platform feature / quirk callback and
> > call it on identify_cpu()
> >
> >  3) Just identify each quirk on struct x86_platform, with a set of defaults
> > set. Then identify_cpu() enables a platform  callback to override
> > defaults, and finally then a shared quirk call is issued to
> > set the different set_cpu_features() or clear them.
> >
> 
> I think this is severely overcomplicating the issue.
> 
> The issue is that IRET is a pile of shit.  It may be quirky, but it
> affects *everything*.
> 
> On x86_64, the kernel assumes that the "iret" implementation works
> around the quirk.  xen_iret doesn't, and th

Re: [Xen-devel] what's inside hypercall page?

2016-03-02 Thread quizyjones
> do_sched_op is self explaining: it is used for scheduling of the vcpu.
> A vcpu going to idle is using this hypercall. So any interrupt waking
> the vcpu up will seem to occur very near to the hypercall.

> do_xen_version is often used as a very fast way to execute the check
> for pending events in the hypervisor (kind of polling).

> do_multicall might run for a long time. So the hypervisor returns to
> the caller from time to time setting IP to the hypercall. The caller
> has the chance to react to interrupts and will then continue the
> hypercall.
> 
> 
> HTH, Juergen


Thanks for the replying. Does that mean we cannot predict when will these two 
hypercalls finish? I want to set up an interval to monitor the instructions 
(one time monitor per hypercall), so as to reduce the performance cost. This 
requires an accurate prediction of instructions' execution so as to avoid 
missing hypercalls. Is that possible? The main problem is the execution of 
syscall (0x050f), as each hypercall behaves different, how can I predict where 
will it go after the syscall returns?
自动判断中文中文(简体)中文(香港)中文(繁体)英语日语朝鲜语德语法语俄语泰语南非语阿拉伯语阿塞拜疆语比利时语保加利亚语加泰隆语捷克语威尔士语丹麦语第维埃语希腊语世界语西班牙语爱沙尼亚语巴士克语法斯语芬兰语法罗语加里西亚语古吉拉特语希伯来语印地语克罗地亚语匈牙利语亚美尼亚语印度尼西亚语冰岛语意大利语格鲁吉亚语哈萨克语卡纳拉语孔卡尼语吉尔吉斯语立陶宛语拉脱维亚语毛利语马其顿语蒙古语马拉地语马来语马耳他语挪威语(伯克梅尔)荷兰语北梭托语旁遮普语波兰语葡萄牙语克丘亚语罗马尼亚语梵文北萨摩斯语斯洛伐克语斯洛文尼亚语阿尔巴尼亚语瑞典语斯瓦希里语叙利亚语泰米尔语泰卢固语塔加路语茨瓦纳语土耳其语宗加语鞑靼语乌克兰语乌都语乌兹别克语越南语班图语祖鲁语自动选择中文中文(简体)中文(香港)中文(繁体)英语日语朝鲜语德语法语俄语泰语南非语阿拉伯语阿塞拜疆语比利时语保加利亚语加泰隆语捷克语威尔士语丹麦语第维埃语希腊语世界语西班牙语爱沙尼亚语巴士克语法斯语芬兰语法罗语加里西亚语古吉拉特语希伯来语印地语克罗地亚语匈牙利语亚美尼亚语印度尼西亚语冰岛语意大利语格鲁吉亚语哈萨克语卡纳拉语孔卡尼语吉尔吉斯语立陶宛语拉脱维亚语毛利语马其顿语蒙古语马拉地语马来语马耳他语挪威语(伯克梅尔)荷兰语北梭托语旁遮普语波兰语葡萄牙语克丘亚语罗马尼亚语梵文北萨摩斯语斯洛伐克语斯洛文尼亚语阿尔巴尼亚语瑞典语斯瓦希里语叙利亚语泰米尔语泰卢固语塔加路语茨瓦纳语土耳其语宗加语鞑靼语乌克兰语乌都语乌兹别克语越南语班图语祖鲁语有道翻译百度翻译必应翻译谷歌翻译谷歌翻译(国内)翻译朗读复制正在查询,请稍候……重试朗读复制复制朗读复制via
 译   ___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/2] x86/entry/32: Get rid of paravirt_enabled in ESPFIX

2016-03-02 Thread Andy Lutomirski
On Mar 1, 2016 2:46 PM, "Borislav Petkov"  wrote:
>
> On Mon, Feb 29, 2016 at 03:50:18PM -0800, Andy Lutomirski wrote:
> > Borislav, if you're okay with this (ab)use of the cpufeatures stuff
>
> Because of X86_BUG_ESPFIX? Why abuse?

Because I'm mixing paravirt and cpufeatures a bit oddly.

>
> --
> Regards/Gruss,
> Boris.
>
> ECO tip #101: Trim your mails when you reply.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/2] x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled

2016-03-02 Thread Andy Lutomirski
On Tue, Mar 1, 2016 at 4:15 PM, Luis R. Rodriguez  wrote:
> Ingo, your feedback appreciated at the end here, regarding quirks.
>
> On Tue, Mar 01, 2016 at 09:00:53AM -0500, Boris Ostrovsky wrote:
>> On 02/29/2016 06:50 PM, Andy Lutomirski wrote:
>> >diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>> >index 91ddae732a36..c6ef4da8e4f4 100644
>> >--- a/arch/x86/kernel/cpu/common.c
>> >+++ b/arch/x86/kernel/cpu/common.c
>> >@@ -979,6 +979,31 @@ static void identify_cpu(struct cpuinfo_x86 *c)
>
> Note: Andy's change is on identify_cpu() modification here at the end.
>
>> >  #ifdef CONFIG_NUMA
>> > numa_add_cpu(smp_processor_id());
>> >  #endif
>> >+
>> >+/*
>> >+ * ESPFIX is a strange bug.  All real CPUs have it.  Paravirt
>> >+ * systems that run Linux at CPL > 0 may or may not have the
>> >+ * issue, but, even if they have the issue, there's absolutely
>> >+ * nothing we can do about it because we can't use the real IRET
>> >+ * instruction.
>> >+ *
>> >+ * NB: For the time being, only 32-bit kernels support
>> >+ * X86_BUG_ESPFIX as such.  64-bit kernels directly choose
>> >+ * whether to apply espfix using paravirt hooks.  If any
>> >+ * non-paravirt system ever shows up that does *not* have the
>> >+ * ESPFIX issue, we can change this.
>> >+ */
>> >+#ifdef CONFIG_X86_32
>> >+#ifdef CONFIG_PARAVIRT
>> >+do {
>> >+extern void native_iret(void);
>> >+if (pv_cpu_ops.iret == native_iret)
>> >+set_cpu_bug(c, X86_BUG_ESPFIX);
>> >+} while (0);
>> >+#else
>> >+set_cpu_bug(c, X86_BUG_ESPFIX);
>> >+#endif
>> >+#endif
>> >  }
>> >  /*
>>
>> Alternatively, PV guests can clear X86_BUG_ESPFIX in their init
>> code. E.g in .set_cpu_features op, just like we do for
>> X86_BUG_SYSRET_SS_ATTRS
>
> Andy's proposal works out of identify_cpu() and that covers both boot
> processor and secondary CPUs. The summary is as follows, tracing back in
> time from left to right.
>
> --- identify_boot_cpu() --- check_bugs() --- start_kernel()
>/
> identify_cpu()<
>\
> --- identify_secondary_cpu() --- cpu_up() --- smp_init()
> --- kernel_init_freeable() --- kernel_init()
> --- rest_init() --- start_kernel()
>
>
> set_cpu_features() is called from both: init_hypervisor_platform()
> during setup_arch() and identify_cpu(). Since it'll be called on
> check_bugs() already on identify_boot_cpu() though I think the
> call from init_hypervisor_platform() seems redundant ?
>
> We assume we just call:
>
> setup_arch() --> init_hypervisor_platform() --> 
> init_hypervisor(&boot_cpu_data)
>
> But the above map on identify_cpu() also shows we call:
>
> start_kernel --> check_bugs() --> identify_boot_cpu() -->
> identify_cpu() --> init_hypervisor() --> set_cpu_features()
>
>
> void init_hypervisor(struct cpuinfo_x86 *c)
> {
> if (x86_hyper && x86_hyper->set_cpu_features)
> x86_hyper->set_cpu_features(c);
> }
>
> Anyway, since we're consolidating quirks, and since it turns out the other
> quirks are being shifted away from paravirt_enabled() out into another struct
> x86_platform_ops CPU specific quirk, I wonder why not just also replace this
> set_cpu_features() thing as a struct x86_platform_ops quirk CPU callback.
>
>> (although this may require adding struct
>> hypervisor_x86 for lguests. Which I think they should have anyway).
>
> lguest already uses x86_platform, and setting up a per CPU quirk would
> be rather trivial.
>
> CPU feature / CPU quirk...
>
> I've stashed the other quirks into a x86_early_init_platform_quirks(),
> this was to have all quirks handled in one place. We handle differences
> with subarch there. vmware has no subarch though, and it uses its own
> set_cpu_features(). We have a few options I  can think of:
>
>  1) keep this on set_cpu_features() and modify lguest to add a struct 
> hypervisor_x86
> as boris suggests
>
>  2) move set_cpu_features() as a platform feature / quirk callback and
> call it on identify_cpu()
>
>  3) Just identify each quirk on struct x86_platform, with a set of defaults
> set. Then identify_cpu() enables a platform  callback to override
> defaults, and finally then a shared quirk call is issued to
> set the different set_cpu_features() or clear them.
>

I think this is severely overcomplicating the issue.

The issue is that IRET is a pile of shit.  It may be quirky, but it
affects *everything*.

On x86_64, the kernel assumes that the "iret" implementation works
around the quirk.  xen_iret doesn't, and that's Xen's problem.

On x86_32, it's inconvenient for native_iret to directly work around
the quirk.  Instead, some other asm code in the exit path sets up the
workaround under the assumption that native_iret is just plain IRET.
It's the responsibility of other IRET implementations to have their
own implementat

[Xen-devel] [xen-unstable baseline-only test] 44208: regressions - FAIL

2016-03-02 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 44208 xen-unstable real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/44208/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-xsm  15 guest-start/debian.repeat fail REGR. vs. 44200
 test-armhf-armhf-libvirt-qcow2  9 debian-di-install   fail REGR. vs. 44200

Regressions which are regarded as allowable (not blocking):
 build-amd64-rumpuserxen   6 xen-buildfail blocked in 44200
 build-i386-rumpuserxen6 xen-buildfail blocked in 44200
 test-amd64-amd64-xl-credit2 19 guest-start/debian.repeat fail blocked in 44200
 test-amd64-amd64-xl-xsm 19 guest-start/debian.repeat fail blocked in 44200
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stopfail blocked in 44200
 test-amd64-amd64-qemuu-nested-intel 16 debian-hvm-install/l1/l2 fail blocked 
in 44200

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-midway   12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 xen  986d9fc3bbf8a6d9d088ca22d1422bd5de249396
baseline version:
 xen  42391c613d42248d82f1b04c523d48bf141b75dc

Last test of basis44200  2016-03-01 03:57:40 Z1 days
Testing same since44208  2016-03-02 13:50:31 Z0 days1 attempts


People who touched revisions under test:
  Boris Ostrovsky 
  Corneliu ZUZU 
  Dario Faggioli 
  Doug Goldstein 
  George Dunlap 
  George Dunlap 
  Haozhong Zhang 
  Ian Campbell 
  Ian Jackson 
  Jan Beulich 
  Parth Dixit 
  Razvan Cojocaru 
  Shannon Zhao 
  Stefano Stabellini 
  Tamas K Lengyel 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt 

Re: [Xen-devel] Prototype Code Review Dashboards (input required)

2016-03-02 Thread Daniel Izquierdo

On 01/03/16 18:04, Lars Kurth wrote:

Daniel, Jesus,

I am going to break my comments down into different sections to make this more 
consumable. Let's focus on the A1-A3 use-cases in this mail.

First I wanted to start of with some questions about definitions, as I am 
seeing some discrepancies in some of the data shown and am trying to understand 
exactly what the data means, and then have a look at the individual sections.

General, to the Xen-A1.A2.A3 dash board
- I played with some filters and noticed some oddities, e.g. if I filter on "merged: 
0" all views change as expected
- If I filter on "merged: 1", a lot of widgets show no data. Is this showing 
that there is an issue with the data somewhere?
- I see similar issues with other filters, e.g. 'emailtype: "patch"'


In order to bring some context to the dataset, ElasticSearch was 
initially used for parsing Apache logs. That means that data should be 
formatted as 'a row = an event'.


In this dataset there are several events that are defined by the field 
'EmailType'. 'patchserie', 'patch', 'comment', 'flag'. And then, 
depending on that 'EmailType', each of the columns may have some meaning 
or some other.


This structure uses the 'EmailType' as the central key where the rest of 
the columns provide extra syntax. For instance, post_ack_comment field 
only makes sense for the EmailType:comment.


Coming back to the comments:

There are fields that apply only to specific type of events. In the case 
of 'merge' this applies only in the case of patches. merge:1 would 
filter patches that are merged (so the rest of the information is 
literally removed as they are not merged). If we filter by merge:0, 
these are the rest of the information (even including flags).


Thus, using the filter merge:1 leads to having info only related to 
'patches' in this case.


As this panel shows information about other types than 'patch', if you 
filter by some 'emailtype' such as 'patch' then you're focusing only on 
patches data and this will display the merged and not merged ones.


In order to improve this, we can either create a panel for type of 
analysis (one panel for patches, one for comments, etc). Or we can play 
with adding the 'merge' field to any flag, patchserie, patch and comment 
whose patch was merged at some point. The latter may sound a bit weird 
as a 'merged' status does not apply to a flag (Reviewed-by) for instance.



On 1 Mar 2016, at 13:53, Lars Kurth  wrote:

Case of Study A.1: Identify top reviewers (for both individuals and companies)
--

Goal: Highlight review contributions - ability to use the data to "reward" review 
contributions and encourage more "review" contributions

The widgets in question are:
- Evolution 'Reviewed by-flag' (no patchseries, no patches)
- What is the difference to Evolution of patches
- Top People/Domains Reviewing patches

Q1: Are this the reviewed-by flags?


They are only the Reviewed-by flags.


Q2: What is the scope? Do the number count
- the # files someone reviewed
- the # patches someone reviewed
- the # series someone reviewed


The number counts the number of reviews accomplished by a developer or 
by a domain. A review is accomplished when the flag 'reviewed-by' is 
detected in a email replying a patch.


If a developer reviews several patches or several versions of the same 
patch, each of those is counted as a different review.




If a reviewer is solely defined by the reviewed-by tags, the data does not 
provide a correct picture.

This is how this works so far.


It may be better to use the following definition (although, others may disagree)
A reviewer is someone who did one of the following for a patch or series:
- Added a reviewed-by flag
- Added an acked-by flag (maintainers tend to use acked-by)
- Made a comment, but is NOT the author


We can update that definition. Do we want to have extra discussion with 
this respect?



Related to that use-case are also the following widgets
- Evolution of Comments Activity
- Top People/Domains Commenting (which also contain post-ACK comments and are 
thus also related to A.3)
- Evolution of Email activity

Q3: Again, the scope isn't quite clear


This is the number of comments replying to a patch. A comment is defined 
as an email reply to a patch.



Q4: The figures are higher than those in "People/Domains Reviewing patches". 
Are comments on people's own patches included (these would be replies to the comments of 
others)


I should check the  last question. I'd say that we're including them, as 
they are 'comments' to a patch. You can indeed comment your own patches 
:). But we can deal with this if this does not make sense.





Possible places where this could be added : a separate table which is not time 
based, but can be filtered by time
Possible metrics: number of review comments by person, number of patches/patch 
series a person is actively commenting on, number of

[Xen-devel] [xen-4.3-testing test] 85001: regressions - trouble: blocked/broken/fail/pass

2016-03-02 Thread osstest service owner
flight 85001 xen-4.3-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/85001/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf   3 host-install(3) broken REGR. vs. 83004
 build-armhf-pvops 3 host-install(3) broken REGR. vs. 83004
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
83004
 test-amd64-amd64-xl-qemut-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
83004
 test-amd64-i386-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
83004
 test-amd64-i386-xl-qemut-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
83004

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 83004
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 83004
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 83004

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install  fail never pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail never pass
 build-i386-rumpuserxen6 xen-buildfail   never pass
 build-amd64-rumpuserxen   6 xen-buildfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass

version targeted for testing:
 xen  404e83e055cb419efccbcb0c5c89476307a9ae46
baseline version:
 xen  ccc7adf9cff5d5f93720afcc1d0f7227d50feab2

Last test of basis83004  2016-02-18 14:47:44 Z   13 days
Testing same since84923  2016-03-01 13:41:07 Z1 days2 attempts


People who touched revisions under test:
  Ian Campbell 
  Ian Jackson 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  broken  
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  blocked 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopsbroken  
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  blocked 
 test-amd64-i386-xl   pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64fail
 test-amd64-i386-xl-qemut-debianhvm-amd64 fail
 test-amd64-amd64-xl-qemuu-debianhvm-amd64fail
 test-amd64-i386-xl-qemuu-debianhvm-amd64 fail
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail

[Xen-devel] [linux-mingo-tip-master test] 85018: regressions - FAIL

2016-03-02 Thread osstest service owner
flight 85018 linux-mingo-tip-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/85018/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 60684
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 60684
 build-amd64-pvops 5 kernel-build  fail REGR. vs. 60684
 build-i386-pvops  5 kernel-build  fail REGR. vs. 60684

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-ch

[Xen-devel] [linux-4.1 test] 84995: regressions - FAIL

2016-03-02 Thread osstest service owner
flight 84995 linux-4.1 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/84995/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 66399
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 66399
 test-armhf-armhf-xl-xsm  15 guest-start/debian.repeat fail REGR. vs. 66399
 test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeat fail REGR. vs. 
66399
 test-armhf-armhf-xl-credit2  15 guest-start/debian.repeat fail REGR. vs. 66399
 test-armhf-armhf-xl  15 guest-start/debian.repeat fail REGR. vs. 66399
 test-armhf-armhf-xl-multivcpu 16 guest-start.2   fail in 82991 REGR. vs. 66399

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-startfail in 82991 pass in 84906
 test-armhf-armhf-xl-cubietruck 11 guest-start  fail in 82991 pass in 84995
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeat fail in 84906 pass 
in 82991
 test-amd64-amd64-xl  17 guest-localmigrate/x10 fail in 84906 pass in 84995
 test-armhf-armhf-xl-multivcpu 11 guest-startfail pass in 84906
 test-armhf-armhf-xl-rtds  6 xen-bootfail pass in 84906

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeat fail in 84906 like 66399
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 66399
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 66399
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 66399
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 66399
 test-armhf-armhf-xl-vhd   9 debian-di-installfail   like 66399

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-check fail in 84906 never 
pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-check fail in 84906 never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-check fail in 84906 never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-check fail in 84906 never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-libvirt-raw  9 debian-di-installfail   never pass

version targeted for testing:
 linux83fdace666f72dbfc4a7681a04e3689b61dae3b9
baseline version:
 linux07cc49f66973f49a391c91bf4b158fa0f2562ca8

Last test of basis66399  2015-12-15 18:20:39 Z   78 days
Failing since 78925  2016-01-24 13:50:39 Z   38 days   39 attempts
Testing same since82845  2016-02-16 14:18:38 Z   15 days   16 attempts

--

Re: [Xen-devel] [PATCH RFC 0/8] x86/hvm, libxl: HVM SMT topology support

2016-03-02 Thread Andrew Cooper
On 02/03/16 19:18, Joao Martins wrote:
>
> On 02/25/2016 05:21 PM, Andrew Cooper wrote:
>> On 22/02/16 21:02, Joao Martins wrote:
>>> Hey!
>>>
>>> This series are a follow-up on the thread about the performance
>>> of hard-pinned HVM guests. Here we propose allowing libxl to
>>> change how the CPU topology looks like for the HVM guest, which can 
>>> favor certain workloads as depicted by Elena on this thread [0]. 
>>> It shows around 22-23% gain on io bound workloads having the guest
>>> vCPUs hard pinned to the pCPUs with a matching core+thread.
>>>
>>> This series is divided as following:
>>> * Patch 1 : Sets initial apicid to be the vcpuid as opposed
>>> to vcpuid * 2 for each core;
>>> * Patch 2 : Whitespace cleanup
>>> * Patch 3 : Adds new leafs to describe Intel/AMD cache
>>> topology. Though it's only internal to libxl;
>>> * Patch 4 : Internal call to set per package CPUID values.
>>> * Patch 5 - 8 : Interfaces for xl and libxl for setting topology.
>>>
>>> I couldn't quite figure out which user interface was better so I
>>> included both our "smt" option and full description of the topology
>>> i.e. "sockets", "cores", "threads" option same as the "-smp"
>>> option on QEMU. Note that the latter could also be used on
>>> libvirt since topology is described in their XML configs.
>>>
>>> It's also an RFC as AMD support isn't implemented yet.
>>>
>>> Any comments are appreciated!
>> Hey.  Sorry I am late getting to this - I am currently swamped.  Some
>> general observations.
> Hey Andrew, Thanks for the pointers!
>
>> The cpuid policy code in Xen was never re-thought through after
>> multi-vcpu guests were introduced, which means they have no
>> understanding of per-package, per-core and per-thread values.
>>
>> As part of my further cpuid work, I will need to fix this.  I was
>> planning to fix it by requiring full cpu topology information to be
>> passed as part of the domaincreate or max_vcpus hypercall  (not chosen
>> which yet).  This would include cores-per-package, threads-per-core etc,
>> and allow Xen to correctly fill in the per-core cpuid values in leaves
>> 4, 0xB and 8008.
> FWIW CPU topology on domaincreate sounds nice. Or would max_vcpus hypercall
> serve other purposes too? (CPU hotplug, migration)

With cpu hotplug, a guest is still limited at max_vcpus, and this
hypercall is the second action during domain creation.

With migration, an empty domain must already be created for the contents
of the stream to be inserted into.  At a minimum, this is createdomain
and max_vcpus, usually with a max_mem to avoid it getting arbitrarily large.

One (mis)feature I want to fix is that currently, the cpuid policy is
regenerated by the toolstack on the destination of the migration, after
the cpu state has been reloaded in Xen.  This causes a chicken and egg
problem between checking the validity of guest state, such as %cr4
against the guest cpuid policy.

I wish to fix this by putting the domain cpuid policy at the head of the
migration stream, which allows the receiving side to first verify that
the domains cpuid policy is compatible with the host, and then verify
all further migration state against the policy.

Even with this, there will be a chicken and egg situation when it comes
to specifying topology.  The best that we can do is let the toolstack
recreate it from scratch (from what is hopefully the same domain
configuration at a higher level), then verify consistency when the
policy is loaded.

>
>> In particular, I am concerned about giving the toolstack the ability to
>> blindly control the APIC IDs.  Their layout is very closely linked to
>> topology, and in particular to the HTT flag.
>>
>> Overall, I want to avoid any possibility of generating APIC layouts
>> (including the emulated IOAPIC with HVM guests) which don't conform to
>> the appropriate AMD/Intel manuals.
> I see so overall having Xen control the topology would be a better approach 
> that
> "mangling" the APICIDs in the cpuid policy as I am proposing. One good thing
> about Xen handling the topology bits would be for Intel CPUs with CPUID 
> faulting
> support where PV guests could also see the topology info. And given that the
> word 10 of hw_caps won't be exposed (as per your CPUID), handling the PV case 
> on
> cpuid policy wouldn't be as clean.

Which word do you mean here?  Even before my series, Xen only had 9
words in hw_cap.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5] libxl: handle failure of xc_version() in libxl_get_version_info()

2016-03-02 Thread Harmandeep Kaur
Check the return value of xc_version() and return NULL if it
fails. libxl_get_version_info() can also return NULL now.

Group all calls to xc_version() , so that data copies in
various info fields only if all calls to xc_version work.
This will eliminate cases in which only partial information
is updated.

Callers of the function libxl_get_version_info() are already
prepared to deal with a NULL return value.

Coverity ID 1351217

Signed-off-by: Harmandeep Kaur 
Reviewed-by: Dario Faggioli 
---
v2: Change local variable rc to r. Removes xen_version.
Better readiblity of blocks of code. Group all calls to
xc_version() , so that data copies in various info
fields only if all calls to xc_version go error-free.

v3: Group all calls to xc_version() , so that data copies in
various info fields only if all calls to xc_version work.

v4: Improve suboptimal subject. Readds xen_version to suit
re-arrangement.

v5: Change datatype of 'r' from long to int.
---
 tools/libxl/libxl.c | 33 ++---
 1 file changed, 18 insertions(+), 15 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 4cdc169..d7bc836 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -5203,41 +5203,44 @@ const libxl_version_info* 
libxl_get_version_info(libxl_ctx *ctx)
 xen_commandline_t xen_commandline;
 } u;
 long xen_version;
+int r = 0;
 libxl_version_info *info = &ctx->version_info;
 
 if (info->xen_version_extra != NULL)
 goto out;
 
 xen_version = xc_version(ctx->xch, XENVER_version, NULL);
+if (xen_version < 0) goto out;
+r = xc_version(ctx->xch, XENVER_extraversion, &u.xen_extra);
+if (r < 0) goto out;
+r = xc_version(ctx->xch, XENVER_compile_info, &u.xen_cc);
+if (r < 0) goto out;
+r = xc_version(ctx->xch, XENVER_capabilities, &u.xen_caps);
+if (r < 0) goto out;
+r = xc_version(ctx->xch, XENVER_changeset, &u.xen_chgset);
+if (r < 0) goto out;
+r = xc_version(ctx->xch, XENVER_platform_parameters, &u.p_parms);
+if (r < 0) goto out;
+r = info->pagesize = xc_version(ctx->xch, XENVER_pagesize, NULL);
+if (r < 0) goto out;
+r = xc_version(ctx->xch, XENVER_commandline, &u.xen_commandline);
+if (r < 0) goto out;
+
 info->xen_version_major = xen_version >> 16;
 info->xen_version_minor = xen_version & 0xFF;
-
-xc_version(ctx->xch, XENVER_extraversion, &u.xen_extra);
 info->xen_version_extra = libxl__strdup(NOGC, u.xen_extra);
-
-xc_version(ctx->xch, XENVER_compile_info, &u.xen_cc);
 info->compiler = libxl__strdup(NOGC, u.xen_cc.compiler);
 info->compile_by = libxl__strdup(NOGC, u.xen_cc.compile_by);
 info->compile_domain = libxl__strdup(NOGC, u.xen_cc.compile_domain);
 info->compile_date = libxl__strdup(NOGC, u.xen_cc.compile_date);
-
-xc_version(ctx->xch, XENVER_capabilities, &u.xen_caps);
 info->capabilities = libxl__strdup(NOGC, u.xen_caps);
-
-xc_version(ctx->xch, XENVER_changeset, &u.xen_chgset);
 info->changeset = libxl__strdup(NOGC, u.xen_chgset);
-
-xc_version(ctx->xch, XENVER_platform_parameters, &u.p_parms);
 info->virt_start = u.p_parms.virt_start;
-
-info->pagesize = xc_version(ctx->xch, XENVER_pagesize, NULL);
-
-xc_version(ctx->xch, XENVER_commandline, &u.xen_commandline);
 info->commandline = libxl__strdup(NOGC, u.xen_commandline);
 
  out:
 GC_FREE;
-return info;
+return r < 0 ? NULL : info;
 }
 
 libxl_vcpuinfo *libxl_list_vcpu(libxl_ctx *ctx, uint32_t domid,
-- 
2.5.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Question] PARSEC benchmark has smaller execution time in VM than in native?

2016-03-02 Thread Meng Xu
On Tue, Mar 1, 2016 at 4:51 PM, Sander Eikelenboom  wrote:
>
> Tuesday, March 1, 2016, 9:39:25 PM, you wrote:
>
>> On Tue, Mar 01, 2016 at 02:52:14PM -0500, Meng Xu wrote:
>>> Hi Elena,
>>>
>>> Thank you very much for sharing this! :-)
>>>
>>> On Tue, Mar 1, 2016 at 1:20 PM, Elena Ufimtseva
>>>  wrote:
>>> >
>>> > On Tue, Mar 01, 2016 at 08:48:30AM -0500, Meng Xu wrote:
>>> > > On Mon, Feb 29, 2016 at 12:59 PM, Konrad Rzeszutek Wilk
>>> > >  wrote:
>>> > > >> > Hey!
>>> > > >> >
>>> > > >> > CC-ing Elena.
>>> > > >>
>>> > > >> I think you forgot you cc.ed her..
>>> > > >> Anyway, let's cc. her now... :-)
>>> > > >>
>>> > > >> >
>>> > > >> >> We are measuring the execution time between native machine 
>>> > > >> >> environment
>>> > > >> >> and xen virtualization environment using PARSEC Benchmark [1].
>>> > > >> >>
>>> > > >> >> In virtualiztion environment, we run a domU with three VCPUs, 
>>> > > >> >> each of
>>> > > >> >> them pinned to a core; we pin the dom0 to another core that is not
>>> > > >> >> used by the domU.
>>> > > >> >>
>>> > > >> >> Inside the Linux in domU in virtualization environment and in 
>>> > > >> >> native
>>> > > >> >> environment,  We used the cpuset to isolate a core (or VCPU) for 
>>> > > >> >> the
>>> > > >> >> system processors and to isolate a core for the benchmark 
>>> > > >> >> processes.
>>> > > >> >> We also configured the Linux boot command line with isocpus= 
>>> > > >> >> option to
>>> > > >> >> isolate the core for benchmark from other unnecessary processes.
>>> > > >> >
>>> > > >> > You may want to just offline them and also boot the machine with 
>>> > > >> > NUMA
>>> > > >> > disabled.
>>> > > >>
>>> > > >> Right, the machine is booted up with NUMA disabled.
>>> > > >> We will offline the unnecessary cores then.
>>> > > >>
>>> > > >> >
>>> > > >> >>
>>> > > >> >> We expect that execution time of benchmarks in xen virtualization
>>> > > >> >> environment is larger than the execution time in native machine
>>> > > >> >> environment. However, the evaluation gave us an opposite result.
>>> > > >> >>
>>> > > >> >> Below is the evaluation data for the canneal and streamcluster 
>>> > > >> >> benchmarks:
>>> > > >> >>
>>> > > >> >> Benchmark: canneal, input=simlarge, conf=gcc-serial
>>> > > >> >> Native: 6.387s
>>> > > >> >> Virtualization: 5.890s
>>> > > >> >>
>>> > > >> >> Benchmark: streamcluster, input=simlarge, conf=gcc-serial
>>> > > >> >> Native: 5.276s
>>> > > >> >> Virtualization: 5.240s
>>> > > >> >>
>>> > > >> >> Is there anything wrong with our evaluation that lead to the 
>>> > > >> >> abnormal
>>> > > >> >> performance results?
>>> > > >> >
>>> > > >> > Nothing is wrong. Virtualization is naturally faster than 
>>> > > >> > baremetal!
>>> > > >> >
>>> > > >> > :-)
>>> > > >> >
>>> > > >> > No clue sadly.
>>> > > >>
>>> > > >> Ah-ha. This is really surprising to me Why will it speed up the
>>> > > >> system by adding one more layer? Unless the virtualization disabled
>>> > > >> some services that occur in native and interfere with the benchmark.
>>> > > >>
>>> > > >> If virtualization is faster than baremetal by nature, why we can see
>>> > > >> that some experiment shows that virtualization introduces overhead?
>>> > > >
>>> > > > Elena told me that there were some weird regression in Linux 4.1 - 
>>> > > > where
>>> > > > CPU burning workloads were _slower_ on baremetal than as guests.
>>> > >
>>> > > Hi Elena,
>>> > > Would you mind sharing with us some of your experience of how you
>>> > > found the real reason? Did you use some tool or some methodology to
>>> > > pin down the reason (i.e,  CPU burning workloads in native is _slower_
>>> > > on baremetal than as guests)?
>>> > >
>>> >
>>> > Hi Meng
>>> >
>>> > Yes, sure!
>>> >
>>> > While working on performance tests for smt-exposing patches from Joao
>>> > I run CPU bound workload in HVM guest and using same kernel in baremetal
>>> > run same test.
>>> > While testing cpu-bound workload on baremetal linux (4.1.0-rc2)
>>> > I found that the time to complete the same test is few times more that
>>> > as it takes for the same under HVM guest.
>>> > I have tried tests where kernel threads pinned to cores and without 
>>> > pinning.
>>> > The execution times are most of the times take as twice longer, sometimes 
>>> > 4
>>> > times longer that HVM case.
>>> >
>>> > Interesting is not only that it takes sometimes 3-4 times more
>>> > than HVM guest, but also that test with bound threads (to cores) takes 
>>> > almost
>>> > 3 times longer
>>> > to execute than running same cpu-bound test under HVM (in all
>>> > configurations).
>>>
>>>
>>> wow~ I didn't expect the native performance can be so "bad" ;-)
>
>> Yes, quite a surprise :)
>>>
>>> >
>>> >
>>> > I run each test 5 times and here are the execution times (seconds):
>>> >
>>> > -
>>> > baremetal   |
>>> > thread_bind | thread unbind | HVM pinned to cores
>>> > --- |-

Re: [Xen-devel] [Question] PARSEC benchmark has smaller execution time in VM than in native?

2016-03-02 Thread Meng Xu
Hi Elena,


On Tue, Mar 1, 2016 at 3:39 PM, Elena Ufimtseva
 wrote:
> On Tue, Mar 01, 2016 at 02:52:14PM -0500, Meng Xu wrote:
>> Hi Elena,
>>
>> Thank you very much for sharing this! :-)
>>
>> On Tue, Mar 1, 2016 at 1:20 PM, Elena Ufimtseva
>>  wrote:
>> >
>> > On Tue, Mar 01, 2016 at 08:48:30AM -0500, Meng Xu wrote:
>> > > On Mon, Feb 29, 2016 at 12:59 PM, Konrad Rzeszutek Wilk
>> > >  wrote:
>> > > >> > Hey!
>> > > >> >
>> > > >> > CC-ing Elena.
>> > > >>
>> > > >> I think you forgot you cc.ed her..
>> > > >> Anyway, let's cc. her now... :-)
>> > > >>
>> > > >> >
>> > > >> >> We are measuring the execution time between native machine 
>> > > >> >> environment
>> > > >> >> and xen virtualization environment using PARSEC Benchmark [1].
>> > > >> >>
>> > > >> >> In virtualiztion environment, we run a domU with three VCPUs, each 
>> > > >> >> of
>> > > >> >> them pinned to a core; we pin the dom0 to another core that is not
>> > > >> >> used by the domU.
>> > > >> >>
>> > > >> >> Inside the Linux in domU in virtualization environment and in 
>> > > >> >> native
>> > > >> >> environment,  We used the cpuset to isolate a core (or VCPU) for 
>> > > >> >> the
>> > > >> >> system processors and to isolate a core for the benchmark 
>> > > >> >> processes.
>> > > >> >> We also configured the Linux boot command line with isocpus= 
>> > > >> >> option to
>> > > >> >> isolate the core for benchmark from other unnecessary processes.
>> > > >> >
>> > > >> > You may want to just offline them and also boot the machine with 
>> > > >> > NUMA
>> > > >> > disabled.
>> > > >>
>> > > >> Right, the machine is booted up with NUMA disabled.
>> > > >> We will offline the unnecessary cores then.
>> > > >>
>> > > >> >
>> > > >> >>
>> > > >> >> We expect that execution time of benchmarks in xen virtualization
>> > > >> >> environment is larger than the execution time in native machine
>> > > >> >> environment. However, the evaluation gave us an opposite result.
>> > > >> >>
>> > > >> >> Below is the evaluation data for the canneal and streamcluster 
>> > > >> >> benchmarks:
>> > > >> >>
>> > > >> >> Benchmark: canneal, input=simlarge, conf=gcc-serial
>> > > >> >> Native: 6.387s
>> > > >> >> Virtualization: 5.890s
>> > > >> >>
>> > > >> >> Benchmark: streamcluster, input=simlarge, conf=gcc-serial
>> > > >> >> Native: 5.276s
>> > > >> >> Virtualization: 5.240s
>> > > >> >>
>> > > >> >> Is there anything wrong with our evaluation that lead to the 
>> > > >> >> abnormal
>> > > >> >> performance results?
>> > > >> >
>> > > >> > Nothing is wrong. Virtualization is naturally faster than baremetal!
>> > > >> >
>> > > >> > :-)
>> > > >> >
>> > > >> > No clue sadly.
>> > > >>
>> > > >> Ah-ha. This is really surprising to me Why will it speed up the
>> > > >> system by adding one more layer? Unless the virtualization disabled
>> > > >> some services that occur in native and interfere with the benchmark.
>> > > >>
>> > > >> If virtualization is faster than baremetal by nature, why we can see
>> > > >> that some experiment shows that virtualization introduces overhead?
>> > > >
>> > > > Elena told me that there were some weird regression in Linux 4.1 - 
>> > > > where
>> > > > CPU burning workloads were _slower_ on baremetal than as guests.
>> > >
>> > > Hi Elena,
>> > > Would you mind sharing with us some of your experience of how you
>> > > found the real reason? Did you use some tool or some methodology to
>> > > pin down the reason (i.e,  CPU burning workloads in native is _slower_
>> > > on baremetal than as guests)?
>> > >
>> >
>> > Hi Meng
>> >
>> > Yes, sure!
>> >
>> > While working on performance tests for smt-exposing patches from Joao
>> > I run CPU bound workload in HVM guest and using same kernel in baremetal
>> > run same test.
>> > While testing cpu-bound workload on baremetal linux (4.1.0-rc2)
>> > I found that the time to complete the same test is few times more that
>> > as it takes for the same under HVM guest.
>> > I have tried tests where kernel threads pinned to cores and without 
>> > pinning.
>> > The execution times are most of the times take as twice longer, sometimes 4
>> > times longer that HVM case.
>> >
>> > Interesting is not only that it takes sometimes 3-4 times more
>> > than HVM guest, but also that test with bound threads (to cores) takes 
>> > almost
>> > 3 times longer
>> > to execute than running same cpu-bound test under HVM (in all
>> > configurations).
>>
>>
>> wow~ I didn't expect the native performance can be so "bad" ;-)
>
> Yes, quite a surprise :)
>>
>> >
>> >
>> > I run each test 5 times and here are the execution times (seconds):
>> >
>> > -
>> > baremetal   |
>> > thread_bind | thread unbind | HVM pinned to cores
>> > --- |---|-
>> >  74 | 83|28
>> >  74 | 88|28
>> >  74 | 38|28
>> >  74 | 73

Re: [Xen-devel] [PATCH v3 01/11] x86/boot: enumerate documentation for the x86 hardware_subarch

2016-03-02 Thread Luis R. Rodriguez
On Wed, Mar 02, 2016 at 01:43:42AM +0100, Luis R. Rodriguez wrote:
> On Wed, Feb 24, 2016 at 09:32:59AM +0100, Ingo Molnar wrote:
> There's only one problem with this strategy I can think so far which differs
> from my original approach, which is partly why I actually started looking at
> this stuff:
> 
>   it doesn't help us pro-actively vet each early boot sequence
>   thrown at the x86 path well work on all required subarchs
> 
> The quirks stuff / proactive solution should perhaps be considered orthogonal.
> It just so happened that I was able to address some quirks with what I was

Since it is orthogonal I'll simply work off on the paravirt_enabled() removal
separately as its possible. Since the clarity on semantics will be needed
for other work I'm doing (proactive solution to avoid issues on early boot)
and since the proposed alternative still uses subarch for the quirks as
you recommended I'll at least still push for documentation update on subarch
use as well for now.

After this, and then after sort a simple link table upstream I can then start
focusing more on the proactive solution once again. That should help keep
things separate and make it clearer what I'm trying to achieve later.

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 0/8] x86/hvm, libxl: HVM SMT topology support

2016-03-02 Thread Joao Martins


On 02/25/2016 05:21 PM, Andrew Cooper wrote:
> On 22/02/16 21:02, Joao Martins wrote:
>> Hey!
>>
>> This series are a follow-up on the thread about the performance
>> of hard-pinned HVM guests. Here we propose allowing libxl to
>> change how the CPU topology looks like for the HVM guest, which can 
>> favor certain workloads as depicted by Elena on this thread [0]. 
>> It shows around 22-23% gain on io bound workloads having the guest
>> vCPUs hard pinned to the pCPUs with a matching core+thread.
>>
>> This series is divided as following:
>> * Patch 1 : Sets initial apicid to be the vcpuid as opposed
>> to vcpuid * 2 for each core;
>> * Patch 2 : Whitespace cleanup
>> * Patch 3 : Adds new leafs to describe Intel/AMD cache
>> topology. Though it's only internal to libxl;
>> * Patch 4 : Internal call to set per package CPUID values.
>> * Patch 5 - 8 : Interfaces for xl and libxl for setting topology.
>>
>> I couldn't quite figure out which user interface was better so I
>> included both our "smt" option and full description of the topology
>> i.e. "sockets", "cores", "threads" option same as the "-smp"
>> option on QEMU. Note that the latter could also be used on
>> libvirt since topology is described in their XML configs.
>>
>> It's also an RFC as AMD support isn't implemented yet.
>>
>> Any comments are appreciated!
> 
> Hey.  Sorry I am late getting to this - I am currently swamped.  Some
> general observations.
Hey Andrew, Thanks for the pointers!

> 
> The cpuid policy code in Xen was never re-thought through after
> multi-vcpu guests were introduced, which means they have no
> understanding of per-package, per-core and per-thread values.
> 
> As part of my further cpuid work, I will need to fix this.  I was
> planning to fix it by requiring full cpu topology information to be
> passed as part of the domaincreate or max_vcpus hypercall  (not chosen
> which yet).  This would include cores-per-package, threads-per-core etc,
> and allow Xen to correctly fill in the per-core cpuid values in leaves
> 4, 0xB and 8008.
FWIW CPU topology on domaincreate sounds nice. Or would max_vcpus hypercall
serve other purposes too? (CPU hotplug, migration)

> 
> In particular, I am concerned about giving the toolstack the ability to
> blindly control the APIC IDs.  Their layout is very closely linked to
> topology, and in particular to the HTT flag.
> 
> Overall, I want to avoid any possibility of generating APIC layouts
> (including the emulated IOAPIC with HVM guests) which don't conform to
> the appropriate AMD/Intel manuals.
I see so overall having Xen control the topology would be a better approach that
"mangling" the APICIDs in the cpuid policy as I am proposing. One good thing
about Xen handling the topology bits would be for Intel CPUs with CPUID faulting
support where PV guests could also see the topology info. And given that the
word 10 of hw_caps won't be exposed (as per your CPUID), handling the PV case on
cpuid policy wouldn't be as clean.

Joao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 7/8] libxl: introduce topology fields

2016-03-02 Thread Joao Martins
On 02/25/2016 04:29 PM, Wei Liu wrote:
> On Mon, Feb 22, 2016 at 09:02:13PM +, Joao Martins wrote:
>> Currently there is "smt" option that changes from a flat core topology
>> to a core+thread topology. This patch adds more expressive options for
>> describing the topology as seen by the guest i.e. sockets, cores and
>> threads to adjust cpu topology as seen by the guest.
>>
>> Signed-off-by: Joao Martins 
>> ---
>> CC: Ian Jackson 
>> CC: Stefano Stabellini 
>> CC: Ian Campbell 
>> CC: Wei Liu 
>> ---
>>  tools/libxl/libxl_dom.c | 18 --
>>  tools/libxl/libxl_types.idl |  4 
>>  2 files changed, 16 insertions(+), 6 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
>> index ff9356d..1e6d9ab 100644
>> --- a/tools/libxl/libxl_dom.c
>> +++ b/tools/libxl/libxl_dom.c
>> @@ -507,14 +507,20 @@ int libxl__build_post(libxl__gc *gc, uint32_t domid,
>>  }
>>  
>>  libxl_cpuid_apply_policy(ctx, domid);
>> -if (info->type == LIBXL_DOMAIN_TYPE_HVM
>> -&& libxl_defbool_val(info->smt)) {
>> +if (info->type == LIBXL_DOMAIN_TYPE_HVM) {
>>  
>> -uint32_t threads = 0;
>> +uint32_t threads = 0, cores = 0;
>> +
>> +if (libxl_defbool_val(info->smt)
>> +&& !libxl__count_threads_per_core(gc, &threads))
>> +cores = info->max_vcpus / threads;
>> +else if (info->topology.cores) {
>> +cores = info->topology.cores;
>> +threads = info->topology.threads;
>> +}
>>  
>> -if (!libxl__count_threads_per_core(gc, &threads))
>> -libxl__cpuid_set_topology(ctx, domid,
>> -  info->max_vcpus / threads, threads);
>> +if (cores && threads)
>> +libxl__cpuid_set_topology(ctx, domid, cores, threads);
>>  }
>>  
>>  if (info->cpuid != NULL)
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index fa4725a..caba626 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -422,6 +422,10 @@ libxl_domain_build_info = Struct("domain_build_info",[
>>  ("vcpu_hard_affinity", Array(libxl_bitmap, "num_vcpu_hard_affinity")),
>>  ("vcpu_soft_affinity", Array(libxl_bitmap, "num_vcpu_soft_affinity")),
>>  ("smt", libxl_defbool),
>> +("topology",Struct(None, [("sockets", integer),
>> +  ("cores",   integer),
>> +  ("threads", integer),
>> +  ])),
> 
> Maybe having topology is a good enough sophisticated configuration
> interface? (See my previous email)
Yeah. That's what I think too, though I proposed both to see what you folks
would prefer. For other potentially users of libxl (libvirt) the topology
field(s) is also a better fit. BTW, the reason for the smt parameter was mostly
for the default case, where it's a single flat topology.

>>  ("numa_placement",  libxl_defbool),
>>  ("tsc_mode",libxl_tsc_mode),
>>  ("max_memkb",   MemKB),
>> -- 
>> 2.1.4
>>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 4/8] libxl: cpuid: add guest topology support

2016-03-02 Thread Joao Martins


On 02/25/2016 04:29 PM, Wei Liu wrote:
> On Mon, Feb 22, 2016 at 09:02:10PM +, Joao Martins wrote:
>> Introduce internal cpuid routine for setting the topology
>> as seen by the guest. The topology is made based on
>> leaf 1 and leaf 4 for Intel, more specifically setting:
>>
>> Number of logical processors:
>>  proccount  (CPUID.1:EBX[16:24])
>>
>> Number of physical cores - 1:
>>  procpkg(CPUID.(4,0):EBX[26:32])
>>
>> cache core count - 1:
>>  proccountX (CPUID.(4,Y):EBX[14:26])
>>
>>  given that X is l1d, l1i, l2 or l3
>>  and Y the correspondent subleave [0-3]
>>
>> Signed-off-by: Joao Martins 
>> ---
>> CC: Ian Jackson 
>> CC: Stefano Stabellini 
>> CC: Ian Campbell 
>> CC: Wei Liu 
>> ---
>>  tools/libxl/libxl_cpuid.c| 38 ++
>>  tools/libxl/libxl_internal.h |  2 ++
>>  2 files changed, 40 insertions(+)
>>
>> diff --git a/tools/libxl/libxl_cpuid.c b/tools/libxl/libxl_cpuid.c
>> index deb81d2..e220566 100644
>> --- a/tools/libxl/libxl_cpuid.c
>> +++ b/tools/libxl/libxl_cpuid.c
>> @@ -352,6 +352,44 @@ void libxl_cpuid_set(libxl_ctx *ctx, uint32_t domid,
>>   (const char**)(cpuid[i].policy), cpuid_res);
>>  }
>>  
>> +static int libxl_cpuid_parse_list(libxl_cpuid_policy_list *topo,
>> +  char **keys, int *vals, size_t sz)
> 
> Just call it cpuid_parse_list is fine.
> 
OK.

> Wei.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 2/8] libxl: remove whitespace on libxl_types.idl

2016-03-02 Thread Joao Martins


On 02/25/2016 04:28 PM, Wei Liu wrote:
> On Mon, Feb 22, 2016 at 09:02:08PM +, Joao Martins wrote:
>> Signed-off-by: Joao Martins 
> 
> Acked-by: Wei Liu 
> 
Thanks!

>> ---
>> CC: Ian Jackson 
>> CC: Stefano Stabellini 
>> CC: Ian Campbell 
>> CC: Wei Liu 
>> ---
>>  tools/libxl/libxl_types.idl | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index 9ad7eba..f04279e 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -436,7 +436,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>>  ("blkdev_start",string),
>>  
>>  ("vnuma_nodes", Array(libxl_vnode_info, "num_vnuma_nodes")),
>> -
>> +
>>  ("device_model_version", libxl_device_model_version),
>>  ("device_model_stubdomain", libxl_defbool),
>>  # if you set device_model you must set device_model_version too
>> @@ -497,10 +497,10 @@ libxl_domain_build_info = Struct("domain_build_info",[
>> ("keymap",   string),
>> ("sdl",  libxl_sdl_info),
>> ("spice",
>> libxl_spice_info),
>> -   
>> +
>> ("gfx_passthru", libxl_defbool),
>> ("gfx_passthru_kind", 
>> libxl_gfx_passthru_kind),
>> -   
>> +
>> ("serial",   string),
>> ("boot", string),
>> ("usb",  libxl_defbool),
>> -- 
>> 2.1.4
>>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 1/8] x86/hvm: set initial apicid to vcpu_id

2016-03-02 Thread Joao Martins
On 02/25/2016 05:03 PM, Jan Beulich wrote:
 On 22.02.16 at 22:02,  wrote:
>> Currently the initial_apicid is set vcpu_id * 2 which makes it difficult
>> for the toolstack to manage how is the topology seen by the guest.
>> Instead of forcing procpkg and proccount to be VCPUID * 2, instead we
>> set it to max vcpuid on proccount to max_vcpu_id + 1 (logical number of
>> logical cores) and procpkg to max_vcpu_id (max cores minus 1)
> 
> I'm afraid it takes more than this to explain why the change is
> needed or at least desirable.
Apologies for my clumsiness in the commit message. I should have explained
properly why we need this for this series in the first place.

Currently use initial_apicid as vcpu_id * 2,
and doubled the leafs 1 and 4 values (proccount and procpkg) which means we will
address 8 LAPICIDs (tohugh only 4 will be used). Example topology and algorithm
below to facilitate discussion:

# Maximum logical addressable IDs (logical processors in a package)
proccount = CPUID.1:EBX[23:16]

# Maximum core addressable IDs - 1 (maximum cores in a package - 1)
procpkg = CPUID.(4,0):EAX[31:26]

# MSB (Calculate most significant bit)
SMT_Mask_width = MSB(proccount / (procpkg + 1))
Core_Mask_width = MSB(procpkg + 1)
CoreSMT_Mask_width = SMT_Mask_width + Core_Mask_width
Pkg_Mask_width = 1 << CoreSMT_Mask_width

SMT_ID = APICID & ((1 << SMT_Mask_width) - 1)
Core_ID = (APICID >> SMT_Mask_width) & ((1 << Core_Mask_width) - 1)
Pkg_ID = (APICID & Pkg_Mask_width) >> CoreSMT_Mask_width

So as it is right now, the topology on a 4 vcpu HVM guest looks like:

=> topology(proccount = 16, procpkg = 7) # current
APICID=0 SMT_ID=0 CORE_ID=0 PKG_ID=0 # VCPU 0
APICID=1 SMT_ID=1 CORE_ID=0 PKG_ID=0
APICID=2 SMT_ID=0 CORE_ID=1 PKG_ID=0 # VCPU 1
APICID=3 SMT_ID=1 CORE_ID=1 PKG_ID=0
APICID=4 SMT_ID=0 CORE_ID=2 PKG_ID=0 # VCPU 2
APICID=5 SMT_ID=1 CORE_ID=2 PKG_ID=0
APICID=6 SMT_ID=0 CORE_ID=3 PKG_ID=0 # VCPU 3
APICID=7 SMT_ID=1 CORE_ID=3 PKG_ID=0
[...]
APICID=14 SMT_ID=0 CORE_ID=7 PKG_ID=0
APICID=15 SMT_ID=1 CORE_ID=7 PKG_ID=0

As you know, APICID describes the SMT, Core and PKG. Problem with having APICID
in even numbers (0, 2, 4, ... N) is that we can't describe the SMT/siblings in
the topology. Thus turning the APICID ID space into (0, 1, 2 .. N) like this
patch proposes means we can know calculate all possibilities on both
topology kinds. Note that is a prerequisite patch so that a later
patch in this series sets the proccount and procpkg to enable seeing some cores
as SMT siblings.

=> topology(proccount = 4, procpkg = 3) # with this patch
APICID=0 SMT_ID=0 CORE_ID=0 PKG_ID=0
APICID=1 SMT_ID=0 CORE_ID=1 PKG_ID=0
APICID=2 SMT_ID=0 CORE_ID=2 PKG_ID=0
APICID=3 SMT_ID=0 CORE_ID=3 PKG_ID=0

x2APIC isn't addressed here for this RFC but it has the same issue (and
consequently exposure of FEATURE_XTOPOLOGY, CPUID.(EAX=0xB, ECX=N)). One
difference is that the SMT,Core,Pkg mask widths are fetched from each subleaf
directly as opposed to a calculation between procpkg and proccount.

> In particular I'd like to suggest that
> you do some archeology to understand why things are the way
> they are.
Digging in the history and threads, this behavior seems to be introduced by
commit c21d85b ("[HVM] Change VCPU->LAPIC_ID mapping so that VCPU0 has ID0")
where the main issue looked like a conflict between VCPU 0 LAPICID and IOAPIC
ID. Previous commits (a41ba62, facdf41) made IOAPIC on 0 and vLAPIC on 1..N but
it broke on old kernels (for the lack of LAPIC 0), so it ended up having a
vLAPIC ID space with 0, 2, 4, 6 and assign vIOAPICID = 1. This way all of
{L,IO}APICs have unique IDs - this thread
(http://lists.xen.org/archives/html/xen-devel/2008-09/msg00986.html) seems to
mention something along these lines too.

But the manuals aren't exactly clear on this ID uniqueness between LAPICs and
I/O APICs on more recent processors. Any lights on this would be great.

Intel 82093AA (IOAPIC) datasheet [0] says the following:

"This register contains the 4-bit APIC ID. The ID serves as a physical name of
the IOAPIC. All APIC devices using the APIC bus should have a unique APIC ID."

Though looking at the Intel SDM Volume 3, Chapter 10.1, Figure 10-2 and 10-3,
the APIC bus seems to be used only up to P6 family processors (Figure 10-2)
and it's indeed shared between I/OAPIC and LAPIC . For its successor (Pentium 4
and later) it's no longer the case (Figure 10-3).

My Broadwell machine in fact have conflicting APIC IDs between IOAPIC and LAPIC
in my MADT table. And it does seem that it's the case too for SeaBIOS (commit
e39b938 ("report real I/O APIC ID (0) on MADT and MP-table (v3)") ) and QEMU.
Though it wouldn't justify as reason for doing this on Xen.

[0] http://www.intel.com/design/chipsets/datashts/29056601.pdf

>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -4633,7 +4633,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
>> unsigned int *ebx,
>>  case 0x1:
>>  /* Fix up VLAPIC details. */
>> 

Re: [Xen-devel] [PATCH V15 4/6] libxl: add pvusb API

2016-03-02 Thread George Dunlap
On 02/03/16 18:32, George Dunlap wrote:
> On 01/03/16 08:09, Chunyan Liu wrote:
>> Add pvusb APIs, including:
>>  - attach/detach (create/destroy) virtual usb controller.
>>  - attach/detach usb device
>>  - list usb controller and usb devices
>>  - some other helper functions
>>
>> Signed-off-by: Simon Cao 
>> Signed-off-by: George Dunlap 
>> Signed-off-by: Chunyan Liu 
>> ---
>> Changes:
>>   reorder usbdev_remove to following three steps:
>>   1. Unassign all interfaces from usbback, stopping and returning an
>>  error as soon as one attempt fails
>>   2. Remove the pvusb xenstore nodes, stopping and returning an error
>>  if it fails
>>   3. Attempt to re-assign all interfaces to the original drivers,
>>  stopping and returning an error as soon as one attempt fails.
> 
> Thanks, Chunyan!  One minor comment about these changes...
> 
>> +static int usbdev_rebind(libxl__gc *gc, const char *busid)
>> +{
>> +char **intfs = NULL;
>> +char *usbdev_encode = NULL;
>> +char *path = NULL;
>> +int i, num = 0;
>> +int rc;
>> +
>> +rc = usbdev_get_all_interfaces(gc, busid, &intfs, &num);
>> +if (rc) goto out;
>> +
>> +usbdev_encode = usb_interface_xenstore_encode(gc, busid);
>> +
>> +for (i = 0; i < num; i++) {
>> +char *intf = intfs[i];
>> +char *usbintf_encode = NULL;
>> +const char *drvpath;
>> +
>> +/* rebind USB interface to its originial driver */
>> +usbintf_encode = usb_interface_xenstore_encode(gc, intf);
>> +path = GCSPRINTF(USBBACK_INFO_PATH "/%s/%s/driver_path",
>> + usbdev_encode, usbintf_encode);
>> +rc = libxl__xs_read_checked(gc, XBT_NULL, path, &drvpath);
>> +if (rc) goto out;
>> +
>> +if (drvpath) {
>> +rc = bind_usbintf(gc, intf, drvpath);
>> +if (rc) {
>> +LOGE(ERROR, "Couldn't rebind %s to %s", intf, drvpath);
>> +goto out;
>> +}
>> +}
>> +}
>> +
>> +path = GCSPRINTF(USBBACK_INFO_PATH "/%s", usbdev_encode);
>> +rc = libxl__xs_rm_checked(gc, XBT_NULL, path);
>> +
>> +out:
> 
> So it looks like if one of the re-binds fails, then it stops where it is
> and leaves the USBBACK re-bind info in xenstore.  In that case it's not
> clear to me how that information would ever be removed.
> 
> I think until such time as we have a command to re-attempt the re-bind,
>  if there's an error in the actual rebind, it should just break out of
> the for loop, and remove the re-bind nodes, and document a way to let
> the user try to clean things up.
> 
>> +static int do_usbdev_remove(libxl__gc *gc, uint32_t domid,
>> +libxl_device_usbdev *usbdev)
>> +{
>> +int rc;
>> +char *busid;
>> +libxl_device_usbctrl usbctrl;
>> +libxl_usbctrlinfo usbctrlinfo;
>> +
>> +libxl_device_usbctrl_init(&usbctrl);
>> +libxl_usbctrlinfo_init(&usbctrlinfo);
>> +usbctrl.devid = usbdev->ctrl;
>> +
>> +rc = libxl_device_usbctrl_getinfo(CTX, domid, &usbctrl, &usbctrlinfo);
>> +if (rc) goto out;
>> +
>> +switch (usbctrlinfo.type) {
>> +case LIBXL_USBCTRL_TYPE_PV:
>> +busid = usbdev_busid_from_ctrlport(gc, domid, usbdev);
>> +if (!busid) {
>> +rc = ERROR_FAIL;
>> +goto out;
>> +}
>> +
>> +rc = usbback_dev_unassign(gc, busid);
>> +if (rc) goto out;
>> +
>> +rc = libxl__device_usbdev_remove_xenstore(gc, domid, usbdev);
>> +if (rc) goto out;
>> +
>> +rc = usbdev_rebind(gc, busid);
>> +if (rc) goto out;
> 
> I think we need a comment here saying why we're doing things in this
> order.  Maybe:
> 
> "Things are done in this order to balance simplicity with robustness in
> the case of failure:
> * We unbind all interfaces before rebinding any interfaces, so that we
> never get into a situation where some interfaces are assigned to usbback
> and some are assigned to the original drivers.
> * We also unbind the interfaces before removing the pvusb xenstore
> nodes, so that if the unbind fails in the middle, the device still shows
> up in xl usb-list, and the user can re-try removing it."

Sorry, just looked through the rest of the series, and there's one more
thing.

Neither here nor in the man page do we explain what to do if something
goes wrong with the detach.  I think the best thing to do is probably to
make the logged error messages more helpful.

What about something like this:

* On failure to unbind: "Error removing device from guest.  Try running
usbdev-detach again."

* On failure to rebind: "USB device removed from guest, but couldn't
re-bind to domain 0.  Try removing and re-inserting the USB device or
reloading the driver modules."

What do you think?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 4/6] libxl: add pvusb API

2016-03-02 Thread George Dunlap
On 01/03/16 08:09, Chunyan Liu wrote:
> Add pvusb APIs, including:
>  - attach/detach (create/destroy) virtual usb controller.
>  - attach/detach usb device
>  - list usb controller and usb devices
>  - some other helper functions
> 
> Signed-off-by: Simon Cao 
> Signed-off-by: George Dunlap 
> Signed-off-by: Chunyan Liu 
> ---
> Changes:
>   reorder usbdev_remove to following three steps:
>   1. Unassign all interfaces from usbback, stopping and returning an
>  error as soon as one attempt fails
>   2. Remove the pvusb xenstore nodes, stopping and returning an error
>  if it fails
>   3. Attempt to re-assign all interfaces to the original drivers,
>  stopping and returning an error as soon as one attempt fails.

Thanks, Chunyan!  One minor comment about these changes...

> +static int usbdev_rebind(libxl__gc *gc, const char *busid)
> +{
> +char **intfs = NULL;
> +char *usbdev_encode = NULL;
> +char *path = NULL;
> +int i, num = 0;
> +int rc;
> +
> +rc = usbdev_get_all_interfaces(gc, busid, &intfs, &num);
> +if (rc) goto out;
> +
> +usbdev_encode = usb_interface_xenstore_encode(gc, busid);
> +
> +for (i = 0; i < num; i++) {
> +char *intf = intfs[i];
> +char *usbintf_encode = NULL;
> +const char *drvpath;
> +
> +/* rebind USB interface to its originial driver */
> +usbintf_encode = usb_interface_xenstore_encode(gc, intf);
> +path = GCSPRINTF(USBBACK_INFO_PATH "/%s/%s/driver_path",
> + usbdev_encode, usbintf_encode);
> +rc = libxl__xs_read_checked(gc, XBT_NULL, path, &drvpath);
> +if (rc) goto out;
> +
> +if (drvpath) {
> +rc = bind_usbintf(gc, intf, drvpath);
> +if (rc) {
> +LOGE(ERROR, "Couldn't rebind %s to %s", intf, drvpath);
> +goto out;
> +}
> +}
> +}
> +
> +path = GCSPRINTF(USBBACK_INFO_PATH "/%s", usbdev_encode);
> +rc = libxl__xs_rm_checked(gc, XBT_NULL, path);
> +
> +out:

So it looks like if one of the re-binds fails, then it stops where it is
and leaves the USBBACK re-bind info in xenstore.  In that case it's not
clear to me how that information would ever be removed.

I think until such time as we have a command to re-attempt the re-bind,
 if there's an error in the actual rebind, it should just break out of
the for loop, and remove the re-bind nodes, and document a way to let
the user try to clean things up.

> +static int do_usbdev_remove(libxl__gc *gc, uint32_t domid,
> +libxl_device_usbdev *usbdev)
> +{
> +int rc;
> +char *busid;
> +libxl_device_usbctrl usbctrl;
> +libxl_usbctrlinfo usbctrlinfo;
> +
> +libxl_device_usbctrl_init(&usbctrl);
> +libxl_usbctrlinfo_init(&usbctrlinfo);
> +usbctrl.devid = usbdev->ctrl;
> +
> +rc = libxl_device_usbctrl_getinfo(CTX, domid, &usbctrl, &usbctrlinfo);
> +if (rc) goto out;
> +
> +switch (usbctrlinfo.type) {
> +case LIBXL_USBCTRL_TYPE_PV:
> +busid = usbdev_busid_from_ctrlport(gc, domid, usbdev);
> +if (!busid) {
> +rc = ERROR_FAIL;
> +goto out;
> +}
> +
> +rc = usbback_dev_unassign(gc, busid);
> +if (rc) goto out;
> +
> +rc = libxl__device_usbdev_remove_xenstore(gc, domid, usbdev);
> +if (rc) goto out;
> +
> +rc = usbdev_rebind(gc, busid);
> +if (rc) goto out;

I think we need a comment here saying why we're doing things in this
order.  Maybe:

"Things are done in this order to balance simplicity with robustness in
the case of failure:
* We unbind all interfaces before rebinding any interfaces, so that we
never get into a situation where some interfaces are assigned to usbback
and some are assigned to the original drivers.
* We also unbind the interfaces before removing the pvusb xenstore
nodes, so that if the unbind fails in the middle, the device still shows
up in xl usb-list, and the user can re-try removing it."

Other than that, I gave this patche a moderately thorough review again
today, and I think everything else looks good to me.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] AMD, maintainers: Remove myself from list

2016-03-02 Thread Boris Ostrovsky

On 03/02/2016 06:42 AM, Aravind Gopalakrishnan wrote:

I will not be looking at AMD related Xen code now.
So, removing myself.

Signed-off-by: Aravind Gopalakrishnan 


With regrets

Acked-by: Boris Ostrovsky 



---
  MAINTAINERS | 2 --
  1 file changed, 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 932b05c..7aacfd6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -107,14 +107,12 @@ F:xen/include/acpi/
  
  AMD IOMMU

  M:Suravee Suthikulpanit 
-M: Aravind Gopalakrishnan 
  S:Maintained
  F:xen/drivers/passthrough/amd/
  
  AMD SVM

  M:Boris Ostrovsky 
  M:Suravee Suthikulpanit 
-M: Aravind Gopalakrishnan 
  S:Supported
  F:xen/arch/x86/hvm/svm/
  F:xen/arch/x86/cpu/vpmu_amd.c



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/9] xl: convert exit codes related to domain subcommands to EXIT_[SUCCESS|FAILURE]

2016-03-02 Thread Dario Faggioli
On Wed, 2016-02-24 at 18:23 +0530, Harmandeep Kaur wrote:
> *main_foo() is treated somewhat as a regular main(), it is changed to
> return EXIT_SUCCESS or EXIT_FAILURE.
> 
Ok, I think I've looked at all the patches now. Good work. :-)

There were a few issues and mistakes, but mostly I think the various
functions could have been grouped better in the various patches.

I tried to point that out when I thought it was the case, and to
provide suggestions... let me know if there is something you did not
understand or need clarifications.

Looking forward to v2. :-)

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/9] xl: Improve return and exit codes of restore and save related functions.

2016-03-02 Thread Dario Faggioli
On Wed, 2016-02-24 at 18:23 +0530, Harmandeep Kaur wrote:
> Signed-off-by: Harmandeep Kaur 
> ---
>  tools/libxl/xl_cmdimpl.c | 40 
> 
>  1 file changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 8f5a2f4..e5bb41f 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -2708,11 +2708,11 @@ static uint32_t create_domain(struct
> domain_create *dom_info)
>  restore_fd = open(restore_file, O_RDONLY);
>  if (restore_fd == -1) {
>  fprintf(stderr, "Can't open restore file: %s\n",
> strerror(errno));
> -return ERROR_INVAL;
> +return -1;
>  }
Ah, so here it is create_domain(). Mmm... no, I think it would be best
to have it changed in the other patch where I mentioned it, together
with the other domain creation related functions.

That being said, the way in which the function is changed looks ok to
me. Only one comment about this hunk:

> @@ -3091,9 +3091,9 @@ out:
>   * already happened in the parent.
>   */
>  if ( daemonize && !need_daemon )
> -exit(ret);
> +exit(EXIT_SUCCESS);
>  
> -return ret;
> +return ret < 0 ? -1 : 0;
>
The ret<0 part was thre because libxl error codes where used... now
that we're not using them any longer, can we just initialize ret to 0,
change it to -1 on error (like you're doing) and, here, just return it.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable-smoke test] 85080: tolerable all pass - PUSHED

2016-03-02 Thread osstest service owner
flight 85080 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/85080/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  7ba900efe5f526c941b1ca055e5347947bb7eb4b
baseline version:
 xen  3f19ca9ad0b66c57c91921dc8a695634eee0c679

Last test of basis84976  2016-03-01 19:02:28 Z0 days
Testing same since85080  2016-03-02 16:00:54 Z0 days1 attempts


People who touched revisions under test:
  Yang Hongyang 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=7ba900efe5f526c941b1ca055e5347947bb7eb4b
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
7ba900efe5f526c941b1ca055e5347947bb7eb4b
+ branch=xen-unstable-smoke
+ revision=7ba900efe5f526c941b1ca055e5347947bb7eb4b
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-unstable-coverity
+ '[' x7ba900efe5f526c941b1ca055e5347947bb7eb4b = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCacheProxy"} or die $!;
'
+++ local cache=git://cache:9419/
+++ '[' xgit://cache:9419/ '!=' x ']'
+++ echo 
'git://cache:9419/https:/

Re: [Xen-devel] [PATCH 3/9] xl: Improve return and exit codes of migrate related functions.

2016-03-02 Thread Dario Faggioli
On Wed, 2016-02-24 at 18:23 +0530, Harmandeep Kaur wrote:
> @@ -50,7 +50,7 @@
>  else if (chk_errnoval > 0)
> {\
>  fprintf(stderr,"xl: fatal error: %s:%d: %s:
> %s\n",  \
>  __FILE__,__LINE__, strerror(chk_errnoval),
> #call);  \
> -exit(-
> ERROR_FAIL);  \
> +exit(EXIT_FAILURE); 
> \
>  }   
> \
>  })
>  
Right below this, there are two more macros, CHK_SYSCALL and MUST,
which also need "fixing"

> @@ -4152,7 +4152,7 @@ static pid_t create_migration_child(const char
> *rune, int *send_fd,
>  pid_t child;
>  
>  if (!rune || !send_fd || !recv_fd)
> -return -1;
> +return EXIT_FAILURE;
> 
Err.. no, create_migration_child() is an internal function, so it is ok
for it to return -1/0, isn't it?
>  
>  MUST( libxl_pipe(ctx, sendpipe) );
>  MUST( libxl_pipe(ctx, recvpipe) );
> @@ -4166,7 +4166,7 @@ static pid_t create_migration_child(const char
> *rune, int *send_fd,
>  close(recvpipe[0]); close(recvpipe[1]);
>  execlp("sh","sh","-c",rune,(char*)0);
>  perror("failed to exec sh");
> -exit(-1);
> +exit(EXIT_FAILURE);
>  }
Of course, in this specific case, since it's an exit() and not a
'return', it is ok to convert this to EXIT_FAILURE, like you're doing.

> @@ -4189,16 +4189,16 @@ static int migrate_read_fixedmessage(int fd,
> const void *msg, int msgsz,
>  
>  stream = rune ? "migration receiver stream" : "migration
> stream";
>  rc = libxl_read_exactly(ctx, fd, buf, msgsz, stream, what);
> -if (rc) return ERROR_FAIL;
> +if (rc) return EXIT_FAILURE;
>  
Internal too, so ok changing it, but to -1/0.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/9] xl: Improve return and exit codes of main_list() and main_vm_list() related functions.

2016-03-02 Thread Dario Faggioli
On Wed, 2016-02-24 at 18:23 +0530, Harmandeep Kaur wrote:
> Signed-off-by: Harmandeep Kaur 
>
This patch again looks fine, but I'll wait for next version to provide
Review-by-s, to double check the new function breakup in the various
patches.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 7/9] xl: Improve return and exit codes of main_create(), main_config_update(), main_sharing(), main_rename() and related functions.

2016-03-02 Thread Dario Faggioli
On Wed, 2016-02-24 at 18:23 +0530, Harmandeep Kaur wrote:
> Signed-off-by: Harmandeep Kaur 
>
I don't recall if I said this already, but main_sharing() does not
belong here.
 
> @@ -5095,11 +5095,11 @@ int main_create(int argc, char **argv)
>  rc = create_domain(&dom_info);
>  if (rc < 0) {
>  free(dom_info.extra_config);
> -return -rc;
> +return EXIT_FAILURE;
>  }
As far as I can see, create_domain() mostly returns libxl error codes.
I think you should convert that one as well (in 0/-1, as it's internal,
and doing it in this patch would be ok).

The rest of the patch looks ok to me.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] xen: add hypercall option to temporarily pin a vcpu

2016-03-02 Thread Anshul Makkar
Hi,


-Original Message-
From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of George 
Dunlap
Sent: 01 March 2016 15:53
To: Juergen Gross ; xen-devel@lists.xen.org
Cc: Wei Liu ; Stefano Stabellini 
; George Dunlap ; 
Andrew Cooper ; Dario Faggioli 
; Ian Jackson ; David Vrabel 
; jbeul...@suse.com
Subject: Re: [Xen-devel] [PATCH v2 2/3] xen: add hypercall option to 
temporarily pin a vcpu

On 01/03/16 09:02, Juergen Gross wrote:
> Some hardware (e.g. Dell studio 1555 laptops) require SMIs to be 
> called on physical cpu 0 only. Linux drivers like dcdbas or i8k try to 
> achieve this by pinning the running thread to cpu 0, but in Dom0 this 
> is not enough: the vcpu must be pinned to physical cpu 0 via Xen, too.
> 
> Add a stable hypercall option SCHEDOP_pin_temp to the sched_op 
> hypercall to achieve this. It is taking a physical cpu number as 
> parameter. If pinning is possible (the calling domain has the 
> privilege to make the call and the cpu is available in the domain's
> cpupool) the calling vcpu is pinned to the specified cpu. The old cpu 
> affinity is saved. To undo the temporary pinning a cpu -1 is 
> specified. This will restore the original cpu affinity for the vcpu.
> 
> Signed-off-by: Juergen Gross 
> ---
> V2: - limit operation to hardware domain as suggested by Jan Beulich
> - some style issues corrected as requested by Jan Beulich
> - use fixed width types in interface as requested by Jan Beulich
> - add compat layer checking as requested by Jan Beulich
> ---
>  xen/common/compat/schedule.c |  4 ++
>  xen/common/schedule.c| 92 
> +---
>  xen/include/public/sched.h   | 17 
>  xen/include/xlat.lst |  1 +
>  4 files changed, 109 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/common/compat/schedule.c 
> b/xen/common/compat/schedule.c index 812c550..73b0f01 100644
> --- a/xen/common/compat/schedule.c
> +++ b/xen/common/compat/schedule.c
> @@ -10,6 +10,10 @@
>  
>  #define do_sched_op compat_sched_op
>  
> +#define xen_sched_pin_temp sched_pin_temp CHECK_sched_pin_temp; 
> +#undef xen_sched_pin_temp
> +
>  #define xen_sched_shutdown sched_shutdown  CHECK_sched_shutdown;  
> #undef xen_sched_shutdown diff --git a/xen/common/schedule.c 
> b/xen/common/schedule.c index b0d4b18..653f852 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -271,6 +271,12 @@ int sched_move_domain(struct domain *d, struct cpupool 
> *c)
>  struct scheduler *old_ops;
>  void *old_domdata;
>  
> +for_each_vcpu ( d, v )
> +{
> +if ( v->affinity_broken )
> +return -EBUSY;
> +}
> +
>  domdata = SCHED_OP(c->sched, alloc_domdata, d);
>  if ( domdata == NULL )
>  return -ENOMEM;
> @@ -669,6 +675,14 @@ int cpu_disable_scheduler(unsigned int cpu)
>  if ( cpumask_empty(&online_affinity) &&
>   cpumask_test_cpu(cpu, v->cpu_hard_affinity) )
>  {
> +if ( v->affinity_broken )
> +{
> +/* The vcpu is temporarily pinned, can't move it. */
> +vcpu_schedule_unlock_irqrestore(lock, flags, v);
> +ret = -EBUSY;
> +break;
> +}

Does this mean that if the user closes the laptop lid while one of these 
drivers has vcpu0 pinned, that Xen will crash (see 
xen/arch/x86/smpboot.c:__cpu_disable())?  Or is it the OS's job to make sure 
that all temporary pins are removed before suspending?

Also -- have you actually tested the "cpupool move while pinned"
functionality to make sure it actually works?  There's a weird bit in
cpupool_unassign_cpu_helper() where after calling cpu_disable_scheduler(cpu), 
it unconditionally sets the cpu bit in the cpupool_free_cpus mask, even if it 
returns an error.  That can't be right, even for the existing -EAGAIN case, can 
it?

I see that you have a loop to retry this call several times in the next patch; 
but what if it fails every time -- what state is the system in?

And, in general, what happens if the device driver gets mixed up and forgets to 
unpin the vcpu?  Is the only recourse to reboot your host (or deal with the 
fact that you can't reconfigure your cpupools)?

 -George

Sorry, lost the original thread so replying at the top of mail chain.

+static XSM_INLINE int xsm_schedop_pin_temp(XSM_DEFAULT_VOID) 
+{ 
+ XSM_ASSERT_ACTION(XSM_PRIV); 
+ return xsm_default_action(action, current->domain, NULL); 
+}

Is the intention is to restrict the hypercall usage to dom0 only ?

Anshul Makkar

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Changes to xenbits login (removing ssh password authentication) - please reply by Friday March 4th

2016-03-02 Thread Lars Kurth
Hi all,

due to the denyhosts package having been removed from Jessie, we are planning 
to disable SSH password authentication from xenbits. The majority of people who 
have xenbits accounts do use SSH public-key authentication, but there may be 
some people who don't. 

I added people who I could identify from their logins and who have been active 
in the community recently into the BCC list. 

If you do use password authentication and not SSH public-key authentication, 
please reply to this mail. We may need to install your SSH key on xenbits.

Best Regards
Lars
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] xen: add hypercall option to temporarily pin a vcpu

2016-03-02 Thread Juergen Gross
On 02/03/16 17:03, Dario Faggioli wrote:
> On Wed, 2016-03-02 at 16:34 +0100, Juergen Gross wrote:
>> On 02/03/16 10:27, Dario Faggioli wrote:
>>>  
>>> However, an xl flag is easier to add, easier to document and easier
>>> and
>>> more natural to find, from the point of view of an user that really
>>> needs it. And perhaps it could turn out useful for other situations
>>> in
>>> future. So, I guess I'd say:
>>>  - yes, let's add that
>>>  - let's do it as a "force flag" of `xl vcpu-pin'.
>> Which raises the question: how to do that on the libxl level?
>>
> Ah, right.
> 
>> a) expand libxl_set_vcpuaffinity() with another parameter (is this
>> even
>>possible? I could do some ifdeffery, but the API would change...)
>>
>> b) add a libxl_set_vcpuaffinity_force() variant
>>
>> c) imply the force flag by specifying both hard and soft maps as NULL
>>(it _is_ basically just that: keep both affinity sets), implying
>> that
>>it makes no sense to specify any affinities with the -f flag
>> (which
>>renders the "force" meaning rather strange, would be more a
>> "restore"
>>now).
>>
> Eheh, tools' maintainers' call. My preference would be b).
> 
> I don't like a), mostly because that would mean everyone will need to
> specify a parameter that it is really only necessary in special cases.
> 
> I could live with c), but it indeed makes the semantic too convoluted
> for my taste.
> 
> I guess, however, that even if going for b), we need to decide whether
> to require a cpumask or not, and what to do if one passes NULL. Maybe
> we can have a cpumask parameter and,
>  - if it is not NULL, force affinity to that,
>  - if it is NULL, just 'restore';
> what do you think?

I would just let the force flag restore the old setting (thus clearing
the affinity_broken flag) and then apply the normal affinity settings.

> Actually, at Xen level, the override only acts on hard affinity...
> should libxl take only one cpumask (for hard affinity only), or both
> hard and soft?

Just as the user is specifying: 0, 1 or 2.

> I'd say just one for hard is enough, unless we want to make space for a
> potential future situation where we will want to break and restore soft
> affinity as well...

The force flag would be just an add-on. That's rather easy in the
hypervisor and in the tools.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Zero-sized reads from XenBus block

2016-03-02 Thread Boris Ostrovsky

On 03/02/2016 11:35 AM, Roger Pau Monné wrote:

El 2/3/16 a les 17:13, Wei Liu ha escrit:

CC Linux kernel and FreeBSD maintainers.

On Wed, Mar 02, 2016 at 12:29:26AM +0300, Sergei Lebedev wrote:

Hi list,

I’m not sure if this is the expected behaviour, but it seems zero-sized reads 
from /dev/xen/xenbus block. Here’s sample code in Python

 import os
 
 fd = os.open("/dev/xen/xenbus", os.O_RDWR)

 os.read(fd, 0)  # Blocks.

The issue is not language-specific, similar code in C blocks as well.

I've tested your code on FreeBSD (after replacing /dev/xen/xenbus with
/dev/xen/xenstore), and it doesn't block there. AFAICT this is because
0-size reads never get to the device "read" routine on FreeBSD, or else
it would block.


This is how xenbus driver is designed --- it always blocks until 
something is written there.


It should indeed return zero right away but I wonder whether someone 
might count on current implementation (in the toolstack or elsewhere). 
Based on FreeBSD behavior I'd think this shouldn't be the case.



-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/9] xl: Improve return and exit codes of main_pause(), main_unpause(), main_destroy() and main_shutdown_or_reboot() related functions.

2016-03-02 Thread Dario Faggioli
On Wed, 2016-02-24 at 18:23 +0530, Harmandeep Kaur wrote:
> Signed-off-by: Harmandeep Kaur 
>
Apart from the subject that, as said already, should be more generic,
and from (at least) one long line, this patch looks fine to me.

I'd provide my Reviewed-by, but I asked to move some hunks from patch
4/9 to here, so that would not apply to next version anyway.

In any case, as said, as far as these modification go, they're fine.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [GRUB2 PATCH v3 4/4] multiboot2: Add support for relocatable images

2016-03-02 Thread Daniel Kiper
Currently multiboot2 protocol loads image exactly at address specified in
ELF or multiboot2 header. This solution works quite well on legacy BIOS
platforms. It is possible because memory regions are placed at predictable
addresses (though I was not able to find any spec which says that it is
strong requirement, so, it looks that it is just a goodwill of hardware
designers). However, EFI platforms are more volatile. Even if required
memory regions live at specific addresses then they are sometimes simply
not free (e.g. used by boot/runtime services on Dell PowerEdge R820 and
OVMF). This means that you are not able to simply set up final image
destination on build time. You have to provide method to relocate image
contents to real load address which is usually different than load address
specified in ELF and multiboot2 headers.

This patch provides all needed machinery to do self relocation in image code.
First of all GRUB2 reads min_addr (min. load addr), max_addr (max. load addr),
align (required image alignment), preference (it says which memory regions are
preferred by image, e.g. none, low, high) from multiboot_header_tag_relocatable
header tag contained in binary. Later loader tries to fulfill request (not only
that one) and if it succeeds then it informs image about real load address via
multiboot_tag_base_addr tag. At this stage GRUB2 role is finished. Starting
from now executable must cope with relocations itself using whole static
and dynamic knowledge provided by boot loader.

This patch does not provide functionality which could do relocations using
ELF relocation data. However, I was asked by Konrad Rzeszutek Wilk and Vladimir
'phcoder' Serbinenko to investigate that thing. It looks that relevant machinery
could be added to existing code (including this patch) without huge effort.
Additionally, ELF relocation could live in parallel with self relocation 
provided
by this patch. However, during research I realized that first of all we should
establish the details how ELF relocatable image should look like and how it 
should
be build. At least to build proper test/example files.

As I saw multiboot2 protocol is able to consume ET_EXEC and ET_DYN ELF files.
Potentially we can use ET_DYN file type. It can be build with gcc/ld -pie 
option.
However, it contains a lot of unneeded stuff (e.g. INTERP, DYNAMIC, GNU_EH_FRAME
program headers) and it could be quite difficult to drop them (Hmmm... Is it
possible to build it properly with custom ld script?). So, I have checked 
ET_EXEC
file type. Sadly in this case linker by default resolves all local symbol 
relocations
and removes relocation related sections. Fortunately it is possible to leave 
them
as is with simple -q/--emit-relocs ld option. However, output file is quite 
fragile
and any operation on it should be done with great care (e.g. strip should be 
called
with --strip-unneeded option). So, this solution is not perfect too. It means 
that
maybe we should look for better solution. However, I think that we should not 
use
any custom tools and focus on functionalities provided by compiler and binutils.
In this context ld scripts looks quite promising but maybe you have better 
solutions.
So, what do you think about that?

This patch was tested with Xen image which uses that functionality. However, 
this Xen
feature is still under development and new patchset will be released in about 
3-4 weeks.

Signed-off-by: Daniel Kiper 
---
v3 - suggestions/fixes:
   - reduce number of casts
 (suggested by Konrad Rzeszutek Wilk),
   - remove unneeded space at the end of line
 (suggested by Konrad Rzeszutek Wilk),
   - improve commit message
 (suggested by Konrad Rzeszutek Wilk).
---
 grub-core/loader/i386/multiboot_mbi.c |6 ++-
 grub-core/loader/multiboot.c  |   12 --
 grub-core/loader/multiboot_elfxx.c|   28 ++
 grub-core/loader/multiboot_mbi2.c |   65 ++---
 include/grub/multiboot.h  |4 +-
 include/multiboot2.h  |   24 
 6 files changed, 120 insertions(+), 19 deletions(-)

diff --git a/grub-core/loader/i386/multiboot_mbi.c 
b/grub-core/loader/i386/multiboot_mbi.c
index f60b702..4fc83ed 100644
--- a/grub-core/loader/i386/multiboot_mbi.c
+++ b/grub-core/loader/i386/multiboot_mbi.c
@@ -72,7 +72,8 @@ load_kernel (grub_file_t file, const char *filename,
   grub_err_t err;
   if (grub_multiboot_quirks & GRUB_MULTIBOOT_QUIRK_BAD_KLUDGE)
 {
-  err = grub_multiboot_load_elf (file, filename, buffer);
+  err = grub_multiboot_load_elf (file, filename, buffer, 0, 0, 0, 0,
+GRUB_RELOCATOR_PREFERENCE_NONE, NULL, 0);
   if (err == GRUB_ERR_NONE) {
return GRUB_ERR_NONE;
   }
@@ -121,7 +122,8 @@ load_kernel (grub_file_t file, const char *filename,
   return GRUB_ERR_NONE;
 }
 
-  return grub_multiboot_load_elf (file, filename, buffer);
+  return grub_multiboot_load_elf (file, filename, buf

[Xen-devel] [GRUB2 PATCH v3 1/4] i386/relocator: Add grub_relocator64_efi relocator

2016-03-02 Thread Daniel Kiper
Add grub_relocator64_efi relocator. It will be used on EFI 64-bit platforms
when multiboot2 compatible image requests MULTIBOOT_TAG_TYPE_EFI_BS. Relocator
will set lower parts of %rax and %rbx accordingly to multiboot2 specification.
On the other hand processor mode, just before jumping into loaded image, will
be set accordingly to Unified Extensible Firmware Interface Specification,
Version 2.4 Errata B, section 2.3.4, x64 Platforms, boot services. This way
loaded image will be able to use EFI boot services without any issues.

If idea is accepted I will prepare grub_relocator32_efi relocator too.

Signed-off-by: Daniel Kiper 
---
v3 - suggestions/fixes:
   - reuse grub-core/lib/i386/relocator64.S code
 instead of creating separate assembly file
 (suggested by Vladimir 'phcoder' Serbinenko),
   - grub_multiboot_boot() cleanup
 (suggested by Vladimir 'phcoder' Serbinenko),
   - reuse multiboot_header_tag_entry_address struct instead
 of creating new one for EFI 64-bit entry point
 (suggested by Vladimir 'phcoder' Serbinenko).
---
 grub-core/lib/i386/relocator.c|   48 ++
 grub-core/lib/i386/relocator64.S  |3 +++
 grub-core/loader/multiboot.c  |   51 +
 grub-core/loader/multiboot_mbi2.c |   19 +++---
 include/grub/i386/multiboot.h |   11 
 include/grub/i386/relocator.h |   21 +++
 include/multiboot2.h  |1 +
 7 files changed, 145 insertions(+), 9 deletions(-)

diff --git a/grub-core/lib/i386/relocator.c b/grub-core/lib/i386/relocator.c
index 71dd4f0..2b0c260 100644
--- a/grub-core/lib/i386/relocator.c
+++ b/grub-core/lib/i386/relocator.c
@@ -69,6 +69,13 @@ extern grub_uint64_t grub_relocator64_rsi;
 extern grub_addr_t grub_relocator64_cr3;
 extern struct grub_i386_idt grub_relocator16_idt;
 
+#ifdef GRUB_MACHINE_EFI
+#ifdef __x86_64__
+extern grub_uint8_t grub_relocator64_efi_start;
+extern grub_uint8_t grub_relocator64_efi_end;
+#endif
+#endif
+
 #define RELOCATOR_SIZEOF(x)(&grub_relocator##x##_end - 
&grub_relocator##x##_start)
 
 grub_err_t
@@ -214,3 +221,44 @@ grub_relocator64_boot (struct grub_relocator *rel,
   /* Not reached.  */
   return GRUB_ERR_NONE;
 }
+
+#ifdef GRUB_MACHINE_EFI
+#ifdef __x86_64__
+grub_err_t
+grub_relocator64_efi_boot (struct grub_relocator *rel,
+  struct grub_relocator64_efi_state state)
+{
+  grub_err_t err;
+  void *relst;
+  grub_relocator_chunk_t ch;
+
+  err = grub_relocator_alloc_chunk_align (rel, &ch, 0,
+ 0x4000 - RELOCATOR_SIZEOF 
(64_efi),
+ RELOCATOR_SIZEOF (64_efi), 16,
+ GRUB_RELOCATOR_PREFERENCE_NONE, 1);
+  if (err)
+return err;
+
+  /* Do not touch %rsp! It points to EFI created stack. */
+  grub_relocator64_rax = state.rax;
+  grub_relocator64_rbx = state.rbx;
+  grub_relocator64_rcx = state.rcx;
+  grub_relocator64_rdx = state.rdx;
+  grub_relocator64_rip = state.rip;
+  grub_relocator64_rsi = state.rsi;
+
+  grub_memmove (get_virtual_current_address (ch), &grub_relocator64_efi_start,
+   RELOCATOR_SIZEOF (64_efi));
+
+  err = grub_relocator_prepare_relocs (rel, get_physical_target_address (ch),
+  &relst, NULL);
+  if (err)
+return err;
+
+  ((void (*) (void)) relst) ();
+
+  /* Not reached.  */
+  return GRUB_ERR_NONE;
+}
+#endif
+#endif
diff --git a/grub-core/lib/i386/relocator64.S b/grub-core/lib/i386/relocator64.S
index e4648d8..7a06b16 100644
--- a/grub-core/lib/i386/relocator64.S
+++ b/grub-core/lib/i386/relocator64.S
@@ -73,6 +73,7 @@ VARIABLE(grub_relocator64_rsp)
 
movq%rax, %rsp
 
+VARIABLE(grub_relocator64_efi_start)
/* mov imm64, %rax */
.byte   0x48
.byte   0xb8
@@ -120,6 +121,8 @@ LOCAL(jump_addr):
 VARIABLE(grub_relocator64_rip)
.quad   0
 
+VARIABLE(grub_relocator64_efi_end)
+
 #ifndef __x86_64__
.p2align4
 LOCAL(gdt):
diff --git a/grub-core/loader/multiboot.c b/grub-core/loader/multiboot.c
index 73aa0aa..18038fd 100644
--- a/grub-core/loader/multiboot.c
+++ b/grub-core/loader/multiboot.c
@@ -118,6 +118,48 @@ grub_multiboot_set_video_mode (void)
   return err;
 }
 
+#ifdef GRUB_MACHINE_EFI
+#ifdef __x86_64__
+#define grub_relocator_efi_bootgrub_relocator64_efi_boot
+#define grub_relocator_efi_state   grub_relocator64_efi_state
+#endif
+#endif
+
+#ifdef grub_relocator_efi_boot
+static void
+efi_boot (struct grub_relocator *rel,
+ grub_uint32_t target)
+{
+  struct grub_relocator_efi_state state_efi = MULTIBOOT_EFI_INITIAL_STATE;
+
+  state_efi.MULTIBOOT_EFI_ENTRY_REGISTER = grub_multiboot_payload_eip;
+  state_efi.MULTIBOOT_EFI_MBI_REGISTER = target;
+
+  grub_relocator_efi_boot (rel, state_efi);
+}
+#else
+#define grub_efi_is_finished   1
+static void
+efi_boot (struct grub_relocator *rel __attribute__ (

[Xen-devel] [GRUB2 PATCH v3 0/4] multiboot2: Add two extensions

2016-03-02 Thread Daniel Kiper
Hi,

This patch series:
  - enables EFI boot services usage in loaded images
by multiboot2 protocol,
  - add support for multiboot2 protocol compatible
relocatable images.

Earlier versions of this patch series are extensively tested
and used internally at least in Oracle. It should be mentioned
that this release does not change any functionality introduced
by earlier releases. It just takes into account comments posted
by various people.

Hmmm... Ugh... Cough... Is it possible to get this stuff
into 2.02 train?

Daniel

 grub-core/lib/i386/relocator.c|   48 +++
 grub-core/lib/i386/relocator64.S  |3 ++
 grub-core/loader/i386/multiboot_mbi.c |6 ++-
 grub-core/loader/multiboot.c  |   63 
 grub-core/loader/multiboot_elfxx.c|   28 ---
 grub-core/loader/multiboot_mbi2.c |  205 
+-
 include/grub/i386/multiboot.h |   11 +
 include/grub/i386/relocator.h |   21 
 include/grub/multiboot.h  |4 +-
 include/multiboot2.h  |   41 
 10 files changed, 357 insertions(+), 73 deletions(-)

Daniel Kiper (4):
  i386/relocator: Add grub_relocator64_efi relocator
  multiboot2: Add tags used to pass ImageHandle to loaded image
  multiboot2: Do not pass memory maps to image if EFI boot services are 
enabled
  multiboot2: Add support for relocatable images


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [GRUB2 PATCH v3 2/4] multiboot2: Add tags used to pass ImageHandle to loaded image

2016-03-02 Thread Daniel Kiper
Add tags used to pass ImageHandle to loaded image if requested.
It is used by at least ExitBootServices() function.

Signed-off-by: Daniel Kiper 
---
v3 - suggestions/fixes:
   - mbi EFI related stuff size calculation
 should depend on target architecture
 (suggested by Konrad Rzeszutek Wilk),
   - use plain type instead of pointer
 dereference as sizeof() argument
 (suggested by Konrad Rzeszutek Wilk),
   - improve commit message
 (suggested by Konrad Rzeszutek Wilk).
---
 grub-core/loader/multiboot_mbi2.c |   50 ++---
 include/multiboot2.h  |   16 
 2 files changed, 57 insertions(+), 9 deletions(-)

diff --git a/grub-core/loader/multiboot_mbi2.c 
b/grub-core/loader/multiboot_mbi2.c
index a3dca90..7591edc 100644
--- a/grub-core/loader/multiboot_mbi2.c
+++ b/grub-core/loader/multiboot_mbi2.c
@@ -172,6 +172,8 @@ grub_multiboot_load (grub_file_t file, const char *filename)
  case MULTIBOOT_TAG_TYPE_NETWORK:
  case MULTIBOOT_TAG_TYPE_EFI_MMAP:
  case MULTIBOOT_TAG_TYPE_EFI_BS:
+ case MULTIBOOT_TAG_TYPE_EFI32_IH:
+ case MULTIBOOT_TAG_TYPE_EFI64_IH:
break;
 
  default:
@@ -407,16 +409,22 @@ grub_multiboot_get_mbi_size (void)
 + grub_get_multiboot_mmap_count ()
 * sizeof (struct multiboot_mmap_entry)), MULTIBOOT_TAG_ALIGN)
 + ALIGN_UP (sizeof (struct multiboot_tag_framebuffer), MULTIBOOT_TAG_ALIGN)
+#ifdef GRUB_MACHINE_EFI
+#ifdef __i386__
 + ALIGN_UP (sizeof (struct multiboot_tag_efi32), MULTIBOOT_TAG_ALIGN)
++ ALIGN_UP (sizeof (struct multiboot_tag_efi32_ih), MULTIBOOT_TAG_ALIGN)
+#endif
+#ifdef __x86_64__
 + ALIGN_UP (sizeof (struct multiboot_tag_efi64), MULTIBOOT_TAG_ALIGN)
++ ALIGN_UP (sizeof (struct multiboot_tag_efi64_ih), MULTIBOOT_TAG_ALIGN)
+#endif
++ ALIGN_UP (sizeof (struct multiboot_tag_efi_mmap)
+   + efi_mmap_size, MULTIBOOT_TAG_ALIGN)
+#endif
 + ALIGN_UP (sizeof (struct multiboot_tag_old_acpi)
+ sizeof (struct grub_acpi_rsdp_v10), MULTIBOOT_TAG_ALIGN)
 + acpiv2_size ()
 + net_size ()
-#ifdef GRUB_MACHINE_EFI
-+ ALIGN_UP (sizeof (struct multiboot_tag_efi_mmap)
-   + efi_mmap_size, MULTIBOOT_TAG_ALIGN)
-#endif
 + sizeof (struct multiboot_tag_vbe) + MULTIBOOT_TAG_ALIGN - 1
 + sizeof (struct multiboot_tag_apm) + MULTIBOOT_TAG_ALIGN - 1;
 }
@@ -907,11 +915,35 @@ grub_multiboot_make_mbi (grub_uint32_t *target)
 
   if (keep_bs)
 {
-  struct multiboot_tag *tag = (struct multiboot_tag *) ptrorig;
-  tag->type = MULTIBOOT_TAG_TYPE_EFI_BS;
-  tag->size = sizeof (struct multiboot_tag);
-  ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
-   / sizeof (grub_properly_aligned_t);
+  {
+   struct multiboot_tag *tag = (struct multiboot_tag *) ptrorig;
+   tag->type = MULTIBOOT_TAG_TYPE_EFI_BS;
+   tag->size = sizeof (struct multiboot_tag);
+   ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+ / sizeof (grub_properly_aligned_t);
+  }
+
+#ifdef __i386__
+  {
+   struct multiboot_tag_efi32_ih *tag = (struct multiboot_tag_efi32_ih *) 
ptrorig;
+   tag->type = MULTIBOOT_TAG_TYPE_EFI32_IH;
+   tag->size = sizeof (struct multiboot_tag_efi32_ih);
+   tag->pointer = (grub_addr_t) grub_efi_image_handle;
+   ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+ / sizeof (grub_properly_aligned_t);
+  }
+#endif
+
+#ifdef __x86_64__
+  {
+   struct multiboot_tag_efi64_ih *tag = (struct multiboot_tag_efi64_ih *) 
ptrorig;
+   tag->type = MULTIBOOT_TAG_TYPE_EFI64_IH;
+   tag->size = sizeof (struct multiboot_tag_efi64_ih);
+   tag->pointer = (grub_addr_t) grub_efi_image_handle;
+   ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+ / sizeof (grub_properly_aligned_t);
+  }
+#endif
 }
 #endif
 
diff --git a/include/multiboot2.h b/include/multiboot2.h
index d96aa40..36a174f 100644
--- a/include/multiboot2.h
+++ b/include/multiboot2.h
@@ -60,6 +60,8 @@
 #define MULTIBOOT_TAG_TYPE_NETWORK   16
 #define MULTIBOOT_TAG_TYPE_EFI_MMAP  17
 #define MULTIBOOT_TAG_TYPE_EFI_BS18
+#define MULTIBOOT_TAG_TYPE_EFI32_IH  19
+#define MULTIBOOT_TAG_TYPE_EFI64_IH  20
 
 #define MULTIBOOT_HEADER_TAG_END  0
 #define MULTIBOOT_HEADER_TAG_INFORMATION_REQUEST  1
@@ -371,6 +373,20 @@ struct multiboot_tag_efi_mmap
   multiboot_uint8_t efi_mmap[0];
 }; 
 
+struct multiboot_tag_efi32_ih
+{
+  multiboot_uint32_t type;
+  multiboot_uint32_t size;
+  multiboot_uint32_t pointer;
+};
+
+struct multiboot_tag_efi64_ih
+{
+  multiboot_uint32_t type;
+  multiboot_uint32_t size;
+  multiboot_uint64_t pointer;
+};
+
 #endif /* ! ASM_FILE */
 
 #endif /* ! MULTIBOOT_HEADER */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xe

[Xen-devel] [GRUB2 PATCH v3 3/4] multiboot2: Do not pass memory maps to image if EFI boot services are enabled

2016-03-02 Thread Daniel Kiper
Do not pass memory maps to image if it asked for EFI boot services.
Main reason for not providing maps is because they will likely be
invalid. We do a few allocations after filling them, e.g. for relocator
needs. Usually we do not care as we would already finish boot services.
If we keep boot services then it is easier to not provide maps. However,
if image needs memory maps and they are not provided by bootloader then
it should get them itself just before ExitBootServices() call.

Signed-off-by: Daniel Kiper 
Reviewed-by: Konrad Rzeszutek Wilk 
---
v3 - suggestions/fixes:
   - improve commit message
 (suggested by Konrad Rzeszutek Wilk and Vladimir 'phcoder' Serbinenko).
---
 grub-core/loader/multiboot_mbi2.c |   71 ++---
 1 file changed, 35 insertions(+), 36 deletions(-)

diff --git a/grub-core/loader/multiboot_mbi2.c 
b/grub-core/loader/multiboot_mbi2.c
index 7591edc..ce68f48 100644
--- a/grub-core/loader/multiboot_mbi2.c
+++ b/grub-core/loader/multiboot_mbi2.c
@@ -390,7 +390,7 @@ static grub_size_t
 grub_multiboot_get_mbi_size (void)
 {
 #ifdef GRUB_MACHINE_EFI
-  if (!efi_mmap_size)
+  if (!keep_bs && !efi_mmap_size)
 find_efi_mmap_size ();
 #endif
   return 2 * sizeof (grub_uint32_t) + sizeof (struct multiboot_tag)
@@ -759,12 +759,13 @@ grub_multiboot_make_mbi (grub_uint32_t *target)
   }
   }
 
-  {
-struct multiboot_tag_mmap *tag = (struct multiboot_tag_mmap *) ptrorig;
-grub_fill_multiboot_mmap (tag);
-ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
-  / sizeof (grub_properly_aligned_t);
-  }
+  if (!keep_bs)
+{
+  struct multiboot_tag_mmap *tag = (struct multiboot_tag_mmap *) ptrorig;
+  grub_fill_multiboot_mmap (tag);
+  ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+   / sizeof (grub_properly_aligned_t);
+}
 
   {
 struct multiboot_tag_elf_sections *tag
@@ -780,18 +781,19 @@ grub_multiboot_make_mbi (grub_uint32_t *target)
   / sizeof (grub_properly_aligned_t);
   }
 
-  {
-struct multiboot_tag_basic_meminfo *tag
-  = (struct multiboot_tag_basic_meminfo *) ptrorig;
-tag->type = MULTIBOOT_TAG_TYPE_BASIC_MEMINFO;
-tag->size = sizeof (struct multiboot_tag_basic_meminfo); 
+  if (!keep_bs)
+{
+  struct multiboot_tag_basic_meminfo *tag
+   = (struct multiboot_tag_basic_meminfo *) ptrorig;
+  tag->type = MULTIBOOT_TAG_TYPE_BASIC_MEMINFO;
+  tag->size = sizeof (struct multiboot_tag_basic_meminfo);
 
-/* Convert from bytes to kilobytes.  */
-tag->mem_lower = grub_mmap_get_lower () / 1024;
-tag->mem_upper = grub_mmap_get_upper () / 1024;
-ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
-   / sizeof (grub_properly_aligned_t);
-  }
+  /* Convert from bytes to kilobytes.  */
+  tag->mem_lower = grub_mmap_get_lower () / 1024;
+  tag->mem_upper = grub_mmap_get_upper () / 1024;
+  ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+   / sizeof (grub_properly_aligned_t);
+}
 
   {
 struct grub_net_network_level_interface *net;
@@ -890,27 +892,24 @@ grub_multiboot_make_mbi (grub_uint32_t *target)
 grub_efi_uintn_t efi_desc_size;
 grub_efi_uint32_t efi_desc_version;
 
-tag->type = MULTIBOOT_TAG_TYPE_EFI_MMAP;
-tag->size = sizeof (*tag) + efi_mmap_size;
-
 if (!keep_bs)
-  err = grub_efi_finish_boot_services (&efi_mmap_size, tag->efi_mmap, NULL,
-  &efi_desc_size, &efi_desc_version);
-else
   {
-   if (grub_efi_get_memory_map (&efi_mmap_size, (void *) tag->efi_mmap,
-NULL,
-&efi_desc_size, &efi_desc_version) <= 0)
- err = grub_error (GRUB_ERR_IO, "couldn't retrieve memory map");
+   tag->type = MULTIBOOT_TAG_TYPE_EFI_MMAP;
+   tag->size = sizeof (*tag) + efi_mmap_size;
+
+   err = grub_efi_finish_boot_services (&efi_mmap_size, tag->efi_mmap, 
NULL,
+&efi_desc_size, &efi_desc_version);
+
+   if (err)
+ return err;
+
+   tag->descr_size = efi_desc_size;
+   tag->descr_vers = efi_desc_version;
+   tag->size = sizeof (*tag) + efi_mmap_size;
+
+   ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+ / sizeof (grub_properly_aligned_t);
   }
-if (err)
-  return err;
-tag->descr_size = efi_desc_size;
-tag->descr_vers = efi_desc_version;
-tag->size = sizeof (*tag) + efi_mmap_size;
-
-ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
-  / sizeof (grub_properly_aligned_t);
   }
 
   if (keep_bs)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5] libelf: rewrite symtab/strtab loading

2016-03-02 Thread Jan Beulich
>>> On 01.03.16 at 12:59,  wrote:
> Changes since v4:
>  - Add a define that contains the number of sections.
>  - Improve the comment to describe the memory layout.
>  - Check that the sh_link field is 0 < sh_link < e_shnum.
>  - Simplify some of the logic, since the SYMTAB section is already
>discovered by elf_init and it's handler stored in elf->sym_tab.

Well, this was a nice idea, but ...

> @@ -164,101 +169,248 @@ void elf_parse_bsdsyms(struct elf_binary *elf, 
> uint64_t pstart)
>  sz = sizeof(uint32_t);
>  
>  /* Space for the elf and elf section headers */
> -sz += (elf_uval(elf, elf->ehdr, e_ehsize) +
> -   elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize));
> +sz += elf_uval(elf, elf->ehdr, e_ehsize) +
> +  ELF_BSDSYM_SECTIONS * elf_uval(elf, elf->ehdr, e_shentsize);
>  sz = elf_round_up(elf, sz);
>  
> +
>  /* Space for the symbol and string tables. */
> -for ( i = 0; i < elf_shdr_count(elf); i++ )
> +sh_link = elf_uval(elf, elf->sym_tab, sh_link);
> +if ( sh_link == SHN_UNDEF || sh_link >= elf_shdr_count(elf) )

... this check then really ought to be moved there (as I now see
in the previous version you likely simply copied what was there).

Everything else looks fine to me now.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Zero-sized reads from XenBus block

2016-03-02 Thread Roger Pau Monné
El 2/3/16 a les 17:13, Wei Liu ha escrit:
> CC Linux kernel and FreeBSD maintainers.
> 
> On Wed, Mar 02, 2016 at 12:29:26AM +0300, Sergei Lebedev wrote:
>> Hi list,
>>
>> I’m not sure if this is the expected behaviour, but it seems zero-sized 
>> reads from /dev/xen/xenbus block. Here’s sample code in Python
>>
>> import os
>> 
>> fd = os.open("/dev/xen/xenbus", os.O_RDWR)
>> os.read(fd, 0)  # Blocks.
>>
>> The issue is not language-specific, similar code in C blocks as well.

I've tested your code on FreeBSD (after replacing /dev/xen/xenbus with
/dev/xen/xenstore), and it doesn't block there. AFAICT this is because
0-size reads never get to the device "read" routine on FreeBSD, or else
it would block.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] AMD, maintainers: Remove myself from list

2016-03-02 Thread Aravind Gopalakrishnan
I will not be looking at AMD related Xen code now.
So, removing myself.

Signed-off-by: Aravind Gopalakrishnan 
---
 MAINTAINERS | 2 --
 1 file changed, 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 932b05c..7aacfd6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -107,14 +107,12 @@ F:xen/include/acpi/
 
 AMD IOMMU
 M: Suravee Suthikulpanit 
-M: Aravind Gopalakrishnan 
 S: Maintained
 F: xen/drivers/passthrough/amd/
 
 AMD SVM
 M: Boris Ostrovsky 
 M: Suravee Suthikulpanit 
-M: Aravind Gopalakrishnan 
 S: Supported
 F: xen/arch/x86/hvm/svm/
 F: xen/arch/x86/cpu/vpmu_amd.c
-- 
2.7.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Zero-sized reads from XenBus block

2016-03-02 Thread Wei Liu
CC Linux kernel and FreeBSD maintainers.

On Wed, Mar 02, 2016 at 12:29:26AM +0300, Sergei Lebedev wrote:
> Hi list,
> 
> I’m not sure if this is the expected behaviour, but it seems zero-sized reads 
> from /dev/xen/xenbus block. Here’s sample code in Python
> 
> import os
> 
> fd = os.open("/dev/xen/xenbus", os.O_RDWR)
> os.read(fd, 0)  # Blocks.
> 
> The issue is not language-specific, similar code in C blocks as well.
> 
> Regards,
> Sergei
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 84960: regressions - FAIL

2016-03-02 Thread osstest service owner
flight 84960 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/84960/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 65543
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 65543

version targeted for testing:
 ovmf 5f67844615baa56fd9e67597a00adcc6b7387ef9
baseline version:
 ovmf 5ac96e3a28dd26eabee421919f67fa7c443a47f1

Last test of basis65543  2015-12-08 08:45:15 Z   85 days
Failing since 65593  2015-12-08 23:44:51 Z   84 days   89 attempts
Testing same since84960  2016-03-01 16:57:24 Z0 days1 attempts


People who touched revisions under test:
  "Samer El-Haj-Mahmoud" 
  "Yao, Jiewen" 
  Alcantara, Paulo 
  Anbazhagan Baraneedharan 
  Andrew Fish 
  Ard Biesheuvel 
  Arthur Crippa Burigo 
  Cecil Sheng 
  Chao Zhang 
  Charles Duffy 
  Cinnamon Shia 
  Cohen, Eugene 
  Dandan Bi 
  Daocheng Bu 
  Daryl McDaniel 
  edk2 dev 
  edk2-devel 
  Eric Dong 
  Eric Dong 
  Eugene Cohen 
  Evan Lloyd 
  Feng Tian 
  Fu Siyuan 
  Hao Wu 
  Haojian Zhuang 
  Hess Chen 
  Heyi Guo 
  Jaben Carsey 
  Jeff Fan 
  Jiaxin Wu 
  jiewen yao 
  Jim Dailey 
  Jordan Justen 
  Karyne Mayer 
  Larry Hauch 
  Laszlo Ersek 
  Leahy, Leroy P 
  Lee Leahy 
  Leekha Shaveta 
  Leif Lindholm 
  Liming Gao 
  Mark Rutland 
  Marvin Haeuser 
  Michael Kinney 
  Michael LeMay 
  Michael Thomas 
  Ni, Ruiyu 
  Paolo Bonzini 
  Paulo Alcantara 
  Paulo Alcantara Cavalcanti 
  Qin Long 
  Qiu Shumin 
  Rodrigo Dias Correa 
  Ruiyu Ni 
  Ryan Harkin 
  Samer El-Haj-Mahmoud 
  Samer El-Haj-Mahmoud 
  Star Zeng 
  Supreeth Venkatesh 
  Tapan Shah 
  Tian, Feng 
  Vladislav Vovchenko 
  Yao Jiewen 
  Yao, Jiewen 
  Ye Ting 
  Yonghong Zhu 
  Zhang Lubo 
  Zhang, Chao B 
  Zhangfei Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 10562 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] xen: add hypercall option to temporarily pin a vcpu

2016-03-02 Thread Dario Faggioli
On Wed, 2016-03-02 at 16:34 +0100, Juergen Gross wrote:
> On 02/03/16 10:27, Dario Faggioli wrote:
> > 
> > However, an xl flag is easier to add, easier to document and easier
> > and
> > more natural to find, from the point of view of an user that really
> > needs it. And perhaps it could turn out useful for other situations
> > in
> > future. So, I guess I'd say:
> >  - yes, let's add that
> >  - let's do it as a "force flag" of `xl vcpu-pin'.
> Which raises the question: how to do that on the libxl level?
> 
Ah, right.

> a) expand libxl_set_vcpuaffinity() with another parameter (is this
> even
>    possible? I could do some ifdeffery, but the API would change...)
> 
> b) add a libxl_set_vcpuaffinity_force() variant
> 
> c) imply the force flag by specifying both hard and soft maps as NULL
>    (it _is_ basically just that: keep both affinity sets), implying
> that
>    it makes no sense to specify any affinities with the -f flag
> (which
>    renders the "force" meaning rather strange, would be more a
> "restore"
>    now).
> 
Eheh, tools' maintainers' call. My preference would be b).

I don't like a), mostly because that would mean everyone will need to
specify a parameter that it is really only necessary in special cases.

I could live with c), but it indeed makes the semantic too convoluted
for my taste.

I guess, however, that even if going for b), we need to decide whether
to require a cpumask or not, and what to do if one passes NULL. Maybe
we can have a cpumask parameter and,
 - if it is not NULL, force affinity to that,
 - if it is NULL, just 'restore';
what do you think?

Actually, at Xen level, the override only acts on hard affinity...
should libxl take only one cpumask (for hard affinity only), or both
hard and soft?
I'd say just one for hard is enough, unless we want to make space for a
potential future situation where we will want to break and restore soft
affinity as well...

Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 10/11] xen: modify page table construction

2016-03-02 Thread Daniel Kiper
On Wed, Mar 02, 2016 at 04:43:07PM +0100, Juergen Gross wrote:
> On 02/03/16 10:12, Daniel Kiper wrote:
> > On Mon, Feb 29, 2016 at 01:19:27PM +0100, Juergen Gross wrote:
> >> On 29/02/16 10:13, Juergen Gross wrote:
> >>> On 25/02/16 19:33, Andrei Borzenkov wrote:
>  22.02.2016 16:14, Juergen Gross пишет:
> > On 22/02/16 13:48, Daniel Kiper wrote:
> >> On Mon, Feb 22, 2016 at 01:30:30PM +0100, Juergen Gross wrote:
> >>> On 22/02/16 13:18, Daniel Kiper wrote:
>  On Mon, Feb 22, 2016 at 10:29:04AM +0100, Juergen Gross wrote:
> > On 22/02/16 10:17, Daniel Kiper wrote:
> >> On Mon, Feb 22, 2016 at 07:03:18AM +0100, Juergen Gross wrote:
> >>> diff --git a/grub-core/lib/xen/relocator.c 
> >>> b/grub-core/lib/xen/relocator.c
> >>> index 8f427d3..a05b253 100644
> >>> --- a/grub-core/lib/xen/relocator.c
> >>> +++ b/grub-core/lib/xen/relocator.c
> >>> @@ -29,6 +29,11 @@
> >>>
> >>>  typedef grub_addr_t grub_xen_reg_t;
> >>>
> >>> +struct grub_relocator_xen_paging_area {
> >>> +  grub_xen_reg_t start;
> >>> +  grub_xen_reg_t size;
> >>> +};
> >>> +
> >>
> >> ... this should have GRUB_PACKED because compiler may
> >> add padding to align size member.
> >
> > Why would the compiler add padding to a structure containing two 
> > items
> > of the same type? I don't think the C standard would allow this.
> >
> > grub_xen_reg_t is either unsigned (32 bit) or unsigned long (64 
> > bit).
> > There is no way this could require any padding.
> 
>  You are right but we should add this here just in case.
> >>>
> >>> Sorry, I don't think this makes any sense. The C standard is very 
> >>> clear
> >>> in this case: a type requiring a special alignment has always a length
> >>> being a multiple of that alignment. Otherwise arrays wouldn't work.
> >>
> >> Sorry, I am not sure what do you mean by that.
> >
> > The size of any C type (no matter whether it is an integral type like
> > "int" or a structure) has always the same alignment restriction as the
> > type itself. So a type requiring 8 byte alignment will always have a
> > size of a multiple of 8 bytes. This is mandatory for arrays to work, as
> > otherwise either the elements wouldn't be placed consecutively in memory
> > or the alignment restrictions wouldn't be obeyed for all elements.
> >
> 
>  I too not follow how it is relevant to this case. We talk about internal
>  padding between structure members, not between array elements.
> 
> > For our case it means that two structure elements of the same type will
> > never require a padding between them, thus the annotation with "packed"
> > can't serve any purpose.
> >
> 
>  Well, I am not aware of any requirement. Compiler may add arbitrary
>  padding between structure elements; it is only prohibited to add padding
>  at the beginning. Sure, it would be unusual, but never say "never" ...
>  also should Xen ever be ported to architecture where types are not
>  self-aligned it will become an issue.
> >>>
> >>> So you are telling me that _all_ interfaces between e.g. Linux, grub2,
> >>> Xen and all wire protocols not attributed with "packed" are just wrong?
> >>>
> >>> Sorry, I don't think this is true.
> >>
> >> Okay, just found a reference: The x86 ABI states:
> >>
> >> Aggregates and Unions
> >> -
> >> Structures and unions assume the alignment of their most strictly
> >> aligned component. Each member is assigned to the lowest available
> >> offset with the appropriate alignment. The size of any object is always
> >> a multiple of the object‘s alignment.
> >>
> >> I don't think any x86 C-compiler will violate the x86 ABI.
> >
> > You just cited only part of paragraph. Here is full paragraph:
> >
> > [...]
> >
> > Aggregates and Unions
> >
> > Structures and unions assume the alignment of their most strictly aligned 
> > component.
> > Each member is assigned to the lowest available offset with the appropriate
> > alignment. The size of any object is always a multiple of the object‘s 
> > alignment.
> > An array uses the same alignment as its elements, except that a local or 
> > global
> > array variable of length at least 16 bytes or a C99 variable-length array 
> > variable
> > always has alignment of at least 16 bytes.
> > Structure and union objects can require padding to meet size and alignment
> > constraints. The contents of any padding is undefined.
> >
> > [...]
> >
> > Well, this is a bit hard to understand, so, please look here
> > http://www.catb.org/esr/structure-packing/#_structure_alignment_and_padding
> > what can happen if struct has members with different sizes and you do
> > not use packed attribute.
> >
> > Luckily you use

Re: [Xen-devel] [BUG] xs.watch and xs.unwatch are unreliable

2016-03-02 Thread Wei Liu
I've CC'ed some people who might have an idea whether they are replying
on this behaviour. I doubt that but let's better be sure...

On Tue, Mar 01, 2016 at 11:17:54PM +0300, Sergei Lebedev wrote:
> Hi list,
> 
> I’ve initially wanted to report another inconsistency in ``xen.lowlevel.xs`` 
> documentation, but this time the issue is more subtle.
> 
> Both ``xs.watch`` and ``xs.unwatch`` accept two arguments: a path to watch 
> and a token. According to the documentation, the second argument must be a 
> string, which makes sense, since the token should be sent directly to 
> XenStore. Instead of doing the simple thing (transmitting the token as-is), 
> the implementation makes a new token from *the pointer* to the token object 
> and sends it instead:
> 
> PyObject *token;
> char token_str[MAX_STRLEN(unsigned long) + 1];
> 
> snprintf(token_str, sizeof(token_str), "%li", (unsigned long)token);
> 
> This does work for simple cases, e.g. if a token is a string literal 
> 
> >>> from xen.lowlevel.xs import xs
> >>> h = xs()
> >>> h.watch(“@introduceDomain”, “token”)
> >>> h.unwatch(“@introduceDomain”, “token”)
> 
> or a small number
> 
> >>> h.watch(“@introduceDomain”, 42)
> >>> h.unwatch(“@introduceDomain”, 42)
> 
> But in the general case this is broken
> 
> >>> h.watch("@introduceDomain", 10)
> >>> h.unwatch("@introduceDomain", 10)
> Traceback (most recent call last):
>   File "", line 1, in 
> xen.lowlevel.xs.Error: (2, 'No such file or directory’)
> 
> Here’s another example with a string token
> 
> >>> token1 = str(10)
> >>> token2 = str(10)
> >>> token1 == token2
> True
> >>> h.watch("@introduceDomain", token1)
> >>> h.unwatch("@introduceDomain", token2)
> Traceback (most recent call last):
>   File "", line 1, in 
> xen.lowlevel.xs.Error: (2, 'No such file or directory’)
> 
> I’m not sure what would be the best way to handle this as there might be 
> existing code relying on this undocumented behaviour. What do you think?
> 

In any case this looks like a real bug.

The fix (as you said) is to transmit the token. I don't think your
proposed fix would break existing users though.

Wei.


> Regards,
> Sergei
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 13/24] arm/acpi: Map all other tables for Dom0

2016-03-02 Thread Jan Beulich
>>> On 02.03.16 at 16:00,  wrote:
> On Wed, 2 Mar 2016, Shannon Zhao wrote:
>> On 2016年03月02日 01:01, Stefano Stabellini wrote:
>> > On Tue, 1 Mar 2016, Stefano Stabellini wrote:
>> >> > On Tue, 1 Mar 2016, Shannon Zhao wrote:
>> >>> > > On 2016/2/29 22:15, Stefano Stabellini wrote:
>>  > > > On Sun, 28 Feb 2016, Shannon Zhao wrote:
>> >> > > >> > From: Shannon Zhao 
>> >> > > >> > 
>> >> > > >> > Map all other tables to Dom0 using 1:1 mappings.
>> >> > > >> > 
>> >> > > >> > Signed-off-by: Shannon Zhao 
>> >> > > >> > ---
>> >> > > >> > v4: fix commit message
>> >> > > >> > ---
>> >> > > >> >  xen/arch/arm/domain_build.c | 26 ++
>> >> > > >> >  1 file changed, 26 insertions(+)
>> >> > > >> > 
>> >> > > >> > diff --git a/xen/arch/arm/domain_build.c 
>> >> > > >> > b/xen/arch/arm/domain_build.c
>> >> > > >> > index 64e48ae..6ad420c 100644
>> >> > > >> > --- a/xen/arch/arm/domain_build.c
>> >> > > >> > +++ b/xen/arch/arm/domain_build.c
>> >> > > >> > @@ -1357,6 +1357,30 @@ static int prepare_dtb(struct domain 
>> >> > > >> > *d, struct 
> kernel_info *kinfo)
>> >> > > >> >  }
>> >> > > >> >  
>> >> > > >> >  #ifdef CONFIG_ACPI
>> >> > > >> > +static void acpi_map_other_tables(struct domain *d)
>> >> > > >> > +{
>> >> > > >> > +int i;
>> >> > > >> > +unsigned long res;
>> >> > > >> > +u64 addr, size;
>> >> > > >> > +
>> >> > > >> > +/* Map all other tables to Dom0 using 1:1 mappings. */
>> >> > > >> > +for( i = 0; i < acpi_gbl_root_table_list.count; i++ )
>> >> > > >> > +{
>> >> > > >> > +addr = acpi_gbl_root_table_list.tables[i].address;
>> >> > > >> > +size = acpi_gbl_root_table_list.tables[i].length;
>> >> > > >> > +res = map_regions(d,
>> >> > > >> > +  paddr_to_pfn(addr & PAGE_MASK),
>> >> > > >> > +  DIV_ROUND_UP(size, PAGE_SIZE),
>> >> > > >> > +  paddr_to_pfn(addr & PAGE_MASK));
>> >> > > >> > +if ( res )
>> >> > > >> > +{
>> >> > > >> > + panic(XENLOG_ERR "Unable to map 0x%"PRIx64
>> >> > > >> > +   " - 0x%"PRIx64" in domain \n",
>> >> > > >> > +   addr & PAGE_MASK, PAGE_ALIGN(addr + 
>> >> > > >> > size) - 1);
>> >> > > >> > +}
>> >> > > >> > +}
>> >> > > >> > +}
>>  > > > The problem with this function is that it is mapping all other 
>>  > > > tables to
>>  > > > the guest, including the unmodified FADT and MADT. This way dom0 
>>  > > > is
>>  > > > going to find two different versions of the FADT and MADT, isn't 
>>  > > > that
>>  > > > right?
>>  > > >  
>> >>> > > We've replaced the entries of XSDT table with new value. That means 
>> >>> > > XSDT
>> >>> > > points to new table. Guest will not see the old ones.
>> >> > 
>> >> > All right. Of course it would be best to avoid mapping the original FADT
>> >> > and MADT at all, but given that they are not likely to be page aligned,
>> >> > it wouldn't be easy to do.
>> >> > 
>> >> > Reviewed-by: Stefano Stabellini 
>> >
>> > However I have one more question: given that map_regions maps the memory
>> > read-only to Dom0, isn't it possible that one or more of the DSDT
>> > functions could not work as expected? I would imagine that the ACPI
>> > bytecode is allowed to change its own memory, right?
>> > 
>> I'm not sure about this. But it seems that Xen or Linux always map these
>> tables to its memory.
> 
> It's not mapping pages in general the problem. The potential issue comes
> from the pages being mapped read-only. If an AML function in the DSDT
> needs to write something to memory, I imagine that the function would
> fail when called from Dom0.
> 
> I think we need to map them read-write, which is safe, even for the
> original FADT and MADT, because by the time Dom0 gets to see them, Xen
> won't parse them anymore (Xen completes parsing ACPI tables, before
> booting Dom0).
> 
> So this patch is fine, but
> http://marc.info/?l=xen-devel&m=145665887528175 needs to be changed to
> use p2m_access_rw instead of p2m_access_r.

Yes, I agree, r/w mappings ought to be fine here as long as only
Dom0 gets them.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNKNOWN

2016-03-02 Thread Roger Pau Monne
And use it as the default value for the VGA kind. This allows libxl to set
it to the default value later on when the domain type is known. For HVM
guests the default value is LIBXL_VGA_INTERFACE_TYPE_CIRRUS while for
HVMlite the default value is LIBXL_VGA_INTERFACE_TYPE_NONE.

Signed-off-by: Roger Pau Monné 
---
Cc: Ian Jackson 
Cc: Ian Campbell 
Cc: Wei Liu 
---
Changes since v4:
 - Return an error when trying to use a VGA card without a device model.
 - Drop Wei's Ack due to the above change.

Changes since v3:
 - s/UNDEF/UNKNOWN/.
 - Add a LIBXL_HAVE_VGA_INTERFACE_TYPE_UNKNOWN.
---
 tools/libxl/libxl.h | 10 ++
 tools/libxl/libxl_create.c  | 13 +++--
 tools/libxl/libxl_dm.c  |  6 ++
 tools/libxl/libxl_types.idl |  3 ++-
 4 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index f9e3ef5..584f8ec 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -895,6 +895,16 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
libxl_mac *src);
 ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED
 #endif
 
+/*
+ * LIBXL_HAVE_VGA_INTERFACE_TYPE_UNKNOWN
+ *
+ * In the case that LIBXL_HAVE_VGA_INTERFACE_TYPE_UNKNOWN is set the
+ * libxl_vga_interface_type enumeration type contains a
+ * LIBXL_VGA_INTERFACE_TYPE_UNKNOWN identifier. This is used to signal
+ * that a libxl_vga_interface_type type has not been initialized yet.
+ */
+#define LIBXL_HAVE_VGA_INTERFACE_TYPE_UNKNOWN 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index f1028bc..9fdf29c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -222,8 +222,12 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 if (b_info->u.hvm.mmio_hole_memkb == LIBXL_MEMKB_DEFAULT)
 b_info->u.hvm.mmio_hole_memkb = 0;
 
-if (!b_info->u.hvm.vga.kind)
-b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
+if (b_info->u.hvm.vga.kind == LIBXL_VGA_INTERFACE_TYPE_UNKNOWN) {
+if (b_info->device_model_version == 
LIBXL_DEVICE_MODEL_VERSION_NONE)
+b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE;
+else
+b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
+}
 
 if (!b_info->u.hvm.hdtype)
 b_info->u.hvm.hdtype = LIBXL_HDTYPE_IDE;
@@ -257,6 +261,11 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 }
 break;
 case LIBXL_DEVICE_MODEL_VERSION_NONE:
+if (b_info->u.hvm.vga.kind != LIBXL_VGA_INTERFACE_TYPE_NONE) {
+LOG(ERROR,
+"guests without a device model cannot have an emulated video card");
+return ERROR_INVAL;
+}
 b_info->video_memkb = 0;
 break;
 case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 4aca38e..5e59199 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -531,6 +531,9 @@ static int libxl__build_device_model_args_old(libxl__gc *gc,
 break;
 case LIBXL_VGA_INTERFACE_TYPE_QXL:
 break;
+default:
+LOG(ERROR, "Invalid emulated video card specified");
+return ERROR_INVAL;
 }
 
 if (b_info->u.hvm.boot) {
@@ -970,6 +973,9 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
 GCSPRINTF("qxl-vga,vram_size_mb=%"PRIu64",ram_size_mb=%"PRIu64,
 (b_info->video_memkb/2/1024), (b_info->video_memkb/2/1024) ) );
 break;
+default:
+LOG(ERROR, "Invalid emulated video card specified");
+return ERROR_INVAL;
 }
 
 if (b_info->u.hvm.boot) {
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 632c009..67bbd86 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -204,11 +204,12 @@ libxl_shutdown_reason = Enumeration("shutdown_reason", [
 ], init_val = "LIBXL_SHUTDOWN_REASON_UNKNOWN")
 
 libxl_vga_interface_type = Enumeration("vga_interface_type", [
+(0, "UNKNOWN"),
 (1, "CIRRUS"),
 (2, "STD"),
 (3, "NONE"),
 (4, "QXL"),
-], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS")
+], init_val = "LIBXL_VGA_INTERFACE_TYPE_UNKNOWN")
 
 libxl_vendor_device = Enumeration("vendor_device", [
 (0, "NONE"),
-- 
2.5.4 (Apple Git-61)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 10/11] xen: modify page table construction

2016-03-02 Thread Juergen Gross
On 02/03/16 10:12, Daniel Kiper wrote:
> On Mon, Feb 29, 2016 at 01:19:27PM +0100, Juergen Gross wrote:
>> On 29/02/16 10:13, Juergen Gross wrote:
>>> On 25/02/16 19:33, Andrei Borzenkov wrote:
 22.02.2016 16:14, Juergen Gross пишет:
> On 22/02/16 13:48, Daniel Kiper wrote:
>> On Mon, Feb 22, 2016 at 01:30:30PM +0100, Juergen Gross wrote:
>>> On 22/02/16 13:18, Daniel Kiper wrote:
 On Mon, Feb 22, 2016 at 10:29:04AM +0100, Juergen Gross wrote:
> On 22/02/16 10:17, Daniel Kiper wrote:
>> On Mon, Feb 22, 2016 at 07:03:18AM +0100, Juergen Gross wrote:
>>> diff --git a/grub-core/lib/xen/relocator.c 
>>> b/grub-core/lib/xen/relocator.c
>>> index 8f427d3..a05b253 100644
>>> --- a/grub-core/lib/xen/relocator.c
>>> +++ b/grub-core/lib/xen/relocator.c
>>> @@ -29,6 +29,11 @@
>>>
>>>  typedef grub_addr_t grub_xen_reg_t;
>>>
>>> +struct grub_relocator_xen_paging_area {
>>> +  grub_xen_reg_t start;
>>> +  grub_xen_reg_t size;
>>> +};
>>> +
>>
>> ... this should have GRUB_PACKED because compiler may
>> add padding to align size member.
>
> Why would the compiler add padding to a structure containing two items
> of the same type? I don't think the C standard would allow this.
>
> grub_xen_reg_t is either unsigned (32 bit) or unsigned long (64 bit).
> There is no way this could require any padding.

 You are right but we should add this here just in case.
>>>
>>> Sorry, I don't think this makes any sense. The C standard is very clear
>>> in this case: a type requiring a special alignment has always a length
>>> being a multiple of that alignment. Otherwise arrays wouldn't work.
>>
>> Sorry, I am not sure what do you mean by that.
>
> The size of any C type (no matter whether it is an integral type like
> "int" or a structure) has always the same alignment restriction as the
> type itself. So a type requiring 8 byte alignment will always have a
> size of a multiple of 8 bytes. This is mandatory for arrays to work, as
> otherwise either the elements wouldn't be placed consecutively in memory
> or the alignment restrictions wouldn't be obeyed for all elements.
>

 I too not follow how it is relevant to this case. We talk about internal
 padding between structure members, not between array elements.

> For our case it means that two structure elements of the same type will
> never require a padding between them, thus the annotation with "packed"
> can't serve any purpose.
>

 Well, I am not aware of any requirement. Compiler may add arbitrary
 padding between structure elements; it is only prohibited to add padding
 at the beginning. Sure, it would be unusual, but never say "never" ...
 also should Xen ever be ported to architecture where types are not
 self-aligned it will become an issue.
>>>
>>> So you are telling me that _all_ interfaces between e.g. Linux, grub2,
>>> Xen and all wire protocols not attributed with "packed" are just wrong?
>>>
>>> Sorry, I don't think this is true.
>>
>> Okay, just found a reference: The x86 ABI states:
>>
>> Aggregates and Unions
>> -
>> Structures and unions assume the alignment of their most strictly
>> aligned component. Each member is assigned to the lowest available
>> offset with the appropriate alignment. The size of any object is always
>> a multiple of the object‘s alignment.
>>
>> I don't think any x86 C-compiler will violate the x86 ABI.
> 
> You just cited only part of paragraph. Here is full paragraph:
> 
> [...]
> 
> Aggregates and Unions
> 
> Structures and unions assume the alignment of their most strictly aligned 
> component.
> Each member is assigned to the lowest available offset with the appropriate
> alignment. The size of any object is always a multiple of the object‘s 
> alignment.
> An array uses the same alignment as its elements, except that a local or 
> global
> array variable of length at least 16 bytes or a C99 variable-length array 
> variable
> always has alignment of at least 16 bytes.
> Structure and union objects can require padding to meet size and alignment
> constraints. The contents of any padding is undefined.
> 
> [...]
> 
> Well, this is a bit hard to understand, so, please look here
> http://www.catb.org/esr/structure-packing/#_structure_alignment_and_padding
> what can happen if struct has members with different sizes and you do
> not use packed attribute.
> 
> Luckily you use struct members with the same sizes, so, everything works.

This wasn't luck, it was on purpose. ;-)

> However, if you/somebody will try to change grub_relocator_xen_paging_area
> layout and add a member with different size in the middle or the beginning
> of struct then suddenly everything will stop workin

Re: [Xen-devel] [PATCH v2 2/3] xen: add hypercall option to temporarily pin a vcpu

2016-03-02 Thread Juergen Gross
On 02/03/16 10:27, Dario Faggioli wrote:
> On Wed, 2016-03-02 at 08:14 +0100, Juergen Gross wrote:
>> On 01/03/16 16:52, George Dunlap wrote:
>>>  
>>>
>>> Also -- have you actually tested the "cpupool move while pinned"
>>> functionality to make sure it actually works?  There's a weird bit
>>> in
>>> cpupool_unassign_cpu_helper() where after calling
>>> cpu_disable_scheduler(cpu), it unconditionally sets the cpu bit in
>>> the
>>> cpupool_free_cpus mask, even if it returns an error.  That can't be
>>> right, even for the existing -EAGAIN case, can it?
>> That should be no problem. Such a failure can be repaired easily by
>> adding the cpu to the cpupool again. 
>>
> And there's not much else one can do, I would say. When we are in
> cpu_disable_scheduler(), coming from
> cpupool_unassign_cpu()-->cpupool_unassign_cpu() we're already halfway
> through removing the cpu from the pool (e.g., we already cleared the
> relevant bit from the cpupool's cpu_valid mask).
> 
> And we don't actually want to revert that, as doing so would allow the
> scheduler to start again moving vcpus to that cpu (and the following
> attempts will risk failing with EAGAIN again :-D).
> 
> FWIW, I've also found that part rather weird for quite some time... But
> it does indeed makes sense, IMO.
> 
>> Adding a comment seems to be a
>> good idea. :-)
>>
> Yep. Should we also add an error message for the user to be able to see
> it, even if she can't read the comment in the source code? (Not
> necessarily right there, if that would make it trigger too much... just
> in a place where it can be seen in the case the user actually need to
> do something).
> 
>> What is wrong and even worse, schedule_cpu_switch() returning an
>> error
>> will leak domlist_read_lock. 
>>
> Indeed, good catch. :-)
> 
>>> And, in general, what happens if the device driver gets mixed up
>>> and
>>> forgets to unpin the vcpu?  Is the only recourse to reboot your
>>> host (or
>>> deal with the fact that you can't reconfigure your cpupools)?
>> Unless we add a "forced" option to "xl vcpu-pin", yes.
>>
> Which would be fine to have, IMO. I'm not sure if it would better be an
> `xl vcpu-pin' flag, or a separate utility (as Jan is also saying).
> 
> A separate utility would fit better the "emergency nature" of the
> thing, avoiding having to clobber xl for that (as this will be the
> only, pretty uncommon, case where such flag would be needed).
> 
> However, an xl flag is easier to add, easier to document and easier and
> more natural to find, from the point of view of an user that really
> needs it. And perhaps it could turn out useful for other situations in
> future. So, I guess I'd say:
>  - yes, let's add that
>  - let's do it as a "force flag" of `xl vcpu-pin'.

Which raises the question: how to do that on the libxl level?

a) expand libxl_set_vcpuaffinity() with another parameter (is this even
   possible? I could do some ifdeffery, but the API would change...)

b) add a libxl_set_vcpuaffinity_force() variant

c) imply the force flag by specifying both hard and soft maps as NULL
   (it _is_ basically just that: keep both affinity sets), implying that
   it makes no sense to specify any affinities with the -f flag (which
   renders the "force" meaning rather strange, would be more a "restore"
   now).


Juergen

> 
> Regards,
> Dario
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Remus: update email address in MAINTAINERS file

2016-03-02 Thread Ian Jackson
Yang Hongyang writes ("[PATCH] Remus: update email address in MAINTAINERS 
file"):
> From: Yang Hongyang 
...
>  REMUS
>  M:   Shriram Rajagopalan 
> -M:   Yang Hongyang 
> +M:   Yang Hongyang 
>  S:   Maintained

Committed-by: Ian Jackson 

Thanks.  I guess you intend to continue as maintainer of this code
then ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 84935: tolerable FAIL - PUSHED

2016-03-02 Thread osstest service owner
flight 84935 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/84935/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 84523
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 84523

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu9c279bec754a84c790b70674a5a224379c8dcda2
baseline version:
 qemuu35227e6a09a274b1496bfe16cbe2008e85fbeb5a

Last test of basis84523  2016-02-29 11:54:01 Z2 days
Failing since 84614  2016-03-01 04:17:02 Z1 days2 attempts
Testing same since84935  2016-03-01 14:15:01 Z1 days1 attempts


People who touched revisions under test:
  Cornelia Huck 
  Daniel P. Berrange 
  David Hildenbrand 
  Eduardo Habkost 
  Fam Zheng 
  Gerd Hoffmann 
  Hitoshi Mitake 
  Jeff Cody 
  John Snow 
  Laszlo Ersek 
  Max Reitz 
  Michal Privoznik 
  Paolo Bonzini 
  Peter Lieven 
  Peter Maydell 
  Sascha Silbe 
  Thomas Huth 
  Vasiliy Tolstov 
  Wei Yang 
  Yi Min Zhao 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-x

Re: [Xen-devel] [PATCH v9 1/6] libxl: Export libxl__device_nextid for internal use

2016-03-02 Thread Wei Liu
On Tue, Feb 23, 2016 at 11:26:56AM +, Olaf Hering wrote:
> Signed-off-by: Olaf Hering 
> Cc: Ian Jackson 
> Cc: Stefano Stabellini 
> Cc: Ian Campbell 
> Cc: Wei Liu 

Assuming this is going to be used in later patches:

Acked-by: Wei Liu 

> ---
>  tools/libxl/libxl.c  | 2 +-
>  tools/libxl/libxl_internal.h | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 2bde0f5..0b16618 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -2031,7 +2031,7 @@ out:
>  }
>  
>  /* common function to get next device id */
> -static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device)
> +int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device)
>  {
>  char *dompath, **l;
>  unsigned int nb;
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index b194e65..46f3e3e 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -1178,6 +1178,7 @@ _hidden int libxl__init_console_from_channel(libxl__gc 
> *gc,
>   libxl__device_console *console,
>   int dev_num,
>   libxl_device_channel *channel);
> +_hidden int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char 
> *device);
>  
>  /*
>   * For each aggregate type which can be used as an input we provide:

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.7 Development Update

2016-03-02 Thread Wei Liu
On Wed, Mar 02, 2016 at 02:43:57PM +, George Dunlap wrote:
> On Wed, Mar 2, 2016 at 1:32 PM, Jan Beulich  wrote:
>  On 02.03.16 at 12:38,  wrote:
> >> On Mon, Feb 29, 2016 at 11:17 AM, Wei Liu  wrote:
> >>> *  Improve ioreq server performance
> >>>   -  Yu Zhang
> >>>   -  Paul Durrant
> >>
> >> If this means "use RB trees for rangesets", I think this is already in.
> >
> > No, it's not. There was no point in committing that one without
> > the patch actually needing it.
> 
> Oh, right -- my mistake then.
> 
> But my real purpose in commenting was that "improve ioreq server
> performance" isn't really an accurate description; a better one might
> be, "allow ioreq server interface to support XenGT", which is what is
> really wanted.
> 

I will update that item to the new name. 

Wei.

>  -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 24/31] Support colo mode for qemu disk

2016-03-02 Thread Wei Liu
On Mon, Feb 22, 2016 at 10:52:28AM +0800, Wen Congyang wrote:
> Usage: disk = 
> ['...,colo,colo-host=xxx,colo-port=xxx,colo-export=xxx,active-disk=xxx,hidden-disk=xxx...']
> For QEMU block replication details:
> http://wiki.qemu.org/Features/BlockReplication
> 
> Signed-off-by: Wen Congyang 
> Signed-off-by: Yang Hongyang 
> ---
>  docs/man/xl.pod.1   |   2 +-
>  docs/misc/xl-disk-configuration.txt |  50 ++
>  tools/libxl/libxl.c |  62 +++-
>  tools/libxl/libxl_create.c  |  25 -
>  tools/libxl/libxl_device.c  |  54 +++
>  tools/libxl/libxl_dm.c  | 184 
> ++--
>  tools/libxl/libxl_types.idl |   7 ++
>  tools/libxl/libxlu_disk_l.l |   7 ++
>  8 files changed, 382 insertions(+), 9 deletions(-)
> 
> diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
> index 1c6dd87..4f1901d 100644
> --- a/docs/man/xl.pod.1
> +++ b/docs/man/xl.pod.1
> @@ -454,7 +454,7 @@ N.B: Remus support in xl is still in experimental 
> (proof-of-concept) phase.
>   Disk replication support is limited to DRBD disks.
>  
>   COLO support in xl is still in experimental (proof-of-concept) phase.
> - There is no support for network or disk at the moment.
> + There is no support for network at the moment.

You need some document here for the syntax, otherwise users have no clue
how to configure disk replicate support. I also won't be able to
meaningfully review this patch without a reference.

>  
>  B
>  
> diff --git a/docs/misc/xl-disk-configuration.txt 
> b/docs/misc/xl-disk-configuration.txt
> index 29f6ddb..6f23c2d 100644
> --- a/docs/misc/xl-disk-configuration.txt
> +++ b/docs/misc/xl-disk-configuration.txt
> @@ -234,6 +234,56 @@ were intentionally created non-sparse to avoid 
> fragmentation of the
>  file.
>  
>  

Some nitpicking about the format below.

> +===
> +COLO PARAMETERS
> +===
> +
> +
> +colo
> +
> +
> +Enable COLO HA for disk. For better understanding block replication on
> +QEMU, please refer to:
> +http://wiki.qemu.org/Features/BlockReplication
> +
> +
> +colo-host
> +-

Blank line here please.

> +Description:   Secondary host's address
> +Mandatory: Yes when COLO enabled
> +
> +
> +colo-port
> +-

Ditto.

> +Description:   Secondary port
> +   We will run a nbd server on secondary host,
> +   and the nbd server will listen this port.
> +Mandatory: Yes when COLO enabled
> +
> +
> +colo-export
> +-

Here as well. And some more "-"s to match "colo-export".

> +Description:   We will run a nbd server on secondary host,
> +   exportname is the nbd server's disk export name.
> +Mandatory: Yes when COLO enabled
> +
> +
> +active-disk
> +---
> +
> +Description:   This is used by secondary. Secondary guest's write
> +   will be buffered in this disk.
> +Mandatory: Yes when COLO enabled
> +
> +
> +hidden-disk
> +---
> +
> +Description:   This is used by secondary. It buffers the original
> +   content that is modified by the primary VM.
> +Mandatory: Yes when COLO enabled
> +
> +

The rest of the patch is mainly for manipulating QEMU parameters. I've
skipped it for now.

>  
>  DEPRECATED PARAMETERS, PREFIXES AND SYNTAXES
>  
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 12df81a..f691628 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -2309,6 +2309,8 @@ int libxl__device_disk_setdefault(libxl__gc *gc, 
> libxl_device_disk *disk)
>  int rc;
>  
>  libxl_defbool_setdefault(&disk->discard_enable, !!disk->readwrite);
> +libxl_defbool_setdefault(&disk->colo_enable, false);
> +libxl_defbool_setdefault(&disk->colo_restore_enable, false);
>  
>  rc = libxl__resolve_domid(gc, disk->backend_domname, 
> &disk->backend_domid);
>  if (rc < 0) return rc;
> @@ -2507,6 +2509,18 @@ static void device_disk_add(libxl__egc *egc, uint32_t 
> domid,
>  flexarray_append(back, "params");
>  flexarray_append(back, GCSPRINTF("%s:%s",
>
> libxl__device_disk_string_of_format(disk->format), disk->pdev_path));
> +if (libxl_defbool_val(disk->colo_enable)) {
> +flexarray_append(back, "colo-host");
> +flexarray_append(back, libxl__sprintf(gc, "%s", 
> disk->colo_host));
> +flexarray_append(back, "colo-port");
> +flexarray_append(back, libxl__sprintf(gc, "%s", 
> disk->colo_port));
> +flexarray_append(back, "colo-export");
> +flexarray_append(back, libxl__sprintf(gc, "%s", 
> disk->colo_export));
> +

Re: [Xen-devel] [PATCH v10 31/31] cmdline switches and config vars to control colo-proxy

2016-03-02 Thread Wei Liu
On Mon, Feb 22, 2016 at 10:52:35AM +0800, Wen Congyang wrote:
> Add cmdline switches to 'xl migrate-receive' command to specify
> a domain-specific hotplug script to setup COLO proxy.
> 
> Add a new config var 'colo.default.agentscript' to xl.conf, that
> allows the user to override the default global script used to
> setup COLO proxy.
> 
> Signed-off-by: Yang Hongyang 
> Signed-off-by: Wen Congyang 
> ---
>  docs/man/xl.conf.pod.5  |  6 ++
>  docs/man/xl.pod.1   |  1 -
>  tools/libxl/libxl.c |  6 ++
>  tools/libxl/libxl_create.c  | 14 --
>  tools/libxl/libxl_types.idl |  1 +
>  tools/libxl/xl.c|  3 +++
>  tools/libxl/xl.h|  1 +
>  tools/libxl/xl_cmdimpl.c| 47 
> ++---
>  8 files changed, 65 insertions(+), 14 deletions(-)
> 
> diff --git a/docs/man/xl.conf.pod.5 b/docs/man/xl.conf.pod.5
> index 8ae19bb..8f7fd28 100644
> --- a/docs/man/xl.conf.pod.5
> +++ b/docs/man/xl.conf.pod.5
> @@ -111,6 +111,12 @@ Configures the default script used by Remus to setup 
> network buffering.
>  
>  Default: C
>  
> +=item B
> +
> +Configures the default script used by COLO to setup colo-proxy.
> +
> +Default: C
> +
>  =item B
>  
>  Configures the default output format used by xl when printing "machine
> diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
> index 4f1901d..edeafcf 100644
> --- a/docs/man/xl.pod.1
> +++ b/docs/man/xl.pod.1
> @@ -454,7 +454,6 @@ N.B: Remus support in xl is still in experimental 
> (proof-of-concept) phase.
>   Disk replication support is limited to DRBD disks.
>  
>   COLO support in xl is still in experimental (proof-of-concept) phase.
> - There is no support for network at the moment.


Same here, missing documentation on how to use the new parameters (if
any). Please provide adequate documentation otherwise we can't
meaningfully review the rest of this patch.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 28/31] COLO nic: implement COLO nic subkind

2016-03-02 Thread Wei Liu
On Mon, Feb 22, 2016 at 10:52:32AM +0800, Wen Congyang wrote:
> implement COLO nic subkind.
> 
> Signed-off-by: Yang Hongyang 
> Signed-off-by: Wen Congyang 
> ---
>  tools/hotplug/Linux/Makefile |   1 +
>  tools/hotplug/Linux/colo-proxy-setup | 135 +++
>  tools/libxl/Makefile |   1 +
>  tools/libxl/libxl_colo_nic.c | 321 
> +++
>  tools/libxl/libxl_internal.h |   5 +
>  tools/libxl/libxl_types.idl  |   1 +

I skipped this patch because it looks mostly internal to COLO.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 26/31] COLO proxy: implement setup/teardown of COLO proxy module

2016-03-02 Thread Wei Liu
On Mon, Feb 22, 2016 at 10:52:30AM +0800, Wen Congyang wrote:
> setup/teardown of COLO proxy module.
> we use netlink to communicate with proxy module.
> About colo-proxy module:
> https://lkml.org/lkml/2015/6/18/32
> How to use:
> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
> 
> Signed-off-by: Yang Hongyang 
> Signed-off-by: Wen Congyang 

I'm tempted to just ack this patch as well.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 25/31] COLO: use qemu block replication

2016-03-02 Thread Wei Liu
On Mon, Feb 22, 2016 at 10:52:29AM +0800, Wen Congyang wrote:
> Use qemu block replication as our block replication solution.
> Note that guest must be paused before starting COLO, otherwise,
> the disk won't be consistent between primary and secondary.
> 
> Signed-off-by: Wen Congyang 
> Signed-off-by: Yang Hongyang 
> ---
>  tools/libxl/Makefile |   1 +
>  tools/libxl/libxl_colo_qdisk.c   | 226 
> +++
>  tools/libxl/libxl_colo_restore.c |  42 +++-
>  tools/libxl/libxl_colo_save.c|  54 +-
>  tools/libxl/libxl_internal.h |  13 +++

All the changes look internal to COLO and are trying to manipulate QEMU
or whatnot. I don't think I can provide very detailed feedback. I am
tempted to just ack this patch.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 27/31] COLO proxy: preresume, postresume and checkpoint

2016-03-02 Thread Wei Liu
On Mon, Feb 22, 2016 at 10:52:31AM +0800, Wen Congyang wrote:
> preresume, postresume and checkpoint
> 
> Signed-off-by: Yang Hongyang 
> Signed-off-by: Wen Congyang 

Same as last patch...

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 23/31] COLO: introduce new API to prepare/start/do/get_error/stop replication

2016-03-02 Thread Wei Liu
On Mon, Feb 22, 2016 at 10:52:27AM +0800, Wen Congyang wrote:
> We will use qemu block replication, and qemu provides some qmp commands
> to prepare replication, start replication, get replication error, and
> stop replication. Introduce new API to execute these qmp commands.
> 
> Signed-off-by: Wen Congyang 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 22/31] implement the cmdline for COLO

2016-03-02 Thread Wei Liu
On Mon, Feb 22, 2016 at 10:52:26AM +0800, Wen Congyang wrote:
[...]
> +if (libxl_defbool_val(info->colo)) {
> +if (libxl_defbool_val(info->compression)) {

This can be simplified as

   if (libxl_defbool_val(xxx) && libxl_defbool_val(yyy))

> +LOG(ERROR, "cannot use memory checkpoint compression in COLO 
> mode");
> +rc = ERROR_FAIL;
> +goto out;
> +}
> +}
> +
>  if (!libxl_defbool_val(info->allow_unsafe) &&
>  (libxl_defbool_val(info->blackhole) ||
>   !libxl_defbool_val(info->netbuf) ||
> @@ -876,7 +892,10 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
> libxl_domain_remus_info *info,
>  dss->live = 1;
>  dss->debug = 0;
>  dss->remus = info;
> -dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_REMUS;
> +if (libxl_defbool_val(info->colo))
> +dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_COLO;
> +else
> +dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_REMUS;
>  
>  assert(info);
>  
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index df7268b..0dc7220 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -4440,6 +4440,8 @@ static void migrate_receive(int debug, int daemonize, 
> int monitor,
>  char rc_buf;
>  char *migration_domname;
>  struct domain_create dom_info;
> +const char *ha = checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO ?
> + "COLO" : "Remus";
>  
>  signal(SIGPIPE, SIG_IGN);
>  /* if we get SIGPIPE we'd rather just have it as an error */
> @@ -4460,6 +4462,9 @@ static void migrate_receive(int debug, int daemonize, 
> int monitor,
>  dom_info.send_back_fd = send_fd;
>  dom_info.migration_domname_r = &migration_domname;
>  dom_info.checkpointed_stream = checkpointed;
> +if (checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO)
> +/* COLO uses stdout to send control message to master */
> +dom_info.quiet = 1;
>  

It seems that dom_info->quiet affects stderr, not stdout. See the only
place that checks this in xl_cmdimpl.c.

>  rc = create_domain(&dom_info);
>  if (rc < 0) {
> @@ -4472,11 +4477,12 @@ static void migrate_receive(int debug, int daemonize, 
> int monitor,
>  
>  switch (checkpointed) {
>  case LIBXL_CHECKPOINTED_STREAM_REMUS:
> +case LIBXL_CHECKPOINTED_STREAM_COLO:
>  /* If we are here, it means that the sender (primary) has crashed.
>   * TODO: Split-Brain Check.
>   */
> -fprintf(stderr, "migration target: Remus Failover for domain %u\n",
> -domid);
> +fprintf(stderr, "migration target: %s Failover for domain %u\n",
> +ha, domid);
>  
>  /*
>   * If domain renaming fails, lets just continue (as we need the 
> domain
> @@ -4492,16 +4498,20 @@ static void migrate_receive(int debug, int daemonize, 
> int monitor,
>  rc = libxl_domain_rename(ctx, domid, migration_domname,
>   common_domname);
>  if (rc)
> -fprintf(stderr, "migration target (Remus): "
> +fprintf(stderr, "migration target (%s): "
>  "Failed to rename domain from %s to %s:%d\n",
> -migration_domname, common_domname, rc);
> +ha, migration_domname, common_domname, rc);
>  }
>  
> +if (checkpointed == LIBXL_CHECKPOINTED_STREAM_COLO)
> +/* The guest is running after failover in COLO mode */
> +exit(rc ? -ERROR_FAIL: 0);
> +
>  rc = libxl_domain_unpause(ctx, domid);
>  if (rc)
> -fprintf(stderr, "migration target (Remus): "
> +fprintf(stderr, "migration target (%s): "
>  "Failed to unpause domain %s (id: %u):%d\n",
> -common_domname, domid, rc);
> +ha, common_domname, domid, rc);
>  
>  exit(rc ? -ERROR_FAIL: 0);
>  default:
> @@ -4649,7 +4659,7 @@ int main_migrate_receive(int argc, char **argv)
>  libxl_checkpointed_stream checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
>  int opt;
>  
> -SWITCH_FOREACH_OPT(opt, "Fedr", NULL, "migrate-receive", 0) {
> +SWITCH_FOREACH_OPT(opt, "Fedrc", NULL, "migrate-receive", 0) {
>  case 'F':
>  daemonize = 0;
>  break;
> @@ -4663,6 +4673,9 @@ int main_migrate_receive(int argc, char **argv)
>  case 'r':
>  checkpointed = LIBXL_CHECKPOINTED_STREAM_REMUS;
>  break;
> +case 'c':
> +checkpointed = LIBXL_CHECKPOINTED_STREAM_COLO;
> +break;
>  }
>  
>  if (argc-optind != 0) {
> @@ -8032,11 +8045,8 @@ int main_remus(int argc, char **argv)
>  int config_len;
>  
>  memset(&r_info, 0, sizeof(libxl_domain_remus_info));
> -/* Defaults */
> -r_info.interval = 200;
> -libxl_defbool_setdefault(&r_info.blackhole, false);
>  
> -  

Re: [Xen-devel] [PATCH v10 12/31] tools/libxl: add back channel support to read stream

2016-03-02 Thread Wei Liu
On Fri, Feb 26, 2016 at 10:16:43AM +0800, Wen Congyang wrote:
[...]
> > 
> > Even if it doesn't have restore helper, check_all_finished also checks
> > if the stream and the conversion helper are till in use.  The
> > explanation in the comment doesn't seem to justify this change.
> 
> In stream_done(), stream->running is set to false, so 
> libxl__stream_read_in_use()
> returns false.
> 
> Back channel stream doesn't support legacy stream, so there is no conversion 
> helper.
> 
> I will update the comments in the next version.
> 

Yes, please do. Say clearly why it is not needed.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 10/31] tools/libxl: add back channel support to write stream

2016-03-02 Thread Wei Liu
On Fri, Feb 26, 2016 at 10:11:27AM +0800, Wen Congyang wrote:
> On 02/25/2016 11:54 PM, Wei Liu wrote:
> > On Mon, Feb 22, 2016 at 10:52:14AM +0800, Wen Congyang wrote:
> >> Add back channel support to write stream. If the write stream is
> >> a back channel stream, this means the write stream is used by
> >> Secondary to send some records back.
> >>
> >> Signed-off-by: Yang Hongyang 
> >> Signed-off-by: Wen Congyang 
> >> ---
> >>  tools/libxl/libxl_dom_save.c |  1 +
> >>  tools/libxl/libxl_internal.h |  1 +
> >>  tools/libxl/libxl_stream_write.c | 26 --
> >>  3 files changed, 22 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
> >> index 72b61c7..18946c5 100644
> >> --- a/tools/libxl/libxl_dom_save.c
> >> +++ b/tools/libxl/libxl_dom_save.c
> >> @@ -404,6 +404,7 @@ void libxl__domain_save(libxl__egc *egc, 
> >> libxl__domain_save_state *dss)
> >>  dss->sws.ao  = dss->ao;
> >>  dss->sws.dss = dss;
> >>  dss->sws.fd  = dss->fd;
> >> +dss->sws.back_channel = false;
> >>  dss->sws.completion_callback = stream_done;
> >>  
> >>  libxl__stream_write_start(egc, &dss->sws);
> >> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> >> index 3d3e8e8..e02e554 100644
> >> --- a/tools/libxl/libxl_internal.h
> >> +++ b/tools/libxl/libxl_internal.h
> >> @@ -3044,6 +3044,7 @@ struct libxl__stream_write_state {
> >>  libxl__ao *ao;
> >>  libxl__domain_save_state *dss;
> >>  int fd;
> >> +bool back_channel;
> >>  void (*completion_callback)(libxl__egc *egc,
> >>  libxl__stream_write_state *sws,
> >>  int rc);
> >> diff --git a/tools/libxl/libxl_stream_write.c 
> >> b/tools/libxl/libxl_stream_write.c
> >> index f6ea55d..5379126 100644
> >> --- a/tools/libxl/libxl_stream_write.c
> >> +++ b/tools/libxl/libxl_stream_write.c
> >> @@ -49,6 +49,13 @@
> >>   *  - if (hvm)
> >>   *  - Emulator context record
> >>   *  - Checkpoint end record
> >> + *
> >> + * For back channel stream:
> >> + * - libxl__stream_write_start()
> >> + *- Set up the stream to running state
> >> + *
> >> + * - Add a new API to write the record. When the record is written
> >> + *   out, call stream->checkpoint_callback() to return.
> > 
> > What does this mean? Which new API?
> 
> The next patch introduces this API. The commits is very old.
> 
> I think I can merge these two patches into one patch.
> 

Please reference the actual function / API.

> > 
> >>   */
> >>  
> >>  /* Success/error/cleanup handling. */
> >> @@ -225,6 +232,15 @@ void libxl__stream_write_start(libxl__egc *egc,
> >>  
> >>  stream->running = true;
> >>  
> >> +dc->ao= ao;
> >> +dc->readfd= -1;
> >> +dc->copywhat  = "save v2 stream";
> >> +dc->writefd   = stream->fd;
> >> +dc->maxsz = -1;
> >> +
> >> +if (stream->back_channel)
> >> +return;
> >> +
> > 
> > There seems to be very subtle change of behaviour.
> > 
> > Before this patch, the dc->* are not set until the emulator check is
> > done. With this path, it is possible in the normal case some of the
> > fields are initialised in the error path.
> > 
> > I think this is OK given the callbacks in the upper layer and in
> > the writer don't rely on those fields to clean up. Just one thing to
> > note.
> > 
> > That said, I suggest you move all initialisation of dc->* in one place.
> 
> OK, I will do it.
> 
> > 
> >>  if (dss->type == LIBXL_DOMAIN_TYPE_HVM) {
> >>  stream->device_model_version =
> >>  libxl__device_model_version_running(gc, dss->domid);
> >> @@ -249,12 +265,7 @@ void libxl__stream_write_start(libxl__egc *egc,
> >>  stream->emu_sub_hdr.index = 0;
> >>  }
> >>  
> >> -dc->ao= ao;
> >> -dc->readfd= -1;
> >>  dc->writewhat = "stream header";
> >> -dc->copywhat  = "save v2 stream";
> >> -dc->writefd   = stream->fd;
> >> -dc->maxsz = -1;
> >>  dc->callback  = stream_header_done;
> >>  
> >>  rc = libxl__datacopier_start(dc);
> >> @@ -279,6 +290,7 @@ void libxl__stream_write_start_checkpoint(libxl__egc 
> >> *egc,
> >>  {
> >>  assert(stream->running);
> >>  assert(!stream->in_checkpoint);
> >> +assert(!stream->back_channel);
> >>  stream->in_checkpoint = true;
> >>  
> >>  write_emulator_xenstore_record(egc, stream);
> >> @@ -590,7 +602,9 @@ static void stream_done(libxl__egc *egc,
> >>  libxl__carefd_close(stream->emu_carefd);
> >>  free(stream->emu_body);
> >>  
> >> -check_all_finished(egc, stream, rc);
> >> +if (!stream->back_channel)
> >> +/* back channel stream doesn't have save helper */
> >> +check_all_finished(egc, stream, rc);
> > 
> > Though it doesn't have helper, do you not need to check if the back
> > channel stream itself is OK? The comment itself doesn't seem to justify
> > this change.
> 

Re: [Xen-devel] [PATCH v4 13/24] arm/acpi: Map all other tables for Dom0

2016-03-02 Thread Stefano Stabellini
On Wed, 2 Mar 2016, Shannon Zhao wrote:
> On 2016年03月02日 01:01, Stefano Stabellini wrote:
> > On Tue, 1 Mar 2016, Stefano Stabellini wrote:
> >> > On Tue, 1 Mar 2016, Shannon Zhao wrote:
> >>> > > On 2016/2/29 22:15, Stefano Stabellini wrote:
>  > > > On Sun, 28 Feb 2016, Shannon Zhao wrote:
> >> > > >> > From: Shannon Zhao 
> >> > > >> > 
> >> > > >> > Map all other tables to Dom0 using 1:1 mappings.
> >> > > >> > 
> >> > > >> > Signed-off-by: Shannon Zhao 
> >> > > >> > ---
> >> > > >> > v4: fix commit message
> >> > > >> > ---
> >> > > >> >  xen/arch/arm/domain_build.c | 26 ++
> >> > > >> >  1 file changed, 26 insertions(+)
> >> > > >> > 
> >> > > >> > diff --git a/xen/arch/arm/domain_build.c 
> >> > > >> > b/xen/arch/arm/domain_build.c
> >> > > >> > index 64e48ae..6ad420c 100644
> >> > > >> > --- a/xen/arch/arm/domain_build.c
> >> > > >> > +++ b/xen/arch/arm/domain_build.c
> >> > > >> > @@ -1357,6 +1357,30 @@ static int prepare_dtb(struct domain 
> >> > > >> > *d, struct kernel_info *kinfo)
> >> > > >> >  }
> >> > > >> >  
> >> > > >> >  #ifdef CONFIG_ACPI
> >> > > >> > +static void acpi_map_other_tables(struct domain *d)
> >> > > >> > +{
> >> > > >> > +int i;
> >> > > >> > +unsigned long res;
> >> > > >> > +u64 addr, size;
> >> > > >> > +
> >> > > >> > +/* Map all other tables to Dom0 using 1:1 mappings. */
> >> > > >> > +for( i = 0; i < acpi_gbl_root_table_list.count; i++ )
> >> > > >> > +{
> >> > > >> > +addr = acpi_gbl_root_table_list.tables[i].address;
> >> > > >> > +size = acpi_gbl_root_table_list.tables[i].length;
> >> > > >> > +res = map_regions(d,
> >> > > >> > +  paddr_to_pfn(addr & PAGE_MASK),
> >> > > >> > +  DIV_ROUND_UP(size, PAGE_SIZE),
> >> > > >> > +  paddr_to_pfn(addr & PAGE_MASK));
> >> > > >> > +if ( res )
> >> > > >> > +{
> >> > > >> > + panic(XENLOG_ERR "Unable to map 0x%"PRIx64
> >> > > >> > +   " - 0x%"PRIx64" in domain \n",
> >> > > >> > +   addr & PAGE_MASK, PAGE_ALIGN(addr + size) 
> >> > > >> > - 1);
> >> > > >> > +}
> >> > > >> > +}
> >> > > >> > +}
>  > > > The problem with this function is that it is mapping all other 
>  > > > tables to
>  > > > the guest, including the unmodified FADT and MADT. This way dom0 is
>  > > > going to find two different versions of the FADT and MADT, isn't 
>  > > > that
>  > > > right?
>  > > >  
> >>> > > We've replaced the entries of XSDT table with new value. That means 
> >>> > > XSDT
> >>> > > points to new table. Guest will not see the old ones.
> >> > 
> >> > All right. Of course it would be best to avoid mapping the original FADT
> >> > and MADT at all, but given that they are not likely to be page aligned,
> >> > it wouldn't be easy to do.
> >> > 
> >> > Reviewed-by: Stefano Stabellini 
> >
> > However I have one more question: given that map_regions maps the memory
> > read-only to Dom0, isn't it possible that one or more of the DSDT
> > functions could not work as expected? I would imagine that the ACPI
> > bytecode is allowed to change its own memory, right?
> > 
> I'm not sure about this. But it seems that Xen or Linux always map these
> tables to its memory.

It's not mapping pages in general the problem. The potential issue comes
from the pages being mapped read-only. If an AML function in the DSDT
needs to write something to memory, I imagine that the function would
fail when called from Dom0.

I think we need to map them read-write, which is safe, even for the
original FADT and MADT, because by the time Dom0 gets to see them, Xen
won't parse them anymore (Xen completes parsing ACPI tables, before
booting Dom0).

So this patch is fine, but
http://marc.info/?l=xen-devel&m=145665887528175 needs to be changed to
use p2m_access_rw instead of p2m_access_r.___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] what's inside hypercall page?

2016-03-02 Thread Juergen Gross
On 02/03/16 15:29, quizyjones wrote:
> After step by step monitoring, I get the following statistics about
> hypercall entries:
> 
> numbers | hypercalls | executed bytes (offset to hypercall entry)
>7755 24: 0 1 3 8 a c d
>6374 23: 0 1 3 4 9
>3281 25: 0 1 3 8 a c d
>2979 13: 0 1 3 8 a c d
>2475 17: 0 1 3 8
>2253 17: a c d
> 749 3: 0 1 3 8 a c d
> 655 23: 0 1 3 4 9 0 1 3 4 9
> 640 29: 0 1 3 8
> 636 29: a c d
> 445 23: 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9
> 433 23: 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9
> 414 24: 0 1 3 8 a c d 0 1 3 8 a c d
> 274 13: *0 1 3 8 8 a c d*
> 129 17: d
> 125 17: a c
> 112 29: a c d 0 1 3 8
> 112 17: c d
> 105 17: a
>  73 24: 0 1 3 8 a c d 0 1 3 8 a c d 0 1 3 8 a c d
>  67 17: 0
>  59 17: 8 a c d
>  54 17: 0 1 3
>  53 17: 0 1
>  50 17: 1 3 8 a c d
>  46 17: 3 8 a c d
>  21 3: 0 1 3 8 a c d 0 1 3 8 a c d
>   8 33: 0 1 3 8 a c d
>   7 17: 1 3
>   6 13: 0 1 3 8 8 8 a c d
>   5 29: d
>   5 23: 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9
>   4 29: a c
>   4 17: 3
>   3 17: 8 a
>   3 17: 8
>   3 17: 3 8
>   3 17: 1 3 8 a c
>   3 17: 1
>   2 29: 0 1 3 8 a c d
>   2 17: 3 8 a
>   2 17: 1 3 8 a
>   2 17: 1 3 8
>   1 29: c
>   1 29: a
>   1 29: 3 8 a c d
>   1 29: 1 3 8 a c d
>   1 29: 0 1
>   1 29: 0
>   1 17: 3 8 a c
> 
> From the above we can see that hypercall #17 and #29 are very irregular,
> with various combination occurs. Other hypercalls basically obey to the
> sequence of "0 1 3 8 a c d" which conforms to the content in
> hypercall_page_initialise function. HYPERCALL_iret is a special one as
> explained in the function, but it also conforms to its sequence of "0 1
> 3 4 9". So why would #17(do_xen_version) and #29(do_sched_op) performs

do_sched_op is self explaining: it is used for scheduling of the vcpu.
A vcpu going to idle is using this hypercall. So any interrupt waking
the vcpu up will seem to occur very near to the hypercall.

do_xen_version is often used as a very fast way to execute the check
for pending events in the hypervisor (kind of polling).

> irregular? They seem to be easily interrupted at any place of the
> hypercall entry. Besides, there is also some abnormals for
> #13(do_multicall) shown in bold.

do_multicall might run for a long time. So the hypervisor returns to
the caller from time to time setting IP to the hypercall. The caller
has the chance to react to interrupts and will then continue the
hypercall.


HTH, Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen-block: introduces extra request to pass-through SCSI commands

2016-03-02 Thread Ian Jackson
Bob Liu writes ("Re: [RFC PATCH] xen-block: introduces extra request to 
pass-through SCSI commands"):
> On 03/02/2016 07:40 PM, Ian Jackson wrote:
> > I can't see how that could cause anything but pain.  In many cases
> > "the underlying SCSI storage target" wouldn't be well defined.  Even
> > if it was, these side channel SCSI commands are likely to Go Wrong in
> > exciting ways.
> > 
> > What SCSI commands do you want to send ?
> 
> * INQUIRY

... but why ?

> * PERSISTENT RESERVE IN
> * PERSISTENT RESERVE OUT
> 
> This is for Failover Clusters in Windows, not sure whether more
> commands are required.  I didn't get a required scsi commands list
> in the failover document.

So you want to be able to reserve the volume against concurrent
access ?  If you're using LVM, such a reservation should apply to the
LVM LV, not to the underlying physical storage device, clearly.  So I
think LIO [1] + PVSCSI might be what you want.

Ian.

[1] http://linux-iscsi.org/wiki/LIO

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.7 Development Update

2016-03-02 Thread George Dunlap
On Wed, Mar 2, 2016 at 1:32 PM, Jan Beulich  wrote:
 On 02.03.16 at 12:38,  wrote:
>> On Mon, Feb 29, 2016 at 11:17 AM, Wei Liu  wrote:
>>> *  Improve ioreq server performance
>>>   -  Yu Zhang
>>>   -  Paul Durrant
>>
>> If this means "use RB trees for rangesets", I think this is already in.
>
> No, it's not. There was no point in committing that one without
> the patch actually needing it.

Oh, right -- my mistake then.

But my real purpose in commenting was that "improve ioreq server
performance" isn't really an accurate description; a better one might
be, "allow ioreq server interface to support XenGT", which is what is
really wanted.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.7 Development Update

2016-03-02 Thread Xu, Quan
On February 29, 2016 at 7:17pm,  wrote:

> *  VT-d asynchronous flush issue
>   -  Quan Xu

V6 has been sent out. Thanks.

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 2/5] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error

2016-03-02 Thread Quan Xu
Signed-off-by: Quan Xu 
---
 xen/arch/x86/mm/p2m-ept.c |   2 +-
 xen/drivers/passthrough/amd/iommu_init.c  |  12 ++-
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   2 +-
 xen/drivers/passthrough/arm/smmu.c|  10 ++-
 xen/drivers/passthrough/iommu.c   |  17 ++--
 xen/drivers/passthrough/vtd/extern.h  |   2 +-
 xen/drivers/passthrough/vtd/iommu.c   | 120 ++
 xen/drivers/passthrough/vtd/quirks.c  |  26 +++---
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   2 +-
 xen/include/asm-x86/iommu.h   |   2 +-
 xen/include/xen/iommu.h   |   6 +-
 11 files changed, 133 insertions(+), 68 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index d31b9af..2247972 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -829,7 +829,7 @@ out:
  need_modify_vtd_table )
 {
 if ( iommu_hap_pt_share )
-iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
+rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, 
vtd_pte_present);
 else
 {
 if ( iommu_flags )
diff --git a/xen/drivers/passthrough/amd/iommu_init.c 
b/xen/drivers/passthrough/amd/iommu_init.c
index d90a2d2..5635650 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1340,12 +1340,14 @@ static void invalidate_all_devices(void)
 iterate_ivrs_mappings(_invalidate_all_devices);
 }
 
-void amd_iommu_suspend(void)
+int amd_iommu_suspend(void)
 {
 struct amd_iommu *iommu;
 
 for_each_amd_iommu ( iommu )
 disable_iommu(iommu);
+
+return 0;
 }
 
 void amd_iommu_resume(void)
@@ -1369,3 +1371,11 @@ void amd_iommu_resume(void)
 invalidate_all_domain_pages();
 }
 }
+
+void amd_iommu_crash_shutdown(void)
+{
+struct amd_iommu *iommu;
+
+for_each_amd_iommu ( iommu )
+disable_iommu(iommu);
+}
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c 
b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index c1c0b6b..8d9f358 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -628,6 +628,6 @@ const struct iommu_ops amd_iommu_ops = {
 .suspend = amd_iommu_suspend,
 .resume = amd_iommu_resume,
 .share_p2m = amd_iommu_share_p2m,
-.crash_shutdown = amd_iommu_suspend,
+.crash_shutdown = amd_iommu_crash_shutdown,
 .dump_p2m_table = amd_dump_p2m_table,
 };
diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index bb08827..96bb568 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2544,7 +2544,7 @@ static int force_stage = 2;
  */
 static u32 platform_features = ARM_SMMU_FEAT_COHERENT_WALK;
 
-static void arm_smmu_iotlb_flush_all(struct domain *d)
+static int arm_smmu_iotlb_flush_all(struct domain *d)
 {
struct arm_smmu_xen_domain *smmu_domain = 
domain_hvm_iommu(d)->arch.priv;
struct iommu_domain *cfg;
@@ -2561,13 +2561,15 @@ static void arm_smmu_iotlb_flush_all(struct domain *d)
arm_smmu_tlb_inv_context(cfg->priv);
}
spin_unlock(&smmu_domain->lock);
+
+return 0;
 }
 
-static void arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn,
- unsigned int page_count)
+static int arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn,
+unsigned int page_count)
 {
 /* ARM SMMU v1 doesn't have flush by VMA and VMID */
-arm_smmu_iotlb_flush_all(d);
+return arm_smmu_iotlb_flush_all(d);
 }
 
 static struct iommu_domain *arm_smmu_get_domain(struct domain *d,
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index daff00c..fff60e9 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -171,7 +171,11 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
  ((page->u.inuse.type_info & PGT_type_mask)
   == PGT_writable_page) )
 mapping |= IOMMUF_writable;
-hd->platform_ops->map_page(d, gfn, mfn, mapping);
+if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) )
+printk(XENLOG_G_ERR
+   "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) failed.\n",
+   gfn, mfn);
+
 if ( !(i++ & 0xf) )
 process_pending_softirqs();
 }
@@ -273,9 +277,7 @@ int iommu_iotlb_flush(struct domain *d, unsigned long gfn, 
unsigned int page_cou
 if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush 
)
 return 0;
 
-hd->platform_ops->iotlb_flush(d, gfn, page_count);
-
-return 0;
+return hd->platform_ops->iotlb_flush(d, gfn, page_count);
 }
 
 int iommu_iotlb_flush_all(struct domain *d)
@@ -285,9 +287,7 @@ int iommu_iotlb_flush_all(struct

Re: [Xen-devel] Xen 4.7 Development Update

2016-03-02 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 02 March 2016 14:21
> To: Paul Durrant
> Cc: Wei Liu; xen-devel; George Dunlap
> Subject: RE: [Xen-devel] Xen 4.7 Development Update
> 
> >>> On 02.03.16 at 15:07,  wrote:
> >>  -Original Message-
> >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Jan
> >> Beulich
> >> Sent: 02 March 2016 13:33
> >> To: George Dunlap
> >> Cc: xen-devel; Wei Liu
> >> Subject: Re: [Xen-devel] Xen 4.7 Development Update
> >>
> >> >>> On 02.03.16 at 12:38,  wrote:
> >> > On Mon, Feb 29, 2016 at 11:17 AM, Wei Liu  wrote:
> >> >> *  Improve ioreq server performance
> >> >>   -  Yu Zhang
> >> >>   -  Paul Durrant
> >> >
> >> > If this means "use RB trees for rangesets", I think this is already in.
> >>
> >> No, it's not. There was no point in committing that one without
> >> the patch actually needing it.
> >>
> >
> > Using RB trees vs. a linear walk is still an improvement so I see no harm in
> > committing it.
> 
> But the individual nodes (and hence the overall resource use)
> grow, and whether that is worth the presumably tiny win on
> lookups I'm not at all certain.
> 

Ok. Your call but I would have thought the increased resource use was small 
enough to not worry about.. it must be in the order of kilobytes, if that.

  Paul

> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 3/5] IOMMU: Make the pcidevs_lock a recursive one

2016-03-02 Thread Quan Xu
Signed-off-by: Quan Xu 
---
 xen/arch/x86/domctl.c   |  8 +--
 xen/arch/x86/hvm/vmsi.c |  4 +-
 xen/arch/x86/irq.c  |  8 +--
 xen/arch/x86/msi.c  | 16 ++---
 xen/arch/x86/pci.c  |  4 +-
 xen/arch/x86/physdev.c  | 16 ++---
 xen/common/sysctl.c |  4 +-
 xen/drivers/passthrough/amd/iommu_init.c|  9 ++-
 xen/drivers/passthrough/amd/iommu_map.c |  2 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  4 +-
 xen/drivers/passthrough/pci.c   | 93 -
 xen/drivers/passthrough/vtd/iommu.c | 14 ++---
 xen/drivers/video/vga.c |  4 +-
 xen/include/xen/pci.h   |  4 +-
 14 files changed, 102 insertions(+), 88 deletions(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index bf62a88..21cc161 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -427,9 +427,9 @@ long arch_do_domctl(
 ret = -ESRCH;
 if ( iommu_enabled )
 {
-spin_lock(&pcidevs_lock);
+pcidevs_lock();
 ret = pt_irq_create_bind(d, bind);
-spin_unlock(&pcidevs_lock);
+pcidevs_unlock();
 }
 if ( ret < 0 )
 printk(XENLOG_G_ERR "pt_irq_create_bind failed (%ld) for dom%d\n",
@@ -452,9 +452,9 @@ long arch_do_domctl(
 
 if ( iommu_enabled )
 {
-spin_lock(&pcidevs_lock);
+pcidevs_lock();
 ret = pt_irq_destroy_bind(d, bind);
-spin_unlock(&pcidevs_lock);
+pcidevs_unlock();
 }
 if ( ret < 0 )
 printk(XENLOG_G_ERR "pt_irq_destroy_bind failed (%ld) for dom%d\n",
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index ac838a9..8e0817b 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -388,7 +388,7 @@ int msixtbl_pt_register(struct domain *d, struct pirq 
*pirq, uint64_t gtable)
 struct msixtbl_entry *entry, *new_entry;
 int r = -EINVAL;
 
-ASSERT(spin_is_locked(&pcidevs_lock));
+ASSERT(pcidevs_is_locked());
 ASSERT(spin_is_locked(&d->event_lock));
 
 /*
@@ -443,7 +443,7 @@ void msixtbl_pt_unregister(struct domain *d, struct pirq 
*pirq)
 struct pci_dev *pdev;
 struct msixtbl_entry *entry;
 
-ASSERT(spin_is_locked(&pcidevs_lock));
+ASSERT(pcidevs_is_locked());
 ASSERT(spin_is_locked(&d->event_lock));
 
 irq_desc = pirq_spin_lock_irq_desc(pirq, NULL);
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index bf2e822..68bdf19 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1955,7 +1955,7 @@ int map_domain_pirq(
 struct pci_dev *pdev;
 unsigned int nr = 0;
 
-ASSERT(spin_is_locked(&pcidevs_lock));
+ASSERT(pcidevs_is_locked());
 
 ret = -ENODEV;
 if ( !cpu_has_apic )
@@ -2100,7 +2100,7 @@ int unmap_domain_pirq(struct domain *d, int pirq)
 if ( (pirq < 0) || (pirq >= d->nr_pirqs) )
 return -EINVAL;
 
-ASSERT(spin_is_locked(&pcidevs_lock));
+ASSERT(pcidevs_is_locked());
 ASSERT(spin_is_locked(&d->event_lock));
 
 info = pirq_info(d, pirq);
@@ -2226,7 +2226,7 @@ void free_domain_pirqs(struct domain *d)
 {
 int i;
 
-spin_lock(&pcidevs_lock);
+pcidevs_lock();
 spin_lock(&d->event_lock);
 
 for ( i = 0; i < d->nr_pirqs; i++ )
@@ -2234,7 +2234,7 @@ void free_domain_pirqs(struct domain *d)
 unmap_domain_pirq(d, i);
 
 spin_unlock(&d->event_lock);
-spin_unlock(&pcidevs_lock);
+pcidevs_unlock();
 }
 
 static void dump_irqs(unsigned char key)
diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
index 3dbb84d..6e5e33e 100644
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -694,7 +694,7 @@ static int msi_capability_init(struct pci_dev *dev,
 u8 slot = PCI_SLOT(dev->devfn);
 u8 func = PCI_FUNC(dev->devfn);
 
-ASSERT(spin_is_locked(&pcidevs_lock));
+ASSERT(pcidevs_is_locked());
 pos = pci_find_cap_offset(seg, bus, slot, func, PCI_CAP_ID_MSI);
 if ( !pos )
 return -ENODEV;
@@ -852,7 +852,7 @@ static int msix_capability_init(struct pci_dev *dev,
 u8 func = PCI_FUNC(dev->devfn);
 bool_t maskall = msix->host_maskall;
 
-ASSERT(spin_is_locked(&pcidevs_lock));
+ASSERT(pcidevs_is_locked());
 
 control = pci_conf_read16(seg, bus, slot, func, msix_control_reg(pos));
 /*
@@ -1042,7 +1042,7 @@ static int __pci_enable_msi(struct msi_info *msi, struct 
msi_desc **desc)
 struct pci_dev *pdev;
 struct msi_desc *old_desc;
 
-ASSERT(spin_is_locked(&pcidevs_lock));
+ASSERT(pcidevs_is_locked());
 pdev = pci_get_pdev(msi->seg, msi->bus, msi->devfn);
 if ( !pdev )
 return -ENODEV;
@@ -1103,7 +1103,7 @@ static int __pci_enable_msix(struct msi_info *msi, struct 
msi_desc **desc)
 u8 func = PCI_FUNC(msi->devfn);
 struct msi_de

[Xen-devel] [PATCH v6 1/5] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error

2016-03-02 Thread Quan Xu
Signed-off-by: Quan Xu 
---
 xen/arch/x86/acpi/power.c | 14 +-
 xen/arch/x86/mm.c | 13 -
 xen/arch/x86/mm/p2m-ept.c | 10 +-
 xen/arch/x86/mm/p2m-pt.c  | 12 ++--
 xen/common/grant_table.c  |  5 +++--
 xen/common/memory.c   |  5 +++--
 xen/drivers/passthrough/iommu.c   | 16 +++-
 xen/drivers/passthrough/vtd/x86/vtd.c |  7 +--
 xen/drivers/passthrough/x86/iommu.c   |  6 +-
 xen/include/xen/iommu.h   |  6 +++---
 10 files changed, 70 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index f41f0de..ed1173c 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
 
 static int device_power_down(void)
 {
+int err;
+
 console_suspend();
 
 time_suspend();
@@ -53,11 +55,21 @@ static int device_power_down(void)
 
 ioapic_suspend();
 
-iommu_suspend();
+err = iommu_suspend();
+if ( err )
+goto iommu_suspend_error;
 
 lapic_suspend();
 
 return 0;
+
+iommu_suspend_error:
+ioapic_resume();
+i8259A_resume();
+time_resume();
+console_resume();
+
+return err;
 }
 
 static void device_power_up(void)
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 202ff76..54d8fce 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2332,7 +2332,7 @@ static int __get_page_type(struct page_info *page, 
unsigned long type,
int preemptible)
 {
 unsigned long nx, x, y = page->u.inuse.type_info;
-int rc = 0;
+int rc = 0, ret = 0;
 
 ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
 
@@ -2443,11 +2443,11 @@ static int __get_page_type(struct page_info *page, 
unsigned long type,
 if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
 {
 if ( (x & PGT_type_mask) == PGT_writable_page )
-iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
+ret = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
 else if ( type == PGT_writable_page )
-iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
-   page_to_mfn(page),
-   IOMMUF_readable|IOMMUF_writable);
+ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
+ page_to_mfn(page),
+ IOMMUF_readable|IOMMUF_writable);
 }
 }
 
@@ -2464,6 +2464,9 @@ static int __get_page_type(struct page_info *page, 
unsigned long type,
 if ( (x & PGT_partial) && !(nx & PGT_partial) )
 put_page(page);
 
+if ( !rc )
+rc = ret;
+
 return rc;
 }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 9860c6c..d31b9af 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -834,7 +834,15 @@ out:
 {
 if ( iommu_flags )
 for ( i = 0; i < (1 << order); i++ )
-iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+{
+rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, 
iommu_flags);
+if ( rc )
+{
+while ( i-- > 0 )
+iommu_unmap_page(d, gfn + i);
+break;
+}
+}
 else
 for ( i = 0; i < (1 << order); i++ )
 iommu_unmap_page(d, gfn + i);
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 709920a..690fffc 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -675,8 +675,16 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long 
gfn, mfn_t mfn,
 }
 else if ( iommu_pte_flags )
 for ( i = 0; i < (1UL << page_order); i++ )
-iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
-   iommu_pte_flags);
+{
+rc = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
+iommu_pte_flags);
+if ( rc )
+{
+while ( i-- > 0 )
+iommu_unmap_page(p2m->domain, gfn + i);
+break;
+}
+}
 else
 for ( i = 0; i < (1UL << page_order); i++ )
 iommu_unmap_page(p2m->domain, gfn + i);
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 2b449d5..f7dd731 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -919,8 +919,9 @@ __gnttab_map_grant_ref(
 {
 nr_gets++;
 (void)get_page(pg, rd);
-if ( !(op->flags & GNTMAP_readonly) )
-get_page_type(p

[Xen-devel] [PATCH v6 0/5] VT-d Device-TLB flush issue

2016-03-02 Thread Quan Xu
This patches fix current timeout concern and also allow limited ATS support:

1. Check VT-d Device-TLB flush error.
   This patch set checks all kinds of error and all the way up the call trees 
of VT-d Device-TLB flush.

2. Make the pcidevs_lock a recursive one.

3. Reduce spin timeout to 1ms, which can be boot-time changed with 
'vtd_qi_timeout'.
   For example:
   multiboot /boot/xen.gz ats=1 vtd_qi_timeout=100

4. Fix vt-d Device-TLB flush timeout issue.
   If Device-TLB flush is timeout, we'll hide the target ATS device and crash 
the domain owning this ATS device.
   If impacted domain is hardware domain, just throw out a warning.
   The hidden device will be disallowed to be further assigned to  any domain.



 * DMAR_OPERATION_TIMEOUT should be also chopped down to a low number of 
milliseconds.
   As Kevin Tian mentioned in 'Revisit VT-d asynchronous flush issue', We also 
confirmed with hardware team
   that 1ms is large enough for IOMMU internal flush. So I can change 
DMAR_OPERATION_TIMEOUT from 1000 ms to 1 ms.

   IOMMU_WAIT_OP() is only for VT-d registers read/write, and there is also a 
panic. We need a further discussion
   whether or how to remove this panic in next patch set.

 * The coming patch set will fix IOTLB/Context/IETC flush timeout.

--Changes in v6:

#patch 1/2
   * Make a reasonable attempt at splitting things, adjusting top level 
functions first and then
 working your way down to leaf ones.
   * Remove some pointless initializers.
   * Log error and don't return error value for hardware_domain init and 
crashed system shutdown.
   * when to populate iommu page table for domu, try to tear down the iommu 
page table for iommu
 iotlb flush error.
   * when the flush_iotlb_qi() return value is positive, All we need is
 -call iommu_flush_write_buffer() only when rc > 0
 -return zero from this function when rc is positive, or rc = 0 after call 
iommu_flush_write_buffer().
   * Fix v4 unaddressed issue:
 http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg01555.html


#patch 3
   * A new patch, make the pcidevs_lock a recursive one (Remove v4 pcidevs_lock 
related patches).

#patch 4
   * Add an entry in docs/misc/xen-command-line.markdown _alphabetically_.
   * Add a __must_check annotation on the function queue_invalidate_wait().

#patch 5
   * Add Stray blanks inside the parentheses.
   * Don't iterate over pdev-s without holding that lock, and hold pcidevs_lock 
for pdev-s list.
   * Print SBDF in canonical :bb:dd.f form.
   * Handle 'ret'/'rc' variables in the same function, and remove the pointless 
rc.

Quan Xu (5):
  IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error
  IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error
  IOMMU: Make the pcidevs_lock a recursive one
  VT-d: Reduce spin timeout to 1ms, which can be boot-time changed
  VT-d: Fix vt-d Device-TLB flush timeout issue

 docs/misc/xen-command-line.markdown   |   7 ++
 xen/arch/x86/acpi/power.c |  14 ++-
 xen/arch/x86/domctl.c |   8 +-
 xen/arch/x86/hvm/vmsi.c   |   4 +-
 xen/arch/x86/irq.c|   8 +-
 xen/arch/x86/mm.c |  13 ++-
 xen/arch/x86/mm/p2m-ept.c |  12 ++-
 xen/arch/x86/mm/p2m-pt.c  |  12 ++-
 xen/arch/x86/msi.c|  16 +--
 xen/arch/x86/pci.c|   4 +-
 xen/arch/x86/physdev.c|  16 +--
 xen/common/grant_table.c  |   5 +-
 xen/common/memory.c   |   5 +-
 xen/common/sysctl.c   |   4 +-
 xen/drivers/passthrough/amd/iommu_init.c  |  21 ++--
 xen/drivers/passthrough/amd/iommu_map.c   |   2 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   6 +-
 xen/drivers/passthrough/arm/smmu.c|  10 +-
 xen/drivers/passthrough/iommu.c   |  25 +++--
 xen/drivers/passthrough/pci.c |  99 ++-
 xen/drivers/passthrough/vtd/extern.h  |   4 +-
 xen/drivers/passthrough/vtd/iommu.c   | 134 +-
 xen/drivers/passthrough/vtd/qinval.c  |  80 +--
 xen/drivers/passthrough/vtd/quirks.c  |  26 +++--
 xen/drivers/passthrough/vtd/x86/ats.c |  12 +++
 xen/drivers/passthrough/vtd/x86/vtd.c |   7 +-
 xen/drivers/passthrough/x86/iommu.c   |   6 +-
 xen/drivers/video/vga.c   |   4 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   2 +-
 xen/include/asm-x86/iommu.h   |   2 +-
 xen/include/xen/iommu.h   |  12 +--
 xen/include/xen/pci.h |   5 +-
 32 files changed, 399 insertions(+), 186 deletions(-)

-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-

[Xen-devel] [PATCH v6 4/5] VT-d: Reduce spin timeout to 1ms, which can be boot-time changed

2016-03-02 Thread Quan Xu
Signed-off-by: Quan Xu 
---
 docs/misc/xen-command-line.markdown  |  7 +++
 xen/drivers/passthrough/vtd/qinval.c | 15 +--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index a565c1b..1f5a111 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1466,6 +1466,13 @@ Note that if **watchdog** option is also specified vpmu 
will be turned off.
 As the BTS virtualisation is not 100% safe and because of the nehalem quirk
 don't use the vpmu flag on production systems with Intel cpus!
 
+### vtd\_qi\_timeout (VT-d)
+> `= `
+
+> Default: `1`
+
+Specify the timeout of the VT-d Queued Invalidation in milliseconds.
+
 ### watchdog
 > `= force | `
 
diff --git a/xen/drivers/passthrough/vtd/qinval.c 
b/xen/drivers/passthrough/vtd/qinval.c
index b81b0bd..882b9f4 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -28,6 +28,11 @@
 #include "vtd.h"
 #include "extern.h"
 
+static unsigned int __read_mostly vtd_qi_timeout = 1;
+integer_param("vtd_qi_timeout", vtd_qi_timeout);
+
+#define IOMMU_QI_TIMEOUT (vtd_qi_timeout * MILLISECS(1))
+
 static void print_qi_regs(struct iommu *iommu)
 {
 u64 val;
@@ -130,6 +135,10 @@ static void queue_invalidate_iotlb(struct iommu *iommu,
 spin_unlock_irqrestore(&iommu->register_lock, flags);
 }
 
+/*
+ * NB. We must check all kinds of error and all the way up the
+ * call trees.
+ */
 static int queue_invalidate_wait(struct iommu *iommu,
 u8 iflag, u8 sw, u8 fn)
 {
@@ -167,10 +176,12 @@ static int queue_invalidate_wait(struct iommu *iommu,
 start_time = NOW();
 while ( poll_slot != QINVAL_STAT_DONE )
 {
-if ( NOW() > (start_time + DMAR_OPERATION_TIMEOUT) )
+if ( NOW() > (start_time + IOMMU_QI_TIMEOUT) )
 {
 print_qi_regs(iommu);
-panic("queue invalidate wait descriptor was not executed");
+printk(XENLOG_WARNING VTDPREFIX
+   "Queue invalidate wait descriptor was timeout.\n");
+return -ETIMEDOUT;
 }
 cpu_relax();
 }
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 5/5] VT-d: Fix vt-d Device-TLB flush timeout issue

2016-03-02 Thread Quan Xu
If Device-TLB flush is timeout, we'll hide the target ATS device
and crash the domain owning this ATS device. If impacted domain
is hardware domain, just throw out a warning. The hidden device
will be disallowed to be further assigned to any domain.

Signed-off-by: Quan Xu 
---
 xen/drivers/passthrough/pci.c |  6 ++--
 xen/drivers/passthrough/vtd/extern.h  |  2 ++
 xen/drivers/passthrough/vtd/qinval.c  | 65 ---
 xen/drivers/passthrough/vtd/x86/ats.c | 12 +++
 xen/include/xen/pci.h |  1 +
 5 files changed, 78 insertions(+), 8 deletions(-)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index d7e94e1..53b382a 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -414,7 +414,7 @@ static void free_pdev(struct pci_seg *pseg, struct pci_dev 
*pdev)
 xfree(pdev);
 }
 
-static void _pci_hide_device(struct pci_dev *pdev)
+void pci_hide_existing_device(struct pci_dev *pdev)
 {
 if ( pdev->domain )
 return;
@@ -431,7 +431,7 @@ int __init pci_hide_device(int bus, int devfn)
 pdev = alloc_pdev(get_pseg(0), bus, devfn);
 if ( pdev )
 {
-_pci_hide_device(pdev);
+pci_hide_existing_device(pdev);
 rc = 0;
 }
 pcidevs_unlock();
@@ -461,7 +461,7 @@ int __init pci_ro_device(int seg, int bus, int devfn)
 }
 
 __set_bit(PCI_BDF2(bus, devfn), pseg->ro_map);
-_pci_hide_device(pdev);
+pci_hide_existing_device(pdev);
 
 return 0;
 }
diff --git a/xen/drivers/passthrough/vtd/extern.h 
b/xen/drivers/passthrough/vtd/extern.h
index 507c2e9..dac1e5a 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -58,6 +58,8 @@ int ats_device(const struct pci_dev *, const struct 
acpi_drhd_unit *);
 
 int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
  u64 addr, unsigned int size_order, u64 type);
+int dev_invalidate_iotlb_sync(struct iommu *iommu, u16 did,
+  u16 seg, u8 bus, u8 devfn);
 
 int qinval_device_iotlb(struct iommu *iommu,
 u32 max_invs_pend, u16 sid, u16 size, u64 addr);
diff --git a/xen/drivers/passthrough/vtd/qinval.c 
b/xen/drivers/passthrough/vtd/qinval.c
index 882b9f4..8ff2d94 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -233,6 +233,57 @@ int qinval_device_iotlb(struct iommu *iommu,
 return 0;
 }
 
+static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
+ u16 seg, u8 bus, u8 devfn)
+{
+struct domain *d = NULL;
+struct pci_dev *pdev;
+
+if ( test_bit(did, iommu->domid_bitmap) )
+d = rcu_lock_domain_by_id(iommu->domid_map[did]);
+
+if ( d == NULL )
+return;
+
+pcidevs_lock();
+for_each_pdev(d, pdev)
+{
+if ( ( pdev->seg == seg ) &&
+ ( pdev->bus == bus ) &&
+ ( pdev->devfn == devfn ) )
+{
+ASSERT ( pdev->domain );
+list_del(&pdev->domain_list);
+pdev->domain = NULL;
+pci_hide_existing_device(pdev);
+break;
+}
+}
+
+pcidevs_unlock();
+
+if ( !is_hardware_domain(d) )
+domain_crash(d);
+
+rcu_unlock_domain(d);
+}
+
+int dev_invalidate_iotlb_sync(struct iommu *iommu, u16 did,
+  u16 seg, u8 bus, u8 devfn)
+{
+struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
+int rc = 0;
+
+if ( qi_ctrl->qinval_maddr )
+{
+rc = queue_invalidate_wait(iommu, 0, 1, 1);
+if ( rc == -ETIMEDOUT )
+dev_invalidate_iotlb_timeout(iommu, did, seg, bus, devfn);
+}
+
+return rc;
+}
+
 static void queue_invalidate_iec(struct iommu *iommu, u8 granu, u8 im, u16 
iidx)
 {
 unsigned long flags;
@@ -342,8 +393,6 @@ static int flush_iotlb_qi(
 
 if ( qi_ctrl->qinval_maddr != 0 )
 {
-int rc;
-
 /* use queued invalidation */
 if (cap_write_drain(iommu->cap))
 dw = 1;
@@ -353,11 +402,17 @@ static int flush_iotlb_qi(
 queue_invalidate_iotlb(iommu,
type >> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
dw, did, size_order, 0, addr);
+
+/*
+ * Before Device-TLB invalidation we need to synchronize
+ * invalidation completions with hardware.
+ */
+ret = invalidate_sync(iommu);
+if ( ret )
+ return ret;
+
 if ( flush_dev_iotlb )
 ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
-rc = invalidate_sync(iommu);
-if ( !ret )
-ret = rc;
 }
 return ret;
 }
diff --git a/xen/drivers/passthrough/vtd/x86/ats.c 
b/xen/drivers/passthrough/vtd/x86/ats.c
index 7c797f6..a79a95a 100644
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -162,6 +162,18 @@ int 

Re: [Xen-devel] what's inside hypercall page?

2016-03-02 Thread quizyjones
After step by step monitoring, I get the following statistics about hypercall 
entries:
numbers | hypercalls | executed bytes (offset to hypercall entry)   7755 24: 0 
1 3 8 a c d   6374 23: 0 1 3 4 9   3281 25: 0 1 3 8 a c d   2979 13: 0 1 3 8 a 
c d   2475 17: 0 1 3 8   2253 17: a c d749 3: 0 1 3 8 a c d655 23: 0 1 
3 4 9 0 1 3 4 9640 29: 0 1 3 8636 29: a c d445 23: 0 1 3 4 9 0 1 3 
4 9 0 1 3 4 9433 23: 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9414 24: 0 1 
3 8 a c d 0 1 3 8 a c d274 13: 0 1 3 8 8 a c d129 17: d125 17: a c  
  112 29: a c d 0 1 3 8112 17: c d105 17: a 73 24: 0 1 3 8 a c d 0 
1 3 8 a c d 0 1 3 8 a c d 67 17: 0 59 17: 8 a c d 54 17: 0 1 3 
53 17: 0 1 50 17: 1 3 8 a c d 46 17: 3 8 a c d 21 3: 0 1 3 8 a c d 
0 1 3 8 a c d  8 33: 0 1 3 8 a c d  7 17: 1 3  6 13: 0 1 3 8 8 8 a 
c d  5 29: d  5 23: 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9 0 1 3 4 9   
   4 29: a c  4 17: 3  3 17: 8 a  3 17: 8  3 17: 3 8  3 17: 
1 3 8 a c  3 17: 1  2 29: 0 1 3 8 a c d  2 17: 3 8 a  2 17: 1 3 
8 a  2 17: 1 3 8  1 29: c  1 29: a  1 29: 3 8 a c d  1 29: 
1 3 8 a c d  1 29: 0 1  1 29: 0  1 17: 3 8 a c
From the above we can see that hypercall #17 and #29 are very irregular, with 
various combination occurs. Other hypercalls basically obey to the sequence of 
"0 1 3 8 a c d" which conforms to the content in hypercall_page_initialise 
function. HYPERCALL_iret is a special one as explained in the function, but it 
also conforms to its sequence of "0 1 3 4 9". So why would #17(do_xen_version) 
and #29(do_sched_op) performs irregular? They seem to be easily interrupted at 
any place of the hypercall entry. Besides, there is also some abnormals for 
#13(do_multicall) shown in bold.
From: quizy_jo...@outlook.com
To: xen-de...@lists.xenproject.org
Date: Wed, 2 Mar 2016 12:44:16 +
Subject: Re: [Xen-devel] what's inside hypercall page?




For following hypercall page initialise code,  where would the execution jumps 
at syscall? How can I predict what is the execution order of "pop %r11"? Is it 
the fifth instruction/step? I need the order to precisely set up hooks to 
monitor hypercalls.
static void hypercall_page_initialise_ring3_kernel(void *hypercall_page){
char *p;int i;
/* Fill in all the transfer points with template machine code. */for ( 
i = 0; i < (PAGE_SIZE / 32); i++ ){if ( i == __HYPERVISOR_iret )
continue;
p = (char *)(hypercall_page + (i * 32));*(u8  *)(p+ 0) = 0x51;  
  /* push %rcx */*(u16 *)(p+ 1) = 0x5341;  /* push %r11 */*(u8  
*)(p+ 3) = 0xb8;/* mov  $,%eax */*(u32 *)(p+ 4) = i;
*(u16 *)(p+ 8) = 0x050f;  /* syscall */*(u16 *)(p+10) = 0x5b41;  /* pop 
 %r11 */*(u8  *)(p+12) = 0x59;/* pop  %rcx */*(u8  *)(p+13) 
= 0xc3;/* ret */}
/* * HYPERVISOR_iret is special because it doesn't return and expects a 
* special stack frame. Guests jump at this transfer point instead of * 
calling it. */p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32));  
  *(u8  *)(p+ 0) = 0x51;/* push %rcx */*(u16 *)(p+ 1) = 0x5341;  /* 
push %r11 */*(u8  *)(p+ 3) = 0x50;/* push %rax */*(u8  *)(p+ 4) = 
0xb8;/* mov  $__HYPERVISOR_iret,%eax */*(u32 *)(p+ 5) = 
__HYPERVISOR_iret;*(u16 *)(p+ 9) = 0x050f;  /* syscall */
自动判断中文中文(简体)中文(香港)中文(繁体)英语日语朝鲜语德语法语俄语泰语南非语阿拉伯语阿塞拜疆语比利时语保加利亚语加泰隆语捷克语威尔士语丹麦语第维埃语希腊语世界语西班牙语爱沙尼亚语巴士克语法斯语芬兰语法罗语加里西亚语古吉拉特语希伯来语印地语克罗地亚语匈牙利语亚美尼亚语印度尼西亚语冰岛语意大利语格鲁吉亚语哈萨克语卡纳拉语孔卡尼语吉尔吉斯语立陶宛语拉脱维亚语毛利语马其顿语蒙古语马拉地语马来语马耳他语挪威语(伯克梅尔)荷兰语北梭托语旁遮普语波兰语葡萄牙语克丘亚语罗马尼亚语梵文北萨摩斯语斯洛伐克语斯洛文尼亚语阿尔巴尼亚语瑞典语斯瓦希里语叙利亚语泰米尔语泰卢固语塔加路语茨瓦纳语土耳其语宗加语鞑靼语乌克兰语乌都语乌兹别克语越南语班图语祖鲁语自动选择中文中文(简体)中文(香港)中文(繁体)英语日语朝鲜语德语法语俄语泰语南非语阿拉伯语阿塞拜疆语比利时语保加利亚语加泰隆语捷克语威尔士语丹麦语第维埃语希腊语世界语西班牙语爱沙尼亚语巴士克语法斯语芬兰语法罗语加里西亚语古吉拉特语希伯来语印地语克罗地亚语匈牙利语亚美尼亚语印度尼西亚语冰岛语意大利语格鲁吉亚语哈萨克语卡纳拉语孔卡尼语吉尔吉斯语立陶宛语拉脱维亚语毛利语马其顿语蒙古语马拉地语马来语马耳他语挪威语(伯克梅尔)荷兰语北梭托语旁遮普语波兰语葡萄牙语克丘亚语罗马尼亚语梵文北萨摩斯语斯洛伐克语斯洛文尼亚语阿尔巴尼亚语瑞典语斯瓦希里语叙利亚语泰米尔语泰卢固语塔加路语茨瓦纳语土耳其语宗加语鞑靼语乌克兰语乌都语乌兹别克语越南语班图语祖鲁语有道翻译百度翻译必应翻译谷歌翻译谷歌翻译(国内)翻译朗读复制正在查询,请稍候……重试朗读复制复制朗读复制via
 译   ___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.7 Development Update

2016-03-02 Thread Jan Beulich
>>> On 02.03.16 at 15:07,  wrote:
>>  -Original Message-
>> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Jan
>> Beulich
>> Sent: 02 March 2016 13:33
>> To: George Dunlap
>> Cc: xen-devel; Wei Liu
>> Subject: Re: [Xen-devel] Xen 4.7 Development Update
>> 
>> >>> On 02.03.16 at 12:38,  wrote:
>> > On Mon, Feb 29, 2016 at 11:17 AM, Wei Liu  wrote:
>> >> *  Improve ioreq server performance
>> >>   -  Yu Zhang
>> >>   -  Paul Durrant
>> >
>> > If this means "use RB trees for rangesets", I think this is already in.
>> 
>> No, it's not. There was no point in committing that one without
>> the patch actually needing it.
>> 
> 
> Using RB trees vs. a linear walk is still an improvement so I see no harm in 
> committing it.

But the individual nodes (and hence the overall resource use)
grow, and whether that is worth the presumably tiny win on
lookups I'm not at all certain.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.7 Development Update

2016-03-02 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Jan
> Beulich
> Sent: 02 March 2016 13:33
> To: George Dunlap
> Cc: xen-devel; Wei Liu
> Subject: Re: [Xen-devel] Xen 4.7 Development Update
> 
> >>> On 02.03.16 at 12:38,  wrote:
> > On Mon, Feb 29, 2016 at 11:17 AM, Wei Liu  wrote:
> >> *  Improve ioreq server performance
> >>   -  Yu Zhang
> >>   -  Paul Durrant
> >
> > If this means "use RB trees for rangesets", I think this is already in.
> 
> No, it's not. There was no point in committing that one without
> the patch actually needing it.
> 

Using RB trees vs. a linear walk is still an improvement so I see no harm in 
committing it.

  Paul

> Jan
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 for Xen 4.7 1/4] xen: enable per-VCPU parameter settings for RTDS scheduler

2016-03-02 Thread George Dunlap
On 02/03/16 13:36, Jan Beulich wrote:
 On 01.03.16 at 18:58,  wrote:
>> On Tue, Feb 9, 2016 at 12:17 PM, Dario Faggioli
>>  wrote:
>>> On Thu, 2016-02-04 at 16:50 -0600, Chong Li wrote:
 --- a/xen/common/sched_rt.c
 +++ b/xen/common/sched_rt.c
>>
>>>
 +for ( index = op->u.v.vcpu_index; index < op->u.v.nr_vcpus;
 index++ )
 +{
 +spin_lock_irqsave(&prv->lock, flags);
 +if ( copy_from_guest_offset(&local_sched,
 +  op->u.v.vcpus, index, 1) )
 +{
 +rc = -EFAULT;
 +spin_unlock_irqrestore(&prv->lock, flags);
 +break;
 +}
 +if ( local_sched.vcpuid >= d->max_vcpus ||
 +  d->vcpu[local_sched.vcpuid] == NULL )
 +{
 +rc = -EINVAL;
 +spin_unlock_irqrestore(&prv->lock, flags);
 +break;
 +}
 +svc = rt_vcpu(d->vcpu[local_sched.vcpuid]);
 +period = MICROSECS(local_sched.s.rtds.period);
 +budget = MICROSECS(local_sched.s.rtds.budget);
 +if ( period > RTDS_MAX_PERIOD || budget <
 RTDS_MIN_BUDGET ||
 +  budget > period )

>>> Isn't checking against RTDS_MIN_PERIOD missing?
>>
>> Because RTDS_MIN_PERIOD==RTDS_MIN_BUDGET, by checking budget <
>> RTDS_MIN_BUDGET and budget > period, the checking against
>> RTDS_MIN_PERIOD is already covered.
> 
> If you make code dependent upon such value matches, the
> dependency should be documented and enforced to be
> noticed if broken by a BUILD_BUG_ON().

To expand upon this:

Code changes.  At the moment RTDS_MIN_PERIOD == RTDS_MIN_BUDGET, but the
very fact that you have two different macros implies to anyone coming
along later that you can change one.  If someone does change one but not
the other, then that will create a bug in the program which will be very
difficult to detect.  It is likely not to be noticed during patch review
(since it probably won't change the code you're now introducing), and it
may not even be noticed in follow-up testing for some time.

After you've been bitten several times by this sort of bug, you learn to
be paranoid about this sort of thing (which is why Dario noticed it).

Two ways to proceed:

1. Don't assume RTDS_MIN_PERIOD == RTDS_MIN_BUDGET here, and add the
extra check Dario mentioned.

2. Assume RTDS_MIN_PERIOD == RTDS_MIN_BUDGET, and add something to the
code which will break the build if this is ever false (as Jan suggested).

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 for Xen 4.7 1/4] xen: enable per-VCPU parameter settings for RTDS scheduler

2016-03-02 Thread Jan Beulich
>>> On 01.03.16 at 18:58,  wrote:
> On Tue, Feb 9, 2016 at 12:17 PM, Dario Faggioli
>  wrote:
>> On Thu, 2016-02-04 at 16:50 -0600, Chong Li wrote:
>>> --- a/xen/common/sched_rt.c
>>> +++ b/xen/common/sched_rt.c
> 
>>
>>> +for ( index = op->u.v.vcpu_index; index < op->u.v.nr_vcpus;
>>> index++ )
>>> +{
>>> +spin_lock_irqsave(&prv->lock, flags);
>>> +if ( copy_from_guest_offset(&local_sched,
>>> +  op->u.v.vcpus, index, 1) )
>>> +{
>>> +rc = -EFAULT;
>>> +spin_unlock_irqrestore(&prv->lock, flags);
>>> +break;
>>> +}
>>> +if ( local_sched.vcpuid >= d->max_vcpus ||
>>> +  d->vcpu[local_sched.vcpuid] == NULL )
>>> +{
>>> +rc = -EINVAL;
>>> +spin_unlock_irqrestore(&prv->lock, flags);
>>> +break;
>>> +}
>>> +svc = rt_vcpu(d->vcpu[local_sched.vcpuid]);
>>> +period = MICROSECS(local_sched.s.rtds.period);
>>> +budget = MICROSECS(local_sched.s.rtds.budget);
>>> +if ( period > RTDS_MAX_PERIOD || budget <
>>> RTDS_MIN_BUDGET ||
>>> +  budget > period )
>>>
>> Isn't checking against RTDS_MIN_PERIOD missing?
> 
> Because RTDS_MIN_PERIOD==RTDS_MIN_BUDGET, by checking budget <
> RTDS_MIN_BUDGET and budget > period, the checking against
> RTDS_MIN_PERIOD is already covered.

If you make code dependent upon such value matches, the
dependency should be documented and enforced to be
noticed if broken by a BUILD_BUG_ON().

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.7 Development Update

2016-03-02 Thread Jan Beulich
>>> On 02.03.16 at 12:38,  wrote:
> On Mon, Feb 29, 2016 at 11:17 AM, Wei Liu  wrote:
>> *  Improve ioreq server performance
>>   -  Yu Zhang
>>   -  Paul Durrant
> 
> If this means "use RB trees for rangesets", I think this is already in.

No, it's not. There was no point in committing that one without
the patch actually needing it.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 84928: tolerable FAIL - PUSHED

2016-03-02 Thread osstest service owner
flight 84928 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/84928/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 build-i386-rumpuserxen6 xen-buildfail   like 84518
 build-amd64-rumpuserxen   6 xen-buildfail   like 84518
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 84518
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 84518
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 84518
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail   like 84518

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  986d9fc3bbf8a6d9d088ca22d1422bd5de249396
baseline version:
 xen  42391c613d42248d82f1b04c523d48bf141b75dc

Last test of basis84518  2016-02-29 11:32:28 Z2 days
Testing same since84610  2016-03-01 03:26:37 Z1 days2 attempts


People who touched revisions under test:
  Boris Ostrovsky 
  Corneliu ZUZU 
  Dario Faggioli 
  Doug Goldstein 
  George Dunlap 
  George Dunlap 
  Haozhong Zhang 
  Ian Campbell 
  Ian Jackson 
  Jan Beulich 
  Parth Dixit 
  Razvan Cojocaru 
  Shannon Zhao 
  Stefano Stabellini 
  Tamas K Lengyel 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern   

Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

2016-03-02 Thread Jan Beulich
>>> On 02.03.16 at 08:14,  wrote:
> It means NVDIMM is very possibly mapped in page granularity, and
> hypervisor needs per-page data structures like page_info (rather than the
> range set style nvdimm_pages) to manage those mappings.
> 
> Then we will face the problem that the potentially huge number of
> per-page data structures may not fit in the normal ram. Linux kernel
> developers came across the same problem, and their solution is to
> reserve an area of NVDIMM and put the page structures in the reserved
> area (https://lwn.net/Articles/672457/). I think we may take the similar
> solution:
> (1) Dom0 Linux kernel reserves an area on each NVDIMM for Xen usage
> (besides the one used by Linux kernel itself) and reports the address
> and size to Xen hypervisor.
> 
> Reasons to choose Linux kernel to make the reservation include:
> (a) only Dom0 Linux kernel has the NVDIMM driver,
> (b) make it flexible for Dom0 Linux kernel to handle all
> reservations (for itself and Xen).
> 
> (2) Then Xen hypervisor builds the page structures for NVDIMM pages and
> stores them in above reserved areas.

Another argument against this being primarily Dom0-managed,
I would say. Furthermore - why would Dom0 waste space
creating per-page control structures for regions which are
meant to be handed to guests anyway?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] hvmloader: Use xen/errno.h rather than the host systems errno.h

2016-03-02 Thread Doug Goldstein
On 3/1/16 12:57 PM, Andrew Cooper wrote:
> hvmloader is unhosted, and shouldn't use the system errno.h.  It already has
> to use Xen's errno.h for other hypercalls.  The use of public/io/xs_wire.h
> requires the use of un-prefixed errno values.
> 
> This fixes the build on stricter toolchains where requesting -fno-builtin does
> reduce the include path as much as it can.
> 
> Reported-by: Doug Goldstein 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Doug Goldstein 

> ---
> CC: Jan Beulich 
> CC: Ian Campbell 
> CC: Ian Jackson 
> CC: Wei Liu 
> CC: Doug Goldstein 
> 
> v3:
>  * Split single patch multiple fixes
> v2:
>  * Fix compilation.  I am not sure how v1 compiled, but I did definitely check
>it before posting.
> ---
>  tools/firmware/hvmloader/util.h   | 9 +
>  tools/firmware/hvmloader/vnuma.c  | 3 +--
>  tools/firmware/hvmloader/xenbus.c | 1 -
>  3 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/firmware/hvmloader/util.h b/tools/firmware/hvmloader/util.h
> index 132d915..3126817 100644
> --- a/tools/firmware/hvmloader/util.h
> +++ b/tools/firmware/hvmloader/util.h
> @@ -9,6 +9,15 @@
>  #include 
>  #include "e820.h"
>  
> +/* Request un-prefixed values from errno.h. */
> +#define XEN_ERRNO(name, value) name = value,
> +enum {
> +#include 
> +};
> +
> +/* Cause xs_wire.h to give us xsd_errors[]. */
> +#define EINVAL EINVAL
> +
>  #define __STR(...) #__VA_ARGS__
>  #define STR(...) __STR(__VA_ARGS__)
>  
> diff --git a/tools/firmware/hvmloader/vnuma.c 
> b/tools/firmware/hvmloader/vnuma.c
> index 4121cc6..85c1a79 100644
> --- a/tools/firmware/hvmloader/vnuma.c
> +++ b/tools/firmware/hvmloader/vnuma.c
> @@ -28,7 +28,6 @@
>  #include "util.h"
>  #include "hypercall.h"
>  #include "vnuma.h"
> -#include 
>  
>  unsigned int nr_vnodes, nr_vmemranges;
>  unsigned int *vcpu_to_vnode, *vdistance;
> @@ -40,7 +39,7 @@ void init_vnuma_info(void)
>  struct xen_vnuma_topology_info vnuma_topo = { .domid = DOMID_SELF };
>  
>  rc = hypercall_memory_op(XENMEM_get_vnumainfo, &vnuma_topo);
> -if ( rc != -XEN_ENOBUFS )
> +if ( rc != -ENOBUFS )
>  return;
>  
>  ASSERT(vnuma_topo.nr_vcpus == hvm_info->nr_vcpus);
> diff --git a/tools/firmware/hvmloader/xenbus.c 
> b/tools/firmware/hvmloader/xenbus.c
> index d0ed993..448157d 100644
> --- a/tools/firmware/hvmloader/xenbus.c
> +++ b/tools/firmware/hvmloader/xenbus.c
> @@ -27,7 +27,6 @@
>  
>  #include "util.h"
>  #include "hypercall.h"
> -#include 
>  #include 
>  #include 
>  #include 
> 


-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] what's inside hypercall page?

2016-03-02 Thread quizyjones
For following hypercall page initialise code,  where would the execution jumps 
at syscall? How can I predict what is the execution order of "pop %r11"? Is it 
the fifth instruction/step? I need the order to precisely set up hooks to 
monitor hypercalls.
static void hypercall_page_initialise_ring3_kernel(void *hypercall_page){
char *p;int i;
/* Fill in all the transfer points with template machine code. */for ( 
i = 0; i < (PAGE_SIZE / 32); i++ ){if ( i == __HYPERVISOR_iret )
continue;
p = (char *)(hypercall_page + (i * 32));*(u8  *)(p+ 0) = 0x51;  
  /* push %rcx */*(u16 *)(p+ 1) = 0x5341;  /* push %r11 */*(u8  
*)(p+ 3) = 0xb8;/* mov  $,%eax */*(u32 *)(p+ 4) = i;
*(u16 *)(p+ 8) = 0x050f;  /* syscall */*(u16 *)(p+10) = 0x5b41;  /* pop 
 %r11 */*(u8  *)(p+12) = 0x59;/* pop  %rcx */*(u8  *)(p+13) 
= 0xc3;/* ret */}
/* * HYPERVISOR_iret is special because it doesn't return and expects a 
* special stack frame. Guests jump at this transfer point instead of * 
calling it. */p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32));  
  *(u8  *)(p+ 0) = 0x51;/* push %rcx */*(u16 *)(p+ 1) = 0x5341;  /* 
push %r11 */*(u8  *)(p+ 3) = 0x50;/* push %rax */*(u8  *)(p+ 4) = 
0xb8;/* mov  $__HYPERVISOR_iret,%eax */*(u32 *)(p+ 5) = 
__HYPERVISOR_iret;*(u16 *)(p+ 9) = 0x050f;  /* syscall */

From: quizy_jo...@outlook.com
To: xen-de...@lists.xenproject.org
Date: Wed, 2 Mar 2016 03:50:55 +
Subject: [Xen-devel] what's inside hypercall page?




I've got the hypercall_page_initialize function as follows. As the size of each 
hypercall page entry is 32B and the initialize function only assigns value to 
the first 8B, is the remaining space empty or initialized afterwards?
static void hypercall_page_initialise_ring1_kernel(void *hypercall_page){
char *p;int i;
/* Fill in all the transfer points with template machine code. */
for ( i = 0; i < (PAGE_SIZE / 32); i++ ){if ( i == 
__HYPERVISOR_iret )continue;
p = (char *)(hypercall_page + (i * 32));*(u8  *)(p+ 0) = 0xb8;  
  /* mov  $,%eax */*(u32 *)(p+ 1) = i;*(u16 *)(p+ 5) = 
(HYPERCALL_VECTOR << 8) | 0xcd; /* int  $xx */ //0x82cd*(u8  *)(p+ 7) = 
0xc3;/* ret */}
/* * HYPERVISOR_iret is special because it doesn't return and expects a 
* special stack frame. Guests jump at this transfer point instead of * 
calling it. */p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32));  
  *(u8  *)(p+ 0) = 0x50;/* push %eax */*(u8  *)(p+ 1) = 0xb8;/* mov 
 $__HYPERVISOR_iret,%eax */*(u32 *)(p+ 2) = __HYPERVISOR_iret;*(u16 
*)(p+ 6) = (HYPERCALL_VECTOR << 8) | 0xcd; /* int  $xx */ 
//0x82cd}自动判断中文中文(简体)中文(香港)中文(繁体)英语日语朝鲜语德语法语俄语泰语南非语阿拉伯语阿塞拜疆语比利时语保加利亚语加泰隆语捷克语威尔士语丹麦语第维埃语希腊语世界语西班牙语爱沙尼亚语巴士克语法斯语芬兰语法罗语加里西亚语古吉拉特语希伯来语印地语克罗地亚语匈牙利语亚美尼亚语印度尼西亚语冰岛语意大利语格鲁吉亚语哈萨克语卡纳拉语孔卡尼语吉尔吉斯语立陶宛语拉脱维亚语毛利语马其顿语蒙古语马拉地语马来语马耳他语挪威语(伯克梅尔)荷兰语北梭托语旁遮普语波兰语葡萄牙语克丘亚语罗马尼亚语梵文北萨摩斯语斯洛伐克语斯洛文尼亚语阿尔巴尼亚语瑞典语斯瓦希里语叙利亚语泰米尔语泰卢固语塔加路语茨瓦纳语土耳其语宗加语鞑靼语乌克兰语乌都语乌兹别克语越南语班图语祖鲁语自动选择中文中文(简体)中文(香港)中文(繁体)英语日语朝鲜语德语法语俄语泰语南非语阿拉伯语阿塞拜疆语比利时语保加利亚语加泰隆语捷克语威尔士语丹麦语第维埃语希腊语世界语西班牙语爱沙尼亚语巴士克语法斯语芬兰语法罗语加里西亚语古吉拉特语希伯来语印地语克罗地亚语匈牙利语亚美尼亚语印度尼西亚语冰岛语意大利语格鲁吉亚语哈萨克语卡纳拉语孔卡尼语吉尔吉斯语立陶宛语拉脱维亚语毛利语马其顿语蒙古语马拉地语马来语马耳他语挪威语(伯克梅尔)荷兰语北梭托语旁遮普语波兰语葡萄牙语克丘亚语罗马尼亚语梵文北萨摩斯语斯洛伐克语斯洛文尼亚语阿尔巴尼亚语瑞典语斯瓦希里语叙利亚语泰米尔语泰卢固语塔加路语茨瓦纳语土耳其语宗加语鞑靼语乌克兰语乌都语乌兹别克语越南语班图语祖鲁语有道翻译百度翻译必应翻译谷歌翻译谷歌翻译(国内)翻译朗读复制正在查询,请稍候……重试朗读复制复制朗读复制via
 译 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >