date:20160323

[Xen-devel] [qemu-mainline test] 86891: regressions - FAIL

2016-03-23 Thread osstest service owner

flight 86891 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/86891/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 86454
 test-amd64-i386-qemuu-rhel6hvm-intel  9 redhat-installfail REGR. vs. 86454
 test-amd64-i386-freebsd10-i386 10 guest-start fail REGR. vs. 86454
 test-amd64-i386-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
86454
 test-amd64-i386-qemuu-rhel6hvm-amd  9 redhat-install  fail REGR. vs. 86454
 test-amd64-i386-freebsd10-amd64 10 guest-startfail REGR. vs. 86454
 test-amd64-amd64-qemuu-nested-amd  9 debian-hvm-install   fail REGR. vs. 86454
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 86454
 test-amd64-amd64-qemuu-nested-intel  9 debian-hvm-install fail REGR. vs. 86454
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail REGR. 
vs. 86454
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
86454
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail REGR. 
vs. 86454
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 86454
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 86454

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-winxpsp3  9 windows-install   fail pass in 86813

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds15 guest-start/debian.repeat fail blocked in 86454
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 86454
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 86454

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu4829e0378dfb91d55af9dfd741bd09e8f2c4f91a
baseline version:
 qemuud1f8764099022bc1173f2413331b26d4ff609a0c

Last test of basis86454  2016-03-17 06:01:30 Z6 days
Failing since 86547  2016-03-18 07:12:41 Z5 days5 attempts
Testing same since86628  2016-03-19 04:51:17 Z5 days4 attempts


People who touched revisions under test:
  Alberto Garcia 
  Daniel P. Berrange 
  David Gibson 
  Eduardo Habkost

[Xen-devel] [ovmf test] 87061: regressions - FAIL

2016-03-23 Thread osstest service owner

flight 87061 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/87061/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 65543
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 65543

version targeted for testing:
 ovmf 36e8e6992d0cd43891e584b24c556e6abc62b6ec
baseline version:
 ovmf 5ac96e3a28dd26eabee421919f67fa7c443a47f1

Last test of basis65543  2015-12-08 08:45:15 Z  106 days
Failing since 65593  2015-12-08 23:44:51 Z  106 days  118 attempts
Testing same since87061  2016-03-23 15:56:14 Z0 days1 attempts


People who touched revisions under test:
  "Samer El-Haj-Mahmoud" 
  "Wu, Hao A" 
  "Yao, Jiewen" 
  Alcantara, Paulo 
  Anbazhagan Baraneedharan 
  Andrew Fish 
  Ard Biesheuvel 
  Arthur Crippa Burigo 
  Cecil Sheng 
  Chao Zhang 
  Chao Zhang
  Charles Duffy 
  Cinnamon Shia 
  Cohen, Eugene 
  Dandan Bi 
  Daocheng Bu 
  Daryl McDaniel 
  David Woodhouse 
  Derek Lin 
  edk2 dev 
  edk2-devel 
  Eric Dong 
  Eric Dong 
  Eugene Cohen 
  Evan Lloyd 
  Feng Tian 
  Fu Siyuan 
  Gabriel Somlo 
  Gary Ching-Pang Lin 
  Gary Lin 
  Ghazi Belaam 
  Hao Wu 
  Haojian Zhuang 
  Hess Chen 
  Heyi Guo 
  Jaben Carsey 
  Jeff Fan 
  Jiaxin Wu 
  jiewen yao 
  Jim Dailey 
  jim_dai...@dell.com 
  Jordan Justen 
  Karyne Mayer 
  Larry Hauch 
  Laszlo Ersek 
  Leahy, Leroy P
  Leahy, Leroy P 
  Lee Leahy 
  Leekha Shaveta 
  Leif Lindholm 
  Liming Gao 
  Mark Rutland 
  Marvin Haeuser 
  Michael Kinney 
  Michael LeMay 
  Michael Thomas 
  MichaÅ Zegan 
  Ni, Ruiyu 
  Paolo Bonzini 
  Paulo Alcantara 
  Paulo Alcantara Cavalcanti 
  Peter Kirmeier 
  Qin Long 
  Qiu Shumin 
  Rodrigo Dias Correa 
  Ruiyu Ni 
  Ryan Harkin 
  Samer El-Haj-Mahmoud 
  Samer El-Haj-Mahmoud 
  Star Zeng 
  Supreeth Venkatesh 
  Tapan Shah 
  Tian, Feng 
  Vladislav Vovchenko 
  Yao Jiewen 
  Yao, Jiewen 
  Ye Ting 
  Yonghong Zhu 
  Zhang Lubo 
  Zhang, Chao B 
  Zhang, Lubo 
  Zhangfei Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at

Re: [Xen-devel] Severe guest disk corruption with device_model_stubdomain_override=1...

2016-03-23 Thread Sarah Newman

On 03/23/2016 02:46 PM, Sarah Newman wrote:
> On 03/22/2016 11:03 PM, Sarah Newman wrote:
>> And nested xen.
>>
>> CPU: AMD Opteron 2352
>> Outer configuration: Xen4CentOS 6 xen 4.6.1-2.el6, linux 
>> 3.18.25-18.el6.x86_64
>> Inner configuration: Xen4CentOS 6 xen 4.6.1-2.el6, linux 
>> 3.18.25-19.el6.x86_64
>> Inner xen command line:  cpuinfo loglvl=all guest_loglvl=error 
>> dom0_mem=512M,max:512M com1=115200,8n1 console=com1 dom0_max_vcpus=1 
>> dom0_vcpus_pin=true
>> Inner linux command line: ro root=LABEL=DISK rootflags=barrier=0 
>> swiotlb=32768 console=hvc0
> 
>> xen_platform_pci seems to be ignored with 
>> device_model_stubdomain_override=1. So I don't think I can test what happens 
>> with the 3.18.25-19.el6.x86_64
>> kernel, no nested xen, and non-paravirtual block devices.
> 
> The patch submitted in 
> http://lists.xenproject.org/archives/html/xen-devel/2016-03/msg03080.html 
> appears to fix the issue.

FYI, I also had to run "ethtool -K -emu tx off" or tcp did not work 
for intra-host communications (off-host worked OK.) I'm not sure if
that's a known issue or not.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-linus test] 86882: regressions - FAIL

2016-03-23 Thread osstest service owner

flight 86882 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/86882/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 59254
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 59254
 test-amd64-amd64-xl-credit2  15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-xl  15 guest-localmigratefail REGR. vs. 59254
 test-amd64-i386-xl-xsm   15 guest-localmigratefail REGR. vs. 59254
 test-amd64-i386-xl   15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-xl-multivcpu 17 guest-localmigrate/x10   fail REGR. vs. 59254
 test-amd64-amd64-pair  21 guest-migrate/src_host/dst_host fail REGR. vs. 59254
 test-armhf-armhf-xl   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl-xsm   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl-cubietruck  6 xen-bootfail REGR. vs. 59254
 test-amd64-i386-pair   22 guest-migrate/dst_host/src_host fail REGR. vs. 59254
 test-amd64-amd64-xl-xsm  15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. vs. 
59254
 test-armhf-armhf-xl-credit2   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail REGR. vs. 59254

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 14 guest-saverestore fail REGR. vs. 59254
 test-armhf-armhf-xl-rtds  6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl-vhd   6 xen-bootfail baseline untested
 test-amd64-amd64-libvirt-pair 22 guest-migrate/dst_host/src_host fail baseline 
untested
 test-amd64-i386-libvirt-pair 22 guest-migrate/dst_host/src_host fail baseline 
untested
 test-amd64-amd64-libvirt 15 guest-saverestore.2  fail blocked in 59254
 test-amd64-amd64-libvirt-xsm 15 guest-saverestore.2  fail blocked in 59254
 test-amd64-i386-libvirt  15 guest-saverestore.2  fail blocked in 59254
 test-amd64-i386-libvirt-xsm  15 guest-saverestore.2  fail blocked in 59254
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59254
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 59254

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass

version targeted for testing:
 linux968f3e374faf41e5e6049399eb7302777a09a1e8
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z  258 days
Failing since 59348  2015-07-10 04:24:05 Z  257 days  185 attempts
Testing same since86882  2016-03-22 04:24:52 Z1 days1 attempts


4762 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm

Re: [Xen-devel] rcu_sched self-detected stall on CPU on kernel 4.4.5/6 in PV DomU

2016-03-23 Thread Steven Haigh


Just wanted to give a bit of a poke about this.

Currently running kernel 4.4.6 in a PV DomU and still occasionally 
getting hangs.


Also stumbled across this that may be related:
https://lkml.org/lkml/2016/2/4/724

My latest hang shows:
[339844.594001] INFO: rcu_sched self-detected stall on CPU
[339844.594001] 1-...: (287557828 ticks this GP) 
idle=4cb/141/0 softirq=1340383/1340384 fqs=95372371
[339844.594001]  (t=287566692 jiffies g=999283 c=999282 
q=1725381)

[339844.594001] Task dump for CPU 1:
[339844.594001] find    R  running task    0  2840   2834 
0x0088
[339844.594001]  818d0c00 88007fd03c58 810a625f 
0001
[339844.594001]  818d0c00 88007fd03c70 810a8699 
0002
[339844.594001]  88007fd03ca0 810d0e5a 88007fd170c0 
818d0c00

[339844.594001] Call Trace:
[339844.594001]    [] sched_show_task+0xaf/0x110
[339844.594001]  [] dump_cpu_task+0x39/0x40
[339844.594001]  [] rcu_dump_cpu_stacks+0x8a/0xc0
[339844.594001]  [] rcu_check_callbacks+0x424/0x7a0
[339844.594001]  [] ? account_system_time+0x81/0x110
[339844.594001]  [] ? account_process_tick+0x61/0x160
[339844.594001]  [] ? tick_sched_do_timer+0x30/0x30
[339844.594001]  [] update_process_times+0x39/0x60
[339844.594001]  [] 
tick_sched_handle.isra.15+0x36/0x50

[339844.594001]  [] tick_sched_timer+0x3d/0x70
[339844.594001]  [] __hrtimer_run_queues+0xf2/0x250
[339844.594001]  [] hrtimer_interrupt+0xa8/0x190
[339844.594001]  [] xen_timer_interrupt+0x2e/0x140
[339844.594001]  [] handle_irq_event_percpu+0x55/0x1e0
[339844.594001]  [] handle_percpu_irq+0x3a/0x50
[339844.594001]  [] generic_handle_irq+0x22/0x30
[339844.594001]  [] 
__evtchn_fifo_handle_events+0x15f/0x180
[339844.594001]  [] 
evtchn_fifo_handle_events+0x10/0x20

[339844.594001]  [] __xen_evtchn_do_upcall+0x43/0x80
[339844.594001]  [] xen_evtchn_do_upcall+0x30/0x50
[339844.594001]  [] xen_hvm_callback_vector+0x82/0x90
[339844.594001]    [] ? _raw_spin_lock+0x10/0x3


On 2016-03-19 08:46, Steven Haigh wrote:

On 19/03/2016 8:40 AM, Steven Haigh wrote:

Hi all,

So I'd just like to give this a prod. I'm still getting DomU's 
randomly

go to 100% CPU usage using kernel 4.4.6 now. It seems running 4.4.6 as
the DomU does not induce these problems.


Sorry - slight correction. Running 4.4.6 as the Dom0 kernel doesn't 
show

these errors. Only in the DomU.



Latest crash message from today:
INFO: rcu_sched self-detected stall on CPU
0-...: (20869552 ticks this GP) idle=9c9/141/0
softirq=1440865/1440865 fqs=15068
 (t=20874993 jiffies g=1354899 c=1354898 q=798)
rcu_sched kthread starved for 20829030 jiffies! g1354899 c1354898 f0x0
s3 ->state=0x0
Task dump for CPU 0:
kworker/u4:1R  running task0  5853  2 0x0088
Workqueue: writeback wb_workfn (flush-202:0)
 818d0c00 88007fc03c58 810a625f 
 818d0c00 88007fc03c70 810a8699 0001
 88007fc03ca0 810d0e5a 88007fc170c0 818d0c00
Call Trace:
   [] sched_show_task+0xaf/0x110
 [] dump_cpu_task+0x39/0x40
 [] rcu_dump_cpu_stacks+0x8a/0xc0
 [] rcu_check_callbacks+0x424/0x7a0
 [] ? account_system_time+0x81/0x110
 [] ? account_process_tick+0x61/0x160
 [] ? tick_sched_do_timer+0x30/0x30
 [] update_process_times+0x39/0x60
 [] tick_sched_handle.isra.15+0x36/0x50
 [] tick_sched_timer+0x3d/0x70
 [] __hrtimer_run_queues+0xf2/0x250
 [] hrtimer_interrupt+0xa8/0x190
 [] xen_timer_interrupt+0x2e/0x140
 [] handle_irq_event_percpu+0x55/0x1e0
 [] handle_percpu_irq+0x3a/0x50
 [] generic_handle_irq+0x22/0x30
 [] __evtchn_fifo_handle_events+0x15f/0x180
 [] evtchn_fifo_handle_events+0x10/0x20
 [] __xen_evtchn_do_upcall+0x43/0x80
 [] xen_evtchn_do_upcall+0x30/0x50
 [] xen_hvm_callback_vector+0x82/0x90
   [] ? queued_spin_lock_slowpath+0x22/0x170
 [] _raw_spin_lock+0x20/0x30
 [] writeback_sb_inodes+0x124/0x560
 [] ? _raw_spin_unlock_irqrestore+0x16/0x20
 [] __writeback_inodes_wb+0x86/0xc0
 [] wb_writeback+0x1d6/0x2d0
 [] wb_workfn+0x284/0x3e0
 [] process_one_work+0x151/0x400
 [] worker_thread+0x11a/0x460
 [] ? __schedule+0x2bf/0x880
 [] ? rescuer_thread+0x2f0/0x2f0
 [] kthread+0xc9/0xe0
 [] ? kthread_park+0x60/0x60
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_park+0x60/0x60

This repeats over and over causing 100% CPU usage - eventually on all
vcpus assigned to the DomU and the only recovery is 'xl destroy'.

I'm currently running Xen 4.6.1 on this system - with kernel 4.4.6 in
both the DomU and Dom0.

On 17/03/2016 8:39 AM, Steven Haigh wrote:

Hi all,

I've noticed the following problem that ends up with a non-repsonsive 
PV

DomU using kernel 4.4.5 under heavy disk IO:

INFO: rcu_sched self-detected stall on CPU
0-...: (6759098 ticks this GP) idle=cb3/141/0
softirq=3244615/3244615 fqs=4
 (t=6762321 jiffies g=2275626 c=2275625 q=54)
rcu_sched kthread starved for 6762309 jiffies! g2275626

Re: [Xen-devel] [PATCH v4 11/34] xsplice: Design document

2016-03-23 Thread Konrad Rzeszutek Wilk

On Wed, Mar 23, 2016 at 05:18:39AM -0600, Jan Beulich wrote:
> >>> On 15.03.16 at 18:56,  wrote:
> > +### XEN_SYSCTL_XSPLICE_LIST (2)
> > +
> > +Retrieve an array of abbreviated status and names of payloads that are 
> > loaded in the
> > +hypervisor.
> > +
> > +The caller provides:
> > +
> > + * `version`. Initially (on first hypercall) *MUST* be zero.
> > + * `idx` index iterator. On first call *MUST* be zero, subsequent calls 
> > varies.
> > + * `nr` the max number of entries to populate.
> > + * `pad` - *MUST* be zero.
> > + * `status` virtual address of where to write `struct xen_xsplice_status`
> > +   structures. Caller *MUST* allocate up to `nr` of them.
> > + * `name` - virtual address of where to write the unique name of the 
> > payload.
> > +   Caller *MUST* allocate up to `nr` of them. Each *MUST* be of
> > +   **XEN_XSPLICE_NAME_SIZE** size.
> > + * `len` - virtual address of where to write the length of each unique name
> > +   of the payload. Caller *MUST* allocate up to `nr` of them. Each *MUST* 
> > be
> > +   of sizeof(uint32_t) (4 bytes).
> > +
> > +If the hypercall returns an positive number, it is the number (upto `nr`
> > +provided to the hypercall) of the payloads returned, along with `nr` 
> > updated
> > +with the number of remaining payloads, `version` updated (it may be the 
> > same
> > +across hypercalls - if it varies the data is stale and further calls could
> > +fail). The `status`, `name`, and `len`' are updated at their designed index
> > +value (`idx`) with the returned value of data.
> > +
> > +If the hypercall returns -XEN_E2BIG the `nr` is too big and should be
> > +lowered.
> > +
> > +If the hypercall returns an zero value there are no more payloads.
> > +
> > +Note that due to the asynchronous nature of hypercalls the control domain 
> > might
> > +have added or removed a number of payloads making this information stale. 
> > It is
> > +the responsibility of the toolstack to use the `version` field to check
> > +between each invocation. if the version differs it should discard the stale
> > +data and start from scratch. It is OK for the toolstack to use the new
> > +`version` field.
> > +
> > +The `struct xen_xsplice_status` structure contains an status of payload 
> > which includes:
> > +
> > + * `status` - indicates the current status of the payload:
> > +   * *XSPLICE_STATUS_CHECKED*  (1) loaded and the ELF payload safety 
> > checks passed.
> > +   * *XSPLICE_STATUS_APPLIED* (2) loaded, checked, and applied.
> > +   *  No other value is possible.
> > + * `rc` - -XEN_EXX type errors encountered while performing the last
> > +   XSPLICE_ACTION_* operation. The normal values can be zero or 
> > -XEN_EAGAIN which
> > +   respectively mean: success or operation in progress. Other values
> > +   imply an error occurred. If there is an error in `rc`, `status` will 
> > **NOT**
> > +   have changed.
> > +
> > +The structure is as follow:
> > +
> > +
> > +struct xen_sysctl_xsplice_list {  
> > +uint32_t version;   /* IN/OUT: Initially *MUST* be 
> > zero.  
> > +   On subsequent calls reuse 
> > value.  
> > +   If varies between calls, we 
> > are  
> > + * getting stale data. */  
> > +uint32_t idx;   /* IN/OUT: Index into array. 
> > */ 
> > +uint32_t nr;/* IN: How many status, names, 
> > and len  
> > +   should fill out.  
> > +   OUT: How many payloads 
> > left. */  
> 
> I think there's an ambiguity left in both the description above and
> the comments here: With idx required to be zero upon first
> invocation (which I'm not clear why that is), which parts of the
> three arrays get filled when idx is non-zero: [0, idx) or [nr, nr + idx)?

Here is the new updated design. Hopefully it is more clear?

From ccd6f3521241ec56158e58bf9e26388b573469b3 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Mon, 14 Sep 2015 09:05:11 -0400
Subject: [PATCH] xsplice: Design document

A mechanism is required to binarily patch the running hypervisor with new
opcodes that have come about due to primarily security updates.

This document describes the design of the API that would allow us to
upload to the hypervisor binary patches.

This document has been shaped by the input from:
  Martin Pohlack 
  Jan Beulich 

Thank you!

Input-from: Martin Pohlack 
Input-from: Jan Beulich 
Signed-off-by: Konrad Rzeszutek Wilk 
Signed-off-by: Ross Lagerwall 

---
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Keir Fraser 
Cc: Tim Deegan 

v1-2:

Re: [Xen-devel] [PATCH v4 12/34] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op

2016-03-23 Thread Konrad Rzeszutek Wilk

On Wed, Mar 23, 2016 at 07:51:29AM -0600, Jan Beulich wrote:
> >>> On 15.03.16 at 18:56,  wrote:
> > --- a/xen/common/Kconfig
> > +++ b/xen/common/Kconfig
> > @@ -168,4 +168,15 @@ config SCHED_DEFAULT
> >  
> >  endmenu
> >  
> > +# Enable/Disable xsplice support
> > +config XSPLICE
> > +   bool "xSplice live patching support"
> > +   default y
> 
> Isn't it a little early in the series to default this to on?

I am ambitious!
> 
> And then of course the EXPERT question comes up again. No
> matter that IanC is no longer around to help with the
> argumentation, the point he has been making about too many
> flavors ending up in the wild continues to apply.

'too many flavors'? As in different versions of Xen with or without
these options enabled? 

.. snip..
> 
> > +static int find_payload(const xen_xsplice_name_t *name, struct payload **f)
> 
..snip..
> > +return -EFAULT;
> > +
> > +spin_lock_recursive(_lock);
> 
> Why do you need a recursive lock here? I think something like this
> should be reasoned about in the commit message.

The earlier version used an extra parameter (locked) to diffrenciate
whether to take a lock or not as the caller could have taken it.

Andrew didn't like it particularly and asked it to be recursive
so that we don't by accident mess up the locking.

.. snip..
> > +static int xsplice_upload(xen_sysctl_xsplice_upload_t *upload)

.. snip..
> > + out:
> > +vfree(raw_data);
> 
> By here you allocated and filled raw_data. And now you
> unconditionally free it. What is that good for?

Nothing. It was added as a placeholder - as the patch
titled "xsplice: Implement payload loading" is actually doing
useful things. I've moved the operations around raw_data into that
patch.

> > +static int xsplice_list(xen_sysctl_xsplice_list_t *list)
> > +{
> > +xen_xsplice_status_t status;
> > +struct payload *data;
> > +unsigned int idx = 0, i = 0;
> > +int rc = 0;
> > +
> > +if ( list->nr > 1024 )
> > +return -E2BIG;
> > +
> > +if ( list->pad != 0 )
> > +return -EINVAL;
> > +
> > +if ( !guest_handle_okay(list->status, sizeof(status) * list->nr) ||
> > + !guest_handle_okay(list->name, XEN_XSPLICE_NAME_SIZE * list->nr) 
> > ||
> > + !guest_handle_okay(list->len, sizeof(uint32_t) * list->nr) )
> 
> guest_handle_okay() already takes into account the element size,
> i.e. it's only the middle one which needs to do any multiplication.
> 
> > +return -EINVAL;
> > +
> > +spin_lock_recursive(_lock);
> > +if ( list->idx > payload_cnt || !list->nr )
> 
> The list->nr check could move up outside the locked region (e.g.
> merge with the pad field check).

I reworked this a bit. I made it so that if list->nr is 0 we would
populate list->nr=payload_count, list->version=payload_version.


> 
> > +{
> > +spin_unlock_recursive(_lock);
> > +return -EINVAL;
> > +}
> > +
> > +list_for_each_entry( data, _list, list )
> 
> Aren't you lacking a list->version check prior to entering this loop
> (which would then mean you don't need to store it below, but only
> on the error path from that check)?

No. The toolstack has no idea of what the right version is on the
first invocation. Which is OK since it gets fresh data (it is
its first invocation).

On subsequent invocations we gleefuly populate up to
min(payload_cnt, ->nr) of data even if the version the toolstack
provided is different. The toolstack will have to decide to throw away
the data and retry the hypercall; or print it out as is.

> 
> > +{
> > +uint32_t len;
> > +
> > +if ( list->idx > i++ )
> > +continue;
> > +
> > +status.state = data->state;
> > +status.rc = data->rc;
> > +len = strlen(data->name);
> > +
> > +/* N.B. 'idx' != 'i'. */
> > +if ( __copy_to_guest_offset(list->name, idx * 
> > XEN_XSPLICE_NAME_SIZE,
> > +data->name, len) ||
> > + __copy_to_guest_offset(list->len, idx, , 1) ||
> 
> You're not coping the NUL terminator here, which makes the result
> more cumbersome to consume by the caller. Perhaps
> XEN_XSPLICE_NAME_SIZE should remain to be 128 (other than
> suggested above), but be specified to include the terminator?

Yes. Fixed that. It also needed a minor change in:
"libxc: Implementation of XEN_XSPLICE_op in libxc" to account for
strlen. (+1 to its result)


Here is the newly minted patch with your suggestions hopefully
implemented to your liking!

From 40f0e9fdb50935d4d3df608950313051a28f12b9 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Mon, 25 Jan 2016 10:51:22 -0500
Subject: [PATCH] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op

The implementation does not actually do any patching.

It just adds the framework for doing the hypercalls,
keeping track of ELF payloads, and the basic operations:
 - query which payloads exist,
 - query for

Re: [Xen-devel] [PATCH v4 07/34] arm/x86: Use struct virtual_region to do bug, symbol, and (x86) exception tables

2016-03-23 Thread Konrad Rzeszutek Wilk

> >> > +#ifdef CONFIG_X86
> >> > +#include 
> >> > +#endif
> >> 
> >> Why?
> > 
> > Otherwise the compilation will fail on ARM as they do not have exceptions
> > (and no asm/uaccess.h file)
> 
> Well, the question was for the #include, not the #ifdef.

Ah, yes. And with the 'ex' being pointers it matters no.

> 
> > --- a/xen/common/symbols.c
> > +++ b/xen/common/symbols.c
> > @@ -17,6 +17,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  
> > @@ -97,8 +98,7 @@ static unsigned int get_symbol_offset(unsigned long pos)
> >  
> >  bool_t is_active_kernel_text(unsigned long addr)
> >  {
> > -return (is_kernel_text(addr) ||
> > -(system_state < SYS_STATE_active && is_kernel_inittext(addr)));
> > +return !!search_virtual_regions(addr);
> 
> search_virtual_regions() doesn't sound like it would be looking for
> text addresses only.

I am not sure what would be a better name - as it
(search_virtual_regions) is used by three other callers?

search_for_addr? 

.. snip..
> > +static void __unregister_virtual_region(struct virtual_region *r)
> 
> > +{
> > +unsigned long flags;
> > +
> > +spin_lock_irqsave(_region_lock, flags);
> > +list_del_rcu(>list);
> > +spin_unlock_irqrestore(_region_lock, flags);
> > +/*
> > + * We do not need to invoke call_rcu.
> > + *
> > + * This is due to the fact that on the deletion we have made sure
> > + * to use spinlocks (to guard against somebody else calling
> > + * unregister_virtual_region) and list_deletion spiced with an memory
> > + * barrier - which will flush out the cache lines in other CPUs.
> 
> I don't think barriers do any kind of cache flushing on remote CPUs
> (not even on the local one).

I am not sure what I had been thinking. The only thing it does is a
memory barrier.

.. snip..

I believe I've addressed the review comments you had:
From 4a2690ba815db7edd5fe075c8d7e9e2ac62a0020 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Thu, 10 Mar 2016 16:35:50 -0500
Subject: [PATCH] arm/x86: Use struct virtual_region to do bug, symbol, and
 (x86) exception tables lookup.

During execution of the hypervisor we have two regions of
executable code - stext -> _etext, and _sinittext -> _einitext.

The later is not needed after bootup.

We also have various built-in macros and functions to search
in between those two swaths depending on the state of the system.

That is either for bug_frames, exceptions (x86) or symbol
names for the instruction.

With xSplice in the picture - we need a mechansim for new payloads
to searched as well for all of this.

Originally we had extra 'if (xsplice)...' but that gets
a bit tiring and does not hook up nicely.

This 'struct virtual_region' and virtual_region_list provide a
mechanism to search for the bug_frames, exception table,
and symbol names entries without having various calls in
other sub-components in the system.

Code which wishes to participate in bug_frames and exception table
entries search has to only use two public APIs:
 - register_virtual_region
 - unregister_virtual_region

to let the core code know.

If the ->lookup_symbol is not then the default internal symbol lookup
mechanism is used.

Suggested-by: Andrew Cooper 
Signed-off-by: Konrad Rzeszutek Wilk 

---
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Keir Fraser 
Cc: Jan Beulich 
Cc: Andrew Cooper 

v4: New patch.
v5:
 - Rename to virtual_region.
 - Ditch the 'skip' function.
 - Remove the _stext.
 - Use RCU lists.
 - Add a search function.
 - Remove extern, add rcu_read_lock. remove __ from name.
---
---
 xen/arch/arm/setup.c |   4 +
 xen/arch/arm/traps.c |  39 ++
 xen/arch/x86/extable.c   |  12 ++-
 xen/arch/x86/setup.c |   6 ++
 xen/arch/x86/traps.c |  40 ++
 xen/common/Makefile  |   1 +
 xen/common/symbols.c |  11 ++-
 xen/common/virtual_region.c  | 160 +++
 xen/include/xen/symbols.h|   9 +++
 xen/include/xen/virtual_region.h |  48 
 10 files changed, 293 insertions(+), 37 deletions(-)
 create mode 100644 xen/common/virtual_region.c
 create mode 100644 xen/include/xen/virtual_region.h

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 6d205a9..09ff1ea 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -860,6 +861,9 @@ void __init start_xen(unsigned long boot_phys_offset,
 
 system_state = SYS_STATE_active;
 
+/* Must be done past setting system_state. */
+unregister_init_virtual_region();
+
 domain_unpause_by_systemcontroller(dom0);
 
 /* Switch on to the dynamically

Re: [Xen-devel] [PATCH v4 04/34] HYPERCALL_version_op. New hypercall mirroring XENVER_ but sane.

2016-03-23 Thread Konrad Rzeszutek Wilk

. fixed all of those ..
> > --- a/xen/xsm/flask/hooks.c
> > +++ b/xen/xsm/flask/hooks.c
> > @@ -1658,6 +1658,40 @@ static int flask_xen_version (uint32_t op)
> >  }
> >  }
> >  
> > +static int flask_version_op (uint32_t op)
> > +{
> > +u32 dsid = domain_sid(current->domain);
> > +
> > +switch ( op )
> > +{
> > +case XEN_VERSION_version:
> > +case XEN_VERSION_platform_parameters:
> > +case XEN_VERSION_get_features:
> > +/* These MUST always be accessible to any guest by default. */
> > +return 0;
> 
> Perhaps these would better be taken care of in xsm_version_op()?

It would be the oddball one.
All of the xsm_**() in the header file (include/xsm/xsm.h) call the function
pointers.

> (That consideration then also applies to the other patch of course.)

Here is the updated patch:

From 10ecad7469a5ba12895418aa2d035def852654e7 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk 
Date: Tue, 22 Mar 2016 16:53:19 -0400
Subject: [PATCH] HYPERCALL_version_op. New hypercall mirroring XENVER_ but
 sane.

This hypercall mirrors the XENVER_ in that it has similar functionality.
However it is designed differently:
 - No compat layer. The data structures are the same size on 32
   as on 64-bit.
 - The hypercall accepts three arguments - the command, pointer to
   an buffer, and the length of the buffer.
 - Each sub-ops can be "probed" for size by returning the size of
   buffer that will be needed - if the buffer is NULL.
 - Subops can complete even if the buffer is too small - truncated
   data will be filled and hypercall will return -ENOBUFS.
 - VERSION_commandline, VERSION_changeset are privileged.
 - There is no XENVER_compile_info equivalent.
 - The hypercall can return -EPERM and toolstack/OSes are expected
   to deal with. However there are three subops: XEN_VERSION_version,
   XEN_VERSION_platform_parameters and XEN_VERSION_get_features
   that will always return an value as guests cannot survive without them.

While we combine some of the common code between XENVER_ and VERSION_
take the liberty of moving pae_extended_cr3 in x86 area.

Suggested-by: Andrew Cooper 
Signed-off-by: Konrad Rzeszutek Wilk 
Acked-by: Daniel De Graaf  [XSM bits]

---
Cc: Daniel De Graaf 
Cc: Ian Jackson 
Cc: Stefano Stabellini 
Cc: Wei Liu 
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: Keir Fraser 
Cc: Jan Beulich 
Cc: Andrew Cooper 

v1-v3: Was not part of the series.
v4: New posting.
v5: Remove memset and use {}. Tweak copy_to_guest and capabilities_info,
add ASSERT(sz) per Andrew's review. Add cached=1 back in.
Per Jan, s/VERSION_OP/VERSION/, squash size check with do_version_op,
update the comments. Dropped Andrew's Review-by. Ate newlines.
Added initcall to guard against garbage being set in cached data.
Folded code populating cache in __init. s/char/char[]/ in public.h
---
---
 tools/flask/policy/policy/modules/xen/xen.te |   7 +-
 xen/arch/arm/traps.c |   1 +
 xen/arch/x86/hvm/hvm.c   |   1 +
 xen/arch/x86/x86_64/compat/entry.S   |   2 +
 xen/arch/x86/x86_64/entry.S  |   2 +
 xen/common/compat/kernel.c   |   2 +
 xen/common/kernel.c  | 213 ++-
 xen/include/public/arch-arm.h|   2 +
 xen/include/public/version.h |  70 -
 xen/include/public/xen.h |   1 +
 xen/include/xen/hypercall.h  |   4 +
 xen/include/xsm/dummy.h  |  21 +++
 xen/include/xsm/xsm.h|   6 +
 xen/xsm/dummy.c  |   1 +
 xen/xsm/flask/hooks.c|  35 +
 xen/xsm/flask/policy/access_vectors  |  21 ++-
 16 files changed, 346 insertions(+), 43 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.te 
b/tools/flask/policy/policy/modules/xen/xen.te
index e174e48..7e69ce9 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -74,11 +74,12 @@ allow dom0_t xen_t:xen2 {
 get_symbol
 };
 
-# Allow dom0 to use all XENVER_ subops that have checks.
+# Allow dom0 to use all XENVER_ subops and VERSION subops that have checks.
 # Note that dom0 is part of domain_type so this has duplicates.
 allow dom0_t xen_t:version {
 xen_extraversion xen_compile_info xen_capabilities
 xen_changeset xen_pagesize xen_guest_handle xen_commandline
+extraversion capabilities changeset pagesize guest_handle commandline
 };
 
 allow dom0_t xen_t:mmu memorymap;
@@ -145,10 +146,12 @@ if (guest_writeconsole) {
 # pmu_ctrl is for)
 allow

[Xen-devel] [linux-4.1 test] 87031: regressions - FAIL

2016-03-23 Thread osstest service owner

flight 87031 linux-4.1 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/87031/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 66399
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 66399
 test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeat fail REGR. vs. 
66399

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail in 86830 pass in 
87031
 test-armhf-armhf-xl   15 guest-start/debian.repeat fail in 86830 pass in 87031
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeat  fail pass in 86654
 test-armhf-armhf-xl-rtds 11 guest-start fail pass in 86830
 test-armhf-armhf-xl-xsm  15 guest-start/debian.repeat   fail pass in 86830

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeat fail in 86830 like 66399
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 66399
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 66399
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 66399
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 66399
 test-armhf-armhf-xl-vhd   9 debian-di-installfail   like 66399

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl-rtds 13 saverestore-support-check fail in 86830 never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-check fail in 86830 never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw  9 debian-di-installfail   never pass

version targeted for testing:
 linux7f30737678023b5becaf0e2e012665f71b886a7d
baseline version:
 linux07cc49f66973f49a391c91bf4b158fa0f2562ca8

Last test of basis66399  2015-12-15 18:20:39 Z   99 days
Failing since 78925  2016-01-24 13:50:39 Z   59 days   64 attempts
Testing same since86587  2016-03-18 16:11:01 Z5 days6 attempts


494 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf

Re: [Xen-devel] [ovmf test] 87014: regressions - FAIL

2016-03-23 Thread Jim Fehlig

On 03/23/2016 09:27 AM, osstest service owner wrote:
> flight 87014 ovmf real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/87014/
>
> Regressions :-(
>
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 
> 65543
>  test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 
> 65543

I've been testing Anthony's series to 'Load BIOS via toolstack instead of been
embedded in hvmloader' [0] in an attempt to use the distro ovmf instead of
"ovmf-xen". openSUSE Factory's ovmf package [1] is updated regularly with an
upstream git snapshot and I noticed VMs did not boot when using it. I bisected
and found ovmf commit 7b0a1ead the culprit

commit 7b0a1ead7d2efa7f9eae4c2b254ff154d9c5f74f
Author: Ruiyu Ni 
Date:   Wed Feb 17 18:06:36 2016 +0800

MdeModuelPkg/PciBus: Return AddrTranslationOffset in GetBarAttributes

Some platform doesn't use CPU(HOST)/Device 1:1 mapping for PCI Bus.
But PCI IO doesn't have interface to tell caller (device driver)
whether the address returned by GetBarAttributes() is HOST address
or device address.
UEFI Spec 2.6 addresses this issue by clarifying the address returned
is HOST address and caller can use AddrTranslationOffset to calculate
the device address.

I'm sure it's biting this test too.

BTW, any pointers on getting debug info from
hvmloader+ovmf+qemu+other-components-involved? With the broken ovmf, I got
nothing on the VM serial port, an empty, black vfb, nothing beyond "Invoking
OVMF..." in xl dmesg, and no errors in the qemu log file. Beyond bisecting, I
wasn't sure how to debug :-). I tried adding

device_model_args=[ "-debugcon file:debug.log", "-global 
isa-debugcon.iobase=0x402"]

to the VM config, but qemu failed to start with "-debugcon file:debug.log:
invalid option"

Regards,
Jim

[0] http://lists.xenproject.org/archives/html/xen-devel/2016-03/msg01626.html
[1] https://build.opensuse.org/package/show/Virtualization/ovmf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Severe guest disk corruption with device_model_stubdomain_override=1...

2016-03-23 Thread Sarah Newman

On 03/22/2016 11:03 PM, Sarah Newman wrote:
> And nested xen.
> 
> CPU: AMD Opteron 2352
> Outer configuration: Xen4CentOS 6 xen 4.6.1-2.el6, linux 3.18.25-18.el6.x86_64
> Inner configuration: Xen4CentOS 6 xen 4.6.1-2.el6, linux 3.18.25-19.el6.x86_64
> Inner xen command line:  cpuinfo loglvl=all guest_loglvl=error 
> dom0_mem=512M,max:512M com1=115200,8n1 console=com1 dom0_max_vcpus=1 
> dom0_vcpus_pin=true
> Inner linux command line: ro root=LABEL=DISK rootflags=barrier=0 
> swiotlb=32768 console=hvc0

> xen_platform_pci seems to be ignored with device_model_stubdomain_override=1. 
> So I don't think I can test what happens with the 3.18.25-19.el6.x86_64
> kernel, no nested xen, and non-paravirtual block devices.

The patch submitted in 
http://lists.xenproject.org/archives/html/xen-devel/2016-03/msg03080.html 
appears to fix the issue.

--Sarah


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] Mini-OS: netfront: fix off-by-one error introduced in 7c8f3483

2016-03-23 Thread Sarah Newman

7c8f3483 introduced a break within a loop in netfront.c such that
cons and nr_consumed were no longer always being incremented. The
offset at cons will be processed multiple times with the break in
place.

Remove the break and re-add "some !=0" in the loop for HAVE_LIBC.

Signed-off-by: Sarah Newman 
---
 netfront.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/netfront.c b/netfront.c
index 0eca5b5..557e8c4 100644
--- a/netfront.c
+++ b/netfront.c
@@ -108,8 +108,10 @@ moretodo:
 
 #ifdef HAVE_LIBC
 some = 0;
-#endif
+for (cons = dev->rx.rsp_cons; (cons != rp) && !some; nr_consumed++, cons++)
+#else
 for (cons = dev->rx.rsp_cons; cons != rp; nr_consumed++, cons++)
+#endif
 {
 struct net_buffer* buf;
 unsigned char* page;
@@ -135,7 +137,6 @@ moretodo:
memcpy(dev->data, page+rx->offset, len);
dev->rlen = len;
some = 1;
-break;
} else
 #endif
dev->netif_rx(page+rx->offset,rx->status);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 3/6] x86/mtrr: Fix Xorg crashes in Qemu sessions

2016-03-23 Thread Toshi Kani

On Wed, 2016-03-23 at 09:53 -0600, Toshi Kani wrote:
> On Wed, 2016-03-23 at 09:44 +0100, Borislav Petkov wrote:
> > On Tue, Mar 22, 2016 at 03:53:30PM -0600, Toshi Kani wrote:
> > > Yes. I had to remove this number since checkpatch complained that I
> > > needed to quote the whole patch tile again.  I will ignore this
> > > checkpatch error and add this commit number here.
> > 
> > Actually, checkpatch is right. We do quote the commit IDs *together*
> > with their names so that the reader knows which commit the text is
> > talking about.
> 
> OK, I will use [1] to refer this patch.  This patch is fully quoted at
> the top of this changelog, and it'd be verbose to repeat this full quote
> every time I refers it...

I ended up with using "the above-mentioned patch" in v3.

Thanks,
-Toshi

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-unstable test] 87027: regressions - FAIL

2016-03-23 Thread osstest service owner

flight 87027 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/87027/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail REGR. vs. 
86491
 test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail in 86747 
REGR. vs. 86491

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-win7-amd64 12 guest-saverestore fail in 86747 pass in 
87027
 test-armhf-armhf-xl-credit2 15 guest-start/debian.repeat fail in 86747 pass in 
87027
 test-amd64-i386-xl-qemuu-win7-amd64 13 guest-localmigrate fail in 86901 pass 
in 87027
 test-armhf-armhf-xl-multivcpu  6 xen-boot  fail in 86901 pass in 87027
 test-amd64-amd64-xl-qemut-win7-amd64 12 guest-saverestore   fail pass in 86747
 test-amd64-i386-xl-qemut-win7-amd64 12 guest-saverestorefail pass in 86901

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeat fail in 86747 blocked in 
86491
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail in 86901 like 86491
 build-amd64-rumpuserxen   6 xen-buildfail   like 86491
 build-i386-rumpuserxen6 xen-buildfail   like 86491
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 86491

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  829e03ca0ef757350546df8546a6575ca3d0e8da
baseline version:
 xen  a6f2cdb633bf519244a16674031b8034b581ba7f

Last test of basis86491  2016-03-17 15:24:59 Z6 days
Failing since 86560  2016-03-18 10:56:34 Z5 days6 attempts
Testing same since86645  2016-03-19 12:55:56 Z4 days5 attempts


People who touched revisions under test:
  Andrew Cooper 
  Chunyan Liu 
  Dagaen Golomb 
  David Vrabel

[Xen-devel] [PATCH v3 5/7] x86/mtrr: Fix PAT init handling when MTRR is disabled

2016-03-23 Thread Toshi Kani

get_mtrr_state() calls pat_init() on BSP even if MTRR is disabled.
This results in calling pat_init() on BSP only since APs do not call
pat_init() when MTRR is disabled.  This inconsistency between BSP
and APs leads to undefined behavior.

Make BSP's calling condition to pat_init() consistent with AP's,
mtrr_ap_init() and mtrr_aps_init().

Signed-off-by: Toshi Kani 
Cc: Borislav Petkov 
Cc: Luis R. Rodriguez 
Cc: Juergen Gross 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
---
 arch/x86/kernel/cpu/mtrr/generic.c |   24 ++--
 arch/x86/kernel/cpu/mtrr/main.c|3 +++
 arch/x86/kernel/cpu/mtrr/mtrr.h|1 +
 3 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c 
b/arch/x86/kernel/cpu/mtrr/generic.c
index fcbcb2f..a9d2e54 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -444,11 +444,24 @@ static void __init print_mtrr_state(void)
pr_debug("TOM2: %016llx aka %lldM\n", mtrr_tom2, mtrr_tom2>>20);
 }
 
+/* PAT setup for BP. We need to go through sync steps here */
+void __init mtrr_bp_pat_init(void)
+{
+   unsigned long flags;
+
+   local_irq_save(flags);
+   prepare_set();
+
+   pat_init();
+
+   post_set();
+   local_irq_restore(flags);
+}
+
 /* Grab all of the MTRR state for this CPU into *state */
 bool __init get_mtrr_state(void)
 {
struct mtrr_var_range *vrs;
-   unsigned long flags;
unsigned lo, dummy;
unsigned int i;
 
@@ -481,15 +494,6 @@ bool __init get_mtrr_state(void)
 
mtrr_state_set = 1;
 
-   /* PAT setup for BP. We need to go through sync steps here */
-   local_irq_save(flags);
-   prepare_set();
-
-   pat_init();
-
-   post_set();
-   local_irq_restore(flags);
-
return !!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED);
 }
 
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 8b1947b..7d393ec 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -752,6 +752,9 @@ void __init mtrr_bp_init(void)
/* BIOS may override */
__mtrr_enabled = get_mtrr_state();
 
+   if (mtrr_enabled())
+   mtrr_bp_pat_init();
+
if (mtrr_cleanup(phys_addr)) {
changed_by_mtrr_cleanup = 1;
mtrr_if->set_all();
diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h
index 951884d..6c7ced0 100644
--- a/arch/x86/kernel/cpu/mtrr/mtrr.h
+++ b/arch/x86/kernel/cpu/mtrr/mtrr.h
@@ -52,6 +52,7 @@ void set_mtrr_prepare_save(struct set_mtrr_context *ctxt);
 void fill_mtrr_var_range(unsigned int index,
u32 base_lo, u32 base_hi, u32 mask_lo, u32 mask_hi);
 bool get_mtrr_state(void);
+void mtrr_bp_pat_init(void);
 
 extern void set_mtrr_ops(const struct mtrr_ops *ops);
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 1/7] x86/mm/pat: Add support of non-default PAT MSR setting

2016-03-23 Thread Toshi Kani

In preparation for fixing a regression caused by 'commit 9cd25aac1f44
("x86/mm/pat: Emulate PAT when it is disabled")', PAT needs to
support a case that PAT MSR is initialized with a non-default
value.

When pat_init() is called and PAT is disabled, it initializes
PAT table with the BIOS default value. Xen, however, sets PAT MSR
with a non-default value to enable WC. This causes inconsistency
between PAT table and PAT MSR when PAT is set to disable on Xen.

Change pat_init() to handle the PAT disable cases properly.  Add
init_cache_modes() to handle two cases when PAT is set to disable.
 1. CPU supports PAT: Set PAT table to be consistent with PAT MSR.
 2. CPU does not support PAT: Set PAT table to be consistent with
PWT and PCD bits in a PTE.

Note, __init_cache_modes(), renamed from pat_init_cache_modes(),
will be changed to a static function in a later patch.

Signed-off-by: Toshi Kani 
Cc: Borislav Petkov 
Cc: Luis R. Rodriguez 
Cc: Juergen Gross 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
---
 arch/x86/include/asm/pat.h |2 +
 arch/x86/mm/pat.c  |   73 
 arch/x86/xen/enlighten.c   |2 +
 3 files changed, 55 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h
index ca6c228..97ea55b 100644
--- a/arch/x86/include/asm/pat.h
+++ b/arch/x86/include/asm/pat.h
@@ -6,7 +6,7 @@
 
 bool pat_enabled(void);
 extern void pat_init(void);
-void pat_init_cache_modes(u64);
+void __init_cache_modes(u64);
 
 extern int reserve_memtype(u64 start, u64 end,
enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 04e2e71..1da55a5 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -181,7 +181,7 @@ static enum page_cache_mode pat_get_cache_mode(unsigned 
pat_val, char *msg)
  * configuration.
  * Using lower indices is preferred, so we start with highest index.
  */
-void pat_init_cache_modes(u64 pat)
+void __init_cache_modes(u64 pat)
 {
enum page_cache_mode cache;
char pat_msg[33];
@@ -207,9 +207,6 @@ static void pat_bsp_init(u64 pat)
return;
}
 
-   if (!pat_enabled())
-   goto done;
-
rdmsrl(MSR_IA32_CR_PAT, tmp_pat);
if (!tmp_pat) {
pat_disable("PAT MSR is 0, disabled.");
@@ -218,15 +215,11 @@ static void pat_bsp_init(u64 pat)
 
wrmsrl(MSR_IA32_CR_PAT, pat);
 
-done:
-   pat_init_cache_modes(pat);
+   __init_cache_modes(pat);
 }
 
 static void pat_ap_init(u64 pat)
 {
-   if (!pat_enabled())
-   return;
-
if (!cpu_has_pat) {
/*
 * If this happens we are on a secondary CPU, but switched to
@@ -238,18 +231,32 @@ static void pat_ap_init(u64 pat)
wrmsrl(MSR_IA32_CR_PAT, pat);
 }
 
-void pat_init(void)
+static void init_cache_modes(void)
 {
-   u64 pat;
-   struct cpuinfo_x86 *c = _cpu_data;
+   u64 pat = 0;
+   static int init_cm_done;
 
-   if (!pat_enabled()) {
+   if (init_cm_done)
+   return;
+
+   if (boot_cpu_has(X86_FEATURE_PAT)) {
+   /*
+* CPU supports PAT. Set PAT table to be consistent with
+* PAT MSR. This case supports "nopat" boot option, and
+* virtual machine environments which support PAT without
+* MTRRs. In specific, Xen has unique setup to PAT MSR.
+*
+* If PAT MSR returns 0, it is considered invalid and emulates
+* as No PAT.
+*/
+   rdmsrl(MSR_IA32_CR_PAT, pat);
+   }
+
+   if (!pat) {
/*
 * No PAT. Emulate the PAT table that corresponds to the two
-* cache bits, PWT (Write Through) and PCD (Cache Disable). This
-* setup is the same as the BIOS default setup when the system
-* has PAT but the "nopat" boot option has been specified. This
-* emulated PAT table is used when MSR_IA32_CR_PAT returns 0.
+* cache bits, PWT (Write Through) and PCD (Cache Disable).
+* This setup is also the same as the BIOS default setup.
 *
 * PTE encoding:
 *
@@ -266,10 +273,36 @@ void pat_init(void)
 */
pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
  PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
+   }
+
+   __init_cache_modes(pat);
+
+   init_cm_done = 1;
+}
+
+/**
+ * pat_init - Initialize PAT MSR and PAT table
+ *
+ * This function initializes PAT MSR and PAT table with an OS-defined value
+ * to enable additional cache attributes, WC and WT.
+ *
+ * This function must be

[Xen-devel] [PATCH v3 7/7] x86/pat: Document PAT initialization

2016-03-23 Thread Toshi Kani

Update PAT documentation to describe how PAT is initialized under
various configurations.

Signed-off-by: Toshi Kani 
Cc: Borislav Petkov 
Cc: Luis R. Rodriguez 
Cc: Juergen Gross 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
---
 Documentation/x86/pat.txt |   32 
 1 file changed, 32 insertions(+)

diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt
index 54944c7..8ccc0fc 100644
--- a/Documentation/x86/pat.txt
+++ b/Documentation/x86/pat.txt
@@ -196,3 +196,35 @@ Another, more verbose way of getting PAT related debug 
messages is with
 "debugpat" boot parameter. With this parameter, various debug messages are
 printed to dmesg log.
 
+PAT Initialization
+--
+
+The following table describes how PAT is initialized under various
+configurations. PAT MSR must be updated by Linux in order to support WC
+and WT attributes. Otherwise, the PAT MSR has the value programmed in it
+by the firmware. Note, Xen enables WC attribute in the PAT MSR for guests.
+
+ MTRR PAT   Call Sequence   PAT State  PAT MSR
+ =
+ EE MTRR -> PAT initEnabledOS
+ ED MTRR -> PAT initDisabled-
+ DE MTRR -> PAT disable Disabled   BIOS
+ DD MTRR -> PAT disable Disabled-
+ -np/E  PAT  -> PAT disable Disabled   BIOS
+ -np/D  PAT  -> PAT disable Disabled-
+ E!P/E  MTRR -> PAT initDisabled   BIOS
+ D!P/E  MTRR -> PAT disable Disabled   BIOS
+ !M   !P/E  MTRR stub -> PAT disableDisabled   BIOS
+
+ Legend
+ 
+ E Feature enabled in CPU
+ DFeature disabled/unsupported in CPU
+ np   "nopat" boot option specified
+ !P   CONFIG_X86_PAT option unset
+ !M   CONFIG_MTRR option unset
+ Enabled   PAT state set to enable
+ Disabled  PAT state set to disable
+ OSPAT initializes PAT MSR with OS setting
+ BIOS  PAT keeps PAT MSR with BIOS setting
+

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 3/7] x86/mm/pat: Replace cpu_has_pat with boot_cpu_has

2016-03-23 Thread Toshi Kani

Borislav Petkov wrote:
> Please use on init paths boot_cpu_has(X86_FEATURE_PAT) and on fast
> paths static_cpu_has(X86_FEATURE_PAT). No more of that cpu_has_XXX
> ugliness.

Replace the use of cpu_has_pat on init paths with boot_cpu_has().

Suggested-by: Borislav Petkov 
Signed-off-by: Toshi Kani 
Cc: Borislav Petkov 
Cc: Luis R. Rodriguez 
Cc: Juergen Gross 
Cc: Robert Elliott 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
---
 arch/x86/mm/pat.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 3c08a27..3aea1ab 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -213,7 +213,7 @@ static void pat_bsp_init(u64 pat)
 {
u64 tmp_pat;
 
-   if (!cpu_has_pat) {
+   if (!boot_cpu_has(X86_FEATURE_PAT)) {
pat_disable("PAT not supported by CPU.");
return;
}
@@ -231,7 +231,7 @@ static void pat_bsp_init(u64 pat)
 
 static void pat_ap_init(u64 pat)
 {
-   if (!cpu_has_pat) {
+   if (!boot_cpu_has(X86_FEATURE_PAT)) {
/*
 * If this happens we are on a secondary CPU, but switched to
 * PAT on the boot CPU. We have no way to undo PAT.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 4/6] x86/mtrr: Fix Xorg crashes in Qemu sessions

2016-03-23 Thread Toshi Kani

A Xorg failure on qemu32 was reported as a regression [1] caused by
'commit 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is disabled")'.
This patch fixes this regression.

Negative effects of this regression were the following two failures [2]
in Xorg on QEMU with QEMU CPU model "qemu32" (-cpu qemu32), which were
triggered by the fact that its virtual CPU does not support MTRRs.

 #1. copy_process() failed in the check in reserve_pfn_range()

copy_process
 copy_mm
  dup_mm
   dup_mmap
copy_page_range
 track_pfn_copy
  reserve_pfn_range

 A WC map request was tracked as WC in memtype, which set a PTE as
 UC (pgprot) per __cachemode2pte_tbl[].  This led to this error in
 reserve_pfn_range() called from track_pfn_copy(), which obtained
 a pgprot from a PTE.  It converts pgprot to page_cache_mode, which
 does not necessarily result in the original page_cache_mode since
 __cachemode2pte_tbl[] redirects multiple types to UC.

 #2. error path in copy_process() then hit WARN_ON_ONCE in
 untrack_pfn().

 x86/PAT: Xorg:509 map pfn expected mapping type uncached-
 minus for [mem 0xfd00-0xfdff], got write-combining
  Call Trace:
 dump_stack
 warn_slowpath_common
 ? untrack_pfn
 ? untrack_pfn
 warn_slowpath_null
 untrack_pfn
 ? __kunmap_atomic
 unmap_single_vma
 ? pagevec_move_tail_fn
 unmap_vmas
 exit_mmap
 mmput
 copy_process.part.47
 _do_fork
 SyS_clone
 do_syscall_32_irqs_on
 entry_INT80_32

These negative effects are caused by two separate bugs, but they
can be addressed in separate patches.  Fixing the pat_init() issue
described below addresses the root cause, and avoids Xorg to hit
these cases.

When the CPU does not support MTRRs, MTRR does not call pat_init(),
which leaves PAT enabled without initializing PAT.  This pat_init()
issue is a long-standing issue, but manifested as issue #1 (and then
hit issue #2) with the above-mentioned commit because the memtype
now tracks cache attribute with 'page_cache_mode'.

This pat_init() issue existed before the commit, but we used pgprot
in memtype.  Hence, we did not have issue #1 before.  But WC request
resulted in WT in effect because WC pgrot is actually WT when PAT
is not initialized.  This is not how it was designed to work.  When
PAT is set to disable properly, WC is converted to UC.  The use of
WT can result in a system crash if the target range does not support
WT.  Fortunately, nobody ran into such issue before.

To fix this pat_init() issue, PAT code has been enhanced to provide
pat_disable() interface.  Call this interface when MTRRs are disabled.
By setting PAT to disable properly, PAT bypasses the memtype check,
and avoids issue #1.

[1]: https://lkml.org/lkml/2016/3/3/828
[2]: https://lkml.org/lkml/2016/3/4/775
Signed-off-by: Toshi Kani 
Cc: Borislav Petkov 
Cc: Luis R. Rodriguez 
Cc: Juergen Gross 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
---
 arch/x86/include/asm/mtrr.h |6 +-
 arch/x86/kernel/cpu/mtrr/main.c |   10 +-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index b94f6f6..dbff145 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -24,6 +24,7 @@
 #define _ASM_X86_MTRR_H
 
 #include 
+#include 
 
 
 /*
@@ -83,9 +84,12 @@ static inline int mtrr_trim_uncached_memory(unsigned long 
end_pfn)
 static inline void mtrr_centaur_report_mcr(int mcr, u32 lo, u32 hi)
 {
 }
+static inline void mtrr_bp_init(void)
+{
+   pat_disable("MTRRs disabled, skipping PAT initialization too.");
+}
 
 #define mtrr_ap_init() do {} while (0)
-#define mtrr_bp_init() do {} while (0)
 #define set_mtrr_aps_delayed_init() do {} while (0)
 #define mtrr_aps_init() do {} while (0)
 #define mtrr_bp_restore() do {} while (0)
diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index 10f8d47..8b1947b 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -759,8 +759,16 @@ void __init mtrr_bp_init(void)
}
}
 
-   if (!mtrr_enabled())
+   if (!mtrr_enabled()) {
pr_info("MTRR: Disabled\n");
+
+   /*
+* PAT initialization relies on MTRR's rendezvous handler.
+* Skip PAT init until the handler can initialize both
+* features independently.
+*/
+   pat_disable("MTRRs disabled, skipping PAT initialization too.");
+   }
 }
 
 void mtrr_ap_init(void)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 6/7] x86/xen, pat: Remove PAT table init code from Xen

2016-03-23 Thread Toshi Kani

Xen supports PAT without MTRRs for its guests.  In order to
enable WC attribute, it was necessary for xen_start_kernel()
to call pat_init_cache_modes() to update PAT table before
starting guest kernel.

Now that the kernel initializes PAT table to the BIOS handoff
state when MTRR is disabled, this Xen-specific PAT init code
is no longer necessary.  Delete it from xen_start_kernel().

Also change __init_cache_modes() to a static function since
PAT table should not be tweaked by other modules.

Signed-off-by: Toshi Kani 
Acked-by: Juergen Gross 
Cc: Konrad Rzeszutek Wilk 
Cc: Borislav Petkov 
Cc: Luis R. Rodriguez 
Cc: Juergen Gross 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
---
 arch/x86/include/asm/pat.h |1 -
 arch/x86/mm/pat.c  |2 +-
 arch/x86/xen/enlighten.c   |9 -
 3 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h
index 0ad356c..0b1ff4c 100644
--- a/arch/x86/include/asm/pat.h
+++ b/arch/x86/include/asm/pat.h
@@ -7,7 +7,6 @@
 bool pat_enabled(void);
 void pat_disable(const char *reason);
 extern void pat_init(void);
-void __init_cache_modes(u64);
 
 extern int reserve_memtype(u64 start, u64 end,
enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 3aea1ab..9db6915 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -192,7 +192,7 @@ static enum page_cache_mode pat_get_cache_mode(unsigned 
pat_val, char *msg)
  * configuration.
  * Using lower indices is preferred, so we start with highest index.
  */
-void __init_cache_modes(u64 pat)
+static void __init_cache_modes(u64 pat)
 {
enum page_cache_mode cache;
char pat_msg[33];
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index f4296b6..d5f172d 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -75,7 +75,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #ifdef CONFIG_ACPI
@@ -1511,7 +1510,6 @@ asmlinkage __visible void __init xen_start_kernel(void)
 {
struct physdev_set_iopl set_iopl;
unsigned long initrd_start = 0;
-   u64 pat;
int rc;
 
if (!xen_start_info)
@@ -1618,13 +1616,6 @@ asmlinkage __visible void __init xen_start_kernel(void)
   xen_start_info->nr_pages);
xen_reserve_special_pages();
 
-   /*
-* Modify the cache mode translation tables to match Xen's PAT
-* configuration.
-*/
-   rdmsrl(MSR_IA32_CR_PAT, pat);
-   __init_cache_modes(pat);
-
/* keep using Xen gdt for now; no urgent need to change it */
 
 #ifdef CONFIG_X86_32

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 0/7] Enhance PAT init to fix Xorg crashes

2016-03-23 Thread Toshi Kani

A Xorg failure on qemu32 was reported as a regression [1] caused by
'commit 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is disabled")'.
This patch-set fixes the regression.

Negative effects of this regression were two failures [2] in Xorg on
QEMU with QEMU CPU model "qemu32" (-cpu qemu32), which were triggered
by the fact that its virtual CPU does not support MTRR.
 #1. copy_process() failed in the check in reserve_pfn_range()
 #2. error path in copy_process() then hit WARN_ON_ONCE in
 untrack_pfn().

These negative effects are caused by two separate bugs, but they can be
addressed in separate patches.  This patch-set addresses the root cause,
a long-standing PAT initialization issue.

Please see the changelog in patch 4/7 for the details of the issue.

- Patch 1-2 make necessary enhancement to PAT for the fix without
  breaking Xen.
- Patch 3 is cleanup.
- Patch 4 fixes the regression.
- Patch 5 fixes an MTRR issue related with PAT init.
- Patch 6 removes PAT init code from Xen.
- Patch 7 adds PAT init to documentation.

[1]: https://lkml.org/lkml/2016/3/3/828
[2]: https://lkml.org/lkml/2016/3/4/775

I'd appreciate if someone can test this patch-set on Xen to verify that
there is no change in "x86/PAT: Configuration [0-7] .." message in dmesg.

---
v3:
 - Change a new func name to init_cache_modes(). (Borislav Petkov)
 - Add check with __pat_enabled, and use WARN_ONCE() for a bug check
   in pat_disable(). (Borislav Petkov)
 - Update changelog, comments, and doc per review. (Borislav Petkov)

v2:
 - Divide patch-set into a single change. (Borislav Petkov)
 - Xen's case must be handled properly. (Luis R. Rodriguez)
 - Change changelog and title to describe the issue. (Ingo Molnar)
 - Update an error message. (Robert Elliott, Borislav Petkov)

---
Toshi Kani (7):
 1/7 x86/mm/pat: Add support of non-default PAT MSR setting
 2/7 x86/mm/pat: Add pat_disable() interface
 3/7 x86/mm/pat: Replace cpu_has_pat with boot_cpu_has
 4/7 x86/mtrr: Fix Xorg crashes in Qemu sessions
 5/7 x86/mtrr: Fix PAT init handling when MTRR is disabled
 6/7 x86/xen,pat: Remove PAT table init code from Xen
 7/7 x86/pat: Document PAT initialization

---
 Documentation/x86/pat.txt  | 32 ++
 arch/x86/include/asm/mtrr.h|  6 ++-
 arch/x86/include/asm/pat.h |  2 +-
 arch/x86/kernel/cpu/mtrr/generic.c | 24 +-
 arch/x86/kernel/cpu/mtrr/main.c| 13 +-
 arch/x86/kernel/cpu/mtrr/mtrr.h|  1 +
 arch/x86/mm/pat.c  | 90 --
 arch/x86/xen/enlighten.c   |  9 
 8 files changed, 132 insertions(+), 45 deletions(-)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 2/7] x86/mm/pat: Add pat_disable() interface

2016-03-23 Thread Toshi Kani

In preparation for fixing a regression caused by 'commit 9cd25aac1f44
("x86/mm/pat: Emulate PAT when it is disabled")', PAT needs to
provide an interface that prevents the OS from initializing the
PAT MSR.

PAT MSR initialization must be done on all CPUs using the specific
sequence of operations defined in Intel SDM.  This requires MTRRs
to be enabled since pat_init() is called as part of MTRR init
from mtrr_rendezvous_handler().

Make pat_disable() as the interface that prevents the OS from
initializing the PAT MSR.  MTRR will call this interface when it
cannot provide the SDM-defined sequence to initialize PAT.

This also assures pat_disable() called from pat_bsp_init() to
set PAT table properly when CPU does not support PAT.

Signed-off-by: Toshi Kani 
Cc: Borislav Petkov 
Cc: Luis R. Rodriguez 
Cc: Juergen Gross 
Cc: Robert Elliott 
Cc: Ingo Molnar 
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
---
 arch/x86/include/asm/pat.h |1 +
 arch/x86/mm/pat.c  |   13 -
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h
index 97ea55b..0ad356c 100644
--- a/arch/x86/include/asm/pat.h
+++ b/arch/x86/include/asm/pat.h
@@ -5,6 +5,7 @@
 #include 
 
 bool pat_enabled(void);
+void pat_disable(const char *reason);
 extern void pat_init(void);
 void __init_cache_modes(u64);
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 1da55a5..3c08a27 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -40,11 +40,22 @@
 static bool boot_cpu_done;
 
 static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);
+static void init_cache_modes(void);
 
-static inline void pat_disable(const char *reason)
+void pat_disable(const char *reason)
 {
+   if (!__pat_enabled)
+   return;
+
+   if (boot_cpu_done) {
+   WARN_ONCE(1, "x86/PAT: PAT cannot be disabled after 
initialization\n");
+   return;
+   }
+
__pat_enabled = 0;
pr_info("x86/PAT: %s\n", reason);
+
+   init_cache_modes();
 }
 
 static int __init nopat(char *str)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 11/34] xsplice: Design document

2016-03-23 Thread Konrad Rzeszutek Wilk

> diff --git a/docs/misc/xsplice.markdown b/docs/misc/xsplice.markdown
> index 6aa5a27..8252e6c 100644
> --- a/docs/misc/xsplice.markdown
> +++ b/docs/misc/xsplice.markdown
> @@ -487,7 +487,9 @@ hypervisor.
>  The caller provides:
>  
>   * `version`. Initially (on first hypercall) *MUST* be zero.
> - * `idx` index iterator. On first call *MUST* be zero, subsequent calls 
> varies.
> + * `idx` index iterator. The index into the hypervisor's payload count. It is
> +recommended that on first invocation zero be used so that `nr` (which the
> +hypervisor will update with the remaining payload count) be provided.
>   * `nr` the max number of entries to populate.
>   * `pad` - *MUST* be zero.
>   * `status` virtual address of where to write `struct xen_xsplice_status`
> @@ -538,9 +540,9 @@ struct xen_sysctl_xsplice_list {
> On subsequent calls reuse 
> value.  
> If varies between calls, we 
> are  
>   * getting stale data. */  
> -uint32_t idx;   /* IN/OUT: Index into array. */  
> +uint32_t idx;   /* IN: Index into array. */  
>  uint32_t nr;/* IN: How many status, names, 
> and len  
> -   should fill out.  
> +   should be filled out.  
> OUT: How many payloads left. 
> */  
>  uint32_t pad;   /* IN: Must be zero. */  
>  XEN_GUEST_HANDLE_64(xen_xsplice_status_t) status;  /* OUT. Must have 
> enough  
> > 
> > Jan
> > 

And it occurred to me that we can do a probe call similar to XEN_VERSION.

That is fill 'nr' with zero and ->names, ->status, ->list, etc can be NULL.
Then 'nr' will be filled back with the number of payloads.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 11/34] xsplice: Design document

2016-03-23 Thread Konrad Rzeszutek Wilk

On Wed, Mar 23, 2016 at 05:18:39AM -0600, Jan Beulich wrote:
> >>> On 15.03.16 at 18:56,  wrote:
> > +### XEN_SYSCTL_XSPLICE_LIST (2)
> > +
> > +Retrieve an array of abbreviated status and names of payloads that are 
> > loaded in the
> > +hypervisor.
> > +
> > +The caller provides:
> > +
> > + * `version`. Initially (on first hypercall) *MUST* be zero.
> > + * `idx` index iterator. On first call *MUST* be zero, subsequent calls 
> > varies.
> > + * `nr` the max number of entries to populate.
> > + * `pad` - *MUST* be zero.
> > + * `status` virtual address of where to write `struct xen_xsplice_status`
> > +   structures. Caller *MUST* allocate up to `nr` of them.
> > + * `name` - virtual address of where to write the unique name of the 
> > payload.
> > +   Caller *MUST* allocate up to `nr` of them. Each *MUST* be of
> > +   **XEN_XSPLICE_NAME_SIZE** size.
> > + * `len` - virtual address of where to write the length of each unique name
> > +   of the payload. Caller *MUST* allocate up to `nr` of them. Each *MUST* 
> > be
> > +   of sizeof(uint32_t) (4 bytes).
> > +
> > +If the hypercall returns an positive number, it is the number (upto `nr`
> > +provided to the hypercall) of the payloads returned, along with `nr` 
> > updated
> > +with the number of remaining payloads, `version` updated (it may be the 
> > same
> > +across hypercalls - if it varies the data is stale and further calls could
> > +fail). The `status`, `name`, and `len`' are updated at their designed index
> > +value (`idx`) with the returned value of data.
> > +
> > +If the hypercall returns -XEN_E2BIG the `nr` is too big and should be
> > +lowered.
> > +
> > +If the hypercall returns an zero value there are no more payloads.
> > +
> > +Note that due to the asynchronous nature of hypercalls the control domain 
> > might
> > +have added or removed a number of payloads making this information stale. 
> > It is
> > +the responsibility of the toolstack to use the `version` field to check
> > +between each invocation. if the version differs it should discard the stale
> > +data and start from scratch. It is OK for the toolstack to use the new
> > +`version` field.
> > +
> > +The `struct xen_xsplice_status` structure contains an status of payload 
> > which includes:
> > +
> > + * `status` - indicates the current status of the payload:
> > +   * *XSPLICE_STATUS_CHECKED*  (1) loaded and the ELF payload safety 
> > checks passed.
> > +   * *XSPLICE_STATUS_APPLIED* (2) loaded, checked, and applied.
> > +   *  No other value is possible.
> > + * `rc` - -XEN_EXX type errors encountered while performing the last
> > +   XSPLICE_ACTION_* operation. The normal values can be zero or 
> > -XEN_EAGAIN which
> > +   respectively mean: success or operation in progress. Other values
> > +   imply an error occurred. If there is an error in `rc`, `status` will 
> > **NOT**
> > +   have changed.
> > +
> > +The structure is as follow:
> > +
> > +
> > +struct xen_sysctl_xsplice_list {  
> > +uint32_t version;   /* IN/OUT: Initially *MUST* be 
> > zero.  
> > +   On subsequent calls reuse 
> > value.  
> > +   If varies between calls, we 
> > are  
> > + * getting stale data. */  
> > +uint32_t idx;   /* IN/OUT: Index into array. 
> > */ 
> > +uint32_t nr;/* IN: How many status, names, 
> > and len  
> > +   should fill out.  
> > +   OUT: How many payloads 
> > left. */  
> 
> I think there's an ambiguity left in both the description above and
> the comments here: With idx required to be zero upon first
> invocation (which I'm not clear why that is), which parts of the

That is actually a stale design choice. Initially the "How many payloads left"
was going to be stamped in 'idx'. But it is now in 'nr'.

The value can be arbitrary, albeit on first invocation it should be 0 otherwise
you won't get 'nr' telling you how many payloads there left. Unless your
'idx' falls below the amount of payloads.

As in, say we have 20 payloads.
If the first hypercall for 'idx' has 30, then the hypercall will return -EINVAL.
If the first hypercall 'idx' has 19, then the hypercall will populate
->name,->len,->status, ->version and write ->nr with 1.

> three arrays get filled when idx is non-zero: [0, idx) or [nr, nr + idx)?

I am going to assume the you are filling the two /*IN*/ entries, so ->idx
and ->nr.

[0, idx]:

If there is data and the amount of payloads is greater than idx (0), and there
are no hypercall preemptions, then:

->nr = remaining amount
->version = version value
->name[0..idx]
->len[0..idx]
->status[0..idx]


[nr, nr + idx]:

If there is data and the amount of payloads is less than nr, then -EINVAL
is returned.

If there is data and the

Re: [Xen-devel] [PATCH 09/16] xen: sched: close potential races when switching scheduler to CPUs

2016-03-23 Thread George Dunlap

On 18/03/16 19:05, Dario Faggioli wrote:
> by using the sched_switch hook that we have introduced in
> the various schedulers.
> 
> The key is to let the actual switch of scheduler and the
> remapping of the scheduler lock for the CPU (if necessary)
> happen together (in the same critical section) protected
> (at least) by the old scheduler lock for the CPU.
> 
> This also means that, in Credit2 and RTDS, we can get rid
> of the code that was doing the scheduler lock remapping
> in csched2_free_pdata() and rt_free_pdata(), and of their
> triggering ASSERT-s.
> 
> Signed-off-by: Dario Faggioli 

Similar to my comment before -- in my own tree I squashed patches 6-9
into a single commit and found it much easier to review. :-)

One important question...

> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 1adc0e2..29582a6 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1617,7 +1617,6 @@ void __init scheduler_init(void)
>  int schedule_cpu_switch(unsigned int cpu, struct cpupool *c)
>  {
>  struct vcpu *idle;
> -spinlock_t *lock;
>  void *ppriv, *ppriv_old, *vpriv, *vpriv_old;
>  struct scheduler *old_ops = per_cpu(scheduler, cpu);
>  struct scheduler *new_ops = (c == NULL) ?  : c->sched;
> @@ -1640,11 +1639,21 @@ int schedule_cpu_switch(unsigned int cpu, struct 
> cpupool *c)
>  if ( old_ops == new_ops )
>  goto out;
>  
> +/*
> + * To setup the cpu for the new scheduler we need:
> + *  - a valid instance of per-CPU scheduler specific data, as it is
> + *allocated by SCHED_OP(alloc_pdata). Note that we do not want to
> + *initialize it yet (i.e., we are not calling SCHED_OP(init_pdata)).
> + *That will be done by the target scheduler, in 
> SCHED_OP(switch_sched),
> + *in proper ordering and with locking.
> + *  - a valid instance of per-vCPU scheduler specific data, for the idle
> + *vCPU of cpu. That is what the target scheduler will use for the
> + *sched_priv field of the per-vCPU info of the idle domain.
> + */
>  idle = idle_vcpu[cpu];
>  ppriv = SCHED_OP(new_ops, alloc_pdata, cpu);
>  if ( IS_ERR(ppriv) )
>  return PTR_ERR(ppriv);
> -SCHED_OP(new_ops, init_pdata, ppriv, cpu);
>  vpriv = SCHED_OP(new_ops, alloc_vdata, idle, idle->domain->sched_priv);
>  if ( vpriv == NULL )
>  {
> @@ -1652,17 +1661,20 @@ int schedule_cpu_switch(unsigned int cpu, struct 
> cpupool *c)
>  return -ENOMEM;
>  }
>  
> -lock = pcpu_schedule_lock_irq(cpu);
> -
>  SCHED_OP(old_ops, tick_suspend, cpu);
> +
> +/*
> + * The actual switch, including (if necessary) the rerouting of the
> + * scheduler lock to whatever new_ops prefers,  needs to happen in one
> + * critical section, protected by old_ops' lock, or races are possible.
> + * Since each scheduler has its own contraints and locking scheme, do
> + * that inside specific scheduler code, rather than here.
> + */
>  vpriv_old = idle->sched_priv;
> -idle->sched_priv = vpriv;
> -per_cpu(scheduler, cpu) = new_ops;
>  ppriv_old = per_cpu(schedule_data, cpu).sched_priv;
> -per_cpu(schedule_data, cpu).sched_priv = ppriv;
> -SCHED_OP(new_ops, tick_resume, cpu);
> +SCHED_OP(new_ops, switch_sched, cpu, ppriv, vpriv);

Is it really safe to read sched_priv without the lock held?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Interested to participate in Outreachy Program

2016-03-23 Thread Doug Goldstein

On 3/23/16 11:34 AM, sabiya kazi wrote:
> Hi Doug,
> Can you have a look at patch and let me know if everything
> is correct, I think things are good.
> 
> I would also like to have a word with you for deciding timeline for
> project. Meantime, I have started reading stuff  about rust language.
> 
> 
> Regards,
> -Sabiya
> 
> 

Inlining the patch since it was sent as an attachment..

> diff --git a/tools/Makefile b/tools/Makefile
> index 3f45fb9..1c2fb79 100644
> --- a/tools/Makefile
> +++ b/tools/Makefile
> @@ -1,4 +1,4 @@
> -XEN_ROOT = $(CURDIR)/..
> + XEN_ROOT = $(CURDIR)/..
>  include $(XEN_ROOT)/tools/Rules.mk

drop this change

>  
>  SUBDIRS-y :=
> diff --git a/tools/console/client/main.c b/tools/console/client/main.c
> index d006fdc..199432c 100644
> --- a/tools/console/client/main.c
> +++ b/tools/console/client/main.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #ifdef __sun__
>  #include 
>  #endif
> @@ -45,10 +46,12 @@
>  
>  #define ESCAPE_CHARACTER 0x1d
>  
> +# define CONTROL(c) ((c) ^ 0x40)

not really necessary as a define. this can just go into the function
with a comment

> +
>  static volatile sig_atomic_t received_signal = 0;
>  static char lockfile[sizeof (XEN_LOCK_DIR "/xenconsole.") + 8] = { 0 };
>  static int lockfd = -1;
> -
> +static char escapechar = ESCAPE_CHARACTER;
>  static void sighandler(int signum)
>  {
>   received_signal = 1;
> @@ -214,7 +217,7 @@ static int console_loop(int fd, struct xs_handle *xs, 
> char *pty_path,
>   char msg[60];
>  
>   len = read(STDIN_FILENO, msg, sizeof(msg));
> - if (len == 1 && msg[0] == ESCAPE_CHARACTER) {
> + if (len == 1 && msg[0] == escapechar) {
>   return 0;
>   } 
>  
> @@ -318,6 +321,14 @@ static void console_unlock(void)
>   }
>  }
>  
> +char getEscapeChar(const char *s)
> +{
> +if (*s == '^')
> +return CONTROL(toupper(s[1]));

This has the possibility to crash. Not really sure why we would want
this at all tbh. The valid range of escape characters should be "a-z A-Z
@ [ \ ] ^ _" so really we should only ever deal with 1 character and
that character needs to be in the range of:

if ( s <= 'a' && s <= 'z' || s <= '@' && s <= '_' )
escapechar = s;
else
tell_the_user_they_did_wrong();

> +
> +return *s;
> +}
> +
>  int main(int argc, char **argv)
>  {
>   struct termios attr;
> @@ -329,6 +340,7 @@ int main(int argc, char **argv)
>   struct option lopt[] = {
>   { "type", 1, 0, 't' },
>   { "num", 1, 0, 'n' },
> + { "escapechar", 1, 0, 'n' },

the last field should be 'e'

>   { "help",0, 0, 'h' },
>   { 0 },
>  
> @@ -363,6 +375,11 @@ int main(int argc, char **argv)
>   exit(EINVAL);
>   }
>   break;
> + case 'e' :
> + escapechar = getEscapeChar(optarg);
> +break;

white space is off here

> +
> +
>   default:
>   fprintf(stderr, "Invalid argument\n");
>   fprintf(stderr, "Try `%s --help' for more 
> information.\n", 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 4cdc169..86ee670 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -1715,14 +1715,16 @@ static void domain_destroy_domid_cb(libxl__egc *egc,
>  }
>  
>  int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int cons_num,
> -   libxl_console_type type)
> +   libxl_console_type type, char escapechar)
>  {
> +
> +
>  GC_INIT(ctx);
>  char *p = GCSPRINTF("%s/xenconsole", libxl__private_bindir_path());
>  char *domid_s = GCSPRINTF("%d", domid);
>  char *cons_num_s = GCSPRINTF("%d", cons_num);
>  char *cons_type_s;
> -
> +char *cons_escape_char = GCSPRINTF("%c", escapechar); 
>  switch (type) {
>  case LIBXL_CONSOLE_TYPE_PV:
>  cons_type_s = "pv";
> @@ -1734,13 +1736,17 @@ int libxl_console_exec(libxl_ctx *ctx, uint32_t 
> domid, int cons_num,
>  goto out;
>  }
>  
> -execl(p, p, domid_s, "--num", cons_num_s, "--type", cons_type_s, (void 
> *)NULL);
> +   if(cons_escape_char == NULL)

this won't ever be true because of the GCSPRINTF() above. I think you
mean to check if escapechar == 0

> +execl(p, p, domid_s, "--num", cons_num_s, "--type", 
> cons_type_s,(void *)NULL);
> +   else
> +execl(p, p, domid_s, "--num", cons_num_s, "--type", cons_type_s, 
> "--escapechar", cons_escape_char, (void *)NULL);
>  
>  out:
>  GC_FREE;
>  return ERROR_FAIL;
>  }
>  
> +

unnecessary white space change

>  int libxl_console_get_tty(libxl_ctx *ctx, uint32_t domid, int cons_num,
>libxl_console_type type, char **path)
>  {
> @@ -1823,7 +1829,7 @@ out:
>  return rc;
>  }
>  
>

Re: [Xen-devel] [PATCH] tools: fix xen-detect to correctly identify domU type

2016-03-23 Thread Juergen Gross

On 23/03/16 12:25, Andrew Cooper wrote:
> On 23/03/16 11:18, David Vrabel wrote:
>> On 23/03/16 11:12, Andrew Cooper wrote:
>>> On 23/03/16 10:59, David Vrabel wrote:
 On 23/03/16 10:55, Andrew Cooper wrote:
> On 23/03/16 10:52, Juergen Gross wrote:
>> On 23/03/16 11:32, David Vrabel wrote:
>>> On 23/03/16 10:25, Jan Beulich wrote:
>>> On 23.03.16 at 11:14,  wrote:
> 7. Report type according to features found (this is a little bit
>ugly: we have to rely on the current hypervisor implementation
>regarding the bits set for the different guest types).
 Well, in some of the cases feature flags only make sense for one
 kind of guest, so if such a flag is set it could be used as positive
 indication (while it being clear may then still mean nothing).

> Would it make sense to add another file to /sys/hypervisor/properties?
> Something like guest_type, containing "pv", "hvm" or "pvh"? If 
> existing
> this could be used to report the guest type.
 That would seem a good idea to me. What do others, namely
 Linux maintainers, think?
>>> What's the use case for user space knowing if it's in a PV or HVM 
>>> domain?
>> The first thing coming to my mind would be diagnostic tools.
> Having the admin able to tell for informational purposes is useful. 
>> This is useful because...?
> 
> Independently verifying that the guest is as expected?
> 
>>
> They can find out by looking at the top of `dmesg`, but a hypervisor
> sysfs node is cleaner than requiring the admin to know every printk()
> variant that Xen puts out.
>
> That is it however.  It specifically shouldn't be used for any other
> decisions, as it isn't relevant.
 I think it should be the toolstack that presents this information.

 I don't think we should add a new kernel ABI for this.
>>> A toolstack is not present in a domU.
>> So?  The guest admin doesn't need to be in the guest itself to get this
>> information -- it's right there is the xl configuration for the guest.
> 
> guest admin != host admin, and had better not have access to dom0.

David, do you agree on adding another /sys file? Or do you still think
this is no good idea? In case you don't like it, do you have a better
alternative?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools/python/xc: fix tmem_control parameter parsing

2016-03-23 Thread Konrad Rzeszutek Wilk

On Wed, Mar 23, 2016 at 01:45:37PM -0400, Zhigang Wang wrote:
> There should be 6 instead of 7 arguments now for tmem_control().
.. which was done in commit 54a51b1766fd433b95e63834eb15d4b1f70271de
"tmem: Remove xc_tmem_control mystical arg3"

but it missed the removal of an 'i'.

> 
> Signed-off-by: Zhigang Wang 

Acked-by: Konrad Rzeszutek Wilk 

> ---
>  tools/python/xen/lowlevel/xc/xc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/python/xen/lowlevel/xc/xc.c 
> b/tools/python/xen/lowlevel/xc/xc.c
> index c40a4e9..ff714d7 100644
> --- a/tools/python/xen/lowlevel/xc/xc.c
> +++ b/tools/python/xen/lowlevel/xc/xc.c
> @@ -1620,7 +1620,7 @@ static PyObject *pyxc_tmem_control(XcObject *self,
>  
>  static char *kwd_list[] = { "pool_id", "subop", "cli_id", "arg1", 
> "arg2", "buf", NULL };
>  
> -if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iis", kwd_list,
> +if ( !PyArg_ParseTupleAndKeywords(args, kwds, "is", kwd_list,
>  _id, , _id, , , ) )
>  return NULL;
>  
> -- 
> 2.5.5
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] ARMv8: New board bring up hangs in kernel start?

2016-03-23 Thread Konrad Rzeszutek Wilk

On Wed, Mar 23, 2016 at 06:24:40PM +0100, Dirk Behme wrote:
> Hi,

Hey,

CC-ing the ARM MAINTAINERs.

> 
> trying to bring up Xen on a new ARMv8 64-bit Cortex A57 eval board, I get
> [1] and then its hanging there.
> 
> I'd guess that it hangs due to missing timer interrupt, maybe missing
> interrupts at all?
> 
> Any hints how to debug this? Or where to look?
> 
> It might be possible that the board's firmware (arm-trusted-firmware based)
> doesn't configure anything correctly. Firmware is running at EL3, Xen at
> EL2. The same kernel is running fine without Xen.
> 
> Using a JTAG debugger I've put breakpoints into xen/arch/arm/time.c
> timer_interrupt() & vtimer_interrupt() but these don't seem to be called at
> all (?)
> 
> Best regards
> 
> Dirk
> 
> [1]
> 
> - UART enabled -
> - CPU  booting -
> - Current EL 0008 -
> - Xen starting at EL2 -
> - Zero BSS -
> - Setting up control registers -
> - Turning on paging -
> - Ready -
> (XEN) Checking for initrd in /chosen
> (XEN) RAM: 4800 - 7fff
> (XEN)
> (XEN) MODULE[0]: 4800 - 480058a2 Device Tree
> (XEN) MODULE[1]: 4820 - 48c0 Kernel
> (XEN)
> (XEN) Command line: console=dtuart dom0_mem=512M loglvl=all
> (XEN) Placing Xen at 0x7fe0-0x8000
> (XEN) Update BOOTMOD_XEN from 4900-49112e01 =>
> 7fe0-7ff12e01
> (XEN) Domain heap initialised
> (XEN) Platform: ARMv8 Cortex A57 64-bit eval board
> (XEN) Taking dtuart configuration from /chosen/stdout-path
> (XEN) Looking for dtuart at "/soc/serial@e6e88000", options ""
>  Xen 4.7-unstable
> (XEN) Xen version 4.7-unstable (dirk@build) (aarch64-poky-linux-gcc (Linaro
> GCC 4.9-2015.03) 4.9.3 20150311 (prerelease)) debug=y Mon Mar 21 09:15:03
> CET 2016
> (XEN) Latest ChangeSet: Tue Feb 9 09:37:15 2016 +0100 git:b0a2893
> (XEN) Processor: 411fd073: "ARM Limited", variant: 0x1, part 0xd07, rev 0x3
> (XEN) 64-bit Execution:
> (XEN)   Processor Features:  
> (XEN) Exception Levels: EL3:64+32 EL2:64+32 EL1:64+32 EL0:64+32
> (XEN) Extensions: FloatingPoint AdvancedSIMD
> (XEN)   Debug Features: 10305106 
> (XEN)   Auxiliary Features:  
> (XEN)   Memory Model Features: 1124 
> (XEN)   ISA Features:  00011120 
> (XEN) 32-bit Execution:
> (XEN)   Processor Features: 0131:00011011
> (XEN) Instruction Sets: AArch32 A32 Thumb Thumb-2 Jazelle
> (XEN) Extensions: GenericTimer Security
> (XEN)   Debug Features: 03010066
> (XEN)   Auxiliary Features: 
> (XEN)   Memory Model Features: 10201105 4000 0126 02102211
> (XEN)  ISA Features: 02101110 13112111 21232042 01112131 00011142 00011121
> (XEN) Using PSCI-1.0 for SMP bringup
> (XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 16660 KHz
> (XEN) GICv2 initialization:
> (XEN) gic_dist_addr=f101
> (XEN) gic_cpu_addr=f102
> (XEN) gic_hyp_addr=f104
> (XEN) gic_vcpu_addr=f106
> (XEN) gic_maintenance_irq=25
> (XEN) GICv2: 512 lines, 8 cpus, secure (IID 0200043b).
> (XEN) Using scheduler: SMP Credit Scheduler (credit)
> (XEN) Allocated console ring of 16 KiB.
> (XEN) Brought up 1 CPUs
> (XEN) P2M: 44-bit IPA with 44-bit PA
> (XEN) P2M: 4 levels with order-0 root, VTCR 0x80043594
> (XEN) I/O virtualisation disabled
> (XEN) *** LOADING DOMAIN 0 ***
> (XEN) Loading kernel from boot module @ 4820
> (XEN) Allocating 1:1 mappings totalling 512MB for dom0:
> (XEN) BANK[0] 0x005000-0x007000 (512MB)
> (XEN) Grant table range: 0x007fe0-0x007fe5c000
> (XEN) Loading zImage from 4820 to
> 5008-50a8
> (XEN) Allocating PPI 16 for event channel interrupt
> (XEN) Loading dom0 DTB to 0x5800-0x5800568a
> (XEN) Scrubbing Free RAM on 1 nodes using 1 CPUs
> (XEN) ...done.
> (XEN) Initial low memory virq threshold set at 0x4000 pages.
> (XEN) Std. Loglevel: All
> (XEN) Guest Loglevel: All
> (XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to
> Xen)
> (XEN) Freed 288kB init memory.
> Booting Linux on physical CPU 0x0
> Linux version 4.4.0+ (dirk@build) (gcc version 4.9.3 20150311 (prerelease)
> (Linaro GCC 4.9-2015.03) ) #1 SMP PREEMPT Mon Mar 21 09:12:13 CET 2016
> Boot CPU: AArch64 Processor [411fd073]
> debug: ignoring loglevel setting.
> efi: Getting EFI parameters from FDT:
> efi: UEFI not found.
> cma: Reserved 16 MiB at 0x6f00
> On node 0 totalpages: 131072
>   DMA zone: 2048 pages used for memmap
>   DMA zone: 0 pages reserved
>   DMA zone: 131072 pages, LIFO batch:31
> psci: probing for conduit method from DT.
> psci: PSCIv0.2 detected in firmware.
> psci: Using standard PSCI v0.2 function IDs
> psci: Trusted OS migration not required
> Xen 4.7 support found
> PERCPU:

Re: [Xen-devel] Interested to participate in Outreachy Program

2016-03-23 Thread Doug Goldstein

On 3/23/16 11:34 AM, sabiya kazi wrote:
> Hi Doug,
> Can you have a look at patch and let me know if everything
> is correct, I think things are good.
> 
> I would also like to have a word with you for deciding timeline for
> project. Meantime, I have started reading stuff  about rust language.
> 
> 
> Regards,
> -Sabiya
> 
> 

Sabiya,

I'll take a look but you will definitely want to take a look at the
contributing guidelines [1] and resubmit the patch following the
guidelines so that we can include it.

[1] http://www.xenproject.org/help/contribution-guidelines.html

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Outreachy bite-sized tasks

2016-03-23 Thread Dario Faggioli

Hey,

please, do not use HTML for emails to this list.

On Wed, 2016-03-23 at 17:38 +0100, Paulina Szubarczyk wrote:
> Hi, 
> 
> Thank you for the proposed tasks. I would like to work on the second
> one, 
> fixing the return codes in xl.
> 
I just wanted to say that, since I've done (and mentored) some similar
activity before, so, if you go for this, feel free to ask and/or to Cc
me to the patches as well. :-)

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xen/arm64: correctly emulate the {w, x}zr registers

2016-03-23 Thread Julien Grall


Hi Stefano,

On 22/02/16 17:38, Stefano Stabellini wrote:

On Fri, 15 Jan 2016, Ian Campbell wrote:

I read the patch and looks good to me. You can add my

Reviewed-by: Stefano Stabellini 

I have a couple of minor comments, which you can ignore or address as
you commit the patch.


This patch fell through the crack. Some compilers may generate MMIO 
access with *zr, so we need this patch in Xen 4.7 (and potentially 
backport it).


Are you fine if the patch is not respined?

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 05/16] xen: sched: move pCPU initialization in an helper

2016-03-23 Thread George Dunlap

On 23/03/16 17:51, George Dunlap wrote:
> On 18/03/16 19:04, Dario Faggioli wrote:
>> That will turn out useful in following patches, where such
>> code will need to be called more than just once. Create an
>> helper now, and move the code there, to avoid mixing code
>> motion and functional changes later.
>>
>> In Credit2, some style cleanup is also done.
>>
>> No functional change intended.
>>
>> Signed-off-by: Dario Faggioli 
>> ---
>> Cc: George Dunlap 
>> ---
>>  xen/common/sched_credit.c  |   22 +-
>>  xen/common/sched_credit2.c |   26 --
>>  2 files changed, 29 insertions(+), 19 deletions(-)
>>
>> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
>> index d4a0f5e..4488d7c 100644
>> --- a/xen/common/sched_credit.c
>> +++ b/xen/common/sched_credit.c
>> @@ -542,16 +542,11 @@ csched_alloc_pdata(const struct scheduler *ops, int 
>> cpu)
>>  }
>>  
>>  static void
>> -csched_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
>> +init_pdata(struct csched_private *prv, struct csched_pcpu *spc, int cpu)
>>  {
>> -struct csched_private *prv = CSCHED_PRIV(ops);
>> -struct csched_pcpu * const spc = pdata;
>> -unsigned long flags;
>> -
>> -/* cpu data needs to be allocated, but STILL uninitialized */
>> -ASSERT(spc && spc->runq.next == spc->runq.prev && spc->runq.next == 
>> NULL);
>> -
>> -spin_lock_irqsave(>lock, flags);
>> +ASSERT(spin_is_locked(>lock));
>> +/* cpu data needs to be allocated, but STILL uninitialized. */
>> +ASSERT(spc && spc->runq.next == NULL && spc->runq.prev == NULL);
> 
> Actually, Juergen, looks like Dario already agrees with us. ;-)
> 
> Obviously this should be updated in the previous patch instead.
> 
> With that done:
> 
> Reviewed-by: George Dunlap 

Oops -- somehow still have Juergen's Fujitsu address in my addressbook...

 -G

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 05/16] xen: sched: move pCPU initialization in an helper

2016-03-23 Thread George Dunlap

On 18/03/16 19:04, Dario Faggioli wrote:
> That will turn out useful in following patches, where such
> code will need to be called more than just once. Create an
> helper now, and move the code there, to avoid mixing code
> motion and functional changes later.
> 
> In Credit2, some style cleanup is also done.
> 
> No functional change intended.
> 
> Signed-off-by: Dario Faggioli 
> ---
> Cc: George Dunlap 
> ---
>  xen/common/sched_credit.c  |   22 +-
>  xen/common/sched_credit2.c |   26 --
>  2 files changed, 29 insertions(+), 19 deletions(-)
> 
> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
> index d4a0f5e..4488d7c 100644
> --- a/xen/common/sched_credit.c
> +++ b/xen/common/sched_credit.c
> @@ -542,16 +542,11 @@ csched_alloc_pdata(const struct scheduler *ops, int cpu)
>  }
>  
>  static void
> -csched_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
> +init_pdata(struct csched_private *prv, struct csched_pcpu *spc, int cpu)
>  {
> -struct csched_private *prv = CSCHED_PRIV(ops);
> -struct csched_pcpu * const spc = pdata;
> -unsigned long flags;
> -
> -/* cpu data needs to be allocated, but STILL uninitialized */
> -ASSERT(spc && spc->runq.next == spc->runq.prev && spc->runq.next == 
> NULL);
> -
> -spin_lock_irqsave(>lock, flags);
> +ASSERT(spin_is_locked(>lock));
> +/* cpu data needs to be allocated, but STILL uninitialized. */
> +ASSERT(spc && spc->runq.next == NULL && spc->runq.prev == NULL);

Actually, Juergen, looks like Dario already agrees with us. ;-)

Obviously this should be updated in the previous patch instead.

With that done:

Reviewed-by: George Dunlap 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] tools/python/xc: fix tmem_control parameter parsing

2016-03-23 Thread Zhigang Wang

There should be 6 instead of 7 arguments now for tmem_control().

Signed-off-by: Zhigang Wang 
---
 tools/python/xen/lowlevel/xc/xc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/python/xen/lowlevel/xc/xc.c 
b/tools/python/xen/lowlevel/xc/xc.c
index c40a4e9..ff714d7 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -1620,7 +1620,7 @@ static PyObject *pyxc_tmem_control(XcObject *self,
 
 static char *kwd_list[] = { "pool_id", "subop", "cli_id", "arg1", "arg2", 
"buf", NULL };
 
-if ( !PyArg_ParseTupleAndKeywords(args, kwds, "iis", kwd_list,
+if ( !PyArg_ParseTupleAndKeywords(args, kwds, "is", kwd_list,
 _id, , _id, , , ) )
 return NULL;
 
-- 
2.5.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 04/16] xen: sched: implement .init_pdata in all schedulers

2016-03-23 Thread George Dunlap

On 22/03/16 08:03, Juergen Gross wrote:
> On 18/03/16 20:04, Dario Faggioli wrote:
>> by borrowing some of the code of .alloc_pdata, i.e.,
>> the bits that perform initializations, leaving only
>> actual allocations in there, when any, which is the
>> case for Credit1 and RTDS.
>>
>> On the other hand, in Credit2, since we don't really
>> need any per-pCPU data allocation, everything that was
>> being done in .alloc_pdata, is now done in .init_pdata.
>> And the fact that now .alloc_pdata can be left undefined,
>> allows us to just get rid of it.
>>
>> Still for Credit2, the fact that .init_pdata is called
>> during CPU_STARTING (rather than CPU_UP_PREPARE) kills
>> the need for the scheduler to setup a similar callback
>> itself, simplifying the code.
>>
>> And thanks to such simplification, it is now also ok to
>> move some of the logic meant at double checking that a
>> cpu was (or was not) initialized, into ASSERTS (rather
>> than an if() and a BUG_ON).
>>
>> Signed-off-by: Dario Faggioli 
>> ---
>> Cc: George Dunlap 
>> Cc: Meng Xu 
>> Cc: Juergen Gross 
>> ---
>>  xen/common/sched_credit.c  |   20 +---
>>  xen/common/sched_credit2.c |   72 
>> +++-
>>  xen/common/sched_rt.c  |9 --
>>  3 files changed, 26 insertions(+), 75 deletions(-)
>>
>> diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
>> index 288749f..d4a0f5e 100644
>> --- a/xen/common/sched_credit.c
>> +++ b/xen/common/sched_credit.c
>> @@ -526,8 +526,6 @@ static void *
>>  csched_alloc_pdata(const struct scheduler *ops, int cpu)
>>  {
>>  struct csched_pcpu *spc;
>> -struct csched_private *prv = CSCHED_PRIV(ops);
>> -unsigned long flags;
>>  
>>  /* Allocate per-PCPU info */
>>  spc = xzalloc(struct csched_pcpu);
>> @@ -540,6 +538,19 @@ csched_alloc_pdata(const struct scheduler *ops, int cpu)
>>  return ERR_PTR(-ENOMEM);
>>  }
>>  
>> +return spc;
>> +}
>> +
>> +static void
>> +csched_init_pdata(const struct scheduler *ops, void *pdata, int cpu)
>> +{
>> +struct csched_private *prv = CSCHED_PRIV(ops);
>> +struct csched_pcpu * const spc = pdata;
>> +unsigned long flags;
>> +
>> +/* cpu data needs to be allocated, but STILL uninitialized */
>> +ASSERT(spc && spc->runq.next == spc->runq.prev && spc->runq.next == 
>> NULL);
> 
> This looks weird. I'd prefer:
> 
> ASSERT(spc && spc->runq.next == NULL && spc->runq.prev == NULL);

I prefer Juergen's suggestion too.  I wouldn't say it's worth respinning
over, but since you have to make adjustments to the previous patch
anyway, you might as well change this while you're at it.

With that change:

Reviewed-by: George Dunlap 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Building tools for ARM (WAS Re: help)

2016-03-23 Thread Julien Grall


(CC xen-user, BCC xen-devel)

On 23/03/16 10:23, Marwa Hamza wrote:

ello


Hello,

Please have a more meaningful subject. Also the question is not related 
to development so the thread is moved to xen-users.



i'm trying to learn more about xen hypervisor .. i install xen in my
host with alpine as domu
and now i'm trying to build xen from source with linux dom0 for an arm
board .. i have a little bit confusion about building xen from the source
here's what i did
i build xen from the source
git clone git://xenbits.xen.org/xen.git 

make dist-xen XEN_TARGET_ARCH=arm32 CROSS_COMPILE=arm-linux-gnueabihf-
CONFIG_EARLY_PRINTK=omap5432

then i download the linux kernel from
git clone
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git


i configured and compiled successfully

i have in my sd card the u-boot.img and MLO and zimage xenuimage and the
file system ubuntu .. it worked fine after some problems .. now i'm
trynig to install linux as domu ..

when i wrote xl list ..the output is no command found ... it looks like
i need to install xen but i don't know how .. i'm really confused .
where should i install it and how
does any body can help me


You will need to compile and install the tools on the board (see [1]).

Regards,

[1] http://wiki.xenproject.org/wiki/Compiling_Xen_From_Source

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 03/16] xen: sched: make implementing .alloc_pdata optional

2016-03-23 Thread George Dunlap

On 18/03/16 19:04, Dario Faggioli wrote:
> The .alloc_pdata scheduler hook must, before this change,
> be implemented by all schedulers --even those ones that
> don't need to allocate anything.
> 
> Make it possible to just use the SCHED_OP(), like for
> the other hooks, by using ERR_PTR() and IS_ERR() for
> error reporting. This:
>  - makes NULL a variant of success;
>  - allows for errors other than ENOMEM to be properly
>communicated (if ever necessary).
> 
> This, in turn, means that schedulers not needing to
> allocate any per-pCPU data, can avoid implementing the
> hook. In fact, the artificial implementation of
> .alloc_pdata in the ARINC653 is removed (and, while there,
> nuke .free_pdata too, as it is equally useless).
> 
> Signed-off-by: Dario Faggioli 

With the xfree issue Juergen pointed out fixed, this looks good to me.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 03/16] xen: sched: make implementing .alloc_pdata optional

2016-03-23 Thread George Dunlap

On 21/03/16 14:22, Jan Beulich wrote:
 On 18.03.16 at 20:04,  wrote:
>> --- a/xen/include/xen/sched-if.h
>> +++ b/xen/include/xen/sched-if.h
>> @@ -9,6 +9,7 @@
>>  #define __XEN_SCHED_IF_H__
>>  
>>  #include 
>> +#include 
>>  
>>  /* A global pointer to the initial cpupool (POOL0). */
>>  extern struct cpupool *cpupool0;
> 
> There is no visible use in this header of what err.h defines - why
> does it get included all of the sudden?

I'm guessing it's so that all the files that use the scheduler interface
automatically get IS_ERR and PTR_ERR without having to include xen/err.h
directly.

But of course that means files like sched_arinc653.c and sched_credit2.c
end up including xen/err.c even though they don't use those macros.
Would you prefer the other files include it directly instead?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 02/16] xen: sched: add .init_pdata hook to the scheduler interface

2016-03-23 Thread George Dunlap

On 18/03/16 19:04, Dario Faggioli wrote:
> with the purpose of decoupling the allocation phase and
> the initialization one, for per-pCPU data of the schedulers.
> 
> This makes it possible to perform the initialization later
> in the pCPU bringup/assignement process, when more information
> (for instance, the host CPU topology) are available. This,
> for now, is important only for Credit2, but it can well be
> useful to other schedulers.
> 
> Signed-off-by: Dario Faggioli 

Reviewed-by: George Dunlap 

Just one note -- I found this patch harder to review than necessary, I
think, because it implemented the callback but nobody was using it.  I
had to keep switching back and forth between the patches to find out
what was going on.  I personally would have folded patches 2 and 4 together.

(Just to be clear, no action necessary.)

> ---
> Cc: George Dunlap 
> Cc: Juergen Gross 
> ---
> Changes from v1:
>  * in schedule_cpu_switch(), call to init_pdata() moved up,
>close to the call to alloc_pdata() (for consistency with
>other call sites) and prototype slightly changed.
> ---
> During v1 review, it was agreed to add ASSERTS() and comments
> to clarify the use of schedule_cpu_switch(). This can't be
> found here, but only because it has happened in another patch.
> ---
>  xen/common/schedule.c  |7 +++
>  xen/include/xen/sched-if.h |1 +
>  2 files changed, 8 insertions(+)
> 
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index e57b659..0627eb5 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1517,10 +1517,15 @@ static int cpu_schedule_callback(
>  struct notifier_block *nfb, unsigned long action, void *hcpu)
>  {
>  unsigned int cpu = (unsigned long)hcpu;
> +struct scheduler *sched = per_cpu(scheduler, cpu);
> +struct schedule_data *sd = _cpu(schedule_data, cpu);
>  int rc = 0;
>  
>  switch ( action )
>  {
> +case CPU_STARTING:
> +SCHED_OP(sched, init_pdata, sd->sched_priv, cpu);
> +break;
>  case CPU_UP_PREPARE:
>  rc = cpu_schedule_up(cpu);
>  break;
> @@ -1597,6 +1602,7 @@ void __init scheduler_init(void)
>  if ( ops.alloc_pdata &&
>   !(this_cpu(schedule_data).sched_priv = ops.alloc_pdata(, 0)) )
>  BUG();
> +SCHED_OP(, init_pdata, this_cpu(schedule_data).sched_priv, 0);
>  }
>  
>  /*
> @@ -1640,6 +1646,7 @@ int schedule_cpu_switch(unsigned int cpu, struct 
> cpupool *c)
>  ppriv = SCHED_OP(new_ops, alloc_pdata, cpu);
>  if ( ppriv == NULL )
>  return -ENOMEM;
> +SCHED_OP(new_ops, init_pdata, ppriv, cpu);
>  vpriv = SCHED_OP(new_ops, alloc_vdata, idle, idle->domain->sched_priv);
>  if ( vpriv == NULL )
>  {
> diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h
> index 825f1ad..70c08c6 100644
> --- a/xen/include/xen/sched-if.h
> +++ b/xen/include/xen/sched-if.h
> @@ -133,6 +133,7 @@ struct scheduler {
>  void *);
>  void (*free_pdata) (const struct scheduler *, void *, int);
>  void *   (*alloc_pdata)(const struct scheduler *, int);
> +void (*init_pdata) (const struct scheduler *, void *, int);
>  void (*free_domdata)   (const struct scheduler *, void *);
>  void *   (*alloc_domdata)  (const struct scheduler *, struct domain 
> *);
>  
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] ARMv8: New board bring up hangs in kernel start?

2016-03-23 Thread Dirk Behme


Hi,

trying to bring up Xen on a new ARMv8 64-bit Cortex A57 eval board, I 
get [1] and then its hanging there.


I'd guess that it hangs due to missing timer interrupt, maybe missing 
interrupts at all?


Any hints how to debug this? Or where to look?

It might be possible that the board's firmware (arm-trusted-firmware 
based) doesn't configure anything correctly. Firmware is running at 
EL3, Xen at EL2. The same kernel is running fine without Xen.


Using a JTAG debugger I've put breakpoints into xen/arch/arm/time.c 
timer_interrupt() & vtimer_interrupt() but these don't seem to be 
called at all (?)


Best regards

Dirk

[1]

- UART enabled -
- CPU  booting -
- Current EL 0008 -
- Xen starting at EL2 -
- Zero BSS -
- Setting up control registers -
- Turning on paging -
- Ready -
(XEN) Checking for initrd in /chosen
(XEN) RAM: 4800 - 7fff
(XEN)
(XEN) MODULE[0]: 4800 - 480058a2 Device Tree
(XEN) MODULE[1]: 4820 - 48c0 Kernel
(XEN)
(XEN) Command line: console=dtuart dom0_mem=512M loglvl=all
(XEN) Placing Xen at 0x7fe0-0x8000
(XEN) Update BOOTMOD_XEN from 4900-49112e01 => 
7fe0-7ff12e01

(XEN) Domain heap initialised
(XEN) Platform: ARMv8 Cortex A57 64-bit eval board
(XEN) Taking dtuart configuration from /chosen/stdout-path
(XEN) Looking for dtuart at "/soc/serial@e6e88000", options ""
 Xen 4.7-unstable
(XEN) Xen version 4.7-unstable (dirk@build) (aarch64-poky-linux-gcc 
(Linaro GCC 4.9-2015.03) 4.9.3 20150311 (prerelease)) debug=y Mon Mar 
21 09:15:03 CET 2016

(XEN) Latest ChangeSet: Tue Feb 9 09:37:15 2016 +0100 git:b0a2893
(XEN) Processor: 411fd073: "ARM Limited", variant: 0x1, part 0xd07, 
rev 0x3

(XEN) 64-bit Execution:
(XEN)   Processor Features:  
(XEN) Exception Levels: EL3:64+32 EL2:64+32 EL1:64+32 EL0:64+32
(XEN) Extensions: FloatingPoint AdvancedSIMD
(XEN)   Debug Features: 10305106 
(XEN)   Auxiliary Features:  
(XEN)   Memory Model Features: 1124 
(XEN)   ISA Features:  00011120 
(XEN) 32-bit Execution:
(XEN)   Processor Features: 0131:00011011
(XEN) Instruction Sets: AArch32 A32 Thumb Thumb-2 Jazelle
(XEN) Extensions: GenericTimer Security
(XEN)   Debug Features: 03010066
(XEN)   Auxiliary Features: 
(XEN)   Memory Model Features: 10201105 4000 0126 02102211
(XEN)  ISA Features: 02101110 13112111 21232042 01112131 00011142 00011121
(XEN) Using PSCI-1.0 for SMP bringup
(XEN) Generic Timer IRQ: phys=30 hyp=26 virt=27 Freq: 16660 KHz
(XEN) GICv2 initialization:
(XEN) gic_dist_addr=f101
(XEN) gic_cpu_addr=f102
(XEN) gic_hyp_addr=f104
(XEN) gic_vcpu_addr=f106
(XEN) gic_maintenance_irq=25
(XEN) GICv2: 512 lines, 8 cpus, secure (IID 0200043b).
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Allocated console ring of 16 KiB.
(XEN) Brought up 1 CPUs
(XEN) P2M: 44-bit IPA with 44-bit PA
(XEN) P2M: 4 levels with order-0 root, VTCR 0x80043594
(XEN) I/O virtualisation disabled
(XEN) *** LOADING DOMAIN 0 ***
(XEN) Loading kernel from boot module @ 4820
(XEN) Allocating 1:1 mappings totalling 512MB for dom0:
(XEN) BANK[0] 0x005000-0x007000 (512MB)
(XEN) Grant table range: 0x007fe0-0x007fe5c000
(XEN) Loading zImage from 4820 to 
5008-50a8

(XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading dom0 DTB to 0x5800-0x5800568a
(XEN) Scrubbing Free RAM on 1 nodes using 1 CPUs
(XEN) ...done.
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch 
input to Xen)

(XEN) Freed 288kB init memory.
Booting Linux on physical CPU 0x0
Linux version 4.4.0+ (dirk@build) (gcc version 4.9.3 20150311 
(prerelease) (Linaro GCC 4.9-2015.03) ) #1 SMP PREEMPT Mon Mar 21 
09:12:13 CET 2016

Boot CPU: AArch64 Processor [411fd073]
debug: ignoring loglevel setting.
efi: Getting EFI parameters from FDT:
efi: UEFI not found.
cma: Reserved 16 MiB at 0x6f00
On node 0 totalpages: 131072
  DMA zone: 2048 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 131072 pages, LIFO batch:31
psci: probing for conduit method from DT.
psci: PSCIv0.2 detected in firmware.
psci: Using standard PSCI v0.2 function IDs
psci: Trusted OS migration not required
Xen 4.7 support found
PERCPU: Embedded 20 pages/cpu @ffc01efc9000 s42112 r8192 d31616 u81920
pcpu-alloc: s42112 r8192 d31616 u81920 alloc=20*4096
pcpu-alloc: [0] 0
Detected PIPT I-cache on CPU0
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 
129024

Kernel command line: console=hvc0 ignore_loglevel
PID hash

Re: [Xen-devel] Call for nominations for new Hypervisor subproject maintainers and committers

2016-03-23 Thread Lars Kurth

Hi everyone,

I just wanted to let you know that this hasn't dropped off my radar. We do have 
a shortlist of people now, and I will need to reach out to people who were 
nominated (and may not know they were). However due to Easter, the time-frame 
may slip by a week.


> On 24 Feb 2016, at 18:51, Lars Kurth  wrote:
> 
> Dear Community members,
> 
> I wanted to inform you that both Keir Fraser and Tim Deegan, have 
> formally stepped down in their roles as committers from the Hypervisor 
> team. In addition, you may have seen that Ian Campbell recently 
> transferred maintainer-ship for many components to other community 
> members (see http://bit.ly/1RnM8JP). This means that Ian will take a much 
> less active role within the project in the future. 
> 
> First and foremost, the remaining committers and the Xen Project 
> Advisory Board would like to thank Keir, Tim and Ian for serving the
> Xen Project community and their manyfold and diverse contributions.
> 
> Given, that as a project, we have found it difficult to promote 
> contributors to maintainer and committer roles in the past, the remaining 
> group of committers felt that we should use a more formal appointment 
> process to successions and succession planning and have asked me to 
> organise this process. Taking a longer term view, the committers also 
> felt that we should not restrict the appointment process to replacing
> committer positions only, but to consider additional committer positions 
> based on merit and to also include new maintainer nominations.
> 
> Thus, to fill these positions, we are soliciting nominations. To nominate
> yourself or someone else within the community, please send e-mail to 
> appointme...@xenproject.org with one of the following subject lines: 
> - "Maintainer Nomination of [name]"
> - "Committer Nomination of [name]"
> 
> Nominees will of course be asked, privately, whether they would be 
> willing to serve, if they have been nominated by someone else.
> 
> Please provide contact details (at least the full name and e-mail address 
> of nominee) in the body of the e-mail and describe why the nominee would 
> be a good fit a maintainer and/or committer. The body of the nomination 
> should list technical knowledge that is needed to be a maintainer and/or 
> committer and highlight core areas of expertise. In addition, we are also 
> interested in specific instances, where the nominee showed communication 
> and open source leadership qualities. 
> 
> For example:
> * Was able to help resolve disagreements, both technical and non-
>  technical, which you were a party to or observer of.
> * Was able to contribute to improve quality and architectural consistency 
>  across several components within the Hypervisor
> * Has been involved in coordinating the activities of several community 
>  members 
> * Has led or driven technical initiatives or larger scale feature 
>  development within the community 
> * Has mentored and encouraged newcomers to the community
> * Has represented the project or aspects of it (e.g. via talks, blog 
>  posts, ...)
> * Has shown other communication and open source leadership qualities
> 
> Being a maintainer and/or committer does require a time commitment. 
> Nominees should be able to follow e-mail discussions on xen-devel@ on an 
> ongoing basis and respond within a couple of days so that discussions 
> progress. Committers should ideally be able to spend a minimum of 4-5 
> days working on the project per month. For maintainers, the time 
> requirement is likely less. 
> 
> We anticipate starting our selection process according to the following 
> rough time-table. 
> 
> Today:Public call for nominations for new committers and maintainers 
>  (self-nominations and 3rd-party nominations both welcome)
> 
> March 11: Closing date for nominations
> 

We are here and now have a short-list. 

> up to
> March 30: We email non-self-nominated nominees in private to ask them 
>  to confirm whether they are willing to act as nominees. 
>  We will also discuss with all nominees, time commitment and 
>  other possible questions (from both nominees and existing
>  committers), related to the proposed nominations.

I will start this after Easter Monday.

> March 30: We conduct a formal vote to ratify nominations
> 
> April 6:  We publish the new maintainers and committers
>  (we may do this earlier)

I expect that this will slip by a week, as some people will be on vacation due 
to Easter.

Best Regards
Lars
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] x86/vMSI-X emulation issue

2016-03-23 Thread Jan Beulich

All,

so I've just learned that Windows (at least some versions and some
of their code paths) use REP MOVSD to read/write the MSI-X table.
The way at least msixtbl_write() works is not compatible with this
(msixtbl_read() also seems affected, albeit to a lesser degree), and
apparently it just worked by accident until the XSA-120 and 128-131
and follow-up changes - most notably commit ad28e42bd1 ("x86/MSI:
track host and guest masking separately"), as without the call to
guest_mask_msi_irq() interrupts won't ever get unmasked.

The problem with emulating REP MOVSD is that msixtbl_write()
intentionally returns X86EMUL_UNHANDLEABLE on all writes to
words 0, 1, and 2. When in the process of emulating multiple
writes, we therefore hand the entire batch of 3 or 4 writes to qemu,
and the hypervisor doesn't get to see any other than the initial
iteration.

Now I see a couple of possible solutions, but none of them look
really neat, hence I'm seeking a second opinion (including, of
course, further alternative ideas):

1) Introduce another X86EMUL_* like status that's not really to be
used by the emulator itself, but only by the two vMSI-X functions
to indicate to their caller that prior to forwarding the request it
should be chopped to a single repetition.

2) Do aforementioned chopping automatically on seeing
X86EMUL_UNHANDLEABLE, on the basis that the .check
handler had indicated that the full range was acceptable. That
would at once cover other similarly undesirable cases like the
vLAPIC code returning this error. However, any stdvga like
emulated device would clearly not want such to happen, and
would instead prefer the entire batch to get forwarded in one
go (stdvga itself sits on a different path). Otoh, with the
devices we have currently, this would seem to be the least
intrusive solution.

3) Have emulation backends provide some kind of (static) flag
indicating which forwarding behavior they would like.

4) Expose the full ioreq to the emulation backends, so they can
fiddle with the request to their liking.

Thanks, Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 25/26] tools/libxc: Use featuresets rather than guesswork

2016-03-23 Thread Andrew Cooper

It is conceptually wrong to base a VM's featureset on the features visible to
the toolstack which happens to construct it.

Instead, the featureset used is either an explicit one passed by the
toolstack, or the default which Xen believes it can give to the guest.

Collect all the feature manipulation into a single function which adjusts the
featureset, and perform deep dependency removal.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

v2:
 * Join several related patches together.
v3:
 * Correctly adjust HTT/CMP_LEGACY in the policy.  PV guests see host details,
   so get the host features.  HVM guests have their vcpu topology presented in
   an HTT compatible manor (even if ends up reporting 1 cpu), so have
   CMP_LEGACY unconditionally cleared.
---
 tools/libxc/xc_cpuid_x86.c | 356 +
 1 file changed, 137 insertions(+), 219 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index a92f5e4..fc7e20a 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -21,7 +21,9 @@
 
 #include 
 #include 
+#include 
 #include "xc_private.h"
+#include "xc_bitops.h"
 #include 
 
 enum {
@@ -31,12 +33,14 @@ enum {
 #include "_xc_cpuid_autogen.h"
 
 #define bitmaskof(idx)  (1u << ((idx) & 31))
-#define clear_bit(idx, dst) ((dst) &= ~bitmaskof(idx))
-#define set_bit(idx, dst)   ((dst) |=  bitmaskof(idx))
+#define featureword_of(idx) ((idx) >> 5)
+#define clear_feature(idx, dst) ((dst) &= ~bitmaskof(idx))
+#define set_feature(idx, dst)   ((dst) |=  bitmaskof(idx))
 
 #define DEF_MAX_BASE 0x000du
 #define DEF_MAX_INTELEXT  0x8008u
 #define DEF_MAX_AMDEXT0x801cu
+#define COMMON_1D CPUID_COMMON_1D_FEATURES
 
 int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps)
 {
@@ -322,37 +326,6 @@ static void amd_xc_cpuid_policy(xc_interface *xch,
 regs[0] = DEF_MAX_AMDEXT;
 break;
 
-case 0x8001: {
-if ( !info->pae )
-clear_bit(X86_FEATURE_PAE, regs[3]);
-
-/* Filter all other features according to a whitelist. */
-regs[2] &= (bitmaskof(X86_FEATURE_LAHF_LM) |
-bitmaskof(X86_FEATURE_CMP_LEGACY) |
-(info->nestedhvm ? bitmaskof(X86_FEATURE_SVM) : 0) |
-bitmaskof(X86_FEATURE_CR8_LEGACY) |
-bitmaskof(X86_FEATURE_ABM) |
-bitmaskof(X86_FEATURE_SSE4A) |
-bitmaskof(X86_FEATURE_MISALIGNSSE) |
-bitmaskof(X86_FEATURE_3DNOWPREFETCH) |
-bitmaskof(X86_FEATURE_OSVW) |
-bitmaskof(X86_FEATURE_XOP) |
-bitmaskof(X86_FEATURE_LWP) |
-bitmaskof(X86_FEATURE_FMA4) |
-bitmaskof(X86_FEATURE_TBM) |
-bitmaskof(X86_FEATURE_DBEXT));
-regs[3] &= (0x0183f3ff | /* features shared with 0x0001:EDX */
-bitmaskof(X86_FEATURE_NX) |
-bitmaskof(X86_FEATURE_LM) |
-bitmaskof(X86_FEATURE_PAGE1GB) |
-bitmaskof(X86_FEATURE_SYSCALL) |
-bitmaskof(X86_FEATURE_MMXEXT) |
-bitmaskof(X86_FEATURE_FFXSR) |
-bitmaskof(X86_FEATURE_3DNOW) |
-bitmaskof(X86_FEATURE_3DNOWEXT));
-break;
-}
-
 case 0x8008:
 /*
  * ECX[15:12] is ApicIdCoreSize: ECX[7:0] is NumberOfCores (minus one).
@@ -399,12 +372,6 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
 {
 switch ( input[0] )
 {
-case 0x0001:
-/* ECX[5] is availability of VMX */
-if ( info->nestedhvm )
-set_bit(X86_FEATURE_VMX, regs[2]);
-break;
-
 case 0x0004:
 /*
  * EAX[31:26] is Maximum Cores Per Package (minus one).
@@ -420,19 +387,6 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
 regs[0] = DEF_MAX_INTELEXT;
 break;
 
-case 0x8001: {
-/* Only a few features are advertised in Intel's 0x8001. */
-regs[2] &= (bitmaskof(X86_FEATURE_LAHF_LM) |
-bitmaskof(X86_FEATURE_3DNOWPREFETCH) |
-bitmaskof(X86_FEATURE_ABM));
-regs[3] &= (bitmaskof(X86_FEATURE_NX) |
-bitmaskof(X86_FEATURE_LM) |
-bitmaskof(X86_FEATURE_PAGE1GB) |
-bitmaskof(X86_FEATURE_SYSCALL) |
-bitmaskof(X86_FEATURE_RDTSCP));
-break;
-}
-
 case 0x8005:
 regs[0] = regs[1] = regs[2] = 0;
 break;
@@ -444,10 +398,6 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
 }
 }
 
-#define XSAVEOPT(1 << 0)
-#define XSAVEC  (1 << 1)
-#define XGETBV1 (1 << 2)
-#define XSAVES  (1 << 3)
 /* Configure extended state enumeration leaves

[Xen-devel] [PATCH v4 22/26] tools/libxc: Expose the automatically generated cpu featuremask information

2016-03-23 Thread Andrew Cooper

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

New in v2
---
 tools/libxc/Makefile  |  9 ++
 tools/libxc/include/xenctrl.h | 14 
 tools/libxc/xc_cpuid_x86.c| 75 +++
 3 files changed, 98 insertions(+)

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 608404f..ef02c9d 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -145,6 +145,15 @@ $(eval $(genpath-target))
 
 xc_private.h: _paths.h
 
+ifeq ($(CONFIG_X86),y)
+
+_xc_cpuid_autogen.h: $(XEN_ROOT)/xen/include/public/arch-x86/cpufeatureset.h 
$(XEN_ROOT)/xen/tools/gen-cpuid.py
+   $(PYTHON) $(XEN_ROOT)/xen/tools/gen-cpuid.py -i $^ -o $@.new
+   $(call move-if-changed,$@.new,$@)
+
+build: _xc_cpuid_autogen.h
+endif
+
 $(CTRL_LIB_OBJS) $(GUEST_LIB_OBJS) \
 $(CTRL_PIC_OBJS) $(GUEST_PIC_OBJS): xc_private.h
 
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index c136aa8..66acbd1 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2533,6 +2533,20 @@ int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t 
socket,
 int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps);
 int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
   uint32_t *nr_features, uint32_t *featureset);
+
+uint32_t xc_get_cpu_featureset_size(void);
+
+enum xc_static_cpu_featuremask {
+XC_FEATUREMASK_KNOWN,
+XC_FEATUREMASK_SPECIAL,
+XC_FEATUREMASK_PV,
+XC_FEATUREMASK_HVM_SHADOW,
+XC_FEATUREMASK_HVM_HAP,
+XC_FEATUREMASK_DEEP_FEATURES,
+};
+const uint32_t *xc_get_static_cpu_featuremask(enum xc_static_cpu_featuremask);
+const uint32_t *xc_get_feature_deep_deps(uint32_t feature);
+
 #endif
 
 /* Compat shims */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index d3674db..0cffb36 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -28,6 +28,7 @@ enum {
 #define XEN_CPUFEATURE(name, value) X86_FEATURE_##name = value,
 #include 
 };
+#include "_xc_cpuid_autogen.h"
 
 #define bitmaskof(idx)  (1u << ((idx) & 31))
 #define clear_bit(idx, dst) ((dst) &= ~bitmaskof(idx))
@@ -78,6 +79,80 @@ int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
 return ret;
 }
 
+uint32_t xc_get_cpu_featureset_size(void)
+{
+return FEATURESET_NR_ENTRIES;
+}
+
+const uint32_t *xc_get_static_cpu_featuremask(
+enum xc_static_cpu_featuremask mask)
+{
+const static uint32_t known[FEATURESET_NR_ENTRIES] = INIT_KNOWN_FEATURES,
+special[FEATURESET_NR_ENTRIES] = INIT_SPECIAL_FEATURES,
+pv[FEATURESET_NR_ENTRIES] = INIT_PV_FEATURES,
+hvm_shadow[FEATURESET_NR_ENTRIES] = INIT_HVM_SHADOW_FEATURES,
+hvm_hap[FEATURESET_NR_ENTRIES] = INIT_HVM_HAP_FEATURES,
+deep_features[FEATURESET_NR_ENTRIES] = INIT_DEEP_FEATURES;
+
+XC_BUILD_BUG_ON(ARRAY_SIZE(known) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(special) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(pv) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(hvm_shadow) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(hvm_hap) != FEATURESET_NR_ENTRIES);
+XC_BUILD_BUG_ON(ARRAY_SIZE(deep_features) != FEATURESET_NR_ENTRIES);
+
+switch ( mask )
+{
+case XC_FEATUREMASK_KNOWN:
+return known;
+
+case XC_FEATUREMASK_SPECIAL:
+return special;
+
+case XC_FEATUREMASK_PV:
+return pv;
+
+case XC_FEATUREMASK_HVM_SHADOW:
+return hvm_shadow;
+
+case XC_FEATUREMASK_HVM_HAP:
+return hvm_hap;
+
+case XC_FEATUREMASK_DEEP_FEATURES:
+return deep_features;
+
+default:
+return NULL;
+}
+}
+
+const uint32_t *xc_get_feature_deep_deps(uint32_t feature)
+{
+static const struct {
+uint32_t feature;
+uint32_t fs[FEATURESET_NR_ENTRIES];
+} deep_deps[] = INIT_DEEP_DEPS;
+
+unsigned int start = 0, end = ARRAY_SIZE(deep_deps);
+
+XC_BUILD_BUG_ON(ARRAY_SIZE(deep_deps) != NR_DEEP_DEPS);
+
+/* deep_deps[] is sorted.  Perform a binary search. */
+while ( start < end )
+{
+unsigned int mid = start + ((end - start) / 2);
+
+if ( deep_deps[mid].feature > feature )
+end = mid;
+else if ( deep_deps[mid].feature < feature )
+start = mid + 1;
+else
+return deep_deps[mid].fs;
+}
+
+return NULL;
+}
+
 struct cpuid_domain_info
 {
 enum
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 13/26] x86/cpu: Sysctl and common infrastructure for levelling context switching

2016-03-23 Thread Andrew Cooper

A toolstack needs to know how much control Xen has over the visible cpuid
values in PV guests.  Provide an explicit mechanism to query what Xen is
capable of.

This interface will currently report no capabilities.  This change is
scaffolding for future patches, which will introduce detection and switching
logic, after which the interface will report hardware capabilities correctly.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * s/cpumasks/cpuidmasks/
v3:
 * Reintroduce XEN_SYSCTL_get_levelling_caps (requested by Joao for some
   development he has planned).
 * Rename to XEN_SYSCTL_get_cpu_levelling_caps, and rename the constants to
   match the Xen command line options.
v4:
 * Move declarations from processor.h to cpuid.h
 * API corrections for XEN_SYSCTL_get_levelling_caps
---
 xen/arch/x86/cpu/common.c|  6 ++
 xen/arch/x86/sysctl.c|  6 ++
 xen/include/asm-x86/cpufeature.h |  1 +
 xen/include/asm-x86/cpuid.h  | 32 
 xen/include/public/sysctl.h  | 23 +++
 5 files changed, 68 insertions(+)

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index b5c023f..7ef75b0 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -36,6 +36,12 @@ integer_param("cpuid_mask_ext_ecx", opt_cpuid_mask_ext_ecx);
 unsigned int opt_cpuid_mask_ext_edx = ~0u;
 integer_param("cpuid_mask_ext_edx", opt_cpuid_mask_ext_edx);
 
+unsigned int __initdata expected_levelling_cap;
+unsigned int __read_mostly levelling_caps;
+
+DEFINE_PER_CPU(struct cpuidmasks, cpuidmasks);
+struct cpuidmasks __read_mostly cpuidmask_defaults;
+
 const struct cpu_dev *__read_mostly cpu_devs[X86_VENDOR_NUM] = {};
 
 unsigned int paddr_bits __read_mostly = 36;
diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 58cbd70..f68cbec 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -190,6 +190,12 @@ long arch_do_sysctl(
 }
 break;
 
+case XEN_SYSCTL_get_cpu_levelling_caps:
+sysctl->u.cpu_levelling_caps.caps = levelling_caps;
+if ( __copy_field_to_guest(u_sysctl, sysctl, 
u.cpu_levelling_caps.caps) )
+ret = -EFAULT;
+break;
+
 default:
 ret = -ENOSYS;
 break;
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index e29b024..84d3220 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -81,6 +81,7 @@
 #define cpu_has_xsavec boot_cpu_has(X86_FEATURE_XSAVEC)
 #define cpu_has_xgetbv1boot_cpu_has(X86_FEATURE_XGETBV1)
 #define cpu_has_xsaves boot_cpu_has(X86_FEATURE_XSAVES)
+#define cpu_has_hypervisor boot_cpu_has(X86_FEATURE_HYPERVISOR)
 
 enum _cache_type {
 CACHE_TYPE_NULL = 0,
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 4725672..9a21c25 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 #define FSCAPINTS FEATURESET_NR_ENTRIES
 
@@ -18,6 +19,7 @@
 
 #ifndef __ASSEMBLY__
 #include 
+#include 
 
 extern const uint32_t known_features[FSCAPINTS];
 extern const uint32_t special_features[FSCAPINTS];
@@ -31,6 +33,36 @@ void calculate_featuresets(void);
 
 const uint32_t *lookup_deep_deps(uint32_t feature);
 
+/*
+ * Expected levelling capabilities (given cpuid vendor/family information),
+ * and levelling capabilities actually available (given MSR probing).
+ */
+#define LCAP_faulting XEN_SYSCTL_CPU_LEVELCAP_faulting
+#define LCAP_1cd  (XEN_SYSCTL_CPU_LEVELCAP_ecx |\
+   XEN_SYSCTL_CPU_LEVELCAP_edx)
+#define LCAP_e1cd (XEN_SYSCTL_CPU_LEVELCAP_extd_ecx |   \
+   XEN_SYSCTL_CPU_LEVELCAP_extd_edx)
+#define LCAP_Da1  XEN_SYSCTL_CPU_LEVELCAP_xsave_eax
+#define LCAP_6c   XEN_SYSCTL_CPU_LEVELCAP_thermal_ecx
+#define LCAP_7ab0 (XEN_SYSCTL_CPU_LEVELCAP_l7s0_eax |   \
+   XEN_SYSCTL_CPU_LEVELCAP_l7s0_ebx)
+extern unsigned int expected_levelling_cap, levelling_caps;
+
+struct cpuidmasks
+{
+uint64_t _1cd;
+uint64_t e1cd;
+uint64_t Da1;
+uint64_t _6c;
+uint64_t _7ab0;
+};
+
+/* Per CPU shadows of masking MSR values, for lazy context switching. */
+DECLARE_PER_CPU(struct cpuidmasks, cpuidmasks);
+
+/* Default masking MSR values, calculated at boot. */
+extern struct cpuidmasks cpuidmask_defaults;
+
 #endif /* __ASSEMBLY__ */
 #endif /* !__X86_CPUID_H__ */
 
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 96680eb..1ab16db 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -766,6 +766,27 @@ struct xen_sysctl_tmem_op {
 typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
 
+/*
+ * XEN_SYSCTL_get_cpu_levelling_caps (x86 specific)
+ *
+ * Return hardware capabilities concerning masking or

[Xen-devel] [PATCH v4 26/26] tools/libxc: Calculate xstate cpuid leaf from guest information

2016-03-23 Thread Andrew Cooper

It is unsafe to generate the guests xstate leaves from host information, as it
prevents the differences between hosts from being hidden.

In addition, some further improvements and corrections:
 - don't discard the known flags in sub-leaves 2..63 ECX
 - zap sub-leaves beyond 62
 - zap all bits in leaf 1, EBX/ECX.  No XSS features are currently supported.

Signed-off-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
---
CC: Wei Liu 
CC: Ian Jackson 

v3:
 * Reintroduce MPX adjustment (this series has been in development since
   before the introduction of MPX upstream, and it got lost in a rebase)
v4:
 * Fold further improvements from Jan
---
 tools/libxc/xc_cpuid_x86.c | 71 +-
 1 file changed, 57 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index fc7e20a..cf1f6b7 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -398,54 +398,97 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
 }
 }
 
+/* XSTATE bits in XCR0. */
+#define X86_XCR0_X87(1ULL <<  0)
+#define X86_XCR0_SSE(1ULL <<  1)
+#define X86_XCR0_AVX(1ULL <<  2)
+#define X86_XCR0_BNDREG (1ULL <<  3)
+#define X86_XCR0_BNDCSR (1ULL <<  4)
+#define X86_XCR0_LWP(1ULL << 62)
+
+#define X86_XSS_MASK(0) /* No XSS states supported yet. */
+
+/* Per-component subleaf flags. */
+#define XSTATE_XSS  (1ULL <<  0)
+#define XSTATE_ALIGN64  (1ULL <<  1)
+
 /* Configure extended state enumeration leaves (0x000D for xsave) */
 static void xc_cpuid_config_xsave(xc_interface *xch,
   const struct cpuid_domain_info *info,
   const unsigned int *input, unsigned int 
*regs)
 {
-if ( info->xfeature_mask == 0 )
+uint64_t guest_xfeature_mask;
+
+if ( info->xfeature_mask == 0 ||
+ !test_bit(X86_FEATURE_XSAVE, info->featureset) )
 {
 regs[0] = regs[1] = regs[2] = regs[3] = 0;
 return;
 }
 
+guest_xfeature_mask = X86_XCR0_SSE | X86_XCR0_X87;
+
+if ( test_bit(X86_FEATURE_AVX, info->featureset) )
+guest_xfeature_mask |= X86_XCR0_AVX;
+
+if ( test_bit(X86_FEATURE_MPX, info->featureset) )
+guest_xfeature_mask |= X86_XCR0_BNDREG | X86_XCR0_BNDCSR;
+
+if ( test_bit(X86_FEATURE_LWP, info->featureset) )
+guest_xfeature_mask |= X86_XCR0_LWP;
+
+/*
+ * Clamp to host mask.  Should be no-op, as guest_xfeature_mask should not
+ * be able to be calculated as larger than info->xfeature_mask.
+ *
+ * TODO - see about making this a harder error.
+ */
+guest_xfeature_mask &= info->xfeature_mask;
+
 switch ( input[1] )
 {
-case 0: 
+case 0:
 /* EAX: low 32bits of xfeature_enabled_mask */
-regs[0] = info->xfeature_mask & 0x;
+regs[0] = guest_xfeature_mask;
 /* EDX: high 32bits of xfeature_enabled_mask */
-regs[3] = (info->xfeature_mask >> 32) & 0x;
+regs[3] = guest_xfeature_mask >> 32;
 /* ECX: max size required by all HW features */
 {
 unsigned int _input[2] = {0xd, 0x0}, _regs[4];
 regs[2] = 0;
-for ( _input[1] = 2; _input[1] < 64; _input[1]++ )
+for ( _input[1] = 2; _input[1] <= 62; _input[1]++ )
 {
 cpuid(_input, _regs);
 if ( (_regs[0] + _regs[1]) > regs[2] )
 regs[2] = _regs[0] + _regs[1];
 }
 }
-/* EBX: max size required by enabled features. 
- * This register contains a dynamic value, which varies when a guest 
- * enables or disables XSTATE features (via xsetbv). The default size 
- * after reset is 576. */ 
+/* EBX: max size required by enabled features.
+ * This register contains a dynamic value, which varies when a guest
+ * enables or disables XSTATE features (via xsetbv). The default size
+ * after reset is 576. */
 regs[1] = 512 + 64; /* FP/SSE + XSAVE.HEADER */
 break;
+
 case 1: /* leaf 1 */
 regs[0] = info->featureset[featureword_of(X86_FEATURE_XSAVEOPT)];
-regs[2] &= info->xfeature_mask;
-regs[3] = 0;
+regs[2] = guest_xfeature_mask & X86_XSS_MASK;
+regs[3] = (guest_xfeature_mask >> 32) & X86_XSS_MASK;
 break;
-case 2 ... 63: /* sub-leaves */
-if ( !(info->xfeature_mask & (1ULL << input[1])) )
+
+case 2 ... 62: /* per-component sub-leaves */
+if ( !(guest_xfeature_mask & (1ULL << input[1])) )
 {
 regs[0] = regs[1] = regs[2] = regs[3] = 0;
 break;
 }
 /* Don't touch EAX, EBX. Also cleanup ECX and EDX */
-regs[2] = regs[3] = 0;
+regs[2] &= XSTATE_XSS | XSTATE_ALIGN64;
+regs[3] = 0;
+break;
+
+default:
+

[Xen-devel] [PATCH v4 11/26] xen/x86: Improvements to in-hypervisor cpuid sanity checks

2016-03-23 Thread Andrew Cooper

Currently, {pv,hvm}_cpuid() has a large quantity of essentially-static logic
for modifying the features visible to a guest.  A lot of this can be subsumed
by {pv,hvm}_featuremask, which identify the features available on this
hardware which could be given to a PV or HVM guest.

This is a step in the direction of full per-domain cpuid policies, but lots
more development is needed for that.  As a result, the static checks are
simplified, but the dynamic checks need to remain for now.

As a side effect, some of the logic for special features can be improved.
OSXSAVE and OSPKE will be automatically cleared because of being absent in the
featuremask.  This allows the fast-forward logic to be more simple.

In addition, there are some corrections to the existing logic:

 * Hiding PSE36 out of PAE mode is architecturally wrong.  It turns out that
   it was a bugfix for running HyperV under Xen, which wanted to see PSE36
   even after choosing to use PAE paging.  PSE36 is not supported by shadow
   paging, so is hidden from non-HAP guests, but is still visible for HAP
   guests.
 * Changing the visibility of RDTSCP based on host TSC stability or virtual
   TSC mode is bogus, so dropped.
 * When emulating Intel to a guest, the common features in e1d should be
   cleared.
 * The APIC bit in e1d (on non-Intel) is also a fast-forward from the
   APIC_BASE MSR.

As a small improvement, use compiler-visible &'s and |'s, rather than
{clear,set}_bit().

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * Reinstate some of the dynamic checks for now.  Future development work will
   instate a complete per-domain policy.
 * Fix OSXSAVE handling for PV guests.
v3:
 * Better handling of the cross-vendor case.
 * Improvements to the handling of special features.
 * Correct PSE36 to being a HAP-only feature.
 * Yet more OSXSAVE fixes for PV guests.
v4:
 * Leak PSE36 into shadow guests to fix buggy versions of Hyper-V.
 * Leak MTRR into the hardware domain to fix Xenolinux dom0.
 * Change cross-vendor 1D disabling logic.
 * Avoid reading arch.pv_vcpu for PVH guests.
---
 xen/arch/x86/hvm/hvm.c | 125 ++---
 xen/arch/x86/traps.c   | 209 -
 2 files changed, 216 insertions(+), 118 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 80d59ff..6593bb1 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -71,6 +71,7 @@
 #include 
 #include 
 #include 
+#include 
 
 bool_t __read_mostly hvm_enabled;
 
@@ -4668,62 +4669,71 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
unsigned int *ebx,
 /* Fix up VLAPIC details. */
 *ebx &= 0x00FFu;
 *ebx |= (v->vcpu_id * 2) << 24;
+
+*ecx &= hvm_featureset[FEATURESET_1c];
+*edx &= hvm_featureset[FEATURESET_1d];
+
+/* APIC exposed to guests, but Fast-forward MSR_APIC_BASE.EN back in. 
*/
 if ( vlapic_hw_disabled(vcpu_vlapic(v)) )
-__clear_bit(X86_FEATURE_APIC & 31, edx);
+*edx &= ~cpufeat_bit(X86_FEATURE_APIC);
 
-/* Fix up OSXSAVE. */
-if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) &&
- (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) )
+/* OSXSAVE cleared by hvm_featureset.  Fast-forward CR4 back in. */
+if ( v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE )
 *ecx |= cpufeat_mask(X86_FEATURE_OSXSAVE);
-else
-*ecx &= ~cpufeat_mask(X86_FEATURE_OSXSAVE);
 
-/* Don't expose PCID to non-hap hvm. */
+/* Don't expose HAP-only features to non-hap guests. */
 if ( !hap_enabled(d) )
+{
 *ecx &= ~cpufeat_mask(X86_FEATURE_PCID);
 
-/* Only provide PSE36 when guest runs in 32bit PAE or in long mode */
-if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
-*edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+/*
+ * PSE36 is not supported in shadow mode.  This bit should be
+ * unilaterally cleared.
+ *
+ * However, an unspecified version of Hyper-V from 2011 refuses
+ * to start as the "cpu does not provide required hw features" if
+ * it can't see PSE36.
+ *
+ * As a workaround, leak the toolstack-provided PSE36 value into a
+ * shadow guest if the guest is already using PAE paging (and
+ * won't care about reverting back to PSE paging).  Otherwise,
+ * knoble it, so a 32bit guest doesn't get the impression that it
+ * could try to use PSE36 paging.
+ */
+if ( !(hvm_pae_enabled(v) || hvm_long_mode_enabled(v)) )
+*edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
+}
 break;
+
 case 0x7:
 if ( count == 0 )
 {
-if ( !cpu_has_smep )
-*ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
-
-

[Xen-devel] [PATCH v4 12/26] x86/cpu: Move set_cpumask() calls into c_early_init()

2016-03-23 Thread Andrew Cooper

Before c/s 44e24f8567 "x86: don't call generic_identify() redundantly", the
commandline-provided masks would take effect in Xen's view of the features.

As the masks got applied after the query for features, the redundant call to
generic_identify() would clobber the pre-masking feature information with the
post-masking information.

Move the set_cpumask() calls into c_early_init() so their effects take place
before the main query for features in generic_identify().

The cpuid_mask_* command line parameters now limit the entire system, a
feature XenServer was relying on for testing purposes.  Subsequent changes
will cause the mask MSRs to be context switched per-domain, removing the need
to use the command line parameters for heterogeneous levelling purposes.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
---
 xen/arch/x86/cpu/amd.c   |  8 ++--
 xen/arch/x86/cpu/intel.c | 34 +-
 2 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 47a38c6..5516777 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -407,6 +407,11 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
  c->cpu_core_id);
 }
 
+static void early_init_amd(struct cpuinfo_x86 *c)
+{
+   set_cpuidmask(c);
+}
+
 static void init_amd(struct cpuinfo_x86 *c)
 {
u32 l, h;
@@ -595,14 +600,13 @@ static void init_amd(struct cpuinfo_x86 *c)
if ((smp_processor_id() == 1) && !cpu_has(c, X86_FEATURE_ITSC))
disable_c1_ramping();
 
-   set_cpuidmask(c);
-
check_syscfg_dram_mod_en();
 }
 
 static const struct cpu_dev amd_cpu_dev = {
.c_vendor   = "AMD",
.c_ident= { "AuthenticAMD" },
+   .c_early_init   = early_init_amd,
.c_init = init_amd,
 };
 
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index bdf89f6..ad22375 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -189,6 +189,23 @@ static void early_init_intel(struct cpuinfo_x86 *c)
if (boot_cpu_data.x86 == 0xF && boot_cpu_data.x86_model == 3 &&
(boot_cpu_data.x86_mask == 3 || boot_cpu_data.x86_mask == 4))
paddr_bits = 36;
+
+   if (c == _cpu_data && c->x86 == 6) {
+   if (probe_intel_cpuid_faulting())
+   __set_bit(X86_FEATURE_CPUID_FAULTING,
+ c->x86_capability);
+   } else if (boot_cpu_has(X86_FEATURE_CPUID_FAULTING)) {
+   BUG_ON(!probe_intel_cpuid_faulting());
+   __set_bit(X86_FEATURE_CPUID_FAULTING, c->x86_capability);
+   }
+
+   if (!cpu_has_cpuid_faulting)
+   set_cpuidmask(c);
+   else if ((c == _cpu_data) &&
+(~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
+   opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
+   opt_cpuid_mask_xsave_eax)))
+   printk("No CPUID feature masking support available\n");
 }
 
 /*
@@ -258,23 +275,6 @@ static void init_intel(struct cpuinfo_x86 *c)
detect_ht(c);
}
 
-   if (c == _cpu_data && c->x86 == 6) {
-   if (probe_intel_cpuid_faulting())
-   __set_bit(X86_FEATURE_CPUID_FAULTING,
- c->x86_capability);
-   } else if (boot_cpu_has(X86_FEATURE_CPUID_FAULTING)) {
-   BUG_ON(!probe_intel_cpuid_faulting());
-   __set_bit(X86_FEATURE_CPUID_FAULTING, c->x86_capability);
-   }
-
-   if (!cpu_has_cpuid_faulting)
-   set_cpuidmask(c);
-   else if ((c == _cpu_data) &&
-(~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
-   opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
-   opt_cpuid_mask_xsave_eax)))
-   printk("No CPUID feature masking support available\n");
-
/* Work around errata */
Intel_errata_workarounds(c);
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 10/26] xen/x86: Improve disabling of features which have dependencies

2016-03-23 Thread Andrew Cooper

APIC and XSAVE have dependent features, which also need disabling if Xen
chooses to disable a feature.

Use setup_clear_cpu_cap() rather than clear_bit(), as it takes care of
dependent features as well.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2: Move boolean_param() adjacent to use_xsave in xstate_init()
---
 xen/arch/x86/apic.c   |  2 +-
 xen/arch/x86/cpu/common.c | 12 +++-
 xen/arch/x86/xstate.c |  6 +-
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index b9601ad..8df5bd3 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1349,7 +1349,7 @@ void pmu_apic_interrupt(struct cpu_user_regs *regs)
 int __init APIC_init_uniprocessor (void)
 {
 if (enable_local_apic < 0)
-__clear_bit(X86_FEATURE_APIC, boot_cpu_data.x86_capability);
+setup_clear_cpu_cap(X86_FEATURE_APIC);
 
 if (!smp_found_config && !cpu_has_apic) {
 skip_ioapic_setup = 1;
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 0942b44..b5c023f 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -16,9 +16,6 @@
 
 #include "cpu.h"
 
-static bool_t use_xsave = 1;
-boolean_param("xsave", use_xsave);
-
 bool_t opt_arat = 1;
 boolean_param("arat", opt_arat);
 
@@ -341,12 +338,6 @@ void identify_cpu(struct cpuinfo_x86 *c)
if (this_cpu->c_init)
this_cpu->c_init(c);
 
-/* Initialize xsave/xrstor features */
-   if ( !use_xsave )
-   __clear_bit(X86_FEATURE_XSAVE, boot_cpu_data.x86_capability);
-
-   if ( cpu_has_xsave )
-   xstate_init(c);
 
if ( !opt_pku )
setup_clear_cpu_cap(X86_FEATURE_PKU);
@@ -370,6 +361,9 @@ void identify_cpu(struct cpuinfo_x86 *c)
 
/* Now the feature flags better reflect actual CPU features! */
 
+   if ( cpu_has_xsave )
+   xstate_init(c);
+
 #ifdef NOISY_CAPS
printk(KERN_DEBUG "CPU: After all inits, caps:");
for (i = 0; i < NCAPINTS; i++)
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index f649405..5060704 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -502,11 +502,15 @@ unsigned int xstate_ctxt_size(u64 xcr0)
 /* Collect the information of processor's extended state */
 void xstate_init(struct cpuinfo_x86 *c)
 {
+static bool_t __initdata use_xsave = 1;
+boolean_param("xsave", use_xsave);
+
 bool_t bsp = c == _cpu_data;
 u32 eax, ebx, ecx, edx;
 u64 feature_mask;
 
-if ( boot_cpu_data.cpuid_level < XSTATE_CPUID )
+if ( (bsp && !use_xsave) ||
+ boot_cpu_data.cpuid_level < XSTATE_CPUID )
 {
 BUG_ON(!bsp);
 setup_clear_cpu_cap(X86_FEATURE_XSAVE);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 19/26] xen+tools: Export maximum host and guest cpu featuresets via SYSCTL

2016-03-23 Thread Andrew Cooper

And provide stubs for toolstack use.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
Acked-by: David Scott 
Acked-by: Jan Beulich 
---
CC: Tim Deegan 

v2:
 * Rebased to use libxencall
 * Improve hypercall documentation
v3:
 * Provide libxc implementation for XEN_SYSCTL_get_cpu_levelling_caps as well.
v4:
 * More const.
---
 tools/libxc/include/xenctrl.h   |  4 +++
 tools/libxc/xc_cpuid_x86.c  | 41 +
 tools/ocaml/libs/xc/xenctrl.ml  |  3 +++
 tools/ocaml/libs/xc/xenctrl.mli |  4 +++
 tools/ocaml/libs/xc/xenctrl_stubs.c | 35 +
 xen/arch/x86/sysctl.c   | 51 +
 xen/include/public/sysctl.h | 27 
 7 files changed, 165 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 150d727..c136aa8 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2529,6 +2529,10 @@ int xc_psr_cat_get_domain_data(xc_interface *xch, 
uint32_t domid,
 int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
uint32_t *cos_max, uint32_t *cbm_len,
bool *cdp_enabled);
+
+int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps);
+int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
+  uint32_t *nr_features, uint32_t *featureset);
 #endif
 
 /* Compat shims */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 733add4..5780397 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -33,6 +33,47 @@
 #define DEF_MAX_INTELEXT  0x8008u
 #define DEF_MAX_AMDEXT0x801cu
 
+int xc_get_cpu_levelling_caps(xc_interface *xch, uint32_t *caps)
+{
+DECLARE_SYSCTL;
+int ret;
+
+sysctl.cmd = XEN_SYSCTL_get_cpu_levelling_caps;
+ret = do_sysctl(xch, );
+
+if ( !ret )
+*caps = sysctl.u.cpu_levelling_caps.caps;
+
+return ret;
+}
+
+int xc_get_cpu_featureset(xc_interface *xch, uint32_t index,
+  uint32_t *nr_features, uint32_t *featureset)
+{
+DECLARE_SYSCTL;
+DECLARE_HYPERCALL_BOUNCE(featureset,
+ *nr_features * sizeof(*featureset),
+ XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+int ret;
+
+if ( xc_hypercall_bounce_pre(xch, featureset) )
+return -1;
+
+sysctl.cmd = XEN_SYSCTL_get_cpu_featureset;
+sysctl.u.cpu_featureset.index = index;
+sysctl.u.cpu_featureset.nr_features = *nr_features;
+set_xen_guest_handle(sysctl.u.cpu_featureset.features, featureset);
+
+ret = do_sysctl(xch, );
+
+xc_hypercall_bounce_post(xch, featureset);
+
+if ( !ret )
+*nr_features = sysctl.u.cpu_featureset.nr_features;
+
+return ret;
+}
+
 struct cpuid_domain_info
 {
 enum
diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index 58a53a1..75006e7 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -242,6 +242,9 @@ external version_changeset: handle -> string = 
"stub_xc_version_changeset"
 external version_capabilities: handle -> string =
   "stub_xc_version_capabilities"
 
+type featureset_index = Featureset_raw | Featureset_host | Featureset_pv | 
Featureset_hvm
+external get_cpu_featureset : handle -> featureset_index -> int64 array = 
"stub_xc_get_cpu_featureset"
+
 external watchdog : handle -> int -> int32 -> int
   = "stub_xc_watchdog"
 
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index 16443df..720e4b2 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -147,6 +147,10 @@ external version_compile_info : handle -> compile_info
 external version_changeset : handle -> string = "stub_xc_version_changeset"
 external version_capabilities : handle -> string
   = "stub_xc_version_capabilities"
+
+type featureset_index = Featureset_raw | Featureset_host | Featureset_pv | 
Featureset_hvm
+external get_cpu_featureset : handle -> featureset_index -> int64 array = 
"stub_xc_get_cpu_featureset"
+
 type core_magic = Magic_hvm | Magic_pv
 type core_header = {
   xch_magic : core_magic;
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 74928e9..e7adf37 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -1214,6 +1214,41 @@ CAMLprim value stub_xc_domain_deassign_device(value xch, 
value domid, value desc
CAMLreturn(Val_unit);
 }
 
+CAMLprim value stub_xc_get_cpu_featureset(value xch, value idx)
+{
+   CAMLparam2(xch, idx);
+   CAMLlocal1(bitmap_val);
+
+   /* Safe, because of the global ocaml lock. */
+   static uint32_t fs_len;
+
+   if (fs_len == 0)
+   {
+   int ret = xc_get_cpu_featureset(_H(xch), 0, _len,

[Xen-devel] [PATCH v4 23/26] tools: Utility for dealing with featuresets

2016-03-23 Thread Andrew Cooper

It is able to reports the current featuresets; both the static masks and
dynamic featuresets from Xen, or to decode an arbitrary featureset into
`/proc/cpuinfo` style strings.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

v2: No linking hackary
---
 .gitignore |   1 +
 tools/misc/Makefile|   4 +
 tools/misc/xen-cpuid.c | 394 +
 3 files changed, 399 insertions(+)
 create mode 100644 tools/misc/xen-cpuid.c

diff --git a/.gitignore b/.gitignore
index b40453e..20ffa2d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -179,6 +179,7 @@ tools/misc/cpuperf/cpuperf-perfcntr
 tools/misc/cpuperf/cpuperf-xen
 tools/misc/xc_shadow
 tools/misc/xen_cpuperf
+tools/misc/xen-cpuid
 tools/misc/xen-detect
 tools/misc/xen-tmem-list-parse
 tools/misc/xenperf
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index a2ef0ec..a94dad9 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -10,6 +10,7 @@ CFLAGS += $(CFLAGS_xeninclude)
 CFLAGS += $(CFLAGS_libxenstore)
 
 # Everything to be installed in regular bin/
+INSTALL_BIN-$(CONFIG_X86)  += xen-cpuid
 INSTALL_BIN-$(CONFIG_X86)  += xen-detect
 INSTALL_BIN+= xencons
 INSTALL_BIN+= xencov_split
@@ -68,6 +69,9 @@ clean:
 .PHONY: distclean
 distclean: clean
 
+xen-cpuid: xen-cpuid.o
+   $(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) 
$(APPEND_LDFLAGS)
+
 xen-hvmctx: xen-hvmctx.o
$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
new file mode 100644
index 000..608c488
--- /dev/null
+++ b/tools/misc/xen-cpuid.c
@@ -0,0 +1,394 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define ARRAY_SIZE(a) (sizeof a / sizeof *a)
+static uint32_t nr_features;
+
+static const char *str_1d[32] =
+{
+[ 0] = "fpu",  [ 1] = "vme",
+[ 2] = "de",   [ 3] = "pse",
+[ 4] = "tsc",  [ 5] = "msr",
+[ 6] = "pae",  [ 7] = "mce",
+[ 8] = "cx8",  [ 9] = "apic",
+[10] = "REZ",  [11] = "sysenter",
+[12] = "mtrr", [13] = "pge",
+[14] = "mca",  [15] = "cmov",
+[16] = "pat",  [17] = "pse36",
+[18] = "psn",  [19] = "clflush",
+[20] = "REZ",  [21] = "ds",
+[22] = "acpi", [23] = "mmx",
+[24] = "fxsr", [25] = "sse",
+[26] = "sse2", [27] = "ss",
+[28] = "htt",  [29] = "tm",
+[30] = "ia64", [31] = "pbe",
+};
+
+static const char *str_1c[32] =
+{
+[ 0] = "sse3",[ 1] = "pclmulqdq",
+[ 2] = "dtes64",  [ 3] = "monitor",
+[ 4] = "ds-cpl",  [ 5] = "vmx",
+[ 6] = "smx", [ 7] = "est",
+[ 8] = "tm2", [ 9] = "ssse3",
+[10] = "cntx-id", [11] = "sdgb",
+[12] = "fma", [13] = "cx16",
+[14] = "xtpr",[15] = "pdcm",
+[16] = "REZ", [17] = "pcid",
+[18] = "dca", [19] = "sse41",
+[20] = "sse42",   [21] = "x2apic",
+[22] = "movebe",  [23] = "popcnt",
+[24] = "tsc-dl",  [25] = "aesni",
+[26] = "xsave",   [27] = "osxsave",
+[28] = "avx", [29] = "f16c",
+[30] = "rdrnd",   [31] = "hyper",
+};
+
+static const char *str_e1d[32] =
+{
+[ 0] = "fpu",[ 1] = "vme",
+[ 2] = "de", [ 3] = "pse",
+[ 4] = "tsc",[ 5] = "msr",
+[ 6] = "pae",[ 7] = "mce",
+[ 8] = "cx8",[ 9] = "apic",
+[10] = "REZ",[11] = "syscall",
+[12] = "mtrr",   [13] = "pge",
+[14] = "mca",[15] = "cmov",
+[16] = "fcmov",  [17] = "pse36",
+[18] = "REZ",[19] = "mp",
+[20] = "nx", [21] = "REZ",
+[22] = "mmx+",   [23] = "mmx",
+[24] = "fxsr",   [25] = "fxsr+",
+[26] = "pg1g",   [27] = "rdtscp",
+[28] = "REZ",[29] = "lm",
+[30] = "3dnow+", [31] = "3dnow",
+};
+
+static const char *str_e1c[32] =
+{
+[ 0] = "lahf_lm",[ 1] = "cmp",
+[ 2] = "svm",[ 3] = "extapic",
+[ 4] = "cr8d",   [ 5] = "lzcnt",
+[ 6] = "sse4a",  [ 7] = "msse",
+[ 8] = "3dnowpf",[ 9] = "osvw",
+[10] = "ibs",[11] = "xop",
+[12] = "skinit", [13] = "wdt",
+[14] = "REZ",[15] = "lwp",
+[16] = "fma4",   [17] = "tce",
+[18] = "REZ",[19] = "nodeid",
+[20] = "REZ",[21] = "tbm",
+[22] = "topoext",[23] = "perfctr_core",
+[24] = "perfctr_nb", [25] = "REZ",
+[26] = "dbx",[27] = "perftsc",
+[28] = "pcx_l2i",[29] = "monitorx",
+
+[30 ... 31] = "REZ",
+};
+
+static const char *str_7b0[32] =
+{
+[ 0] = "fsgsbase", [ 1] = "tsc-adj",
+[ 2] = "sgx",  [ 3] = "bmi1",
+[ 4] = "hle",  [ 5] = "avx2",
+[ 6] = "REZ",  [ 7] = "smep",
+[ 8] = "bmi2", [ 9] = "erms",
+[10] = "invpcid",  [11] = "rtm",
+[12] = "pqm",  [13] = "depfpp",
+[14] = "mpx",  [15] = "pqe",
+[16] = "avx512f",  [17] = "avx512dq",
+[18] = "rdseed",   [19] = "adx",
+[20] =

[Xen-devel] [PATCH v4 15/26] x86/cpu: Rework Intel masking/faulting setup

2016-03-23 Thread Andrew Cooper

This patch is best reviewed as its end result rather than as a diff, as it
rewrites almost all of the setup.

On the BSP, cpuid information is used to evaluate the potential available set
of masking MSRs, and they are unconditionally probed, filling in the
availability information and hardware defaults.  A side effect of this is that
probe_intel_cpuid_faulting() can move to being __init.

The command line parameters are then combined with the hardware defaults to
further restrict the Xen default masking level.  Each cpu is then context
switched into the default levelling state.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2:
 * Style fixes.
 * Provide extra information if opt_cpu_info.
 * Extra comment indicating the expected use of intel_ctxt_switch_levelling().
v3:
 * Style fixes.
 * Avoid printing the cpumask defaults if faulting is available.
---
 xen/arch/x86/cpu/intel.c | 234 ++-
 1 file changed, 149 insertions(+), 85 deletions(-)

diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index ad22375..b2666a8 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -18,11 +18,18 @@
 
 #define select_idle_routine(x) ((void)0)
 
-static unsigned int probe_intel_cpuid_faulting(void)
+static bool_t __init probe_intel_cpuid_faulting(void)
 {
uint64_t x;
-   return !rdmsr_safe(MSR_INTEL_PLATFORM_INFO, x) &&
-   (x & MSR_PLATFORM_INFO_CPUID_FAULTING);
+
+   if (rdmsr_safe(MSR_INTEL_PLATFORM_INFO, x) ||
+   !(x & MSR_PLATFORM_INFO_CPUID_FAULTING))
+   return 0;
+
+   expected_levelling_cap |= LCAP_faulting;
+   levelling_caps |=  LCAP_faulting;
+   __set_bit(X86_FEATURE_CPUID_FAULTING, boot_cpu_data.x86_capability);
+   return 1;
 }
 
 static DEFINE_PER_CPU(bool_t, cpuid_faulting_enabled);
@@ -44,36 +51,40 @@ void set_cpuid_faulting(bool_t enable)
 }
 
 /*
- * opt_cpuid_mask_ecx/edx: cpuid.1[ecx, edx] feature mask.
- * For example, E8400[Intel Core 2 Duo Processor series] ecx = 0x0008E3FD,
- * edx = 0xBFEBFBFF when executing CPUID.EAX = 1 normally. If you want to
- * 'rev down' to E8400, you can set these values in these Xen boot parameters.
+ * Set caps in expected_levelling_cap, probe a specific masking MSR, and set
+ * caps in levelling_caps if it is found, or clobber the MSR index if missing.
+ * If preset, reads the default value into msr_val.
  */
-static void set_cpuidmask(const struct cpuinfo_x86 *c)
+static uint64_t __init _probe_mask_msr(unsigned int *msr, uint64_t caps)
 {
-   static unsigned int msr_basic, msr_ext, msr_xsave;
-   static enum { not_parsed, no_mask, set_mask } status;
-   u64 msr_val;
+   uint64_t val = 0;
 
-   if (status == no_mask)
-   return;
+   expected_levelling_cap |= caps;
 
-   if (status == set_mask)
-   goto setmask;
+   if (rdmsr_safe(*msr, val) || wrmsr_safe(*msr, val))
+   *msr = 0;
+   else
+   levelling_caps |= caps;
 
-   ASSERT((status == not_parsed) && (c == _cpu_data));
-   status = no_mask;
+   return val;
+}
 
-   if (!~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx &
-  opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx &
-  opt_cpuid_mask_xsave_eax))
-   return;
+/* Indices of the masking MSRs, or 0 if unavailable. */
+static unsigned int __read_mostly msr_basic, __read_mostly msr_ext,
+   __read_mostly msr_xsave;
+
+/*
+ * Probe for the existance of the expected masking MSRs.  They might easily
+ * not be available if Xen is running virtualised.
+ */
+static void __init probe_masking_msrs(void)
+{
+   const struct cpuinfo_x86 *c = _cpu_data;
+   unsigned int exp_msr_basic, exp_msr_ext, exp_msr_xsave;
 
/* Only family 6 supports this feature. */
-   if (c->x86 != 6) {
-   printk("No CPUID feature masking support available\n");
+   if (c->x86 != 6)
return;
-   }
 
switch (c->x86_model) {
case 0x17: /* Yorkfield, Wolfdale, Penryn, Harpertown(DP) */
@@ -100,59 +111,121 @@ static void set_cpuidmask(const struct cpuinfo_x86 *c)
break;
}
 
-   status = set_mask;
+   exp_msr_basic = msr_basic;
+   exp_msr_ext   = msr_ext;
+   exp_msr_xsave = msr_xsave;
 
-   if (~(opt_cpuid_mask_ecx & opt_cpuid_mask_edx)) {
-   if (msr_basic)
-   printk("Writing CPUID feature mask ecx:edx -> 
%08x:%08x\n",
-  opt_cpuid_mask_ecx, opt_cpuid_mask_edx);
-   else
-   printk("No CPUID feature mask available\n");
-   }
-   else
-   msr_basic = 0;
-
-   if (~(opt_cpuid_mask_ext_ecx & opt_cpuid_mask_ext_edx)) {
-   if (msr_ext)
-   printk("Writing CPUID extended feature mask ecx:edx -> 
%08x:%08x\n",
-

[Xen-devel] [PATCH v4 17/26] x86/pv: Provide custom cpumasks for PV domains

2016-03-23 Thread Andrew Cooper

And use them in preference to cpumask_defaults on context switch.  HVM domains
must not be masked (to avoid interfering with cpuid calls within the guest),
so always lazily context switch to the host default.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2:
 * s/cpumasks/cpuidmasks/
 * Use structure assignment
 * Fix error path in arch_domain_create()
v3:
 * Indentation fixes.
 * Only allocate PV cpuidmasks if the host is has cpumasks to use.
---
 xen/arch/x86/cpu/amd.c   |  4 +++-
 xen/arch/x86/cpu/intel.c |  5 -
 xen/arch/x86/domain.c| 14 ++
 xen/include/asm-x86/domain.h |  2 ++
 4 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 484d4b0..8cb04f0 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -206,7 +206,9 @@ static void __init noinline probe_masking_msrs(void)
 static void amd_ctxt_switch_levelling(const struct domain *nextd)
 {
struct cpuidmasks *these_masks = _cpu(cpuidmasks);
-   const struct cpuidmasks *masks = _defaults;
+   const struct cpuidmasks *masks =
+   (nextd && is_pv_domain(nextd) && 
nextd->arch.pv_domain.cpuidmasks)
+   ? nextd->arch.pv_domain.cpuidmasks : _defaults;
 
 #define LAZY(cap, msr, field)  \
({  \
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index 71b1199..00a9987 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -154,13 +154,16 @@ static void __init probe_masking_msrs(void)
 static void intel_ctxt_switch_levelling(const struct domain *nextd)
 {
struct cpuidmasks *these_masks = _cpu(cpuidmasks);
-   const struct cpuidmasks *masks = _defaults;
+   const struct cpuidmasks *masks;
 
if (cpu_has_cpuid_faulting) {
set_cpuid_faulting(nextd && is_pv_domain(nextd));
return;
}
 
+   masks = (nextd && is_pv_domain(nextd) && 
nextd->arch.pv_domain.cpuidmasks)
+   ? nextd->arch.pv_domain.cpuidmasks : _defaults;
+
 #define LAZY(msr, field)   \
({  \
if (unlikely(these_masks->field != masks->field) && \
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index abc7194..d0d9773 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -577,6 +577,14 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 goto fail;
 clear_page(d->arch.pv_domain.gdt_ldt_l1tab);
 
+if ( levelling_caps & ~LCAP_faulting )
+{
+d->arch.pv_domain.cpuidmasks = xmalloc(struct cpuidmasks);
+if ( !d->arch.pv_domain.cpuidmasks )
+goto fail;
+*d->arch.pv_domain.cpuidmasks = cpuidmask_defaults;
+}
+
 rc = create_perdomain_mapping(d, GDT_LDT_VIRT_START,
   GDT_LDT_MBYTES << (20 - PAGE_SHIFT),
   NULL, NULL);
@@ -672,7 +680,10 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 paging_final_teardown(d);
 free_perdomain_mappings(d);
 if ( is_pv_domain(d) )
+{
+xfree(d->arch.pv_domain.cpuidmasks);
 free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
+}
 psr_domain_free(d);
 return rc;
 }
@@ -692,7 +703,10 @@ void arch_domain_destroy(struct domain *d)
 
 free_perdomain_mappings(d);
 if ( is_pv_domain(d) )
+{
 free_xenheap_page(d->arch.pv_domain.gdt_ldt_l1tab);
+xfree(d->arch.pv_domain.cpuidmasks);
+}
 
 free_xenheap_page(d->shared_info);
 cleanup_domain_irq_mapping(d);
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index de60def..90f021f 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -252,6 +252,8 @@ struct pv_domain
 
 /* map_domain_page() mapping cache. */
 struct mapcache_domain mapcache;
+
+struct cpuidmasks *cpuidmasks;
 };
 
 struct monitor_write_data {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 21/26] tools/libxc: Use public/featureset.h for cpuid policy generation

2016-03-23 Thread Andrew Cooper

Rather than having a different local copy of some of the feature
definitions.

Modify the xc_cpuid_x86.c cpumask helpers to appropriate truncate the
new values.

As some of the feature have been renamed in the public API, similar renames
are made here.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

v3:
 * Adjust naming to match Xen.
---
 tools/libxc/xc_cpufeature.h | 151 
 tools/libxc/xc_cpuid_x86.c  |  37 ++-
 2 files changed, 20 insertions(+), 168 deletions(-)
 delete mode 100644 tools/libxc/xc_cpufeature.h

diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
deleted file mode 100644
index 01dbeec..000
--- a/tools/libxc/xc_cpufeature.h
+++ /dev/null
@@ -1,151 +0,0 @@
-/*
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation;
- * version 2.1 of the License.
- *
- * This library is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * Lesser General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public
- * License along with this library; If not, see .
- */
-
-#ifndef __LIBXC_CPUFEATURE_H
-#define __LIBXC_CPUFEATURE_H
-
-/* Intel-defined CPU features, CPUID level 0x0001 (edx) */
-#define X86_FEATURE_FPU  0 /* Onboard FPU */
-#define X86_FEATURE_VME  1 /* Virtual Mode Extensions */
-#define X86_FEATURE_DE   2 /* Debugging Extensions */
-#define X86_FEATURE_PSE  3 /* Page Size Extensions */
-#define X86_FEATURE_TSC  4 /* Time Stamp Counter */
-#define X86_FEATURE_MSR  5 /* Model-Specific Registers, RDMSR, WRMSR */
-#define X86_FEATURE_PAE  6 /* Physical Address Extensions */
-#define X86_FEATURE_MCE  7 /* Machine Check Architecture */
-#define X86_FEATURE_CX8  8 /* CMPXCHG8 instruction */
-#define X86_FEATURE_APIC 9 /* Onboard APIC */
-#define X86_FEATURE_SEP 11 /* SYSENTER/SYSEXIT */
-#define X86_FEATURE_MTRR12 /* Memory Type Range Registers */
-#define X86_FEATURE_PGE 13 /* Page Global Enable */
-#define X86_FEATURE_MCA 14 /* Machine Check Architecture */
-#define X86_FEATURE_CMOV15 /* CMOV instruction */
-#define X86_FEATURE_PAT 16 /* Page Attribute Table */
-#define X86_FEATURE_PSE36   17 /* 36-bit PSEs */
-#define X86_FEATURE_PN  18 /* Processor serial number */
-#define X86_FEATURE_CLFLSH  19 /* Supports the CLFLUSH instruction */
-#define X86_FEATURE_DS  21 /* Debug Store */
-#define X86_FEATURE_ACPI22 /* ACPI via MSR */
-#define X86_FEATURE_MMX 23 /* Multimedia Extensions */
-#define X86_FEATURE_FXSR24 /* FXSAVE and FXRSTOR instructions */
-#define X86_FEATURE_XMM 25 /* Streaming SIMD Extensions */
-#define X86_FEATURE_XMM226 /* Streaming SIMD Extensions-2 */
-#define X86_FEATURE_SELFSNOOP   27 /* CPU self snoop */
-#define X86_FEATURE_HT  28 /* Hyper-Threading */
-#define X86_FEATURE_ACC 29 /* Automatic clock control */
-#define X86_FEATURE_IA6430 /* IA-64 processor */
-#define X86_FEATURE_PBE 31 /* Pending Break Enable */
-
-/* AMD-defined CPU features, CPUID level 0x8001 */
-/* Don't duplicate feature flags which are redundant with Intel! */
-#define X86_FEATURE_SYSCALL 11 /* SYSCALL/SYSRET */
-#define X86_FEATURE_MP  19 /* MP Capable. */
-#define X86_FEATURE_NX  20 /* Execute Disable */
-#define X86_FEATURE_MMXEXT  22 /* AMD MMX extensions */
-#define X86_FEATURE_FFXSR   25 /* FFXSR instruction optimizations */
-#define X86_FEATURE_PAGE1GB 26 /* 1Gb large page support */
-#define X86_FEATURE_RDTSCP  27 /* RDTSCP */
-#define X86_FEATURE_LM  29 /* Long Mode (x86-64) */
-#define X86_FEATURE_3DNOWEXT30 /* AMD 3DNow! extensions */
-#define X86_FEATURE_3DNOW   31 /* 3DNow! */
-
-/* Intel-defined CPU features, CPUID level 0x0001 (ecx) */
-#define X86_FEATURE_XMM3 0 /* Streaming SIMD Extensions-3 */
-#define X86_FEATURE_PCLMULQDQ1 /* Carry-less multiplication */
-#define X86_FEATURE_DTES64   2 /* 64-bit Debug Store */
-#define X86_FEATURE_MWAIT3 /* Monitor/Mwait support */
-#define X86_FEATURE_DSCPL4 /* CPL Qualified Debug Store */
-#define X86_FEATURE_VMXE 5 /* Virtual Machine Extensions */
-#define X86_FEATURE_SMXE 6 /* Safer Mode Extensions */
-#define X86_FEATURE_EST  7 /* Enhanced SpeedStep */
-#define X86_FEATURE_TM2  8 /* Thermal Monitor 2 */
-#define X86_FEATURE_SSSE39 /* Supplemental Streaming SIMD Exts-3 */
-#define

[Xen-devel] [PATCH v4 16/26] x86/cpu: Context switch cpuid masks and faulting state in context_switch()

2016-03-23 Thread Andrew Cooper

A single ctxt_switch_levelling() function pointer is provided
(defaulting to an empty nop), which is overridden in the appropriate
$VENDOR_init_levelling().

set_cpuid_faulting() is made private and included within
intel_ctxt_switch_levelling().

One functional change is that the faulting configuration is no longer special
cased for dom0.  There was never any need to, and it will cause dom0 to
observe the same information through native and enlightened cpuid.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v3:
 * Don't leave cpuid masking/faulting active for the kexec kernel.
v2:
 * Style fixes
 * ASSERT() that faulting is available in set_cpuid_faulting()
---
 xen/arch/x86/cpu/amd.c  |  3 +++
 xen/arch/x86/cpu/common.c   |  7 +++
 xen/arch/x86/cpu/intel.c| 20 +++-
 xen/arch/x86/crash.c|  3 +++
 xen/arch/x86/domain.c   |  4 +---
 xen/include/asm-x86/processor.h |  2 +-
 6 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 0e1c8b9..484d4b0 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -326,6 +326,9 @@ static void __init noinline amd_init_levelling(void)
   (uint32_t)cpuidmask_defaults._7ab0,
   (uint32_t)cpuidmask_defaults._6c);
}
+
+   if (levelling_caps)
+   ctxt_switch_levelling = amd_ctxt_switch_levelling;
 }
 
 /*
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 7ef75b0..fe6eab4 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -88,6 +88,13 @@ static const struct cpu_dev default_cpu = {
 };
 static const struct cpu_dev *this_cpu = _cpu;
 
+static void default_ctxt_switch_levelling(const struct domain *nextd)
+{
+   /* Nop */
+}
+void (* __read_mostly ctxt_switch_levelling)(const struct domain *nextd) =
+   default_ctxt_switch_levelling;
+
 bool_t opt_cpu_info;
 boolean_param("cpuinfo", opt_cpu_info);
 
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index b2666a8..71b1199 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -32,13 +32,15 @@ static bool_t __init probe_intel_cpuid_faulting(void)
return 1;
 }
 
-static DEFINE_PER_CPU(bool_t, cpuid_faulting_enabled);
-void set_cpuid_faulting(bool_t enable)
+static void set_cpuid_faulting(bool_t enable)
 {
+   static DEFINE_PER_CPU(bool_t, cpuid_faulting_enabled);
+   bool_t *this_enabled = _cpu(cpuid_faulting_enabled);
uint32_t hi, lo;
 
-   if (!cpu_has_cpuid_faulting ||
-   this_cpu(cpuid_faulting_enabled) == enable )
+   ASSERT(cpu_has_cpuid_faulting);
+
+   if (*this_enabled == enable)
return;
 
rdmsr(MSR_INTEL_MISC_FEATURES_ENABLES, lo, hi);
@@ -47,7 +49,7 @@ void set_cpuid_faulting(bool_t enable)
lo |= MSR_MISC_FEATURES_CPUID_FAULTING;
wrmsr(MSR_INTEL_MISC_FEATURES_ENABLES, lo, hi);
 
-   this_cpu(cpuid_faulting_enabled) = enable;
+   *this_enabled = enable;
 }
 
 /*
@@ -154,6 +156,11 @@ static void intel_ctxt_switch_levelling(const struct 
domain *nextd)
struct cpuidmasks *these_masks = _cpu(cpuidmasks);
const struct cpuidmasks *masks = _defaults;
 
+   if (cpu_has_cpuid_faulting) {
+   set_cpuid_faulting(nextd && is_pv_domain(nextd));
+   return;
+   }
+
 #define LAZY(msr, field)   \
({  \
if (unlikely(these_masks->field != masks->field) && \
@@ -227,6 +234,9 @@ static void __init noinline intel_init_levelling(void)
   (uint32_t)cpuidmask_defaults.e1cd,
   (uint32_t)cpuidmask_defaults.Da1);
}
+
+   if (levelling_caps)
+   ctxt_switch_levelling = intel_ctxt_switch_levelling;
 }
 
 static void early_init_intel(struct cpuinfo_x86 *c)
diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c
index 888a214..f28f527 100644
--- a/xen/arch/x86/crash.c
+++ b/xen/arch/x86/crash.c
@@ -189,6 +189,9 @@ void machine_crash_shutdown(void)
 
 nmi_shootdown_cpus();
 
+/* Reset CPUID masking and faulting to the host's default. */
+ctxt_switch_levelling(NULL);
+
 info = kexec_crash_save_info();
 info->xen_phys_start = xen_phys_start;
 info->dom0_pfn_to_mfn_frame_list_list =
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 6ec7554..abc7194 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2088,9 +2088,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
 load_segments(next);
 }
 
-set_cpuid_faulting(is_pv_domain(nextd) &&
-   !is_control_domain(nextd) &&
-   !is_hardware_domain(nextd));
+

[Xen-devel] [PATCH v4 24/26] tools/libxc: Wire a featureset through to cpuid policy logic

2016-03-23 Thread Andrew Cooper

Later changes will cause the cpuid generation logic to seed their information
from a featureset.  This patch adds the infrastructure to specify a
featureset, and will obtain the appropriate default from Xen if omitted.

Signed-off-by: Andrew Cooper 
Acked-by: Wei Liu 
---
CC: Ian Jackson 

v2:
 * Modify existing call rather than introducing a new one.
 * Fix up in-tree callsites.
---
 tools/libxc/include/xenctrl.h   |  4 ++-
 tools/libxc/xc_cpuid_x86.c  | 69 -
 tools/libxl/libxl_cpuid.c   |  2 +-
 tools/ocaml/libs/xc/xenctrl_stubs.c |  2 +-
 tools/python/xen/lowlevel/xc/xc.c   |  2 +-
 5 files changed, 66 insertions(+), 13 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 66acbd1..872fd08 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1896,7 +1896,9 @@ int xc_cpuid_set(xc_interface *xch,
  const char **config,
  char **config_transformed);
 int xc_cpuid_apply_policy(xc_interface *xch,
-  domid_t domid);
+  domid_t domid,
+  uint32_t *featureset,
+  unsigned int nr_features);
 void xc_cpuid_to_str(const unsigned int *regs,
  char **strs); /* some strs[] may be NULL if ENOMEM */
 int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 0cffb36..a92f5e4 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -166,6 +166,9 @@ struct cpuid_domain_info
 bool pvh;
 uint64_t xfeature_mask;
 
+uint32_t *featureset;
+unsigned int nr_features;
+
 /* PV-only information. */
 bool pv64;
 
@@ -197,11 +200,14 @@ static void cpuid(const unsigned int *input, unsigned int 
*regs)
 }
 
 static int get_cpuid_domain_info(xc_interface *xch, domid_t domid,
- struct cpuid_domain_info *info)
+ struct cpuid_domain_info *info,
+ uint32_t *featureset,
+ unsigned int nr_features)
 {
 struct xen_domctl domctl = {};
 xc_dominfo_t di;
 unsigned int in[2] = { 0, ~0U }, regs[4];
+unsigned int i, host_nr_features = xc_get_cpu_featureset_size();
 int rc;
 
 cpuid(in, regs);
@@ -223,6 +229,23 @@ static int get_cpuid_domain_info(xc_interface *xch, 
domid_t domid,
 info->hvm = di.hvm;
 info->pvh = di.pvh;
 
+info->featureset = calloc(host_nr_features, sizeof(*info->featureset));
+if ( !info->featureset )
+return -ENOMEM;
+
+info->nr_features = host_nr_features;
+
+if ( featureset )
+{
+memcpy(info->featureset, featureset,
+   min(host_nr_features, nr_features) * sizeof(*info->featureset));
+
+/* Check for truncated set bits. */
+for ( i = nr_features; i < host_nr_features; ++i )
+if ( featureset[i] != 0 )
+return -EOPNOTSUPP;
+}
+
 /* Get xstate information. */
 domctl.cmd = XEN_DOMCTL_getvcpuextstate;
 domctl.domain = domid;
@@ -247,6 +270,14 @@ static int get_cpuid_domain_info(xc_interface *xch, 
domid_t domid,
 return rc;
 
 info->nestedhvm = !!val;
+
+if ( !featureset )
+{
+rc = xc_get_cpu_featureset(xch, XEN_SYSCTL_cpu_featureset_hvm,
+   _nr_features, info->featureset);
+if ( rc )
+return rc;
+}
 }
 else
 {
@@ -257,11 +288,24 @@ static int get_cpuid_domain_info(xc_interface *xch, 
domid_t domid,
 return rc;
 
 info->pv64 = (width == 8);
+
+if ( !featureset )
+{
+rc = xc_get_cpu_featureset(xch, XEN_SYSCTL_cpu_featureset_pv,
+   _nr_features, info->featureset);
+if ( rc )
+return rc;
+}
 }
 
 return 0;
 }
 
+static void free_cpuid_domain_info(struct cpuid_domain_info *info)
+{
+free(info->featureset);
+}
+
 static void amd_xc_cpuid_policy(xc_interface *xch,
 const struct cpuid_domain_info *info,
 const unsigned int *input, unsigned int *regs)
@@ -789,16 +833,18 @@ void xc_cpuid_to_str(const unsigned int *regs, char 
**strs)
 }
 }
 
-int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid)
+int xc_cpuid_apply_policy(xc_interface *xch, domid_t domid,
+  uint32_t *featureset,
+  unsigned int nr_features)
 {
 struct cpuid_domain_info info = {};
 unsigned int input[2] = { 0, 0 }, regs[4];
 unsigned int base_max, ext_max;
 int rc;
 
-rc = get_cpuid_domain_info(xch, domid, );
+rc = get_cpuid_domain_info(xch, domid, ,

[Xen-devel] [PATCH v4 18/26] x86/domctl: Update PV domain cpumasks when setting cpuid policy

2016-03-23 Thread Andrew Cooper

This allows PV domains with different featuresets to observe different values
from a native cpuid instruction, on supporting hardware.

It is important to leak the host view of HTT and CMP_LEGACY through to guests,
even though they could be hidden.  These flags affect how to interpret other
cpuid leaves which are not maskable.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * Use switch() rather than if/elseif chain
 * Clamp to static PV featuremask
v3:
 * Only set a shadow cpumask if it is available in hardware.  This causes
   fewer branches in the context switch.
 * Fix interaction between fastforward bits and override MSR.
 * Fix up the cross-vendor case.
 * Fix the host view of HTT/CMP_LEGACY.
v4:
 * More comments explaining the masking MSRs behaviour.
 * s/CPU/CPUID/
 * Leak host X2APIC.
---
 xen/arch/x86/domctl.c| 138 +++
 xen/include/asm-x86/cpufeature.h |   1 +
 2 files changed, 139 insertions(+)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index b7c7f42..403bae8 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int gdbsx_guest_mem_io(domid_t domid, struct xen_domctl_gdbsx_memio 
*iop)
 {
@@ -87,6 +88,143 @@ static void update_domain_cpuid_info(struct domain *d,
 d->arch.x86_model = (ctl->eax >> 4) & 0xf;
 if ( d->arch.x86 >= 0x6 )
 d->arch.x86_model |= (ctl->eax >> 12) & 0xf0;
+
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_1cd) == LCAP_1cd) )
+{
+uint64_t mask = cpuidmask_defaults._1cd;
+uint32_t ecx = ctl->ecx & pv_featureset[FEATURESET_1c];
+uint32_t edx = ctl->edx & pv_featureset[FEATURESET_1d];
+
+/*
+ * Must expose hosts HTT and X2APIC value so a guest using native
+ * CPUID can correctly interpret other leaves which cannot be
+ * masked.
+ */
+if ( cpu_has_x2apic )
+ecx |= cpufeat_mask(X86_FEATURE_X2APIC);
+if ( cpu_has_htt )
+edx |= cpufeat_mask(X86_FEATURE_HTT);
+
+switch ( boot_cpu_data.x86_vendor )
+{
+case X86_VENDOR_INTEL:
+/*
+ * Intel masking MSRs are documented as AND masks.
+ * Experimentally, they are applied before OSXSAVE and APIC
+ * are fast-forwarded from real hardware state.
+ */
+mask &= ((uint64_t)edx << 32) | ecx;
+break;
+
+case X86_VENDOR_AMD:
+mask &= ((uint64_t)ecx << 32) | edx;
+
+/*
+ * AMD masking MSRs are documented as overrides.
+ * Experimentally, fast-forwarding of the OSXSAVE and APIC
+ * bits from real hardware state only occurs if the MSR has
+ * the respective bits set.
+ */
+if ( ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
+ecx = cpufeat_mask(X86_FEATURE_OSXSAVE);
+else
+ecx = 0;
+edx = cpufeat_mask(X86_FEATURE_APIC);
+
+mask |= ((uint64_t)ecx << 32) | edx;
+break;
+}
+
+d->arch.pv_domain.cpuidmasks->_1cd = mask;
+}
+break;
+
+case 6:
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_6c) == LCAP_6c) )
+{
+uint64_t mask = cpuidmask_defaults._6c;
+
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+mask &= (~0ULL << 32) | ctl->ecx;
+
+d->arch.pv_domain.cpuidmasks->_6c = mask;
+}
+break;
+
+case 7:
+if ( ctl->input[1] != 0 )
+break;
+
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_7ab0) == LCAP_7ab0) )
+{
+uint64_t mask = cpuidmask_defaults._7ab0;
+uint32_t eax = ctl->eax;
+uint32_t ebx = ctl->ebx & pv_featureset[FEATURESET_7b0];
+
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+mask &= ((uint64_t)eax << 32) | ebx;
+
+d->arch.pv_domain.cpuidmasks->_7ab0 = mask;
+}
+break;
+
+case 0xd:
+if ( ctl->input[1] != 1 )
+break;
+
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_Da1) == LCAP_Da1) )
+{
+uint64_t mask = cpuidmask_defaults.Da1;
+uint32_t eax = ctl->eax & pv_featureset[FEATURESET_Da1];
+
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+mask &= (~0ULL << 32) | eax;
+
+d->arch.pv_domain.cpuidmasks->Da1 = mask;
+}
+break;
+
+case 0x8001:
+if ( is_pv_domain(d) && ((levelling_caps & LCAP_e1cd) == LCAP_e1cd) )
+{
+uint64_t mask = cpuidmask_defaults.e1cd;
+

[Xen-devel] [PATCH v4 20/26] tools/libxc: Modify bitmap operations to take void pointers

2016-03-23 Thread Andrew Cooper

The type of the pointer to a bitmap is not interesting; it does not affect the
representation of the block of bits being pointed to.

Make the libxc functions consistent with those in Xen, so they can work just
as well with 'unsigned int *' based bitmaps.

As part of doing so, change the implementation to be in terms of char rather
than unsigned long.  This fixes alignment concerns with ARM.

Signed-off-by: Andrew Cooper 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Stefano Stabellini 
CC: Julien Grall 

v2:
 * New
v3:
 * Implement in terms of char rather than unsigned long to fix alignment
   issues for ARM.
v4:
 * Fix erronious calculation in bitmap_size()
---
 tools/libxc/xc_bitops.h | 37 -
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/tools/libxc/xc_bitops.h b/tools/libxc/xc_bitops.h
index cd749f4..3e7a544 100644
--- a/tools/libxc/xc_bitops.h
+++ b/tools/libxc/xc_bitops.h
@@ -6,70 +6,73 @@
 #include 
 #include 
 
+/* Needed by several includees, but no longer used for bitops. */
 #define BITS_PER_LONG (sizeof(unsigned long) * 8)
 #define ORDER_LONG (sizeof(unsigned long) == 4 ? 5 : 6)
 
-#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr)/BITS_PER_LONG]
-#define BITMAP_SHIFT(_nr) ((_nr) % BITS_PER_LONG)
+#define BITMAP_ENTRY(_nr,_bmap) ((_bmap))[(_nr) / 8]
+#define BITMAP_SHIFT(_nr) ((_nr) % 8)
 
 /* calculate required space for number of longs needed to hold nr_bits */
 static inline int bitmap_size(int nr_bits)
 {
-int nr_long, nr_bytes;
-nr_long = (nr_bits + BITS_PER_LONG - 1) >> ORDER_LONG;
-nr_bytes = nr_long * sizeof(unsigned long);
-return nr_bytes;
+return (nr_bits + 7) / 8;
 }
 
-static inline unsigned long *bitmap_alloc(int nr_bits)
+static inline void *bitmap_alloc(int nr_bits)
 {
 return calloc(1, bitmap_size(nr_bits));
 }
 
-static inline void bitmap_set(unsigned long *addr, int nr_bits)
+static inline void bitmap_set(void *addr, int nr_bits)
 {
 memset(addr, 0xff, bitmap_size(nr_bits));
 }
 
-static inline void bitmap_clear(unsigned long *addr, int nr_bits)
+static inline void bitmap_clear(void *addr, int nr_bits)
 {
 memset(addr, 0, bitmap_size(nr_bits));
 }
 
-static inline int test_bit(int nr, unsigned long *addr)
+static inline int test_bit(int nr, const void *_addr)
 {
+const char *addr = _addr;
 return (BITMAP_ENTRY(nr, addr) >> BITMAP_SHIFT(nr)) & 1;
 }
 
-static inline void clear_bit(int nr, unsigned long *addr)
+static inline void clear_bit(int nr, void *_addr)
 {
+char *addr = _addr;
 BITMAP_ENTRY(nr, addr) &= ~(1UL << BITMAP_SHIFT(nr));
 }
 
-static inline void set_bit(int nr, unsigned long *addr)
+static inline void set_bit(int nr, void *_addr)
 {
+char *addr = _addr;
 BITMAP_ENTRY(nr, addr) |= (1UL << BITMAP_SHIFT(nr));
 }
 
-static inline int test_and_clear_bit(int nr, unsigned long *addr)
+static inline int test_and_clear_bit(int nr, void *addr)
 {
 int oldbit = test_bit(nr, addr);
 clear_bit(nr, addr);
 return oldbit;
 }
 
-static inline int test_and_set_bit(int nr, unsigned long *addr)
+static inline int test_and_set_bit(int nr, void *addr)
 {
 int oldbit = test_bit(nr, addr);
 set_bit(nr, addr);
 return oldbit;
 }
 
-static inline void bitmap_or(unsigned long *dst, const unsigned long *other,
+static inline void bitmap_or(void *_dst, const void *_other,
  int nr_bits)
 {
-int i, nr_longs = (bitmap_size(nr_bits) / sizeof(unsigned long));
-for ( i = 0; i < nr_longs; ++i )
+char *dst = _dst;
+const char *other = _other;
+int i;
+for ( i = 0; i < bitmap_size(nr_bits); ++i )
 dst[i] |= other[i];
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 14/26] x86/cpu: Rework AMD masking MSR setup

2016-03-23 Thread Andrew Cooper

This patch is best reviewed as its end result rather than as a diff, as it
rewrites almost all of the setup.

On the BSP, cpuid information is used to evaluate the potential available set
of masking MSRs, and they are unconditionally probed, filling in the
availability information and hardware defaults.

The command line parameters are then combined with the hardware defaults to
further restrict the Xen default masking level.  Each cpu is then context
switched into the default levelling state.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v2:
 * Provide extra information if opt_cpu_info
 * Extra comment indicating the expected use of amd_ctxt_switch_levelling()
v3:
 * Fix the interaction of the fast-forward bits with the override MSRs.
 * Style fixups.
---
 xen/arch/x86/cpu/amd.c | 276 -
 1 file changed, 179 insertions(+), 97 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 5516777..0e1c8b9 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -80,6 +80,13 @@ static inline int wrmsr_amd_safe(unsigned int msr, unsigned 
int lo,
return err;
 }
 
+static void wrmsr_amd(unsigned int msr, uint64_t val)
+{
+   asm volatile("wrmsr" ::
+"c" (msr), "a" ((uint32_t)val),
+"d" (val >> 32), "D" (0x9c5a203a));
+}
+
 static const struct cpuidmask {
uint16_t fam;
char rev[2];
@@ -126,126 +133,198 @@ static const struct cpuidmask *__init noinline 
get_cpuidmask(const char *opt)
 }
 
 /*
+ * Sets caps in expected_levelling_cap, probes for the specified mask MSR, and
+ * set caps in levelling_caps if it is found.  Processors prior to Fam 10h
+ * required a 32-bit password for masking MSRs.  Returns the default value.
+ */
+static uint64_t __init _probe_mask_msr(unsigned int msr, uint64_t caps)
+{
+   unsigned int hi, lo;
+
+   expected_levelling_cap |= caps;
+
+   if ((rdmsr_amd_safe(msr, , ) == 0) &&
+   (wrmsr_amd_safe(msr, lo, hi) == 0))
+   levelling_caps |= caps;
+
+   return ((uint64_t)hi << 32) | lo;
+}
+
+/*
+ * Probe for the existance of the expected masking MSRs.  They might easily
+ * not be available if Xen is running virtualised.
+ */
+static void __init noinline probe_masking_msrs(void)
+{
+   const struct cpuinfo_x86 *c = _cpu_data;
+
+   /*
+* First, work out which masking MSRs we should have, based on
+* revision and cpuid.
+*/
+
+   /* Fam11 doesn't support masking at all. */
+   if (c->x86 == 0x11)
+   return;
+
+   cpuidmask_defaults._1cd =
+   _probe_mask_msr(MSR_K8_FEATURE_MASK, LCAP_1cd);
+   cpuidmask_defaults.e1cd =
+   _probe_mask_msr(MSR_K8_EXT_FEATURE_MASK, LCAP_e1cd);
+
+   if (c->cpuid_level >= 7)
+   cpuidmask_defaults._7ab0 =
+   _probe_mask_msr(MSR_AMD_L7S0_FEATURE_MASK, LCAP_7ab0);
+
+   if (c->x86 == 0x15 && c->cpuid_level >= 6 && cpuid_ecx(6))
+   cpuidmask_defaults._6c =
+   _probe_mask_msr(MSR_AMD_THRM_FEATURE_MASK, LCAP_6c);
+
+   /*
+* Don't bother warning about a mismatch if virtualised.  These MSRs
+* are not architectural and almost never virtualised.
+*/
+   if ((expected_levelling_cap == levelling_caps) ||
+   cpu_has_hypervisor)
+   return;
+
+   printk(XENLOG_WARNING "Mismatch between expected (%#x) "
+  "and real (%#x) levelling caps: missing %#x\n",
+  expected_levelling_cap, levelling_caps,
+  (expected_levelling_cap ^ levelling_caps) & levelling_caps);
+   printk(XENLOG_WARNING "Fam %#x, model %#x level %#x\n",
+  c->x86, c->x86_model, c->cpuid_level);
+   printk(XENLOG_WARNING
+  "If not running virtualised, please report a bug\n");
+}
+
+/*
+ * Context switch levelling state to the next domain.  A parameter of NULL is
+ * used to context switch to the default host state, and is used by the BSP/AP
+ * startup code.
+ */
+static void amd_ctxt_switch_levelling(const struct domain *nextd)
+{
+   struct cpuidmasks *these_masks = _cpu(cpuidmasks);
+   const struct cpuidmasks *masks = _defaults;
+
+#define LAZY(cap, msr, field)  \
+   ({  \
+   if (unlikely(these_masks->field != masks->field) && \
+   ((levelling_caps & cap) == cap))\
+   {   \
+   wrmsr_amd(msr, masks->field);   \
+   these_masks->field = masks->field;  \
+   }   \
+   })
+
+   LAZY(LCAP_1cd,  MSR_K8_FEATURE_MASK,

[Xen-devel] [PATCH v4 09/26] xen/x86: Clear dependent features when clearing a cpu cap

2016-03-23 Thread Andrew Cooper

When clearing a cpu cap, clear all dependent features.  This avoids having a
featureset with intermediate features disabled, but leaf features enabled.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
v3:
 * Style fixes.  Use __test_and_set_bit()
---
 xen/arch/x86/cpu/common.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index d302272..0942b44 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -53,8 +53,22 @@ static unsigned int cleared_caps[NCAPINTS];
 
 void __init setup_clear_cpu_cap(unsigned int cap)
 {
+   const uint32_t *dfs;
+   unsigned int i;
+
+   if (__test_and_set_bit(cap, cleared_caps))
+   return;
+
__clear_bit(cap, boot_cpu_data.x86_capability);
-   __set_bit(cap, cleared_caps);
+   dfs = lookup_deep_deps(cap);
+
+   if (!dfs)
+   return;
+
+   for (i = 0; i < FSCAPINTS; ++i) {
+   cleared_caps[i] |= dfs[i];
+   boot_cpu_data.x86_capability[i] &= ~dfs[i];
+   }
 }
 
 static void default_init(struct cpuinfo_x86 * c)
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Outreachy bite-sized tasks

2016-03-23 Thread Paulina Szubarczyk

Hi,

Thank you for the proposed tasks. I would like to work on the second one,
fixing the return codes in xl.

Regards,
Paulina Szubarczyk

On 23 March 2016 at 16:32, Roger Pau Monné  wrote:

> Hello,
>
> First of all, thanks for your interest in the Xen Project, and for wanting
> to participate in Outreachy.
>
> Both of you have expressed interest in the "QEMU xen-blkback performance
> analysis and improvements" Outreachy project, and AFAIK both of you still
> need to perform your initial contribution.
>
> I've found a couple of small tasks that you can perform and should allow
> you to complete your initial contribution to the project:
>
>  - The first one is related to xenalyze, and it consist in creating a
>header file that can be shared by both the Xen kernel and xenalyze.
>You will need to move the TRC_ defines found in sched_credit.c and
>sched_credit2.c to a header that's shared with xenalyze and then
>replace the usage of TRC_SCHED_CLASS_EVT with the defines in the header
>file [0].
>
>  - The second one consist in fixing the return codes of certain xl
>commands. There are commands in xl that will return 0 (SUCCESS) even
>when failing, which makes it very hard to write scripts that make use
>of xl. A list of those commands can be found in [1], together with some
>preliminary patches. Please note that those patches have comments that
>you will need to address, and that you should also need to preserve the
>original authorship of the patches plus yours.
>
> I encourage you to read the wiki page about sending patches to xen-devel
> [2], it should guide you through your first steps on using git and
> creating suitable patches.
>
> Also, please note that this is an open source project, so you will need to
> coordinate in order to figure out which task are you going to take, in
> order to avoid clashes or duplication of efforts.
>
> If you have further questions, either reply to this thread (keeping
> the xen-devel mailing list on the Cc), or feel free to start another one
> if you think it's more suited.
>
> Roger.
>
> [0]
> http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg02624.html
> [1]
> http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02246.html
> [2] http://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches
>
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 05/26] xen/x86: Annotate special features

2016-03-23 Thread Andrew Cooper

Some bits in a featureset are not simple a indication of new functionality,
and require special handling.

APIC, OSXSAVE and OSPKE are fast-forwards of other pieces of state;
IA32_APIC_BASE.EN, CR4.OSXSAVE and CR4.OSPKE.  Xen will take care of filling
these appropriately at runtime.

FDP_EXCP_ONLY and NO_FPU_SEL are bits indicating reduced functionality in the
x87 pipeline.  The effects of these cannot be hidden from the guest, so the
host values will always be provided.

HTT, X2APIC and CMP_LEGACY indicate how to interpret other cpuid leaves.  In
most cases, the toolstack value will be used (with the expectation that these
flags will match the other provided topology information).  However with cpuid
masking, the host values are presented as masking cannot influence what the
guest sees in the dependent leaves.

HYPERVISOR is unconditionally set in the PV ABI, but follows the toolstack
setting for HVM guests.

Signed-off-by: Andrew Cooper 
Acked-by: Jan Beulich 
Reviewed-by: Konrad Rzeszutek Wilk 
---
v3:
 * Essentially new.  Replaces "Store antifeatures inverted in a featureset"
v4:
 * Include X2APIC and HYPERVISOR as special bits.
---
 xen/arch/x86/cpuid.c|  2 ++
 xen/include/asm-x86/cpuid.h |  1 +
 xen/include/public/arch-x86/cpufeatureset.h | 30 -
 xen/tools/gen-cpuid.py  | 17 +++-
 4 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 05cd646..77e008a 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -3,10 +3,12 @@
 #include 
 
 const uint32_t known_features[] = INIT_KNOWN_FEATURES;
+const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
 
 static void __init __maybe_unused build_assertions(void)
 {
 BUILD_BUG_ON(ARRAY_SIZE(known_features) != FSCAPINTS);
+BUILD_BUG_ON(ARRAY_SIZE(special_features) != FSCAPINTS);
 }
 
 /*
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index b72d88f..0ecf357 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -10,6 +10,7 @@
 #include 
 
 extern const uint32_t known_features[FSCAPINTS];
+extern const uint32_t special_features[FSCAPINTS];
 
 #endif /* __ASSEMBLY__ */
 #endif /* !__X86_CPUID_H__ */
diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index 5da37eb..8308972 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -71,6 +71,18 @@ enum {
  * CPUID instruction, but this is not preclude other sources of information.
  */
 
+/*
+ * Attribute syntax:
+ *
+ * Attributes for a particular feature are provided as characters before the
+ * first space in the comment immediately following the feature value.
+ *
+ * Special: '!'
+ *   This bit has special properties and is not a straight indication of a
+ *   piece of new functionality.  Xen will handle these differently,
+ *   and may override toolstack settings completely.
+ */
+
 /* Intel-defined CPU features, CPUID level 0x0001.edx, word 0 */
 XEN_CPUFEATURE(FPU,   0*32+ 0) /*   Onboard FPU */
 XEN_CPUFEATURE(VME,   0*32+ 1) /*   Virtual Mode Extensions */
@@ -81,7 +93,7 @@ XEN_CPUFEATURE(MSR,   0*32+ 5) /*   Model-Specific 
Registers, RDMSR, WRM
 XEN_CPUFEATURE(PAE,   0*32+ 6) /*   Physical Address Extensions */
 XEN_CPUFEATURE(MCE,   0*32+ 7) /*   Machine Check Architecture */
 XEN_CPUFEATURE(CX8,   0*32+ 8) /*   CMPXCHG8 instruction */
-XEN_CPUFEATURE(APIC,  0*32+ 9) /*   Onboard APIC */
+XEN_CPUFEATURE(APIC,  0*32+ 9) /*!  Onboard APIC */
 XEN_CPUFEATURE(SEP,   0*32+11) /*   SYSENTER/SYSEXIT */
 XEN_CPUFEATURE(MTRR,  0*32+12) /*   Memory Type Range Registers */
 XEN_CPUFEATURE(PGE,   0*32+13) /*   Page Global Enable */
@@ -96,7 +108,7 @@ XEN_CPUFEATURE(MMX,   0*32+23) /*   Multimedia 
Extensions */
 XEN_CPUFEATURE(FXSR,  0*32+24) /*   FXSAVE and FXRSTOR instructions */
 XEN_CPUFEATURE(SSE,   0*32+25) /*   Streaming SIMD Extensions */
 XEN_CPUFEATURE(SSE2,  0*32+26) /*   Streaming SIMD Extensions-2 */
-XEN_CPUFEATURE(HTT,   0*32+28) /*   Hyper-Threading Technology */
+XEN_CPUFEATURE(HTT,   0*32+28) /*!  Hyper-Threading Technology */
 XEN_CPUFEATURE(TM1,   0*32+29) /*   Thermal Monitor 1 */
 XEN_CPUFEATURE(PBE,   0*32+31) /*   Pending Break Enable */
 
@@ -119,17 +131,17 @@ XEN_CPUFEATURE(PCID,  1*32+17) /*   Process 
Context ID */
 XEN_CPUFEATURE(DCA,   1*32+18) /*   Direct Cache Access */
 XEN_CPUFEATURE(SSE4_1,1*32+19) /*   Streaming SIMD Extensions 4.1 */
 XEN_CPUFEATURE(SSE4_2,1*32+20) /*   Streaming SIMD Extensions 4.2 */
-XEN_CPUFEATURE(X2APIC,1*32+21) /*   Extended xAPIC */
+XEN_CPUFEATURE(X2APIC,

[Xen-devel] [PATCH v4 06/26] xen/x86: Annotate VM applicability in featureset

2016-03-23 Thread Andrew Cooper

Use attributes to specify whether a feature is applicable to be exposed to:
 1) All guests
 2) HVM guests
 3) HVM HAP guests
and, via absence of an attribute, to no guests.

There is no current need for other categories (e.g. PV-only features), and
such categories should not be introduced if possible.  These categories follow
from the fact that, with increased hardware support, a guest gets more
features to use.

These settings are derived from the existing code in {pv,hvm}_cpuid(), and
xc_cpuid_x86.c.  One notable exception is EXTAPIC which was previously
erroneously exposed to guests.  PV guests don't get to use the APIC and the
HVM APIC emulation doesn't support extended space.

Signed-off-by: Andrew Cooper 
Reviewed-by: Konrad Rzeszutek Wilk 
---
CC: Jan Beulich 

v2:
 * Annotate features using a magic comment and autogeneration.
v3:
 * Rebase over the new namespaceing changes.
 * Expand commit message.
 * Correct PSE36 to being a HAP-only feature.
v4:
 * Re-break PSE36.
 * Hide LWP from PV guests.
---
 xen/include/public/arch-x86/cpufeatureset.h | 187 ++--
 xen/tools/gen-cpuid.py  |  30 -
 2 files changed, 125 insertions(+), 92 deletions(-)

diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index 8308972..75dd2ac 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -81,135 +81,140 @@ enum {
  *   This bit has special properties and is not a straight indication of a
  *   piece of new functionality.  Xen will handle these differently,
  *   and may override toolstack settings completely.
+ *
+ * Applicability to guests: 'A', 'S' or 'H'
+ *   'A' = All guests.
+ *   'S' = All HVM guests (not PV guests).
+ *   'H' = HVM HAP guests (not PV or HVM Shadow guests).
  */
 
 /* Intel-defined CPU features, CPUID level 0x0001.edx, word 0 */
-XEN_CPUFEATURE(FPU,   0*32+ 0) /*   Onboard FPU */
-XEN_CPUFEATURE(VME,   0*32+ 1) /*   Virtual Mode Extensions */
-XEN_CPUFEATURE(DE,0*32+ 2) /*   Debugging Extensions */
-XEN_CPUFEATURE(PSE,   0*32+ 3) /*   Page Size Extensions */
-XEN_CPUFEATURE(TSC,   0*32+ 4) /*   Time Stamp Counter */
-XEN_CPUFEATURE(MSR,   0*32+ 5) /*   Model-Specific Registers, RDMSR, 
WRMSR */
-XEN_CPUFEATURE(PAE,   0*32+ 6) /*   Physical Address Extensions */
-XEN_CPUFEATURE(MCE,   0*32+ 7) /*   Machine Check Architecture */
-XEN_CPUFEATURE(CX8,   0*32+ 8) /*   CMPXCHG8 instruction */
-XEN_CPUFEATURE(APIC,  0*32+ 9) /*!  Onboard APIC */
-XEN_CPUFEATURE(SEP,   0*32+11) /*   SYSENTER/SYSEXIT */
-XEN_CPUFEATURE(MTRR,  0*32+12) /*   Memory Type Range Registers */
-XEN_CPUFEATURE(PGE,   0*32+13) /*   Page Global Enable */
-XEN_CPUFEATURE(MCA,   0*32+14) /*   Machine Check Architecture */
-XEN_CPUFEATURE(CMOV,  0*32+15) /*   CMOV instruction (FCMOVCC and 
FCOMI too if FPU present) */
-XEN_CPUFEATURE(PAT,   0*32+16) /*   Page Attribute Table */
-XEN_CPUFEATURE(PSE36, 0*32+17) /*   36-bit PSEs */
-XEN_CPUFEATURE(CLFLUSH,   0*32+19) /*   CLFLUSH instruction */
+XEN_CPUFEATURE(FPU,   0*32+ 0) /*A  Onboard FPU */
+XEN_CPUFEATURE(VME,   0*32+ 1) /*S  Virtual Mode Extensions */
+XEN_CPUFEATURE(DE,0*32+ 2) /*A  Debugging Extensions */
+XEN_CPUFEATURE(PSE,   0*32+ 3) /*S  Page Size Extensions */
+XEN_CPUFEATURE(TSC,   0*32+ 4) /*A  Time Stamp Counter */
+XEN_CPUFEATURE(MSR,   0*32+ 5) /*A  Model-Specific Registers, RDMSR, 
WRMSR */
+XEN_CPUFEATURE(PAE,   0*32+ 6) /*A  Physical Address Extensions */
+XEN_CPUFEATURE(MCE,   0*32+ 7) /*A  Machine Check Architecture */
+XEN_CPUFEATURE(CX8,   0*32+ 8) /*A  CMPXCHG8 instruction */
+XEN_CPUFEATURE(APIC,  0*32+ 9) /*!A Onboard APIC */
+XEN_CPUFEATURE(SEP,   0*32+11) /*A  SYSENTER/SYSEXIT */
+XEN_CPUFEATURE(MTRR,  0*32+12) /*S  Memory Type Range Registers */
+XEN_CPUFEATURE(PGE,   0*32+13) /*S  Page Global Enable */
+XEN_CPUFEATURE(MCA,   0*32+14) /*A  Machine Check Architecture */
+XEN_CPUFEATURE(CMOV,  0*32+15) /*A  CMOV instruction (FCMOVCC and 
FCOMI too if FPU present) */
+XEN_CPUFEATURE(PAT,   0*32+16) /*A  Page Attribute Table */
+XEN_CPUFEATURE(PSE36, 0*32+17) /*S  36-bit PSEs */
+XEN_CPUFEATURE(CLFLUSH,   0*32+19) /*A  CLFLUSH instruction */
 XEN_CPUFEATURE(DS,0*32+21) /*   Debug Store */
-XEN_CPUFEATURE(ACPI,  0*32+22) /*   ACPI via MSR */
-XEN_CPUFEATURE(MMX,   0*32+23) /*   Multimedia Extensions */
-XEN_CPUFEATURE(FXSR,  0*32+24) /*   FXSAVE and FXRSTOR instructions */
-XEN_CPUFEATURE(SSE,   0*32+25) /*   Streaming SIMD Extensions */
-XEN_CPUFEATURE(SSE2,  0*32+26) /*   Streaming SIMD Extensions-2

[Xen-devel] [PATCH v4 02/26] xen/x86: Script to automatically process featureset information

2016-03-23 Thread Andrew Cooper

This script consumes include/public/arch-x86/cpufeatureset.h and generates a
single include/asm-x86/cpuid-autogen.h containing all the processed
information.

It currently generates just FEATURESET_NR_ENTRIES.  Future changes will
generate more information.

Signed-off-by: Andrew Cooper 
Acked-by: Jan Beulich 
---
v2:
 * New
v3:
 * Rebased over the new namespacing in cpufeatureset.h
v4:
 * Speeling fixes
---
 .gitignore   |   1 +
 xen/include/Makefile |  10 ++
 xen/include/asm-x86/cpufeature.h |   3 +-
 xen/tools/gen-cpuid.py   | 197 +++
 4 files changed, 210 insertions(+), 1 deletion(-)
 create mode 100755 xen/tools/gen-cpuid.py

diff --git a/.gitignore b/.gitignore
index 91f690c..b40453e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -252,6 +252,7 @@ xen/include/headers.chk
 xen/include/headers++.chk
 xen/include/asm
 xen/include/asm-*/asm-offsets.h
+xen/include/asm-x86/cpuid-autogen.h
 xen/include/compat/*
 xen/include/config/
 xen/include/generated/
diff --git a/xen/include/Makefile b/xen/include/Makefile
index 9c8188b..268bc9d 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -117,5 +117,15 @@ headers++.chk: $(PUBLIC_HEADERS) Makefile
 
 endif
 
+ifeq ($(XEN_TARGET_ARCH),x86_64)
+
+$(BASEDIR)/include/asm-x86/cpuid-autogen.h: 
$(BASEDIR)/include/public/arch-x86/cpufeatureset.h 
$(BASEDIR)/tools/gen-cpuid.py FORCE
+   $(PYTHON) $(BASEDIR)/tools/gen-cpuid.py -i $^ -o $@.new
+   $(call move-if-changed,$@.new,$@)
+
+all: $(BASEDIR)/include/asm-x86/cpuid-autogen.h
+endif
+
 clean::
rm -rf compat headers.chk headers++.chk
+   rm -f $(BASEDIR)/include/asm-x86/cpuid-autogen.h
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index a044616..bcda09b 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -11,8 +11,9 @@
 
 #include 
 #include 
+#include 
 
-#define FSCAPINTS 9
+#define FSCAPINTS FEATURESET_NR_ENTRIES
 #define NCAPINTS (FSCAPINTS + 1) /* N 32-bit words worth of info */
 
 /* Other features, Xen-defined mapping. */
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
new file mode 100755
index 000..c6bd98d
--- /dev/null
+++ b/xen/tools/gen-cpuid.py
@@ -0,0 +1,197 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+import sys, os, re
+
+class Fail(Exception):
+pass
+
+class State(object):
+
+def __init__(self, input, output):
+
+self.source = input
+self.input  = open_file_or_fd(input, "r", 2)
+self.output = open_file_or_fd(output, "w", 2)
+
+# State parsed from input
+self.names = {} # Name => value mapping
+
+# State calculated
+self.nr_entries = 0 # Number of words in a featureset
+
+def parse_definitions(state):
+"""
+Parse featureset information from @param f and mutate the global
+namespace with symbols
+"""
+feat_regex = re.compile(
+r"^XEN_CPUFEATURE\(([A-Z0-9_]+),"
+"\s+([\s\d]+\*[\s\d]+\+[\s\d]+)\).*$")
+
+this = sys.modules[__name__]
+
+for l in state.input.readlines():
+# Short circuit the regex...
+if not l.startswith("XEN_CPUFEATURE("):
+continue
+
+res = feat_regex.match(l)
+
+if res is None:
+raise Fail("Failed to interpret '%s'" % (l.strip(), ))
+
+name = res.groups()[0]
+val = eval(res.groups()[1]) # Regex confines this to a very simple 
expression
+
+if hasattr(this, name):
+raise Fail("Duplicate symbol %s" % (name,))
+
+if val in state.names:
+raise Fail("Aliased value between %s and %s" %
+   (name, state.names[val]))
+
+# Mutate the current namespace to insert a feature literal with its
+# bit index.  Prepend an underscore if the name starts with a digit.
+if name[0] in "0123456789":
+this_name = "_" + name
+else:
+this_name = name
+setattr(this, this_name, val)
+
+# Construct a reverse mapping of value to name
+state.names[val] = name
+
+if len(state.names) == 0:
+raise Fail("No features found")
+
+def featureset_to_uint32s(fs, nr):
+""" Represent a featureset as a list of C-compatible uint32_t's """
+
+bitmap = 0L
+for f in fs:
+bitmap |= 1L << f
+
+words = []
+while bitmap:
+words.append(bitmap & ((1L << 32) - 1))
+bitmap >>= 32
+
+assert len(words) <= nr
+
+if len(words) < nr:
+words.extend([0] * (nr - len(words)))
+
+return [ "0x%08xU" % x for x in words ]
+
+def format_uint32s(words, indent):
+""" Format a list of uint32_t's suitable for a macro definition """
+spaces = " " * indent
+return spaces + (", \\\n" + spaces).join(words) + ", \\"
+
+
+def crunch_numbers(state):
+
+# Size of bitmaps
+state.nr_entries = nr_entries =

[Xen-devel] [PATCH v4 01/26] xen/public: Export cpu featureset information in the public API

2016-03-23 Thread Andrew Cooper

For the featureset to be a useful object, it needs a stable interpretation, a
property which is missing from the current hw_caps interface.

Additionly, introduce TSC_ADJUST, FDP_EXCP_ONLY, SHA, PREFETCHWT1, ITSC, EFRO
and CLZERO which will be used by later changes.

To maintain compilation, FSCAPINTS is currently hardcoded at 9.  Future
changes will change this to being dynamically generated.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Tim Deegan 

v2:
 * Rebase over upstream changes
 * Collect all feature introductions from later in the series
 * Restrict API to Xen and toolstack
v3:
 * Allow the constants to be in a namespace of the includers choosing.
 * Add FDP_EXCP_ONLY
v4:
 * Magic blocks in new file.
 * Remove default ASM support.
 * Renumber the synthetic values from 0.
---
 xen/include/asm-x86/cpufeature.h| 152 ++-
 xen/include/asm-x86/cpufeatureset.h |  32 
 xen/include/public/arch-x86/cpufeatureset.h | 228 
 3 files changed, 273 insertions(+), 139 deletions(-)
 create mode 100644 xen/include/asm-x86/cpufeatureset.h
 create mode 100644 xen/include/public/arch-x86/cpufeatureset.h

diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 1bac562..a044616 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -10,148 +10,22 @@
 #endif
 
 #include 
+#include 
 
-#define NCAPINTS   9   /* N 32-bit words worth of info */
+#define FSCAPINTS 9
+#define NCAPINTS (FSCAPINTS + 1) /* N 32-bit words worth of info */
 
-/* Intel-defined CPU features, CPUID level 0x0001 (edx), word 0 */
-#define X86_FEATURE_FPU(0*32+ 0) /* Onboard FPU */
-#define X86_FEATURE_VME(0*32+ 1) /* Virtual Mode Extensions */
-#define X86_FEATURE_DE (0*32+ 2) /* Debugging Extensions */
-#define X86_FEATURE_PSE(0*32+ 3) /* Page Size Extensions */
-#define X86_FEATURE_TSC(0*32+ 4) /* Time Stamp Counter */
-#define X86_FEATURE_MSR(0*32+ 5) /* Model-Specific Registers, 
RDMSR, WRMSR */
-#define X86_FEATURE_PAE(0*32+ 6) /* Physical Address 
Extensions */
-#define X86_FEATURE_MCE(0*32+ 7) /* Machine Check Architecture 
*/
-#define X86_FEATURE_CX8(0*32+ 8) /* CMPXCHG8 instruction */
-#define X86_FEATURE_APIC   (0*32+ 9) /* Onboard APIC */
-#define X86_FEATURE_SEP(0*32+11) /* SYSENTER/SYSEXIT */
-#define X86_FEATURE_MTRR   (0*32+12) /* Memory Type Range Registers */
-#define X86_FEATURE_PGE(0*32+13) /* Page Global Enable */
-#define X86_FEATURE_MCA(0*32+14) /* Machine Check Architecture 
*/
-#define X86_FEATURE_CMOV   (0*32+15) /* CMOV instruction (FCMOVCC and 
FCOMI too if FPU present) */
-#define X86_FEATURE_PAT(0*32+16) /* Page Attribute Table */
-#define X86_FEATURE_PSE36  (0*32+17) /* 36-bit PSEs */
-#define X86_FEATURE_CLFLUSH(0*32+19) /* Supports the CLFLUSH instruction */
-#define X86_FEATURE_DS (0*32+21) /* Debug Store */
-#define X86_FEATURE_ACPI   (0*32+22) /* ACPI via MSR */
-#define X86_FEATURE_MMX(0*32+23) /* Multimedia Extensions */
-#define X86_FEATURE_FXSR   (0*32+24) /* FXSAVE and FXRSTOR instructions 
(fast save and restore */
- /* of FPU context), and CR4.OSFXSR 
available */
-#define X86_FEATURE_SSE(0*32+25) /* Streaming SIMD Extensions 
*/
-#define X86_FEATURE_SSE2   (0*32+26) /* Streaming SIMD Extensions-2 */
-#define X86_FEATURE_HTT(0*32+28) /* Hyper-Threading Technology 
*/
-#define X86_FEATURE_TM1(0*32+29) /* Thermal Monitor 1 */
-#define X86_FEATURE_PBE(0*32+31) /* Pending Break Enable */
-
-/* AMD-defined CPU features, CPUID level 0x8001, word 1 */
-/* Don't duplicate feature flags which are redundant with Intel! */
-#define X86_FEATURE_SYSCALL(1*32+11) /* SYSCALL/SYSRET */
-#define X86_FEATURE_NX (1*32+20) /* Execute Disable */
-#define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */
-#define X86_FEATURE_FFXSR   (1*32+25) /* FFXSR instruction optimizations */
-#define X86_FEATURE_PAGE1GB(1*32+26) /* 1Gb large page support */
-#define X86_FEATURE_RDTSCP (1*32+27) /* RDTSCP */
-#define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */
-#define X86_FEATURE_3DNOWEXT   (1*32+30) /* AMD 3DNow! extensions */
-#define X86_FEATURE_3DNOW  (1*32+31) /* 3DNow! */
-
-/* Intel-defined CPU features, CPUID level 0x000D:1 (eax), word 2 */
-#define X86_FEATURE_XSAVEOPT   (2*32+ 0) /* XSAVEOPT instruction. */
-#define X86_FEATURE_XSAVEC (2*32+ 1) /* XSAVEC/XRSTORC instructions. */
-#define X86_FEATURE_XGETBV1(2*32+ 2) /* XGETBV with %ecx=1. */
-#define X86_FEATURE_XSAVES (2*32+ 3) /* XSAVES/XRSTORS instructions.

[Xen-devel] [PATCH v4 07/26] xen/x86: Calculate maximum host and guest featuresets

2016-03-23 Thread Andrew Cooper

All of this information will be used by the toolstack to make informed
levelling decisions for VMs, and by Xen to sanity check toolstack-provided
information.

Signed-off-by: Andrew Cooper 
Reviewed-by: Konrad Rzeszutek Wilk 
---
CC: Jan Beulich 

v3:
 * Move as much as possible into .init.
 * Fix the handing of the shared bits for the cross-vendor case.
 * Fix extended check.
v4:
 * Fix copy error in calculate_hvm_featureset()
---
 xen/arch/x86/cpuid.c| 162 
 xen/arch/x86/setup.c|   3 +
 xen/include/asm-x86/cpuid.h |  17 +
 3 files changed, 182 insertions(+)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 77e008a..41439f8 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -1,14 +1,176 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 const uint32_t known_features[] = INIT_KNOWN_FEATURES;
 const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
 
+static const uint32_t __initconst pv_featuremask[] = INIT_PV_FEATURES;
+static const uint32_t __initconst hvm_shadow_featuremask[] = 
INIT_HVM_SHADOW_FEATURES;
+static const uint32_t __initconst hvm_hap_featuremask[] = 
INIT_HVM_HAP_FEATURES;
+
+uint32_t __read_mostly raw_featureset[FSCAPINTS];
+uint32_t __read_mostly pv_featureset[FSCAPINTS];
+uint32_t __read_mostly hvm_featureset[FSCAPINTS];
+
+static void __init sanitise_featureset(uint32_t *fs)
+{
+unsigned int i;
+
+for ( i = 0; i < FSCAPINTS; ++i )
+{
+/* Clamp to known mask. */
+fs[i] &= known_features[i];
+}
+
+/*
+ * Sort out shared bits.  We are constructing a featureset which needs to
+ * be applicable to a cross-vendor case.  Intel strictly clears the common
+ * bits in e1d, while AMD strictly duplicates them.
+ *
+ * We duplicate them here to be compatible with AMD while on Intel, and
+ * rely on logic closer to the guest to make the featureset stricter if
+ * emulating Intel.
+ */
+fs[FEATURESET_e1d] = ((fs[FEATURESET_1d]  &  CPUID_COMMON_1D_FEATURES) |
+  (fs[FEATURESET_e1d] & ~CPUID_COMMON_1D_FEATURES));
+}
+
+static void __init calculate_raw_featureset(void)
+{
+unsigned int max, tmp;
+
+max = cpuid_eax(0);
+
+if ( max >= 1 )
+cpuid(0x1, , ,
+  _featureset[FEATURESET_1c],
+  _featureset[FEATURESET_1d]);
+if ( max >= 7 )
+cpuid_count(0x7, 0, ,
+_featureset[FEATURESET_7b0],
+_featureset[FEATURESET_7c0],
+);
+if ( max >= 0xd )
+cpuid_count(0xd, 1,
+_featureset[FEATURESET_Da1],
+, , );
+
+max = cpuid_eax(0x8000);
+if ( (max >> 16) != 0x8000 )
+return;
+
+if ( max >= 0x8001 )
+cpuid(0x8001, , ,
+  _featureset[FEATURESET_e1c],
+  _featureset[FEATURESET_e1d]);
+if ( max >= 0x8007 )
+cpuid(0x8007, , , ,
+  _featureset[FEATURESET_e7d]);
+if ( max >= 0x8008 )
+cpuid(0x8008, ,
+  _featureset[FEATURESET_e8b],
+  , );
+}
+
+static void __init calculate_pv_featureset(void)
+{
+unsigned int i;
+
+for ( i = 0; i < FSCAPINTS; ++i )
+pv_featureset[i] = host_featureset[i] & pv_featuremask[i];
+
+/* Unconditionally claim to be able to set the hypervisor bit. */
+__set_bit(X86_FEATURE_HYPERVISOR, pv_featureset);
+
+/*
+ * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
+ * affect how to interpret topology information in other cpuid leaves.
+ */
+__set_bit(X86_FEATURE_HTT, pv_featureset);
+__set_bit(X86_FEATURE_X2APIC, pv_featureset);
+__set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
+
+sanitise_featureset(pv_featureset);
+}
+
+static void __init calculate_hvm_featureset(void)
+{
+unsigned int i;
+const uint32_t *hvm_featuremask;
+
+if ( !hvm_enabled )
+return;
+
+hvm_featuremask = hvm_funcs.hap_supported ?
+hvm_hap_featuremask : hvm_shadow_featuremask;
+
+for ( i = 0; i < FSCAPINTS; ++i )
+hvm_featureset[i] = host_featureset[i] & hvm_featuremask[i];
+
+/* Unconditionally claim to be able to set the hypervisor bit. */
+__set_bit(X86_FEATURE_HYPERVISOR, hvm_featureset);
+
+/*
+ * Allow the toolstack to set HTT, X2APIC and CMP_LEGACY.  These bits
+ * affect how to interpret topology information in other cpuid leaves.
+ */
+__set_bit(X86_FEATURE_HTT, hvm_featureset);
+__set_bit(X86_FEATURE_X2APIC, hvm_featureset);
+__set_bit(X86_FEATURE_CMP_LEGACY, hvm_featureset);
+
+/*
+ * Xen can provide an APIC emulation to HVM guests even if the host's APIC
+ * isn't enabled.
+ */
+__set_bit(X86_FEATURE_APIC, hvm_featureset);
+
+/*
+ * On AMD, PV guests are entirely

[Xen-devel] [PATCH v4 04/26] xen/x86: Mask out unknown features from Xen's capabilities

2016-03-23 Thread Andrew Cooper

If Xen doesn't know about a feature, it is unsafe for use and should be
deliberately hidden from Xen's capabilities.

This doesn't make a practical difference yet, but will make a difference
later when the guest featuresets are seeded from the host featureset.

Signed-off-by: Andrew Cooper 
Acked-by: Jan Beulich 
Reviewed-by: Konrad Rzeszutek Wilk 
---
v2:
 * Reduced substantially from v1, by using the autogenerated information.
v3:
 * Drop redundant braces.
---
 xen/arch/x86/Makefile|  1 +
 xen/arch/x86/cpu/common.c|  2 ++
 xen/arch/x86/cpuid.c | 20 
 xen/include/asm-x86/cpufeature.h |  4 +---
 xen/include/asm-x86/cpuid.h  | 25 +
 xen/tools/gen-cpuid.py   | 24 
 6 files changed, 73 insertions(+), 3 deletions(-)
 create mode 100644 xen/arch/x86/cpuid.c
 create mode 100644 xen/include/asm-x86/cpuid.h

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 1bcb08b..729065b 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -12,6 +12,7 @@ obj-y += bitops.o
 obj-bin-y += bzimage.init.o
 obj-bin-y += clear_page.o
 obj-bin-y += copy_page.o
+obj-y += cpuid.o
 obj-y += compat.o x86_64/compat.o
 obj-$(CONFIG_KEXEC) += crash.o
 obj-y += debug.o
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 1a278b1..d302272 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -341,6 +341,8 @@ void identify_cpu(struct cpuinfo_x86 *c)
 * The vendor-specific functions might have changed features.  Now
 * we do "generic changes."
 */
+   for (i = 0; i < FSCAPINTS; ++i)
+   c->x86_capability[i] &= known_features[i];
 
for (i = 0 ; i < NCAPINTS ; ++i)
c->x86_capability[i] &= ~cleared_caps[i];
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
new file mode 100644
index 000..05cd646
--- /dev/null
+++ b/xen/arch/x86/cpuid.c
@@ -0,0 +1,20 @@
+#include 
+#include 
+#include 
+
+const uint32_t known_features[] = INIT_KNOWN_FEATURES;
+
+static void __init __maybe_unused build_assertions(void)
+{
+BUILD_BUG_ON(ARRAY_SIZE(known_features) != FSCAPINTS);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index bcda09b..e29b024 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -10,10 +10,8 @@
 #endif
 
 #include 
-#include 
-#include 
+#include 
 
-#define FSCAPINTS FEATURESET_NR_ENTRIES
 #define NCAPINTS (FSCAPINTS + 1) /* N 32-bit words worth of info */
 
 /* Other features, Xen-defined mapping. */
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
new file mode 100644
index 000..b72d88f
--- /dev/null
+++ b/xen/include/asm-x86/cpuid.h
@@ -0,0 +1,25 @@
+#ifndef __X86_CPUID_H__
+#define __X86_CPUID_H__
+
+#include 
+#include 
+
+#define FSCAPINTS FEATURESET_NR_ENTRIES
+
+#ifndef __ASSEMBLY__
+#include 
+
+extern const uint32_t known_features[FSCAPINTS];
+
+#endif /* __ASSEMBLY__ */
+#endif /* !__X86_CPUID_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index c6bd98d..44b4c98 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -19,6 +19,8 @@ class State(object):
 
 # State calculated
 self.nr_entries = 0 # Number of words in a featureset
+self.common_1d = 0 # Common features between 1d and e1d
+self.known = [] # All known features
 
 def parse_definitions(state):
 """
@@ -95,6 +97,22 @@ def crunch_numbers(state):
 # Size of bitmaps
 state.nr_entries = nr_entries = (max(state.names.keys()) >> 5) + 1
 
+# Features common between 1d and e1d.
+common_1d = (FPU, VME, DE, PSE, TSC, MSR, PAE, MCE, CX8, APIC,
+ MTRR, PGE, MCA, CMOV, PAT, PSE36, MMX, FXSR)
+
+# All known features.  Duplicate the common features in e1d
+e1d_base = SYSCALL & ~31
+state.known = featureset_to_uint32s(
+state.names.keys() + [ e1d_base + (x % 32) for x in common_1d ],
+nr_entries)
+
+# Fold common back into names
+for f in common_1d:
+state.names[e1d_base + (f % 32)] = "E1D_" + state.names[f]
+
+state.common_1d = featureset_to_uint32s(common_1d, 1)[0]
+
 
 def write_results(state):
 state.output.write(
@@ -109,7 +127,13 @@ def write_results(state):
 state.output.write(
 """
 #define FEATURESET_NR_ENTRIES %sU
+
+#define CPUID_COMMON_1D_FEATURES %s
+
+#define INIT_KNOWN_FEATURES { \\\n%s\n}
 """ % (state.nr_entries,
+   state.common_1d,
+   format_uint32s(state.known, 4),
))
 
 state.output.write(
-- 
2.1.4

[Xen-devel] [PATCH v4 03/26] xen/x86: Collect more cpuid feature leaves

2016-03-23 Thread Andrew Cooper

New words are:
 * 0x8007.edx - Contains Invarient TSC
 * 0x8008.ebx - Newly used for AMD Zen processors

In addition, replace some open-coded ITSC and EFRO manipulation.

Signed-off-by: Andrew Cooper 
Acked-by: Jan Beulich 
Reviewed-by: Konrad Rzeszutek Wilk 
---
v2:
 * Rely on ordering of generic_identify() to simplify init_amd()
 * Remove opencoded EFRO manipulation as well
---
 xen/arch/x86/cpu/amd.c| 21 +++--
 xen/arch/x86/cpu/common.c |  6 ++
 xen/arch/x86/cpu/intel.c  |  2 +-
 xen/arch/x86/domain.c |  2 +-
 4 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index a4bef21..47a38c6 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -294,21 +294,6 @@ int cpu_has_amd_erratum(const struct cpuinfo_x86 *cpu, int 
osvw_id, ...)
return 0;
 }
 
-/* Can this system suffer from TSC drift due to C1 clock ramping? */
-static int c1_ramping_may_cause_clock_drift(struct cpuinfo_x86 *c) 
-{ 
-   if (cpuid_edx(0x8007) & (1<<8)) {
-   /*
-* CPUID.AdvPowerMgmtInfo.TscInvariant
-* EDX bit 8, 8000_0007
-* Invariant TSC on 8th Gen or newer, use it
-* (assume all cores have invariant TSC)
-*/
-   return 0;
-   }
-   return 1;
-}
-
 /*
  * Disable C1-Clock ramping if enabled in PMM7.CpuLowPwrEnh on 8th-generation
  * cores only. Assume BIOS has setup all Northbridges equivalently.
@@ -475,7 +460,7 @@ static void init_amd(struct cpuinfo_x86 *c)
}
 
if (c->extended_cpuid_level >= 0x8007) {
-   if (cpuid_edx(0x8007) & (1<<8)) {
+   if (cpu_has(c, X86_FEATURE_ITSC)) {
__set_bit(X86_FEATURE_CONSTANT_TSC, c->x86_capability);
__set_bit(X86_FEATURE_NONSTOP_TSC, c->x86_capability);
if (c->x86 != 0x11)
@@ -600,14 +585,14 @@ static void init_amd(struct cpuinfo_x86 *c)
wrmsrl(MSR_K7_PERFCTR3, 0);
}
 
-   if (cpuid_edx(0x8007) & (1 << 10)) {
+   if (cpu_has(c, X86_FEATURE_EFRO)) {
rdmsr(MSR_K7_HWCR, l, h);
l |= (1 << 27); /* Enable read-only APERF/MPERF bit */
wrmsr(MSR_K7_HWCR, l, h);
}
 
/* Prevent TSC drift in non single-processor, single-core platforms. */
-   if ((smp_processor_id() == 1) && c1_ramping_may_cause_clock_drift(c))
+   if ((smp_processor_id() == 1) && !cpu_has(c, X86_FEATURE_ITSC))
disable_c1_ramping();
 
set_cpuidmask(c);
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 8b94c1b..1a278b1 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -269,6 +269,12 @@ static void generic_identify(struct cpuinfo_x86 *c)
 
if (c->extended_cpuid_level >= 0x8004)
get_model_name(c); /* Default name */
+   if (c->extended_cpuid_level >= 0x8007)
+   c->x86_capability[cpufeat_word(X86_FEATURE_ITSC)]
+   = cpuid_edx(0x8007);
+   if (c->extended_cpuid_level >= 0x8008)
+   c->x86_capability[cpufeat_word(X86_FEATURE_CLZERO)]
+   = cpuid_ebx(0x8008);
 
/* Intel-defined flags: level 0x0007 */
if ( c->cpuid_level >= 0x0007 )
diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
index d4f574b..bdf89f6 100644
--- a/xen/arch/x86/cpu/intel.c
+++ b/xen/arch/x86/cpu/intel.c
@@ -281,7 +281,7 @@ static void init_intel(struct cpuinfo_x86 *c)
if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
(c->x86 == 0x6 && c->x86_model >= 0x0e))
__set_bit(X86_FEATURE_CONSTANT_TSC, c->x86_capability);
-   if (cpuid_edx(0x8007) & (1u<<8)) {
+   if (cpu_has(c, X86_FEATURE_ITSC)) {
__set_bit(X86_FEATURE_CONSTANT_TSC, c->x86_capability);
__set_bit(X86_FEATURE_NONSTOP_TSC, c->x86_capability);
__set_bit(X86_FEATURE_TSC_RELIABLE, c->x86_capability);
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a33f975..6ec7554 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2614,7 +2614,7 @@ void domain_cpuid(
  */
 if ( (input == 0x8007) && /* Advanced Power Management */
  !d->disable_migrate && !d->arch.vtsc )
-*edx &= ~(1u<<8); /* TSC Invariant */
+*edx &= ~cpufeat_mask(X86_FEATURE_ITSC);
 
 return;
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 08/26] xen/x86: Generate deep dependencies of features

2016-03-23 Thread Andrew Cooper

Some features depend on other features.  Working out and maintaining the exact
dependency tree is complicated, so it is expressed in the automatic generation
script.

At runtime, Xen needs to be disable all features which are dependent on a
feature being disabled.  Because of the flattening performed at compile time,
runtime can use a single mask to disable all eventual features.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 

v2:
 * New.
v3:
 * Vastly more reserch and comments.
v4:
 * Expand commit message.
 * More tweaks to the dependency tree.
 * Avoid for_each_set_bit() walking off the end of disabled_features[].
   Expanding disabled_features[] turns out to be far more simple than
   attempting to opencode for_each_set_bit()
---
 xen/arch/x86/cpuid.c|  56 +
 xen/include/asm-x86/cpuid.h |   2 +
 xen/tools/gen-cpuid.py  | 143 +++-
 3 files changed, 200 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 41439f8..e1e0e44 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -11,6 +11,7 @@ const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
 static const uint32_t __initconst pv_featuremask[] = INIT_PV_FEATURES;
 static const uint32_t __initconst hvm_shadow_featuremask[] = 
INIT_HVM_SHADOW_FEATURES;
 static const uint32_t __initconst hvm_hap_featuremask[] = 
INIT_HVM_HAP_FEATURES;
+static const uint32_t __initconst deep_features[] = INIT_DEEP_FEATURES;
 
 uint32_t __read_mostly raw_featureset[FSCAPINTS];
 uint32_t __read_mostly pv_featureset[FSCAPINTS];
@@ -18,12 +19,36 @@ uint32_t __read_mostly hvm_featureset[FSCAPINTS];
 
 static void __init sanitise_featureset(uint32_t *fs)
 {
+/* for_each_set_bit() uses unsigned longs.  Extend with zeroes. */
+uint32_t disabled_features[
+ROUNDUP(FSCAPINTS, sizeof(unsigned long)/sizeof(uint32_t))] = {};
 unsigned int i;
 
 for ( i = 0; i < FSCAPINTS; ++i )
 {
 /* Clamp to known mask. */
 fs[i] &= known_features[i];
+
+/*
+ * Identify which features with deep dependencies have been
+ * disabled.
+ */
+disabled_features[i] = ~fs[i] & deep_features[i];
+}
+
+for_each_set_bit(i, (void *)disabled_features,
+ sizeof(disabled_features) * 8)
+{
+const uint32_t *dfs = lookup_deep_deps(i);
+unsigned int j;
+
+ASSERT(dfs); /* deep_features[] should guarentee this. */
+
+for ( j = 0; j < FSCAPINTS; ++j )
+{
+fs[j] &= ~dfs[j];
+disabled_features[j] &= ~dfs[j];
+}
 }
 
 /*
@@ -164,6 +189,36 @@ void __init calculate_featuresets(void)
 calculate_hvm_featureset();
 }
 
+const uint32_t * __init lookup_deep_deps(uint32_t feature)
+{
+static const struct {
+uint32_t feature;
+uint32_t fs[FSCAPINTS];
+} deep_deps[] __initconst = INIT_DEEP_DEPS;
+unsigned int start = 0, end = ARRAY_SIZE(deep_deps);
+
+BUILD_BUG_ON(ARRAY_SIZE(deep_deps) != NR_DEEP_DEPS);
+
+/* Fast early exit. */
+if ( !test_bit(feature, deep_features) )
+return NULL;
+
+/* deep_deps[] is sorted.  Perform a binary search. */
+while ( start < end )
+{
+unsigned int mid = start + ((end - start) / 2);
+
+if ( deep_deps[mid].feature > feature )
+end = mid;
+else if ( deep_deps[mid].feature < feature )
+start = mid + 1;
+else
+return deep_deps[mid].fs;
+}
+
+return NULL;
+}
+
 static void __init __maybe_unused build_assertions(void)
 {
 BUILD_BUG_ON(ARRAY_SIZE(known_features) != FSCAPINTS);
@@ -171,6 +226,7 @@ static void __init __maybe_unused build_assertions(void)
 BUILD_BUG_ON(ARRAY_SIZE(pv_featuremask) != FSCAPINTS);
 BUILD_BUG_ON(ARRAY_SIZE(hvm_shadow_featuremask) != FSCAPINTS);
 BUILD_BUG_ON(ARRAY_SIZE(hvm_hap_featuremask) != FSCAPINTS);
+BUILD_BUG_ON(ARRAY_SIZE(deep_features) != FSCAPINTS);
 }
 
 /*
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 5041bcd..4725672 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -29,6 +29,8 @@ extern uint32_t hvm_featureset[FSCAPINTS];
 
 void calculate_featuresets(void);
 
+const uint32_t *lookup_deep_deps(uint32_t feature);
+
 #endif /* __ASSEMBLY__ */
 #endif /* !__X86_CPUID_H__ */
 
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index 4fd603d..1cec5d8 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -144,6 +144,131 @@ def crunch_numbers(state):
 state.hvm_shadow = featureset_to_uint32s(state.raw_hvm_shadow, nr_entries)
 state.hvm_hap = featureset_to_uint32s(state.raw_hvm_hap, nr_entries)
 
+#
+# Feature dependency information.
+#
+# !!! WARNING !!!
+#
+# A lot of this information is derived from the written text of vendors
+# software

[Xen-devel] [PATCH v4 00/26] x86: Improvements to cpuid handling for guests

2016-03-23 Thread Andrew Cooper

This series is available in git form at:
  http://xenbits.xen.org/git-http/people/andrewcoop/xen.git levelling-v4

There are no major changes from v3.  There were minor adjustmenst to the
feature dependency tree, OSXSAVE/OSPKE handling for PV guests and collection
of Acks/Reviews.

Most patches do now how Acks/Reviews.  The remaining patches are #1 (Rest),
#6-8,11-13,18 (x86), #20 (ARM), 26 (Toolstack).

The current cpuid code, both in the hypervisor and toolstack, has grown
organically for a very long time, and is flawed in many ways.  This series
focuses specifically on the fixing the bits pertaining to the visible
features, and I will be fixing other areas in future work (e.g. per-core,
per-package values, auditing of incoming migration values, etc.)

These changes alter the workflow of cpuid handling as follows:

Xen boots and evaluates its current capabilities.  It uses this information to
calculate the maximum featuresets it can provide to guests, and provides this
information for toolstack consumption.  A toolstack may then calculate a safe
set of features (taking into account migratability), and sets a guests cpuid
policy.  Xen then takes care of context switching the levelling state.

In particular, this means that PV guests may have different levels while
running on the same host, an option which was not previously available.

Andrew Cooper (26):
  xen/public: Export cpu featureset information in the public API
  xen/x86: Script to automatically process featureset information
  xen/x86: Collect more cpuid feature leaves
  xen/x86: Mask out unknown features from Xen's capabilities
  xen/x86: Annotate special features
  xen/x86: Annotate VM applicability in featureset
  xen/x86: Calculate maximum host and guest featuresets
  xen/x86: Generate deep dependencies of features
  xen/x86: Clear dependent features when clearing a cpu cap
  xen/x86: Improve disabling of features which have dependencies
  xen/x86: Improvements to in-hypervisor cpuid sanity checks
  x86/cpu: Move set_cpumask() calls into c_early_init()
  x86/cpu: Sysctl and common infrastructure for levelling context
switching
  x86/cpu: Rework AMD masking MSR setup
  x86/cpu: Rework Intel masking/faulting setup
  x86/cpu: Context switch cpuid masks and faulting state in
context_switch()
  x86/pv: Provide custom cpumasks for PV domains
  x86/domctl: Update PV domain cpumasks when setting cpuid policy
  xen+tools: Export maximum host and guest cpu featuresets via SYSCTL
  tools/libxc: Modify bitmap operations to take void pointers
  tools/libxc: Use public/featureset.h for cpuid policy generation
  tools/libxc: Expose the automatically generated cpu featuremask
information
  tools: Utility for dealing with featuresets
  tools/libxc: Wire a featureset through to cpuid policy logic
  tools/libxc: Use featuresets rather than guesswork
  tools/libxc: Calculate xstate cpuid leaf from guest information

 .gitignore  |   2 +
 tools/libxc/Makefile|   9 +
 tools/libxc/include/xenctrl.h   |  22 +-
 tools/libxc/xc_bitops.h |  37 +-
 tools/libxc/xc_cpufeature.h | 151 ---
 tools/libxc/xc_cpuid_x86.c  | 621 +---
 tools/libxl/libxl_cpuid.c   |   2 +-
 tools/misc/Makefile |   4 +
 tools/misc/xen-cpuid.c  | 394 ++
 tools/ocaml/libs/xc/xenctrl.ml  |   3 +
 tools/ocaml/libs/xc/xenctrl.mli |   4 +
 tools/ocaml/libs/xc/xenctrl_stubs.c |  37 +-
 tools/python/xen/lowlevel/xc/xc.c   |   2 +-
 xen/arch/x86/Makefile   |   1 +
 xen/arch/x86/apic.c |   2 +-
 xen/arch/x86/cpu/amd.c  | 308 --
 xen/arch/x86/cpu/common.c   |  49 ++-
 xen/arch/x86/cpu/intel.c| 263 +++-
 xen/arch/x86/cpuid.c| 240 +++
 xen/arch/x86/crash.c|   3 +
 xen/arch/x86/domain.c   |  20 +-
 xen/arch/x86/domctl.c   | 138 +++
 xen/arch/x86/hvm/hvm.c  | 125 --
 xen/arch/x86/setup.c|   3 +
 xen/arch/x86/sysctl.c   |  57 +++
 xen/arch/x86/traps.c| 209 ++
 xen/arch/x86/xstate.c   |   6 +-
 xen/include/Makefile|  10 +
 xen/include/asm-x86/cpufeature.h| 153 +--
 xen/include/asm-x86/cpufeatureset.h |  32 ++
 xen/include/asm-x86/cpuid.h |  77 
 xen/include/asm-x86/domain.h|   2 +
 xen/include/asm-x86/processor.h |   2 +-
 xen/include/public/arch-x86/cpufeatureset.h | 245 +++
 xen/include/public/sysctl.h |  50 +++
 xen/tools/gen-cpuid.py  | 405 ++
 36

Re: [Xen-devel] Interested to participate in Outreachy Program

2016-03-23 Thread sabiya kazi

Hi Doug,
Can you have a look at patch and let me know if everything
is correct, I think things are good.

I would also like to have a word with you for deciding timeline for
project. Meantime, I have started reading stuff  about rust language.


Regards,
-Sabiya

diff --git a/tools/Makefile b/tools/Makefile
index 3f45fb9..1c2fb79 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -1,4 +1,4 @@
-XEN_ROOT = $(CURDIR)/..
+	XEN_ROOT = $(CURDIR)/..
 include $(XEN_ROOT)/tools/Rules.mk
 
 SUBDIRS-y :=
diff --git a/tools/console/client/main.c b/tools/console/client/main.c
index d006fdc..199432c 100644
--- a/tools/console/client/main.c
+++ b/tools/console/client/main.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef __sun__
 #include 
 #endif
@@ -45,10 +46,12 @@
 
 #define ESCAPE_CHARACTER 0x1d
 
+# define CONTROL(c) ((c) ^ 0x40)
+
 static volatile sig_atomic_t received_signal = 0;
 static char lockfile[sizeof (XEN_LOCK_DIR "/xenconsole.") + 8] = { 0 };
 static int lockfd = -1;
-
+static char escapechar = ESCAPE_CHARACTER;
 static void sighandler(int signum)
 {
 	received_signal = 1;
@@ -214,7 +217,7 @@ static int console_loop(int fd, struct xs_handle *xs, char *pty_path,
 			char msg[60];
 
 			len = read(STDIN_FILENO, msg, sizeof(msg));
-			if (len == 1 && msg[0] == ESCAPE_CHARACTER) {
+			if (len == 1 && msg[0] == escapechar) {
 return 0;
 			} 
 
@@ -318,6 +321,14 @@ static void console_unlock(void)
 	}
 }
 
+char getEscapeChar(const char *s)
+{
+if (*s == '^')
+return CONTROL(toupper(s[1]));
+
+return *s;
+}
+
 int main(int argc, char **argv)
 {
 	struct termios attr;
@@ -329,6 +340,7 @@ int main(int argc, char **argv)
 	struct option lopt[] = {
 		{ "type", 1, 0, 't' },
 		{ "num", 1, 0, 'n' },
+		{ "escapechar", 1, 0, 'n' },
 		{ "help",0, 0, 'h' },
 		{ 0 },
 
@@ -363,6 +375,11 @@ int main(int argc, char **argv)
 exit(EINVAL);
 			}
 			break;
+		case 'e' :
+		escapechar = getEscapeChar(optarg);
+break;
+
+
 		default:
 			fprintf(stderr, "Invalid argument\n");
 			fprintf(stderr, "Try `%s --help' for more information.\n", 
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 4cdc169..86ee670 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1715,14 +1715,16 @@ static void domain_destroy_domid_cb(libxl__egc *egc,
 }
 
 int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int cons_num,
-   libxl_console_type type)
+   libxl_console_type type, char escapechar)
 {
+
+
 GC_INIT(ctx);
 char *p = GCSPRINTF("%s/xenconsole", libxl__private_bindir_path());
 char *domid_s = GCSPRINTF("%d", domid);
 char *cons_num_s = GCSPRINTF("%d", cons_num);
 char *cons_type_s;
-
+char *cons_escape_char = GCSPRINTF("%c", escapechar); 
 switch (type) {
 case LIBXL_CONSOLE_TYPE_PV:
 cons_type_s = "pv";
@@ -1734,13 +1736,17 @@ int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int cons_num,
 goto out;
 }
 
-execl(p, p, domid_s, "--num", cons_num_s, "--type", cons_type_s, (void *)NULL);
+   if(cons_escape_char == NULL)
+execl(p, p, domid_s, "--num", cons_num_s, "--type", cons_type_s,(void *)NULL);
+   else
+execl(p, p, domid_s, "--num", cons_num_s, "--type", cons_type_s, "--escapechar", cons_escape_char, (void *)NULL);
 
 out:
 GC_FREE;
 return ERROR_FAIL;
 }
 
+
 int libxl_console_get_tty(libxl_ctx *ctx, uint32_t domid, int cons_num,
   libxl_console_type type, char **path)
 {
@@ -1823,7 +1829,7 @@ out:
 return rc;
 }
 
-int libxl_primary_console_exec(libxl_ctx *ctx, uint32_t domid_vm)
+int libxl_primary_console_exec(libxl_ctx *ctx, uint32_t domid_vm, char escapechar)
 {
 uint32_t domid;
 int cons_num;
@@ -1832,7 +1838,7 @@ int libxl_primary_console_exec(libxl_ctx *ctx, uint32_t domid_vm)
 
 rc = libxl__primary_console_find(ctx, domid_vm, , _num, );
 if ( rc ) return rc;
-return libxl_console_exec(ctx, domid, cons_num, type);
+return libxl_console_exec(ctx, domid, cons_num, type, escapechar);
 }
 
 int libxl_primary_console_get_tty(libxl_ctx *ctx, uint32_t domid_vm,
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index f9e3ef5..4ac8cfd 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -218,6 +218,12 @@
 #define LIBXL_HAVE_SOFT_RESET 1
 
 /*
+ if user does not specify any escape character sequence then 
+ Default escape character will be ^] 
+ */
+
+#define CTRL_CLOSE_BRACKET '\e'
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
@@ -1317,15 +1323,26 @@ int libxl_wait_for_free_memory(libxl_ctx *ctx, uint32_t domid, uint32_t memory_k
 int libxl_wait_for_memory_target(libxl_ctx *ctx, uint32_t domid, int wait_secs);
 
 int libxl_vncviewer_exec(libxl_ctx *ctx, uint32_t domid, int autopass);
-int libxl_console_exec(libxl_ctx *ctx, uint32_t domid, int cons_num, libxl_console_type type);
+
+

Re: [Xen-devel] [PATCH] blkif.h: document scsi/0x12/0x node

2016-03-23 Thread Paul Durrant

> -Original Message-
[snip]
> > > >
> > > > For this part, there is ioctl() interface for all block device.
> > > > Looking at virtio-blk in KVM world, it can accept almost all SCSI
> commands
> > > also in ioctl() even they already have virtio-scsi.
> > > > But that's another story.
> > > >
> > >
> > > So this means that you would then need to add a bunch of new request
> > > types
> > > to the PV block protocol in order to make use of this new exported
> > > information?
> > >
> >
> > No, why do you think that? The info is in xenstore so why does the blkif
> protocol need to be involved at all?
> 
> Sorry, I'm just trying to figure out how is this going to be used.
> 
> Isn't this information going to have some impact on how the user uses
> the block device? If not, why are we exporting it then?
> 

I assume that the user will want to get this information from blkfront via 
ioctl (as Bob suggests), but blkfront can just pull it straight from xenstore. 
No need for any communication with blkback.

  Paul

> Roger.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] blkif.h: document scsi/0x12/0x node

2016-03-23 Thread Roger Pau Monné

On Wed, 23 Mar 2016, Paul Durrant wrote:
> > -Original Message-
> [snip]
> > > > >
> > > > > For this part, there is ioctl() interface for all block device.
> > > > > Looking at virtio-blk in KVM world, it can accept almost all SCSI
> > commands
> > > > also in ioctl() even they already have virtio-scsi.
> > > > > But that's another story.
> > > > >
> > > >
> > > > So this means that you would then need to add a bunch of new request
> > > > types
> > > > to the PV block protocol in order to make use of this new exported
> > > > information?
> > > >
> > >
> > > No, why do you think that? The info is in xenstore so why does the blkif
> > protocol need to be involved at all?
> > 
> > Sorry, I'm just trying to figure out how is this going to be used.
> > 
> > Isn't this information going to have some impact on how the user uses
> > the block device? If not, why are we exporting it then?
> > 
> 
> I assume that the user will want to get this information from blkfront via 
> ioctl (as Bob suggests), but blkfront can just pull it straight from 
> xenstore. No need for any communication with blkback.

Yes, I understand this. What I want to know is what impact is this 
information going to have on how the user uses the PV block device.

Is the information found in this magic page going to be used to try to 
send requests of specfic size in order to take advantge of some hw 
features?

And then my other question is, in order to take advantadge of this 
information, will we need to add new PV block request types that 
encapsulate SCSI commands?

TBH, ATM I don't see why this is useful at all.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] blkif.h: document scsi/0x12/0x node

2016-03-23 Thread Roger Pau Monné

On Wed, 23 Mar 2016, Paul Durrant wrote:
> > -Original Message-
> > From: Roger Pau Monné [mailto:roger@citrix.com]
> > Sent: 23 March 2016 14:54
> > To: Bob Liu
> > Cc: Roger Pau Monne; xen-devel@lists.xen.org; Paul Durrant; Ian Jackson;
> > konrad.w...@oracle.com; jgr...@suse.com; annie...@oracle.com; David
> > Vrabel
> > Subject: Re: [PATCH] blkif.h: document scsi/0x12/0x node
> > 
> > On Wed, 23 Mar 2016, Bob Liu wrote:
> > > On 03/23/2016 08:33 PM, Roger Pau Monné wrote:
> > > > On Wed, 23 Mar 2016, Bob Liu wrote:
> > > >
> > > >> This patch documents a xenstore node which is used by XENVBD
> > Windows PV
> > > >> driver.
> > > >>
> > > >> The use case is that XenServer may have OEM specific storage backends
> > and
> > > >> there is requirement to run OEM software in guest which relied on VPD
> > > >> information supplied by the storages.
> > > >> Adding a node to xenstore is the easiest way to get this VPD 
> > > >> information
> > from
> > > >> the backend into guest where XENVBD Windows PV driver can get
> > INQUIRY VPD data
> > > >> from this node and return to OEM software.
> > > >>
> > > >> Signed-off-by: Bob Liu 
> > > >> ---
> > > >>  xen/include/public/io/blkif.h |   24 
> > > >>  1 file changed, 24 insertions(+)
> > > >>
> > > >> diff --git a/xen/include/public/io/blkif.h 
> > > >> b/xen/include/public/io/blkif.h
> > > >> index 99f0326..afbcbff 100644
> > > >> --- a/xen/include/public/io/blkif.h
> > > >> +++ b/xen/include/public/io/blkif.h
> > > >> @@ -182,6 +182,30 @@
> > > >>   *  backend driver paired with a LIFO queue in the frontend will
> > > >>   *  allow us to have better performance in this scenario.
> > > >>   *
> > > >> + * scsi/0x12/0x
> > > >> + *Values: base64 encoded string
> > > >> + *
> > > >> + *This optional node contains SCSI INQUIRY VPD information.
> > > >> + * is the hexadecimal representation of the VPD page
> > code.
> > > >> + *Currently only XENVBD Windows PV driver is using this node.
> > > >> + *
> > > >> + *A frontend e.g XENVBD Windows PV driver which represents
> > a Xen VBD to
> > > >> + *its containing operating system as a (virtual) SCSI target may
> > return the
> > > >> + *specified data in response to INQUIRY commands from its
> > containing OS.
> > > >> + *
> > > >> + *A frontend which supports this feature must return the
> > backend-specified
> > > >> + *data for every INQUIRY command with the EVPD bit set.
> > > >> + *For EVPD=1 INQUIRY commands where the corresponding
> > xenstore node
> > > >> + *does not exist, the frontend must report (to its containing
> > OS) an
> > > >> + *appropriate failure condition.
> > > >> + *
> > > >> + *A frontend which does not support this feature just disregard
> > these
> > > >> + *xenstore nodes.
> > > >> + *
> > > >> + *The data of this string node is base64 encoded. Base64 is a
> > group of
> > > >> + *similar binary-to-text encoding schemes that represent
> > binary data in an
> > > >> + *ASCII string format by translating it into a radix-64
> > representation.
> > > >> + *
> > > >
> > > > I'm sorry, but I need to raise similar concerns as the ones expressed by
> > > > other people.
> > > >
> > > > I understand that those pages that you plan to export to the guest
> > contain
> > > > some kind of hardware specific information, but how is the guest going 
> > > > to
> > > > make use of this?
> > > >
> > > > It can only interact with a Xen virtual block device, and there you can
> > > > only send read, write, flush and discard requests. Even the block size 
> > > > is
> > > > hardcoded to 512b by the protocol, so I'm not sure how are you going to
> > > > use this information.
> > > >
> > >
> > > For this part, there is ioctl() interface for all block device.
> > > Looking at virtio-blk in KVM world, it can accept almost all SCSI commands
> > also in ioctl() even they already have virtio-scsi.
> > > But that's another story.
> > >
> > 
> > So this means that you would then need to add a bunch of new request
> > types
> > to the PV block protocol in order to make use of this new exported
> > information?
> >
> 
> No, why do you think that? The info is in xenstore so why does the blkif 
> protocol need to be involved at all?

Sorry, I'm just trying to figure out how is this going to be used.

Isn't this information going to have some impact on how the user uses 
the block device? If not, why are we exporting it then?

Roger.___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 01/16] xen: sched: fix locking when allocating an RTDS pCPU

2016-03-23 Thread George Dunlap

On 18/03/16 19:04, Dario Faggioli wrote:
> as doing that include changing the scheduler lock
> mapping for the pCPU itself, and the correct way
> of doing that is:
>  - take the lock that the pCPU is using right now
>(which may be the lock of another scheduler);
>  - change the mapping of the lock to the RTDS one;
>  - release the lock (the one that has actually been
>taken!)
> 
> Signed-off-by: Dario Faggioli 

Reviewed-by: George Dunlap 

> ---
> Cc: Meng Xu 
> Cc: George Dunlap 
> Cc: Tianyang Chen 
> ---
>  xen/common/sched_rt.c |9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index c896a6f..d98bfb6 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -653,11 +653,16 @@ static void *
>  rt_alloc_pdata(const struct scheduler *ops, int cpu)
>  {
>  struct rt_private *prv = rt_priv(ops);
> +spinlock_t *old_lock;
>  unsigned long flags;
>  
> -spin_lock_irqsave(>lock, flags);
> +/* Move the scheduler lock to our global runqueue lock.  */
> +old_lock = pcpu_schedule_lock_irqsave(cpu, );
> +
>  per_cpu(schedule_data, cpu).schedule_lock = >lock;
> -spin_unlock_irqrestore(>lock, flags);
> +
> +/* _Not_ pcpu_schedule_unlock(): per_cpu().schedule_lock changed! */
> +spin_unlock_irqrestore(old_lock, flags);
>  
>  if ( !alloc_cpumask_var(&_cpumask_scratch[cpu]) )
>  return NULL;
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Outreachy bite-sized tasks

2016-03-23 Thread Roger Pau Monné

Hello,

First of all, thanks for your interest in the Xen Project, and for wanting 
to participate in Outreachy.

Both of you have expressed interest in the "QEMU xen-blkback performance 
analysis and improvements" Outreachy project, and AFAIK both of you still 
need to perform your initial contribution.

I've found a couple of small tasks that you can perform and should allow 
you to complete your initial contribution to the project:

 - The first one is related to xenalyze, and it consist in creating a 
   header file that can be shared by both the Xen kernel and xenalyze.
   You will need to move the TRC_ defines found in sched_credit.c and 
   sched_credit2.c to a header that's shared with xenalyze and then 
   replace the usage of TRC_SCHED_CLASS_EVT with the defines in the header 
   file [0].

 - The second one consist in fixing the return codes of certain xl 
   commands. There are commands in xl that will return 0 (SUCCESS) even 
   when failing, which makes it very hard to write scripts that make use 
   of xl. A list of those commands can be found in [1], together with some 
   preliminary patches. Please note that those patches have comments that 
   you will need to address, and that you should also need to preserve the 
   original authorship of the patches plus yours.

I encourage you to read the wiki page about sending patches to xen-devel 
[2], it should guide you through your first steps on using git and 
creating suitable patches.

Also, please note that this is an open source project, so you will need to 
coordinate in order to figure out which task are you going to take, in 
order to avoid clashes or duplication of efforts.

If you have further questions, either reply to this thread (keeping 
the xen-devel mailing list on the Cc), or feel free to start another one 
if you think it's more suited.

Roger.

[0] http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg02624.html
[1] http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02246.html
[2] http://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [ovmf test] 87014: regressions - FAIL

2016-03-23 Thread osstest service owner

flight 87014 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/87014/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 65543
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 65543

version targeted for testing:
 ovmf 6a9bc80154dd8771bc6f76b9b6c7579753e86e50
baseline version:
 ovmf 5ac96e3a28dd26eabee421919f67fa7c443a47f1

Last test of basis65543  2015-12-08 08:45:15 Z  106 days
Failing since 65593  2015-12-08 23:44:51 Z  105 days  117 attempts
Testing same since87014  2016-03-23 07:35:56 Z0 days1 attempts


People who touched revisions under test:
  "Samer El-Haj-Mahmoud" 
  "Wu, Hao A" 
  "Yao, Jiewen" 
  Alcantara, Paulo 
  Anbazhagan Baraneedharan 
  Andrew Fish 
  Ard Biesheuvel 
  Arthur Crippa Burigo 
  Cecil Sheng 
  Chao Zhang 
  Chao Zhang
  Charles Duffy 
  Cinnamon Shia 
  Cohen, Eugene 
  Dandan Bi 
  Daocheng Bu 
  Daryl McDaniel 
  David Woodhouse 
  Derek Lin 
  edk2 dev 
  edk2-devel 
  Eric Dong 
  Eric Dong 
  Eugene Cohen 
  Evan Lloyd 
  Feng Tian 
  Fu Siyuan 
  Gabriel Somlo 
  Gary Ching-Pang Lin 
  Gary Lin 
  Ghazi Belaam 
  Hao Wu 
  Haojian Zhuang 
  Hess Chen 
  Heyi Guo 
  Jaben Carsey 
  Jeff Fan 
  Jiaxin Wu 
  jiewen yao 
  Jim Dailey 
  jim_dai...@dell.com 
  Jordan Justen 
  Karyne Mayer 
  Larry Hauch 
  Laszlo Ersek 
  Leahy, Leroy P
  Leahy, Leroy P 
  Lee Leahy 
  Leekha Shaveta 
  Leif Lindholm 
  Liming Gao 
  Mark Rutland 
  Marvin Haeuser 
  Michael Kinney 
  Michael LeMay 
  Michael Thomas 
  MichaÅ Zegan 
  Ni, Ruiyu 
  Paolo Bonzini 
  Paulo Alcantara 
  Paulo Alcantara Cavalcanti 
  Peter Kirmeier 
  Qin Long 
  Qiu Shumin 
  Rodrigo Dias Correa 
  Ruiyu Ni 
  Ryan Harkin 
  Samer El-Haj-Mahmoud 
  Samer El-Haj-Mahmoud 
  Star Zeng 
  Supreeth Venkatesh 
  Tapan Shah 
  Tian, Feng 
  Vladislav Vovchenko 
  Yao Jiewen 
  Yao, Jiewen 
  Ye Ting 
  Yonghong Zhu 
  Zhang Lubo 
  Zhang, Chao B 
  Zhang, Lubo 
  Zhangfei Gao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at

Re: [Xen-devel] [PATCH v2 3/6] x86/mtrr: Fix Xorg crashes in Qemu sessions

2016-03-23 Thread Toshi Kani

On Wed, 2016-03-23 at 09:44 +0100, Borislav Petkov wrote:
> On Tue, Mar 22, 2016 at 03:53:30PM -0600, Toshi Kani wrote:
> > Yes. I had to remove this number since checkpatch complained that I
> > needed to quote the whole patch tile again.  I will ignore this
> > checkpatch error and add this commit number here.
> 
> Actually, checkpatch is right. We do quote the commit IDs *together*
> with their names so that the reader knows which commit the text is
> talking about.

OK, I will use [1] to refer this patch.  This patch is fully quoted at the
top of this changelog, and it'd be verbose to repeat this full quote every
time I refers it...

Thanks,
-Toshi

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] blkif.h: document scsi/0x12/0x node

2016-03-23 Thread Paul Durrant

> -Original Message-
> From: Roger Pau Monné [mailto:roger@citrix.com]
> Sent: 23 March 2016 14:54
> To: Bob Liu
> Cc: Roger Pau Monne; xen-devel@lists.xen.org; Paul Durrant; Ian Jackson;
> konrad.w...@oracle.com; jgr...@suse.com; annie...@oracle.com; David
> Vrabel
> Subject: Re: [PATCH] blkif.h: document scsi/0x12/0x node
> 
> On Wed, 23 Mar 2016, Bob Liu wrote:
> > On 03/23/2016 08:33 PM, Roger Pau Monné wrote:
> > > On Wed, 23 Mar 2016, Bob Liu wrote:
> > >
> > >> This patch documents a xenstore node which is used by XENVBD
> Windows PV
> > >> driver.
> > >>
> > >> The use case is that XenServer may have OEM specific storage backends
> and
> > >> there is requirement to run OEM software in guest which relied on VPD
> > >> information supplied by the storages.
> > >> Adding a node to xenstore is the easiest way to get this VPD information
> from
> > >> the backend into guest where XENVBD Windows PV driver can get
> INQUIRY VPD data
> > >> from this node and return to OEM software.
> > >>
> > >> Signed-off-by: Bob Liu 
> > >> ---
> > >>  xen/include/public/io/blkif.h |   24 
> > >>  1 file changed, 24 insertions(+)
> > >>
> > >> diff --git a/xen/include/public/io/blkif.h 
> > >> b/xen/include/public/io/blkif.h
> > >> index 99f0326..afbcbff 100644
> > >> --- a/xen/include/public/io/blkif.h
> > >> +++ b/xen/include/public/io/blkif.h
> > >> @@ -182,6 +182,30 @@
> > >>   *  backend driver paired with a LIFO queue in the frontend will
> > >>   *  allow us to have better performance in this scenario.
> > >>   *
> > >> + * scsi/0x12/0x
> > >> + *  Values: base64 encoded string
> > >> + *
> > >> + *  This optional node contains SCSI INQUIRY VPD information.
> > >> + *   is the hexadecimal representation of the VPD page
> code.
> > >> + *  Currently only XENVBD Windows PV driver is using this node.
> > >> + *
> > >> + *  A frontend e.g XENVBD Windows PV driver which represents
> a Xen VBD to
> > >> + *  its containing operating system as a (virtual) SCSI target may
> return the
> > >> + *  specified data in response to INQUIRY commands from its
> containing OS.
> > >> + *
> > >> + *  A frontend which supports this feature must return the
> backend-specified
> > >> + *  data for every INQUIRY command with the EVPD bit set.
> > >> + *  For EVPD=1 INQUIRY commands where the corresponding
> xenstore node
> > >> + *  does not exist, the frontend must report (to its containing
> OS) an
> > >> + *  appropriate failure condition.
> > >> + *
> > >> + *  A frontend which does not support this feature just disregard
> these
> > >> + *  xenstore nodes.
> > >> + *
> > >> + *  The data of this string node is base64 encoded. Base64 is a
> group of
> > >> + *  similar binary-to-text encoding schemes that represent
> binary data in an
> > >> + *  ASCII string format by translating it into a radix-64
> representation.
> > >> + *
> > >
> > > I'm sorry, but I need to raise similar concerns as the ones expressed by
> > > other people.
> > >
> > > I understand that those pages that you plan to export to the guest
> contain
> > > some kind of hardware specific information, but how is the guest going to
> > > make use of this?
> > >
> > > It can only interact with a Xen virtual block device, and there you can
> > > only send read, write, flush and discard requests. Even the block size is
> > > hardcoded to 512b by the protocol, so I'm not sure how are you going to
> > > use this information.
> > >
> >
> > For this part, there is ioctl() interface for all block device.
> > Looking at virtio-blk in KVM world, it can accept almost all SCSI commands
> also in ioctl() even they already have virtio-scsi.
> > But that's another story.
> >
> 
> So this means that you would then need to add a bunch of new request
> types
> to the PV block protocol in order to make use of this new exported
> information?
>

No, why do you think that? The info is in xenstore so why does the blkif 
protocol need to be involved at all?

  Paul
 
> What are those commands going to be? How are they going to be passed to
> the underlying storage?
> 
> Roger.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 2/6] x86/mm/pat: Add pat_disable() interface

2016-03-23 Thread Toshi Kani

On Wed, 2016-03-23 at 09:51 +0100, Borislav Petkov wrote:
> On Tue, Mar 22, 2016 at 03:40:45PM -0600, Toshi Kani wrote:
> > Will change to "Prevent the OS from initializing the PAT MSR".
> > 
> > I wanted to clarify that "disable" does not mean to disable PAT MSR.
> 
> How do you "disable PAT MSR" ?

We can't, but I thought not everyone knows how it works...

> I think you're overdocumenting this. pat_disable() is as clear as day
> what it does. It doesn't need any commenting...

Right, maybe I am just paranoid.  I will remove the comment as you
suggested.

> > I've run checkpatch.pl and thought it was OK to have this warning
> > (instead of a >80 warning) since the error message part was not split.
> >  The "attempting" part is for debugging and its string is passed from
> > the caller. 
> 
> We always put the quoted strings on a single line for easier grepping.
> Forget the 80-cols rule.

OK.

Thanks,
-Toshi

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/6] x86/mm/pat: Change PAT to support non-default PAT MSR

2016-03-23 Thread Toshi Kani

On Wed, 2016-03-23 at 09:43 +0100, Borislav Petkov wrote:
> On Tue, Mar 22, 2016 at 12:35:19PM -0600, Toshi Kani wrote:
> > Right.  Will change to "Add support of non-default PAT MSR setting at
> > handoff".
> 
> Please remove this "handoff" notion from the text. Every hw register is
> being handed off to the OS once the kernel takes over so there's no need
> to make it special here.

Will do.

> > I'd like to make it clear that this function does not set PAT MSR,
> > unlike what pat_init() does.  When CPU supports PAT, it keeps PAT MSR
> > in whatever the setting at handoff, and initializes PAT table to match
> > with this setting.
> > 
> > I am open to a better name, but I am afraid that setup_pat() can be
> > confusing as if it sets PAT MSR.
> 
> So call it init_cache_modes() and rename the current
> pat_init_cache_modes() to __init_cache_modes() to denote it is a lower
> level helper of the init_cache_modes() one. The init_cache_modes() one
> deals with the higher level figuring out of whether PAT is enabled and
> if not, preparing the attr bits for emulation. In the end, it calls
> __init_cache_modes(). All nice and easy.

Good idea!  Will do.

Thanks,
-Toshi

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] blkif.h: document scsi/0x12/0x node

2016-03-23 Thread Roger Pau Monné

On Wed, 23 Mar 2016, Bob Liu wrote:
> On 03/23/2016 08:33 PM, Roger Pau Monné wrote:
> > On Wed, 23 Mar 2016, Bob Liu wrote:
> > 
> >> This patch documents a xenstore node which is used by XENVBD Windows PV
> >> driver.
> >>
> >> The use case is that XenServer may have OEM specific storage backends and
> >> there is requirement to run OEM software in guest which relied on VPD
> >> information supplied by the storages.
> >> Adding a node to xenstore is the easiest way to get this VPD information 
> >> from
> >> the backend into guest where XENVBD Windows PV driver can get INQUIRY VPD 
> >> data
> >> from this node and return to OEM software.
> >>
> >> Signed-off-by: Bob Liu 
> >> ---
> >>  xen/include/public/io/blkif.h |   24 
> >>  1 file changed, 24 insertions(+)
> >>
> >> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
> >> index 99f0326..afbcbff 100644
> >> --- a/xen/include/public/io/blkif.h
> >> +++ b/xen/include/public/io/blkif.h
> >> @@ -182,6 +182,30 @@
> >>   *  backend driver paired with a LIFO queue in the frontend will
> >>   *  allow us to have better performance in this scenario.
> >>   *
> >> + * scsi/0x12/0x
> >> + *Values: base64 encoded string
> >> + *
> >> + *This optional node contains SCSI INQUIRY VPD information.
> >> + * is the hexadecimal representation of the VPD page code.
> >> + *Currently only XENVBD Windows PV driver is using this node.
> >> + *
> >> + *A frontend e.g XENVBD Windows PV driver which represents a Xen 
> >> VBD to
> >> + *its containing operating system as a (virtual) SCSI target may 
> >> return the
> >> + *specified data in response to INQUIRY commands from its 
> >> containing OS.
> >> + *
> >> + *A frontend which supports this feature must return the 
> >> backend-specified
> >> + *data for every INQUIRY command with the EVPD bit set.
> >> + *For EVPD=1 INQUIRY commands where the corresponding xenstore 
> >> node
> >> + *does not exist, the frontend must report (to its containing OS) 
> >> an
> >> + *appropriate failure condition.
> >> + *
> >> + *A frontend which does not support this feature just disregard 
> >> these
> >> + *xenstore nodes.
> >> + *
> >> + *The data of this string node is base64 encoded. Base64 is a 
> >> group of
> >> + *similar binary-to-text encoding schemes that represent binary 
> >> data in an
> >> + *ASCII string format by translating it into a radix-64 
> >> representation.
> >> + *
> > 
> > I'm sorry, but I need to raise similar concerns as the ones expressed by 
> > other people.
> > 
> > I understand that those pages that you plan to export to the guest contain 
> > some kind of hardware specific information, but how is the guest going to 
> > make use of this?
> > 
> > It can only interact with a Xen virtual block device, and there you can 
> > only send read, write, flush and discard requests. Even the block size is 
> > hardcoded to 512b by the protocol, so I'm not sure how are you going to 
> > use this information.
> > 
> 
> For this part, there is ioctl() interface for all block device.
> Looking at virtio-blk in KVM world, it can accept almost all SCSI commands 
> also in ioctl() even they already have virtio-scsi.
> But that's another story.
> 

So this means that you would then need to add a bunch of new request types 
to the PV block protocol in order to make use of this new exported 
information?

What are those commands going to be? How are they going to be passed to 
the underlying storage?

Roger.___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-unstable-smoke test] 87036: tolerable all pass - PUSHED

2016-03-23 Thread osstest service owner

flight 87036 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/87036/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  0fa4c7f25dd0756fcc96de0824deb7763a7fa739
baseline version:
 xen  829e03ca0ef757350546df8546a6575ca3d0e8da

Last test of basis86586  2016-03-18 16:03:06 Z4 days
Testing same since87036  2016-03-23 12:02:18 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Daniel De Graaf 
  Doug Goldstein 
  Ian Campbell 
  Jan Beulich 
  Julien Grall 
  Konrad Rzeszutek Wilk 
  Olaf Hering 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-smoke
+ revision=0fa4c7f25dd0756fcc96de0824deb7763a7fa739
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 
0fa4c7f25dd0756fcc96de0824deb7763a7fa739
+ branch=xen-unstable-smoke
+ revision=0fa4c7f25dd0756fcc96de0824deb7763a7fa739
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-smoke
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-smoke
+ prevxenbranch=xen-4.6-testing
+ '[' x0fa4c7f25dd0756fcc96de0824deb7763a7fa739 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local

Re: [Xen-devel] [PATCH] blkif.h: document scsi/0x12/0x node

2016-03-23 Thread Ian Jackson

Paul Durrant writes ("RE: [PATCH] blkif.h: document scsi/0x12/0x node"):
> > From: Bob Liu [mailto:bob@oracle.com]
> > Sent: 23 March 2016 11:48
> > To: xen-devel@lists.xen.org
> > Cc: Paul Durrant; Ian Jackson; konrad.w...@oracle.com; jgr...@suse.com;
> > Roger Pau Monne; annie...@oracle.com; David Vrabel; Bob Liu
> > Subject: [PATCH] blkif.h: document scsi/0x12/0x node

Nacked-by: Ian Jackson 

I'm sorry to say that the need for this protocol extension, and its
proper semantics, have yet to be established.

I don't think it was appropriate to repost the patch in this form
while the conversation on those topics is ongoing.  It would have been
appropriate if the patch contained answers, or at least forward
movement, in that conversation.  But regret to say that this version
of this patch does not do that.

> > + * The data of this string node is base64 encoded. Base64 is a group of
> > + * similar binary-to-text encoding schemes that represent binary data in
> > an
> > + * ASCII string format by translating it into a radix-64 representation.
> 
> Do we need to explain what base64 is?

This seems to be a response to my complaint

  I would like the base64 encoding to specified much more explicitly.
  Just `base64 formatted' is too vague.

By that I meant that the specific exact base64 format (of which there
are of course several dialects each with various variants) has to be
specified, probably by an reference to a suitable external
specification.

I did not mean that a general dictionary definition of base64 should
be cut-and-pasted from Wikipedia into this specification document.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/5] x86/time: implement tsc as clocksource

2016-03-23 Thread Jan Beulich

>>> On 23.03.16 at 13:05,  wrote:
> On 03/23/2016 07:28 AM, Jan Beulich wrote:
>> Sure - it seems quite obvious that all boot time available CPUs
>> should be checked.
> Cool, so I will go with moving init_xen_time right after all CPUs are up but
> before initcalls are invoked.

I think your other alternative seemed more reasonable; I wouldn't
be very happy to see init_xen_time() moved around.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 12/34] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op

2016-03-23 Thread Jan Beulich

>>> On 15.03.16 at 18:56,  wrote:
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -168,4 +168,15 @@ config SCHED_DEFAULT
>  
>  endmenu
>  
> +# Enable/Disable xsplice support
> +config XSPLICE
> + bool "xSplice live patching support"
> + default y

Isn't it a little early in the series to default this to on?

And then of course the EXPERT question comes up again. No
matter that IanC is no longer around to help with the
argumentation, the point he has been making about too many
flavors ending up in the wild continues to apply.

> @@ -460,6 +461,12 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
> u_sysctl)
>  ret = tmem_control(>u.tmem_op);
>  break;
>  
> +case XEN_SYSCTL_xsplice_op:
> +ret = xsplice_op(>u.xsplice);
> +if ( ret != -ENOSYS )
> +copyback = 1;
> +break;

Why is ENOSYS special here, but not e.g. EOPNOTSUPP?

> +struct payload {
> +uint32_t state;  /* One of the XSPLICE_STATE_*. */
> +int32_t rc;  /* 0 or -XEN_EXX. */
> +struct list_head list;   /* Linked to 'payload_list'. */
> +char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */

Could I talk you into reducing XEN_XSPLICE_NAME_SIZE to 127,
to avoid needless padding in places like this one?

> +static int verify_name(const xen_xsplice_name_t *name)
> +{
> +if ( name->size == 0 || name->size > XEN_XSPLICE_NAME_SIZE )
> +return -EINVAL;
> +
> +if ( name->pad[0] || name->pad[1] || name->pad[2] )

I'd like to ask for consistency here: Either always use == 0 / != 0,
or always omit the latter and use ! in place of the former.

> +static int verify_payload(const xen_sysctl_xsplice_upload_t *upload)
> +{
> +if ( verify_name(>name) )
> +return -EINVAL;
> +
> +if ( upload->size == 0 )
> +return -EINVAL;
> +
> +if ( !guest_handle_okay(upload->payload, upload->size) )

Careful here - upload->size is uint64_t, yet array_access_ok() makes
assumptions on not too large a size getting passed. I.e. I think you
want to apply an upper bound to the size right here - for example, it
can't reasonably be bigger than XEN_VIRT_END - XEN_VIRT_START
if I remember correctly how you intend to place those payloads.

> +static int find_payload(const xen_xsplice_name_t *name, struct payload **f)

Perhaps neater to use the xen/err.h constructs here instead
of indirection?

> +{
> +struct payload *data;
> +XEN_GUEST_HANDLE_PARAM(char) str;
> +char n[XEN_XSPLICE_NAME_SIZE + 1] = { 0 };

This pointlessly zeroes the entire array. Just set str[name->size]
to zero after the copy-in.

> +int rc = -EINVAL;

Pointless initializer.

> +rc = verify_name(name);
> +if ( rc )
> +return rc;
> +
> +str = guest_handle_cast(name->name, char);

Why do you need a cast here?

> +if ( copy_from_guest(n, str, name->size) )

You validated the address range already, so __copy_from_guest()
will be just fine and more efficient.

> +return -EFAULT;
> +
> +spin_lock_recursive(_lock);

Why do you need a recursive lock here? I think something like this
should be reasoned about in the commit message.

> +/*
> + * We MUST be holding the payload_lock spinlock.
> + */

Single line comment (but kind of redundant with ...

> +static void free_payload(struct payload *data)
> +{
> +ASSERT(spin_is_locked(_lock));

... this anyway).

> +static int xsplice_upload(xen_sysctl_xsplice_upload_t *upload)
> +{
> +struct payload *data = NULL;

Pointless initializer.

> +void *raw_data = NULL;
> +int rc;
> +
> +rc = verify_payload(upload);
> +if ( rc )
> +return rc;
> +
> +rc = find_payload(>name, );
> +if ( rc == 0 /* Found. */ )
> +return -EEXIST;
> +
> +if ( rc != -ENOENT )
> +return rc;
> +
> +data = xzalloc(struct payload);
> +if ( !data )
> +return -ENOMEM;
> +
> +rc = -EFAULT;
> +if ( copy_from_guest(data->name, upload->name.name, upload->name.size) )

__copy_from_guest()

> +goto out;
> +
> +rc = -ENOMEM;
> +raw_data = vzalloc(upload->size);

vmalloc()

> +if ( !raw_data )
> +goto out;
> +
> +rc = -EFAULT;
> +if ( copy_from_guest(raw_data, upload->payload, upload->size) )

__copy_from_guest()

> +goto out;
> +
> +data->state = XSPLICE_STATE_CHECKED;
> +data->rc = 0;

This is redundant with the xzalloc() above.

> +INIT_LIST_HEAD(>list);
> +
> +spin_lock_recursive(_lock);
> +list_add_tail(>list, _list);
> +payload_cnt++;
> +payload_version++;
> +spin_unlock_recursive(_lock);
> +
> + out:
> +vfree(raw_data);

By here you allocated and filled raw_data. And now you
unconditionally free it. What is that good for?

> +if ( rc )
> +{
> +xfree(data);
> +}

The use of braces here is inconsistent with all of the rest of this
function.

> +static int

Re: [Xen-devel] [PATCH] blkif.h: document scsi/0x12/0x node

2016-03-23 Thread Bob Liu


On 03/23/2016 08:33 PM, Roger Pau Monné wrote:
> On Wed, 23 Mar 2016, Bob Liu wrote:
> 
>> This patch documents a xenstore node which is used by XENVBD Windows PV
>> driver.
>>
>> The use case is that XenServer may have OEM specific storage backends and
>> there is requirement to run OEM software in guest which relied on VPD
>> information supplied by the storages.
>> Adding a node to xenstore is the easiest way to get this VPD information from
>> the backend into guest where XENVBD Windows PV driver can get INQUIRY VPD 
>> data
>> from this node and return to OEM software.
>>
>> Signed-off-by: Bob Liu 
>> ---
>>  xen/include/public/io/blkif.h |   24 
>>  1 file changed, 24 insertions(+)
>>
>> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
>> index 99f0326..afbcbff 100644
>> --- a/xen/include/public/io/blkif.h
>> +++ b/xen/include/public/io/blkif.h
>> @@ -182,6 +182,30 @@
>>   *  backend driver paired with a LIFO queue in the frontend will
>>   *  allow us to have better performance in this scenario.
>>   *
>> + * scsi/0x12/0x
>> + *  Values: base64 encoded string
>> + *
>> + *  This optional node contains SCSI INQUIRY VPD information.
>> + *   is the hexadecimal representation of the VPD page code.
>> + *  Currently only XENVBD Windows PV driver is using this node.
>> + *
>> + *  A frontend e.g XENVBD Windows PV driver which represents a Xen VBD to
>> + *  its containing operating system as a (virtual) SCSI target may return 
>> the
>> + *  specified data in response to INQUIRY commands from its containing OS.
>> + *
>> + *  A frontend which supports this feature must return the backend-specified
>> + *  data for every INQUIRY command with the EVPD bit set.
>> + *  For EVPD=1 INQUIRY commands where the corresponding xenstore node
>> + *  does not exist, the frontend must report (to its containing OS) an
>> + *  appropriate failure condition.
>> + *
>> + *  A frontend which does not support this feature just disregard these
>> + *  xenstore nodes.
>> + *
>> + *  The data of this string node is base64 encoded. Base64 is a group of
>> + *  similar binary-to-text encoding schemes that represent binary data in an
>> + *  ASCII string format by translating it into a radix-64 representation.
>> + *
> 
> I'm sorry, but I need to raise similar concerns as the ones expressed by 
> other people.
> 
> I understand that those pages that you plan to export to the guest contain 
> some kind of hardware specific information, but how is the guest going to 
> make use of this?
> 
> It can only interact with a Xen virtual block device, and there you can 
> only send read, write, flush and discard requests. Even the block size is 
> hardcoded to 512b by the protocol, so I'm not sure how are you going to 
> use this information.
> 

For this part, there is ioctl() interface for all block device.
Looking at virtio-blk in KVM world, it can accept almost all SCSI commands also 
in ioctl() even they already have virtio-scsi.
But that's another story.

Thanks,
Bob

> Also, the fact that's implemented in some drivers in some OS isn't an 
> argument in order to have them added. FreeBSD had for a very long time a 
> set of custom extensions, that where never added to blkif.h simply because 
> they were broken and unneeded, so the solution was to remove them from the 
> implementation, and the same could happen here IMHO.
> 
> Roger.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools: fix xen-detect to correctly identify domU type

2016-03-23 Thread Boris Ostrovsky


On 03/23/2016 07:33 AM, Juergen Gross wrote:

On 23/03/16 11:55, Andrew Cooper wrote:

On 23/03/16 10:52, Juergen Gross wrote:

On 23/03/16 11:32, David Vrabel wrote:

On 23/03/16 10:25, Jan Beulich wrote:

On 23.03.16 at 11:14,  wrote:

7. Report type according to features found (this is a little bit
ugly: we have to rely on the current hypervisor implementation
regarding the bits set for the different guest types).

Well, in some of the cases feature flags only make sense for one
kind of guest, so if such a flag is set it could be used as positive
indication (while it being clear may then still mean nothing).


Would it make sense to add another file to /sys/hypervisor/properties?
Something like guest_type, containing "pv", "hvm" or "pvh"? If existing
this could be used to report the guest type.

That would seem a good idea to me. What do others, namely
Linux maintainers, think?

What's the use case for user space knowing if it's in a PV or HVM domain?

The first thing coming to my mind would be diagnostic tools.

Having the admin able to tell for informational purposes is useful.
They can find out by looking at the top of `dmesg`, but a hypervisor
sysfs node is cleaner than requiring the admin to know every printk()
variant that Xen puts out.

Especially on a long running guest this information might be not
available in case of trouble.


What about dmidecode?

Unprivileged PV guests will return nothing:

[root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]# dmidecode
# dmidecode 2.11
# No SMBIOS nor DMI entry point found, sorry.
[root@dhcp-burlington7-2nd-B-east-10-152-55-140 ~]#


HVM guests will say something like:

System Information
Manufacturer: Xen
Product Name: HVM domU

and dom0 will report actual info.

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/3] xenalyze: handle RTDS scheduler events

2016-03-23 Thread George Dunlap

On 12/03/16 11:34, Dario Faggioli wrote:
> so the trace will show properly decoded info,
> rather than just a bunch of hex codes.
> 
> Signed-off-by: Dario Faggioli 
> Reviewed-by: Konrad Rzeszutek Wilk 

Acked-by: George Dunlap 

> ---
> Cc: George Dunlap 
> Cc: Meng Xu 
> Cc: Tianyang Chen 
> Cc: Ian Jackson 
> Cc: Wei Liu 
> ---
> Changes from v2:
>  * use 64 bits ints for time values (now that the scheduler
>does that too), as suggested during review.
> 
> Changes from v1:
>  * '} * r =' turned into '} *r =', as requested
>during review.
> ---
>  tools/xentrace/xenalyze.c |   52 
> +
>  1 file changed, 52 insertions(+)
> 
> diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
> index 353bed7..b949986 100644
> --- a/tools/xentrace/xenalyze.c
> +++ b/tools/xentrace/xenalyze.c
> @@ -7823,6 +7823,58 @@ void sched_process(struct pcpu_info *p)
> r->rq_avgload, r->b_avgload);
>  }
>  break;
> +/* RTDS (TRC_RTDS_xxx) */
> +case TRC_SCHED_CLASS_EVT(RTDS, 1): /* TICKLE   */
> +if(opt.dump_all) {
> +struct {
> +unsigned int cpu:16;
> +} *r = (typeof(r))ri->d;
> +
> +printf(" %s rtds:runq_tickle cpu %u\n",
> +   ri->dump_header, r->cpu);
> +}
> +break;
> +case TRC_SCHED_CLASS_EVT(RTDS, 2): /* RUNQ_PICK*/
> +if(opt.dump_all) {
> +struct {
> +unsigned int vcpuid:16, domid:16;
> +uint64_t cur_dl, cur_bg;
> +} __attribute__((packed)) *r = (typeof(r))ri->d;
> +
> +printf(" %s rtds:runq_pick d%uv%u, deadline = %"PRIu64", "
> +   "budget = %"PRIu64"\n", ri->dump_header,
> +   r->domid, r->vcpuid, r->cur_dl, r->cur_bg);
> +}
> +break;
> +case TRC_SCHED_CLASS_EVT(RTDS, 3): /* BUDGET_BURN  */
> +if(opt.dump_all) {
> +struct {
> +unsigned int vcpuid:16, domid:16;
> +uint64_t cur_bg;
> +int delta;
> +} __attribute__((packed)) *r = (typeof(r))ri->d;
> +
> +printf(" %s rtds:burn_budget d%uv%u, budget = %"PRIu64", "
> +   "delta = %d\n", ri->dump_header, r->domid,
> +   r->vcpuid, r->cur_bg, r->delta);
> +}
> +break;
> +case TRC_SCHED_CLASS_EVT(RTDS, 4): /* BUDGET_REPLENISH */
> +if(opt.dump_all) {
> +struct {
> +unsigned int vcpuid:16, domid:16;
> +uint64_t cur_dl, cur_bg;
> +} __attribute__((packed)) *r = (typeof(r))ri->d;
> +
> +printf(" %s rtds:repl_budget d%uv%u, deadline = %"PRIu64", "
> +   "budget = %"PRIu64"\n", ri->dump_header,
> +   r->domid, r->vcpuid, r->cur_dl, r->cur_bg);
> +}
> +break;
> +case TRC_SCHED_CLASS_EVT(RTDS, 5): /* SCHED_TASKLET*/
> +if(opt.dump_all)
> +printf(" %s rtds:sched_tasklet\n", ri->dump_header);
> +break;
>  default:
>  process_generic(ri);
>  }
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/3] xenalyze: handle DOM0 operaions events

2016-03-23 Thread George Dunlap

On 12/03/16 11:33, Dario Faggioli wrote:
> (i.e., domain creation and destruction) so the
> trace will show properly decoded info, rather
> than just a bunch of hex codes.
> ---

For some reason git won't apply your 'v2', complaining: 'corrupt patch
at line 14'.

But re the content (i.e., this patch with the SoB and the title fixed)

Acked-by: George Dunlap 

Sorry it took so long to get to this.


> Cc: George Dunlap 
> Cc: Ian Jackson 
> Cc: Wei Liu 
> Cc: Konrad Rzeszutek Wilk 
> ---
> Changes from v1:
>  * new patch in the series.
> ---
>  tools/xentrace/xenalyze.c |   26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
> index d4a5b0c..353bed7 100644
> --- a/tools/xentrace/xenalyze.c
> +++ b/tools/xentrace/xenalyze.c
> @@ -8388,6 +8388,30 @@ void hw_process(struct pcpu_info *p)
>  }
>  
>  }
> +
> +#define TRC_DOM0_SUB_DOMOPS 1
> +void dom0_process(struct pcpu_info *p)
> +{
> +struct record_info *ri = >ri;
> +
> +switch(ri->evt.sub)
> +{
> +case TRC_DOM0_SUB_DOMOPS:
> +if(opt.dump_all) {
> +struct {
> +unsigned int domid;
> +} *r = (typeof(r))ri->d;
> +
> +printf(" %s %s domain d%u\n", ri->dump_header,
> +   ri->event == TRC_DOM0_DOM_ADD ? "creating" : "destroying",
> +   r->domid);
> +}
> +break;
> +default:
> +process_generic(>ri);
> +}
> +}
> +
>  /*  Base - */
>  void dump_generic(FILE * f, struct record_info *ri)
>  {
> @@ -9224,6 +9248,8 @@ void process_record(struct pcpu_info *p) {
>  hw_process(p);
>  break;
>  case TRC_DOM0OP_MAIN:
> +dom0_process(p);
> +break;
>  default:
>  process_generic(ri);
>  }
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] blkif.h: document scsi/0x12/0x node

2016-03-23 Thread Roger Pau Monné

On Wed, 23 Mar 2016, Bob Liu wrote:

> This patch documents a xenstore node which is used by XENVBD Windows PV
> driver.
> 
> The use case is that XenServer may have OEM specific storage backends and
> there is requirement to run OEM software in guest which relied on VPD
> information supplied by the storages.
> Adding a node to xenstore is the easiest way to get this VPD information from
> the backend into guest where XENVBD Windows PV driver can get INQUIRY VPD data
> from this node and return to OEM software.
> 
> Signed-off-by: Bob Liu 
> ---
>  xen/include/public/io/blkif.h |   24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
> index 99f0326..afbcbff 100644
> --- a/xen/include/public/io/blkif.h
> +++ b/xen/include/public/io/blkif.h
> @@ -182,6 +182,30 @@
>   *  backend driver paired with a LIFO queue in the frontend will
>   *  allow us to have better performance in this scenario.
>   *
> + * scsi/0x12/0x
> + *   Values: base64 encoded string
> + *
> + *   This optional node contains SCSI INQUIRY VPD information.
> + *is the hexadecimal representation of the VPD page code.
> + *   Currently only XENVBD Windows PV driver is using this node.
> + *
> + *   A frontend e.g XENVBD Windows PV driver which represents a Xen VBD to
> + *   its containing operating system as a (virtual) SCSI target may return 
> the
> + *   specified data in response to INQUIRY commands from its containing OS.
> + *
> + *   A frontend which supports this feature must return the backend-specified
> + *   data for every INQUIRY command with the EVPD bit set.
> + *   For EVPD=1 INQUIRY commands where the corresponding xenstore node
> + *   does not exist, the frontend must report (to its containing OS) an
> + *   appropriate failure condition.
> + *
> + *   A frontend which does not support this feature just disregard these
> + *   xenstore nodes.
> + *
> + *   The data of this string node is base64 encoded. Base64 is a group of
> + *   similar binary-to-text encoding schemes that represent binary data in an
> + *   ASCII string format by translating it into a radix-64 representation.
> + *

I'm sorry, but I need to raise similar concerns as the ones expressed by 
other people.

I understand that those pages that you plan to export to the guest contain 
some kind of hardware specific information, but how is the guest going to 
make use of this?

It can only interact with a Xen virtual block device, and there you can 
only send read, write, flush and discard requests. Even the block size is 
hardcoded to 512b by the protocol, so I'm not sure how are you going to 
use this information.

Also, the fact that's implemented in some drivers in some OS isn't an 
argument in order to have them added. FreeBSD had for a very long time a 
set of custom extensions, that where never added to blkif.h simply because 
they were broken and unneeded, so the solution was to remove them from the 
implementation, and the same could happen here IMHO.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 0/5] libxl: add support for qemu base pvusb backend

2016-03-23 Thread Juergen Gross

This patch series is meant to be applied on top of Chunyan's series
to support pvusb in libxl.

It is adding support for an alternative pvusb backend "qusb" via qemu.

Changes in V3:
- added new patches 3 and 4 to at least report failure in case no device
  model is running when adding devices to a domain requiring a dm.

Changes in V2:
- patch 1: Return false if libxl__get_domid() fails as requested by
  George Dunlap
- Swapped patches 2 and 3 as former patch 2 has been questioned to make
  sense for 4.7. This will remove an obstacle for former patch 3 to go in.

Juergen Gross (5):
  libxl: make libxl__need_xenpv_qemu() operate on domain config
  libxl: add new pvusb backend "qusb" provided by qemu
  libxl: add service function to check whether device model is running
  libxl: check for dynamic device model start required
  libxl: add domain config parameter to force start of qemu

 docs/man/xl.cfg.pod.5|  17 +-
 tools/libxl/libxl.c  |  16 +-
 tools/libxl/libxl_create.c   |  10 +---
 tools/libxl/libxl_device.c   |   3 +-
 tools/libxl/libxl_dm.c   |  88 +++-
 tools/libxl/libxl_internal.h |  10 ++--
 tools/libxl/libxl_pci.c  |   3 +
 tools/libxl/libxl_pvusb.c| 108 +++
 tools/libxl/libxl_types.idl  |   2 +
 tools/libxl/libxl_types_internal.idl |   1 +
 tools/libxl/xl_cmdimpl.c |   3 +
 11 files changed, 181 insertions(+), 80 deletions(-)

-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 3/5] libxl: add service function to check whether device model is running

2016-03-23 Thread Juergen Gross

Add an internal service function to check for a running device model.
This can be used later when adding devices to a domain requiring a
device model for either printing an error message or starting the
device model in case it is not already running.

Signed-off-by: Juergen Gross 
---
 tools/libxl/libxl.c|  4 +---
 tools/libxl/libxl_dm.c | 10 ++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 3471c4c..dcd0951 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1532,7 +1532,6 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 libxl_ctx *ctx = CTX;
 uint32_t domid = dis->domid;
 char *dom_path;
-char *pid;
 int rc, dm_present;
 
 libxl__ev_child_init(>destroyer);
@@ -1555,8 +1554,7 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 }
 /* fall through */
 case LIBXL_DOMAIN_TYPE_PV:
-pid = libxl__xs_read(gc, XBT_NULL, 
GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
-dm_present = (pid != NULL);
+dm_present = libxl__dm_active(gc, domid);
 break;
 case LIBXL_DOMAIN_TYPE_INVALID:
 rc = ERROR_FAIL;
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 361e584..bffb8f8 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2150,6 +2150,16 @@ out:
 return ret;
 }
 
+int libxl__dm_active(libxl__gc *gc, uint32_t domid)
+{
+char *pid, *path;
+
+path = GCSPRINTF("/local/domain/%d/image/device-model-pid", domid);
+pid = libxl__xs_read(gc, XBT_NULL, path);
+
+return pid != NULL;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 1/5] libxl: make libxl__need_xenpv_qemu() operate on domain config

2016-03-23 Thread Juergen Gross

libxl__need_xenpv_qemu() is called with configuration data for console,
vfbs, disks and channels today in order to evaluate the need for
starting a device model for a pv domain.

The console data is local to the caller and setup in a way to never
require a device model. All other data is taken from the domain config
structure.

In order to support other device backends via qemu change the interface
of libxl__need_xenpv_qemu() to take the domain config structure as
input instead of the single device arrays.

Signed-off-by: Juergen Gross 
---
V2: Return false if libxl__get_domid() fails as requested by George Dunlap
---
 tools/libxl/libxl_create.c   |  9 +--
 tools/libxl/libxl_dm.c   | 57 
 tools/libxl/libxl_internal.h |  5 +---
 3 files changed, 17 insertions(+), 54 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 61b5c01..0e2b0a0 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1304,7 +1304,6 @@ static void domcreate_launch_dm(libxl__egc *egc, 
libxl__multidev *multidev,
 }
 case LIBXL_DOMAIN_TYPE_PV:
 {
-int need_qemu = 0;
 libxl__device_console console;
 libxl__device device;
 
@@ -1314,17 +1313,11 @@ static void domcreate_launch_dm(libxl__egc *egc, 
libxl__multidev *multidev,
 }
 
 init_console_info(gc, , 0);
-
-need_qemu = libxl__need_xenpv_qemu(gc, 1, ,
-d_config->num_vfbs, d_config->vfbs,
-d_config->num_disks, _config->disks[0],
-d_config->num_channels, _config->channels[0]);
-
 console.backend_domid = state->console_domid;
 libxl__device_console_add(gc, domid, , state, );
 libxl__device_console_dispose();
 
-if (need_qemu) {
+if (libxl__need_xenpv_qemu(gc, d_config)) {
 dcs->dmss.dm.guest_domid = domid;
 libxl__spawn_local_dm(egc, >dmss.dm);
 return;
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 4aca38e..897f3f9 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2107,61 +2107,34 @@ int libxl__destroy_device_model(libxl__gc *gc, uint32_t 
domid)
 GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
 }
 
-int libxl__need_xenpv_qemu(libxl__gc *gc,
-int nr_consoles, libxl__device_console *consoles,
-int nr_vfbs, libxl_device_vfb *vfbs,
-int nr_disks, libxl_device_disk *disks,
-int nr_channels, libxl_device_channel *channels)
+int libxl__need_xenpv_qemu(libxl__gc *gc, libxl_domain_config *d_config)
 {
 int i, ret = 0;
 uint32_t domid;
 
-/*
- * qemu is required in order to support 2 or more consoles. So switch all
- * backends to qemu if this is the case
- */
-if (nr_consoles > 1) {
-for (i = 0; i < nr_consoles; i++)
-consoles[i].consback = LIBXL__CONSOLE_BACKEND_IOEMU;
-ret = 1;
+if (libxl__get_domid(gc, ))
 goto out;
-}
 
-for (i = 0; i < nr_consoles; i++) {
-if (consoles[i].consback == LIBXL__CONSOLE_BACKEND_IOEMU) {
-ret = 1;
-goto out;
-}
-}
-
-if (nr_vfbs > 0) {
+if (d_config->num_vfbs > 0) {
 ret = 1;
 goto out;
 }
 
-if (nr_disks > 0) {
-ret = libxl__get_domid(gc, );
-if (ret) goto out;
-for (i = 0; i < nr_disks; i++) {
-if (disks[i].backend == LIBXL_DISK_BACKEND_QDISK &&
-disks[i].backend_domid == domid) {
-ret = 1;
-goto out;
-}
+for (i = 0; i < d_config->num_disks; i++) {
+if (d_config->disks[i].backend == LIBXL_DISK_BACKEND_QDISK &&
+d_config->disks[i].backend_domid == domid) {
+ret = 1;
+goto out;
 }
 }
 
-if (nr_channels > 0) {
-ret = libxl__get_domid(gc, );
-if (ret) goto out;
-for (i = 0; i < nr_channels; i++) {
-if (channels[i].backend_domid == domid) {
-/* xenconsoled is limited to the first console only.
-   Until this restriction is removed we must use qemu for
-   secondary consoles which includes all channels. */
-ret = 1;
-goto out;
-}
+for (i = 0; i < d_config->num_channels; i++) {
+if (d_config->channels[i].backend_domid == domid) {
+/* xenconsoled is limited to the first console only.
+   Until this restriction is removed we must use qemu for
+   secondary consoles which includes all channels. */
+ret = 1;
+goto out;
 }
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 345a764..fc7bdab 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1616,10 +1616,7 @@ _hidden int

[Xen-devel] [PATCH v3 5/5] libxl: add domain config parameter to force start of qemu

2016-03-23 Thread Juergen Gross

Today the device model (qemu) is started for a pv domain only in case
a device requiring qemu is specified in the domain configuration
(qdisk, vfb, channel). If there is no such device the device model
isn't started and hence it is impossible to add such a device to the
domain later.

Add a domain configuration parameter to specify the device model is
to be started in any case. This will enable adding devices with a
qemu based backend later.

While the optimal solution would be to start the device model
automatically when needed this would require some major rework of
libxl at multiple places.

Signed-off-by: Juergen Gross 
---
 docs/man/xl.cfg.pod.5   | 6 ++
 tools/libxl/libxl_create.c  | 1 +
 tools/libxl/libxl_dm.c  | 5 +
 tools/libxl/libxl_types.idl | 1 +
 tools/libxl/xl_cmdimpl.c| 3 +++
 5 files changed, 16 insertions(+)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index a4cc1b3..a3611a6 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1956,6 +1956,12 @@ xen-qemudepriv-domid$domid or xen-qemudepriv-shared or 
root.
 Please note that running QEMU as non-root causes migration and PCI
 passthrough not to work properly.
 
+=item

[Xen-devel] [PATCH v3 4/5] libxl: check for dynamic device model start required

2016-03-23 Thread Juergen Gross

Add a service routine checking whether a device model must be started
after adding a device to a domain.

Signed-off-by: Juergen Gross 
---
 tools/libxl/libxl.c  | 12 
 tools/libxl/libxl_dm.c   | 14 ++
 tools/libxl/libxl_internal.h |  4 
 tools/libxl/libxl_pci.c  |  3 +++
 tools/libxl/libxl_pvusb.c|  6 ++
 5 files changed, 39 insertions(+)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index dcd0951..2b4e36f 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2084,6 +2084,9 @@ void libxl__device_vtpm_add(libxl__egc *egc, uint32_t 
domid,
 if (rc) goto out;
 
 DEVICE_ADD(vtpm, vtpms, domid, _saved, COMPARE_DEVID, _config);
+
+rc = libxl__dm_check_start(gc, _config, domid);
+if (rc) goto out;
 }
 
 for (;;) {
@@ -2388,6 +2391,9 @@ static void device_disk_add(libxl__egc *egc, uint32_t 
domid,
 if (rc) goto out;
 
 DEVICE_ADD(disk, disks, domid, _saved, COMPARE_DISK, _config);
+
+rc = libxl__dm_check_start(gc, _config, domid);
+if (rc) goto out;
 }
 
 for (;;) {
@@ -2928,6 +2934,9 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, 
libxl_device_disk *disk,
 
 DEVICE_ADD(disk, disks, domid, _saved, COMPARE_DISK, _config);
 
+rc = libxl__dm_check_start(gc, _config, domid);
+if (rc) goto out;
+
 if (dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
 rc = libxl__qmp_insert_cdrom(gc, domid, disk);
 if (rc) goto out;
@@ -3354,6 +3363,9 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t 
domid,
 if (rc) goto out;
 
 DEVICE_ADD(nic, nics, domid, _saved, COMPARE_DEVID, _config);
+
+rc = libxl__dm_check_start(gc, _config, domid);
+if (rc) goto out;
 }
 
 for (;;) {
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index bffb8f8..78c46674 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2160,6 +2160,20 @@ int libxl__dm_active(libxl__gc *gc, uint32_t domid)
 return pid != NULL;
 }
 
+int libxl__dm_check_start(libxl__gc *gc, libxl_domain_config *d_config,
+  uint32_t domid)
+{
+if (libxl__dm_active(gc, domid))
+return 0;
+
+if (!libxl__need_xenpv_qemu(gc, d_config))
+return 0;
+
+LOG(ERROR, "device model required but not running");
+
+return ERROR_FAIL;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 2db8b1b..9708a46 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1618,6 +1618,10 @@ _hidden const char *libxl__domain_device_model(libxl__gc 
*gc,
 const libxl_domain_build_info *info);
 _hidden int libxl__need_xenpv_qemu(libxl__gc *gc,
libxl_domain_config *d_config);
+_hidden int libxl__dm_active(libxl__gc *gc, uint32_t domid);
+_hidden int libxl__dm_check_start(libxl__gc *gc,
+  libxl_domain_config *d_config,
+  uint32_t domid);
 
 /*
  * This function will fix reserved device memory conflict
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index dc10cb7..300fd4d 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -169,6 +169,9 @@ static int libxl__device_pci_add_xenstore(libxl__gc *gc, 
uint32_t domid, libxl_d
 
 DEVICE_ADD(pci, pcidevs, domid, _saved, COMPARE_PCI, _config);
 
+rc = libxl__dm_check_start(gc, _config, domid);
+if (rc) goto out;
+
 for (;;) {
 rc = libxl__xs_transaction_start(gc, );
 if (rc) goto out;
diff --git a/tools/libxl/libxl_pvusb.c b/tools/libxl/libxl_pvusb.c
index 7200ead..976e4c7 100644
--- a/tools/libxl/libxl_pvusb.c
+++ b/tools/libxl/libxl_pvusb.c
@@ -139,6 +139,9 @@ static int libxl__device_usbctrl_add_xenstore(libxl__gc 
*gc, uint32_t domid,
 
 DEVICE_ADD(usbctrl, usbctrls, domid, _saved,
COMPARE_USBCTRL, _config);
+
+rc = libxl__dm_check_start(gc, _config, domid);
+if (rc) goto out;
 }
 
 for (;;) {
@@ -955,6 +958,9 @@ static int libxl__device_usbdev_add_xenstore(libxl__gc *gc, 
uint32_t domid,
 
 DEVICE_ADD(usbdev, usbdevs, domid, _saved,
COMPARE_USB, _config);
+
+rc = libxl__dm_check_start(gc, _config, domid);
+if (rc) goto out;
 }
 
 for (;;) {
-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 2/5] libxl: add new pvusb backend "qusb" provided by qemu

2016-03-23 Thread Juergen Gross

Add a new pvusb backend type "qusb" which is provided by qemu. It can
be selected either by specifying the type directly in the configuration
or it is selected automatically by libxl in case there is no "usbback"
driver loaded.

Signed-off-by: Juergen Gross 
---
 docs/man/xl.cfg.pod.5|  11 +++-
 tools/libxl/libxl_device.c   |   3 +-
 tools/libxl/libxl_dm.c   |   8 +++
 tools/libxl/libxl_internal.h |   1 +
 tools/libxl/libxl_pvusb.c| 102 +++
 tools/libxl/libxl_types.idl  |   1 +
 tools/libxl/libxl_types_internal.idl |   1 +
 7 files changed, 101 insertions(+), 26 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index ec739cc..a4cc1b3 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -737,8 +737,15 @@ Possible Bs are:
 
 =item

Re: [Xen-devel] [PATCH] blkif.h: document scsi/0x12/0x node

2016-03-23 Thread Paul Durrant

> -Original Message-
> From: Bob Liu [mailto:bob@oracle.com]
> Sent: 23 March 2016 11:48
> To: xen-devel@lists.xen.org
> Cc: Paul Durrant; Ian Jackson; konrad.w...@oracle.com; jgr...@suse.com;
> Roger Pau Monne; annie...@oracle.com; David Vrabel; Bob Liu
> Subject: [PATCH] blkif.h: document scsi/0x12/0x node
> 
> This patch documents a xenstore node which is used by XENVBD Windows
> PV
> driver.
> 
> The use case is that XenServer may have OEM specific storage backends and
> there is requirement to run OEM software in guest which relied on VPD
> information supplied by the storages.
> Adding a node to xenstore is the easiest way to get this VPD information
> from
> the backend into guest where XENVBD Windows PV driver can get INQUIRY
> VPD data
> from this node and return to OEM software.
> 
> Signed-off-by: Bob Liu 
> ---
>  xen/include/public/io/blkif.h |   24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
> index 99f0326..afbcbff 100644
> --- a/xen/include/public/io/blkif.h
> +++ b/xen/include/public/io/blkif.h
> @@ -182,6 +182,30 @@
>   *  backend driver paired with a LIFO queue in the frontend will
>   *  allow us to have better performance in this scenario.
>   *
> + * scsi/0x12/0x
> + *   Values: base64 encoded string
> + *
> + *   This optional node contains SCSI INQUIRY VPD information.
> + *is the hexadecimal representation of the VPD page code.
> + *   Currently only XENVBD Windows PV driver is using this node.
> + *
> + *   A frontend e.g XENVBD Windows PV driver which represents a Xen
> VBD to
> + *   its containing operating system as a (virtual) SCSI target may return
> the

s/target/LUN

> + *   specified data in response to INQUIRY commands from its containing
> OS.

I think we can safely say "containing OS" is Windows.

> + *
> + *   A frontend which supports this feature must return the backend-
> specified
> + *   data for every INQUIRY command with the EVPD bit set.
> + *   For EVPD=1 INQUIRY commands where the corresponding xenstore
> node
> + *   does not exist, the frontend must report (to its containing OS) an
> + *   appropriate failure condition.

Not necessarily. Page 0x80 is compulsory in the T10 SPC spec so it has to be 
synthesized in the absence of data in xenstore.

> + *
> + *   A frontend which does not support this feature just disregard these
> + *   xenstore nodes.
> + *
> + *   The data of this string node is base64 encoded. Base64 is a group of
> + *   similar binary-to-text encoding schemes that represent binary data in
> an
> + *   ASCII string format by translating it into a radix-64 representation.

Do we need to explain what base64 is?

  Paul

> + *
>   *--- Request Transport Parameters 
> 
>   *
>   * max-ring-page-order
> --
> 1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

1 2 >

1 - 100 of 183 matches

Mail list logo