[Xen-devel] [xen-unstable test] 131483: regressions - FAIL

2018-12-21 Thread osstest service owner
flight 131483 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131483/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 131423

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 131423
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 131423
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 131423
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 131423
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 131423
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 131423
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 131423
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 131423
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 131423
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 xen  7183e86a29c3fe15078eb0b8c11d3e556c22effa
baseline version:
 xen  3fd3fda9c26fc3c4f77250f795ed7ff9d38e2ec6

Last test of basis   131423  2018-12-18 02:18:18 Z4 days
Failing since131444  2018-12-19 13:39:27 Z2 days2 attempts
Testing same since   131483  2018-12-21 04:26:02 Z1 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Andrii Anisov 
  Brian Woods 
  Edgar E. Iglesias 
  Ian Jackson 
  Jan Beulich 
  Julien Grall 
  Matthew Daley 
  Razvan Cojocaru 
  Stefano Stabellini 
  Stefano Stabellini 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-prev   

[Xen-devel] [freebsd-master test] 131488: all pass - PUSHED

2018-12-21 Thread osstest service owner
flight 131488 freebsd-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131488/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 freebsd  de7f82e0773386e0171af4a2fdcb2141fd1354cc
baseline version:
 freebsd  204d7bf6cddd87478f9c1a6bb55f482d87cf2eaa

Last test of basis   131440  2018-12-19 09:19:49 Z2 days
Testing same since   131488  2018-12-21 09:20:53 Z0 days1 attempts


People who touched revisions under test:
  bcran 
  bde 
  cem 
  dab 
  emaste 
  imp 
  markj 
  mckusick 
  mjg 
  mw 
  np 
  rmacklem 
  tuexen 
  yuripv 

jobs:
 build-amd64-freebsd-againpass
 build-amd64-freebsd  pass
 build-amd64-xen-freebsd  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/freebsd.git
   204d7bf6cdd..de7f82e0773  de7f82e0773386e0171af4a2fdcb2141fd1354cc -> 
tested/master

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-3.18 test] 131479: regressions - FAIL

2018-12-21 Thread osstest service owner
flight 131479 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131479/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 128858
 test-amd64-i386-xl-raw7 xen-boot fail REGR. vs. 128858
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail REGR. vs. 
128858
 test-amd64-i386-rumprun-i386  7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-xl-multivcpu  7 xen-bootfail REGR. vs. 128858
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-boot fail REGR. 
vs. 128858
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-boot fail REGR. 
vs. 128858
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-boot fail REGR. vs. 128858
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 128858
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 128858
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 128858
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-boot  fail REGR. vs. 128858
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 128858
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 128858
 test-amd64-amd64-xl-xsm   7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-xl-qemuu-ovmf-amd64  7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-rumprun-amd64  7 xen-boot   fail REGR. vs. 128858
 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 128858
 test-amd64-amd64-xl-pvshim7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-i386-pvgrub  7 xen-boot fail REGR. vs. 128858
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host   fail REGR. vs. 128858
 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host   fail REGR. vs. 128858
 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 128858
 test-amd64-i386-freebsd10-amd64  7 xen-boot  fail REGR. vs. 128858
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 128858

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-cubietruck 16 guest-start/debian.repeat fail in 131420 
pass in 131479
 test-armhf-armhf-xl-credit1   6 xen-install  fail in 131442 pass in 131479
 test-amd64-amd64-qemuu-nested-intel  7 xen-bootfail pass in 131420
 test-amd64-i386-freebsd10-i386  7 xen-boot fail pass in 131420
 test-armhf-armhf-xl-multivcpu 16 guest-start/debian.repeat fail pass in 131442

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-arndale  16 guest-start/debian.repeatfail  like 128807
 test-armhf-armhf-libvirt-raw 10 debian-di-installfail  like 128841
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 128858
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 128858
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 128858
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 128858
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 128858
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-localmigrate/x10 
fail like 128858
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 

[Xen-devel] [ovmf test] 131490: regressions - FAIL

2018-12-21 Thread osstest service owner
flight 131490 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131490/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i3866 xen-buildfail REGR. vs. 129475
 build-i386-xsm6 xen-buildfail REGR. vs. 129475
 build-amd64-xsm   6 xen-buildfail REGR. vs. 129475
 build-amd64   6 xen-buildfail REGR. vs. 129475

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a

version targeted for testing:
 ovmf a18f784cfdbe17855ec4376e80db927e1a81aaca
baseline version:
 ovmf 5ae3184d8c59f7bbb84bad482df6b8020ba58188

Last test of basis   129475  2018-11-05 21:13:11 Z   46 days
Failing since129526  2018-11-06 20:49:26 Z   45 days  163 attempts
Testing same since   131490  2018-12-21 10:01:12 Z0 days1 attempts


People who touched revisions under test:
  Achin Gupta 
  Ard Biesheuvel 
  Bob Feng 
  bob.c.f...@intel.com 
  BobCF 
  Chasel Chiu 
  Chasel, Chiu 
  Chen A Chen 
  Dandan Bi 
  David Wei 
  Derek Lin 
  Eric Dong 
  Feng, Bob C 
  Fu Siyuan 
  Gary Lin 
  Hao Wu 
  Jaben Carsey 
  Jeff Brasen 
  Jian J Wang 
  Jiaxin Wu 
  Jiewen Yao 
  Laszlo Ersek 
  Leif Lindholm 
  Liming Gao 
  Liu Yu 
  Marc Zyngier 
  Marcin Wojtas 
  Mike Maslenkin 
  Ming Huang 
  Pedroa Liu 
  Ruiyu Ni 
  shenglei 
  Shenglei Zhang 
  Star Zeng 
  Sughosh Ganu 
  Sumit Garg 
  Sun, Zailiang 
  Thomas Abraham 
  Ting Ye 
  Tomasz Michalec 
  Vijayenthiran Subramaniam 
  Vladimir Olovyannikov 
  Wang BinX A 
  Wu Jiaxin 
  Ye Ting 
  Yonghong Zhu 
  yuchenlin 
  Zailiang Sun 
  Zhang, Chao B 
  Zhao, ZhiqiangX 
  Zhiju.Fan 
  zhijufan 
  ZhiqiangX Zhao 
  zwei4 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 4306 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-linus test] 131475: regressions - FAIL

2018-12-21 Thread osstest service owner
flight 131475 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131475/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 7 xen-boot fail REGR. vs. 
125898
 test-amd64-amd64-rumprun-amd64  7 xen-boot   fail REGR. vs. 125898
 test-amd64-amd64-xl-qcow2 7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-xl-multivcpu  7 xen-bootfail REGR. vs. 125898
 test-amd64-amd64-pygrub   7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-libvirt-vhd  7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-xl-qemuu-win7-amd64  7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-examine   8 reboot   fail REGR. vs. 125898
 test-amd64-amd64-xl-pvhv2-intel  7 xen-boot  fail REGR. vs. 125898
 test-amd64-amd64-libvirt-pair 10 xen-boot/src_host   fail REGR. vs. 125898
 test-amd64-amd64-libvirt-pair 11 xen-boot/dst_host   fail REGR. vs. 125898
 test-amd64-amd64-pair10 xen-boot/src_hostfail REGR. vs. 125898
 test-amd64-amd64-pair11 xen-boot/dst_hostfail REGR. vs. 125898
 test-amd64-i386-xl-qemut-win10-i386  7 xen-boot  fail REGR. vs. 125898
 test-amd64-i386-xl-shadow 7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-rumprun-i386  7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-xl7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-pair 10 xen-boot/src_hostfail REGR. vs. 125898
 test-amd64-i386-pair 11 xen-boot/dst_hostfail REGR. vs. 125898
 test-amd64-i386-xl-xsm7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-libvirt   7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-libvirt-pair 10 xen-boot/src_hostfail REGR. vs. 125898
 test-amd64-i386-libvirt-xsm   7 xen-boot fail REGR. vs. 125898
 test-amd64-i386-libvirt-pair 11 xen-boot/dst_hostfail REGR. vs. 125898
 test-amd64-amd64-libvirt-xsm  7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 125898
 test-amd64-i386-qemut-rhel6hvm-amd 12 guest-start/redhat.repeat fail REGR. vs. 
125898
 test-armhf-armhf-xl   7 xen-boot fail REGR. vs. 125898
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
125898

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  7 xen-boot fail REGR. vs. 125898

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-credit1   7 xen-bootfail baseline untested
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 125898
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 125898
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 125898
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 125898
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 125898
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 125898
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 125898
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass

[Xen-devel] [xen-unstable-smoke test] 131502: tolerable all pass - PUSHED

2018-12-21 Thread osstest service owner
flight 131502 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131502/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  9d357cbaf74f0c1dc85a16498dac6d819743ce38
baseline version:
 xen  d3ce6380c1f71f9575115750e4ed707911529345

Last test of basis   131500  2018-12-21 18:00:50 Z0 days
Testing same since   131502  2018-12-21 22:01:26 Z0 days1 attempts


People who touched revisions under test:
  George Dunlap 
  Ian Jackson 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   d3ce6380c1..9d357cbaf7  9d357cbaf74f0c1dc85a16498dac6d819743ce38 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 13/25] argo: implement the register op

2018-12-21 Thread Christopher Clark
On Fri, Dec 21, 2018 at 12:53 AM Jan Beulich  wrote:
>
> >>> On 21.12.18 at 09:16,  wrote:
> > On Thu, Dec 20, 2018 at 11:28 PM Jan Beulich  wrote:
> >>
> >> >>> On 21.12.18 at 02:25,  wrote:
> >> > On Thu, Dec 20, 2018 at 12:29 AM Jan Beulich  wrote:
> >> >>
> >> >> >>> On 20.12.18 at 06:29,  wrote:
> >> >> > On Wed, Dec 12, 2018 at 1:48 AM Jan Beulich  wrote:
> >> >> >>
> >> >> >> > +static int
> >> >> >> > +argo_find_ring_mfns(struct domain *d, struct argo_ring_info 
> >> >> >> > *ring_info,
> >> >> >> > +uint32_t npage, 
> >> >> >> > XEN_GUEST_HANDLE_PARAM(argo_pfn_t) pfn_hnd,
> >> >> >> > +uint32_t len)
> >> >> >> > +{
> >> >> >> > +int i;
> >> >> >> > +int ret = 0;
> >> >> >> > +
> >> >> >> > +if ( (npage << PAGE_SHIFT) < len )
> >> >> >> > +return -EINVAL;
> >> >> >> > +
> >> >> >> > +if ( ring_info->mfns )
> >> >> >> > +{
> >> >> >> > +/*
> >> >> >> > + * Ring already existed. Check if it's the same ring,
> >> >> >> > + * i.e. same number of pages and all translated gpfns 
> >> >> >> > still
> >> >> >> > + * translating to the same mfns
> >> >> >> > + */
> >> >> >>
> >> >> >> This comment makes me wonder whether the translations are
> >> >> >> permitted to change at other times. If so I'm not sure what
> >> >> >> value verification here has. If not, this probably would want to
> >> >> >> be debugging-only code.
> >> >> >
> >> >> > My understanding is that the gfn->mfn translation is not necessarily 
> >> >> > stable
> >> >> > across entry and exit from host power state S4, suspend to disk.
>
> Note this ^^^ (and see below).
>
> >> Now I'm afraid there's some confusion here: Originally you've
> >> said "host".
> >>
> >> >> How would that be? It's not stable across guest migration (or
> >> >> its non-live save/restore equivalent),
> >> >
> >> > Right, that's clear.
> >> >
> >> >> but how would things change across S3?
> >> >
> >> > I don't think that they do change in that case.
> >> >
> >> > From studying the code involved above, a related item: the guest runs 
> >> > the same
> >> > suspend and resume kernel code before entering into/exiting from either 
> >> > guest
> >> > S3 or S4, so the guest kernel resume code needs to re-register the 
> >> > rings, to
> >> > cover the case where it is coming up in an environment where they were 
> >> > dropped
> >> > - so that's what it does.
> >> >
> >> > This relates to the code section above: if guest entry to S3 is aborted 
> >> > at the
> >> > final step (eg. error or platform refuses, eg. maybe a physical device
> >> > interaction with passthrough) then the hypervisor has not torn down the 
> >> > rings,
> >> > the guest remains running within the same domain, and the guest resume 
> >> > logic
> >> > runs, which runs through re-registration for all its rings. The check in 
> >> > the
> >> > logic above allows the existing ring mappings within the hypervisor to be
> >> > preserved.
> >>
> >> Yet now you suddenly talk about guest S3.
> >
> > Well, the context is that you did just ask about S3, without
> > specifying host or guest.
>
> I'm sorry to be picky, but no, I don't think I did. You did expicitly
> say "host", making me further think only about that case.

OK, apologies for the confusing direction of the reply. It was not intended
to be so.

> > That logic aims to make ring registration idempotent, to avoid the
> > teardown of established mappings of the ring pages in the case where
> > doing so isn't needed.
>
> You treat complexity in the kernel for complexity in the hypervisor.

(s/treat/trade/ ?) OK, that is a fair concern, yes.

> I'm not sure this is appropriate, as I can't judge how much more
> difficult it would be for the guest to look after itself. But let's look
> at both cases again:
> - For guest S3, afaik, the domain doesn't change, and hence
>   memory assignment remains the same. No re-registration
>   necessary then afaict.
> - For guest S4, aiui, the domain gets destroyed and a new one
>   built upon resume. Re-registration would be needed, but due
>   to the domain re-construction no leftovers ought to exist in
>   Xen.

I agree.

> Hence to me it would seem more natural to have the guest deal
> with the situation, and have no extra logic for this in Xen. You
> want the guest to re-register anyway, yet simply avoiding to
> do so in the S3 case ought to be a single (or very few)
> conditional(s), i.e. not a whole lot of complexity.

OK. That looks doable. thanks.

Christopher

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 5/5] pvcalls-back: set -ENOTCONN in pvcalls_conn_back_read

2018-12-21 Thread Stefano Stabellini
When a connection is closing we receive on pvcalls_sk_state_change
notification. Instead of setting the connection as closed immediately
(-ENOTCONN), let's read one more time from it: pvcalls_conn_back_read
will set the connection as closed when necessary.

That way, we avoid races between pvcalls_sk_state_change and
pvcalls_back_ioworker.

Signed-off-by: Stefano Stabellini 
---
 drivers/xen/pvcalls-back.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
index 2e5d845..71b6287 100644
--- a/drivers/xen/pvcalls-back.c
+++ b/drivers/xen/pvcalls-back.c
@@ -160,9 +160,10 @@ static void pvcalls_conn_back_read(void *opaque)
 
/* write the data, then modify the indexes */
virt_wmb();
-   if (ret < 0)
+   if (ret < 0) {
+   atomic_set(>read, 0);
intf->in_error = ret;
-   else
+   } else
intf->in_prod = prod + ret;
/* update the indexes, then notify the other end */
virt_wmb();
@@ -288,7 +289,7 @@ static void pvcalls_sk_state_change(struct sock *sock)
return;
 
intf = map->ring;
-   intf->in_error = -ENOTCONN;
+   atomic_inc(>read);
notify_remote_via_irq(map->irq);
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 4/5] pvcalls-front: don't return error when the ring is full

2018-12-21 Thread Stefano Stabellini
When the ring is full, size == array_size. It is not an error condition,
so simply return 0 instead of an error.

Signed-off-by: Stefano Stabellini 
---
 drivers/xen/pvcalls-front.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 0158858..1a893a1 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -475,8 +475,10 @@ static int __write_ring(struct pvcalls_data_intf *intf,
virt_mb();
 
size = pvcalls_queued(prod, cons, array_size);
-   if (size >= array_size)
+   if (size > array_size)
return -EINVAL;
+   if (size == array_size)
+   return 0;
if (len > array_size - size)
len = array_size - size;
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 3/5] pvcalls-front: properly allocate sk

2018-12-21 Thread Stefano Stabellini
Don't use kzalloc: it ends up leaving sk->sk_prot not properly
initialized. Use sk_alloc instead and define our own trivial struct
proto.

Signed-off-by: Stefano Stabellini 
---
 drivers/xen/pvcalls-front.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 4f3d664..0158858 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -31,6 +31,12 @@
 #define PVCALLS_NR_RSP_PER_RING __CONST_RING_SIZE(xen_pvcalls, XEN_PAGE_SIZE)
 #define PVCALLS_FRONT_MAX_SPIN 5000
 
+static struct proto pvcalls_proto = {
+   .name   = "PVCalls",
+   .owner  = THIS_MODULE,
+   .obj_size = sizeof(struct sock),
+};
+
 struct pvcalls_bedata {
struct xen_pvcalls_front_ring ring;
grant_ref_t ref;
@@ -837,7 +843,7 @@ int pvcalls_front_accept(struct socket *sock, struct socket 
*newsock, int flags)
 
 received:
map2->sock = newsock;
-   newsock->sk = kzalloc(sizeof(*newsock->sk), GFP_KERNEL);
+   newsock->sk = sk_alloc(sock_net(sock->sk), PF_INET, GFP_KERNEL, 
_proto, false);
if (!newsock->sk) {
bedata->rsp[req_id].req_id = PVCALLS_INVALID_ID;
map->passive.inflight_req_id = PVCALLS_INVALID_ID;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 0/5] pvcalls fixes

2018-12-21 Thread Stefano Stabellini
Hi all,

This series is a collection of small pvcalls fixes.

Cheers,

Stefano


Stefano Stabellini (5):
  pvcalls-front: read all data before closing the connection
  pvcalls-front: don't try to free unallocated rings
  pvcalls-front: properly allocate sk
  pvcalls-front: don't return error when the ring is full
  pvcalls-back: set -ENOTCONN in pvcalls_conn_back_read

 drivers/xen/pvcalls-back.c  |  7 ---
 drivers/xen/pvcalls-front.c | 20 +---
 2 files changed, 17 insertions(+), 10 deletions(-)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 1/5] pvcalls-front: read all data before closing the connection

2018-12-21 Thread Stefano Stabellini
When a connection is closing in_error is set to ENOTCONN. There could
still be outstanding data on the ring left by the backend. Before
closing the connection on the frontend side, drain the ring.

Signed-off-by: Stefano Stabellini 
---
 drivers/xen/pvcalls-front.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 77224d8..e5d95aa 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -560,15 +560,13 @@ static int __read_ring(struct pvcalls_data_intf *intf,
error = intf->in_error;
/* get pointers before reading from the ring */
virt_rmb();
-   if (error < 0)
-   return error;
 
size = pvcalls_queued(prod, cons, array_size);
masked_prod = pvcalls_mask(prod, array_size);
masked_cons = pvcalls_mask(cons, array_size);
 
if (size == 0)
-   return 0;
+   return error ?: size;
 
if (len > size)
len = size;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 13/25] argo: implement the register op

2018-12-21 Thread Christopher Clark
On Thu, Dec 20, 2018 at 4:52 AM Roger Pau Monné  wrote:
>
> On Wed, Dec 19, 2018 at 09:41:59PM -0800, Christopher Clark wrote:
> > On Wed, Dec 12, 2018 at 8:48 AM Roger Pau Monné  
> > wrote:
> > >
> > > On Fri, Nov 30, 2018 at 05:32:52PM -0800, Christopher Clark wrote:
> > > > +static inline uint16_t
> > > > +argo_hash_fn(const struct argo_ring_id *id)
> > >
> > > No need for the argo_ prefix for static functions, this is already an
> > > argo specific file.
> >
> > Although the compiler could live without the prefix, I'm finding it helpful 
> > to
> > very easily determine that functions being used are not defined elsewhere
> > within Xen; so I've left the prefix as is for version two of this series.
>
> Why do you care whether they are defined elsewhere in Xen? The scope
> of static functions is limited to the translation unit anyway.

ok, I'll remove the prefixes - you're right that I shouldn't care
whether they are defined elsewhere in Xen, and Jan's points about the
string table expansion and serial line bandwidth are true - I had not
considered those. Would adding a note to describe this reasoning to
the CODING_STYLE document be welcome?

> > > > +#else
> > > > +*mfn = p2m_lookup(d, _gfn(pfn), );
> > > > +#endif
> > > > +
> > > > +if ( !mfn_valid(*mfn) )
> > > > +ret = -EINVAL;
> > > > +#ifdef CONFIG_X86
> > > > +else if ( p2m_is_paging(p2mt) || (p2mt == p2m_ram_logdirty) )
> > > > +ret = -EAGAIN;
> > > > +#endif
> > > > +else if ( (p2mt != p2m_ram_rw) ||
> > > > +  !get_page_and_type(mfn_to_page(*mfn), d, 
> > > > PGT_writable_page) )
> > > > +ret = -EINVAL;
> > > > +
> > > > +#ifdef CONFIG_X86
> > > > +put_gfn(d, pfn);
> > >
> > > If you do this put_gfn here, by the time you check that the gfn -> mfn
> > > matches your expectations the guest might have somehow changed the gfn
> > > -> mfn mapping already (for example by ballooning down memory?)
> >
> > If the guest does that, I think it only harms itself. If for some reason
> > a memory access is denied, then the op would just fail. I don't think
> > there's a more serious consequence to be worried about.
>
> Then I wonder why you need such check in any case if the code can
> handle such cases, the more than the check itself is racy.

OK, so at the root of the question here is: does it matter what the p2m
type of the memory is at these points:

1) when the gfn is translated to mfn, at the time of ring registration

2) when the hypervisor writes into guest memory:
- where the tx_ptr index is initialized in the register op
- where ringbuf data is written in sendv
- where ring description data is written in notify

or is having PGT_writable_page type and ownership by the domain
sufficient?

For 1), I think there's some use in saying no to a guest that has
supplied a region that appears misconfigured.

For 2), input would be appreciated. It currently works under the
assumption that a p2m type check is unnecessary, which is why the
put_gfn is where it is.

For further background context, here's my understanding of this section:

When the guest invokes the hypercall operation to register a ring, it
identifies the memory that it owns and wants the hypervisor to use by
supplying an array of gfns (or in v2, addresses which are shifted to
extract their gfns).

The hypervisor translates from gfns to mfns, using the translation that
exists at that time, and then refers to that memory internally by mfn
from there on out. This find_ring_mfn function is where the gfn->mfn
translation happens.  (The variable name does needs renaming from pfn,
as you noted - thanks.)

To do the translation from gfn to mfn, (on x86) it's using
get_gfn_unshare. That's doing three things:
* returns the mfn, if there is one.
* returns the p2m type of that memory.
* acquires a reference to that gfn, which needs to be dropped at some
  point.

The p2m type type check on the gfn in find_ring_mfn at that time is
possibly conservative, rejecting more types than perhaps it needs to,
but the type that it accepts (p2m_ram_rw) is sane. It is a validation of
the p2m type at that instant, intended to detect if the guest has
supplied memory to the ring register op that does not make sense for it
to use as a ring, as indicated by the current p2m type, and if so, fail
early, or indicate that a retry later is needed.

Then the get_page_and_type call is where the memory identified by the
mfn that was just obtained, gets locked to PGT_writable_page type, and
ownership fixed to its current owner domain, by adding to its reference
count.

Then the gfn reference count is dropped with the put_gfn call. This
means that the guest can elect to change the p2m type afterwards, if it
wants; (any change needs to be consistent with its domain ownership and
PGT_writable_page type though -- not sure if that constrains possible types).

That memory can have guest-supplied data written into it, either by the
domain owning the page itself, or in 

[Xen-devel] [PATCH 2/5] pvcalls-front: don't try to free unallocated rings

2018-12-21 Thread Stefano Stabellini
inflight_req_id is 0 when initialized. If inflight_req_id is 0, there is
no accept_map to free. Fix the check in pvcalls_front_release.

Signed-off-by: Stefano Stabellini 
---
 drivers/xen/pvcalls-front.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index e5d95aa..4f3d664 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -1030,8 +1030,8 @@ int pvcalls_front_release(struct socket *sock)
spin_lock(>socket_lock);
list_del(>list);
spin_unlock(>socket_lock);
-   if (READ_ONCE(map->passive.inflight_req_id) !=
-   PVCALLS_INVALID_ID) {
+   if (READ_ONCE(map->passive.inflight_req_id) != 
PVCALLS_INVALID_ID &&
+   READ_ONCE(map->passive.inflight_req_id) != 0) {
pvcalls_front_free_map(bedata,
   map->passive.accept_map);
}
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 131500: tolerable all pass - PUSHED

2018-12-21 Thread osstest service owner
flight 131500 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/131500/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  d3ce6380c1f71f9575115750e4ed707911529345
baseline version:
 xen  af80e1d9c79e3bcb392775f311d20eec54b3389b

Last test of basis   131491  2018-12-21 11:01:16 Z0 days
Testing same since   131500  2018-12-21 18:00:50 Z0 days1 attempts


People who touched revisions under test:
  Benjamin Sanda 
  Julien Grall 
  Stefano Stabellini 
  Wei Liu 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   af80e1d9c7..d3ce6380c1  d3ce6380c1f71f9575115750e4ed707911529345 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC v2 3/4] pvh: Add x86/HVM direct boot ABI header file

2018-12-21 Thread Liam Merwick
From: Liam Merwick 

The x86/HVM direct boot ABI permits Qemu to be able to boot directly
into the uncompressed Linux kernel binary with minimal firmware involvement.

https://xenbits.xen.org/docs/unstable/misc/pvh.html

This commit adds the header file that defines the start_info struct
that needs to be populated in order to use this ABI.

The canonical version of start_info.h is in the Xen codebase.
(like QEMU, the Linux kernel uses a copy as well).

Signed-off-by: Liam Merwick 
Reviewed-by: Konrad Rzeszutek Wilk 
---
 include/hw/xen/start_info.h | 146 
 1 file changed, 146 insertions(+)
 create mode 100644 include/hw/xen/start_info.h

diff --git a/include/hw/xen/start_info.h b/include/hw/xen/start_info.h
new file mode 100644
index ..348779eb10cd
--- /dev/null
+++ b/include/hw/xen/start_info.h
@@ -0,0 +1,146 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2016, Citrix Systems, Inc.
+ */
+
+#ifndef __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__
+#define __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__
+
+/*
+ * Start of day structure passed to PVH guests and to HVM guests in %ebx.
+ *
+ * NOTE: nothing will be loaded at physical address 0, so a 0 value in any
+ * of the address fields should be treated as not present.
+ *
+ *  0 ++
+ *| magic  | Contains the magic value XEN_HVM_START_MAGIC_VALUE
+ *|| ("xEn3" with the 0x80 bit of the "E" set).
+ *  4 ++
+ *| version| Version of this structure. Current version is 1. New
+ *|| versions are guaranteed to be backwards-compatible.
+ *  8 ++
+ *| flags  | SIF_xxx flags.
+ * 12 ++
+ *| nr_modules | Number of modules passed to the kernel.
+ * 16 ++
+ *| modlist_paddr  | Physical address of an array of modules
+ *|| (layout of the structure below).
+ * 24 ++
+ *| cmdline_paddr  | Physical address of the command line,
+ *|| a zero-terminated ASCII string.
+ * 32 ++
+ *| rsdp_paddr | Physical address of the RSDP ACPI data structure.
+ * 40 ++
+ *| memmap_paddr   | Physical address of the (optional) memory map. Only
+ *|| present in version 1 and newer of the structure.
+ * 48 ++
+ *| memmap_entries | Number of entries in the memory map table. Only
+ *|| present in version 1 and newer of the structure.
+ *|| Zero if there is no memory map being provided.
+ * 52 ++
+ *| reserved   | Version 1 and newer only.
+ * 56 ++
+ *
+ * The layout of each entry in the module structure is the following:
+ *
+ *  0 ++
+ *| paddr  | Physical address of the module.
+ *  8 ++
+ *| size   | Size of the module in bytes.
+ * 16 ++
+ *| cmdline_paddr  | Physical address of the command line,
+ *|| a zero-terminated ASCII string.
+ * 24 ++
+ *| reserved   |
+ * 32 ++
+ *
+ * The layout of each entry in the memory map table is as follows:
+ *
+ *  0 ++
+ *| addr   | Base address
+ *  8 ++
+ *| size   | Size of mapping in bytes
+ * 16 ++
+ *| type   | Type of mapping as defined between the hypervisor
+ *|| and guest it's starting. E820_TYPE_xxx, for example.
+ * 20 +|
+ *| reserved   |
+ * 24 ++
+ *
+ * The address and sizes are always a 64bit little endian unsigned integer.
+ *
+ * NB: Xen on x86 will always try to place all the data below the 4GiB
+ * boundary.
+ *
+ * Version numbers of the hvm_start_info structure have evolved 

[Xen-devel] [RFC v2 4/4] pvh: Boot uncompressed kernel using direct boot ABI

2018-12-21 Thread Liam Merwick
These changes (along with corresponding Linux kernel and qboot changes)
enable a guest to be booted using the x86/HVM direct boot ABI.

This commit adds a load_elfboot() routine to pass the size and
location of the kernel entry point to qboot (which will fill in
the start_info struct information needed to to boot the guest).
Having loaded the ELF binary, load_linux() will run qboot
which continues the boot.

The address for the kernel entry point is read from an ELF Note
in the uncompressed kernel binary by a helper routine passed
to load_elf().

Co-developed-by: George Kennedy 
Signed-off-by: George Kennedy 
Signed-off-by: Liam Merwick 
---
 hw/i386/pc.c  | 136 +-
 include/elf.h |  10 +
 2 files changed, 145 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 115bc2825ce4..6d44a14da44d 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -54,6 +54,7 @@
 #include "sysemu/qtest.h"
 #include "kvm_i386.h"
 #include "hw/xen/xen.h"
+#include "hw/xen/start_info.h"
 #include "ui/qemu-spice.h"
 #include "exec/memory.h"
 #include "exec/address-spaces.h"
@@ -109,6 +110,9 @@ static struct e820_entry *e820_table;
 static unsigned e820_entries;
 struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 
+/* Physical Address of PVH entry point read from kernel ELF NOTE */
+static size_t pvh_start_addr;
+
 void gsi_handler(void *opaque, int n, int level)
 {
 GSIState *s = opaque;
@@ -834,6 +838,109 @@ struct setup_data {
 uint8_t data[0];
 } __attribute__((packed));
 
+
+/*
+ * The entry point into the kernel for PVH boot is different from
+ * the native entry point.  The PVH entry is defined by the x86/HVM
+ * direct boot ABI and is available in an ELFNOTE in the kernel binary.
+ *
+ * This function is passed to load_elf() when it is called from
+ * load_elfboot() which then additionally checks for an ELF Note of
+ * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
+ * parse the PVH entry address from the ELF Note.
+ *
+ * Due to trickery in elf_opts.h, load_elf() is actually available as
+ * load_elf32() or load_elf64() and this routine needs to be able
+ * to deal with being called as 32 or 64 bit.
+ *
+ * The address of the PVH entry point is saved to the 'pvh_start_addr'
+ * global variable.  (although the entry point is 32-bit, the kernel
+ * binary can be either 32-bit or 64-bit).
+ */
+static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
+{
+size_t *elf_note_data_addr;
+
+/* Check if ELF Note header passed in is valid */
+if (arg1 == NULL) {
+return 0;
+}
+
+if (is64) {
+struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
+uint64_t nhdr_size64 = sizeof(struct elf64_note);
+uint64_t phdr_align = *(uint64_t *)arg2;
+uint64_t nhdr_namesz = nhdr64->n_namesz;
+
+elf_note_data_addr =
+((void *)nhdr64) + nhdr_size64 +
+QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
+} else {
+struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
+uint32_t nhdr_size32 = sizeof(struct elf32_note);
+uint32_t phdr_align = *(uint32_t *)arg2;
+uint32_t nhdr_namesz = nhdr32->n_namesz;
+
+elf_note_data_addr =
+((void *)nhdr32) + nhdr_size32 +
+QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
+}
+
+pvh_start_addr = *elf_note_data_addr;
+
+return pvh_start_addr;
+}
+
+static bool load_elfboot(const char *kernel_filename,
+   int kernel_file_size,
+   uint8_t *header,
+   size_t pvh_xen_start_addr,
+   FWCfgState *fw_cfg)
+{
+uint32_t flags = 0;
+uint32_t mh_load_addr = 0;
+uint32_t elf_kernel_size = 0;
+uint64_t elf_entry;
+uint64_t elf_low, elf_high;
+int kernel_size;
+
+if (ldl_p(header) != 0x464c457f) {
+return false; /* no elfboot */
+}
+
+bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
+flags = elf_is64 ?
+((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
+
+if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
+error_report("elfboot unsupported flags = %x", flags);
+exit(1);
+}
+
+uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
+kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
+   NULL, _note_type, _entry,
+   _low, _high, 0, I386_ELF_MACHINE,
+   0, 0);
+
+if (kernel_size < 0) {
+error_report("Error while loading elf kernel");
+exit(1);
+}
+mh_load_addr = elf_low;
+elf_kernel_size = elf_high - elf_low;
+
+if (pvh_start_addr == 0) {
+error_report("Error loading uncompressed kernel without PVH ELF Note");
+exit(1);
+}
+fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
+fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
+

[Xen-devel] [RFC v2 2/4] elf-ops.h: Add get_elf_note_type()

2018-12-21 Thread Liam Merwick
Introduce a routine which, given a pointer to a range of ELF Notes,
searches through them looking for a note matching the type specified
and returns a pointer to the matching ELF note.

Signed-off-by: Liam Merwick 
---
 include/hw/elf_ops.h | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
index 37d20a3800c1..ffbdfbe9c2d8 100644
--- a/include/hw/elf_ops.h
+++ b/include/hw/elf_ops.h
@@ -265,6 +265,49 @@ fail:
 return ret;
 }
 
+/* Given 'nhdr', a pointer to a range of ELF Notes, search through them
+ * for a note matching type 'elf_note_type' and return a pointer to
+ * the matching ELF note.
+ */
+static struct elf_note *glue(get_elf_note_type, SZ)(struct elf_note *nhdr,
+elf_word note_size,
+elf_word phdr_align,
+elf_word elf_note_type)
+{
+elf_word nhdr_size = sizeof(struct elf_note);
+elf_word elf_note_entry_offset = 0;
+elf_word note_type;
+elf_word nhdr_namesz;
+elf_word nhdr_descsz;
+
+if (nhdr == NULL) {
+return NULL;
+}
+
+note_type = nhdr->n_type;
+while (note_type != elf_note_type) {
+nhdr_namesz = nhdr->n_namesz;
+nhdr_descsz = nhdr->n_descsz;
+
+elf_note_entry_offset = nhdr_size +
+QEMU_ALIGN_UP(nhdr_namesz, phdr_align) +
+QEMU_ALIGN_UP(nhdr_descsz, phdr_align);
+
+/* If the offset calculated in this iteration exceeds the
+* supplied size, we are done and no matching note was found.
+*/
+if (elf_note_entry_offset > note_size) {
+return NULL;
+}
+
+/* skip to the next ELF Note entry */
+nhdr = (void *)nhdr + elf_note_entry_offset;
+note_type = nhdr->n_type;
+}
+
+return nhdr;
+}
+
 static int glue(load_elf, SZ)(const char *name, int fd,
   uint64_t (*elf_note_fn)(void *, void *, bool),
   uint64_t (*translate_fn)(void *, uint64_t),
@@ -512,6 +555,13 @@ static int glue(load_elf, SZ)(const char *name, int fd,
 }
 }
 
+   /* Search the ELF notes to find one with a type matching the
+* value passed in via 'translate_opaque'
+*/
+nhdr = (struct elf_note *)data;
+   assert(translate_opaque != NULL);
+nhdr = glue(get_elf_note_type, SZ)(nhdr, file_size, ph->p_align,
+   *(uint64_t *)translate_opaque);
 if (nhdr != NULL) {
 bool is64 =
 sizeof(struct elf_note) == sizeof(struct elf64_note);
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [Qemu-devel] [RFC 1/3] pvh: Add x86/HVM direct boot ABI header file

2018-12-21 Thread Liam Merwick

On 11/12/2018 14:57, Liam Merwick wrote:

On 11/12/2018 14:01, Stefan Hajnoczi wrote:

On Wed, Dec 05, 2018 at 10:37:24PM +, Liam Merwick wrote:

From: Liam Merwick 

The x86/HVM direct boot ABI permits Qemu to be able to boot directly
into the uncompressed Linux kernel binary without the need to run 
firmware.


https://xenbits.xen.org/docs/unstable/misc/pvh.html

This commit adds the header file that defines the start_info struct
that needs to be populated in order to use this ABI.

Signed-off-by: Maran Wilson 
Signed-off-by: Liam Merwick 
Reviewed-by: Konrad Rzeszutek Wilk 
---
  include/hw/xen/start_info.h | 146 


  1 file changed, 146 insertions(+)
  create mode 100644 include/hw/xen/start_info.h


Does it make sense to bring in Linux
include/xen/interface/hvm/start_info.h via QEMU's
include/standard-headers/?

QEMU has a script in scripts/update-linux-header.sh for syncing Linux
headers into include/standard-headers/.  This makes it easy to keep
Linux header files up-to-date.  We basically treat files in
include/standard-headers/ as auto-generated.

If you define start_info.h yourself without using
include/standard-headers/, then it won't be synced with Linux.



That does seem better.  I will make that change.


When attempting to implement this, I found the canonical copy of this 
header file is actually in Xen and the Linux copy is kept in sync with 
that.  Also, 'make headers_install' doesn't install those Xen headers.


Instead I updated the commit comment to mention the canonical copy 
location.  This file isn't expected to change much so I think keeping it 
in sync in future shouldn't be onerous.


Regards,
Liam

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [RFC v2 0/4] QEMU changes to do PVH boot

2018-12-21 Thread Liam Merwick
For certain applications it is desirable to rapidly boot a KVM virtual
machine. In cases where legacy hardware and software support within the
guest is not needed, QEMU should be able to boot directly into the
uncompressed Linux kernel binary with minimal firmware involvement.

There already exists an ABI to allow this for Xen PVH guests and the ABI
is supported by Linux and FreeBSD:

   https://xenbits.xen.org/docs/unstable/misc/pvh.html

Details on the Linux changes (v9 staged for 4.21): 
https://lkml.org/lkml/2018/12/14/1330
qboot pull request: https://github.com/bonzini/qboot/pull/17 

This patch series provides QEMU support to read the ELF header of an
uncompressed kernel binary and get the 32-bit PVH kernel entry point
from an ELF Note.  In load_linux() a call is made to load_elfboot()
so see if the header matches that of an uncompressed kernel binary (ELF)
and if so, loads the binary and determines the kernel entry address
from an ELF Note in the binary.  Then qboot does futher initialisation
of the guest (e820, etc.) and jumps to the kernel entry address and
boots the guest.

changes v1 -> v2
- Based on feedback from Stefan Hajnoczi
- The reading of the PVH entry point is now done in a single pass during
  elf_load() which results in Patch2 in v1 being split into Patches 1&2 in v2
  and considerably reworked.
- Patch1 adds a new optional function pointer to parse the ELF note type
  (the type is passed in via the existing translate_opaque arg - the
  function already had 11 args so I didn't want to add more than one new arg).
- Patch2 adds a function to elf_ops.h to find an ELF note
  matching a specific type 
- Patch3 just has a line added to the commit message to state that the Xen
  repo is the canonical location
- Patch4 (that does the PVH boot) is mainly equivalent to Patch3 in v1
  just minor load_elfboot() changes and the addition of a
  read_pvh_start_addr() helper function for load_elf()


Usіng the method/scripts documented by the NEMU team at

   https://github.com/intel/nemu/wiki/Measuring-Boot-Latency
   https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00200.html

below are some timings measured (vmlinux and bzImage from the same build)
Time to get to kernel start is almost halved (95ṁs -> 48ms)

QEMU + qboot + vmlinux (PVH + 4.20-rc4)
 qemu_init_end: 41.550521
 fw_start: 41.667139 (+0.116618)
 fw_do_boot: 47.448495 (+5.781356)
 linux_startup_64: 47.720785 (+0.27229)
 linux_start_kernel: 48.399541 (+0.678756)
 linux_start_user: 296.952056 (+248.552515)

QEMU + qboot + bzImage:
 qemu_init_end: 29.209276
 fw_start: 29.317342 (+0.108066)
 linux_start_boot: 36.679362 (+7.36202)
 linux_startup_64: 94.531349 (+57.851987)
 linux_start_kernel: 94.900913 (+0.369564)
 linux_start_user: 401.060971 (+306.160058)

QEMU + bzImage:
 qemu_init_end: 30.424430
 linux_startup_64: 893.770334 (+863.345904)
 linux_start_kernel: 894.17049 (+0.400156)
 linux_start_user: 1208.679768 (+314.509278)


Liam Merwick (4):
  elf: Add optional function ptr to load_elf() to parse ELF notes
  elf-ops.h: Add get_elf_note_type()
  pvh: Add x86/HVM direct boot ABI header file
  pvh: Boot uncompressed kernel using direct boot ABI

 hw/alpha/dp264.c   |   4 +-
 hw/arm/armv7m.c|   3 +-
 hw/arm/boot.c  |   2 +-
 hw/core/generic-loader.c   |   2 +-
 hw/core/loader.c   |  24 ---
 hw/cris/boot.c |   3 +-
 hw/hppa/machine.c  |   6 +-
 hw/i386/multiboot.c|   2 +-
 hw/i386/pc.c   | 131 +++-
 hw/lm32/lm32_boards.c  |   6 +-
 hw/lm32/milkymist.c|   3 +-
 hw/m68k/an5206.c   |   2 +-
 hw/m68k/mcf5208.c  |   2 +-
 hw/microblaze/boot.c   |   7 +-
 hw/mips/mips_fulong2e.c|   5 +-
 hw/mips/mips_malta.c   |   5 +-
 hw/mips/mips_mipssim.c |   5 +-
 hw/mips/mips_r4k.c |   5 +-
 hw/moxie/moxiesim.c|   2 +-
 hw/nios2/boot.c|   7 +-
 hw/openrisc/openrisc_sim.c |   2 +-
 hw/pci-host/prep.c |   2 +-
 hw/ppc/e500.c  |   3 +-
 hw/ppc/mac_newworld.c  |   5 +-
 hw/ppc/mac_oldworld.c  |   5 +-
 hw/ppc/ppc440_bamboo.c |   2 +-
 hw/ppc/sam460ex.c  |   3 +-
 hw/ppc/spapr.c |   7 +-
 hw/ppc/virtex_ml507.c  |   2 +-
 hw/riscv/sifive_e.c|   2 +-
 hw/riscv/sifive_u.c|   2 +-
 hw/riscv/spike.c   |   2 +-
 hw/riscv/virt.c|   2 +-
 hw/s390x/ipl.c |   9 ++-
 hw/sparc/leon3.c   |   3 +-
 hw/sparc/sun4m.c   |   6 +-
 hw/sparc64/sun4u.c |   4 +-
 hw/tricore/tricore_testboard.c |   2 +-
 hw/xtensa/sim.c|  12 ++--
 hw/xtensa/xtfpga.c |   2 +-
 include/elf.h  |  10 +++
 include/hw/elf_ops.h   |  72 
 include/hw/loader.h|   

Re: [Xen-devel] [RFC 2/3] pc: Read PVH entry point from ELF note in kernel binary

2018-12-21 Thread Liam Merwick

Thanks Stefan for the review - comments inline.

On 11/12/2018 14:17, Stefan Hajnoczi wrote:

On Wed, Dec 05, 2018 at 10:37:25PM +, Liam Merwick wrote:

From: Liam Merwick 

Add support to read the PVH Entry address from an ELF note in the
uncompressed kernel binary (as defined by the x86/HVM direct boot ABI).
This 32-bit entry point will be used by QEMU to load the kernel in the
guest and jump into the kernel entry point.

For now, a call to this function is added in pc_memory_init() to read the
address - a future patch will use the entry point.

Signed-off-by: Liam Merwick 
---
  hw/i386/pc.c  | 272 +-
  include/elf.h |  10 +++
  2 files changed, 281 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f095725dbab2..056aa46d99b9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -109,6 +109,9 @@ static struct e820_entry *e820_table;
  static unsigned e820_entries;
  struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
  
+/* Physical Address of PVH entry point read from kernel ELF NOTE */

+static size_t pvh_start_addr;
+
  void gsi_handler(void *opaque, int n, int level)
  {
  GSIState *s = opaque;
@@ -834,6 +837,267 @@ struct setup_data {
  uint8_t data[0];
  } __attribute__((packed));
  
+/*

+ * Search through the ELF Notes for an entry with the given
+ * ELF Note type
+ */
+static void *get_elf_note_type(void *ehdr, void *phdr, bool elf_is64,
+size_t elf_note_type)


Generic ELF code.  Can you put it in hw/core/loader.c?



I've added a modified/slimmed down version to include/hw/elf_ops.h 
(which now handles 32 and 64 bit as you mention below).  I've put this 
in a separate commit.






+{
+void *nhdr = NULL;
+size_t nhdr_size = elf_is64 ? sizeof(Elf64_Nhdr) : sizeof(Elf32_Nhdr);
+size_t elf_note_entry_sz = 0;
+size_t phdr_off;
+size_t phdr_align;
+size_t phdr_memsz;
+size_t nhdr_namesz;
+size_t nhdr_descsz;
+size_t note_type;


The macro tricks used by hw/core/loader.c are nasty, but I think they
get the types right.  Here the Elf64 on 32-bit host case is definitely
broken due to using size_t.  Perhaps 64-on-32 isn't supported, but
getting the types right is worth discussing.


+
+phdr_off = elf_is64 ?
+((Elf64_Phdr *)phdr)->p_offset : ((Elf32_Phdr *)phdr)->p_offset;
+phdr_align = elf_is64 ?
+((Elf64_Phdr *)phdr)->p_align : ((Elf32_Phdr *)phdr)->p_align;
+phdr_memsz = elf_is64 ?
+((Elf64_Phdr *)phdr)->p_memsz : ((Elf32_Phdr *)phdr)->p_memsz;
+
+nhdr = ehdr + phdr_off;


The ELF file is untrusted.  All inputs must be validated.  phdr_off
could be an bogus/malicious value.



Most of the parsing of the ELF binary goes away due to moving to parse 
during elf_load() - more info below.






+note_type = elf_is64 ?
+((Elf64_Nhdr *)nhdr)->n_type : ((Elf32_Nhdr *)nhdr)->n_type;
+nhdr_namesz = elf_is64 ?
+((Elf64_Nhdr *)nhdr)->n_namesz : ((Elf32_Nhdr *)nhdr)->n_namesz;
+nhdr_descsz = elf_is64 ?
+((Elf64_Nhdr *)nhdr)->n_descsz : ((Elf32_Nhdr *)nhdr)->n_descsz;
+
+while (note_type != elf_note_type) {
+elf_note_entry_sz = nhdr_size +
+QEMU_ALIGN_UP(nhdr_namesz, phdr_align) +
+QEMU_ALIGN_UP(nhdr_descsz, phdr_align);
+
+/*
+ * Verify that we haven't exceeded the end of the ELF Note section.
+ * If we have, then there is no note of the given type present
+ * in the ELF Notes.
+ */
+if (phdr_off + phdr_memsz < ((nhdr - ehdr) + elf_note_entry_sz)) {
+error_report("Note type (0x%lx) not found in ELF Note section",
+elf_note_type);
+return NULL;
+}
+
+/* skip to the next ELF Note entry */
+nhdr += elf_note_entry_sz;
+note_type = elf_is64 ?
+((Elf64_Nhdr *)nhdr)->n_type : ((Elf32_Nhdr *)nhdr)->n_type;
+nhdr_namesz = elf_is64 ?
+((Elf64_Nhdr *)nhdr)->n_namesz : ((Elf32_Nhdr *)nhdr)->n_namesz;
+nhdr_descsz = elf_is64 ?
+((Elf64_Nhdr *)nhdr)->n_descsz : ((Elf32_Nhdr *)nhdr)->n_descsz;
+}
+
+return nhdr;
+}
+
+/*
+ * The entry point into the kernel for PVH boot is different from
+ * the native entry point.  The PVH entry is defined by the x86/HVM
+ * direct boot ABI and is available in an ELFNOTE in the kernel binary.
+ * This function reads the ELF headers of the binary specified on the
+ * command line by -kernel (path contained in 'filename') and discovers
+ * the PVH entry address from the appropriate ELF Note.
+ *
+ * The address of the PVH entry point is saved to the 'pvh_start_addr'
+ * global variable. The ELF class of the binary is returned via 'elfclass'
+ * (although the entry point is 32-bit, the kernel binary can be either
+ * 32-bit or 64-bit).
+ */
+static bool read_pvh_start_addr_elf_note(const char *filename,
+unsigned char *elfclass)
+{


Can this be integrated into ELF loading?  

[Xen-devel] [RFC v2 1/4] elf: Add optional function ptr to load_elf() to parse ELF notes

2018-12-21 Thread Liam Merwick
This patch adds an optional function pointer, 'elf_note_fn', to
load_elf() which causes load_elf() to additionally parse any
ELF program headers of type PT_NOTE and check to see if the ELF
Note is of the type specified by the 'translate_opaque' arg.
If a matching ELF Note is found then the specfied function pointer
is called to process the ELF note.

Passing a NULL function pointer results in ELF Notes being skipped.

The first consumer of this functionality is the PVHboot support
which needs to read the XEN_ELFNOTE_PHYS32_ENTRY ELF Note while
loading the uncompressed kernel binary in order to discover the
boot entry address for the x86/HVM direct boot ABI.

Signed-off-by: Liam Merwick 
---
 hw/alpha/dp264.c   |  4 ++--
 hw/arm/armv7m.c|  3 ++-
 hw/arm/boot.c  |  2 +-
 hw/core/generic-loader.c   |  2 +-
 hw/core/loader.c   | 24 
 hw/cris/boot.c |  3 ++-
 hw/hppa/machine.c  |  6 +++---
 hw/i386/multiboot.c|  2 +-
 hw/lm32/lm32_boards.c  |  6 --
 hw/lm32/milkymist.c|  3 ++-
 hw/m68k/an5206.c   |  2 +-
 hw/m68k/mcf5208.c  |  2 +-
 hw/microblaze/boot.c   |  7 ---
 hw/mips/mips_fulong2e.c|  5 +++--
 hw/mips/mips_malta.c   |  5 +++--
 hw/mips/mips_mipssim.c |  5 +++--
 hw/mips/mips_r4k.c |  5 +++--
 hw/moxie/moxiesim.c|  2 +-
 hw/nios2/boot.c|  7 ---
 hw/openrisc/openrisc_sim.c |  2 +-
 hw/pci-host/prep.c |  2 +-
 hw/ppc/e500.c  |  3 ++-
 hw/ppc/mac_newworld.c  |  5 +++--
 hw/ppc/mac_oldworld.c  |  5 +++--
 hw/ppc/ppc440_bamboo.c |  2 +-
 hw/ppc/sam460ex.c  |  3 ++-
 hw/ppc/spapr.c |  7 ---
 hw/ppc/virtex_ml507.c  |  2 +-
 hw/riscv/sifive_e.c|  2 +-
 hw/riscv/sifive_u.c|  2 +-
 hw/riscv/spike.c   |  2 +-
 hw/riscv/virt.c|  2 +-
 hw/s390x/ipl.c |  9 ++---
 hw/sparc/leon3.c   |  3 ++-
 hw/sparc/sun4m.c   |  6 --
 hw/sparc64/sun4u.c |  4 ++--
 hw/tricore/tricore_testboard.c |  2 +-
 hw/xtensa/sim.c| 12 
 hw/xtensa/xtfpga.c |  2 +-
 include/hw/elf_ops.h   | 23 +++
 include/hw/loader.h|  9 -
 41 files changed, 134 insertions(+), 70 deletions(-)

diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
index dd62f2a4050c..0347eb897c8a 100644
--- a/hw/alpha/dp264.c
+++ b/hw/alpha/dp264.c
@@ -114,7 +114,7 @@ static void clipper_init(MachineState *machine)
 error_report("no palcode provided");
 exit(1);
 }
-size = load_elf(palcode_filename, cpu_alpha_superpage_to_phys,
+size = load_elf(palcode_filename, NULL, cpu_alpha_superpage_to_phys,
 NULL, _entry, _low, _high,
 0, EM_ALPHA, 0, 0);
 if (size < 0) {
@@ -133,7 +133,7 @@ static void clipper_init(MachineState *machine)
 if (kernel_filename) {
 uint64_t param_offset;
 
-size = load_elf(kernel_filename, cpu_alpha_superpage_to_phys,
+size = load_elf(kernel_filename, NULL, cpu_alpha_superpage_to_phys,
 NULL, _entry, _low, _high,
 0, EM_ALPHA, 0, 0);
 if (size < 0) {
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index 4bf9131b81e4..a4d528537eb4 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -298,7 +298,8 @@ void armv7m_load_kernel(ARMCPU *cpu, const char 
*kernel_filename, int mem_size)
 as = cpu_get_address_space(cs, asidx);
 
 if (kernel_filename) {
-image_size = load_elf_as(kernel_filename, NULL, NULL, , ,
+image_size = load_elf_as(kernel_filename, NULL, NULL, NULL,
+ , ,
  NULL, big_endian, EM_ARM, 1, 0, as);
 if (image_size < 0) {
 image_size = load_image_targphys_as(kernel_filename, 0,
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 94fce128028c..2b59379be6af 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -884,7 +884,7 @@ static int64_t arm_load_elf(struct arm_boot_info *info, 
uint64_t *pentry,
 }
 }
 
-ret = load_elf_as(info->kernel_filename, NULL, NULL,
+ret = load_elf_as(info->kernel_filename, NULL, NULL, NULL,
   pentry, lowaddr, highaddr, big_endian, elf_machine,
   1, data_swab, as);
 if (ret <= 0) {
diff --git a/hw/core/generic-loader.c b/hw/core/generic-loader.c
index fbae05fb3b64..3695dd439cd0 100644
--- a/hw/core/generic-loader.c
+++ b/hw/core/generic-loader.c
@@ -136,7 +136,7 @@ static void generic_loader_realize(DeviceState *dev, Error 
**errp)
 AddressSpace *as = s->cpu ? s->cpu->as :  NULL;
 
 if (!s->force_raw) {
-size = load_elf_as(s->file, NULL, NULL, , NULL, 

[Xen-devel] [PATCH for-4.12 V13 1/5] x86/p2m: allocate logdirty_ranges for altp2ms

2018-12-21 Thread Razvan Cojocaru
For now, only do allocation/deallocation; keeping them in sync
will be done in subsequent patches.

Logdirty synchronization will only be done for active altp2ms;
so allocate logdirty rangesets (copying the host logdirty
rangeset) when an altp2m is activated, and free it when
deactivated.

Signed-off-by: Razvan Cojocaru 
Tested-by: Tamas K Lengyel 
Reviewed-by: George Dunlap 

---
CC: George Dunlap 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Wei Liu 
CC: "Roger Pau Monné" 

---
Changes since V12:
 - Added George's Reviewed-by.
---
 xen/arch/x86/mm/p2m.c | 46 +++---
 1 file changed, 39 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fea4497..96a6d3e 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2265,6 +2265,40 @@ void p2m_flush_altp2m(struct domain *d)
 altp2m_list_unlock(d);
 }
 
+static int p2m_activate_altp2m(struct domain *d, unsigned int idx)
+{
+struct p2m_domain *hostp2m, *p2m;
+int rc;
+
+ASSERT(idx < MAX_ALTP2M);
+
+p2m = d->arch.altp2m_p2m[idx];
+hostp2m = p2m_get_hostp2m(d);
+
+p2m_lock(p2m);
+
+rc = p2m_init_logdirty(p2m);
+
+if ( rc )
+goto out;
+
+/* The following is really just a rangeset copy. */
+rc = rangeset_merge(p2m->logdirty_ranges, hostp2m->logdirty_ranges);
+
+if ( rc )
+{
+p2m_free_logdirty(p2m);
+goto out;
+}
+
+p2m_init_altp2m_ept(d, idx);
+
+ out:
+p2m_unlock(p2m);
+
+return rc;
+}
+
 int p2m_init_altp2m_by_id(struct domain *d, unsigned int idx)
 {
 int rc = -EINVAL;
@@ -2275,10 +2309,7 @@ int p2m_init_altp2m_by_id(struct domain *d, unsigned int 
idx)
 altp2m_list_lock(d);
 
 if ( d->arch.altp2m_eptp[idx] == mfn_x(INVALID_MFN) )
-{
-p2m_init_altp2m_ept(d, idx);
-rc = 0;
-}
+rc = p2m_activate_altp2m(d, idx);
 
 altp2m_list_unlock(d);
 return rc;
@@ -2296,9 +2327,10 @@ int p2m_init_next_altp2m(struct domain *d, uint16_t *idx)
 if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) )
 continue;
 
-p2m_init_altp2m_ept(d, i);
-*idx = i;
-rc = 0;
+rc = p2m_activate_altp2m(d, i);
+
+if ( !rc )
+*idx = i;
 
 break;
 }
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 V13 0/5] Fix VGA logdirty related display freezes with altp2m

2018-12-21 Thread Razvan Cojocaru
This series aims to prevent the display from freezing when
enabling altp2m and switching to a new view (and assorted problems
when resizing the display).

The series introduces p2m_{init,free}_logdirty(), allocates a new
logdirty rangeset for each new altp2m, and  propagates (under lock)
changes to all p2ms.

[PATCH for-4.12 V13 1/5] x86/p2m: allocate logdirty_ranges for altp2ms
[PATCH for-4.12 V13 2/5] x86/p2m: refactor p2m_reset_altp2m()
[PATCH for-4.12 V13 3/5] x86/altp2m: fix display frozen when switching to a new 
view early
[PATCH for-4.12 V13 4/5] p2m: Always use hostp2m when clipping rangesets
[PATCH for-4.12 V13 5/5] p2m: change_type_range: Only invalidate mapped gfns


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 V13 4/5] p2m: Always use hostp2m when clipping rangesets

2018-12-21 Thread Razvan Cojocaru
The logdirty rangesets of the altp2ms need to be kept in sync with the
hostp2m. This means when iterating through the altp2ms, we need to
use the host p2m to clip the rangeset, not the indiviual altp2m's
value.

This change also:

- Documents that the end is non-inclusive

- Calculates an "inclusive" value for the end once, rather than
  open-coding the modification, and (worse) back-modifying updates so
  that the calculation ends up correct

- Clarifies the logic deciding whether to call
  change_entry_type_global() or change_entry_type_range()

- Handles the case where start >= hostp2m->max_mapped_pfn

Signed-off-by: George Dunlap 
Signed-off-by: Razvan Cojocaru 
Tested-by: Tamas K Lengyel 

---
CC: George Dunlap 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Wei Liu 
CC: "Roger Pau Monné" 

---
Changes since V12:
 - Corrected the clipping warning placement.
---
 xen/arch/x86/mm/p2m.c | 48 
 1 file changed, 32 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index d145850..2af50af 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1002,30 +1002,46 @@ int p2m_change_type_one(struct domain *d, unsigned long 
gfn_l,
 return rc;
 }
 
-/* Modify the p2m type of a range of gfns from ot to nt. */
+/* Modify the p2m type of [start, end_exclusive) from ot to nt. */
 static void change_type_range(struct p2m_domain *p2m,
-  unsigned long start, unsigned long end,
+  unsigned long start, unsigned long end_exclusive,
   p2m_type_t ot, p2m_type_t nt)
 {
-unsigned long gfn = start;
 struct domain *d = p2m->domain;
+const unsigned long host_max_pfn = p2m_get_hostp2m(d)->max_mapped_pfn;
+unsigned long end = end_exclusive - 1;
 int rc = 0;
 
-if ( unlikely(end > p2m->max_mapped_pfn) )
+/*
+ * Always clip the rangeset down to the host p2m. This is probably not
+ * the right behavior. This should be revisited later, but for now post a
+ * warning.
+ */
+if ( unlikely(end > host_max_pfn) )
 {
-if ( !gfn )
-{
-p2m->change_entry_type_global(p2m, ot, nt);
-gfn = end;
-}
-end = p2m->max_mapped_pfn + 1;
+printk(XENLOG_G_WARNING "Dom%d logdirty rangeset clipped to 
max_mapped_pfn\n",
+   d->domain_id);
+end = host_max_pfn;
 }
-if ( gfn < end )
-rc = p2m->change_entry_type_range(p2m, ot, nt, gfn, end - 1);
+
+/* If the requested range is out of scope, return doing nothing. */
+if ( start > end )
+return;
+
+/*
+ * If all valid gfns are in the invalidation range, just do a
+ * global type change. Otherwise, invalidate only the range we
+ * need.
+ */
+if ( !start && end >= p2m->max_mapped_pfn )
+p2m->change_entry_type_global(p2m, ot, nt);
+else
+rc = p2m->change_entry_type_range(p2m, ot, nt, start, end);
+
 if ( rc )
 {
-printk(XENLOG_G_ERR "Error %d changing Dom%d GFNs [%lx,%lx] from %d to 
%d\n",
-   rc, d->domain_id, start, end - 1, ot, nt);
+printk(XENLOG_G_ERR "Error %d changing Dom%d GFNs [%lx,%lx) from %d to 
%d\n",
+   rc, d->domain_id, start, end_exclusive, ot, nt);
 domain_crash(d);
 }
 
@@ -1033,11 +1049,11 @@ static void change_type_range(struct p2m_domain *p2m,
 {
 case p2m_ram_rw:
 if ( ot == p2m_ram_logdirty )
-rc = rangeset_remove_range(p2m->logdirty_ranges, start, end - 1);
+rc = rangeset_remove_range(p2m->logdirty_ranges, start, end);
 break;
 case p2m_ram_logdirty:
 if ( ot == p2m_ram_rw )
-rc = rangeset_add_range(p2m->logdirty_ranges, start, end - 1);
+rc = rangeset_add_range(p2m->logdirty_ranges, start, end);
 break;
 default:
 break;
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 V13 3/5] x86/altp2m: fix display frozen when switching to a new view early

2018-12-21 Thread Razvan Cojocaru
When an new altp2m view is created very early on guest boot, the
display will freeze (although the guest will run normally). This
may also happen on resizing the display. The reason is the way
Xen currently (mis)handles logdirty VGA: it intentionally
misconfigures VGA pages so that they will fault.

The problem is that it only does this in the host p2m. Once we
switch to a new altp2m, the misconfigured entries will no longer
fault, so the display will not be updated.

This patch:
* updates ept_handle_misconfig() to use the active altp2m instead
  of the hostp2m;
* modifies p2m_change_entry_type_global(),
  p2m_memory_type_changed(), p2m_change_type_range() and
  p2m_finish_type_change() to propagate their changes to all
  valid altp2ms.

With the introduction of altp2m fields in p2m_memory_type_changed()
the whole function has been put under CONFIG_HVM.

Suggested-by: George Dunlap 
Signed-off-by: Razvan Cojocaru 
Tested-by: Tamas K Lengyel 
Acked-by: George Dunlap 

---
CC: Jun Nakajima 
CC: Kevin Tian 
CC: George Dunlap 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Wei Liu 
CC: "Roger Pau Monné" 

---
Changes since V12:
 - None.
---
 xen/arch/x86/mm/p2m-ept.c |   9 ++-
 xen/arch/x86/mm/p2m-pt.c  |   8 +++
 xen/arch/x86/mm/p2m.c | 169 ++
 xen/include/asm-x86/p2m.h |   6 +-
 4 files changed, 158 insertions(+), 34 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 6e4e375..00fb82d 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -657,6 +657,9 @@ bool_t ept_handle_misconfig(uint64_t gpa)
 bool_t spurious;
 int rc;
 
+if ( altp2m_active(curr->domain) )
+p2m = p2m_get_altp2m(curr);
+
 p2m_lock(p2m);
 
 spurious = curr->arch.hvm.vmx.ept_spurious_misconfig;
@@ -1416,9 +1419,13 @@ void p2m_init_altp2m_ept(struct domain *d, unsigned int 
i)
 struct p2m_domain *hostp2m = p2m_get_hostp2m(d);
 struct ept_data *ept;
 
+p2m->default_access = hostp2m->default_access;
+p2m->domain = hostp2m->domain;
+
+p2m->global_logdirty = hostp2m->global_logdirty;
 p2m->ept.ad = hostp2m->ept.ad;
 p2m->min_remapped_gfn = gfn_x(INVALID_GFN);
-p2m->max_remapped_gfn = 0;
+p2m->max_mapped_pfn = p2m->max_remapped_gfn = 0;
 ept = >ept;
 ept->mfn = pagetable_get_pfn(p2m_get_pagetable(p2m));
 d->arch.altp2m_eptp[i] = ept->eptp;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 17a6b61..b5c19df 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -464,6 +465,13 @@ int p2m_pt_handle_deferred_changes(uint64_t gpa)
 struct p2m_domain *p2m = p2m_get_hostp2m(current->domain);
 int rc;
 
+/*
+ * Should altp2m ever be enabled for NPT / shadow use, this code
+ * should be updated to make use of the active altp2m, like
+ * ept_handle_misconfig().
+ */
+ASSERT(!altp2m_active(current->domain));
+
 p2m_lock(p2m);
 rc = do_recalc(p2m, PFN_DOWN(gpa));
 p2m_unlock(p2m);
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 7c6aae7..d145850 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -277,7 +277,6 @@ int p2m_init(struct domain *d)
 int p2m_is_logdirty_range(struct p2m_domain *p2m, unsigned long start,
   unsigned long end)
 {
-ASSERT(p2m_is_hostp2m(p2m));
 if ( p2m->global_logdirty ||
  rangeset_contains_range(p2m->logdirty_ranges, start, end) )
 return 1;
@@ -286,31 +285,79 @@ int p2m_is_logdirty_range(struct p2m_domain *p2m, 
unsigned long start,
 return 0;
 }
 
+static void change_entry_type_global(struct p2m_domain *p2m,
+ p2m_type_t ot, p2m_type_t nt)
+{
+p2m->change_entry_type_global(p2m, ot, nt);
+p2m->global_logdirty = (nt == p2m_ram_logdirty);
+}
+
 void p2m_change_entry_type_global(struct domain *d,
   p2m_type_t ot, p2m_type_t nt)
 {
-struct p2m_domain *p2m = p2m_get_hostp2m(d);
+struct p2m_domain *hostp2m = p2m_get_hostp2m(d);
 
 ASSERT(ot != nt);
 ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt));
 
-p2m_lock(p2m);
-p2m->change_entry_type_global(p2m, ot, nt);
-p2m->global_logdirty = (nt == p2m_ram_logdirty);
-p2m_unlock(p2m);
+p2m_lock(hostp2m);
+
+change_entry_type_global(hostp2m, ot, nt);
+
+#ifdef CONFIG_HVM
+if ( unlikely(altp2m_active(d)) )
+{
+unsigned int i;
+
+for ( i = 0; i < MAX_ALTP2M; i++ )
+if ( d->arch.altp2m_eptp[i] != mfn_x(INVALID_MFN) )
+{
+struct p2m_domain *altp2m = d->arch.altp2m_p2m[i];
+
+p2m_lock(altp2m);
+change_entry_type_global(altp2m, ot, nt);
+p2m_unlock(altp2m);
+}
+}
+#endif
+
+p2m_unlock(hostp2m);
+}
+
+#ifdef CONFIG_HVM
+/* 

[Xen-devel] [PATCH for-4.12 V13 2/5] x86/p2m: refactor p2m_reset_altp2m()

2018-12-21 Thread Razvan Cojocaru
Refactor p2m_reset_altp2m() so that it can be used to remove
redundant codepaths, fixing the locking while we're at it.

The previous code now replaced by p2m_reset_altp2m(d, i,
ALTP2M_DEACTIVATE) calls did not set p2m->min_remapped_gfn
and p2m->max_remapped_gfn because in those cases the altp2m
idx was disabled; so before getting used again,
p2m_init_altp2m_ept() would get called, which resets them.
Always setting them in p2m_reset_altp2m(), while redundant,
is preferable to an extra conditional.

Signed-off-by: Razvan Cojocaru 
Tested-by: Tamas K Lengyel 
Reviewed-by: George Dunlap 

---
CC: George Dunlap 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Wei Liu 
CC: "Roger Pau Monné" 

---
Changes since V12:
 - Added George's Reviewed-by.
---
 xen/arch/x86/mm/p2m.c | 57 ++-
 1 file changed, 34 insertions(+), 23 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 96a6d3e..7c6aae7 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2247,6 +2247,36 @@ bool_t p2m_altp2m_lazy_copy(struct vcpu *v, paddr_t gpa,
 return 1;
 }
 
+enum altp2m_reset_type {
+ALTP2M_RESET,
+ALTP2M_DEACTIVATE
+};
+
+static void p2m_reset_altp2m(struct domain *d, unsigned int idx,
+ enum altp2m_reset_type reset_type)
+{
+struct p2m_domain *p2m;
+
+ASSERT(idx < MAX_ALTP2M);
+p2m = d->arch.altp2m_p2m[idx];
+
+p2m_lock(p2m);
+
+p2m_flush_table_locked(p2m);
+
+if ( reset_type == ALTP2M_DEACTIVATE )
+p2m_free_logdirty(p2m);
+
+/* Uninit and reinit ept to force TLB shootdown */
+ept_p2m_uninit(p2m);
+ept_p2m_init(p2m);
+
+p2m->min_remapped_gfn = gfn_x(INVALID_GFN);
+p2m->max_remapped_gfn = 0;
+
+p2m_unlock(p2m);
+}
+
 void p2m_flush_altp2m(struct domain *d)
 {
 unsigned int i;
@@ -2255,10 +2285,7 @@ void p2m_flush_altp2m(struct domain *d)
 
 for ( i = 0; i < MAX_ALTP2M; i++ )
 {
-p2m_flush_table(d->arch.altp2m_p2m[i]);
-/* Uninit and reinit ept to force TLB shootdown */
-ept_p2m_uninit(d->arch.altp2m_p2m[i]);
-ept_p2m_init(d->arch.altp2m_p2m[i]);
+p2m_reset_altp2m(d, i, ALTP2M_DEACTIVATE);
 d->arch.altp2m_eptp[i] = mfn_x(INVALID_MFN);
 }
 
@@ -2357,10 +2384,7 @@ int p2m_destroy_altp2m_by_id(struct domain *d, unsigned 
int idx)
 
 if ( !_atomic_read(p2m->active_vcpus) )
 {
-p2m_flush_table(d->arch.altp2m_p2m[idx]);
-/* Uninit and reinit ept to force TLB shootdown */
-ept_p2m_uninit(d->arch.altp2m_p2m[idx]);
-ept_p2m_init(d->arch.altp2m_p2m[idx]);
+p2m_reset_altp2m(d, idx, ALTP2M_DEACTIVATE);
 d->arch.altp2m_eptp[idx] = mfn_x(INVALID_MFN);
 rc = 0;
 }
@@ -2485,16 +2509,6 @@ int p2m_change_altp2m_gfn(struct domain *d, unsigned int 
idx,
 return rc;
 }
 
-static void p2m_reset_altp2m(struct p2m_domain *p2m)
-{
-p2m_flush_table(p2m);
-/* Uninit and reinit ept to force TLB shootdown */
-ept_p2m_uninit(p2m);
-ept_p2m_init(p2m);
-p2m->min_remapped_gfn = gfn_x(INVALID_GFN);
-p2m->max_remapped_gfn = 0;
-}
-
 int p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
 mfn_t mfn, unsigned int page_order,
 p2m_type_t p2mt, p2m_access_t p2ma)
@@ -2528,7 +2542,7 @@ int p2m_altp2m_propagate_change(struct domain *d, gfn_t 
gfn,
 {
 if ( !reset_count++ )
 {
-p2m_reset_altp2m(p2m);
+p2m_reset_altp2m(d, i, ALTP2M_RESET);
 last_reset_idx = i;
 }
 else
@@ -2542,10 +2556,7 @@ int p2m_altp2m_propagate_change(struct domain *d, gfn_t 
gfn,
  d->arch.altp2m_eptp[i] == mfn_x(INVALID_MFN) )
 continue;
 
-p2m = d->arch.altp2m_p2m[i];
-p2m_lock(p2m);
-p2m_reset_altp2m(p2m);
-p2m_unlock(p2m);
+p2m_reset_altp2m(d, i, ALTP2M_RESET);
 }
 
 ret = 0;
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 V13 5/5] p2m: change_type_range: Only invalidate mapped gfns

2018-12-21 Thread Razvan Cojocaru
change_type_range() invalidates gfn ranges to lazily change the type
of a range of gfns, and also modifies the logdirty rangesets of that
p2m. At the moment, it clips both down by the hostp2m.

While this will result in correct behavior, it's not entirely efficient,
since invalidated entries outside that range will, on fault, simply be
modified back to "empty" before faulting normally again.

Separate out the calculation of the two ranges. Keep using the
hostp2m's max_mapped_pfn to clip the logdirty ranges, but use the
current p2m's max_mapped_pfn to further clip the invalidation range
for alternate p2ms.

Signed-off-by: George Dunlap 
Signed-off-by: Razvan Cojocaru 
Reviewed-by: Jan Beulich 
Tested-by: Tamas K Lengyel 
Acked-by: George Dunlap 

---
CC: George Dunlap 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Wei Liu 
CC: "Roger Pau Monné" 

---
Changes since V12:
 - Updated to apply on top of the modified patch 4.
---
 xen/arch/x86/mm/p2m.c | 58 +--
 1 file changed, 42 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 2af50af..6380bc0 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1007,42 +1007,68 @@ static void change_type_range(struct p2m_domain *p2m,
   unsigned long start, unsigned long end_exclusive,
   p2m_type_t ot, p2m_type_t nt)
 {
+unsigned long invalidate_start, invalidate_end;
 struct domain *d = p2m->domain;
 const unsigned long host_max_pfn = p2m_get_hostp2m(d)->max_mapped_pfn;
 unsigned long end = end_exclusive - 1;
+const unsigned long max_pfn = p2m->max_mapped_pfn;
 int rc = 0;
 
 /*
- * Always clip the rangeset down to the host p2m. This is probably not
- * the right behavior. This should be revisited later, but for now post a
- * warning.
+ * If we have an altp2m, the logdirty rangeset range needs to
+ * match that of the hostp2m, but for efficiency, we want to clip
+ * down the the invalidation range according to the mapped values
+ * in the altp2m. Keep track of and clip the ranges separately.
+ */
+invalidate_start = start;
+invalidate_end   = end;
+
+/*
+ * Clip down to the host p2m. This is probably not the right behavior.
+ * This should be revisited later, but for now post a warning.
  */
 if ( unlikely(end > host_max_pfn) )
 {
 printk(XENLOG_G_WARNING "Dom%d logdirty rangeset clipped to 
max_mapped_pfn\n",
d->domain_id);
-end = host_max_pfn;
+end = invalidate_end = host_max_pfn;
 }
 
 /* If the requested range is out of scope, return doing nothing. */
 if ( start > end )
 return;
 
+if ( p2m_is_altp2m(p2m) )
+invalidate_end = min(invalidate_end, max_pfn);
+
 /*
- * If all valid gfns are in the invalidation range, just do a
- * global type change. Otherwise, invalidate only the range we
- * need.
+ * If the p2m is empty, or the range is outside the currently
+ * mapped range, no need to do the invalidation; just update the
+ * rangeset.
  */
-if ( !start && end >= p2m->max_mapped_pfn )
-p2m->change_entry_type_global(p2m, ot, nt);
-else
-rc = p2m->change_entry_type_range(p2m, ot, nt, start, end);
-
-if ( rc )
+if ( invalidate_start < invalidate_end )
 {
-printk(XENLOG_G_ERR "Error %d changing Dom%d GFNs [%lx,%lx) from %d to 
%d\n",
-   rc, d->domain_id, start, end_exclusive, ot, nt);
-domain_crash(d);
+/*
+ * If all valid gfns are in the invalidation range, just do a
+ * global type change. Otherwise, invalidate only the range
+ * we need.
+ *
+ * NB that invalidate_end can't logically be >max_pfn at this
+ * point. If this changes, the == will need to be changed to
+ * >=.
+ */
+ASSERT(invalidate_end <= max_pfn);
+if ( !invalidate_start && invalidate_end == max_pfn)
+p2m->change_entry_type_global(p2m, ot, nt);
+else
+rc = p2m->change_entry_type_range(p2m, ot, nt,
+  invalidate_start, 
invalidate_end);
+if ( rc )
+{
+printk(XENLOG_G_ERR "Error %d changing Dom%d GFNs [%lx,%lx] from 
%d to %d\n",
+   rc, d->domain_id, invalidate_start, invalidate_end, ot, nt);
+domain_crash(d);
+}
 }
 
 switch ( nt )
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] 4.10.1 Xen crash and reboot

2018-12-21 Thread Andy Smith
Hello,

And again today:

(XEN) [ Xen-4.10.3-pre  x86_64  debug=n   Not tainted ]
(XEN) CPU:4
(XEN) RIP:e008:[] 
guest_4.o#sh_page_fault__guest_4+0x70b/0x2060
(XEN) RFLAGS: 00010203   CONTEXT: hypervisor (d61v1)
(XEN) rax: 00c422641dd0   rbx: 832005c49000   rcx: 81c0e060
(XEN) rdx:    rsi: 832005c49000   rdi: 00c422641dd0
(XEN) rbp: 81c0e0601880   rsp: 83207e607c38   r8:  0310
(XEN) r9:     r10:    r11: 
(XEN) r12: 83207e607ef8   r13: 00f9cea7   r14: 
(XEN) r15: 830079592000   cr0: 80050033   cr4: 00372660
(XEN) cr3: 001ffab1a001   cr2: 81c0e0601880
(XEN) fsb: 7f89c67fc700   gsb:    gss: 88007f30
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) Xen code around  
(guest_4.o#sh_page_fault__guest_4+0x70b/0x2060):
(XEN)  49 c1 e8 1e 4a 8d 2c c1 <48> 8b 4d 00 f6 c1 01 0f 84 f8 06 00 00 48 c1 e1
(XEN) Xen stack trace from rsp=83207e607c38:
(XEN)830f68748208 00c422641dd0 832005c49600 00f9cea7
(XEN)832005c49660 832005c49000 83207e607d70 83207e607d20
(XEN)0c422641 0090 82d0805802c0 000205c49000
(XEN)0008 0880 0898 82d0805802c0
(XEN)01fd58a1 01ffab1a 0208 0041
(XEN)800f9cea7825 01ff82d0 000d 82d0
(XEN)832005c49000 0001000d 83207e607fff 83207e607d20
(XEN)00a1 00c422641dd0 000f86569067 000f86544067
(XEN)000f68748067 800f9cea7925 00f87171 00f86569
(XEN)00f86544 00f68748 0005 
(XEN)82e03ff56340 832005c49000 00057ff0 
(XEN)83207e607e18 830079592000 832005c49000 83207e607ef8
(XEN)00c422641dd0 82d08034e4b0  82d080349e20
(XEN) 83207e607fff 830079592000 82d08034e5ae
(XEN)82d080354913 82d080354907 82d080354913 82d080354907
(XEN)82d080354913 82d080354907 82d080354913 82d080354907
(XEN)82d080354913 82d080354907 82d080354913 82d080354907
(XEN)82d080354913 83207e607ef8 830079592000 00c422641dd0
(XEN)832005c49000 0004  82d0802a1842
(XEN)82d080354913 82d080354907 82d080354913 82d080354907
(XEN) Xen call trace:
(XEN)[] guest_4.o#sh_page_fault__guest_4+0x70b/0x2060
(XEN)[] do_iret+0/0x1a0
(XEN)[] toggle_guest_pt+0x30/0x160
(XEN)[] do_iret+0xfe/0x1a0
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] handle_exception+0x8f/0xf9
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] handle_exception+0x8f/0xf9
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] handle_exception+0x8f/0xf9
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] handle_exception+0x8f/0xf9
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] handle_exception+0x8f/0xf9
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] handle_exception+0x8f/0xf9
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] do_page_fault+0x1a2/0x4e0
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] handle_exception+0x8f/0xf9
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] handle_exception+0x8f/0xf9
(XEN)[] handle_exception+0x9b/0xf9
(XEN)[] x86_64/entry.S#handle_exception_saved+0x68/0x94
(XEN)
(XEN) Pagetable walk from 81c0e0601880:
(XEN)  L4[0x103] = 801ffab1a063 
(XEN)  L3[0x103] = 801ffab1a063 
(XEN)  L2[0x103] = 801ffab1a063 
(XEN)  L1[0x001] =  
(XEN)
(XEN) 
(XEN) Panic on CPU 4:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=]
(XEN) Faulting linear address: 81c0e0601880
(XEN) 
(XEN)
(XEN) Reboot in five seconds...
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG.

Host has now rebooted into a hypervisor with pcid=0 command line.

I note that:

(XEN) RFLAGS: 00010203   CONTEXT: hypervisor (d61v1)

and the previous incident (below):

(XEN) RFLAGS: 00010246   CONTEXT: hypervisor (d31v1)

These are the same guest.

Is it worth me moving this guest to a test host without pcid=0 to
see if it crashes it, meanwhile keeping production hosts with
pcid=0? And then putting pcid=0 on the test host to see if it
survives longer?

This will take quite a long time to gain confidence of, since the
incidents are about 2 weeks apart each time.

Thanks,
Andy

On Mon, Dec 10, 2018 at 03:58:41PM +, Andy Smith wrote:
> Hi,
> 
> Up front information:
> 
> Today one of my Xen hosts crashed with this logging on the serial:
> 
> (XEN) [ Xen-4.10.1  x86_64  debug=n   Not 

[Xen-devel] [PATCH v2] gic-vgic: skip irqs locking in gic_restore_pending_irqs()

2018-12-21 Thread Andrii Anisov
From: Andrii Anisov 

This function is called under IRQs disabled already, so drop additional
flags save and restore.

Signed-off-by: Andrii Anisov 

---

This is a half of an RFC patch [1] which relies on the already
existing code.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03293.html

Changes in v2:
Added an ASSERT() to verify the function should be called with interrupts
disabled.

---
 xen/arch/arm/gic-vgic.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index 48922f5..f9098c8 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -279,11 +279,12 @@ static void gic_restore_pending_irqs(struct vcpu *v)
 int lr = 0;
 struct pending_irq *p, *t, *p_r;
 struct list_head *inflight_r;
-unsigned long flags;
 unsigned int nr_lrs = gic_get_nr_lrs();
 int lrs = nr_lrs;
 
-spin_lock_irqsave(>arch.vgic.lock, flags);
+ASSERT(!local_irq_is_enabled());
+
+spin_lock(>arch.vgic.lock);
 
 if ( list_empty(>arch.vgic.lr_pending) )
 goto out;
@@ -327,7 +328,7 @@ found:
 }
 
 out:
-spin_unlock_irqrestore(>arch.vgic.lock, flags);
+spin_unlock(>arch.vgic.lock);
 }
 
 void gic_clear_pending_irqs(struct vcpu *v)
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 3/3] x86/mm-locks: apply a bias to lock levels for control domain

2018-12-21 Thread George Dunlap


> On Dec 21, 2018, at 9:41 AM, Roger Pau Monne  wrote:
> 
> paging_log_dirty_op function takes mm locks from a subject domain and
> then attempts to perform copy to operations against the caller domain
> in order to copy the result of the hypercall into the caller provided
> buffer.
> 
> This works fine when the caller is a non-paging domain, but triggers a
> lock order panic when the caller is a paging domain due to the fact
> that at the point where the copy to operation is performed the subject
> domain paging lock is locked, and the copy operation requires
> locking the caller p2m lock which has a lower level.
> 
> Fix this limitation by adding a bias to the level of control domain mm
> locks, so that the lower control domain mm lock always has a level
> greater than the higher unprivileged domain lock level. This allows
> locking the subject domain mm locks and then locking the control
> domain mm locks, while keeping the same lock ordering and the changes
> mostly confined to mm-locks.h.
> 
> Note that so far only this flow (locking a subject domain locks and
> then the control domain ones) has been identified, but not all
> possible code paths have been inspected. Hence this solution attempts
> to be a non-intrusive fix for the problem at hand, without discarding
> further changes in the future if other valid code paths are found that
> require more complex lock level ordering.
> 
> Signed-off-by: Roger Pau Monné 

Reviewed-by: George Dunlap 

…but given the nature of the change, I’d like to see it pass at least one 
ad-hoc osstest run before it gets checked in.  (That probably means you’ll have 
to chase someone else to do that.)

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V12 4/5] p2m: Always use hostp2m when clipping rangesets

2018-12-21 Thread Razvan Cojocaru
On 12/21/18 8:24 PM, George Dunlap wrote:
> 
>> On Dec 20, 2018, at 5:38 PM, Razvan Cojocaru  
>> wrote:
>>
>> The logdirty rangesets of the altp2ms need to be kept in sync with the
>> hostp2m. This means when iterating through the altp2ms, we need to
>> use the host p2m to clip the rangeset, not the indiviual altp2m's
>> value.
>>
>> This change also:
>>
>> - Documents that the end is non-inclusive
>>
>> - Calculates an "inclusive" value for the end once, rather than
>>  open-coding the modification, and (worse) back-modifying updates so
>>  that the calculation ends up correct
>>
>> - Clarifies the logic deciding whether to call
>>  change_entry_type_global() or change_entry_type_range()
>>
>> - Handles the case where start >= hostp2m->max_mapped_pfn
>>
>> Signed-off-by: George Dunlap 
>> Signed-off-by: Razvan Cojocaru 
>> Tested-by: Tamas K Lengyel 
>>
>> ---
>> CC: George Dunlap 
>> CC: Jan Beulich 
>> CC: Andrew Cooper 
>> CC: Wei Liu 
>> CC: "Roger Pau Monné" 
>>
>> ---
>> Changes since V11:
>> - Added end_exclusive, to avoid modifying function arguments.
>> - Moved the if ( start >= host_max_pfn ) warning below the code
>>   doing the clipping.
> 
> I’m afraid you misunderstood what I meant.
> 
> Why would we want to give a warning only if start >= host_max_pfn?  And if 
> the *start* was above host_max_pfn, why would we tell people we were 
> “clipping” the rangeset, when in fact we drop it entirely in that case?
> 
> Rather, I meant this:
> 
> —
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index f552fd5199..2af50af2bd 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1018,11 +1018,11 @@ static void change_type_range(struct p2m_domain *p2m,
>   * warning.
>   */
>  if ( unlikely(end > host_max_pfn) )
> -end = host_max_pfn;
> -
> -if ( start >= host_max_pfn )
> +{
>  printk(XENLOG_G_WARNING "Dom%d logdirty rangeset clipped to 
> max_mapped_pfn\n",
> d->domain_id);
> +end = host_max_pfn;
> +}
> 
>  /* If the requested range is out of scope, return doing nothing. */
>  if ( start > end )
> —
> 
> Everything else looks good.

Apologies, I'll resend.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 11/11] dm_depriv: Mark `UID cleanup` as completed

2018-12-21 Thread George Dunlap
Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
CC: Ian Jackson 
CC: Wei Liu 
---
 docs/designs/qemu-deprivilege.md | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/docs/designs/qemu-deprivilege.md b/docs/designs/qemu-deprivilege.md
index f7444a434d..81a5f5c05d 100644
--- a/docs/designs/qemu-deprivilege.md
+++ b/docs/designs/qemu-deprivilege.md
@@ -128,26 +128,6 @@ are specified; this does not apply to QEMU running as a 
Xen DM.
 
 '''Tested''': Not tested
 
-# Restrictions / improvements still to do
-
-This lists potential restrictions still to do.  It is meant to be
-listed in order of ease of implementation, with low-hanging fruit
-first.
-
-### Further RLIMITs
-
-RLIMIT_AS limits the total amount of memory; but this includes the
-virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
-fiddles with this; it would be straightforward to make it *set* the
-rlimit to what it thinks a sensible limit is.
-
-RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
-threads for some devices, so this would require some thought.
-
-Other things that would take some cleverness / changes to QEMU to
-utilize due to ordering constrants:
- - RLIMIT_NOFILES (after all necessary files are opened)
-
 ### libxl UID cleanup
 
 '''Description''': Domain IDs are reused, and thus restricted UIDs are
@@ -223,6 +203,26 @@ Since this will kill all other `reaper_uid` processes as 
well, we must
 either allocate a separate `reaper_uid` per domain, or use locking to
 ensure that only one killing process is active at a time.
 
+# Restrictions / improvements still to do
+
+This lists potential restrictions still to do.  It is meant to be
+listed in order of ease of implementation, with low-hanging fruit
+first.
+
+### Further RLIMITs
+
+RLIMIT_AS limits the total amount of memory; but this includes the
+virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
+fiddles with this; it would be straightforward to make it *set* the
+rlimit to what it thinks a sensible limit is.
+
+RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
+threads for some devices, so this would require some thought.
+
+Other things that would take some cleverness / changes to QEMU to
+utilize due to ordering constrants:
+ - RLIMIT_NOFILES (after all necessary files are opened)
+
 ## libxl: Treat QMP connection as untrusted
 
 '''Description''': Currently libxl talks with QEMU via QMP; but its
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] gic-vgic: skip irqs locking in gic_restore_pending_irqs()

2018-12-21 Thread Julien Grall
Hi,

Sorry for the formatting.

Please address my comments from the previous version.

Cheers,


On Fri, 21 Dec 2018, 18:18 Andrii Anisov,  wrote:

> From: Andrii Anisov 
>
> This function is called under IRQs disabled already, so drop additional
> flags save and restore.
>
> Signed-off-by: Andrii Anisov 
>
> ---
>
> This is a half of an RFC patch [1] which relies on the already
> existing code.
>
> [1]
> https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03293.html
>
> ---
>  xen/arch/arm/gic-vgic.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
> index 48922f5..34179c0 100644
> --- a/xen/arch/arm/gic-vgic.c
> +++ b/xen/arch/arm/gic-vgic.c
> @@ -279,11 +279,10 @@ static void gic_restore_pending_irqs(struct vcpu *v)
>  int lr = 0;
>  struct pending_irq *p, *t, *p_r;
>  struct list_head *inflight_r;
> -unsigned long flags;
>  unsigned int nr_lrs = gic_get_nr_lrs();
>  int lrs = nr_lrs;
>
> -spin_lock_irqsave(>arch.vgic.lock, flags);
> +spin_lock(>arch.vgic.lock);
>
>  if ( list_empty(>arch.vgic.lr_pending) )
>  goto out;
> @@ -327,7 +326,7 @@ found:
>  }
>
>  out:
> -spin_unlock_irqrestore(>arch.vgic.lock, flags);
> +spin_unlock(>arch.vgic.lock);
>  }
>
>  void gic_clear_pending_irqs(struct vcpu *v)
> --
> 2.7.4
>
>
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 08/11] libxl: Kill QEMU by uid when possible

2018-12-21 Thread Ian Jackson
George Dunlap writes ("[PATCH v4 08/11] libxl: Kill QEMU by uid when possible"):
> The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
> one specific domain ID.  This domain ID will probably eventually be
> used again.  It is therefore necessary to make absolutely sure that a
> rogue QEMU process cannot hang around after its domain has exited.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V12 4/5] p2m: Always use hostp2m when clipping rangesets

2018-12-21 Thread George Dunlap

> On Dec 20, 2018, at 5:38 PM, Razvan Cojocaru  
> wrote:
> 
> The logdirty rangesets of the altp2ms need to be kept in sync with the
> hostp2m. This means when iterating through the altp2ms, we need to
> use the host p2m to clip the rangeset, not the indiviual altp2m's
> value.
> 
> This change also:
> 
> - Documents that the end is non-inclusive
> 
> - Calculates an "inclusive" value for the end once, rather than
>  open-coding the modification, and (worse) back-modifying updates so
>  that the calculation ends up correct
> 
> - Clarifies the logic deciding whether to call
>  change_entry_type_global() or change_entry_type_range()
> 
> - Handles the case where start >= hostp2m->max_mapped_pfn
> 
> Signed-off-by: George Dunlap 
> Signed-off-by: Razvan Cojocaru 
> Tested-by: Tamas K Lengyel 
> 
> ---
> CC: George Dunlap 
> CC: Jan Beulich 
> CC: Andrew Cooper 
> CC: Wei Liu 
> CC: "Roger Pau Monné" 
> 
> ---
> Changes since V11:
> - Added end_exclusive, to avoid modifying function arguments.
> - Moved the if ( start >= host_max_pfn ) warning below the code
>   doing the clipping.

I’m afraid you misunderstood what I meant.

Why would we want to give a warning only if start >= host_max_pfn?  And if the 
*start* was above host_max_pfn, why would we tell people we were “clipping” the 
rangeset, when in fact we drop it entirely in that case?

Rather, I meant this:

—
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index f552fd5199..2af50af2bd 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1018,11 +1018,11 @@ static void change_type_range(struct p2m_domain *p2m,
  * warning.
  */
 if ( unlikely(end > host_max_pfn) )
-end = host_max_pfn;
-
-if ( start >= host_max_pfn )
+{
 printk(XENLOG_G_WARNING "Dom%d logdirty rangeset clipped to 
max_mapped_pfn\n",
d->domain_id);
+end = host_max_pfn;
+}

 /* If the requested range is out of scope, return doing nothing. */
 if ( start > end )
—

Everything else looks good.

 -George
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 10/11] libxl: Introduce specific username to be used as a reaper

2018-12-21 Thread George Dunlap
Untrusted device models must be killed by uid rather than by pid for
safety.  To do this reliably, we need another uid, not used for any
other purpose, from which to make the kill system call.

When using xen-qemuuser-range-base, we can repurpose
xen-qemuuser-range-base itself as a UID from which to kill other
devicemodel uids (since domain ID 0 should never have a device model
associated with it).

However, we'd like people to be able to use the device_model_user
feature without also defining xen-qemuuser-range-base (which requires
the ability to 'reserve' 32k+ user IDs).

To that end, introduce the xen-qemuuser-reaper id.  When killing by
UID, first look for and use that ID if available; then fall back to
xen-qemuuser-range-base.

Document the new call in docs/features/qemu-deprivilege.pandoc.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v3:
- New in this version

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 docs/features/qemu-deprivilege.pandoc |  9 ++
 tools/libxl/libxl_dm.c| 40 +--
 tools/libxl/libxl_internal.h  |  1 +
 3 files changed, 35 insertions(+), 15 deletions(-)

diff --git a/docs/features/qemu-deprivilege.pandoc 
b/docs/features/qemu-deprivilege.pandoc
index ce21a60ef7..eb05981a83 100644
--- a/docs/features/qemu-deprivilege.pandoc
+++ b/docs/features/qemu-deprivilege.pandoc
@@ -77,12 +77,21 @@ And then in your config file, the following line:
 
 device_model_user="xen-qemuuser-c6-01"
 
+If you use this method, you should also allocate one "reaper" user to
+be used for killing device models:
+
+adduser --system --no-create-home --group xen-qemuuser-reaper
+
 NOTE: It is important when using `device_model_user` that EACH VM HAVE
 A SEPARATE UID, and that none of these UIDs map to root.  xl will
 throw an error a uid maps to zero, but not if multiple VMs have the
 same uid.  Multiple VMs with the same device model uid will cause
 problems.
 
+It is also important that `xen-qemuuser-reaper` not have any processes
+associated with it, as they will be destroyed when deprivileged qemu
+processes are destroyed.
+
 ## Domain config changes
 
 The core domain config change is to add the following line to the
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 922fa70f11..f5322e3f45 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -268,24 +268,37 @@ static int libxl__get_reaper_uid(libxl__gc *gc, uid_t 
*reaper_uid)
 struct passwd *user_base, user_pwbuf;
 int rc;
 
+rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_REAPER,
+ _pwbuf, _base);
+/* 
+ * Either there was an error, or we found a suitable user; stop
+ * looking
+ */
+if (rc || user_base)
+goto out;
+
 rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
-if (rc)
-return rc;
+if (rc || user_base)
+goto out;
 
-if (!user_base) {
-LOG(WARN, "Couldn't find uid for reaper process");
-return ERROR_INVAL;
-}
-
-if (user_base->pw_uid == 0) {
-LOG(ERROR, "UID for reaper process maps to root!");
-return ERROR_INVAL;
+LOG(WARN, "Couldn't find uid for reaper process");
+rc = ERROR_INVAL;
+
+ out:
+/* First check to see if the discovered user maps to root */
+if (!rc) {
+if (user_base->pw_uid == 0) {
+LOG(ERROR, "UID for reaper process maps to root!");
+rc = ERROR_INVAL;
+}
 }
 
-*reaper_uid = user_base->pw_uid;
+/* If everything is OK, set reaper_uid as appropriate */
+if (!rc)
+*reaper_uid = user_base->pw_uid;
 
-return 0;
+return rc;
 }
 
 const char *libxl__domain_device_model(libxl__gc *gc,
@@ -2908,9 +2921,6 @@ static int 
get_reaper_lock_and_uid(libxl__destroy_devicemodel_state *ddms,
 
 /*
  * Get reaper_uid.  If we can't find such a uid, return an error.
- *
- * FIXME: This means that domain destruction will fail if
- * device_model_user is set but QEMU_USER_RANGE_BASE doesn't exist.
  */
 return libxl__get_reaper_uid(gc, reaper_uid);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 8323c7924d..67f3f6ac76 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4419,6 +4419,7 @@ _hidden int libxl__read_sysfs_file_contents(libxl__gc *gc,
 #define LIBXL_QEMU_USER_PREFIX "xen-qemuuser"
 #define LIBXL_QEMU_USER_SHARED LIBXL_QEMU_USER_PREFIX"-shared"
 #define LIBXL_QEMU_USER_RANGE_BASE LIBXL_QEMU_USER_PREFIX"-range-base"
+#define LIBXL_QEMU_USER_REAPER LIBXL_QEMU_USER_PREFIX"-reaper"
 
 static inline bool libxl__acpi_defbool_val(const libxl_domain_build_info 
*b_info)
 {
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 02/11] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN)

2018-12-21 Thread George Dunlap
QEMU_USER_BASE allows a user to specify the UID to use when running
the devicemodel for a specific domain number.  Unfortunately, this is
not really practical: It requires nearly 32,000 entries in
/etc/passwd.  QEMU_USER_RANGE_BASE is much more practical.

Remove support for QEMU_USER_BASE.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
NB that I've chosen not to update the xl.cfg man page at this time; it
needs a lot of other updates as well, which would be easier to do all
at once at the end.

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c   | 16 
 tools/libxl/libxl_internal.h |  1 -
 2 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index bbcbc94b6c..6024d4b7b8 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -138,13 +138,6 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 return 0;
 }
 
-user = GCSPRINTF("%s%d", LIBXL_QEMU_USER_BASE, guest_domid);
-ret = userlookup_helper_getpwnam(gc, user, _pwbuf, 0);
-if (ret < 0)
-return ret;
-if (ret > 0)
-goto end_search;
-
 ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
 if (ret < 0)
@@ -174,15 +167,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 if (ret < 0)
 return ret;
 if (ret > 0) {
-LOGD(WARN, guest_domid, "Could not find user %s%d, falling back to %s",
- LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED);
+LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
+ LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
 goto end_search;
 }
 
 LOGD(ERROR, guest_domid,
- "Could not find user %s%d or %s or range base pseudo-user %s, cannot 
restrict",
- LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED,
- LIBXL_QEMU_USER_RANGE_BASE);
+ "Could not find user %s or range base pseudo-user %s, cannot 
restrict",
+ LIBXL_QEMU_USER_SHARED, LIBXL_QEMU_USER_RANGE_BASE);
 return ERROR_INVAL;
 
 end_search:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index c4a43bd0b7..b147f3803c 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4387,7 +4387,6 @@ _hidden int libxl__read_sysfs_file_contents(libxl__gc *gc,
 int *datalen_r);
 
 #define LIBXL_QEMU_USER_PREFIX "xen-qemuuser"
-#define LIBXL_QEMU_USER_BASE   LIBXL_QEMU_USER_PREFIX"-domid"
 #define LIBXL_QEMU_USER_SHARED LIBXL_QEMU_USER_PREFIX"-shared"
 #define LIBXL_QEMU_USER_RANGE_BASE LIBXL_QEMU_USER_PREFIX"-range-base"
 
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 03/11] libxl: Clean up userlookup_helper_getpw* helper

2018-12-21 Thread George Dunlap
Bring conventions more in line with libxl__xs_read_checked():
- If found, return 0 and set pointer to non-NULL
- If not found, return 0 and set pointer to NULL
- On error, return libxl-style error number.

Update documentation to match.

Use CODING_STYLE compliant `r` rather than `ret`.

On error, log the error code before returning instead of discarding
it.

Now that it only returns 0 or errno, update caller error checks to be
`if (ret)` rather than `if (ret < 0)`.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v3:
- Used more idiomatic `if (ret)` rather than `if (ret < 0)`

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 35 ---
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 6024d4b7b8..67204b94c2 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -72,7 +72,13 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
  *  userlookup_helper_getpwuid(libxl__gc*, uid_t uid,
  * struct passwd **pwd_r);
  *
- *  returns 1 if the user was found, 0 if it was not, -1 on error
+ *  If the user is found, return 0 and set *pwd_r to the appropriat
+ *  value.
+ *
+ *  If the user is not found but there are no errors, return 0
+ *  and set *pwd_r to NULL.
+ *
+ *  On error, return a libxl-style error code.
  */
 #define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF) \
 static int userlookup_helper_##NAME(libxl__gc *gc,  \
@@ -83,7 +89,7 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
 struct STRUCTNAME *resultp = NULL;  \
 char *buf = NULL;   \
 long buf_size;  \
-int ret;\
+int r;  \
 \
 buf_size = sysconf(SYSCONF);\
 if (buf_size < 0) { \
@@ -95,17 +101,16 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
 \
 while (1) { \
 buf = libxl__realloc(gc, buf, buf_size);\
-ret = NAME##_r(spec, resultbuf, buf, buf_size, );   \
-if (ret == ERANGE) {\
+r = NAME##_r(spec, resultbuf, buf, buf_size, ); \
+if (r == ERANGE) {  \
 buf_size += 128;\
 continue;   \
 }   \
-if (ret != 0)   \
+if (r != 0) {   \
+LOGEV(ERROR, r, "Looking up username/uid with " #NAME); \
 return ERROR_FAIL;  \
-if (resultp != NULL) {  \
-if (out) *out = resultp;\
-return 1;   \
 }   \
+*out = resultp; \
 return 0;   \
 }   \
 }
@@ -140,16 +145,16 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 
 ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
-if (ret < 0)
+if (ret)
 return ret;
-if (ret > 0) {
+if (user_base) {
 struct passwd *user_clash, user_clash_pwbuf;
 uid_t intended_uid = user_base->pw_uid + guest_domid;
 ret = userlookup_helper_getpwuid(gc, intended_uid,
  _clash_pwbuf, _clash);
-if (ret < 0)
+if (ret)
 return ret;
-if (ret > 0) {
+if (user_clash) {
 LOGD(ERROR, guest_domid,
  "wanted to use uid %ld (%s + %d) but that is user %s !",
  (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
@@ -163,10 +168,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 }
 
 user = LIBXL_QEMU_USER_SHARED;
-ret = userlookup_helper_getpwnam(gc, user, _pwbuf, 0);
-if (ret < 0)
+ret = userlookup_helper_getpwnam(gc, user, _pwbuf, 

[Xen-devel] [PATCH v4 05/11] libxl: Do root checks once in libxl__domain_get_device_model_uid

2018-12-21 Thread George Dunlap
At the moment, we check for equivalence to literal "root" before
deciding whether to add the `runas` command-line option to QEMU.  This
is unsatisfactory for several reasons.

First, just because the string doesn't match "root" doesn't mean the
final uid won't end up being zero; in particular, the range_base
calculations may end up producing "0:NNN", which would be root in any
case.

Secondly, it's almost certainly a configuration error if the resulting
uid ends up to be zero; rather than silently do what was specified but
probably not intended, throw an error.

To fix this, check for root once in
libxl__domain_get_device_model_uid.  If the result is root, return an
error; if appropriate, set `runas`.

After that, assume that the presence of state->dm_runas implies that a
`runas` argument should be constructed.

One side effect of this is to check whether device_model_user exists
before passing it to qemu, resulting in better error reporting.

While we're here:
- Refactor the function to use the "goto out" idiom
- Use 'rc' rather than 'ret', in line with CODING_STYLE

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
Let the record show that I personally think the `goto out` pattern
here would be improved by having a "root check" label.

v3:
- Update to use more idiomatic `if (rc)`
- Make "inputs" to goto more clear
- Initialize intended_uid to -1 sentinel
- Document expectations when jumping to 'out'
- Don't return directly after non-trivial initial checks
- Always explicitly set rc to 0 when jumping to out, even if we just
  checked that it was zero a few lines earlier
- Work around one-goto limitation by having multiple 'rc' checks.
- Return generic ERROR_INVAL rather than in accurate ERROR_DEVICE_EXISTS

v2:
- Refactor to use `out` rather than multiple labels
- Only check for root once
- Use 'out' rather than direct returns for errors (only use direct returns
  for early `succeed-without-setting-runas` paths)
- Use `rc` rather than `ret` to more closely align with CODING_STYLE
- Fill out comments about the cases we're handling
- Return ERROR_DEVICE_EXISTS rather than ERROR_FAIL if there's another
  username that maps to our calculated uid
- Report an error if the specified device_model_user doesn't exist

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 107 -
 1 file changed, 85 insertions(+), 22 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 67204b94c2..d73bbb6b06 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -126,65 +126,128 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 const libxl_domain_build_info *b_info = >guest_config->b_info;
 
 struct passwd *user_base, user_pwbuf;
-int ret;
+int rc;
 char *user;
+uid_t intended_uid = -1;
 
 /* Only qemu-upstream can run as a different uid */
 if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
 return 0;
 
+/*
+ * From this point onward, all paths should go through the `out`
+ * label.  The invariants should be:
+ * - rc may be 0, or an error code.
+ * - if rc is an error code, user and intended_uid are ignored.
+ * - if rc is 0, user may be set or not set.
+ * - if user is set, then intended_uid must be set to a UID matching
+ *   the username `user`.  This will be checked for root (0).
+ */
+
+/*
+ * If device_model_user is present, set `-runas` even if
+ * dm_restrict isn't in use
+ */
 user = b_info->device_model_user;
-if (user)
-goto end_search;
+if (user) {
+rc = userlookup_helper_getpwnam(gc, user, _pwbuf, _base);
+if (rc)
+goto out;
+
+if (!user_base) {
+LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
+ user);
+rc = ERROR_INVAL;
+goto out;
+}
+
+intended_uid = user_base->pw_uid;
+rc = 0;
+goto out;
+}
 
+/*
+ * If dm_restrict isn't set, and we don't have a specified user, don't
+ * bother setting a `-runas` parameter.
+ */
 if (!libxl_defbool_val(b_info->dm_restrict)) {
 LOGD(DEBUG, guest_domid,
  "dm_restrict disabled, starting QEMU as root");
-return 0;
+user = NULL; /* Should already be null, but just in case */
+goto out;
 }
 
-ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+/*
+ * dm_restrict is set, but device_model_user isn't set; look for
+ * QEMU_USER_BASE_RANGE
+ */
+rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
-if (ret)
-return ret;
+if (rc)
+goto out;
 if (user_base) {
 struct passwd *user_clash, user_clash_pwbuf;
-uid_t intended_uid = user_base->pw_uid + guest_domid;
-ret = 

[Xen-devel] [PATCH v4 07/11] libxl: Make killing of device model asynchronous

2018-12-21 Thread George Dunlap
Or at least, give it an asynchronous interface so that we can make it
actually asynchronous in subsequent patches.

Create state structures and callback function signatures.  Add the
state structure to libxl__destroy_domid_state.  Break
libxl__destroy_domid down into two functions.

No functional change intended.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v2:
- Note that libxl__devicemodel_destroy_cb may be called reentrantly

NB that I retain the comment before libxl__destroy_device_model, in
spite of the fact that it looks "pointless", to separate it logically
from the previous prototype.

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c   | 11 +++---
 tools/libxl/libxl_domain.c   | 40 
 tools/libxl/libxl_internal.h | 20 --
 3 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 450433452d..ca59df33fe 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2696,19 +2696,24 @@ out:
 return rc;
 }
 
-int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
+void libxl__destroy_device_model(libxl__egc *egc,
+ libxl__destroy_devicemodel_state *ddms)
 {
+STATE_AO_GC(ddms->ao);
 int rc;
+int domid = ddms->domid;
 char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
+
 if (!xs_rm(CTX->xsh, XBT_NULL, path))
 LOGD(ERROR, domid, "xs_rm failed for %s", path);
+
 /* We should try to destroy the device model anyway. */
 rc = kill_device_model(gc,
   GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
-
+
 libxl__qmp_cleanup(gc, domid);
 
-return rc;
+ddms->callback(egc, ddms, rc);
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index d46b97dedf..0ce1ba1327 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1008,6 +1008,10 @@ static void destroy_finish_check(libxl__egc *egc,
 }
 
 /* Callbacks for libxl__destroy_domid */
+static void dm_destroy_cb(libxl__egc *egc,
+  libxl__destroy_devicemodel_state *ddms,
+  int rc);
+
 static void devices_destroy_cb(libxl__egc *egc,
libxl__devices_remove_state *drs,
int rc);
@@ -1066,16 +1070,18 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 if (rc < 0) {
 LOGEVD(ERROR, rc, domid, "xc_domain_pause failed");
 }
+
 if (dm_present) {
-if (libxl__destroy_device_model(gc, domid) < 0)
-LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+dis->ddms.ao = ao;
+dis->ddms.domid = domid;
+dis->ddms.callback = dm_destroy_cb;
+
+libxl__destroy_device_model(egc, >ddms);
+return;
+} else {
+dm_destroy_cb(egc, >ddms, 0);
+return;
 }
-dis->drs.ao = ao;
-dis->drs.domid = domid;
-dis->drs.callback = devices_destroy_cb;
-dis->drs.force = 1;
-libxl__devices_destroy(egc, >drs);
-return;
 
 out:
 assert(rc);
@@ -1083,6 +1089,24 @@ out:
 return;
 }
 
+static void dm_destroy_cb(libxl__egc *egc,
+  libxl__destroy_devicemodel_state *ddms,
+  int rc)
+{
+libxl__destroy_domid_state *dis = CONTAINER_OF(ddms, *dis, ddms);
+STATE_AO_GC(dis->ao);
+uint32_t domid = dis->domid;
+
+if (rc < 0)
+LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+
+dis->drs.ao = ao;
+dis->drs.domid = domid;
+dis->drs.callback = devices_destroy_cb;
+dis->drs.force = 1;
+libxl__devices_destroy(egc, >drs);
+}
+
 static void devices_destroy_cb(libxl__egc *egc,
libxl__devices_remove_state *drs,
int rc)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b147f3803c..f9e0bf6578 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1705,8 +1705,6 @@ _hidden int 
libxl__wait_for_device_model_deprecated(libxl__gc *gc,
   void *userdata),
 void *check_callback_userdata);
 
-_hidden int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid);
-
 _hidden const libxl_vnc_info *libxl__dm_vnc(const libxl_domain_config *g_cfg);
 
 _hidden char *libxl__abs_path(libxl__gc *gc, const char *s, const char *path);
@@ -3672,6 +3670,7 @@ extern const struct libxl_device_type *device_type_tbl[];
 
 typedef struct libxl__domain_destroy_state libxl__domain_destroy_state;
 typedef struct libxl__destroy_domid_state libxl__destroy_domid_state;
+typedef struct libxl__destroy_devicemodel_state 
libxl__destroy_devicemodel_state;
 typedef struct libxl__devices_remove_state 

[Xen-devel] [PATCH v4 04/11] dm_depriv: Describe expected usage of device_model_user parameter

2018-12-21 Thread George Dunlap
A number of subsequent patches rely on as-yet undefined behavior for
what the `device_model_user` parameter does.  Rather than implement it
incorrectly (or randomly), or remove the feature, describe an expected
usage for the feature.  Further patches will make decisions based on
this expected usage.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v3:
- Fix a minor typo

v2:
- Remove stale comment about device_model_user not being ready

RFC: As we'll see in a later patch, this implementation is still
incomplete: we need a `reaper` uid from which to kill uids.

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 docs/features/qemu-deprivilege.pandoc | 17 +
 tools/libxl/libxl_types.idl   |  1 -
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/features/qemu-deprivilege.pandoc 
b/docs/features/qemu-deprivilege.pandoc
index f941525189..ce21a60ef7 100644
--- a/docs/features/qemu-deprivilege.pandoc
+++ b/docs/features/qemu-deprivilege.pandoc
@@ -66,6 +66,23 @@ this, create a user named `xen-qemuuser-shared`; for example:
 
 adduser --no-create-home --system xen-qemuuser-shared
 
+A final way to set up a separate process for qemus is to allocate one
+UID per VM, and set the UID in the domain config file with the
+`device_model_user` argument.  For example, suppose you have a VM
+named `c6-01`.  You might do the following:
+
+adduser --system --no-create-home --group xen-qemuuser-c6-01
+
+And then in your config file, the following line:
+
+device_model_user="xen-qemuuser-c6-01"
+
+NOTE: It is important when using `device_model_user` that EACH VM HAVE
+A SEPARATE UID, and that none of these UIDs map to root.  xl will
+throw an error a uid maps to zero, but not if multiple VMs have the
+same uid.  Multiple VMs with the same device model uid will cause
+problems.
+
 ## Domain config changes
 
 The core domain config change is to add the following line to the
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 51cf06a3a2..141c46e42a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -495,7 +495,6 @@ libxl_domain_build_info = Struct("domain_build_info",[
 ("device_model", string),
 ("device_model_ssidref", uint32),
 ("device_model_ssid_label", string),
-# device_model_user is not ready for use yet
 ("device_model_user", string),
 
 # extra parameters pass directly to qemu, NULL terminated
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 09/11] libxl: Kill QEMU with "reaper" ruid

2018-12-21 Thread George Dunlap
Using kill(-1) to killing an untrusted dm process with the real uid
equal to the dm_uid isn't guaranteed to succeed: the process in
question may be able to kill the reaper process after the setresuid()
and before the kill().

Instead, set the real uid to the QEMU user for domain 0
(QEMU_USER_RANGE_BASE + 0).  The reaper process will still be able to
kill the dm process, but not vice versa.

This, in turn, requires locking to make sure that only one reaper
process is using that uid at a time; otherwise one reaper process may
kill the other reaper process.

Create a lockfile in RUNDIR/dm-reaper-lock, and grab the lock before
executing kill.

In the event that we can't get the lock for some reason, go ahead with
the kill using dm_uid for both real and effective UIDs.  This isn't
guaranteed to work, but it's no worse than not trying to kill the
process at all.

NB that this effectively requires admins using device_model_user to
also define xen_qemuuser_range_base; this will be addressed in
subsequent patches.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v4:
- Add a comment to the pot of get_reaper_lock_and_uid() mentioning
  that it must be called from a subprocess.

v3:
- Have libxl__get_reaper_uid return an error if a suitable uid is not found.
- Explicitly set reaper_uid to dm_uid if get_reaper_lock_and_uid()
  returns an error, rather than relying on reaper_uid being unchanged
  on failure.
- Open the lockfile 0644 rather than 0666.
- Remove bogus comment.

v2:
- Port over previous changes
- libxl__get_reaper_uid() won't set errno, use LOG rather than LOGE.
- Accumulate error and return for all failures
- Move flock() outside of the condition.  Also fix EINTR check (check
  errno rather than return value).
- Add a comment explaining why we return an error even if the kill()
  succeeds
- Move locking to a separate function to minimize gotos
- Refactor libxl__get_reaper_id to take a pointer for reaper_uid;
  return only success/failure.  Also return EINVAL if reaper_uid would
  resolve to 0.
- Handle "reaper_uid not found" specially; note issue with
  device_model_user feature
- Assert that final reaper_uid != 0

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 122 +
 1 file changed, 112 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index f2a21cf744..922fa70f11 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -259,6 +259,35 @@ out:
 return rc;
 }
 
+/*
+ * Look up "reaper UID".  If present and non-root, returns 0 and sets
+ * reaper_uid.  Otherwise returns libxl-style error.
+ */
+static int libxl__get_reaper_uid(libxl__gc *gc, uid_t *reaper_uid)
+{
+struct passwd *user_base, user_pwbuf;
+int rc;
+
+rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+ _pwbuf, _base);
+if (rc)
+return rc;
+
+if (!user_base) {
+LOG(WARN, "Couldn't find uid for reaper process");
+return ERROR_INVAL;
+}
+
+if (user_base->pw_uid == 0) {
+LOG(ERROR, "UID for reaper process maps to root!");
+return ERROR_INVAL;
+}
+
+*reaper_uid = user_base->pw_uid;
+
+return 0;
+}
+
 const char *libxl__domain_device_model(libxl__gc *gc,
const libxl_domain_build_info *info)
 {
@@ -2834,37 +2863,110 @@ out:
 return;
 }
 
+/* 
+ * Note that this attempts to grab a file lock, so must be called from
+ * a sub-process.
+ */
+static int get_reaper_lock_and_uid(libxl__destroy_devicemodel_state *ddms,
+   uid_t *reaper_uid)
+{
+STATE_AO_GC(ddms->ao);
+int domid = ddms->domid;
+int r;
+const char * lockfile;
+int fd;
+
+/* Try to lock the "reaper uid" */
+lockfile = GCSPRINTF("%s/dm-reaper-lock", libxl__run_dir_path());
+
+/*
+ * NB that since we've just forked, we can't have any
+ * threads; so we don't need the libxl__carefd
+ * infrastructure here.
+ */
+fd = open(lockfile, O_RDWR|O_CREAT, 0644);
+if (fd < 0) {
+LOGED(ERROR, domid,
+  "unexpected error while trying to open lockfile %s, errno=%d",
+  lockfile, errno);
+return ERROR_FAIL;
+}
+
+/* Try to lock the file, retrying on EINTR */
+for (;;) {
+r = flock(fd, LOCK_EX);
+if (!r)
+break;
+if (errno != EINTR) {
+/* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
+LOGED(ERROR, domid,
+  "unexpected error while trying to lock %s, fd=%d, errno=%d",
+  lockfile, fd, errno);
+return ERROR_FAIL;
+}
+}
+
+/*
+ * Get reaper_uid.  If we can't find such a uid, return an error.
+ *
+ * FIXME: This means that domain destruction will fail if
+ * device_model_user is set but QEMU_USER_RANGE_BASE doesn't exist.
+ */
+

[Xen-devel] [PATCH v4 06/11] libxl: Move qmp cleanup into devicemodel destroy function

2018-12-21 Thread George Dunlap
Removing the qmp connection is logically part of the device model
destruction; having the caller destroy it is a mild layering
violation.

Move libxl__qmp_cleanup() into libxl__destroy_device_model().  This
will make it easier when we make devicemodel destruction asynchronous.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 9 +++--
 tools/libxl/libxl_domain.c | 2 --
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index d73bbb6b06..450433452d 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2698,12 +2698,17 @@ out:
 
 int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
 {
+int rc;
 char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
 if (!xs_rm(CTX->xsh, XBT_NULL, path))
 LOGD(ERROR, domid, "xs_rm failed for %s", path);
 /* We should try to destroy the device model anyway. */
-return kill_device_model(gc,
-GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+rc = kill_device_model(gc,
+  GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+
+libxl__qmp_cleanup(gc, domid);
+
+return rc;
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index 3377bba994..d46b97dedf 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1069,8 +1069,6 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 if (dm_present) {
 if (libxl__destroy_device_model(gc, domid) < 0)
 LOGD(ERROR, domid, "libxl__destroy_device_model failed");
-
-libxl__qmp_cleanup(gc, domid);
 }
 dis->drs.ao = ao;
 dis->drs.domid = domid;
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 08/11] libxl: Kill QEMU by uid when possible

2018-12-21 Thread George Dunlap
The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
one specific domain ID.  This domain ID will probably eventually be
used again.  It is therefore necessary to make absolutely sure that a
rogue QEMU process cannot hang around after its domain has exited.

Killing QEMU by pid is insufficient in this situation, because QEMU
may be able to fork() to escape killing.  It is surprisingly tricky to
kill a process which can call fork() without races; the only reliable
way is to use kill(-1) to kill all processes with a given uid.

We can use this method only when we're sure that there's only one QEMU
instance per uid.  Add a dm_uid into the domain_build_state struct,
and set it in libxl__domain_get_device_model_uid() when it's safe to
kill by UID.  Store this in xenstore next to device-model-pid.

On domain destroy, check to see if device-model-uid is present in
xenstore.  If so, fork off a reaper process, setuid to that uid, and
do kill(-9) to kill all uids of that type.  Otherwise, carry on
destroying by pid.

While we're here, make libxl__destroy_device_model() consistently:
 1. Return an error when anything fails
 2. But continue to do as much clean-up as possible

NOTE that this is not yet completely safe: with ruid == dm_uid, the
device model may be able to kill(-9) the 'reaper' process before the
reaper process can kill it.  Further patches will address this.

Signed-off-by: George Dunlap 
---
v4:
- Add missing goto out
- Check that WEXITSTATUS() <= 125 before using it to set rc
- Remove stray blank line

v3:
- rename dm_uid to dm_kill_uid to indicate that it's only used when
  it's suitable to be used for kiling.
- rename xenstore node to device-model-kill-uid for the same reason
- Add a comment explaining what dm_runas and dm_kill_uid are for.
- Minor tweak to comment for clarification
- Add an `rc = 0;` before an `out:` label
- Make the accumulator macro generic and put in libxl-internal.h
- Fix up child error code return
- Account for undefined behavior on failure of libxl__xs_read_checked.

v2:
- Rebase on top of previous "goto out" refactoring
- Rather than introducing a `uid` string, Introduce a boolean,
  "kill_by_uid"; and do the GCSPRINTF() once if that is set.
- Fix typo "starting"
- Always call kill_device_model_uid_cb(); only call
  libxl__qmp_cleanup() from there
- Refactor libxl__destroy_device_model() to follow "goto out on error"
  pattern
- Retain and report errors even when we continue trying to clean up
- Report errors removing DM xenstore directory (except -ENOENT)
- Report errors reading device-model-uid
- Put "kill by uid" child logic in a separate function
- Refactor "kill by uid" to follow "goto out on error" pattern
- Change "kill by uid" to return libxl-style error, rather than errno
- Document the intention of when to return errors
- Assert that dm_uid != 0
- Log what the reaper process setresuid'd to

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 tools/libxl/libxl_dm.c   | 214 +--
 tools/libxl/libxl_internal.h |  16 ++-
 2 files changed, 220 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index ca59df33fe..f2a21cf744 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -129,6 +129,7 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
 int rc;
 char *user;
 uid_t intended_uid = -1;
+bool kill_by_uid;
 
 /* Only qemu-upstream can run as a different uid */
 if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
@@ -141,7 +142,8 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
  * - if rc is an error code, user and intended_uid are ignored.
  * - if rc is 0, user may be set or not set.
  * - if user is set, then intended_uid must be set to a UID matching
- *   the username `user`.  This will be checked for root (0).
+ *   the username `user`, and kill_by_uid must be set to the appropriate
+ *   value.  intended_uid will be checked for root (0).
  */
 
 /*
@@ -162,6 +164,7 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
 }
 
 intended_uid = user_base->pw_uid;
+kill_by_uid = true;
 rc = 0;
 goto out;
 }
@@ -205,12 +208,15 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 LOGD(DEBUG, guest_domid, "using uid %ld", (long)intended_uid);
 user = GCSPRINTF("%ld:%ld", (long)intended_uid,
  (long)user_base->pw_gid);
+kill_by_uid = true;
 rc = 0;
 goto out;
 }
 
 /*
- * We couldn't find QEMU_USER_BASE_RANGE; look for QEMU_USER_SHARED
+ * We couldn't find QEMU_USER_BASE_RANGE; look for
+ * QEMU_USER_SHARED.  NB for QEMU_USER_SHARED, all QEMU will run
+ * as the same UID, we can't kill by uid; therefore don't set uid.
  */
 user = LIBXL_QEMU_USER_SHARED;
 rc = userlookup_helper_getpwnam(gc, user, 

[Xen-devel] [PATCH] gic-vgic: skip irqs locking in gic_restore_pending_irqs()

2018-12-21 Thread Andrii Anisov
From: Andrii Anisov 

This function is called under IRQs disabled already, so drop additional
flags save and restore.

Signed-off-by: Andrii Anisov 

---

This is a half of an RFC patch [1] which relies on the already
existing code.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03293.html

---
 xen/arch/arm/gic-vgic.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index 48922f5..34179c0 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -279,11 +279,10 @@ static void gic_restore_pending_irqs(struct vcpu *v)
 int lr = 0;
 struct pending_irq *p, *t, *p_r;
 struct list_head *inflight_r;
-unsigned long flags;
 unsigned int nr_lrs = gic_get_nr_lrs();
 int lrs = nr_lrs;
 
-spin_lock_irqsave(>arch.vgic.lock, flags);
+spin_lock(>arch.vgic.lock);
 
 if ( list_empty(>arch.vgic.lr_pending) )
 goto out;
@@ -327,7 +326,7 @@ found:
 }
 
 out:
-spin_unlock_irqrestore(>arch.vgic.lock, flags);
+spin_unlock(>arch.vgic.lock);
 }
 
 void gic_clear_pending_irqs(struct vcpu *v)
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 11/11] dm_depriv: Mark `UID cleanup` as completed

2018-12-21 Thread Ian Jackson
George Dunlap writes ("[PATCH v3 11/11] dm_depriv: Mark `UID cleanup` as 
completed"):
> Signed-off-by: George Dunlap 

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V12 2/5] x86/p2m: refactor p2m_reset_altp2m()

2018-12-21 Thread George Dunlap


> On Dec 20, 2018, at 5:38 PM, Razvan Cojocaru  
> wrote:
> 
> Refactor p2m_reset_altp2m() so that it can be used to remove
> redundant codepaths, fixing the locking while we're at it.
> 
> The previous code now replaced by p2m_reset_altp2m(d, i,
> ALTP2M_DEACTIVATE) calls did not set p2m->min_remapped_gfn
> and p2m->max_remapped_gfn because in those cases the altp2m
> idx was disabled; so before getting used again,
> p2m_init_altp2m_ept() would get called, which resets them.
> Always setting them in p2m_reset_altp2m(), while redundant,
> is preferable to an extra conditional.
> 
> Signed-off-by: Razvan Cojocaru 
> Tested-by: Tamas K Lengyel 

Reviewed-by: George Dunlap 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 10/11] libxl: Introduce specific username to be used as a reaper

2018-12-21 Thread Ian Jackson
George Dunlap writes ("[PATCH v3 10/11] libxl: Introduce specific username to 
be used as a reaper"):
> Untrusted device models must be killed by uid rather than by pid for
> safety.  To do this reliably, we need another uid, not used for any
> other purpose, from which to make the kill system call.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Live migrate with Linux >= 4.13 domU causes kernel time jumps and TCP connection stalls.

2018-12-21 Thread Hans van Kranenburg
Hi,

We've been tracking down a live migration bug during the last three days
here at work, and here's what we found so far.

1. Xen version and dom0 linux kernel version don't matter.
2. DomU kernel is >= Linux 4.13.

When using live migrate to another dom0, this often happens:

[   37.511305] Freezing user space processes ... (elapsed 0.001 seconds)
done.
[   37.513316] OOM killer disabled.
[   37.513323] Freezing remaining freezable tasks ... (elapsed 0.001
seconds) done.
[   37.514837] suspending xenstore...
[   37.515142] xen:grant_table: Grant tables using version 1 layout
[18446744002.593711] OOM killer enabled.
[18446744002.593726] Restarting tasks ... done.
[18446744002.604527] Setting capacity to 6291456

As a side effect, all open TCP connections stall, because the timestamp
counters of packets sent to the outside world are affected:

https://syrinx.knorrie.org/~knorrie/tmp/tcp-stall.png

"The problem seems to occur after the domU is resumed. The first packet
(#90) has wrong timestamp value (far from the past), marked red in the
image. Green is the normal sequence of timestamps from the server
(domU), acknowledged by the client. Once client receives the packet from
the past, it attempts re-sending everything from the start. As the
timestamp never reaches normal value, the client goes crazy thinking
that the server has not received anything, keeping the loop on. But they
just exist in different ages."

--- >8 ---

Ad 1. We reproduced this on different kinds of HP dl360 g7/8/9 gear,
both with Xen 4.11 / Linux 4.19.9 dom0 kernel and with Xen 4.4 / Linux
3.16 as dom0 kernel.

Ad 2. This was narrowed down by just grabbing old debian kernel images
from https://snapshot.debian.org/binary/?cat=l and trying them.

OK   linux-image-4.12.0-2-amd64_4.12.13-1_amd64.deb
FAIL linux-image-4.13.0-rc5-amd64_4.13~rc5-1~exp1_amd64.deb
FAIL linux-image-4.13.0-trunk-amd64_4.13.1-1~exp1_amd64.deb
FAIL linux-image-4.13.0-1-amd64_4.13.4-1_amd64.deb
FAIL linux-image-4.13.0-1-amd64_4.13.13-1_amd64.deb
FAIL linux-image-4.14.0-3-amd64_4.14.17-1_amd64.deb
FAIL linux-image-4.15.0-3-amd64_4.15.17-1_amd64.deb
FAIL linux-image-4.16.0-2-amd64_4.16.16-2_amd64.deb
FAIL ... everything up to 4.19.9 here

So, there seems to be a change introduced in 4.13 that makes this
behaviour appear. We didn't start compiling old kernels yet to be able
to bisect it further.

--- >8 ---

For the rest of the info, I'm focussing on a test environment for
reproduction, which is 4x identical HP DL360G7, named sirius, gamma,
omega and flopsy.

It's running the 4.11 packages from Debian, rebuilt for Stretch:
4.11.1~pre.20180911.5acdd26fdc+dfsg-5~bpo9+1
https://salsa.debian.org/xen-team/debian-xen/commits/stretch-backports

Dom0 kernel is 4.19.9 from Debian, rebuilt for Stretch:
https://salsa.debian.org/knorrie-guest/linux/commits/debian/4.19.9-1_mxbp9+1

xen_commandline : placeholder dom0_max_vcpus=1-4 dom0_mem=4G,max:4G
com2=115200,8n1 console=com2,vga noreboot xpti=no-dom0,domu smt=off

vendor_id   : GenuineIntel
cpu family  : 6
model   : 44
model name  : Intel(R) Xeon(R) CPU   X5675  @ 3.07GHz
stepping: 2
microcode   : 0x1f
cpu MHz : 3066.727

--- >8 ---

There are some interesting additional patterns:

1. consistent success / failure paths.

After rebooting all 4 physical servers, starting a domU with 4.19 kernel
and then live migrating it, it might first time fail, or it might
succeed. However, from the first time it fails, the specific direction
of movement keeps showing the failure every single time this combination
is used. Same goes for successful live migrate. E.g.:

sirius -> flopsy OK
sirius -> gamma OK
flopsy -> gamma OK
flopsy -> sirius OK
gamma -> flopsy FAIL
gamma -> sirius FAIL
omega -> flopsy FAIL

After rebooting all of the servers again, and restarting the whole test
procedure, the combinations and results change, but are again consistent
as soon as we start live migrating and seeing results.

2. TCP connections only hang when opened while "timestamp value in dmesg
is low", followed with a "time is 18 gazillion" situation. When opening
a TCP connection to the domU while it's at 18 gazillion seconds uptime,
the TCP connection keeps working all the time after subsequent live
migrations, even when it jumps up and down, following the OK and FAIL paths.

3. Since this is related to time and clocks, the last thing today we
tried was, instead of using default settings, put "clocksource=tsc
tsc=stable:socket" on the xen command line and "clocksource=tsc" on the
domU linux kernel line. What we observed after doing this, is that the
failure happens less often, but still happens. Everything else applies.

--- >8 ---

Additional question:

It's 2018, should we have these "clocksource=tsc tsc=stable:socket" on
Xen and "clocksource=tsc" anyways now, for Xen 4.11 and Linux 4.19
domUs? All our hardware has 'TscInvariant = true'.

Related: 

Re: [Xen-devel] [PATCH v3 09/11] libxl: Kill QEMU with "reaper" ruid

2018-12-21 Thread Ian Jackson
George Dunlap writes ("[PATCH v3 09/11] libxl: Kill QEMU with "reaper" ruid"):
> Using kill(-1) to killing an untrusted dm process with the real uid
> equal to the dm_uid isn't guaranteed to succeed: the process in
> question may be able to kill the reaper process after the setresuid()
> and before the kill().
...
> +/*
> + * Look up "reaper UID".  If present and non-root, returns 0 and sets
> + * reaper_uid.  Otherwise returns libxl-style error.
> + */

Might be worth mentioning that this function expects to be called in a
subprocess, for a variety of reasons including because it takes a lock.

Nevertheless,

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 08/11] libxl: Kill QEMU by uid when possible

2018-12-21 Thread Ian Jackson
George Dunlap writes ("[PATCH v3 08/11] libxl: Kill QEMU by uid when possible"):
> The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
> one specific domain ID.  This domain ID will probably eventually be
> used again.  It is therefore necessary to make absolutely sure that a
> rogue QEMU process cannot hang around after its domain has exited.
...
> +static int kill_device_model_uid_child(libxl__destroy_devicemodel_state 
> *ddms,

> +/*
> + * And kill everyone but me.
> + *
> + * NB that it's not clear from either POSIX or the Linux man page
> + * that ESRCH would be returned with a pid value of -1, but it
> + * doesn't hurt to check.
> + */
> +r = kill(-1, 9);
> +if (r && errno != ESRCH) {
> +LOGED(ERROR, domid, "kill(-1,9)");
> +rc = ERROR_FAIL;
> +}

Missing `goto out', there.

> +
> +rc = 0;
> +
> +out:
> +return rc;
> +}

> +static void kill_device_model_uid_cb(libxl__egc *egc,
> +libxl__ev_child *destroyer,
> +pid_t pid, int status)
> +{
> +libxl__destroy_devicemodel_state *ddms = CONTAINER_OF(destroyer, *ddms, 
> destroyer);
> +STATE_AO_GC(ddms->ao);
> +
> +if (status) {
> +int rc = ERROR_FAIL;
> +
> +if (WIFEXITED(status))
> +rc = -WEXITSTATUS(status);

But WEXITSTATUS might be something weird.  See my mail about possible
system-invented exit statuses.  I suggest you tolerate only the status
values you intend to generate, 1..125.  (And 0 for success.)

> +/* 
> + * A macro to help retain the first failure in "do as much as you can"
> + * situations.  Note the hard-coded use of the variable name `rc`.
> + */
> +#define ACCUMULATE_RC(rc_acc) ((rc_acc) = (rc_acc) ?: rc)
> +

Good, although you have trailing whitespace in the new blank line.


Everything else LGTM.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V12 1/5] x86/p2m: allocate logdirty_ranges for altp2ms

2018-12-21 Thread George Dunlap


> On Dec 20, 2018, at 5:38 PM, Razvan Cojocaru  
> wrote:
> 
> For now, only do allocation/deallocation; keeping them in sync
> will be done in subsequent patches.
> 
> Logdirty synchronization will only be done for active altp2ms;
> so allocate logdirty rangesets (copying the host logdirty
> rangeset) when an altp2m is activated, and free it when
> deactivated.
> 
> Signed-off-by: Razvan Cojocaru 
> Tested-by: Tamas K Lengyel 

Reviewed-by: George Dunlap 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 7/8] xenalyze: Build for Both ARM and x86 Platforms

2018-12-21 Thread Stefano Stabellini
On Fri, 21 Dec 2018, Julien Grall wrote:
> Hi,
> 
> On 21/12/2018 16:26, Julien Grall wrote:
> > From: Benjamin Sanda 
> > 
> > Modified to provide building of the xenalyze binary for both ARM and
> > x86 platforms. The xenalyze binary is now built as part of the BIN
> > list for both platforms.
> > 
> > Signed-off-by: Benjamin Sanda 
> > Signed-off-by: Julien Grall 
> > Acked-by: Wei Liu 
> > Acked-by: Stefano Stabellini 
> 
> I made a typo in the address e-mail. Stefano, could you fix it on commit?

Yes, fixed on commit. First 7 patches have been committed.


> > ---
> >  Changes in v3:
> >  - Add Stefano's acked-by
> > 
> >  Changes in v2:
> >  - Add Wei's acked-by
> > ---
> >   tools/xentrace/Makefile | 3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/tools/xentrace/Makefile b/tools/xentrace/Makefile
> > index 0bad942bdf..9fb7fc96e7 100644
> > --- a/tools/xentrace/Makefile
> > +++ b/tools/xentrace/Makefile
> > @@ -9,8 +9,7 @@ LDLIBS += $(LDLIBS_libxenevtchn)
> >   LDLIBS += $(LDLIBS_libxenctrl)
> >   LDLIBS += $(ARGP_LDFLAGS)
> >   -BIN-$(CONFIG_X86) = xenalyze
> > -BIN  = $(BIN-y)
> > +BIN  = xenalyze
> >   SBIN = xentrace xentrace_setsize
> >   LIBBIN   = xenctx
> >   SCRIPTS  = xentrace_format
> > 
> 
> -- 
> Julien Grall
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 05/11] libxl: Do root checks once in libxl__domain_get_device_model_uid

2018-12-21 Thread Ian Jackson
George Dunlap writes ("[PATCH v3 05/11] libxl: Do root checks once in 
libxl__domain_get_device_model_uid"):
> At the moment, we check for equivalence to literal "root" before
> deciding whether to add the `runas` command-line option to QEMU.  This
> is unsatisfactory for several reasons.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] gic: drop interrupts enabling on interrupts processing

2018-12-21 Thread Andrii Anisov
From: Andrii Anisov 

This reduces the number of context switches in case we have coming guest
interrupts from different sources at a high rate. What is likely for
multimedia use-cases.
Having irqs unlocked here makes us go through trap path again in case we
have a new guest interrupt arrived (even with the same priority, after
`desc->handler->end(desc)` in `do_IRQ()`), what is just a processor
cycles wasting. We will catch them all in the `gic_interrupt() function
loop anyway. And the guest irqs arrival prioritization is meaningless
here, it is only effective at guest's level.

Signed-off-by: Andrii Anisov 

---

This is the patch [1] from RFC series, with extended commit message.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03286.html

---
 xen/arch/arm/gic.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 6cc7dec..9f5cd95 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -386,10 +386,8 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
 
 if ( likely(irq >= 16 && irq < 1020) )
 {
-local_irq_enable();
 isb();
 do_IRQ(regs, irq, is_fiq);
-local_irq_disable();
 }
 else if ( is_lpi(irq) )
 {
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 4/8] xen/arm: Make get_page_from_gfn working with DOMID_XEN

2018-12-21 Thread Andrii Anisov



On 21.12.18 19:27, Julien Grall wrote:


Is it a wildcard tag? :)


No, just a wrong copy-paste.
Must be:

Reviewed-by: Andrii Anisov 

;)

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 3/8] xen/arm: Add support for read-only foreign mappings

2018-12-21 Thread Andrii Anisov


Reviewed-by: Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 3/8] xen/arm: Add support for read-only foreign mappings

2018-12-21 Thread Julien Grall

Hi Andrii,

On 21/12/2018 17:27, Andrii Anisov wrote:



On 21.12.18 18:26, Julien Grall wrote:

Currently, foreign mappings can only be read-write. A follow-up patch will
extend foreign mapping for Xen backend memory (via XEN_DOMID), some of
that memory should only be read accessible for the mapping domain.

Introduce a new p2m_type to cater read-only foreign mappings. For now,
the decision between the two foreign mapping type is based on the type
of the guest page mapped.

Signed-off-by: Julien Grall 


Andrii Anisov 


Another wildcard tag? :)

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 4/8] xen/arm: Make get_page_from_gfn working with DOMID_XEN

2018-12-21 Thread Julien Grall

Hi Andrii,

On 21/12/2018 17:26, Andrii Anisov wrote:



On 21.12.18 18:26, Julien Grall wrote:

DOMID_XEN is used to share pages beloging to the hypervisor
(e.g trace buffers). Unlike other domains, DOMID_XEN is a non-auto
translated domain and therefore does not have a P2M.

This patch adds a special case for DOMID_XEN in get_page_from_gfn. We
may want to provide "non-auto translated helpers" in the future if we
see more case.

Signed-off-by: Julien Grall 


Andrii Anisov 


Is it a wildcard tag? :)

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 3/8] xen/arm: Add support for read-only foreign mappings

2018-12-21 Thread Andrii Anisov



On 21.12.18 18:26, Julien Grall wrote:

Currently, foreign mappings can only be read-write. A follow-up patch will
extend foreign mapping for Xen backend memory (via XEN_DOMID), some of
that memory should only be read accessible for the mapping domain.

Introduce a new p2m_type to cater read-only foreign mappings. For now,
the decision between the two foreign mapping type is based on the type
of the guest page mapped.

Signed-off-by: Julien Grall 


Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 4/8] xen/arm: Make get_page_from_gfn working with DOMID_XEN

2018-12-21 Thread Andrii Anisov



On 21.12.18 18:26, Julien Grall wrote:

DOMID_XEN is used to share pages beloging to the hypervisor
(e.g trace buffers). Unlike other domains, DOMID_XEN is a non-auto
translated domain and therefore does not have a P2M.

This patch adds a special case for DOMID_XEN in get_page_from_gfn. We
may want to provide "non-auto translated helpers" in the future if we
see more case.

Signed-off-by: Julien Grall 


Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 0/2] misc safety certification fixes

2018-12-21 Thread Stefano Stabellini
On Fri, 21 Dec 2018, Julien Grall wrote:
> Hi,
> 
> On 21/12/2018 09:27, Jan Beulich wrote:
> > > > > On 20.12.18 at 18:26,  wrote:
> > > On 11/12/18 11:06 PM, Stefano Stabellini wrote:
> > > Discussing with Stefano today, he is aiming to get this series for Xen
> > > 4.12. I will be away until the x86/common code freeze.
> > > 
> > > I agree with him that I will waive my ack if it gets reviewed by any
> > > committers.
> > 
> > Well, discussion on patch 2 was abandoned rather than finished
> > afaict, which means Stefano either lost interest or is meaning to
> > submit v5 with the comments addressed.
> 
> He is planning to send a new version while I am on holidays. My request
> applies for any new version of this providing it gets reviewed by any
> committers.

That's right: I haven't had the bandwidth to continue the discussion,
but I intend to resume work on it on the 1st week of Jan. The idea is
that even if Julien will be on holidays, if I manage to make enough
progress and get your reviewed-by the series could go in anyway before
the freeze, if that's OK for you. 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 01/11] libxl: Move dm user determination logic into a helper function

2018-12-21 Thread George Dunlap
To reliably kill an untrusted devicemodel, we need to know not only
its pid, but its uid.  In preparation for this, move the userid
determination logic into a helper function.

Create a new field, `dm_runas`, in libxl__domain_build_state to store
the value during domain creation.

This change also removes unnecessary duplication of the argument
construction code.

While here, clean up some minor CODING_STYLE infractions (space
between * and variable name).

No functional change intended.

While here, delete some trailing whitespace.

Signed-off-by: George Dunlap 
Acked-by: Wei Liu 
Acked-by: Ian Jackson 
---
v2:
- Remove unnecessary space between * and dm_runas
- Additional code clean-up
- Delete trailing whitespace

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 tools/libxl/libxl_dm.c   | 260 +++
 tools/libxl/libxl_internal.h |  22 +--
 2 files changed, 151 insertions(+), 131 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 5698fe8af3..bbcbc94b6c 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -65,6 +65,131 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
 return logfile_w;
 }
 
+/*
+ *  userlookup_helper_getpwnam(libxl__gc*, const char *user,
+ * struct passwd **pwd_r);
+ *
+ *  userlookup_helper_getpwuid(libxl__gc*, uid_t uid,
+ * struct passwd **pwd_r);
+ *
+ *  returns 1 if the user was found, 0 if it was not, -1 on error
+ */
+#define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF) \
+static int userlookup_helper_##NAME(libxl__gc *gc,  \
+SPEC_TYPE spec, \
+struct STRUCTNAME *resultbuf,   \
+struct STRUCTNAME **out)\
+{   \
+struct STRUCTNAME *resultp = NULL;  \
+char *buf = NULL;   \
+long buf_size;  \
+int ret;\
+\
+buf_size = sysconf(SYSCONF);\
+if (buf_size < 0) { \
+buf_size = 2048;\
+LOG(DEBUG,  \
+"sysconf failed, setting the initial buffer size to %ld",   \
+buf_size);  \
+}   \
+\
+while (1) { \
+buf = libxl__realloc(gc, buf, buf_size);\
+ret = NAME##_r(spec, resultbuf, buf, buf_size, );   \
+if (ret == ERANGE) {\
+buf_size += 128;\
+continue;   \
+}   \
+if (ret != 0)   \
+return ERROR_FAIL;  \
+if (resultp != NULL) {  \
+if (out) *out = resultp;\
+return 1;   \
+}   \
+return 0;   \
+}   \
+}
+
+DEFINE_USERLOOKUP_HELPER(getpwnam, const char*, passwd, _SC_GETPW_R_SIZE_MAX);
+DEFINE_USERLOOKUP_HELPER(getpwuid, uid_t,   passwd, _SC_GETPW_R_SIZE_MAX);
+
+static int libxl__domain_get_device_model_uid(libxl__gc *gc,
+  libxl__dm_spawn_state *dmss)
+{
+int guest_domid = dmss->guest_domid;
+libxl__domain_build_state *const state = dmss->build_state;
+const libxl_domain_build_info *b_info = >guest_config->b_info;
+
+struct passwd *user_base, user_pwbuf;
+int ret;
+char *user;
+
+/* Only qemu-upstream can run as a different uid */
+if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
+return 0;
+
+user = b_info->device_model_user;
+if (user)
+goto end_search;
+
+if (!libxl_defbool_val(b_info->dm_restrict)) {
+LOGD(DEBUG, guest_domid,
+ "dm_restrict disabled, 

[Xen-devel] [PATCH v3 10/11] libxl: Introduce specific username to be used as a reaper

2018-12-21 Thread George Dunlap
Untrusted device models must be killed by uid rather than by pid for
safety.  To do this reliably, we need another uid, not used for any
other purpose, from which to make the kill system call.

When using xen-qemuuser-range-base, we can repurpose
xen-qemuuser-range-base itself as a UID from which to kill other
devicemodel uids (since domain ID 0 should never have a device model
associated with it).

However, we'd like people to be able to use the device_model_user
feature without also defining xen-qemuuser-range-base (which requires
the ability to 'reserve' 32k+ user IDs).

To that end, introduce the xen-qemuuser-reaper id.  When killing by
UID, first look for and use that ID if available; then fall back to
xen-qemuuser-range-base.

Document the new call in docs/features/qemu-deprivilege.pandoc.

Signed-off-by: George Dunlap 
---
v3:
- New in this version

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 docs/features/qemu-deprivilege.pandoc |  9 ++
 tools/libxl/libxl_dm.c| 40 +--
 tools/libxl/libxl_internal.h  |  1 +
 3 files changed, 35 insertions(+), 15 deletions(-)

diff --git a/docs/features/qemu-deprivilege.pandoc 
b/docs/features/qemu-deprivilege.pandoc
index ce21a60ef7..eb05981a83 100644
--- a/docs/features/qemu-deprivilege.pandoc
+++ b/docs/features/qemu-deprivilege.pandoc
@@ -77,12 +77,21 @@ And then in your config file, the following line:
 
 device_model_user="xen-qemuuser-c6-01"
 
+If you use this method, you should also allocate one "reaper" user to
+be used for killing device models:
+
+adduser --system --no-create-home --group xen-qemuuser-reaper
+
 NOTE: It is important when using `device_model_user` that EACH VM HAVE
 A SEPARATE UID, and that none of these UIDs map to root.  xl will
 throw an error a uid maps to zero, but not if multiple VMs have the
 same uid.  Multiple VMs with the same device model uid will cause
 problems.
 
+It is also important that `xen-qemuuser-reaper` not have any processes
+associated with it, as they will be destroyed when deprivileged qemu
+processes are destroyed.
+
 ## Domain config changes
 
 The core domain config change is to add the following line to the
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index f7c4e5eb3b..731d7f3c2c 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -268,24 +268,37 @@ static int libxl__get_reaper_uid(libxl__gc *gc, uid_t 
*reaper_uid)
 struct passwd *user_base, user_pwbuf;
 int rc;
 
+rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_REAPER,
+ _pwbuf, _base);
+/* 
+ * Either there was an error, or we found a suitable user; stop
+ * looking
+ */
+if (rc || user_base)
+goto out;
+
 rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
-if (rc)
-return rc;
+if (rc || user_base)
+goto out;
 
-if (!user_base) {
-LOG(WARN, "Couldn't find uid for reaper process");
-return ERROR_INVAL;
-}
-
-if (user_base->pw_uid == 0) {
-LOG(ERROR, "UID for reaper process maps to root!");
-return ERROR_INVAL;
+LOG(WARN, "Couldn't find uid for reaper process");
+rc = ERROR_INVAL;
+
+ out:
+/* First check to see if the discovered user maps to root */
+if (!rc) {
+if (user_base->pw_uid == 0) {
+LOG(ERROR, "UID for reaper process maps to root!");
+rc = ERROR_INVAL;
+}
 }
 
-*reaper_uid = user_base->pw_uid;
+/* If everything is OK, set reaper_uid as appropriate */
+if (!rc)
+*reaper_uid = user_base->pw_uid;
 
-return 0;
+return rc;
 }
 
 const char *libxl__domain_device_model(libxl__gc *gc,
@@ -2904,9 +2917,6 @@ static int 
get_reaper_lock_and_uid(libxl__destroy_devicemodel_state *ddms,
 
 /*
  * Get reaper_uid.  If we can't find such a uid, return an error.
- *
- * FIXME: This means that domain destruction will fail if
- * device_model_user is set but QEMU_USER_RANGE_BASE doesn't exist.
  */
 return libxl__get_reaper_uid(gc, reaper_uid);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e3619daf3a..f588e63599 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4419,6 +4419,7 @@ _hidden int libxl__read_sysfs_file_contents(libxl__gc *gc,
 #define LIBXL_QEMU_USER_PREFIX "xen-qemuuser"
 #define LIBXL_QEMU_USER_SHARED LIBXL_QEMU_USER_PREFIX"-shared"
 #define LIBXL_QEMU_USER_RANGE_BASE LIBXL_QEMU_USER_PREFIX"-range-base"
+#define LIBXL_QEMU_USER_REAPER LIBXL_QEMU_USER_PREFIX"-reaper"
 
 static inline bool libxl__acpi_defbool_val(const libxl_domain_build_info 
*b_info)
 {
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 03/11] libxl: Clean up userlookup_helper_getpw* helper

2018-12-21 Thread George Dunlap
Bring conventions more in line with libxl__xs_read_checked():
- If found, return 0 and set pointer to non-NULL
- If not found, return 0 and set pointer to NULL
- On error, return libxl-style error number.

Update documentation to match.

Use CODING_STYLE compliant `r` rather than `ret`.

On error, log the error code before returning instead of discarding
it.

Now that it only returns 0 or errno, update caller error checks to be
`if (ret)` rather than `if (ret < 0)`.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v3:
- Used more idiomatic `if (ret)` rather than `if (ret < 0)`

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 35 ---
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 6024d4b7b8..67204b94c2 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -72,7 +72,13 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
  *  userlookup_helper_getpwuid(libxl__gc*, uid_t uid,
  * struct passwd **pwd_r);
  *
- *  returns 1 if the user was found, 0 if it was not, -1 on error
+ *  If the user is found, return 0 and set *pwd_r to the appropriat
+ *  value.
+ *
+ *  If the user is not found but there are no errors, return 0
+ *  and set *pwd_r to NULL.
+ *
+ *  On error, return a libxl-style error code.
  */
 #define DEFINE_USERLOOKUP_HELPER(NAME,SPEC_TYPE,STRUCTNAME,SYSCONF) \
 static int userlookup_helper_##NAME(libxl__gc *gc,  \
@@ -83,7 +89,7 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
 struct STRUCTNAME *resultp = NULL;  \
 char *buf = NULL;   \
 long buf_size;  \
-int ret;\
+int r;  \
 \
 buf_size = sysconf(SYSCONF);\
 if (buf_size < 0) { \
@@ -95,17 +101,16 @@ static int libxl__create_qemu_logfile(libxl__gc *gc, char 
*name)
 \
 while (1) { \
 buf = libxl__realloc(gc, buf, buf_size);\
-ret = NAME##_r(spec, resultbuf, buf, buf_size, );   \
-if (ret == ERANGE) {\
+r = NAME##_r(spec, resultbuf, buf, buf_size, ); \
+if (r == ERANGE) {  \
 buf_size += 128;\
 continue;   \
 }   \
-if (ret != 0)   \
+if (r != 0) {   \
+LOGEV(ERROR, r, "Looking up username/uid with " #NAME); \
 return ERROR_FAIL;  \
-if (resultp != NULL) {  \
-if (out) *out = resultp;\
-return 1;   \
 }   \
+*out = resultp; \
 return 0;   \
 }   \
 }
@@ -140,16 +145,16 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 
 ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
-if (ret < 0)
+if (ret)
 return ret;
-if (ret > 0) {
+if (user_base) {
 struct passwd *user_clash, user_clash_pwbuf;
 uid_t intended_uid = user_base->pw_uid + guest_domid;
 ret = userlookup_helper_getpwuid(gc, intended_uid,
  _clash_pwbuf, _clash);
-if (ret < 0)
+if (ret)
 return ret;
-if (ret > 0) {
+if (user_clash) {
 LOGD(ERROR, guest_domid,
  "wanted to use uid %ld (%s + %d) but that is user %s !",
  (long)intended_uid, LIBXL_QEMU_USER_RANGE_BASE,
@@ -163,10 +168,10 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 }
 
 user = LIBXL_QEMU_USER_SHARED;
-ret = userlookup_helper_getpwnam(gc, user, _pwbuf, 0);
-if (ret < 0)
+ret = userlookup_helper_getpwnam(gc, user, _pwbuf, 

[Xen-devel] [PATCH v3 05/11] libxl: Do root checks once in libxl__domain_get_device_model_uid

2018-12-21 Thread George Dunlap
At the moment, we check for equivalence to literal "root" before
deciding whether to add the `runas` command-line option to QEMU.  This
is unsatisfactory for several reasons.

First, just because the string doesn't match "root" doesn't mean the
final uid won't end up being zero; in particular, the range_base
calculations may end up producing "0:NNN", which would be root in any
case.

Secondly, it's almost certainly a configuration error if the resulting
uid ends up to be zero; rather than silently do what was specified but
probably not intended, throw an error.

To fix this, check for root once in
libxl__domain_get_device_model_uid.  If the result is root, return an
error; if appropriate, set `runas`.

After that, assume that the presence of state->dm_runas implies that a
`runas` argument should be constructed.

One side effect of this is to check whether device_model_user exists
before passing it to qemu, resulting in better error reporting.

While we're here:
- Refactor the function to use the "goto out" idiom
- Use 'rc' rather than 'ret', in line with CODING_STYLE

Signed-off-by: George Dunlap 
---
Let the record show that I personally think the `goto out` pattern
here would be improved by having a "root check" label.

v3:
- Update to use more idiomatic `if (rc)`
- Make "inputs" to goto more clear
- Initialize intended_uid to -1 sentinel
- Document expectations when jumping to 'out'
- Don't return directly after non-trivial initial checks
- Always explicitly set rc to 0 when jumping to out, even if we just
  checked that it was zero a few lines earlier
- Work around one-goto limitation by having multiple 'rc' checks.
- Return generic ERROR_INVAL rather than in accurate ERROR_DEVICE_EXISTS

v2:
- Refactor to use `out` rather than multiple labels
- Only check for root once
- Use 'out' rather than direct returns for errors (only use direct returns
  for early `succeed-without-setting-runas` paths)
- Use `rc` rather than `ret` to more closely align with CODING_STYLE
- Fill out comments about the cases we're handling
- Return ERROR_DEVICE_EXISTS rather than ERROR_FAIL if there's another
  username that maps to our calculated uid
- Report an error if the specified device_model_user doesn't exist

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 107 -
 1 file changed, 85 insertions(+), 22 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 67204b94c2..d73bbb6b06 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -126,65 +126,128 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 const libxl_domain_build_info *b_info = >guest_config->b_info;
 
 struct passwd *user_base, user_pwbuf;
-int ret;
+int rc;
 char *user;
+uid_t intended_uid = -1;
 
 /* Only qemu-upstream can run as a different uid */
 if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
 return 0;
 
+/*
+ * From this point onward, all paths should go through the `out`
+ * label.  The invariants should be:
+ * - rc may be 0, or an error code.
+ * - if rc is an error code, user and intended_uid are ignored.
+ * - if rc is 0, user may be set or not set.
+ * - if user is set, then intended_uid must be set to a UID matching
+ *   the username `user`.  This will be checked for root (0).
+ */
+
+/*
+ * If device_model_user is present, set `-runas` even if
+ * dm_restrict isn't in use
+ */
 user = b_info->device_model_user;
-if (user)
-goto end_search;
+if (user) {
+rc = userlookup_helper_getpwnam(gc, user, _pwbuf, _base);
+if (rc)
+goto out;
+
+if (!user_base) {
+LOGD(ERROR, guest_domid, "Couldn't find device_model_user %s",
+ user);
+rc = ERROR_INVAL;
+goto out;
+}
+
+intended_uid = user_base->pw_uid;
+rc = 0;
+goto out;
+}
 
+/*
+ * If dm_restrict isn't set, and we don't have a specified user, don't
+ * bother setting a `-runas` parameter.
+ */
 if (!libxl_defbool_val(b_info->dm_restrict)) {
 LOGD(DEBUG, guest_domid,
  "dm_restrict disabled, starting QEMU as root");
-return 0;
+user = NULL; /* Should already be null, but just in case */
+goto out;
 }
 
-ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+/*
+ * dm_restrict is set, but device_model_user isn't set; look for
+ * QEMU_USER_BASE_RANGE
+ */
+rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
-if (ret)
-return ret;
+if (rc)
+goto out;
 if (user_base) {
 struct passwd *user_clash, user_clash_pwbuf;
-uid_t intended_uid = user_base->pw_uid + guest_domid;
-ret = userlookup_helper_getpwuid(gc, intended_uid,
+
+  

[Xen-devel] [PATCH v3 11/11] dm_depriv: Mark `UID cleanup` as completed

2018-12-21 Thread George Dunlap
Signed-off-by: George Dunlap 
---
CC: Ian Jackson 
CC: Wei Liu 
---
 docs/designs/qemu-deprivilege.md | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/docs/designs/qemu-deprivilege.md b/docs/designs/qemu-deprivilege.md
index f7444a434d..81a5f5c05d 100644
--- a/docs/designs/qemu-deprivilege.md
+++ b/docs/designs/qemu-deprivilege.md
@@ -128,26 +128,6 @@ are specified; this does not apply to QEMU running as a 
Xen DM.
 
 '''Tested''': Not tested
 
-# Restrictions / improvements still to do
-
-This lists potential restrictions still to do.  It is meant to be
-listed in order of ease of implementation, with low-hanging fruit
-first.
-
-### Further RLIMITs
-
-RLIMIT_AS limits the total amount of memory; but this includes the
-virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
-fiddles with this; it would be straightforward to make it *set* the
-rlimit to what it thinks a sensible limit is.
-
-RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
-threads for some devices, so this would require some thought.
-
-Other things that would take some cleverness / changes to QEMU to
-utilize due to ordering constrants:
- - RLIMIT_NOFILES (after all necessary files are opened)
-
 ### libxl UID cleanup
 
 '''Description''': Domain IDs are reused, and thus restricted UIDs are
@@ -223,6 +203,26 @@ Since this will kill all other `reaper_uid` processes as 
well, we must
 either allocate a separate `reaper_uid` per domain, or use locking to
 ensure that only one killing process is active at a time.
 
+# Restrictions / improvements still to do
+
+This lists potential restrictions still to do.  It is meant to be
+listed in order of ease of implementation, with low-hanging fruit
+first.
+
+### Further RLIMITs
+
+RLIMIT_AS limits the total amount of memory; but this includes the
+virtual memory which QEMU uses as a mapcache.  xen-mapcache.c already
+fiddles with this; it would be straightforward to make it *set* the
+rlimit to what it thinks a sensible limit is.
+
+RLIMIT_NPROC limits total number of processes or threads.  QEMU uses
+threads for some devices, so this would require some thought.
+
+Other things that would take some cleverness / changes to QEMU to
+utilize due to ordering constrants:
+ - RLIMIT_NOFILES (after all necessary files are opened)
+
 ## libxl: Treat QMP connection as untrusted
 
 '''Description''': Currently libxl talks with QEMU via QMP; but its
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 06/11] libxl: Move qmp cleanup into devicemodel destroy function

2018-12-21 Thread George Dunlap
Removing the qmp connection is logically part of the device model
destruction; having the caller destroy it is a mild layering
violation.

Move libxl__qmp_cleanup() into libxl__destroy_device_model().  This
will make it easier when we make devicemodel destruction asynchronous.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 9 +++--
 tools/libxl/libxl_domain.c | 2 --
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index d73bbb6b06..450433452d 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2698,12 +2698,17 @@ out:
 
 int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
 {
+int rc;
 char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
 if (!xs_rm(CTX->xsh, XBT_NULL, path))
 LOGD(ERROR, domid, "xs_rm failed for %s", path);
 /* We should try to destroy the device model anyway. */
-return kill_device_model(gc,
-GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+rc = kill_device_model(gc,
+  GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
+
+libxl__qmp_cleanup(gc, domid);
+
+return rc;
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index 3377bba994..d46b97dedf 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1069,8 +1069,6 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 if (dm_present) {
 if (libxl__destroy_device_model(gc, domid) < 0)
 LOGD(ERROR, domid, "libxl__destroy_device_model failed");
-
-libxl__qmp_cleanup(gc, domid);
 }
 dis->drs.ao = ao;
 dis->drs.domid = domid;
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 08/11] libxl: Kill QEMU by uid when possible

2018-12-21 Thread George Dunlap
The privcmd fd that a dm_restrict'ed QEMU has gives it permission to
one specific domain ID.  This domain ID will probably eventually be
used again.  It is therefore necessary to make absolutely sure that a
rogue QEMU process cannot hang around after its domain has exited.

Killing QEMU by pid is insufficient in this situation, because QEMU
may be able to fork() to escape killing.  It is surprisingly tricky to
kill a process which can call fork() without races; the only reliable
way is to use kill(-1) to kill all processes with a given uid.

We can use this method only when we're sure that there's only one QEMU
instance per uid.  Add a dm_uid into the domain_build_state struct,
and set it in libxl__domain_get_device_model_uid() when it's safe to
kill by UID.  Store this in xenstore next to device-model-pid.

On domain destroy, check to see if device-model-uid is present in
xenstore.  If so, fork off a reaper process, setuid to that uid, and
do kill(-9) to kill all uids of that type.  Otherwise, carry on
destroying by pid.

While we're here, make libxl__destroy_device_model() consistently:
 1. Return an error when anything fails
 2. But continue to do as much clean-up as possible

NOTE that this is not yet completely safe: with ruid == dm_uid, the
device model may be able to kill(-9) the 'reaper' process before the
reaper process can kill it.  Further patches will address this.

Signed-off-by: George Dunlap 
---
v3:
- rename dm_uid to dm_kill_uid to indicate that it's only used when
  it's suitable to be used for kiling.
- rename xenstore node to device-model-kill-uid for the same reason
- Add a comment explaining what dm_runas and dm_kill_uid are for.
- Minor tweak to comment for clarification
- Add an `rc = 0;` before an `out:` label
- Make the accumulator macro generic and put in libxl-internal.h
- Fix up child error code return
- Account for undefined behavior on failure of libxl__xs_read_checked.

v2:
- Rebase on top of previous "goto out" refactoring
- Rather than introducing a `uid` string, Introduce a boolean,
  "kill_by_uid"; and do the GCSPRINTF() once if that is set.
- Fix typo "starting"
- Always call kill_device_model_uid_cb(); only call
  libxl__qmp_cleanup() from there
- Refactor libxl__destroy_device_model() to follow "goto out on error"
  pattern
- Retain and report errors even when we continue trying to clean up
- Report errors removing DM xenstore directory (except -ENOENT)
- Report errors reading device-model-uid
- Put "kill by uid" child logic in a separate function
- Refactor "kill by uid" to follow "goto out on error" pattern
- Change "kill by uid" to return libxl-style error, rather than errno
- Document the intention of when to return errors
- Assert that dm_uid != 0
- Log what the reaper process setresuid'd to

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 tools/libxl/libxl_dm.c   | 213 +--
 tools/libxl/libxl_internal.h |  16 ++-
 2 files changed, 219 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index ca59df33fe..d265f24864 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -129,6 +129,7 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
 int rc;
 char *user;
 uid_t intended_uid = -1;
+bool kill_by_uid;
 
 /* Only qemu-upstream can run as a different uid */
 if (b_info->device_model_version != LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN)
@@ -141,7 +142,8 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
  * - if rc is an error code, user and intended_uid are ignored.
  * - if rc is 0, user may be set or not set.
  * - if user is set, then intended_uid must be set to a UID matching
- *   the username `user`.  This will be checked for root (0).
+ *   the username `user`, and kill_by_uid must be set to the appropriate
+ *   value.  intended_uid will be checked for root (0).
  */
 
 /*
@@ -162,6 +164,7 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
 }
 
 intended_uid = user_base->pw_uid;
+kill_by_uid = true;
 rc = 0;
 goto out;
 }
@@ -205,12 +208,15 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 LOGD(DEBUG, guest_domid, "using uid %ld", (long)intended_uid);
 user = GCSPRINTF("%ld:%ld", (long)intended_uid,
  (long)user_base->pw_gid);
+kill_by_uid = true;
 rc = 0;
 goto out;
 }
 
 /*
- * We couldn't find QEMU_USER_BASE_RANGE; look for QEMU_USER_SHARED
+ * We couldn't find QEMU_USER_BASE_RANGE; look for
+ * QEMU_USER_SHARED.  NB for QEMU_USER_SHARED, all QEMU will run
+ * as the same UID, we can't kill by uid; therefore don't set uid.
  */
 user = LIBXL_QEMU_USER_SHARED;
 rc = userlookup_helper_getpwnam(gc, user, _pwbuf, _base);
@@ -220,6 +226,7 @@ static int libxl__domain_get_device_model_uid(libxl__gc *gc,
 

[Xen-devel] [PATCH v3 09/11] libxl: Kill QEMU with "reaper" ruid

2018-12-21 Thread George Dunlap
Using kill(-1) to killing an untrusted dm process with the real uid
equal to the dm_uid isn't guaranteed to succeed: the process in
question may be able to kill the reaper process after the setresuid()
and before the kill().

Instead, set the real uid to the QEMU user for domain 0
(QEMU_USER_RANGE_BASE + 0).  The reaper process will still be able to
kill the dm process, but not vice versa.

This, in turn, requires locking to make sure that only one reaper
process is using that uid at a time; otherwise one reaper process may
kill the other reaper process.

Create a lockfile in RUNDIR/dm-reaper-lock, and grab the lock before
executing kill.

In the event that we can't get the lock for some reason, go ahead with
the kill using dm_uid for both real and effective UIDs.  This isn't
guaranteed to work, but it's no worse than not trying to kill the
process at all.

NB that this effectively requires admins using device_model_user to
also define xen_qemuuser_range_base; this will be addressed in
subsequent patches.

Signed-off-by: George Dunlap 
---
v3:
- Have libxl__get_reaper_uid return an error if a suitable uid is not found.
- Explicitly set reaper_uid to dm_uid if get_reaper_lock_and_uid()
  returns an error, rather than relying on reaper_uid being unchanged
  on failure.
- Open the lockfile 0644 rather than 0666.
- Remove bogus comment.

v2:
- Port over previous changes
- libxl__get_reaper_uid() won't set errno, use LOG rather than LOGE.
- Accumulate error and return for all failures
- Move flock() outside of the condition.  Also fix EINTR check (check
  errno rather than return value).
- Add a comment explaining why we return an error even if the kill()
  succeeds
- Move locking to a separate function to minimize gotos
- Refactor libxl__get_reaper_id to take a pointer for reaper_uid;
  return only success/failure.  Also return EINVAL if reaper_uid would
  resolve to 0.
- Handle "reaper_uid not found" specially; note issue with
  device_model_user feature
- Assert that final reaper_uid != 0

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c | 118 +
 1 file changed, 108 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index d265f24864..f7c4e5eb3b 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -259,6 +259,35 @@ out:
 return rc;
 }
 
+/*
+ * Look up "reaper UID".  If present and non-root, returns 0 and sets
+ * reaper_uid.  Otherwise returns libxl-style error.
+ */
+static int libxl__get_reaper_uid(libxl__gc *gc, uid_t *reaper_uid)
+{
+struct passwd *user_base, user_pwbuf;
+int rc;
+
+rc = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
+ _pwbuf, _base);
+if (rc)
+return rc;
+
+if (!user_base) {
+LOG(WARN, "Couldn't find uid for reaper process");
+return ERROR_INVAL;
+}
+
+if (user_base->pw_uid == 0) {
+LOG(ERROR, "UID for reaper process maps to root!");
+return ERROR_INVAL;
+}
+
+*reaper_uid = user_base->pw_uid;
+
+return 0;
+}
+
 const char *libxl__domain_device_model(libxl__gc *gc,
const libxl_domain_build_info *info)
 {
@@ -2834,37 +2863,106 @@ out:
 return;
 }
 
+static int get_reaper_lock_and_uid(libxl__destroy_devicemodel_state *ddms,
+   uid_t *reaper_uid)
+{
+STATE_AO_GC(ddms->ao);
+int domid = ddms->domid;
+int r;
+const char * lockfile;
+int fd;
+
+/* Try to lock the "reaper uid" */
+lockfile = GCSPRINTF("%s/dm-reaper-lock", libxl__run_dir_path());
+
+/*
+ * NB that since we've just forked, we can't have any
+ * threads; so we don't need the libxl__carefd
+ * infrastructure here.
+ */
+fd = open(lockfile, O_RDWR|O_CREAT, 0644);
+if (fd < 0) {
+LOGED(ERROR, domid,
+  "unexpected error while trying to open lockfile %s, errno=%d",
+  lockfile, errno);
+return ERROR_FAIL;
+}
+
+/* Try to lock the file, retrying on EINTR */
+for (;;) {
+r = flock(fd, LOCK_EX);
+if (!r)
+break;
+if (errno != EINTR) {
+/* All other errno: EBADF, EINVAL, ENOLCK, EWOULDBLOCK */
+LOGED(ERROR, domid,
+  "unexpected error while trying to lock %s, fd=%d, errno=%d",
+  lockfile, fd, errno);
+return ERROR_FAIL;
+}
+}
+
+/*
+ * Get reaper_uid.  If we can't find such a uid, return an error.
+ *
+ * FIXME: This means that domain destruction will fail if
+ * device_model_user is set but QEMU_USER_RANGE_BASE doesn't exist.
+ */
+return libxl__get_reaper_uid(gc, reaper_uid);
+}
+
+
 /*
  * Destroy all processes of the given uid by setresuid to the
  * specified uid and kill(-1).  NB this MUST BE CALLED FROM A SEPARATE
- * PROCESS from the normal libxl process.  Returns 

[Xen-devel] [PATCH v3 02/11] libxl: Get rid of support for QEMU_USER_BASE (xen-qemuuser-domidNN)

2018-12-21 Thread George Dunlap
QEMU_USER_BASE allows a user to specify the UID to use when running
the devicemodel for a specific domain number.  Unfortunately, this is
not really practical: It requires nearly 32,000 entries in
/etc/passwd.  QEMU_USER_RANGE_BASE is much more practical.

Remove support for QEMU_USER_BASE.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
NB that I've chosen not to update the xl.cfg man page at this time; it
needs a lot of other updates as well, which would be easier to do all
at once at the end.

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c   | 16 
 tools/libxl/libxl_internal.h |  1 -
 2 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index bbcbc94b6c..6024d4b7b8 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -138,13 +138,6 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 return 0;
 }
 
-user = GCSPRINTF("%s%d", LIBXL_QEMU_USER_BASE, guest_domid);
-ret = userlookup_helper_getpwnam(gc, user, _pwbuf, 0);
-if (ret < 0)
-return ret;
-if (ret > 0)
-goto end_search;
-
 ret = userlookup_helper_getpwnam(gc, LIBXL_QEMU_USER_RANGE_BASE,
  _pwbuf, _base);
 if (ret < 0)
@@ -174,15 +167,14 @@ static int libxl__domain_get_device_model_uid(libxl__gc 
*gc,
 if (ret < 0)
 return ret;
 if (ret > 0) {
-LOGD(WARN, guest_domid, "Could not find user %s%d, falling back to %s",
- LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED);
+LOGD(WARN, guest_domid, "Could not find user %s, falling back to %s",
+ LIBXL_QEMU_USER_RANGE_BASE, LIBXL_QEMU_USER_SHARED);
 goto end_search;
 }
 
 LOGD(ERROR, guest_domid,
- "Could not find user %s%d or %s or range base pseudo-user %s, cannot 
restrict",
- LIBXL_QEMU_USER_BASE, guest_domid, LIBXL_QEMU_USER_SHARED,
- LIBXL_QEMU_USER_RANGE_BASE);
+ "Could not find user %s or range base pseudo-user %s, cannot 
restrict",
+ LIBXL_QEMU_USER_SHARED, LIBXL_QEMU_USER_RANGE_BASE);
 return ERROR_INVAL;
 
 end_search:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index c4a43bd0b7..b147f3803c 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4387,7 +4387,6 @@ _hidden int libxl__read_sysfs_file_contents(libxl__gc *gc,
 int *datalen_r);
 
 #define LIBXL_QEMU_USER_PREFIX "xen-qemuuser"
-#define LIBXL_QEMU_USER_BASE   LIBXL_QEMU_USER_PREFIX"-domid"
 #define LIBXL_QEMU_USER_SHARED LIBXL_QEMU_USER_PREFIX"-shared"
 #define LIBXL_QEMU_USER_RANGE_BASE LIBXL_QEMU_USER_PREFIX"-range-base"
 
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 07/11] libxl: Make killing of device model asynchronous

2018-12-21 Thread George Dunlap
Or at least, give it an asynchronous interface so that we can make it
actually asynchronous in subsequent patches.

Create state structures and callback function signatures.  Add the
state structure to libxl__destroy_domid_state.  Break
libxl__destroy_domid down into two functions.

No functional change intended.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v2:
- Note that libxl__devicemodel_destroy_cb may be called reentrantly

NB that I retain the comment before libxl__destroy_device_model, in
spite of the fact that it looks "pointless", to separate it logically
from the previous prototype.

CC: Ian Jackson 
CC: Wei Liu 
---
 tools/libxl/libxl_dm.c   | 11 +++---
 tools/libxl/libxl_domain.c   | 40 
 tools/libxl/libxl_internal.h | 20 --
 3 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 450433452d..ca59df33fe 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2696,19 +2696,24 @@ out:
 return rc;
 }
 
-int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid)
+void libxl__destroy_device_model(libxl__egc *egc,
+ libxl__destroy_devicemodel_state *ddms)
 {
+STATE_AO_GC(ddms->ao);
 int rc;
+int domid = ddms->domid;
 char *path = DEVICE_MODEL_XS_PATH(gc, LIBXL_TOOLSTACK_DOMID, domid, "");
+
 if (!xs_rm(CTX->xsh, XBT_NULL, path))
 LOGD(ERROR, domid, "xs_rm failed for %s", path);
+
 /* We should try to destroy the device model anyway. */
 rc = kill_device_model(gc,
   GCSPRINTF("/local/domain/%d/image/device-model-pid", domid));
-
+
 libxl__qmp_cleanup(gc, domid);
 
-return rc;
+ddms->callback(egc, ddms, rc);
 }
 
 /* Return 0 if no dm needed, 1 if needed and <0 if error. */
diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index d46b97dedf..0ce1ba1327 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1008,6 +1008,10 @@ static void destroy_finish_check(libxl__egc *egc,
 }
 
 /* Callbacks for libxl__destroy_domid */
+static void dm_destroy_cb(libxl__egc *egc,
+  libxl__destroy_devicemodel_state *ddms,
+  int rc);
+
 static void devices_destroy_cb(libxl__egc *egc,
libxl__devices_remove_state *drs,
int rc);
@@ -1066,16 +1070,18 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
 if (rc < 0) {
 LOGEVD(ERROR, rc, domid, "xc_domain_pause failed");
 }
+
 if (dm_present) {
-if (libxl__destroy_device_model(gc, domid) < 0)
-LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+dis->ddms.ao = ao;
+dis->ddms.domid = domid;
+dis->ddms.callback = dm_destroy_cb;
+
+libxl__destroy_device_model(egc, >ddms);
+return;
+} else {
+dm_destroy_cb(egc, >ddms, 0);
+return;
 }
-dis->drs.ao = ao;
-dis->drs.domid = domid;
-dis->drs.callback = devices_destroy_cb;
-dis->drs.force = 1;
-libxl__devices_destroy(egc, >drs);
-return;
 
 out:
 assert(rc);
@@ -1083,6 +1089,24 @@ out:
 return;
 }
 
+static void dm_destroy_cb(libxl__egc *egc,
+  libxl__destroy_devicemodel_state *ddms,
+  int rc)
+{
+libxl__destroy_domid_state *dis = CONTAINER_OF(ddms, *dis, ddms);
+STATE_AO_GC(dis->ao);
+uint32_t domid = dis->domid;
+
+if (rc < 0)
+LOGD(ERROR, domid, "libxl__destroy_device_model failed");
+
+dis->drs.ao = ao;
+dis->drs.domid = domid;
+dis->drs.callback = devices_destroy_cb;
+dis->drs.force = 1;
+libxl__devices_destroy(egc, >drs);
+}
+
 static void devices_destroy_cb(libxl__egc *egc,
libxl__devices_remove_state *drs,
int rc)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b147f3803c..f9e0bf6578 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1705,8 +1705,6 @@ _hidden int 
libxl__wait_for_device_model_deprecated(libxl__gc *gc,
   void *userdata),
 void *check_callback_userdata);
 
-_hidden int libxl__destroy_device_model(libxl__gc *gc, uint32_t domid);
-
 _hidden const libxl_vnc_info *libxl__dm_vnc(const libxl_domain_config *g_cfg);
 
 _hidden char *libxl__abs_path(libxl__gc *gc, const char *s, const char *path);
@@ -3672,6 +3670,7 @@ extern const struct libxl_device_type *device_type_tbl[];
 
 typedef struct libxl__domain_destroy_state libxl__domain_destroy_state;
 typedef struct libxl__destroy_domid_state libxl__destroy_domid_state;
+typedef struct libxl__destroy_devicemodel_state 
libxl__destroy_devicemodel_state;
 typedef struct libxl__devices_remove_state 

[Xen-devel] [PATCH v3 04/11] dm_depriv: Describe expected usage of device_model_user parameter

2018-12-21 Thread George Dunlap
A number of subsequent patches rely on as-yet undefined behavior for
what the `device_model_user` parameter does.  Rather than implement it
incorrectly (or randomly), or remove the feature, describe an expected
usage for the feature.  Further patches will make decisions based on
this expected usage.

Signed-off-by: George Dunlap 
Acked-by: Ian Jackson 
---
v3:
- Fix a minor typo

v2:
- Remove stale comment about device_model_user not being ready

RFC: As we'll see in a later patch, this implementation is still
incomplete: we need a `reaper` uid from which to kill uids.

CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony Perard 
---
 docs/features/qemu-deprivilege.pandoc | 17 +
 tools/libxl/libxl_types.idl   |  1 -
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/docs/features/qemu-deprivilege.pandoc 
b/docs/features/qemu-deprivilege.pandoc
index f941525189..ce21a60ef7 100644
--- a/docs/features/qemu-deprivilege.pandoc
+++ b/docs/features/qemu-deprivilege.pandoc
@@ -66,6 +66,23 @@ this, create a user named `xen-qemuuser-shared`; for example:
 
 adduser --no-create-home --system xen-qemuuser-shared
 
+A final way to set up a separate process for qemus is to allocate one
+UID per VM, and set the UID in the domain config file with the
+`device_model_user` argument.  For example, suppose you have a VM
+named `c6-01`.  You might do the following:
+
+adduser --system --no-create-home --group xen-qemuuser-c6-01
+
+And then in your config file, the following line:
+
+device_model_user="xen-qemuuser-c6-01"
+
+NOTE: It is important when using `device_model_user` that EACH VM HAVE
+A SEPARATE UID, and that none of these UIDs map to root.  xl will
+throw an error a uid maps to zero, but not if multiple VMs have the
+same uid.  Multiple VMs with the same device model uid will cause
+problems.
+
 ## Domain config changes
 
 The core domain config change is to add the following line to the
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 51cf06a3a2..141c46e42a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -495,7 +495,6 @@ libxl_domain_build_info = Struct("domain_build_info",[
 ("device_model", string),
 ("device_model_ssidref", uint32),
 ("device_model_ssid_label", string),
-# device_model_user is not ready for use yet
 ("device_model_user", string),
 
 # extra parameters pass directly to qemu, NULL terminated
-- 
2.19.2


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Andrew Cooper
On 21/12/2018 16:21, Julien Grall wrote:
>>> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
>>> index 3304921991..1efbc071c5 100644
>>> --- a/xen/include/asm-x86/p2m.h
>>> +++ b/xen/include/asm-x86/p2m.h
>>> @@ -491,18 +491,21 @@ struct page_info *p2m_get_page_from_gfn(struct
>>> p2m_domain *p2m, gfn_t gfn,
>>>   p2m_query_t q);
>>>     static inline struct page_info *get_page_from_gfn(
>>> -    struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
>>> +    struct domain *d, gfn_t gfn, p2m_type_t *t, p2m_query_t q)
>>>   {
>>>   struct page_info *page;
>>> +    mfn_t mfn;
>>>     if ( paging_mode_translate(d) )
>>> -    return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn),
>>> t, NULL, q);
>>> +    return p2m_get_page_from_gfn(p2m_get_hostp2m(d), gfn, t,
>>> NULL, q);
>>>     /* Non-translated guests see 1-1 RAM / MMIO mappings
>>> everywhere */
>>>   if ( t )
>>>   *t = likely(d != dom_io) ? p2m_ram_rw : p2m_mmio_direct;
>>> -    page = mfn_to_page(_mfn(gfn));
>>> -    return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL;
>>> +
>>> +    mfn = _mfn(gfn_x(gfn));
>>> +    page = mfn_to_page(mfn);
>>> +    return mfn_valid(mfn) && get_page(page, d) ? page : NULL;
>>
>> This unfortunately propagates some bad behaviour, because it is not safe
>> to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds.  (In practice
>> it works because mfn_to_page() is just pointer arithmetic.)
>>
>> Pleas can you express this as:
>>
>> return (mfn_valid(mfn) &&
>>  (page = mfn_to_page(mfn), get_page(page, d))) ? page : NULL;
>>
>> which at least gets the order of operations in the correct order from
>> C's point of view.
>>
>> Alternatively, and perhaps easier to follow:
>>
>> if ( !mfn_valid(mfn) )
>>  return NULL;
>>
>> page = mfn_to_page(mfn);
>>
>> return get_page(page, d) ? page : NULL;
>
> I am happy to fix that. However, shouldn't this be handled in a
> separate patch? After all, the code is not worst than it currently is.

I don't think its worthy of a separate patch.  You're touching the code
anyway, so might as well do it all in one go.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v3 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Julien Grall



On 21/12/2018 16:26, Julien Grall wrote:

No functional change intended.

Only reasonable clean-ups are done in this patch. The rest will use _gfn
for the time being.

Signed-off-by: Julien Grall 
Acked-by: Jan Beulich 


Please ignore this patch. I need to rebase it on Andrew's patch.

Cheers,



---
 Changes in v3:
 - Add Jan's acked-by

 Changes in v2:
 - Remove >> PAGE_SHIFT in svm code
 - Fix typo in the e-mail address
 - Small NITs
---
  xen/arch/arm/guestcopy.c |  2 +-
  xen/arch/arm/mm.c|  2 +-
  xen/arch/x86/cpu/vpmu.c  |  2 +-
  xen/arch/x86/domain.c| 12 ++--
  xen/arch/x86/domctl.c|  6 +++---
  xen/arch/x86/hvm/dm.c|  2 +-
  xen/arch/x86/hvm/domain.c|  2 +-
  xen/arch/x86/hvm/hvm.c   |  9 +
  xen/arch/x86/hvm/svm/svm.c   |  8 
  xen/arch/x86/hvm/viridian/time.c |  8 
  xen/arch/x86/hvm/viridian/viridian.c | 16 
  xen/arch/x86/hvm/vmx/vmx.c   |  4 ++--
  xen/arch/x86/hvm/vmx/vvmx.c  | 12 ++--
  xen/arch/x86/mm.c| 24 ++--
  xen/arch/x86/mm/p2m.c|  2 +-
  xen/arch/x86/mm/shadow/hvm.c |  6 +++---
  xen/arch/x86/physdev.c   |  3 ++-
  xen/arch/x86/pv/descriptor-tables.c  |  4 ++--
  xen/arch/x86/pv/emul-priv-op.c   |  6 +++---
  xen/arch/x86/pv/mm.c |  2 +-
  xen/arch/x86/traps.c | 11 ++-
  xen/common/domain.c  |  2 +-
  xen/common/event_fifo.c  | 12 ++--
  xen/common/memory.c  |  4 ++--
  xen/common/tmem_xen.c|  2 +-
  xen/include/asm-arm/p2m.h|  6 +++---
  xen/include/asm-x86/p2m.h| 11 +++
  27 files changed, 95 insertions(+), 85 deletions(-)

diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 7a0f3e9d5f..55892062bb 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -37,7 +37,7 @@ static struct page_info *translate_get_page(copy_info_t info, 
uint64_t addr,
  return get_page_from_gva(info.gva.v, addr,
   write ? GV2M_WRITE : GV2M_READ);
  
-page = get_page_from_gfn(info.gpa.d, paddr_to_pfn(addr), , P2M_ALLOC);

+page = get_page_from_gfn(info.gpa.d, gaddr_to_gfn(addr), , P2M_ALLOC);
  
  if ( !page )

  return NULL;
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 01ae20..340a1d1548 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1252,7 +1252,7 @@ int xenmem_add_to_physmap_one(
  
  /* Take reference to the foreign domain page.

   * Reference will be released in XENMEM_remove_from_physmap */
-page = get_page_from_gfn(od, idx, , P2M_ALLOC);
+page = get_page_from_gfn(od, _gfn(idx), , P2M_ALLOC);
  if ( !page )
  {
  put_pg_owner(od);
diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index 8a4f753eae..4d8f153031 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -607,7 +607,7 @@ static int pvpmu_init(struct domain *d, xen_pmu_params_t 
*params)
  struct vcpu *v;
  struct vpmu_struct *vpmu;
  struct page_info *page;
-uint64_t gfn = params->val;
+gfn_t gfn = _gfn(params->val);
  
  if ( (params->vcpu >= d->max_vcpus) || (d->vcpu[params->vcpu] == NULL) )

  return -EINVAL;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253ff..b462a8513b 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -827,7 +827,7 @@ int arch_set_info_guest(
  unsigned long flags;
  bool compat;
  #ifdef CONFIG_PV
-unsigned long cr3_gfn;
+gfn_t cr3_gfn;
  struct page_info *cr3_page;
  unsigned long cr4;
  int rc = 0;
@@ -1091,9 +1091,9 @@ int arch_set_info_guest(
  set_bit(_VPF_in_reset, >pause_flags);
  
  if ( !compat )

-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]);
+cr3_gfn = _gfn(xen_cr3_to_pfn(c.nat->ctrlreg[3]));
  else
-cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]);
+cr3_gfn = _gfn(compat_cr3_to_pfn(c.cmp->ctrlreg[3]));
  cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
  
  if ( !cr3_page )

@@ -1122,7 +1122,7 @@ int arch_set_info_guest(
  case 0:
  if ( !compat && !VM_ASSIST(d, m2p_strict) &&
   !paging_mode_refcounts(d) )
-fill_ro_mpt(_mfn(cr3_gfn));
+fill_ro_mpt(_mfn(gfn_x(cr3_gfn)));
  break;
  default:
  if ( cr3_page == current->arch.old_guest_table )
@@ -1137,7 +1137,7 @@ int arch_set_info_guest(
  v->arch.guest_table = pagetable_from_page(cr3_page);
  if ( c.nat->ctrlreg[1] )
  {
-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]);
+cr3_gfn = 

Re: [Xen-devel] [PATCH for-4.12 v3 7/8] xenalyze: Build for Both ARM and x86 Platforms

2018-12-21 Thread Julien Grall

Hi,

On 21/12/2018 16:26, Julien Grall wrote:

From: Benjamin Sanda 

Modified to provide building of the xenalyze binary for both ARM and
x86 platforms. The xenalyze binary is now built as part of the BIN
list for both platforms.

Signed-off-by: Benjamin Sanda 
Signed-off-by: Julien Grall 
Acked-by: Wei Liu 
Acked-by: Stefano Stabellini 


I made a typo in the address e-mail. Stefano, could you fix it on commit?

Cheers,



---
 Changes in v3:
 - Add Stefano's acked-by

 Changes in v2:
 - Add Wei's acked-by
---
  tools/xentrace/Makefile | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/xentrace/Makefile b/tools/xentrace/Makefile
index 0bad942bdf..9fb7fc96e7 100644
--- a/tools/xentrace/Makefile
+++ b/tools/xentrace/Makefile
@@ -9,8 +9,7 @@ LDLIBS += $(LDLIBS_libxenevtchn)
  LDLIBS += $(LDLIBS_libxenctrl)
  LDLIBS += $(ARGP_LDFLAGS)
  
-BIN-$(CONFIG_X86) = xenalyze

-BIN  = $(BIN-y)
+BIN  = xenalyze
  SBIN = xentrace xentrace_setsize
  LIBBIN   = xenctx
  SCRIPTS  = xentrace_format



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 5/8] xen/arm: Allow a privileged domain to map foreign page from DOMID_XEN

2018-12-21 Thread Julien Grall
For auto-translated domain, the only way to map a page to itself is the
using the foreign map API. The current code does not allow mapping page from
special page (such as DOMID_XEN).

As xentrace buffers are shared using DOMID_XEN, it is not possible to use
tracing for Arm.

This could be solved by using the helper get_pg_owner(). This helper will
be able to get a reference on DOMID_XEN and therefore allow mapping for
privileged domain.

This patch replace the call to rcu_lock_domain_by_any_id() with
get_pg_owner(). For consistency, all the call to rcu_unlock_domain are
replaced by put_pg_owner().

Signed-off-by: Julien grall 
Reviewed-by: Andrii Anisov 
Reviewed-by: Stefano Stabellini 

---
Changes in v3:
- Add Stefano's reviewed-by
- Fix typoes

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/mm.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 3bf11eec4f..01ae20 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1233,20 +1233,20 @@ int xenmem_add_to_physmap_one(
 struct domain *od;
 p2m_type_t p2mt;
 
-od = rcu_lock_domain_by_any_id(extra.foreign_domid);
+od = get_pg_owner(extra.foreign_domid);
 if ( od == NULL )
 return -ESRCH;
 
 if ( od == d )
 {
-rcu_unlock_domain(od);
+put_pg_owner(od);
 return -EINVAL;
 }
 
 rc = xsm_map_gmfn_foreign(XSM_TARGET, d, od);
 if ( rc )
 {
-rcu_unlock_domain(od);
+put_pg_owner(od);
 return rc;
 }
 
@@ -1255,7 +1255,7 @@ int xenmem_add_to_physmap_one(
 page = get_page_from_gfn(od, idx, , P2M_ALLOC);
 if ( !page )
 {
-rcu_unlock_domain(od);
+put_pg_owner(od);
 return -EINVAL;
 }
 
@@ -1264,13 +1264,13 @@ int xenmem_add_to_physmap_one(
 else
 {
 put_page(page);
-rcu_unlock_domain(od);
+put_pg_owner(od);
 return -EINVAL;
 }
 
 mfn = page_to_mfn(page);
 
-rcu_unlock_domain(od);
+put_pg_owner(od);
 break;
 }
 case XENMAPSPACE_dev_mmio:
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 6/8] xen/arm: Initialize trace buffer

2018-12-21 Thread Julien Grall
From: Benjamin Sanda 

Now that we allow a privileged domain to map tracing buffer, initialize
them so a user can effectively trace Xen.

Signed-off-by: Benjamin Sanda 
[julien: rework commit message]
Signed-off-by: Julien Grall 
Reviewed-by: Andrii Anisov 
Acked-by: Stefano Stabellini 

---
Changes in v3:
- Add Stefano's acked-by

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/setup.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index fb923cdf67..444857a967 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -899,6 +900,8 @@ void __init start_xen(unsigned long boot_phys_offset,
 
 heap_init_late();
 
+init_trace_bufs();
+
 init_constructors();
 
 console_endboot();
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 3/8] xen/arm: Add support for read-only foreign mappings

2018-12-21 Thread Julien Grall
Currently, foreign mappings can only be read-write. A follow-up patch will
extend foreign mapping for Xen backend memory (via XEN_DOMID), some of
that memory should only be read accessible for the mapping domain.

Introduce a new p2m_type to cater read-only foreign mappings. For now,
the decision between the two foreign mapping type is based on the type
of the guest page mapped.

Signed-off-by: Julien Grall 

---

Cc: Andrii Anisov 

Changes in v3:
- Remove Andrii's reviewed-by
- Move out the XEN_DOMID code in a separate patch
- Make the new addition future-proof

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/mm.c | 5 +++--
 xen/arch/arm/p2m.c| 1 +
 xen/include/asm-arm/p2m.h | 9 +++--
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 7193d83b44..3bf11eec4f 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1259,7 +1259,9 @@ int xenmem_add_to_physmap_one(
 return -EINVAL;
 }
 
-if ( !p2m_is_ram(p2mt) )
+if ( p2m_is_ram(p2mt) )
+t = (p2mt == p2m_ram_rw) ? p2m_map_foreign_rw : p2m_map_foreign_ro;
+else
 {
 put_page(page);
 rcu_unlock_domain(od);
@@ -1267,7 +1269,6 @@ int xenmem_add_to_physmap_one(
 }
 
 mfn = page_to_mfn(page);
-t = p2m_map_foreign_rw;
 
 rcu_unlock_domain(od);
 break;
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 89279fb590..1e7c91e39a 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -478,6 +478,7 @@ static void p2m_set_permission(lpae_t *e, p2m_type_t t, 
p2m_access_t a)
 break;
 
 case p2m_iommu_map_ro:
+case p2m_map_foreign_ro:
 case p2m_grant_map_ro:
 case p2m_invalid:
 e->p2m.xn = 1;
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index a1aef7b793..a03a033a05 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -116,6 +116,7 @@ typedef enum {
 p2m_mmio_direct_nc, /* Read/write mapping of genuine MMIO area 
non-cacheable */
 p2m_mmio_direct_c,  /* Read/write mapping of genuine MMIO area cacheable */
 p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */
+p2m_map_foreign_ro, /* Read-only RAM pages from foreign domain */
 p2m_grant_map_rw,   /* Read/write grant mapping */
 p2m_grant_map_ro,   /* Read-only grant mapping */
 /* The types below are only used to decide the page attribute in the P2M */
@@ -135,12 +136,16 @@ typedef enum {
 #define P2M_GRANT_TYPES (p2m_to_mask(p2m_grant_map_rw) |  \
  p2m_to_mask(p2m_grant_map_ro))
 
+/* Foreign mappings types */
+#define P2M_FOREIGN_TYPES (p2m_to_mask(p2m_map_foreign_rw) | \
+   p2m_to_mask(p2m_map_foreign_ro))
+
 /* Useful predicates */
 #define p2m_is_ram(_t) (p2m_to_mask(_t) & P2M_RAM_TYPES)
-#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign_rw))
+#define p2m_is_foreign(_t) (p2m_to_mask(_t) & P2M_FOREIGN_TYPES)
 #define p2m_is_any_ram(_t) (p2m_to_mask(_t) &   \
 (P2M_RAM_TYPES | P2M_GRANT_TYPES |  \
- p2m_to_mask(p2m_map_foreign_rw)))
+ P2M_FOREIGN_TYPES))
 
 /* All common type definitions should live ahead of this inclusion. */
 #ifdef _XEN_P2M_COMMON_H
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Julien Grall
No functional change intended.

Only reasonable clean-ups are done in this patch. The rest will use _gfn
for the time being.

Signed-off-by: Julien Grall 
Acked-by: Jan Beulich 

---
Changes in v3:
- Add Jan's acked-by

Changes in v2:
- Remove >> PAGE_SHIFT in svm code
- Fix typo in the e-mail address
- Small NITs
---
 xen/arch/arm/guestcopy.c |  2 +-
 xen/arch/arm/mm.c|  2 +-
 xen/arch/x86/cpu/vpmu.c  |  2 +-
 xen/arch/x86/domain.c| 12 ++--
 xen/arch/x86/domctl.c|  6 +++---
 xen/arch/x86/hvm/dm.c|  2 +-
 xen/arch/x86/hvm/domain.c|  2 +-
 xen/arch/x86/hvm/hvm.c   |  9 +
 xen/arch/x86/hvm/svm/svm.c   |  8 
 xen/arch/x86/hvm/viridian/time.c |  8 
 xen/arch/x86/hvm/viridian/viridian.c | 16 
 xen/arch/x86/hvm/vmx/vmx.c   |  4 ++--
 xen/arch/x86/hvm/vmx/vvmx.c  | 12 ++--
 xen/arch/x86/mm.c| 24 ++--
 xen/arch/x86/mm/p2m.c|  2 +-
 xen/arch/x86/mm/shadow/hvm.c |  6 +++---
 xen/arch/x86/physdev.c   |  3 ++-
 xen/arch/x86/pv/descriptor-tables.c  |  4 ++--
 xen/arch/x86/pv/emul-priv-op.c   |  6 +++---
 xen/arch/x86/pv/mm.c |  2 +-
 xen/arch/x86/traps.c | 11 ++-
 xen/common/domain.c  |  2 +-
 xen/common/event_fifo.c  | 12 ++--
 xen/common/memory.c  |  4 ++--
 xen/common/tmem_xen.c|  2 +-
 xen/include/asm-arm/p2m.h|  6 +++---
 xen/include/asm-x86/p2m.h| 11 +++
 27 files changed, 95 insertions(+), 85 deletions(-)

diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c
index 7a0f3e9d5f..55892062bb 100644
--- a/xen/arch/arm/guestcopy.c
+++ b/xen/arch/arm/guestcopy.c
@@ -37,7 +37,7 @@ static struct page_info *translate_get_page(copy_info_t info, 
uint64_t addr,
 return get_page_from_gva(info.gva.v, addr,
  write ? GV2M_WRITE : GV2M_READ);
 
-page = get_page_from_gfn(info.gpa.d, paddr_to_pfn(addr), , P2M_ALLOC);
+page = get_page_from_gfn(info.gpa.d, gaddr_to_gfn(addr), , P2M_ALLOC);
 
 if ( !page )
 return NULL;
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 01ae20..340a1d1548 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1252,7 +1252,7 @@ int xenmem_add_to_physmap_one(
 
 /* Take reference to the foreign domain page.
  * Reference will be released in XENMEM_remove_from_physmap */
-page = get_page_from_gfn(od, idx, , P2M_ALLOC);
+page = get_page_from_gfn(od, _gfn(idx), , P2M_ALLOC);
 if ( !page )
 {
 put_pg_owner(od);
diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index 8a4f753eae..4d8f153031 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -607,7 +607,7 @@ static int pvpmu_init(struct domain *d, xen_pmu_params_t 
*params)
 struct vcpu *v;
 struct vpmu_struct *vpmu;
 struct page_info *page;
-uint64_t gfn = params->val;
+gfn_t gfn = _gfn(params->val);
 
 if ( (params->vcpu >= d->max_vcpus) || (d->vcpu[params->vcpu] == NULL) )
 return -EINVAL;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253ff..b462a8513b 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -827,7 +827,7 @@ int arch_set_info_guest(
 unsigned long flags;
 bool compat;
 #ifdef CONFIG_PV
-unsigned long cr3_gfn;
+gfn_t cr3_gfn;
 struct page_info *cr3_page;
 unsigned long cr4;
 int rc = 0;
@@ -1091,9 +1091,9 @@ int arch_set_info_guest(
 set_bit(_VPF_in_reset, >pause_flags);
 
 if ( !compat )
-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]);
+cr3_gfn = _gfn(xen_cr3_to_pfn(c.nat->ctrlreg[3]));
 else
-cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]);
+cr3_gfn = _gfn(compat_cr3_to_pfn(c.cmp->ctrlreg[3]));
 cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
 
 if ( !cr3_page )
@@ -1122,7 +1122,7 @@ int arch_set_info_guest(
 case 0:
 if ( !compat && !VM_ASSIST(d, m2p_strict) &&
  !paging_mode_refcounts(d) )
-fill_ro_mpt(_mfn(cr3_gfn));
+fill_ro_mpt(_mfn(gfn_x(cr3_gfn)));
 break;
 default:
 if ( cr3_page == current->arch.old_guest_table )
@@ -1137,7 +1137,7 @@ int arch_set_info_guest(
 v->arch.guest_table = pagetable_from_page(cr3_page);
 if ( c.nat->ctrlreg[1] )
 {
-cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]);
+cr3_gfn = _gfn(xen_cr3_to_pfn(c.nat->ctrlreg[1]));
 cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
 
 if ( !cr3_page )
@@ -1162,7 +1162,7 @@ int arch_set_info_guest(
 

[Xen-devel] [PATCH for-4.12 v3 2/8] xen/arm: Rename p2m_map_foreign to p2m_map_foreign_rw

2018-12-21 Thread Julien Grall
A follow-up patch will introduce another type of foreign mapping. Rename
the type to make clear it is only used for read-write mapping.

No functional changes intended.

Signed-off-by: Julien Grall 
Reviewed-by: Andrii Anisov 
Acked-by: Stefano Stabellini 

---
Changes in v3
- Add Stefano's acked-by

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/mm.c | 2 +-
 xen/arch/arm/p2m.c| 2 +-
 xen/include/asm-arm/p2m.h | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index d96a6655ee..7193d83b44 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -1267,7 +1267,7 @@ int xenmem_add_to_physmap_one(
 }
 
 mfn = page_to_mfn(page);
-t = p2m_map_foreign;
+t = p2m_map_foreign_rw;
 
 rcu_unlock_domain(od);
 break;
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 7ae5b29699..89279fb590 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -468,7 +468,7 @@ static void p2m_set_permission(lpae_t *e, p2m_type_t t, 
p2m_access_t a)
 break;
 
 case p2m_iommu_map_rw:
-case p2m_map_foreign:
+case p2m_map_foreign_rw:
 case p2m_grant_map_rw:
 case p2m_mmio_direct_dev:
 case p2m_mmio_direct_nc:
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 4db8e8709d..a1aef7b793 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -115,7 +115,7 @@ typedef enum {
 p2m_mmio_direct_dev,/* Read/write mapping of genuine Device MMIO area */
 p2m_mmio_direct_nc, /* Read/write mapping of genuine MMIO area 
non-cacheable */
 p2m_mmio_direct_c,  /* Read/write mapping of genuine MMIO area cacheable */
-p2m_map_foreign,/* Ram pages from foreign domain */
+p2m_map_foreign_rw, /* Read/write RAM pages from foreign domain */
 p2m_grant_map_rw,   /* Read/write grant mapping */
 p2m_grant_map_ro,   /* Read-only grant mapping */
 /* The types below are only used to decide the page attribute in the P2M */
@@ -137,10 +137,10 @@ typedef enum {
 
 /* Useful predicates */
 #define p2m_is_ram(_t) (p2m_to_mask(_t) & P2M_RAM_TYPES)
-#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign))
+#define p2m_is_foreign(_t) (p2m_to_mask(_t) & p2m_to_mask(p2m_map_foreign_rw))
 #define p2m_is_any_ram(_t) (p2m_to_mask(_t) &   \
 (P2M_RAM_TYPES | P2M_GRANT_TYPES |  \
- p2m_to_mask(p2m_map_foreign)))
+ p2m_to_mask(p2m_map_foreign_rw)))
 
 /* All common type definitions should live ahead of this inclusion. */
 #ifdef _XEN_P2M_COMMON_H
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 7/8] xenalyze: Build for Both ARM and x86 Platforms

2018-12-21 Thread Julien Grall
From: Benjamin Sanda 

Modified to provide building of the xenalyze binary for both ARM and
x86 platforms. The xenalyze binary is now built as part of the BIN
list for both platforms.

Signed-off-by: Benjamin Sanda 
Signed-off-by: Julien Grall 
Acked-by: Wei Liu 
Acked-by: Stefano Stabellini 

---
Changes in v3:
- Add Stefano's acked-by

Changes in v2:
- Add Wei's acked-by
---
 tools/xentrace/Makefile | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/xentrace/Makefile b/tools/xentrace/Makefile
index 0bad942bdf..9fb7fc96e7 100644
--- a/tools/xentrace/Makefile
+++ b/tools/xentrace/Makefile
@@ -9,8 +9,7 @@ LDLIBS += $(LDLIBS_libxenevtchn)
 LDLIBS += $(LDLIBS_libxenctrl)
 LDLIBS += $(ARGP_LDFLAGS)
 
-BIN-$(CONFIG_X86) = xenalyze
-BIN  = $(BIN-y)
+BIN  = xenalyze
 SBIN = xentrace xentrace_setsize
 LIBBIN   = xenctx
 SCRIPTS  = xentrace_format
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 1/8] xen/arm: p2m: Introduce p2m_get_page_from_gfn

2018-12-21 Thread Julien Grall
In a follow-up patch, we will need to handle get_page_from_gfn
differently for DOMID_XEN. To keep the code simple move the current
content in a new separate helper p2m_get_page_from_gfn.

Note the new helper is not anymore a static inline function as the helper
is quite complex.

Finally, take the opportunity to use typesafe gfn as the change is
minor.

Signed-off-by: Julien Grall 
Reviewed-by: Andrii Anisov 
Reviewed-by: Stefano Stabellini 

---
Changes in v3:
- Fix typeos
- Add Stefano's reviewed-by
- Fix coding style

Changes in v2:
- Add Andrii's reviewed-by
---
 xen/arch/arm/p2m.c| 33 +
 xen/include/asm-arm/p2m.h | 33 -
 2 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 2b5e43f50a..7ae5b29699 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -406,6 +406,39 @@ mfn_t p2m_lookup(struct domain *d, gfn_t gfn, p2m_type_t 
*t)
 return mfn;
 }
 
+struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn,
+p2m_type_t *t)
+{
+struct page_info *page;
+p2m_type_t p2mt;
+mfn_t mfn = p2m_lookup(d, gfn, );
+
+if ( t )
+*t = p2mt;
+
+if ( !p2m_is_any_ram(p2mt) )
+return NULL;
+
+if ( !mfn_valid(mfn) )
+return NULL;
+
+page = mfn_to_page(mfn);
+
+/*
+ * get_page won't work on foreign mapping because the page doesn't
+ * belong to the current domain.
+ */
+if ( p2m_is_foreign(p2mt) )
+{
+struct domain *fdom = page_get_owner_and_reference(page);
+ASSERT(fdom != NULL);
+ASSERT(fdom != d);
+return page;
+}
+
+return get_page(page, d) ? page : NULL;
+}
+
 int guest_physmap_mark_populate_on_demand(struct domain *d,
   unsigned long gfn,
   unsigned int order)
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 01cd3ee4b5..4db8e8709d 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -289,38 +289,13 @@ typedef unsigned int p2m_query_t;
 #define P2M_ALLOC(1u<<0)   /* Populate PoD and paged-out entries */
 #define P2M_UNSHARE  (1u<<1)   /* Break CoW sharing */
 
+struct page_info *p2m_get_page_from_gfn(struct domain *d, gfn_t gfn,
+p2m_type_t *t);
+
 static inline struct page_info *get_page_from_gfn(
 struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
 {
-struct page_info *page;
-p2m_type_t p2mt;
-mfn_t mfn = p2m_lookup(d, _gfn(gfn), );
-
-if (t)
-*t = p2mt;
-
-if ( !p2m_is_any_ram(p2mt) )
-return NULL;
-
-if ( !mfn_valid(mfn) )
-return NULL;
-page = mfn_to_page(mfn);
-
-/*
- * get_page won't work on foreign mapping because the page doesn't
- * belong to the current domain.
- */
-if ( p2m_is_foreign(p2mt) )
-{
-struct domain *fdom = page_get_owner_and_reference(page);
-ASSERT(fdom != NULL);
-ASSERT(fdom != d);
-return page;
-}
-
-if ( !get_page(page, d) )
-return NULL;
-return page;
+return p2m_get_page_from_gfn(d, _gfn(gfn), t);
 }
 
 int get_page_type(struct page_info *page, unsigned long type);
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 0/8] xen/arm: Add xentrace support

2018-12-21 Thread Julien Grall
Hi all,

This patch series is a rework of the series sent by Benjamin Sanda in April
2016 [1]. It finally adds support for xentrace/xenanalyze on Arm.

Cheers,

[1] https://lists.xenproject.org/archives/html/xen-devel/2016-04/msg00464.html

*** BLURB HERE ***

Benjamin Sanda (2):
  xen/arm: Initialize trace buffer
  xenalyze: Build for Both ARM and x86 Platforms

Julien Grall (6):
  xen/arm: p2m: Introduce p2m_get_page_from_gfn
  xen/arm: Rename p2m_map_foreign to p2m_map_foreign_rw
  xen/arm: Add support for read-only foreign mappings
  xen/arm: Make get_page_from_gfn working with DOMID_XEN
  xen/arm: Allow a privileged domain to map foreign page from DOMID_XEN
  xen: Switch parameter in get_page_from_gfn to use typesafe gfn

 tools/xentrace/Makefile  |  3 +-
 xen/arch/arm/guestcopy.c |  2 +-
 xen/arch/arm/mm.c| 19 ++--
 xen/arch/arm/p2m.c   | 36 ++-
 xen/arch/arm/setup.c |  3 ++
 xen/arch/x86/cpu/vpmu.c  |  2 +-
 xen/arch/x86/domain.c| 12 
 xen/arch/x86/domctl.c|  6 ++--
 xen/arch/x86/hvm/dm.c|  2 +-
 xen/arch/x86/hvm/domain.c|  2 +-
 xen/arch/x86/hvm/hvm.c   |  9 +++---
 xen/arch/x86/hvm/svm/svm.c   |  8 ++---
 xen/arch/x86/hvm/viridian/time.c |  8 ++---
 xen/arch/x86/hvm/viridian/viridian.c | 16 +-
 xen/arch/x86/hvm/vmx/vmx.c   |  4 +--
 xen/arch/x86/hvm/vmx/vvmx.c  | 12 
 xen/arch/x86/mm.c| 24 ---
 xen/arch/x86/mm/p2m.c|  2 +-
 xen/arch/x86/mm/shadow/hvm.c |  6 ++--
 xen/arch/x86/physdev.c   |  3 +-
 xen/arch/x86/pv/descriptor-tables.c  |  4 +--
 xen/arch/x86/pv/emul-priv-op.c   |  6 ++--
 xen/arch/x86/pv/mm.c |  2 +-
 xen/arch/x86/traps.c | 11 +++
 xen/common/domain.c  |  2 +-
 xen/common/event_fifo.c  | 12 
 xen/common/memory.c  |  4 +--
 xen/common/tmem_xen.c|  2 +-
 xen/include/asm-arm/p2m.h| 57 +---
 xen/include/asm-x86/p2m.h| 11 ---
 30 files changed, 174 insertions(+), 116 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH for-4.12 v3 4/8] xen/arm: Make get_page_from_gfn working with DOMID_XEN

2018-12-21 Thread Julien Grall
DOMID_XEN is used to share pages beloging to the hypervisor
(e.g trace buffers). Unlike other domains, DOMID_XEN is a non-auto
translated domain and therefore does not have a P2M.

This patch adds a special case for DOMID_XEN in get_page_from_gfn. We
may want to provide "non-auto translated helpers" in the future if we
see more case.

Signed-off-by: Julien Grall 

---
Changes in v3:
- Split from "xen/arm: Add support for read-only foreign
mappings"
- Use likely rather than unlikely
- Fix typoes
---
 xen/include/asm-arm/p2m.h | 33 -
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index a03a033a05..041dea827c 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -300,7 +300,38 @@ struct page_info *p2m_get_page_from_gfn(struct domain *d, 
gfn_t gfn,
 static inline struct page_info *get_page_from_gfn(
 struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
 {
-return p2m_get_page_from_gfn(d, _gfn(gfn), t);
+mfn_t mfn;
+p2m_type_t _t;
+struct page_info *page;
+
+/*
+ * Special case for DOMID_XEN as it is the only domain so far that is
+ * not auto-translated.
+ */
+if ( likely(d != dom_xen) )
+return p2m_get_page_from_gfn(d, _gfn(gfn), t);
+
+if ( !t )
+t = &_t;
+
+*t = p2m_invalid;
+
+/*
+ * DOMID_XEN sees 1-1 RAM. The p2m_type is based on the type of the
+ * page.
+ */
+mfn = _mfn(gfn);
+page = mfn_to_page(mfn);
+
+if ( !mfn_valid(mfn) || !get_page(page, d) )
+return NULL;
+
+if ( page->u.inuse.type_info & PGT_writable_page )
+*t = p2m_ram_rw;
+else
+*t = p2m_ram_ro;
+
+return page;
 }
 
 int get_page_type(struct page_info *page, unsigned long type);
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 03/15] x86/cpu/vpmu: Add Hygon Dhyana support for vPMU

2018-12-21 Thread Pu Wen
On 2018/12/21 21:34, Boris Ostrovsky wrote:
> On 12/21/18 5:02 AM, Pu Wen wrote:
>> On 2018/12/20 22:25, Boris Ostrovsky wrote:
>> ...
 diff --git a/xen/arch/x86/cpu/vpmu_amd.c b/xen/arch/x86/cpu/vpmu_amd.c
 index 5efc39b..e9f0a5c 100644
 --- a/xen/arch/x86/cpu/vpmu_amd.c
 +++ b/xen/arch/x86/cpu/vpmu_amd.c
 @@ -554,6 +554,8 @@ int __init amd_vpmu_init(void)
case 0x12:
case 0x14:
case 0x16:
 +case 0x17:
 +case 0x18:
>>>
>>> This also enables VPMU support for Zen which goes beyond what the
>>> commit message claims to do.
>> Sorry for the not clear commit message. Will add modification description
>> in the commit message and make the changes complete.
>>
>> On the other hand, since current Xen vPMU still not support Zen. so in
>> this patch we enable 0x17 support. If this modification is not preferred,
>> will remove AMD Xen 0x17 support in next version.
> 
> Enabling 0x17 should be fine, I just thought commit message should be
> explicit about that.

OK, will explicit describe the enabling of 0x17 in the commit message in next
version patch set. Thanks for the suggestion.

>>> Also, why are you choosing to use legacy MSRs (and you did the same in
>>> Linux)? Doesn't Zen (which you are saying is similar to Hygon) support
>>> c001_020X bank?
>> In Linux, the Xen PMU driver use the default branch cases, which also use
>> the legacy MSRs way. So we choose to follow legacy MSRs here in Dhyana
>> cases.
>>
>> Since both of Zen and Dhyana support C001_020X MSRs. If use the C001_020X
>> is preferred, we will try to modify the related codes and create a patch.
> 
> 
> I don't have a Zen box available right now but from what I can see 0x17
> counters are compatible with 0x15 so I think switching to C001_020X
> should work. And looks like you are using those in Linux (non-Xen part) too.
>> Also the Linux Xen PMU driver may need to be updated to use these MSRs.
> 
> Yes, although Linux part is used only by PV guests.

Yes, I have tested the MSRs of the 0x15 ones by booting a Dom0 PV guest.
It works even when the Linux Xen PMU driver use the legacy MSRs. I'll test the
PV guest by using C001_020X in the Linux Xen part tomorrow.

Thx.

-- 
Regards,
Pu Wen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 v2 8/8] xen: Switch parameter in get_page_from_gfn to use typesafe gfn

2018-12-21 Thread Julien Grall

Hi Andrew,

On 21/12/2018 14:14, Andrew Cooper wrote:

On 20/12/2018 19:23, Julien Grall wrote:

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 32dc4253ff..b462a8513b 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -827,7 +827,7 @@ int arch_set_info_guest(
  unsigned long flags;
  bool compat;
  #ifdef CONFIG_PV
-unsigned long cr3_gfn;
+gfn_t cr3_gfn;


I've sent an alternative patch which this patch should be rebased over,
at which point all modifications to arch_set_info_guest() should
hopefully disappear.


The rest of the series should be merged by end of today (Code freeze for Xen 
Arm). So I will resend this patch separately after my holidays.





diff --git a/xen/arch/x86/hvm/domain.c b/xen/arch/x86/hvm/domain.c
index 5d5a746a25..73d2da8441 100644
--- a/xen/arch/x86/hvm/domain.c
+++ b/xen/arch/x86/hvm/domain.c
@@ -297,7 +297,7 @@ int arch_set_info_hvm_guest(struct vcpu *v, const 
vcpu_hvm_context_t *ctx)
  {
  /* Shadow-mode CR3 change. Check PDBR and update refcounts. */
  struct page_info *page = get_page_from_gfn(v->domain,
- v->arch.hvm.guest_cr[3] >> PAGE_SHIFT,
+ gaddr_to_gfn(v->arch.hvm.guest_cr[3]),
   NULL, P2M_ALLOC);


Can you re-indent while modifying this please?


Sure.




diff --git a/xen/arch/x86/hvm/viridian/time.c b/xen/arch/x86/hvm/viridian/time.c
index 840a82b457..a718434456 100644
--- a/xen/arch/x86/hvm/viridian/time.c
+++ b/xen/arch/x86/hvm/viridian/time.c
@@ -38,16 +38,16 @@ static void dump_reference_tsc(const struct domain *d)
  
  static void update_reference_tsc(struct domain *d, bool initialize)

  {
-unsigned long gmfn = d->arch.hvm.viridian.reference_tsc.fields.pfn;
-struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
+gfn_t gfn = _gfn(d->arch.hvm.viridian.reference_tsc.fields.pfn);
+struct page_info *page = get_page_from_gfn(d, gfn, NULL, P2M_ALLOC);
  HV_REFERENCE_TSC_PAGE *p;
  
  if ( !page || !get_page_type(page, PGT_writable_page) )

  {
  if ( page )
  put_page(page);
-gdprintk(XENLOG_WARNING, "Bad GMFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n",
- gmfn, mfn_x(page ? page_to_mfn(page) : INVALID_MFN));
+gdprintk(XENLOG_WARNING, "Bad GFN %#"PRI_gfn" (MFN %#"PRI_mfn")\n",


The canonical format for gfns and mfns are just %"PRI_*, without the #

Do you mind fixing this seeing as you're changing the string anyway?


Sure.




diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 3304921991..1efbc071c5 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -491,18 +491,21 @@ struct page_info *p2m_get_page_from_gfn(struct p2m_domain 
*p2m, gfn_t gfn,
  p2m_query_t q);
  
  static inline struct page_info *get_page_from_gfn(

-struct domain *d, unsigned long gfn, p2m_type_t *t, p2m_query_t q)
+struct domain *d, gfn_t gfn, p2m_type_t *t, p2m_query_t q)
  {
  struct page_info *page;
+mfn_t mfn;
  
  if ( paging_mode_translate(d) )

-return p2m_get_page_from_gfn(p2m_get_hostp2m(d), _gfn(gfn), t, NULL, 
q);
+return p2m_get_page_from_gfn(p2m_get_hostp2m(d), gfn, t, NULL, q);
  
  /* Non-translated guests see 1-1 RAM / MMIO mappings everywhere */

  if ( t )
  *t = likely(d != dom_io) ? p2m_ram_rw : p2m_mmio_direct;
-page = mfn_to_page(_mfn(gfn));
-return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL;
+
+mfn = _mfn(gfn_x(gfn));
+page = mfn_to_page(mfn);
+return mfn_valid(mfn) && get_page(page, d) ? page : NULL;


This unfortunately propagates some bad behaviour, because it is not safe
to use mfn_to_page(mfn); before mfn_valid(mfn) succeeds.  (In practice
it works because mfn_to_page() is just pointer arithmetic.)

Pleas can you express this as:

return (mfn_valid(mfn) &&
     (page = mfn_to_page(mfn), get_page(page, d))) ? page : NULL;

which at least gets the order of operations in the correct order from
C's point of view.

Alternatively, and perhaps easier to follow:

if ( !mfn_valid(mfn) )
     return NULL;

page = mfn_to_page(mfn);

return get_page(page, d) ? page : NULL;


I am happy to fix that. However, shouldn't this be handled in a separate patch? 
After all, the code is not worst than it currently is.


Cheers,



~Andrew



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 14/14] libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 14/14] libxl: Re-implement 
domain_suspend_device_model using libxl__ev_qmp"):
> The re-implementation is done because we want to be able to send the
> file description that QEMU can use to save its state. When QEMU is
> restricted, it would not be able to write to a path.
> 
> This replace both libxl__qmp_stop() and libxl__qmp_save().
> 
> qmp_qemu_check_version() was only used by libxl__qmp_save(), so it is
> replace by a version using libxl__ev_qmp instead.
> 
> Coding style fixed in libxl__domain_suspend_device_model() for the
> return value.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 00/14] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_*

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 00/14] libxl: Enable save/restore/migration 
of a restricted QEMU + libxl__ev_qmp_*"):
> Patch series available in this git branch:
> https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git 
> br.libxl-ev-qmp-v

Thanks.  Sorry for being so slow to review this.  It is very nearly
ready and I hope it will make 4.12.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 13/14] libxl: Change libxl__domain_suspend_device_model() to be async

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 13/14] libxl: Change 
libxl__domain_suspend_device_model() to be async"):
> This create an extra step for the two call sites of the function.
> 
> libxl__domain_suspend_device_model() in this patch gets an extra error
> variable (there is ret and rc), but ret goes away in the next patch.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen/dom0: Improve documentation for dom0= and dom0-iommu=

2018-12-21 Thread Roger Pau Monné
On Fri, Dec 21, 2018 at 01:13:25PM +, Andrew Cooper wrote:
> On 21/12/2018 12:08, Roger Pau Monné wrote:
> > On Thu, Dec 20, 2018 at 11:40:51PM +, Andrew Cooper wrote:
> >> Update to the latest metadata style, and expand each of the clauses with 
> >> more
> >> information, including applicable CONFIG_* options.
> >>
> >> Drop the redundant comment beside parse_dom0_param(), to avoid it getting 
> >> out
> >> of sync with the main documentation.
> >>
> >> Signed-off-by: Andrew Cooper 
> > Thanks! A couple of fixes below, because the original text is actually
> > wrong...
> 
> TBH, that is my default assumption every time I do work like this :)

In this case it's my fault :(, because I changed the code and forgot
about the docs.

> >
> >> ---
> >> CC: Jan Beulich 
> >> CC: Wei Liu 
> >> CC: Roger Pau Monné 
> >> CC: Stefano Stabellini 
> >> CC: Julien Grall 
> >>
> >> Please double check for correctness.  The text matches my
> >> understanding/reading of the code, but some of it is rather subtle going.
> >>
> >> It occurs to me that:
> >>
> >>  * The choice of dom0 boot mode should in part be derived from the 
> >> available
> >>CONFIG_* options, and ELF notes advertised in the dom0 kernel.
> > This is indeed doable, but would require parsing the dom0 kernel
> > before building the domain.
> 
> I don't see anything wrong with parsing the ELF headers ahead of
> building the domain.  From the overall boot time, its just an
> order-of-operations issue.

Oh yes, I didn't mean my comment to sound like criticism. I agree
there should be no issues in parsing the ELF earlier, or if there are
issues they should be fixed.

> >
> >>  * AMD probably needs to gain an `ivmd=` to mirror `rmrr=` on the Intel 
> >> side,
> >>because we know there are other errors in the IVRS table.
> > Yes, albeit using rmrr is quite cumbersome because it's mostly a
> > trial-and-error process until there are no more iommu faults (unless
> > you can get the correct rmrr command for your hardware somewhere).
> >
> >>  * Neither of map-{inclusive,reserved} should be active by default, even on
> >>Intel hardware, and we should (wherever possible) have quirks like we 
> >> have
> >>for all other firmware screwups.  Requiring the user to diagnose/work
> >>around firmware problems like this is quite rude.
> > That would indeed be nice, but I think there are too many vendor
> > firmware versions to be able to correctly identify such quirks, the
> > more that vendors don't even list missing RMRR as erratum.
> 
> I don't agree.  We already have quirks based on DMI (at the moment,
> mainly for reboot overrides), and the vast majority of the offending
> cases are the BMC shared mailbox, which will be in a fixed per-platform
> location.

IIRC I've only found a single box that worked without map-reserved,
and that's my NUC which has firmware from Intel. And even in that case
the USB ports weren't fully working.

I guess such quirks could be applied based on the chipset version then
if Xen realizes the firmware is either wrong or missing obvious RMRR
regions?

> I don't expect we'll ever find and fix all quirks, but where we do find
> suitable ones, we should put them into the boot code.

Sadly I agree. What I'm worried about is turning the default
map-{inclusive/reserved} to off, that's likely to make dom0 unable to
boot on a huge amount of hardware.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 12/14] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 12/14] libxl_qmp: Store advertised QEMU 
version in libxl__ev_qmp"):
> This will be used in a later patch.

Acked-by: Ian Jackson 

> +o = libxl__json_map_get("QMP", resp, JSON_MAP);
> +o = libxl__json_map_get("version", o, JSON_MAP);
> +o = libxl__json_map_get("qemu", o, JSON_MAP);
> +#define GRAB_VERSION(level) do { \
> +ev->qemu_version.level = libxl__json_object_get_integer( \
> +libxl__json_map_get(#level, o, JSON_INTEGER)); \
> +} while (0)
> +GRAB_VERSION(major);
> +GRAB_VERSION(minor);
> +GRAB_VERSION(micro);

Earlier I wrote:

   I would prefer the indentation to be such that the statement inside
   the macro is indented like the ones outside.

Ie like this:

  +#define GRAB_VERSION(level) do { \
  +ev->qemu_version.level = libxl__json_object_get_integer( \
  +libxl__json_map_get(#level, o, JSON_INTEGER)); \
  +} while (0)
  +GRAB_VERSION(major);

But up to you.  My ack stands either way.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 11/14] libxl: QEMU startup sync based on QMP

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 11/14] libxl: QEMU startup sync based on 
QMP"):
> This is only activated when dm_restrict=1, as explained in a previous
> patch "libxl_dm: Pre-open QMP socket for QEMU"
> 
> Signed-off-by: Anthony PERARD 
> Reviewed-by: Roger Pau Monné 

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 09/14] libxl_dm: Pre-open QMP socket for QEMU

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 09/14] libxl_dm: Pre-open QMP socket for 
QEMU"):
> This patch moves the creation of the QMP unix socket from QEMU to libxl.
> But libxl doesn't rely on this yet.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 08/14] libxl: Add init/dispose of for libxl__domain_build_state

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 08/14] libxl: Add init/dispose of for 
libxl__domain_build_state"):
> These two new functions libxl__domain_build_state_{init,dispose} should
> be called every time a new libxl__domain_build_state comes to existance.
> 
> There seems to be two of them, one with the domain creation machinery,
> and one in the stub_dm_spawn.

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 06/14] libxl_qmp: Implementation of libxl__ev_qmp_*

2018-12-21 Thread Ian Jackson
Ian Jackson writes ("Re: [PATCH v7 06/14] libxl_qmp: Implementation of 
libxl__ev_qmp_*"):
> I have only fairly minor comments now.  The biggest one remaining is
> about the use of EGC_GC which I think probably wants to become
> STATE_AO_GC throughout.  See below...

I realise that I should state explicitly that libxl__ev_qmp_callback
must still get an egc.

The egc should be passed through all the layers, but its gc should
generally not be used.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v7 07/14] libxl_exec: Add libxl__spawn_initiate_failure

2018-12-21 Thread Ian Jackson
Anthony PERARD writes ("[PATCH v7 07/14] libxl_exec: Add 
libxl__spawn_initiate_failure"):
> This function can be used by user of libxl__spawn_* when they setup a
> notification other than xenstore. The parent can already report success
> via libxl__spawn_initiate_detach(), this new function can be used for
> failure instead of waiting for the timeout.

Acked-by: Ian Jackson 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/dom0: Add a dom0-iommu=none option

2018-12-21 Thread Roger Pau Monné
On Fri, Dec 21, 2018 at 01:01:22PM +, Andrew Cooper wrote:
> On 21/12/2018 12:44, Roger Pau Monné wrote:
> > On Thu, Dec 20, 2018 at 11:40:52PM +, Andrew Cooper wrote:
> >> For development purposes, it is very convenient to boot Xen as a PVH guest,
> >> with an XTF PV or PVH "dom0".  The edit-compile-go cycle is a matter of
> >> seconds, and you can resonably insert printk() debugging in places which 
> >> which
> >> would be completely infeasible when booting fully-fledged guests.
> >>
> >> However, the PVH dom0 path insists on having a working IOMMU, which doesn't
> >> exist when virtualised as a PVH guest, and isn't necessary for XTF anyway.
> >>
> >> Introduce a developer mode to skip the IOMMU requirement.
> > This looks very similar to the current 'passthrough' option, maybe it
> > would be enough to allow PVH dom0 to use the passthrough option
> > provided a warning is added to
> > arch_iommu_check_autotranslated_hwdom?
> 
> I considered that, but "dom0-iommu=passthrough" isn't an accurate
> description of what is going on.  Frankly, its not correct for PV either.

And what about turning dom0-iommu into a boolean option itself, so
that you can do "dom0-iommu=false"?

I think that's more similar to other Xen command line options that
have a boolean value and/or a list of sub-options.

> TBH, dom0-iommu=none is better for both.  How about I introduce that as
> the new option, and leave passthrough as a legacy alias?

That sounds good, I would like to avoid (if possible) the
proliferation of new iommu_hwdom_* variables, because we already have
a bunch and the functionality introduced by the 'none' option looks
very similar to what 'passthrough' aims to achieve.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH for-4.12 6/8] xen/arm: Implement workaround for Cortex-A76 erratum 1165522

2018-12-21 Thread Andrii Anisov



On 28.11.18 18:49, Julien Grall wrote:

Early version of Cortex-A76 can end-up with corrupt TLBs if they
speculate an AT instruction while the S1/S2 system registers are in an
inconsistent state.

This can happen during guest context switch and when invalidating the
TLBs for other than the current VMID.

The workaround implemented in Xen will:
 - Use an empty stage-2 with a reserved VMID while context switching
 between 2 guests
 - Use an empty stage-2 with the VMID where TLBs need to be flushed

Signed-off-by: Julien Grall 


Reviewed-by: Andrii Anisov 

--
Sincerely,
Andrii Anisov.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >