Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-07 Thread Jan Beulich
On 07.12.2021 20:11, Julien Grall wrote:
> 
> 
> On 07/12/2021 08:37, Michal Orzel wrote:
>> Hi Julien,
> 
> Hi,
> 
>> On 06.12.2021 16:29, Julien Grall wrote:
>>> Hi,
>>>
>>> On 06/12/2021 14:20, Michal Orzel wrote:
 to hypervisor when switching to AArch32 state.

>> I will change to "from AArch32 state".
 According to section D1.20.2 of Arm Arm(DDI 0487A.j):
 "If the general-purpose register was accessible from AArch32 state the
 upper 32 bits either become zero, or hold the value that the same
 architectural register held before any AArch32 execution.
 The choice between these two options is IMPLEMENTATIONDEFINED"
>>>
>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>
>> Ok.

 Currently Xen does not ensure that the top 32 bits are zeroed and this
 needs to be fixed.
>>>
>>> Can you outline why this is a problem and why we need to protect? IIRC, the 
>>> main concern is Xen may misinterpret what the guest requested but we are 
>>> not concerned about Xen using wrong value.
>>>
>> I would say:
>> "
>> The reason why this is a problem is that there are places in Xen where we 
>> assume that top 32bits are zero for AArch32 guests.
>> If they are not, this can lead to misinterpretation of Xen regarding what 
>> the guest requested.
>> For example hypercalls returning an error encoded in a signed long like 
>> do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>> if the command passed as the first argument was clobbered,
>> "

 Fix this bug by zeroing the upper 32 bits of these registers on an
 entry to hypervisor when switching to AArch32 state.

 Set default value of parameter compat of macro entry to 0 (AArch64 mode
 as we are on 64-bit hypervisor) to avoid checking if parameter is blank
 when not passed.
>>>
>>> Which error do you see otherwise? Is it a compilation error?
>>>
>> Yes, this is a compilation error. The errors appear at each line when 
>> "entry" is called without passing value for "compat".
>> So basically in all the places where entry is called with hyp=1.
>> When taking the current patch and removing default value for compat you will 
>> get:
>> ```
>> entry.S:254: Error: ".endif" without ".if"
>> entry.S:258: Error: symbol `.if' is already defined
>> entry.S:258: Error: ".endif" without ".if"
>> entry.S:262: Error: symbol `.if' is already defined
>> entry.S:262: Error: ".endif" without ".if"
>> entry.S:266: Error: symbol `.if' is already defined
>> entry.S:266: Error: ".endif" without ".if"
>> entry.S:278: Error: symbol `.if' is already defined
>> entry.S:278: Error: ".endif" without ".if"
>> entry.S:292: Error: symbol `.if' is already defined
>> entry.S:292: Error: ".endif" without ".if"
>> entry.S:317: Error: symbol `.if' is already defined
>> entry.S:317: Error: ".endif" without ".if"
>> ```
> 
> Thanks for input. I am concerned with your suggested approach (or using 
> .if 0\compat as suggested by Jan) because they allow the caller to not 
> properly specify compat when hyp=0. The risk here is we may generate the 
> wrong entry.
> 
> compat should need to be specified when hyp=1 as we will always run in 
> aarch64 mode. So could we protect this code with hyp=0?

Since my suggestion was only to avoid the need for specifying a default
for the parameter (which you didn't seem to be happy about), it would
then merely extend to

.if !0\hyp && 0\compat

or something along those lines.

Jan




[PATCH] MAINTAINERS: widen Anthony's area

2021-12-07 Thread Jan Beulich
As was briefly discussed on the December Community Call, I'd like to
propose to widen Anthony's maintainership to all of tools/. This then
means that the special LIBXENLIGHT entry can go away.

Signed-off-by: Jan Beulich 
---
Note that we're still looking for a 2nd maintainer there, considering
that Wei's time is rather limited.

--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -375,9 +375,11 @@
 
 LIBS
 M: Wei Liu 
+M: Anthony PERARD 
 R: Juergen Gross 
 S: Supported
 F: tools/include/libxenvchan.h
+F: tools/include/libxl*.h
 F: tools/include/xencall.h
 F: tools/include/xenctrl*.h
 F: tools/include/xendevicemodel.h
@@ -393,15 +395,6 @@
 F: tools/include/xentoollog.h
 F: tools/libs/
 
-LIBXENLIGHT
-M: Wei Liu 
-M: Anthony PERARD 
-S: Supported
-F: tools/include/libxl*.h
-F: tools/libs/light/
-F: tools/libs/util/
-F: tools/xl/
-
 LIVEPATCH
 M: Konrad Rzeszutek Wilk 
 M: Ross Lagerwall 
@@ -514,6 +507,7 @@
 
 TOOLSTACK
 M: Wei Liu 
+M: Anthony PERARD 
 S: Supported
 F: autogen.sh
 F: config/*.in




[linux-linus test] 167222: regressions - FAIL

2021-12-07 Thread osstest service owner
flight 167222 linux-linus real [real]
flight 167227 linux-linus real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/167222/
http://logs.test-lab.xenproject.org/osstest/logs/167227/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-credit1  23 guest-start.2fail REGR. vs. 167179

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-examine  4 memdisk-try-append   fail REGR. vs. 167179

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167179
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167179
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167179
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167179
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167179
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167179
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167179
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167179
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass

version targeted for testing:
 linuxf80ef9e49fdfbfbc4197711230098b90e6b05a7e
baseline version:
 linux0fcfb00b28c0b7884635dacf38e46d60bf3d4eb1

Last test of basis   167179  2021-12-06 04:55:24 Z2 days
Testing same since   167222  2021-12-06 19:43:33 Z1 days1 attempts


People who touched revisions under 

[qemu-mainline test] 167221: tolerable FAIL - PUSHED

2021-12-07 Thread osstest service owner
flight 167221 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167221/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167121
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167121
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167121
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167121
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167121
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167121
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167121
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167121
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass

version targeted for testing:
 qemuu2f8eb086732ad1875003101f5324f01c47d7408c
baseline version:
 qemuu99fc08366b06282614daeda989d2fde6ab8a707f

Last test of basis   167121  2021-12-04 23:14:54 Z3 days
Testing same since   167221  2021-12-06 19:38:13 Z1 days1 attempts


People who touched revisions under test:
  Alex Bennée 
  Jiaxun Yang 
  Philippe 

Re: Xen on LFS

2021-12-07 Thread Sai Kiran Kumar Reddy
Hi,

I have posted my query in xen-users mailing list one week ago. I was not
able to get any response from the community. Could you please look into it
and help me out here? I am trying to install xen(from source code), on
LInux from Scratch build system. I just want to know the list of xorg
packages, to be installed for Xen.

Thanks in advance for your time and support.

Regards,
Sai Kiran.

On Tue, Nov 30, 2021 at 12:58 PM Sai Kiran Kumar Reddy 
wrote:

> Ok, thanks for the reply. Will do that.
>
> On Tue, Nov 30, 2021 at 12:52 PM Jan Beulich  wrote:
>
>> On 30.11.2021 07:50, Sai Kiran Kumar Reddy wrote:
>> > I am Sai Kiran. I am currently working on installing xen on Linux From
>> > Scratch(LFS) system. One of the dependencies of xen is "xorg" package.
>> This
>> > package is present in Beyond
>> >  Linux
>> From
>> > Scratch(BLFS) <
>> https://www.linuxfromscratch.org/blfs/view/svn/x/xorg7.html>
>> > manual. But, there are a lot of packages to be installed. I am not sure
>> if
>> > all these packages are required for Xen. Also, is xorg a must, to build
>> and
>> > install xen?
>> >
>> > Kindly help me out here. Thanks in advance, for the support.
>>
>> Thanks for your interest, but I'm afraid your question isn't fitting
>> xen-devel.
>> Please raise it on xen-users.
>>
>> Jan
>>
>>


[libvirt test] 167224: regressions - FAIL

2021-12-07 Thread osstest service owner
flight 167224 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167224/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  31e08a365df4158794c45a36eed04621b7e4165e
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  516 days
Failing since151818  2020-07-11 04:18:52 Z  515 days  496 attempts
Testing same since   167224  2021-12-07 04:21:24 Z1 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Andika Triwidada 
  Andrea Bolognani 
  Ani Sinha 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Didik Supriadi 
  dinglimin 
  Dmitrii Shcherbakov 
  Dmytro Linkin 
  Eiichi Tsukata 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  Franck Ridel 
  Gavi Teitz 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Hela Basa 
  Helmut Grohne 
  Hiroki Narukawa 
  Ian Wienand 
  Ioanna Alifieraki 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  jason lee 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jinsheng Zhang 
  Jiri Denemark 
  Joachim Falk 
  John Ferlan 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Koichi Murase 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Lei Yang 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Luke Yue 
  Luyao Zhong 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Markus Schade 
  Martin Kletzander 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Chevsky 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nikolay Shirokovskiy 
  Olaf Hering 
  Olesya Gerasimenko 
  Or Ozeri 
  Orion Poplawski 
  Pany 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peng Liang 
  Peter Krempa 
  Pino Toscano 
  Pino Toscano 
  Piotr Drąg 
  Prathamesh Chavan 
  Praveen K Paladugu 
  Richard W.M. Jones 
  Ricky Tigg 
  Robin Lee 
  Roman Bogorodskiy 
  Roman Bolshakov 
  Ryan Gahagan 
  Ryan Schmidt 
  Sam Hartman 
  Scott Shambarger 
  Sebastian Mitterle 
  SeongHyun Jo 
  Shalini Chellathurai Saroja 
  Shaojun Yang 
  Shi Lei 
  simmon 
  Simon Chopin 
  Simon Gaiser 
  Simon Rowe 
  Stefan Bader 
  Stefan Berger 
  Stefan Berger 
  Stefan Hajnoczi 
  Stefan Hajnoczi 
  Szymon Scholz 
  Thomas Huth 
  

Re: Aarch64 stand-alone application for Xen

2021-12-07 Thread Mathieu Poirier
Hi Bertrand,

On Fri, 26 Nov 2021 at 03:32, Bertrand Marquis  wrote:
>
> Hi Mathieu,
>
> > On 25 Nov 2021, at 22:59, Mathieu Poirier  
> > wrote:
> >
> > Good day,
> >
> > I am in the process of adding support for aarch64 to the xen-sys
> > crate[1].  The crate currently supports x86_64 and includes a
> > stand-alone "oxerun" application that can be used to validate
> > hypercalls.  My goal is to provide the same functionality on arm64.  I
> > am looking for a stand-alone aarch64 example, something like an "hello
> > world" to help me with the assembler startup code.
>
> We are working on porting XTF to arm64 and already have something running.
> I think it could be a good starting point for you:
> https://github.com/orzelmichal/xtf/tree/arm-devel

Quick one - have you been able to get the "test-arm-64le-example"
application to run?  So far Xen gives me the following error:

(XEN) 
(XEN) Panic on CPU 0:
(XEN) Unable to copy the kernel in the hwdom memory
(XEN) 

I wanted to check with you before starting to dig into it.

Thanks,
Mathieu

>
> Regards
> Bertrand
>
> >
> > Many thanks for the consideration,
> > Mathieu
> >
> > [1]. https://crates.io/crates/xen-sys
> >
>



Re: [PATCH V3 4/6] xen/unpopulated-alloc: Add mechanism to use Xen resource

2021-12-07 Thread Stefano Stabellini
On Thu, 25 Nov 2021, Oleksandr wrote:
> > > Please note the following:
> > > for V3 arch_xen_unpopulated_init() was moved to init() as was agreed
> > > and gained __init specifier. So the target_resource is initialized there.
> > > 
> > > With current patch series applied if CONFIG_XEN_UNPOPULATED_ALLOC
> > > is enabled:
> > > 
> > > 1. On Arm, under normal circumstances, the xen_alloc_unpopulated_pages()
> > > won't be called “before” arch_xen_unpopulated_init(). It will only be
> > > called "before" when either ACPI is in use or something wrong happened
> > > with DT (and we failed to read xen_grant_frames), so we fallback to
> > > xen_xlate_map_ballooned_pages() in arm/xen/enlighten.c:xen_guest_init(),
> > > please see "arm/xen: Switch to use gnttab_setup_auto_xlat_frames() for DT"
> > > for details. But in that case, I think, it doesn't matter much whether
> > > xen_alloc_unpopulated_pages() is called "before" of "after"
> > > target_resource
> > > initialization, as we don't have extended regions in place the
> > > target_resource
> > > will remain invalid even after initialization, so
> > > xen_alloc_ballooned_pages()
> > > will be used in both scenarios.
> > > 
> > > 2. On x86, I am not quite sure which modes use unpopulated-alloc (PVH?),
> > > but it looks like xen_alloc_unpopulated_pages() can (and will) be called
> > > “before” arch_xen_unpopulated_init().
> > > At least, I see that xen_xlate_map_ballooned_pages() is called in
> > > x86/xen/grant-table.c:xen_pvh_gnttab_setup(). According to the initcall
> > > levels for both xen_pvh_gnttab_setup() and init() I expect the former
> > > to be called earlier.
> > > If it is true, the sentence in the commit description which mentions
> > > that “behaviour on x86 is not changed” is not precise. I don’t think
> > > it would be correct to fallback to xen_alloc_ballooned_pages() just
> > > because we haven’t initialized target_resource yet (on x86 it is just
> > > assigning it iomem_resource), at least this doesn't look like an expected
> > > behaviour and unlikely would be welcome.
> > > 
> > > I am wondering whether it would be better to move
> > > arch_xen_unpopulated_init()
> > > to a dedicated init() marked with an appropriate initcall level
> > > (early_initcall?)
> > > to make sure it will always be called *before*
> > > xen_xlate_map_ballooned_pages().
> > > What do you think?
> 
>    ... here (#2). Or I really missed something and there wouldn't be an issue?

Yes, I see your point. Yeah, it makes sense to make sure that
drivers/xen/unpopulated-alloc.c:init is executed before
xen_pvh_gnttab_setup.

If we move it to early_initcall, then we end up running it before
xen_guest_init on ARM. But that might be fine: it looks like it should
work OK and would also allow us to execute xen_xlate_map_ballooned_pages
with target_resource already set.

So I'd say go for it :)

Re: [patch V2 29/36] PCI/MSI: Simplify pci_irq_get_affinity()

2021-12-07 Thread Thomas Gleixner
Cedric,

On Tue, Dec 07 2021 at 18:42, Cédric Le Goater wrote:
>
> This is breaking nvme on pseries but it's probably one of the previous
> patches. I haven't figured out what's wrong yet. Here is the oops FYI.

Hrm.

> [   32.494562] WARNING: CPU: 26 PID: 658 at kernel/irq/chip.c:210 
> irq_startup+0x1c0/0x1e0

This complains about a manual enable_irq() on a managed interrupt.

> [   32.494575] Modules linked in: ibmvscsi ibmveth scsi_transport_srp bnx2x 
> ipr libata xhci_pci xhci_hcd nvme xts vmx_crypto nvme_core mdio t10_pi 
> libcrc32c dm_mirror dm_region_hash dm_log dm_mod
> [   32.494601] CPU: 26 PID: 658 Comm: kworker/26:1H Not tainted 
> 5.16.0-rc4-clg+ #54
> [   32.494607] Workqueue: kblockd blk_mq_timeout_work
> [   32.494681] NIP [c0206f00] irq_startup+0x1c0/0x1e0
> [   32.494686] LR [c0206df0] irq_startup+0xb0/0x1e0
> [   32.494690] Call Trace:
> [   32.494692] [c018050f38b0] [c0206df0] irq_startup+0xb0/0x1e0 
> (unreliable)
> [   32.494699] [c018050f38f0] [c020155c] __enable_irq+0x9c/0xb0
> [   32.494705] [c018050f3950] [c02015d0] enable_irq+0x60/0xc0
> [   32.494710] [c018050f39d0] [c00814a54ae8] 
> nvme_poll_irqdisable+0x80/0xc0 [nvme]
> [   32.494719] [c018050f3a00] [c00814a55824] nvme_timeout+0x18c/0x420 
> [nvme]
> [   32.494726] [c018050f3ae0] [c076e1b8] 
> blk_mq_check_expired+0xa8/0x130
> [   32.494732] [c018050f3b10] [c07793e8] bt_iter+0xd8/0x120
> [   32.494737] [c018050f3b60] [c077a34c] 
> blk_mq_queue_tag_busy_iter+0x25c/0x3f0
> [   32.494742] [c018050f3c20] [c076ffa4] 
> blk_mq_timeout_work+0x84/0x1a0
> [   32.494747] [c018050f3c70] [c0182a78] 
> process_one_work+0x2a8/0x5a0

Confused. I diffed against v1, but could not spot anything except that
properties issue which you found too.

Thanks,

tglx




Re: [patch V2 19/31] PCI: hv: Rework MSI handling

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:51:33PM +0100, Thomas Gleixner wrote:
> Replace the about to vanish iterators and make use of the filtering. Take
> the descriptor lock around the iterators.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/controller/pci-hyperv.c |   15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -3445,18 +3445,23 @@ static int hv_pci_suspend(struct hv_devi
>  
>  static int hv_pci_restore_msi_msg(struct pci_dev *pdev, void *arg)
>  {
> - struct msi_desc *entry;
>   struct irq_data *irq_data;
> + struct msi_desc *entry;
> + int ret = 0;
>  
> - for_each_pci_msi_entry(entry, pdev) {
> + msi_lock_descs(>dev);
> + msi_for_each_desc(entry, >dev, MSI_DESC_ASSOCIATED) {
>   irq_data = irq_get_irq_data(entry->irq);
> - if (WARN_ON_ONCE(!irq_data))
> - return -EINVAL;
> + if (WARN_ON_ONCE(!irq_data)) {
> + ret = -EINVAL;
> + break;
> + }
>  
>   hv_compose_msi_msg(irq_data, >msg);
>   }
> + msi_unlock_descs(>dev);
>  
> - return 0;
> + return ret;
>  }
>  
>  /*
> 



Re: [patch V2 10/31] PCI/MSI: Use msi_on_each_desc()

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:51:18PM +0100, Thomas Gleixner wrote:
> Use the new iterator functions which pave the way for dynamically extending
> MSI-X vectors.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/irqdomain.c |4 ++--
>  drivers/pci/msi/legacy.c|   19 ---
>  drivers/pci/msi/msi.c   |   30 ++
>  3 files changed, 24 insertions(+), 29 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -83,7 +83,7 @@ static int pci_msi_domain_check_cap(stru
>   struct msi_domain_info *info,
>   struct device *dev)
>  {
> - struct msi_desc *desc = first_pci_msi_entry(to_pci_dev(dev));
> + struct msi_desc *desc = msi_first_desc(dev, MSI_DESC_ALL);
>  
>   /* Special handling to support __pci_enable_msi_range() */
>   if (pci_msi_desc_is_multi_msi(desc) &&
> @@ -98,7 +98,7 @@ static int pci_msi_domain_check_cap(stru
>   unsigned int idx = 0;
>  
>   /* Check for gaps in the entry indices */
> - for_each_msi_entry(desc, dev) {
> + msi_for_each_desc(desc, dev, MSI_DESC_ALL) {
>   if (desc->msi_index != idx++)
>   return -ENOTSUPP;
>   }
> --- a/drivers/pci/msi/legacy.c
> +++ b/drivers/pci/msi/legacy.c
> @@ -28,7 +28,7 @@ int __weak arch_setup_msi_irqs(struct pc
>   if (type == PCI_CAP_ID_MSI && nvec > 1)
>   return 1;
>  
> - for_each_pci_msi_entry(desc, dev) {
> + msi_for_each_desc(desc, >dev, MSI_DESC_NOTASSOCIATED) {
>   ret = arch_setup_msi_irq(dev, desc);
>   if (ret)
>   return ret < 0 ? ret : -ENOSPC;
> @@ -42,27 +42,24 @@ void __weak arch_teardown_msi_irqs(struc
>   struct msi_desc *desc;
>   int i;
>  
> - for_each_pci_msi_entry(desc, dev) {
> - if (desc->irq) {
> - for (i = 0; i < desc->nvec_used; i++)
> - arch_teardown_msi_irq(desc->irq + i);
> - }
> + msi_for_each_desc(desc, >dev, MSI_DESC_ASSOCIATED) {
> + for (i = 0; i < desc->nvec_used; i++)
> + arch_teardown_msi_irq(desc->irq + i);
>   }
>  }
>  
>  static int pci_msi_setup_check_result(struct pci_dev *dev, int type, int ret)
>  {
> - struct msi_desc *entry;
> + struct msi_desc *desc;
>   int avail = 0;
>  
>   if (type != PCI_CAP_ID_MSIX || ret >= 0)
>   return ret;
>  
>   /* Scan the MSI descriptors for successfully allocated ones. */
> - for_each_pci_msi_entry(entry, dev) {
> - if (entry->irq != 0)
> - avail++;
> - }
> + msi_for_each_desc(desc, >dev, MSI_DESC_ASSOCIATED)
> + avail++;
> +
>   return avail ? avail : ret;
>  }
>  
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -299,7 +299,6 @@ static void __pci_restore_msix_state(str
>  
>   if (!dev->msix_enabled)
>   return;
> - BUG_ON(list_empty(dev_to_msi_list(>dev)));
>  
>   /* route the table */
>   pci_intx_for_msi(dev, 0);
> @@ -309,7 +308,7 @@ static void __pci_restore_msix_state(str
>   write_msg = arch_restore_msi_irqs(dev);
>  
>   msi_lock_descs(>dev);
> - for_each_pci_msi_entry(entry, dev) {
> + msi_for_each_desc(entry, >dev, MSI_DESC_ALL) {
>   if (write_msg)
>   __pci_write_msi_msg(entry, >msg);
>   pci_msix_write_vector_ctrl(entry, entry->pci.msix_ctrl);
> @@ -378,14 +377,14 @@ static int msi_verify_entries(struct pci
>   if (!dev->no_64bit_msi)
>   return 0;
>  
> - for_each_pci_msi_entry(entry, dev) {
> + msi_for_each_desc(entry, >dev, MSI_DESC_ALL) {
>   if (entry->msg.address_hi) {
>   pci_err(dev, "arch assigned 64-bit MSI address %#x%08x 
> but device only supports 32 bits\n",
>   entry->msg.address_hi, entry->msg.address_lo);
> - return -EIO;
> + break;
>   }
>   }
> - return 0;
> + return !entry ? 0 : -EIO;
>  }
>  
>  /**
> @@ -418,7 +417,7 @@ static int msi_capability_init(struct pc
>   goto unlock;
>  
>   /* All MSIs are unmasked by default; mask them all */
> - entry = first_pci_msi_entry(dev);
> + entry = msi_first_desc(>dev, MSI_DESC_ALL);
>   pci_msi_mask(entry, msi_multi_mask(entry));
>  
>   /* Configure MSI capability structure */
> @@ -508,11 +507,11 @@ static int msix_setup_msi_descs(struct p
>  
>  static void msix_update_entries(struct pci_dev *dev, struct msix_entry 
> *entries)
>  {
> - struct msi_desc *entry;
> + struct msi_desc *desc;
>  
>   if (entries) {
> - for_each_pci_msi_entry(entry, dev) {
> -  

Re: [patch V2 09/31] PCI/MSI: Let core code free MSI descriptors

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:51:16PM +0100, Thomas Gleixner wrote:
> Set the domain info flag which tells the core code to free the MSI
> descriptors from msi_domain_free_irqs() and add an explicit call to the
> core function into the legacy code.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/irqdomain.c |3 ++-
>  drivers/pci/msi/legacy.c|1 +
>  drivers/pci/msi/msi.c   |   14 --
>  3 files changed, 3 insertions(+), 15 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -171,7 +171,8 @@ struct irq_domain *pci_msi_create_irq_do
>   if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
>   pci_msi_domain_update_chip_ops(info);
>  
> - info->flags |= MSI_FLAG_ACTIVATE_EARLY | MSI_FLAG_DEV_SYSFS;
> + info->flags |= MSI_FLAG_ACTIVATE_EARLY | MSI_FLAG_DEV_SYSFS |
> +MSI_FLAG_FREE_MSI_DESCS;
>   if (IS_ENABLED(CONFIG_GENERIC_IRQ_RESERVATION_MODE))
>   info->flags |= MSI_FLAG_MUST_REACTIVATE;
>  
> --- a/drivers/pci/msi/legacy.c
> +++ b/drivers/pci/msi/legacy.c
> @@ -80,4 +80,5 @@ void pci_msi_legacy_teardown_msi_irqs(st
>  {
>   msi_device_destroy_sysfs(>dev);
>   arch_teardown_msi_irqs(dev);
> + msi_free_msi_descs(>dev);
>  }
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -224,22 +224,8 @@ EXPORT_SYMBOL_GPL(pci_write_msi_msg);
>  
>  static void free_msi_irqs(struct pci_dev *dev)
>  {
> - struct list_head *msi_list = dev_to_msi_list(>dev);
> - struct msi_desc *entry, *tmp;
> - int i;
> -
> - for_each_pci_msi_entry(entry, dev)
> - if (entry->irq)
> - for (i = 0; i < entry->nvec_used; i++)
> - BUG_ON(irq_has_action(entry->irq + i));
> -
>   pci_msi_teardown_msi_irqs(dev);
>  
> - list_for_each_entry_safe(entry, tmp, msi_list, list) {
> - list_del(>list);
> - free_msi_entry(entry);
> - }
> -
>   if (dev->msix_base) {
>   iounmap(dev->msix_base);
>   dev->msix_base = NULL;
> 



Re: [patch V2 08/31] PCI/MSI: Use msi_add_msi_desc()

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:51:15PM +0100, Thomas Gleixner wrote:
> Simplify the allocation of MSI descriptors by using msi_add_msi_desc()
> which moves the storage handling to core code and prepares for dynamic
> extension of the MSI-X vector space.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/msi.c |  122 
> --
>  1 file changed, 59 insertions(+), 63 deletions(-)
> 
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -340,45 +340,51 @@ void pci_restore_msi_state(struct pci_de
>  }
>  EXPORT_SYMBOL_GPL(pci_restore_msi_state);
>  
> -static struct msi_desc *
> -msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity_desc 
> *masks)
> +static int msi_setup_msi_desc(struct pci_dev *dev, int nvec,
> +   struct irq_affinity_desc *masks)
>  {
> - struct msi_desc *entry;
> + struct msi_desc desc;
>   unsigned long prop;
>   u16 control;
> + int ret;
>  
>   /* MSI Entry Initialization */
> - entry = alloc_msi_entry(>dev, nvec, masks);
> - if (!entry)
> - return NULL;
> + memset(, 0, sizeof(desc));
>  
>   pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, );
>   /* Lies, damned lies, and MSIs */
>   if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING)
>   control |= PCI_MSI_FLAGS_MASKBIT;
> + /* Respect XEN's mask disabling */
> + if (pci_msi_ignore_mask)
> + control &= ~PCI_MSI_FLAGS_MASKBIT;
>  
> - entry->pci.msi_attrib.is_64 = !!(control & PCI_MSI_FLAGS_64BIT);
> - entry->pci.msi_attrib.can_mask  = !pci_msi_ignore_mask &&
> -   !!(control & PCI_MSI_FLAGS_MASKBIT);
> - entry->pci.msi_attrib.default_irq = dev->irq;
> - entry->pci.msi_attrib.multi_cap = (control & PCI_MSI_FLAGS_QMASK) >> 1;
> - entry->pci.msi_attrib.multiple  = ilog2(__roundup_pow_of_two(nvec));
> + desc.nvec_used  = nvec;
> + desc.pci.msi_attrib.is_64   = !!(control & PCI_MSI_FLAGS_64BIT);
> + desc.pci.msi_attrib.can_mask= !!(control & PCI_MSI_FLAGS_MASKBIT);
> + desc.pci.msi_attrib.default_irq = dev->irq;
> + desc.pci.msi_attrib.multi_cap   = (control & PCI_MSI_FLAGS_QMASK) >> 1;
> + desc.pci.msi_attrib.multiple= ilog2(__roundup_pow_of_two(nvec));
> + desc.affinity   = masks;
>  
>   if (control & PCI_MSI_FLAGS_64BIT)
> - entry->pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_64;
> + desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_64;
>   else
> - entry->pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_32;
> + desc.pci.mask_pos = dev->msi_cap + PCI_MSI_MASK_32;
>  
>   /* Save the initial mask status */
> - if (entry->pci.msi_attrib.can_mask)
> - pci_read_config_dword(dev, entry->pci.mask_pos, 
> >pci.msi_mask);
> + if (desc.pci.msi_attrib.can_mask)
> + pci_read_config_dword(dev, desc.pci.mask_pos, 
> _mask);
>  
> - prop = MSI_PROP_PCI_MSI;
> - if (entry->pci.msi_attrib.is_64)
> - prop |= MSI_PROP_64BIT;
> - msi_device_set_properties(>dev, prop);
> + ret = msi_add_msi_desc(>dev, );
> + if (!ret) {
> + prop = MSI_PROP_PCI_MSI;
> + if (desc.pci.msi_attrib.is_64)
> + prop |= MSI_PROP_64BIT;
> + msi_device_set_properties(>dev, prop);
> + }
>  
> - return entry;
> + return ret;
>  }
>  
>  static int msi_verify_entries(struct pci_dev *dev)
> @@ -423,17 +429,14 @@ static int msi_capability_init(struct pc
>   masks = irq_create_affinity_masks(nvec, affd);
>  
>   msi_lock_descs(>dev);
> - entry = msi_setup_entry(dev, nvec, masks);
> - if (!entry) {
> - ret = -ENOMEM;
> + ret = msi_setup_msi_desc(dev, nvec, masks);
> + if (ret)
>   goto unlock;
> - }
>  
>   /* All MSIs are unmasked by default; mask them all */
> + entry = first_pci_msi_entry(dev);
>   pci_msi_mask(entry, msi_multi_mask(entry));
>  
> - list_add_tail(>list, dev_to_msi_list(>dev));
> -
>   /* Configure MSI capability structure */
>   ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI);
>   if (ret)
> @@ -482,49 +485,40 @@ static void __iomem *msix_map_region(str
>   return ioremap(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
>  }
>  
> -static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
> -   struct msix_entry *entries, int nvec,
> -   struct irq_affinity_desc *masks)
> +static int msix_setup_msi_descs(struct pci_dev *dev, void __iomem *base,
> + struct msix_entry *entries, int nvec,
> + struct irq_affinity_desc *masks)
>  {
> - int i, vec_count = pci_msix_vec_count(dev);
> + int ret = 0, i, vec_count = 

Re: [patch V2 15/23] PCI/MSI: Move code into a separate directory

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:47PM +0100, Thomas Gleixner wrote:
> msi.c is getting larger and really could do with a splitup. Move it into
> it's own directory to prepare for that.

s/it's/its/

> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  Documentation/driver-api/pci/pci.rst |2 
>  drivers/pci/Makefile |3 
>  drivers/pci/msi.c| 1532 
> ---
>  drivers/pci/msi/Makefile |4 
>  drivers/pci/msi/msi.c| 1532 
> +++
>  5 files changed, 1539 insertions(+), 1534 deletions(-)
> 
> --- a/Documentation/driver-api/pci/pci.rst
> +++ b/Documentation/driver-api/pci/pci.rst
> @@ -13,7 +13,7 @@ PCI Support Library
>  .. kernel-doc:: drivers/pci/search.c
> :export:
>  
> -.. kernel-doc:: drivers/pci/msi.c
> +.. kernel-doc:: drivers/pci/msi/msi.c
> :export:
>  
>  .. kernel-doc:: drivers/pci/bus.c
> --- a/drivers/pci/Makefile
> +++ b/drivers/pci/Makefile
> @@ -5,8 +5,9 @@
>  obj-$(CONFIG_PCI)+= access.o bus.o probe.o host-bridge.o \
>  remove.o pci.o pci-driver.o search.o \
>  pci-sysfs.o rom.o setup-res.o irq.o vpd.o \
> -setup-bus.o vc.o mmap.o setup-irq.o msi.o
> +setup-bus.o vc.o mmap.o setup-irq.o
>  
> +obj-$(CONFIG_PCI)+= msi/
>  obj-$(CONFIG_PCI)+= pcie/
>  
>  ifdef CONFIG_PCI
> --- a/drivers/pci/msi.c
> +++ /dev/null
> @@ -1,1532 +0,0 @@
> -// SPDX-License-Identifier: GPL-2.0
> -/*
> - * PCI Message Signaled Interrupt (MSI)
> - *
> - * Copyright (C) 2003-2004 Intel
> - * Copyright (C) Tom Long Nguyen (tom.l.ngu...@intel.com)
> - * Copyright (C) 2016 Christoph Hellwig.
> - */
> -
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -
> -#include "pci.h"
> -
> -#ifdef CONFIG_PCI_MSI
> -
> -static int pci_msi_enable = 1;
> -int pci_msi_ignore_mask;
> -
> -#define msix_table_size(flags)   ((flags & PCI_MSIX_FLAGS_QSIZE) + 1)
> -
> -#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> -static int pci_msi_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
> -{
> - struct irq_domain *domain;
> -
> - domain = dev_get_msi_domain(>dev);
> - if (domain && irq_domain_is_hierarchy(domain))
> - return msi_domain_alloc_irqs(domain, >dev, nvec);
> -
> - return arch_setup_msi_irqs(dev, nvec, type);
> -}
> -
> -static void pci_msi_teardown_msi_irqs(struct pci_dev *dev)
> -{
> - struct irq_domain *domain;
> -
> - domain = dev_get_msi_domain(>dev);
> - if (domain && irq_domain_is_hierarchy(domain))
> - msi_domain_free_irqs(domain, >dev);
> - else
> - arch_teardown_msi_irqs(dev);
> -}
> -#else
> -#define pci_msi_setup_msi_irqs   arch_setup_msi_irqs
> -#define pci_msi_teardown_msi_irqsarch_teardown_msi_irqs
> -#endif
> -
> -#ifdef CONFIG_PCI_MSI_ARCH_FALLBACKS
> -/* Arch hooks */
> -int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
> -{
> - return -EINVAL;
> -}
> -
> -void __weak arch_teardown_msi_irq(unsigned int irq)
> -{
> -}
> -
> -int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
> -{
> - struct msi_desc *entry;
> - int ret;
> -
> - /*
> -  * If an architecture wants to support multiple MSI, it needs to
> -  * override arch_setup_msi_irqs()
> -  */
> - if (type == PCI_CAP_ID_MSI && nvec > 1)
> - return 1;
> -
> - for_each_pci_msi_entry(entry, dev) {
> - ret = arch_setup_msi_irq(dev, entry);
> - if (ret < 0)
> - return ret;
> - if (ret > 0)
> - return -ENOSPC;
> - }
> -
> - return 0;
> -}
> -
> -void __weak arch_teardown_msi_irqs(struct pci_dev *dev)
> -{
> - int i;
> - struct msi_desc *entry;
> -
> - for_each_pci_msi_entry(entry, dev)
> - if (entry->irq)
> - for (i = 0; i < entry->nvec_used; i++)
> - arch_teardown_msi_irq(entry->irq + i);
> -}
> -#endif /* CONFIG_PCI_MSI_ARCH_FALLBACKS */
> -
> -/*
> - * PCI 2.3 does not specify mask bits for each MSI interrupt.  Attempting to
> - * mask all MSI interrupts by clearing the MSI enable bit does not work
> - * reliably as devices without an INTx disable bit will then generate a
> - * level IRQ which will never be cleared.
> - */
> -static inline __attribute_const__ u32 msi_multi_mask(struct msi_desc *desc)
> -{
> - /* Don't shift by >= width of type */
> - if (desc->pci.msi_attrib.multi_cap >= 5)
> - return 0x;
> - return (1 << (1 << desc->pci.msi_attrib.multi_cap)) - 1;
> -}
> -
> -static noinline void pci_msi_update_mask(struct msi_desc *desc, u32 clear, 
> u32 set)
> -{
> - 

Re: [patch V2 07/31] PCI/MSI: Protect MSI operations

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:51:13PM +0100, Thomas Gleixner wrote:
> To prepare for dynamic extension of MSI-X vectors, protect the MSI
> operations for MSI and MSI-X. This requires to move the invocation of
> irq_create_affinity_masks() out of the descriptor lock section to avoid
> reverse lock ordering vs. CPU hotplug lock as some callers of the PCI/MSI
> allocation interfaces already hold it.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/irqdomain.c |4 -
>  drivers/pci/msi/msi.c   |  120 
> ++--
>  2 files changed, 73 insertions(+), 51 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -14,7 +14,7 @@ int pci_msi_setup_msi_irqs(struct pci_de
>  
>   domain = dev_get_msi_domain(>dev);
>   if (domain && irq_domain_is_hierarchy(domain))
> - return msi_domain_alloc_irqs(domain, >dev, nvec);
> + return msi_domain_alloc_irqs_descs_locked(domain, >dev, 
> nvec);
>  
>   return pci_msi_legacy_setup_msi_irqs(dev, nvec, type);
>  }
> @@ -25,7 +25,7 @@ void pci_msi_teardown_msi_irqs(struct pc
>  
>   domain = dev_get_msi_domain(>dev);
>   if (domain && irq_domain_is_hierarchy(domain))
> - msi_domain_free_irqs(domain, >dev);
> + msi_domain_free_irqs_descs_locked(domain, >dev);
>   else
>   pci_msi_legacy_teardown_msi_irqs(dev);
>  }
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -322,11 +322,13 @@ static void __pci_restore_msix_state(str
>  
>   write_msg = arch_restore_msi_irqs(dev);
>  
> + msi_lock_descs(>dev);
>   for_each_pci_msi_entry(entry, dev) {
>   if (write_msg)
>   __pci_write_msi_msg(entry, >msg);
>   pci_msix_write_vector_ctrl(entry, entry->pci.msix_ctrl);
>   }
> + msi_unlock_descs(>dev);
>  
>   pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
>  }
> @@ -339,20 +341,16 @@ void pci_restore_msi_state(struct pci_de
>  EXPORT_SYMBOL_GPL(pci_restore_msi_state);
>  
>  static struct msi_desc *
> -msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
> +msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity_desc 
> *masks)
>  {
> - struct irq_affinity_desc *masks = NULL;
>   struct msi_desc *entry;
>   unsigned long prop;
>   u16 control;
>  
> - if (affd)
> - masks = irq_create_affinity_masks(nvec, affd);
> -
>   /* MSI Entry Initialization */
>   entry = alloc_msi_entry(>dev, nvec, masks);
>   if (!entry)
> - goto out;
> + return NULL;
>  
>   pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, );
>   /* Lies, damned lies, and MSIs */
> @@ -379,8 +377,7 @@ msi_setup_entry(struct pci_dev *dev, int
>   if (entry->pci.msi_attrib.is_64)
>   prop |= MSI_PROP_64BIT;
>   msi_device_set_properties(>dev, prop);
> -out:
> - kfree(masks);
> +
>   return entry;
>  }
>  
> @@ -416,14 +413,21 @@ static int msi_verify_entries(struct pci
>  static int msi_capability_init(struct pci_dev *dev, int nvec,
>  struct irq_affinity *affd)
>  {
> + struct irq_affinity_desc *masks = NULL;
>   struct msi_desc *entry;
>   int ret;
>  
>   pci_msi_set_enable(dev, 0); /* Disable MSI during set up */
>  
> - entry = msi_setup_entry(dev, nvec, affd);
> - if (!entry)
> - return -ENOMEM;
> + if (affd)
> + masks = irq_create_affinity_masks(nvec, affd);
> +
> + msi_lock_descs(>dev);
> + entry = msi_setup_entry(dev, nvec, masks);
> + if (!entry) {
> + ret = -ENOMEM;
> + goto unlock;
> + }
>  
>   /* All MSIs are unmasked by default; mask them all */
>   pci_msi_mask(entry, msi_multi_mask(entry));
> @@ -446,11 +450,14 @@ static int msi_capability_init(struct pc
>  
>   pcibios_free_irq(dev);
>   dev->irq = entry->irq;
> - return 0;
> + goto unlock;
>  
>  err:
>   pci_msi_unmask(entry, msi_multi_mask(entry));
>   free_msi_irqs(dev);
> +unlock:
> + msi_unlock_descs(>dev);
> + kfree(masks);
>   return ret;
>  }
>  
> @@ -477,23 +484,18 @@ static void __iomem *msix_map_region(str
>  
>  static int msix_setup_entries(struct pci_dev *dev, void __iomem *base,
> struct msix_entry *entries, int nvec,
> -   struct irq_affinity *affd)
> +   struct irq_affinity_desc *masks)
>  {
> - struct irq_affinity_desc *curmsk, *masks = NULL;
> + int i, vec_count = pci_msix_vec_count(dev);
> + struct irq_affinity_desc *curmsk;
>   struct msi_desc *entry;
>   void __iomem *addr;
> - int ret, i;
> - int vec_count = pci_msix_vec_count(dev);
> -
> - if (affd)
> - masks = irq_create_affinity_masks(nvec, affd);
>  
>   for (i = 0, curmsk = 

Re: [patch V2 16/23] PCI/MSI: Split out CONFIG_PCI_MSI independent part

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:49PM +0100, Thomas Gleixner wrote:
> These functions are required even when CONFIG_PCI_MSI is not set. Move them
> to their own file.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/Makefile |3 ++-
>  drivers/pci/msi/msi.c|   39 ---
>  drivers/pci/msi/pcidev_msi.c |   43 
> +++
>  3 files changed, 45 insertions(+), 40 deletions(-)
> 
> --- a/drivers/pci/msi/Makefile
> +++ b/drivers/pci/msi/Makefile
> @@ -1,4 +1,5 @@
>  # SPDX-License-Identifier: GPL-2.0
>  #
>  # Makefile for the PCI/MSI
> -obj-$(CONFIG_PCI)+= msi.o
> +obj-$(CONFIG_PCI)+= pcidev_msi.o
> +obj-$(CONFIG_PCI_MSI)+= msi.o
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -18,8 +18,6 @@
>  
>  #include "../pci.h"
>  
> -#ifdef CONFIG_PCI_MSI
> -
>  static int pci_msi_enable = 1;
>  int pci_msi_ignore_mask;
>  
> @@ -1493,40 +1491,3 @@ bool pci_dev_has_special_msi_domain(stru
>  }
>  
>  #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
> -#endif /* CONFIG_PCI_MSI */
> -
> -void pci_msi_init(struct pci_dev *dev)
> -{
> - u16 ctrl;
> -
> - /*
> -  * Disable the MSI hardware to avoid screaming interrupts
> -  * during boot.  This is the power on reset default so
> -  * usually this should be a noop.
> -  */
> - dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
> - if (!dev->msi_cap)
> - return;
> -
> - pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, );
> - if (ctrl & PCI_MSI_FLAGS_ENABLE)
> - pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS,
> -   ctrl & ~PCI_MSI_FLAGS_ENABLE);
> -
> - if (!(ctrl & PCI_MSI_FLAGS_64BIT))
> - dev->no_64bit_msi = 1;
> -}
> -
> -void pci_msix_init(struct pci_dev *dev)
> -{
> - u16 ctrl;
> -
> - dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
> - if (!dev->msix_cap)
> - return;
> -
> - pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, );
> - if (ctrl & PCI_MSIX_FLAGS_ENABLE)
> - pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS,
> -   ctrl & ~PCI_MSIX_FLAGS_ENABLE);
> -}
> --- /dev/null
> +++ b/drivers/pci/msi/pcidev_msi.c
> @@ -0,0 +1,43 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * MSI[X} related functions which are available unconditionally.
> + */
> +#include "../pci.h"
> +
> +/*
> + * Disable the MSI[X] hardware to avoid screaming interrupts during boot.
> + * This is the power on reset default so usually this should be a noop.
> + */
> +
> +void pci_msi_init(struct pci_dev *dev)
> +{
> + u16 ctrl;
> +
> + dev->msi_cap = pci_find_capability(dev, PCI_CAP_ID_MSI);
> + if (!dev->msi_cap)
> + return;
> +
> + pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, );
> + if (ctrl & PCI_MSI_FLAGS_ENABLE) {
> + pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS,
> +   ctrl & ~PCI_MSI_FLAGS_ENABLE);
> + }
> +
> + if (!(ctrl & PCI_MSI_FLAGS_64BIT))
> + dev->no_64bit_msi = 1;
> +}
> +
> +void pci_msix_init(struct pci_dev *dev)
> +{
> + u16 ctrl;
> +
> + dev->msix_cap = pci_find_capability(dev, PCI_CAP_ID_MSIX);
> + if (!dev->msix_cap)
> + return;
> +
> + pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, );
> + if (ctrl & PCI_MSIX_FLAGS_ENABLE) {
> + pci_write_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS,
> +   ctrl & ~PCI_MSIX_FLAGS_ENABLE);
> + }
> +}
> 



Re: [patch V2 22/23] genirq/msi: Handle PCI/MSI allocation fail in core code

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:59PM +0100, Thomas Gleixner wrote:
> Get rid of yet another irqdomain callback and let the core code return the
> already available information of how many descriptors could be allocated.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas# PCI

> ---
>  drivers/pci/msi/irqdomain.c |   13 -
>  include/linux/msi.h |5 +
>  kernel/irq/msi.c|   29 +
>  3 files changed, 26 insertions(+), 21 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -95,16 +95,6 @@ static int pci_msi_domain_check_cap(stru
>   return 0;
>  }
>  
> -static int pci_msi_domain_handle_error(struct irq_domain *domain,
> -struct msi_desc *desc, int error)
> -{
> - /* Special handling to support __pci_enable_msi_range() */
> - if (pci_msi_desc_is_multi_msi(desc) && error == -ENOSPC)
> - return 1;
> -
> - return error;
> -}
> -
>  static void pci_msi_domain_set_desc(msi_alloc_info_t *arg,
>   struct msi_desc *desc)
>  {
> @@ -115,7 +105,6 @@ static void pci_msi_domain_set_desc(msi_
>  static struct msi_domain_ops pci_msi_domain_ops_default = {
>   .set_desc   = pci_msi_domain_set_desc,
>   .msi_check  = pci_msi_domain_check_cap,
> - .handle_error   = pci_msi_domain_handle_error,
>  };
>  
>  static void pci_msi_domain_update_dom_ops(struct msi_domain_info *info)
> @@ -129,8 +118,6 @@ static void pci_msi_domain_update_dom_op
>   ops->set_desc = pci_msi_domain_set_desc;
>   if (ops->msi_check == NULL)
>   ops->msi_check = pci_msi_domain_check_cap;
> - if (ops->handle_error == NULL)
> - ops->handle_error = pci_msi_domain_handle_error;
>   }
>  }
>  
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -286,7 +286,6 @@ struct msi_domain_info;
>   * @msi_check:   Callback for verification of the 
> domain/info/dev data
>   * @msi_prepare: Prepare the allocation of the interrupts in the domain
>   * @set_desc:Set the msi descriptor for an interrupt
> - * @handle_error:Optional error handler if the allocation fails
>   * @domain_alloc_irqs:   Optional function to override the default 
> allocation
>   *   function.
>   * @domain_free_irqs:Optional function to override the default free
> @@ -295,7 +294,7 @@ struct msi_domain_info;
>   * @get_hwirq, @msi_init and @msi_free are callbacks used by the underlying
>   * irqdomain.
>   *
> - * @msi_check, @msi_prepare, @handle_error and @set_desc are callbacks used 
> by
> + * @msi_check, @msi_prepare and @set_desc are callbacks used by
>   * msi_domain_alloc/free_irqs().
>   *
>   * @domain_alloc_irqs, @domain_free_irqs can be used to override the
> @@ -332,8 +331,6 @@ struct msi_domain_ops {
>  msi_alloc_info_t *arg);
>   void(*set_desc)(msi_alloc_info_t *arg,
>   struct msi_desc *desc);
> - int (*handle_error)(struct irq_domain *domain,
> - struct msi_desc *desc, int error);
>   int (*domain_alloc_irqs)(struct irq_domain *domain,
>struct device *dev, int nvec);
>   void(*domain_free_irqs)(struct irq_domain *domain,
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -538,6 +538,27 @@ static bool msi_check_reservation_mode(s
>   return desc->pci.msi_attrib.is_msix || desc->pci.msi_attrib.can_mask;
>  }
>  
> +static int msi_handle_pci_fail(struct irq_domain *domain, struct msi_desc 
> *desc,
> +int allocated)
> +{
> + switch(domain->bus_token) {
> + case DOMAIN_BUS_PCI_MSI:
> + case DOMAIN_BUS_VMD_MSI:
> + if (IS_ENABLED(CONFIG_PCI_MSI))
> + break;
> + fallthrough;
> + default:
> + return -ENOSPC;
> + }
> +
> + /* Let a failed PCI multi MSI allocation retry */
> + if (desc->nvec_used > 1)
> + return 1;
> +
> + /* If there was a successful allocation let the caller know */
> + return allocated ? allocated : -ENOSPC;
> +}
> +
>  int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
>   int nvec)
>  {
> @@ -546,6 +567,7 @@ int __msi_domain_alloc_irqs(struct irq_d
>   struct irq_data *irq_data;
>   struct msi_desc *desc;
>   msi_alloc_info_t arg = { };
> + int allocated = 0;
>   int i, ret, virq;
>   bool can_reserve;
>  
> @@ -560,16 +582,15 @@ int __msi_domain_alloc_irqs(struct irq_d
>  dev_to_node(dev), , false,
>  

Re: [patch V2 28/36] PCI/MSI: Use __msi_get_virq() in pci_get_vector()

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:39:41PM +0100, Thomas Gleixner wrote:
> Use msi_get_vector() and handle the return value to be compatible.
> 
> No functional change intended.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Bjorn Helgaas 

> ---
> V2: Handle the INTx case directly instead of trying to be overly smart - Marc
> ---
>  drivers/pci/msi/msi.c |   25 +
>  1 file changed, 5 insertions(+), 20 deletions(-)
> 
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -1032,28 +1032,13 @@ EXPORT_SYMBOL(pci_free_irq_vectors);
>   */
>  int pci_irq_vector(struct pci_dev *dev, unsigned int nr)
>  {
> - if (dev->msix_enabled) {
> - struct msi_desc *entry;
> + unsigned int irq;
>  
> - for_each_pci_msi_entry(entry, dev) {
> - if (entry->msi_index == nr)
> - return entry->irq;
> - }
> - WARN_ON_ONCE(1);
> - return -EINVAL;
> - }
> + if (!dev->msi_enabled && !dev->msix_enabled)
> + return !nr ? dev->irq : -EINVAL;
>  
> - if (dev->msi_enabled) {
> - struct msi_desc *entry = first_pci_msi_entry(dev);
> -
> - if (WARN_ON_ONCE(nr >= entry->nvec_used))
> - return -EINVAL;
> - } else {
> - if (WARN_ON_ONCE(nr > 0))
> - return -EINVAL;
> - }
> -
> - return dev->irq + nr;
> + irq = msi_get_virq(>dev, nr);
> + return irq ? irq : -EINVAL;
>  }
>  EXPORT_SYMBOL(pci_irq_vector);
>  
> 



Re: [patch V2 25/36] PCI/MSI: Provide MSI_FLAG_MSIX_CONTIGUOUS

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:39:36PM +0100, Thomas Gleixner wrote:
> Provide a domain info flag which makes the core code check for a contiguous
> MSI-X index on allocation. That's simpler than checking it at some other
> domain callback in architecture code.
> 
> Signed-off-by: Thomas Gleixner 
> Reviewed-by: Greg Kroah-Hartman 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/irqdomain.c |   16 ++--
>  include/linux/msi.h |2 ++
>  2 files changed, 16 insertions(+), 2 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -89,9 +89,21 @@ static int pci_msi_domain_check_cap(stru
>   if (pci_msi_desc_is_multi_msi(desc) &&
>   !(info->flags & MSI_FLAG_MULTI_PCI_MSI))
>   return 1;
> - else if (desc->pci.msi_attrib.is_msix && !(info->flags & 
> MSI_FLAG_PCI_MSIX))
> - return -ENOTSUPP;
>  
> + if (desc->pci.msi_attrib.is_msix) {
> + if (!(info->flags & MSI_FLAG_PCI_MSIX))
> + return -ENOTSUPP;
> +
> + if (info->flags & MSI_FLAG_MSIX_CONTIGUOUS) {
> + unsigned int idx = 0;
> +
> + /* Check for gaps in the entry indices */
> + for_each_msi_entry(desc, dev) {
> + if (desc->msi_index != idx++)
> + return -ENOTSUPP;
> + }
> + }
> + }
>   return 0;
>  }
>  
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -376,6 +376,8 @@ enum {
>   MSI_FLAG_LEVEL_CAPABLE  = (1 << 6),
>   /* Populate sysfs on alloc() and destroy it on free() */
>   MSI_FLAG_DEV_SYSFS  = (1 << 7),
> + /* MSI-X entries must be contiguous */
> + MSI_FLAG_MSIX_CONTIGUOUS= (1 << 8),
>  };
>  
>  int msi_domain_set_affinity(struct irq_data *data, const struct cpumask 
> *mask,
> 



Re: [patch V2 19/36] PCI/MSI: Store properties in device::msi::data

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:39:26PM +0100, Thomas Gleixner wrote:
> Store the properties which are interesting for various places so the MSI
> descriptor fiddling can be removed.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Bjorn Helgaas 

> ---
> V2: Use the setter function
> ---
>  drivers/pci/msi/msi.c |8 
>  1 file changed, 8 insertions(+)
> 
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -244,6 +244,8 @@ static void free_msi_irqs(struct pci_dev
>   iounmap(dev->msix_base);
>   dev->msix_base = NULL;
>   }
> +
> + msi_device_set_properties(>dev, 0);
>  }
>  
>  static void pci_intx_for_msi(struct pci_dev *dev, int enable)
> @@ -341,6 +343,7 @@ msi_setup_entry(struct pci_dev *dev, int
>  {
>   struct irq_affinity_desc *masks = NULL;
>   struct msi_desc *entry;
> + unsigned long prop;
>   u16 control;
>  
>   if (affd)
> @@ -372,6 +375,10 @@ msi_setup_entry(struct pci_dev *dev, int
>   if (entry->pci.msi_attrib.can_mask)
>   pci_read_config_dword(dev, entry->pci.mask_pos, 
> >pci.msi_mask);
>  
> + prop = MSI_PROP_PCI_MSI;
> + if (entry->pci.msi_attrib.is_64)
> + prop |= MSI_PROP_64BIT;
> + msi_device_set_properties(>dev, prop);
>  out:
>   kfree(masks);
>   return entry;
> @@ -514,6 +521,7 @@ static int msix_setup_entries(struct pci
>   if (masks)
>   curmsk++;
>   }
> + msi_device_set_properties(>dev, MSI_PROP_PCI_MSIX | 
> MSI_PROP_64BIT);
>   ret = 0;
>  out:
>   kfree(masks);
> 



Re: [patch V2 17/36] PCI/MSI: Use msi_desc::msi_index

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:39:23PM +0100, Thomas Gleixner wrote:
> The usage of msi_desc::pci::entry_nr is confusing at best. It's the index
> into the MSI[X] descriptor table.
> 
> Use msi_desc::msi_index which is shared between all MSI incarnations
> instead of having a PCI specific storage for no value.
> 
> Signed-off-by: Thomas Gleixner 
> Reviewed-by: Greg Kroah-Hartman 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  arch/powerpc/platforms/pseries/msi.c |4 ++--
>  arch/x86/pci/xen.c   |2 +-
>  drivers/pci/msi/irqdomain.c  |2 +-
>  drivers/pci/msi/msi.c|   20 
>  drivers/pci/xen-pcifront.c   |2 +-
>  include/linux/msi.h  |2 --
>  6 files changed, 13 insertions(+), 19 deletions(-)
> 
> --- a/arch/powerpc/platforms/pseries/msi.c
> +++ b/arch/powerpc/platforms/pseries/msi.c
> @@ -332,7 +332,7 @@ static int check_msix_entries(struct pci
>  
>   expected = 0;
>   for_each_pci_msi_entry(entry, pdev) {
> - if (entry->pci.msi_attrib.entry_nr != expected) {
> + if (entry->msi_index != expected) {
>   pr_debug("rtas_msi: bad MSI-X entries.\n");
>   return -EINVAL;
>   }
> @@ -580,7 +580,7 @@ static int pseries_irq_domain_alloc(stru
>   int hwirq;
>   int i, ret;
>  
> - hwirq = rtas_query_irq_number(pci_get_pdn(pdev), 
> desc->pci.msi_attrib.entry_nr);
> + hwirq = rtas_query_irq_number(pci_get_pdn(pdev), desc->msi_index);
>   if (hwirq < 0) {
>   dev_err(>dev, "Failed to query HW IRQ: %d\n", hwirq);
>   return hwirq;
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -306,7 +306,7 @@ static int xen_initdom_setup_msi_irqs(st
>   return -EINVAL;
>  
>   map_irq.table_base = pci_resource_start(dev, bir);
> - map_irq.entry_nr = msidesc->pci.msi_attrib.entry_nr;
> + map_irq.entry_nr = msidesc->msi_index;
>   }
>  
>   ret = -EINVAL;
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -57,7 +57,7 @@ static irq_hw_number_t pci_msi_domain_ca
>  {
>   struct pci_dev *dev = msi_desc_to_pci_dev(desc);
>  
> - return (irq_hw_number_t)desc->pci.msi_attrib.entry_nr |
> + return (irq_hw_number_t)desc->msi_index |
>   pci_dev_id(dev) << 11 |
>   (pci_domain_nr(dev->bus) & 0x) << 27;
>  }
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -44,7 +44,7 @@ static inline void pci_msi_unmask(struct
>  
>  static inline void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
>  {
> - return desc->pci.mask_base + desc->pci.msi_attrib.entry_nr * 
> PCI_MSIX_ENTRY_SIZE;
> + return desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE;
>  }
>  
>  /*
> @@ -356,13 +356,10 @@ msi_setup_entry(struct pci_dev *dev, int
>   if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING)
>   control |= PCI_MSI_FLAGS_MASKBIT;
>  
> - entry->pci.msi_attrib.is_msix   = 0;
> - entry->pci.msi_attrib.is_64 = !!(control & 
> PCI_MSI_FLAGS_64BIT);
> - entry->pci.msi_attrib.is_virtual= 0;
> - entry->pci.msi_attrib.entry_nr  = 0;
> + entry->pci.msi_attrib.is_64 = !!(control & PCI_MSI_FLAGS_64BIT);
>   entry->pci.msi_attrib.can_mask  = !pci_msi_ignore_mask &&
> !!(control & PCI_MSI_FLAGS_MASKBIT);
> - entry->pci.msi_attrib.default_irq   = dev->irq; /* Save IOAPIC 
> IRQ */
> + entry->pci.msi_attrib.default_irq = dev->irq;
>   entry->pci.msi_attrib.multi_cap = (control & PCI_MSI_FLAGS_QMASK) >> 1;
>   entry->pci.msi_attrib.multiple  = ilog2(__roundup_pow_of_two(nvec));
>  
> @@ -496,12 +493,11 @@ static int msix_setup_entries(struct pci
>   entry->pci.msi_attrib.is_64 = 1;
>  
>   if (entries)
> - entry->pci.msi_attrib.entry_nr = entries[i].entry;
> + entry->msi_index = entries[i].entry;
>   else
> - entry->pci.msi_attrib.entry_nr = i;
> + entry->msi_index = i;
>  
> - entry->pci.msi_attrib.is_virtual =
> - entry->pci.msi_attrib.entry_nr >= vec_count;
> + entry->pci.msi_attrib.is_virtual = entry->msi_index >= 
> vec_count;
>  
>   entry->pci.msi_attrib.can_mask  = !pci_msi_ignore_mask &&
> 
> !entry->pci.msi_attrib.is_virtual;
> @@ -1034,7 +1030,7 @@ int pci_irq_vector(struct pci_dev *dev,
>   struct msi_desc *entry;
>  
>   for_each_pci_msi_entry(entry, dev) {
> - if (entry->pci.msi_attrib.entry_nr == nr)
> + if (entry->msi_index == nr)
>   return entry->irq;
>  

Re: [patch V2 08/36] PCI/MSI: Let the irq code handle sysfs groups

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:39:09PM +0100, Thomas Gleixner wrote:
> Set the domain info flag which makes the core code handle sysfs groups and
> put an explicit invocation into the legacy code.
> 
> Signed-off-by: Thomas Gleixner 
> Reviewed-by: Greg Kroah-Hartman 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/irqdomain.c |2 +-
>  drivers/pci/msi/legacy.c|6 +-
>  drivers/pci/msi/msi.c   |   23 ---
>  include/linux/pci.h |1 -
>  4 files changed, 6 insertions(+), 26 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -159,7 +159,7 @@ struct irq_domain *pci_msi_create_irq_do
>   if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS)
>   pci_msi_domain_update_chip_ops(info);
>  
> - info->flags |= MSI_FLAG_ACTIVATE_EARLY;
> + info->flags |= MSI_FLAG_ACTIVATE_EARLY | MSI_FLAG_DEV_SYSFS;
>   if (IS_ENABLED(CONFIG_GENERIC_IRQ_RESERVATION_MODE))
>   info->flags |= MSI_FLAG_MUST_REACTIVATE;
>  
> --- a/drivers/pci/msi/legacy.c
> +++ b/drivers/pci/msi/legacy.c
> @@ -70,10 +70,14 @@ int pci_msi_legacy_setup_msi_irqs(struct
>  {
>   int ret = arch_setup_msi_irqs(dev, nvec, type);
>  
> - return pci_msi_setup_check_result(dev, type, ret);
> + ret = pci_msi_setup_check_result(dev, type, ret);
> + if (!ret)
> + ret = msi_device_populate_sysfs(>dev);
> + return ret;
>  }
>  
>  void pci_msi_legacy_teardown_msi_irqs(struct pci_dev *dev)
>  {
> + msi_device_destroy_sysfs(>dev);
>   arch_teardown_msi_irqs(dev);
>  }
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -233,11 +233,6 @@ static void free_msi_irqs(struct pci_dev
>   for (i = 0; i < entry->nvec_used; i++)
>   BUG_ON(irq_has_action(entry->irq + i));
>  
> - if (dev->msi_irq_groups) {
> - msi_destroy_sysfs(>dev, dev->msi_irq_groups);
> - dev->msi_irq_groups = NULL;
> - }
> -
>   pci_msi_teardown_msi_irqs(dev);
>  
>   list_for_each_entry_safe(entry, tmp, msi_list, list) {
> @@ -417,7 +412,6 @@ static int msi_verify_entries(struct pci
>  static int msi_capability_init(struct pci_dev *dev, int nvec,
>  struct irq_affinity *affd)
>  {
> - const struct attribute_group **groups;
>   struct msi_desc *entry;
>   int ret;
>  
> @@ -441,14 +435,6 @@ static int msi_capability_init(struct pc
>   if (ret)
>   goto err;
>  
> - groups = msi_populate_sysfs(>dev);
> - if (IS_ERR(groups)) {
> - ret = PTR_ERR(groups);
> - goto err;
> - }
> -
> - dev->msi_irq_groups = groups;
> -
>   /* Set MSI enabled bits */
>   pci_intx_for_msi(dev, 0);
>   pci_msi_set_enable(dev, 1);
> @@ -576,7 +562,6 @@ static void msix_mask_all(void __iomem *
>  static int msix_capability_init(struct pci_dev *dev, struct msix_entry 
> *entries,
>   int nvec, struct irq_affinity *affd)
>  {
> - const struct attribute_group **groups;
>   void __iomem *base;
>   int ret, tsize;
>   u16 control;
> @@ -618,14 +603,6 @@ static int msix_capability_init(struct p
>  
>   msix_update_entries(dev, entries);
>  
> - groups = msi_populate_sysfs(>dev);
> - if (IS_ERR(groups)) {
> - ret = PTR_ERR(groups);
> - goto out_free;
> - }
> -
> - dev->msi_irq_groups = groups;
> -
>   /* Set MSI-X enabled bits and unmask the function */
>   pci_intx_for_msi(dev, 0);
>   dev->msix_enabled = 1;
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -475,7 +475,6 @@ struct pci_dev {
>  #ifdef CONFIG_PCI_MSI
>   void __iomem*msix_base;
>   raw_spinlock_t  msi_lock;
> - const struct attribute_group **msi_irq_groups;
>  #endif
>   struct pci_vpd  vpd;
>  #ifdef CONFIG_PCIE_DPC
> 



Re: [patch V2 03/36] PCI/MSI: Allocate MSI device data on first use

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:39:00PM +0100, Thomas Gleixner wrote:
> Allocate MSI device data on first use, i.e. when a PCI driver invokes one
> of the PCI/MSI enablement functions.
> 
> Signed-off-by: Thomas Gleixner 
> Reviewed-by: Greg Kroah-Hartman 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/msi.c |   20 +++-
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -889,10 +889,12 @@ static int __pci_enable_msi_range(struct
>  /* deprecated, don't use */
>  int pci_enable_msi(struct pci_dev *dev)
>  {
> - int rc = __pci_enable_msi_range(dev, 1, 1, NULL);
> - if (rc < 0)
> - return rc;
> - return 0;
> + int rc = msi_setup_device_data(>dev);
> +
> + if (!rc)
> + rc = __pci_enable_msi_range(dev, 1, 1, NULL);
> +
> + return rc < 0 ? rc : 0;
>  }
>  EXPORT_SYMBOL(pci_enable_msi);
>  
> @@ -947,7 +949,11 @@ static int __pci_enable_msix_range(struc
>  int pci_enable_msix_range(struct pci_dev *dev, struct msix_entry *entries,
>   int minvec, int maxvec)
>  {
> - return __pci_enable_msix_range(dev, entries, minvec, maxvec, NULL, 0);
> + int ret = msi_setup_device_data(>dev);
> +
> + if (!ret)
> + ret = __pci_enable_msix_range(dev, entries, minvec, maxvec, 
> NULL, 0);
> + return ret;
>  }
>  EXPORT_SYMBOL(pci_enable_msix_range);
>  
> @@ -974,8 +980,12 @@ int pci_alloc_irq_vectors_affinity(struc
>  struct irq_affinity *affd)
>  {
>   struct irq_affinity msi_default_affd = {0};
> + int ret = msi_setup_device_data(>dev);
>   int nvecs = -ENOSPC;
>  
> + if (ret)
> + return ret;
> +
>   if (flags & PCI_IRQ_AFFINITY) {
>   if (!affd)
>   affd = _default_affd;
> 



Re: [patch V2 23/23] PCI/MSI: Move descriptor counting on allocation fail to the legacy code

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:28:00PM +0100, Thomas Gleixner wrote:
> The irqdomain code already returns the information. Move the loop to the
> legacy code.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/legacy.c |   20 +++-
>  drivers/pci/msi/msi.c|   19 +--
>  2 files changed, 20 insertions(+), 19 deletions(-)
> 
> --- a/drivers/pci/msi/legacy.c
> +++ b/drivers/pci/msi/legacy.c
> @@ -50,9 +50,27 @@ void __weak arch_teardown_msi_irqs(struc
>   }
>  }
>  
> +static int pci_msi_setup_check_result(struct pci_dev *dev, int type, int ret)
> +{
> + struct msi_desc *entry;
> + int avail = 0;
> +
> + if (type != PCI_CAP_ID_MSIX || ret >= 0)
> + return ret;
> +
> + /* Scan the MSI descriptors for successfully allocated ones. */
> + for_each_pci_msi_entry(entry, dev) {
> + if (entry->irq != 0)
> + avail++;
> + }
> + return avail ? avail : ret;
> +}
> +
>  int pci_msi_legacy_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
>  {
> - return arch_setup_msi_irqs(dev, nvec, type);
> + int ret = arch_setup_msi_irqs(dev, nvec, type);
> +
> + return pci_msi_setup_check_result(dev, type, ret);
>  }
>  
>  void pci_msi_legacy_teardown_msi_irqs(struct pci_dev *dev)
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -609,7 +609,7 @@ static int msix_capability_init(struct p
>  
>   ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
>   if (ret)
> - goto out_avail;
> + goto out_free;
>  
>   /* Check if all MSI entries honor device restrictions */
>   ret = msi_verify_entries(dev);
> @@ -634,23 +634,6 @@ static int msix_capability_init(struct p
>   pcibios_free_irq(dev);
>   return 0;
>  
> -out_avail:
> - if (ret < 0) {
> - /*
> -  * If we had some success, report the number of IRQs
> -  * we succeeded in setting up.
> -  */
> - struct msi_desc *entry;
> - int avail = 0;
> -
> - for_each_pci_msi_entry(entry, dev) {
> - if (entry->irq != 0)
> - avail++;
> - }
> - if (avail != 0)
> - ret = avail;
> - }
> -
>  out_free:
>   free_msi_irqs(dev);
>  
> 



Re: [patch V2 21/23] PCI/MSI: Make pci_msi_domain_check_cap() static

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:57PM +0100, Thomas Gleixner wrote:
> No users outside of that file.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/irqdomain.c |5 +++--
>  include/linux/msi.h |2 --
>  2 files changed, 3 insertions(+), 4 deletions(-)
> 
> --- a/drivers/pci/msi/irqdomain.c
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -79,8 +79,9 @@ static inline bool pci_msi_desc_is_multi
>   *  1 if Multi MSI is requested, but the domain does not support it
>   *  -ENOTSUPP otherwise
>   */
> -int pci_msi_domain_check_cap(struct irq_domain *domain,
> -  struct msi_domain_info *info, struct device *dev)
> +static int pci_msi_domain_check_cap(struct irq_domain *domain,
> + struct msi_domain_info *info,
> + struct device *dev)
>  {
>   struct msi_desc *desc = first_pci_msi_entry(to_pci_dev(dev));
>  
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -439,8 +439,6 @@ void *platform_msi_get_host_data(struct
>  struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
>struct msi_domain_info *info,
>struct irq_domain *parent);
> -int pci_msi_domain_check_cap(struct irq_domain *domain,
> -  struct msi_domain_info *info, struct device *dev);
>  u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev 
> *pdev);
>  struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev);
>  bool pci_dev_has_special_msi_domain(struct pci_dev *pdev);
> 



Re: [patch V2 20/23] PCI/MSI: Move msi_lock to struct pci_dev

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:56PM +0100, Thomas Gleixner wrote:
> It's only required for PCI/MSI. So no point in having it in every struct
> device.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Bjorn Helgaas 

> ---
> V2: New patch
> ---
>  drivers/base/core.c|1 -
>  drivers/pci/msi/msi.c  |2 +-
>  drivers/pci/probe.c|4 +++-
>  include/linux/device.h |2 --
>  include/linux/pci.h|1 +
>  5 files changed, 5 insertions(+), 5 deletions(-)
> 
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2875,7 +2875,6 @@ void device_initialize(struct device *de
>   device_pm_init(dev);
>   set_dev_node(dev, NUMA_NO_NODE);
>  #ifdef CONFIG_GENERIC_MSI_IRQ
> - raw_spin_lock_init(>msi_lock);
>   INIT_LIST_HEAD(>msi_list);
>  #endif
>   INIT_LIST_HEAD(>links.consumers);
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -18,7 +18,7 @@ int pci_msi_ignore_mask;
>  
>  static noinline void pci_msi_update_mask(struct msi_desc *desc, u32 clear, 
> u32 set)
>  {
> - raw_spinlock_t *lock = >dev->msi_lock;
> + raw_spinlock_t *lock = _pci_dev(desc->dev)->msi_lock;
>   unsigned long flags;
>  
>   if (!desc->pci.msi_attrib.can_mask)
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2311,7 +2311,9 @@ struct pci_dev *pci_alloc_dev(struct pci
>   INIT_LIST_HEAD(>bus_list);
>   dev->dev.type = _dev_type;
>   dev->bus = pci_bus_get(bus);
> -
> +#ifdef CONFIG_PCI_MSI
> + raw_spin_lock_init(>msi_lock);
> +#endif
>   return dev;
>  }
>  EXPORT_SYMBOL(pci_alloc_dev);
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -407,7 +407,6 @@ struct dev_links_info {
>   * @em_pd:   device's energy model performance domain
>   * @pins:For device pin management.
>   *   See Documentation/driver-api/pin-control.rst for details.
> - * @msi_lock:Lock to protect MSI mask cache and mask register
>   * @msi_list:Hosts MSI descriptors
>   * @msi_domain: The generic MSI domain this device is using.
>   * @numa_node:   NUMA node this device is close to.
> @@ -508,7 +507,6 @@ struct device {
>   struct dev_pin_info *pins;
>  #endif
>  #ifdef CONFIG_GENERIC_MSI_IRQ
> - raw_spinlock_t  msi_lock;
>   struct list_headmsi_list;
>  #endif
>  #ifdef CONFIG_DMA_OPS
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -474,6 +474,7 @@ struct pci_dev {
>  #endif
>  #ifdef CONFIG_PCI_MSI
>   void __iomem*msix_base;
> + raw_spinlock_t  msi_lock;
>   const struct attribute_group **msi_irq_groups;
>  #endif
>   struct pci_vpd  vpd;
> 



Re: [patch V2 19/23] PCI/MSI: Sanitize MSIX table map handling

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:54PM +0100, Thomas Gleixner wrote:
> Unmapping the MSIX base mapping in the loops which allocate/free MSI
> desciptors is daft and in the way of allowing runtime expansion of MSI-X
> descriptors.

s/MSIX/MSI-X/ (subject and first use in commit log)
s/desciptors/descriptors/

> Store the mapping in struct pci_dev and free it after freeing the MSI-X
> descriptors.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/msi.c |   18 --
>  include/linux/pci.h   |1 +
>  2 files changed, 9 insertions(+), 10 deletions(-)
> 
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -241,14 +241,14 @@ static void free_msi_irqs(struct pci_dev
>   pci_msi_teardown_msi_irqs(dev);
>  
>   list_for_each_entry_safe(entry, tmp, msi_list, list) {
> - if (entry->pci.msi_attrib.is_msix) {
> - if (list_is_last(>list, msi_list))
> - iounmap(entry->pci.mask_base);
> - }
> -
>   list_del(>list);
>   free_msi_entry(entry);
>   }
> +
> + if (dev->msix_base) {
> + iounmap(dev->msix_base);
> + dev->msix_base = NULL;
> + }
>  }
>  
>  static void pci_intx_for_msi(struct pci_dev *dev, int enable)
> @@ -501,10 +501,6 @@ static int msix_setup_entries(struct pci
>   for (i = 0, curmsk = masks; i < nvec; i++) {
>   entry = alloc_msi_entry(>dev, 1, curmsk);
>   if (!entry) {
> - if (!i)
> - iounmap(base);
> - else
> - free_msi_irqs(dev);
>   /* No enough memory. Don't try again */
>   ret = -ENOMEM;
>   goto out;
> @@ -602,12 +598,14 @@ static int msix_capability_init(struct p
>   goto out_disable;
>   }
>  
> + dev->msix_base = base;
> +
>   /* Ensure that all table entries are masked. */
>   msix_mask_all(base, tsize);
>  
>   ret = msix_setup_entries(dev, base, entries, nvec, affd);
>   if (ret)
> - goto out_disable;
> + goto out_free;
>  
>   ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
>   if (ret)
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -473,6 +473,7 @@ struct pci_dev {
>   u8  ptm_granularity;
>  #endif
>  #ifdef CONFIG_PCI_MSI
> + void __iomem*msix_base;
>   const struct attribute_group **msi_irq_groups;
>  #endif
>   struct pci_vpd  vpd;
> 



Re: [patch V2 18/23] PCI/MSI: Split out irqdomain code

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:52PM +0100, Thomas Gleixner wrote:
> Move the irqdomain specific code into it's own file.

s/it's/its/

> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi/Makefile|1 
>  drivers/pci/msi/irqdomain.c |  279 ++
>  drivers/pci/msi/legacy.c|   13 +
>  drivers/pci/msi/msi.c   |  319 
> +---
>  drivers/pci/msi/msi.h   |   39 +
>  include/linux/msi.h |   11 -
>  6 files changed, 340 insertions(+), 322 deletions(-)
> 
> --- a/drivers/pci/msi/Makefile
> +++ b/drivers/pci/msi/Makefile
> @@ -3,4 +3,5 @@
>  # Makefile for the PCI/MSI
>  obj-$(CONFIG_PCI)+= pcidev_msi.o
>  obj-$(CONFIG_PCI_MSI)+= msi.o
> +obj-$(CONFIG_PCI_MSI_IRQ_DOMAIN) += irqdomain.o
>  obj-$(CONFIG_PCI_MSI_ARCH_FALLBACKS) += legacy.o
> --- /dev/null
> +++ b/drivers/pci/msi/irqdomain.c
> @@ -0,0 +1,279 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * PCI Message Signaled Interrupt (MSI) - irqdomain support
> + */
> +#include 
> +#include 
> +#include 
> +
> +#include "msi.h"
> +
> +int pci_msi_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
> +{
> + struct irq_domain *domain;
> +
> + domain = dev_get_msi_domain(>dev);
> + if (domain && irq_domain_is_hierarchy(domain))
> + return msi_domain_alloc_irqs(domain, >dev, nvec);
> +
> + return pci_msi_legacy_setup_msi_irqs(dev, nvec, type);
> +}
> +
> +void pci_msi_teardown_msi_irqs(struct pci_dev *dev)
> +{
> + struct irq_domain *domain;
> +
> + domain = dev_get_msi_domain(>dev);
> + if (domain && irq_domain_is_hierarchy(domain))
> + msi_domain_free_irqs(domain, >dev);
> + else
> + pci_msi_legacy_teardown_msi_irqs(dev);
> +}
> +
> +/**
> + * pci_msi_domain_write_msg - Helper to write MSI message to PCI config space
> + * @irq_data:Pointer to interrupt data of the MSI interrupt
> + * @msg: Pointer to the message
> + */
> +static void pci_msi_domain_write_msg(struct irq_data *irq_data, struct 
> msi_msg *msg)
> +{
> + struct msi_desc *desc = irq_data_get_msi_desc(irq_data);
> +
> + /*
> +  * For MSI-X desc->irq is always equal to irq_data->irq. For
> +  * MSI only the first interrupt of MULTI MSI passes the test.
> +  */
> + if (desc->irq == irq_data->irq)
> + __pci_write_msi_msg(desc, msg);
> +}
> +
> +/**
> + * pci_msi_domain_calc_hwirq - Generate a unique ID for an MSI source
> + * @desc:Pointer to the MSI descriptor
> + *
> + * The ID number is only used within the irqdomain.
> + */
> +static irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc)
> +{
> + struct pci_dev *dev = msi_desc_to_pci_dev(desc);
> +
> + return (irq_hw_number_t)desc->pci.msi_attrib.entry_nr |
> + pci_dev_id(dev) << 11 |
> + (pci_domain_nr(dev->bus) & 0x) << 27;
> +}
> +
> +static inline bool pci_msi_desc_is_multi_msi(struct msi_desc *desc)
> +{
> + return !desc->pci.msi_attrib.is_msix && desc->nvec_used > 1;
> +}
> +
> +/**
> + * pci_msi_domain_check_cap - Verify that @domain supports the capabilities
> + * for @dev
> + * @domain:  The interrupt domain to check
> + * @info:The domain info for verification
> + * @dev: The device to check
> + *
> + * Returns:
> + *  0 if the functionality is supported
> + *  1 if Multi MSI is requested, but the domain does not support it
> + *  -ENOTSUPP otherwise
> + */
> +int pci_msi_domain_check_cap(struct irq_domain *domain,
> +  struct msi_domain_info *info, struct device *dev)
> +{
> + struct msi_desc *desc = first_pci_msi_entry(to_pci_dev(dev));
> +
> + /* Special handling to support __pci_enable_msi_range() */
> + if (pci_msi_desc_is_multi_msi(desc) &&
> + !(info->flags & MSI_FLAG_MULTI_PCI_MSI))
> + return 1;
> + else if (desc->pci.msi_attrib.is_msix && !(info->flags & 
> MSI_FLAG_PCI_MSIX))
> + return -ENOTSUPP;
> +
> + return 0;
> +}
> +
> +static int pci_msi_domain_handle_error(struct irq_domain *domain,
> +struct msi_desc *desc, int error)
> +{
> + /* Special handling to support __pci_enable_msi_range() */
> + if (pci_msi_desc_is_multi_msi(desc) && error == -ENOSPC)
> + return 1;
> +
> + return error;
> +}
> +
> +static void pci_msi_domain_set_desc(msi_alloc_info_t *arg,
> + struct msi_desc *desc)
> +{
> + arg->desc = desc;
> + arg->hwirq = pci_msi_domain_calc_hwirq(desc);
> +}
> +
> +static struct msi_domain_ops pci_msi_domain_ops_default = {
> + .set_desc   = pci_msi_domain_set_desc,
> + .msi_check  = pci_msi_domain_check_cap,
> + .handle_error   = pci_msi_domain_handle_error,
> +};
> +
> +static void 

Re: [patch V2 17/23] PCI/MSI: Split out !IRQDOMAIN code

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:51PM +0100, Thomas Gleixner wrote:
> Split out the non irqdomain code into its own file.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
> V2: Add proper includes and fix variable name - Cedric
> ---
>  drivers/pci/msi/Makefile |5 ++--
>  drivers/pci/msi/legacy.c |   52 
> +++
>  drivers/pci/msi/msi.c|   46 -
>  3 files changed, 55 insertions(+), 48 deletions(-)
> 
> --- a/drivers/pci/msi/Makefile
> +++ b/drivers/pci/msi/Makefile
> @@ -1,5 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0
>  #
>  # Makefile for the PCI/MSI
> -obj-$(CONFIG_PCI)+= pcidev_msi.o
> -obj-$(CONFIG_PCI_MSI)+= msi.o
> +obj-$(CONFIG_PCI)+= pcidev_msi.o
> +obj-$(CONFIG_PCI_MSI)+= msi.o
> +obj-$(CONFIG_PCI_MSI_ARCH_FALLBACKS) += legacy.o
> --- /dev/null
> +++ b/drivers/pci/msi/legacy.c
> @@ -0,0 +1,52 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * PCI Message Signaled Interrupt (MSI).
> + *
> + * Legacy architecture specific setup and teardown mechanism.
> + */
> +#include 
> +#include 
> +
> +/* Arch hooks */
> +int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
> +{
> + return -EINVAL;
> +}
> +
> +void __weak arch_teardown_msi_irq(unsigned int irq)
> +{
> +}
> +
> +int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
> +{
> + struct msi_desc *desc;
> + int ret;
> +
> + /*
> +  * If an architecture wants to support multiple MSI, it needs to
> +  * override arch_setup_msi_irqs()
> +  */
> + if (type == PCI_CAP_ID_MSI && nvec > 1)
> + return 1;
> +
> + for_each_pci_msi_entry(desc, dev) {
> + ret = arch_setup_msi_irq(dev, desc);
> + if (ret)
> + return ret < 0 ? ret : -ENOSPC;
> + }
> +
> + return 0;
> +}
> +
> +void __weak arch_teardown_msi_irqs(struct pci_dev *dev)
> +{
> + struct msi_desc *desc;
> + int i;
> +
> + for_each_pci_msi_entry(desc, dev) {
> + if (desc->irq) {
> + for (i = 0; i < desc->nvec_used; i++)
> + arch_teardown_msi_irq(desc->irq + i);
> + }
> + }
> +}
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -50,52 +50,6 @@ static void pci_msi_teardown_msi_irqs(st
>  #define pci_msi_teardown_msi_irqsarch_teardown_msi_irqs
>  #endif
>  
> -#ifdef CONFIG_PCI_MSI_ARCH_FALLBACKS
> -/* Arch hooks */
> -int __weak arch_setup_msi_irq(struct pci_dev *dev, struct msi_desc *desc)
> -{
> - return -EINVAL;
> -}
> -
> -void __weak arch_teardown_msi_irq(unsigned int irq)
> -{
> -}
> -
> -int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
> -{
> - struct msi_desc *entry;
> - int ret;
> -
> - /*
> -  * If an architecture wants to support multiple MSI, it needs to
> -  * override arch_setup_msi_irqs()
> -  */
> - if (type == PCI_CAP_ID_MSI && nvec > 1)
> - return 1;
> -
> - for_each_pci_msi_entry(entry, dev) {
> - ret = arch_setup_msi_irq(dev, entry);
> - if (ret < 0)
> - return ret;
> - if (ret > 0)
> - return -ENOSPC;
> - }
> -
> - return 0;
> -}
> -
> -void __weak arch_teardown_msi_irqs(struct pci_dev *dev)
> -{
> - int i;
> - struct msi_desc *entry;
> -
> - for_each_pci_msi_entry(entry, dev)
> - if (entry->irq)
> - for (i = 0; i < entry->nvec_used; i++)
> - arch_teardown_msi_irq(entry->irq + i);
> -}
> -#endif /* CONFIG_PCI_MSI_ARCH_FALLBACKS */
> -
>  /*
>   * PCI 2.3 does not specify mask bits for each MSI interrupt.  Attempting to
>   * mask all MSI interrupts by clearing the MSI enable bit does not work
> 



Re: [patch V2 14/23] PCI/MSI: Make msix_update_entries() smarter

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:46PM +0100, Thomas Gleixner wrote:
> No need to walk the descriptors and check for each one whether the entries
> pointer function argument is NULL. Do it once.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -642,8 +642,8 @@ static void msix_update_entries(struct p
>  {
>   struct msi_desc *entry;
>  
> - for_each_pci_msi_entry(entry, dev) {
> - if (entries) {
> + if (entries) {
> + for_each_pci_msi_entry(entry, dev) {
>   entries->vector = entry->irq;
>   entries++;
>   }
> 



Re: [patch V2 13/23] PCI/MSI: Cleanup include zoo

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:44PM +0100, Thomas Gleixner wrote:
> Get rid of the pile of unneeded includes which accumulated over time.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

Nice, thanks!

> ---
> V2: Address build fail on powerpc - Cedric
> ---
>  drivers/pci/msi.c |   16 
>  1 file changed, 4 insertions(+), 12 deletions(-)
> 
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -7,22 +7,14 @@
>   * Copyright (C) 2016 Christoph Hellwig.
>   */
>  
> +#include 
>  #include 
> -#include 
> -#include 
> -#include 
>  #include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> +#include 
>  #include 
> +#include 
>  #include 
> +#include 
>  
>  #include "pci.h"
>  
> 



Re: [patch V2 12/23] PCI/MSI: Make arch_restore_msi_irqs() less horrible.

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:42PM +0100, Thomas Gleixner wrote:
> Make arch_restore_msi_irqs() return a boolean which indicates whether the
> core code should restore the MSI message or not. Get rid of the indirection
> in x86.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 
> Cc: x...@kernel.org
> Cc: xen-devel@lists.xenproject.org
> Cc: Christian Borntraeger 
> Cc: Heiko Carstens 

Acked-by: Bjorn Helgaas# PCI

> ---
>  arch/s390/pci/pci_irq.c   |4 +-
>  arch/x86/include/asm/x86_init.h   |6 ---
>  arch/x86/include/asm/xen/hypervisor.h |8 +
>  arch/x86/kernel/apic/msi.c|6 +++
>  arch/x86/kernel/x86_init.c|   12 ---
>  arch/x86/pci/xen.c|   13 
>  drivers/pci/msi.c |   54 
> +++---
>  include/linux/msi.h   |7 +---
>  8 files changed, 45 insertions(+), 65 deletions(-)
> 
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -387,13 +387,13 @@ void arch_teardown_msi_irqs(struct pci_d
>   airq_iv_free(zpci_ibv[0], zdev->msi_first_bit, 
> zdev->msi_nr_irqs);
>  }
>  
> -void arch_restore_msi_irqs(struct pci_dev *pdev)
> +bool arch_restore_msi_irqs(struct pci_dev *pdev)
>  {
>   struct zpci_dev *zdev = to_zpci(pdev);
>  
>   if (!zdev->irqs_registered)
>   zpci_set_irq(zdev);
> - default_restore_msi_irqs(pdev);
> + return true;
>  }
>  
>  static struct airq_struct zpci_airq = {
> --- a/arch/x86/include/asm/x86_init.h
> +++ b/arch/x86/include/asm/x86_init.h
> @@ -289,12 +289,6 @@ struct x86_platform_ops {
>   struct x86_hyper_runtime hyper;
>  };
>  
> -struct pci_dev;
> -
> -struct x86_msi_ops {
> - void (*restore_msi_irqs)(struct pci_dev *dev);
> -};
> -
>  struct x86_apic_ops {
>   unsigned int(*io_apic_read)   (unsigned int apic, unsigned int reg);
>   void(*restore)(void);
> --- a/arch/x86/include/asm/xen/hypervisor.h
> +++ b/arch/x86/include/asm/xen/hypervisor.h
> @@ -57,6 +57,14 @@ static inline bool __init xen_x2apic_par
>  }
>  #endif
>  
> +struct pci_dev;
> +
> +#ifdef CONFIG_XEN_DOM0
> +bool xen_initdom_restore_msi(struct pci_dev *dev);
> +#else
> +static inline bool xen_initdom_restore_msi(struct pci_dev *dev) { return 
> true; }
> +#endif
> +
>  #ifdef CONFIG_HOTPLUG_CPU
>  void xen_arch_register_cpu(int num);
>  void xen_arch_unregister_cpu(int num);
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  struct irq_domain *x86_pci_msi_default_domain __ro_after_init;
>  
> @@ -345,3 +346,8 @@ void dmar_free_hwirq(int irq)
>   irq_domain_free_irqs(irq, 1);
>  }
>  #endif
> +
> +bool arch_restore_msi_irqs(struct pci_dev *dev)
> +{
> + return xen_initdom_restore_msi(dev);
> +}
> --- a/arch/x86/kernel/x86_init.c
> +++ b/arch/x86/kernel/x86_init.c
> @@ -145,18 +145,6 @@ struct x86_platform_ops x86_platform __r
>  
>  EXPORT_SYMBOL_GPL(x86_platform);
>  
> -#if defined(CONFIG_PCI_MSI)
> -struct x86_msi_ops x86_msi __ro_after_init = {
> - .restore_msi_irqs   = default_restore_msi_irqs,
> -};
> -
> -/* MSI arch specific hooks */
> -void arch_restore_msi_irqs(struct pci_dev *dev)
> -{
> - x86_msi.restore_msi_irqs(dev);
> -}
> -#endif
> -
>  struct x86_apic_ops x86_apic_ops __ro_after_init = {
>   .io_apic_read   = native_io_apic_read,
>   .restore= native_restore_boot_irq_mode,
> --- a/arch/x86/pci/xen.c
> +++ b/arch/x86/pci/xen.c
> @@ -351,10 +351,13 @@ static int xen_initdom_setup_msi_irqs(st
>   return ret;
>  }
>  
> -static void xen_initdom_restore_msi_irqs(struct pci_dev *dev)
> +bool xen_initdom_restore_msi(struct pci_dev *dev)
>  {
>   int ret = 0;
>  
> + if (!xen_initial_domain())
> + return true;
> +
>   if (pci_seg_supported) {
>   struct physdev_pci_device restore_ext;
>  
> @@ -375,10 +378,10 @@ static void xen_initdom_restore_msi_irqs
>   ret = HYPERVISOR_physdev_op(PHYSDEVOP_restore_msi, );
>   WARN(ret && ret != -ENOSYS, "restore_msi -> %d\n", ret);
>   }
> + return false;
>  }
>  #else /* CONFIG_XEN_PV_DOM0 */
>  #define xen_initdom_setup_msi_irqs   NULL
> -#define xen_initdom_restore_msi_irqs NULL
>  #endif /* !CONFIG_XEN_PV_DOM0 */
>  
>  static void xen_teardown_msi_irqs(struct pci_dev *dev)
> @@ -466,12 +469,10 @@ static __init struct irq_domain *xen_cre
>  static __init void xen_setup_pci_msi(void)
>  {
>   if (xen_pv_domain()) {
> - if (xen_initial_domain()) {
> + if (xen_initial_domain())
>   xen_msi_ops.setup_msi_irqs = xen_initdom_setup_msi_irqs;
> - x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs;
> - } else {
> + else
>   xen_msi_ops.setup_msi_irqs = 

Re: [patch V2 08/23] PCI/sysfs: Use pci_irq_vector()

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:36PM +0100, Thomas Gleixner wrote:
> instead of fiddling with msi descriptors.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

s/msi/MSI/ above if you have a chance.  Nice cleanup, thanks!

> ---
>  drivers/pci/pci-sysfs.c |7 ++-
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -62,11 +62,8 @@ static ssize_t irq_show(struct device *d
>* For MSI, show the first MSI IRQ; for all other cases including
>* MSI-X, show the legacy INTx IRQ.
>*/
> - if (pdev->msi_enabled) {
> - struct msi_desc *desc = first_pci_msi_entry(pdev);
> -
> - return sysfs_emit(buf, "%u\n", desc->irq);
> - }
> + if (pdev->msi_enabled)
> + return sysfs_emit(buf, "%u\n", pci_irq_vector(pdev, 0));
>  #endif
>  
>   return sysfs_emit(buf, "%u\n", pdev->irq);
> 



Re: [patch V2 07/23] PCI/MSI: Remove msi_desc_to_pci_sysdata()

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:34PM +0100, Thomas Gleixner wrote:
> Last user is gone long ago.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/msi.c   |8 
>  include/linux/msi.h |5 -
>  2 files changed, 13 deletions(-)
> 
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1267,14 +1267,6 @@ struct pci_dev *msi_desc_to_pci_dev(stru
>  }
>  EXPORT_SYMBOL(msi_desc_to_pci_dev);
>  
> -void *msi_desc_to_pci_sysdata(struct msi_desc *desc)
> -{
> - struct pci_dev *dev = msi_desc_to_pci_dev(desc);
> -
> - return dev->bus->sysdata;
> -}
> -EXPORT_SYMBOL_GPL(msi_desc_to_pci_sysdata);
> -
>  #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
>  /**
>   * pci_msi_domain_write_msg - Helper to write MSI message to PCI config space
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -218,13 +218,8 @@ static inline void msi_desc_set_iommu_co
>   for_each_msi_entry((desc), &(pdev)->dev)
>  
>  struct pci_dev *msi_desc_to_pci_dev(struct msi_desc *desc);
> -void *msi_desc_to_pci_sysdata(struct msi_desc *desc);
>  void pci_write_msi_msg(unsigned int irq, struct msi_msg *msg);
>  #else /* CONFIG_PCI_MSI */
> -static inline void *msi_desc_to_pci_sysdata(struct msi_desc *desc)
> -{
> - return NULL;
> -}
>  static inline void pci_write_msi_msg(unsigned int irq, struct msi_msg *msg)
>  {
>  }
> 



Re: [patch V2 06/23] PCI/MSI: Make pci_msi_domain_write_msg() static

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:33PM +0100, Thomas Gleixner wrote:
> There is no point to have this function public as it is set by the PCI core
> anyway when a PCI/MSI irqdomain is created.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas# PCI

> ---
>  drivers/irqchip/irq-gic-v2m.c|1 -
>  drivers/irqchip/irq-gic-v3-its-pci-msi.c |1 -
>  drivers/irqchip/irq-gic-v3-mbi.c |1 -
>  drivers/pci/msi.c|2 +-
>  include/linux/msi.h  |1 -
>  5 files changed, 1 insertion(+), 5 deletions(-)
> 
> --- a/drivers/irqchip/irq-gic-v2m.c
> +++ b/drivers/irqchip/irq-gic-v2m.c
> @@ -88,7 +88,6 @@ static struct irq_chip gicv2m_msi_irq_ch
>   .irq_mask   = gicv2m_mask_msi_irq,
>   .irq_unmask = gicv2m_unmask_msi_irq,
>   .irq_eoi= irq_chip_eoi_parent,
> - .irq_write_msi_msg  = pci_msi_domain_write_msg,
>  };
>  
>  static struct msi_domain_info gicv2m_msi_domain_info = {
> --- a/drivers/irqchip/irq-gic-v3-its-pci-msi.c
> +++ b/drivers/irqchip/irq-gic-v3-its-pci-msi.c
> @@ -28,7 +28,6 @@ static struct irq_chip its_msi_irq_chip
>   .irq_unmask = its_unmask_msi_irq,
>   .irq_mask   = its_mask_msi_irq,
>   .irq_eoi= irq_chip_eoi_parent,
> - .irq_write_msi_msg  = pci_msi_domain_write_msg,
>  };
>  
>  static int its_pci_msi_vec_count(struct pci_dev *pdev, void *data)
> --- a/drivers/irqchip/irq-gic-v3-mbi.c
> +++ b/drivers/irqchip/irq-gic-v3-mbi.c
> @@ -171,7 +171,6 @@ static struct irq_chip mbi_msi_irq_chip
>   .irq_unmask = mbi_unmask_msi_irq,
>   .irq_eoi= irq_chip_eoi_parent,
>   .irq_compose_msi_msg= mbi_compose_msi_msg,
> - .irq_write_msi_msg  = pci_msi_domain_write_msg,
>  };
>  
>  static struct msi_domain_info mbi_msi_domain_info = {
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1281,7 +1281,7 @@ EXPORT_SYMBOL_GPL(msi_desc_to_pci_sysdat
>   * @irq_data:Pointer to interrupt data of the MSI interrupt
>   * @msg: Pointer to the message
>   */
> -void pci_msi_domain_write_msg(struct irq_data *irq_data, struct msi_msg *msg)
> +static void pci_msi_domain_write_msg(struct irq_data *irq_data, struct 
> msi_msg *msg)
>  {
>   struct msi_desc *desc = irq_data_get_msi_desc(irq_data);
>  
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -455,7 +455,6 @@ void *platform_msi_get_host_data(struct
>  #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */
>  
>  #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> -void pci_msi_domain_write_msg(struct irq_data *irq_data, struct msi_msg 
> *msg);
>  struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode,
>struct msi_domain_info *info,
>struct irq_domain *parent);
> 



Re: [patch V2 02/23] PCI/MSI: Fix pci_irq_vector()/pci_irq_get_affinity()

2021-12-07 Thread Bjorn Helgaas
On Mon, Dec 06, 2021 at 11:27:26PM +0100, Thomas Gleixner wrote:
> pci_irq_vector() and pci_irq_get_affinity() use the list position to find the
> MSI-X descriptor at a given index. That's correct for the normal case where
> the entry number is the same as the list position.
> 
> But it's wrong for cases where MSI-X was allocated with an entries array
> describing sparse entry numbers into the hardware message descriptor
> table. That's inconsistent at best.
> 
> Make it always check the entry number because that's what the zero base
> index really means. This change won't break existing users which use a
> sparse entries array for allocation because these users retrieve the Linux
> interrupt number from the entries array after allocation and none of them
> uses pci_irq_vector() or pci_irq_get_affinity().
> 
> Fixes: aff171641d18 ("PCI: Provide sensible IRQ vector alloc/free routines")
> Signed-off-by: Thomas Gleixner 
> Tested-by: Juergen Gross 
> Reviewed-by: Jason Gunthorpe 

Acked-by: Bjorn Helgaas 

> ---
> V2: Fix typo in subject - Jason
> ---
>  drivers/pci/msi.c |   26 ++
>  1 file changed, 18 insertions(+), 8 deletions(-)
> 
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -1187,19 +1187,24 @@ EXPORT_SYMBOL(pci_free_irq_vectors);
>  
>  /**
>   * pci_irq_vector - return Linux IRQ number of a device vector
> - * @dev: PCI device to operate on
> - * @nr: device-relative interrupt vector index (0-based).
> + * @dev: PCI device to operate on
> + * @nr:  Interrupt vector index (0-based)
> + *
> + * @nr has the following meanings depending on the interrupt mode:
> + *   MSI-X:  The index in the MSI-X vector table
> + *   MSI:The index of the enabled MSI vectors
> + *   INTx:   Must be 0
> + *
> + * Return: The Linux interrupt number or -EINVAl if @nr is out of range.
>   */
>  int pci_irq_vector(struct pci_dev *dev, unsigned int nr)
>  {
>   if (dev->msix_enabled) {
>   struct msi_desc *entry;
> - int i = 0;
>  
>   for_each_pci_msi_entry(entry, dev) {
> - if (i == nr)
> + if (entry->msi_attrib.entry_nr == nr)
>   return entry->irq;
> - i++;
>   }
>   WARN_ON_ONCE(1);
>   return -EINVAL;
> @@ -1223,17 +1228,22 @@ EXPORT_SYMBOL(pci_irq_vector);
>   * pci_irq_get_affinity - return the affinity of a particular MSI vector
>   * @dev: PCI device to operate on
>   * @nr:  device-relative interrupt vector index (0-based).
> + *
> + * @nr has the following meanings depending on the interrupt mode:
> + *   MSI-X:  The index in the MSI-X vector table
> + *   MSI:The index of the enabled MSI vectors
> + *   INTx:   Must be 0
> + *
> + * Return: A cpumask pointer or NULL if @nr is out of range
>   */
>  const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
>  {
>   if (dev->msix_enabled) {
>   struct msi_desc *entry;
> - int i = 0;
>  
>   for_each_pci_msi_entry(entry, dev) {
> - if (i == nr)
> + if (entry->msi_attrib.entry_nr == nr)
>   return >affinity->mask;
> - i++;
>   }
>   WARN_ON_ONCE(1);
>   return NULL;
> 



Re: [patch V2 01/23] powerpc/4xx: Remove MSI support which never worked

2021-12-07 Thread Thomas Gleixner
Cedric,

On Tue, Dec 07 2021 at 16:50, Cédric Le Goater wrote:
> On 12/7/21 12:36, Michael Ellerman wrote:
>> 
>> This patch should drop those selects I guess. Can you send an
>> incremental diff for Thomas to squash in?
>
> Sure.
>
>> Removing all the tendrils in various device tree files will probably
>> require some archaeology, and it should be perfectly safe to leave those
>> in the tree with the driver gone. So I think we can do that as a
>> subsequent patch, rather than in this series.
>
> Here are the changes. Compiled tested with ppc40x and ppc44x defconfigs.

< Lots of patch skipped />
> @@ -141,7 +138,6 @@ config REDWOOD
>   select FORCE_PCI
>   select PPC4xx_PCI_EXPRESS
>   select PCI_MSI
> - select PPC4xx_MSI
>   help
> This option enables support for the AMCC PPC460SX Redwood board.

While that is incremental it certainly is worth a patch on it's
own. Could you add a proper changelog and an SOB please?

Thanks,

tglx



[xen-4.16-testing test] 167218: tolerable FAIL - PUSHED

2021-12-07 Thread osstest service owner
flight 167218 xen-4.16-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167218/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 166959
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 166959
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 166959
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 166959
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 166959
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 166959
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 166959
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 166959
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 166959
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 166959
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 166959
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 166959
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  2dcea9c94c59cf2a9c7ee824a573f7c1b864b5d5
baseline version:
 xen  

Re: [PATCH] xen/arm: Add Kconfig parameter for memory banks number

2021-12-07 Thread Julien Grall

Hi Bertrand and Luca,

On 07/12/2021 11:09, Bertrand Marquis wrote:

On 7 Dec 2021, at 10:52, Luca Fancellu  wrote:




On 6 Dec 2021, at 17:05, Julien Grall  wrote:

Hi Luca,

On 06/12/2021 15:37, Luca Fancellu wrote:

Currently the maximum number of memory banks is fixed to
128, but on some new platforms that have a large amount
of memory, this value is not enough




Hi Julien,


Can you provide some information on the setup? Is it using UEFI?


Yes it is a platform with 32gb of ram, I’ve built Xen with ACPI support and 
it’s starting using UEFI+ACPI.

Thanks! That makes more sense now. Although...






and prevents Xen
from booting.


AFAIK, the restriction should only prevent Xen to use all the memory. If that's 
not the case, then this should be fixed.


The code that it’s failing is this, inside efi_process_memory_map_bootinfo(…) 
in the arch/arm/efi/efi-boot.h:

#ifdef CONFIG_ACPI
else if ( desc_ptr->Type == EfiACPIReclaimMemory )
{
if ( !meminfo_add_bank(, desc_ptr) )
{
PrintStr(L"Error: All " __stringify(NR_MEM_BANKS)
  " acpi meminfo mem banks exhausted.\r\n");
return EFI_LOAD_ERROR;
}
}
#endif


... I was expecting bootinfo.mem to be filled rather than bootinfo.acpi:

if ( !meminfo_add_bank(, desc_ptr) )
{
PrintStr(L"Warning: All " __stringify(NR_MEM_BANKS)
  " bootinfo mem banks exhausted.\r\n");
break;
}

It is actually quite surprising that you end up with more than 128 
regions here.







Create a Kconfig parameter to set the value, by default
128.


I think Xen should be able to boot on any platform with the default 
configuration. So the value should at least be bumped.


Signed-off-by: Luca Fancellu 
---
xen/arch/arm/Kconfig| 8 
xen/include/asm-arm/setup.h | 2 +-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index ecfa6822e4d3..805e3c417e89 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -25,6 +25,14 @@ menu "Architecture Features"
   source "arch/Kconfig"
+config MEM_BANKS
+   int "Maximum number of memory banks."
+   default "128"
+   help
+ Controls the build-time size memory bank array.
+ It is the upper bound of the number of logical entities describing
+ the memory.


NR_MEM_BANKS is going to be used by multiple internal structure in Xen (e.g. 
static memory, reserved memory, normal memory). So how could an admin decide 
the correct value?

In particular for UEFI, we are at the mercy of the firmware that can expose any 
kind of memory map (that's why we had to increase the original number of banks).

So maybe it is time for us to move out from a static array and re-think how we 
discover the memory.

That this is probably going to take some time to get it properly, so
I would be OK with bumping the value + a config gated UNSUPPORTED.



Looking at what we have, the memory is actually fragmented by ACPI but a long 
term solution could be to make the code a little bit more smart and try to 
merge together adjacent banks.


That could work, but I think we could get rid of bootinfo.acpi completely.

If I am not mistaken, bootinfo.acpi is only used by Xen to figure out 
how to create the EFI memory map for dom0. So this could be replaced 
with a walk of the UEFI memory map.


Looking at the code, we have already some boiler plate for x86 to pass 
the UEFI memory map from the stub to Xen. They would need need to be 
adjusted for Arm to:


   * Enable ebmalloc() (see TODOs in common/efi/ebmalloc.c)
   * Switch efi_arch_allocate_mmap_buffer() to use ebmalloc()
   * Adjust the pointers (see end of the efi_exit_boot()). I think we 
would want to pass the physical and let the actual Xen to translate to a 
virtual address




I would suggest to just increase the existing define to 256 to fix the current 
issue (which might be encountered by anybody using ACPI) and put a comment in 
the code for now with a TODO explaining why we currently need such a high value 
and what should be done to fix this.


I am fine with simply bumping the value (the actual array doesn't take a 
lot of space and will be freed after boot).


Long term, I think it would be better if we start to reduce the number 
of bootinfo.mem* and removing any hardcoded size.


When using UEFI, we could replace with the UEFI memory map. For non-UEFI 
DT boot, we would still need to create the memory map by hand to avoid 
walking the DT every time.


Cheers,

--
Julien Grall



Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-07 Thread Julien Grall

Hi Jan,

On 07/12/2021 09:55, Jan Beulich wrote:

--- a/xen/arch/arm/arm64/entry.S
+++ b/xen/arch/arm/arm64/entry.S
@@ -109,8 +109,16 @@
    * If 0, we rely on the on x0/x1 to have been saved at the correct
    * position on the stack before.
    */
-    .macro  entry, hyp, compat, save_x0_x1=1
+    .macro  entry, hyp, compat=0, save_x0_x1=1
   sub sp, sp, #(UREGS_SPSR_el1 - UREGS_LR) /* CPSR, PC, SP, LR */
+
+    /* Zero the upper 32 bits of the registers when switching to AArch32 */
+    .if \compat == 1  /* AArch32 mode */
+    .irp 
nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
+    mov w\nr, w\nr
+    .endr
+    .endif


So Jan mentioned, the x0/x1 may have already been saved. So you may need to 
fetch them from the stack and then clobber the top 32-bit.


So I would do the following:
-fetch x0/x1 from the stack
-clobber them
-store them again on the stack

/*
  * Zero the upper 32 bits of the gp registers when switching
  * from AArch32.
  */
.if \compat == 1  /* AArch32 mode */

/* x0/x1 have already been saved so fetch them to zero top 32 bits */
.if \save_x0_x1 == 0
ldp x0, x1, [sp], #(UREGS_kernel_sizeof - UREGS_X0)
.endif

.irp 
nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
mov w\nr, w\nr
.endr

.if \save_x0_x1 == 0
stp x0, x1, [sp, #-(UREGS_kernel_sizeof - UREGS_X0)]
.endif

.endif


Wouldn't it be more efficient to store 32 bits of zero each into the
high halves of the respective stack slots? Afaict same code size, but
less memory / cache traffic.


It would indeed be more efficient.


Plus it would avoid the latent issue of
a user of the macro actually expecting the two registers to retain
their values across the macro invocation.


While this is not explicitely written, a caller cannot expect the 
registers to be preserved by this macro.




I'm also puzzled by the two different memory addressing forms, but I
can easily see that I may be lacking enough Arm knowledge there.


I agree this is quite puzzling. The first one would update 'sp' after 
loading the register but I don't quite understand why it is necessary.


Assuming the this is happening before the instruction 'sub sp, sp, ...', 
then we should only need to load/store from sp with an offset (i.e. the 
second form).


Cheers,

--
Julien Grall



Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-07 Thread Julien Grall




On 07/12/2021 08:37, Michal Orzel wrote:

Hi Julien,


Hi,


On 06.12.2021 16:29, Julien Grall wrote:

Hi,

On 06/12/2021 14:20, Michal Orzel wrote:

to hypervisor when switching to AArch32 state.


I will change to "from AArch32 state".

According to section D1.20.2 of Arm Arm(DDI 0487A.j):
"If the general-purpose register was accessible from AArch32 state the
upper 32 bits either become zero, or hold the value that the same
architectural register held before any AArch32 execution.
The choice between these two options is IMPLEMENTATIONDEFINED"


Typo: Missing space between IMPLEMENTATION and DEFINED.


Ok.


Currently Xen does not ensure that the top 32 bits are zeroed and this
needs to be fixed.


Can you outline why this is a problem and why we need to protect? IIRC, the 
main concern is Xen may misinterpret what the guest requested but we are not 
concerned about Xen using wrong value.


I would say:
"
The reason why this is a problem is that there are places in Xen where we 
assume that top 32bits are zero for AArch32 guests.
If they are not, this can lead to misinterpretation of Xen regarding what the 
guest requested.
For example hypercalls returning an error encoded in a signed long like 
do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
if the command passed as the first argument was clobbered,
"


Fix this bug by zeroing the upper 32 bits of these registers on an
entry to hypervisor when switching to AArch32 state.

Set default value of parameter compat of macro entry to 0 (AArch64 mode
as we are on 64-bit hypervisor) to avoid checking if parameter is blank
when not passed.


Which error do you see otherwise? Is it a compilation error?


Yes, this is a compilation error. The errors appear at each line when "entry" is called 
without passing value for "compat".
So basically in all the places where entry is called with hyp=1.
When taking the current patch and removing default value for compat you will 
get:
```
entry.S:254: Error: ".endif" without ".if"
entry.S:258: Error: symbol `.if' is already defined
entry.S:258: Error: ".endif" without ".if"
entry.S:262: Error: symbol `.if' is already defined
entry.S:262: Error: ".endif" without ".if"
entry.S:266: Error: symbol `.if' is already defined
entry.S:266: Error: ".endif" without ".if"
entry.S:278: Error: symbol `.if' is already defined
entry.S:278: Error: ".endif" without ".if"
entry.S:292: Error: symbol `.if' is already defined
entry.S:292: Error: ".endif" without ".if"
entry.S:317: Error: symbol `.if' is already defined
entry.S:317: Error: ".endif" without ".if"
```


Thanks for input. I am concerned with your suggested approach (or using 
.if 0\compat as suggested by Jan) because they allow the caller to not 
properly specify compat when hyp=0. The risk here is we may generate the 
wrong entry.


compat should need to be specified when hyp=1 as we will always run in 
aarch64 mode. So could we protect this code with hyp=0?


Cheers,

--
Julien Grall



Re: [PATCH] xen/arm: Do not include in the image functions...

2021-12-07 Thread Julien Grall

Hi,

On 07/12/2021 07:10, Michal Orzel wrote:

On 06.12.2021 17:40, Julien Grall wrote:



On 06/12/2021 15:00, Michal Orzel wrote:

Hi Julien,


Hi Michal,


On 06.12.2021 15:39, Julien Grall wrote:

Hi Michal,

On 06/12/2021 14:19, Michal Orzel wrote:

vtimer_update_irqs, vtimer_update_irq and vcpu_update_evtchn_irq if
CONFIG_NEW_VGIC is not set.

enter_hypervisor_from_guest is protecting calls to these functions
with CONFIG_NEW_VGIC but their definitions and declarations are not > 
protected. This means that we are including them in the image even
though we are not making use of them. Fix that.


While I agree, they are only used by the new vGIC, the implementation of the 
functions are not. So I don't think they should be protected by CONFIG_NEW_VGIC.

Actually, I am not convinced they should be protected. But I guess you did that 
for a reason. Would you be able to clarify what is your reason?


  From what I know + what the commit introducing these fucntions states 
(b9db96f71a74), the current vGIC does not handle level-triggered vIRQs.
The functionality of these functions is only related to new VGIC implementation 
which can handle level triggered vIRQs.


This is a known error in the vGIC implementation which should be resolved 
before this leads to a disaster.


I just thought that if this error is present for such a long time, there are no 
plans to make current vgic handle level type irqs.


The error has been present for such a long time because no-one yet 
volunteered to properly fix it. That said, the new vGIC has been an 
attempt to resolve it but the effort kind of died.


It would be nice to revive the effort.

Cheers,

--
Julien Grall



Re: [patch V2 29/36] PCI/MSI: Simplify pci_irq_get_affinity()

2021-12-07 Thread Cédric Le Goater

Thomas,

On 12/6/21 23:39, Thomas Gleixner wrote:

Replace open coded MSI descriptor chasing and use the proper accessor
functions instead.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Greg Kroah-Hartman 
Reviewed-by: Jason Gunthorpe 
---
  drivers/pci/msi/msi.c |   26 ++
  1 file changed, 10 insertions(+), 16 deletions(-)

--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -1056,26 +1056,20 @@ EXPORT_SYMBOL(pci_irq_vector);
   */
  const struct cpumask *pci_irq_get_affinity(struct pci_dev *dev, int nr)
  {
-   if (dev->msix_enabled) {
-   struct msi_desc *entry;
+   int irq = pci_irq_vector(dev, nr);
+   struct msi_desc *desc;
  
-		for_each_pci_msi_entry(entry, dev) {

-   if (entry->msi_index == nr)
-   return >affinity->mask;
-   }
-   WARN_ON_ONCE(1);
+   if (WARN_ON_ONCE(irq <= 0))
return NULL;
-   } else if (dev->msi_enabled) {
-   struct msi_desc *entry = first_pci_msi_entry(dev);
  
-		if (WARN_ON_ONCE(!entry || !entry->affinity ||

-nr >= entry->nvec_used))
-   return NULL;
-
-   return >affinity[nr].mask;
-   } else {
+   desc = irq_get_msi_desc(irq);
+   /* Non-MSI does not have the information handy */
+   if (!desc)
return cpu_possible_mask;
-   }
+
+   if (WARN_ON_ONCE(!desc->affinity))
+   return NULL;
+   return >affinity[nr].mask;
  }
  EXPORT_SYMBOL(pci_irq_get_affinity);


This is breaking nvme on pseries but it's probably one of the previous
patches. I haven't figured out what's wrong yet. Here is the oops FYI.

Thanks,

C.

[   32.494536] [ cut here ]
[   32.494562] WARNING: CPU: 26 PID: 658 at kernel/irq/chip.c:210 
irq_startup+0x1c0/0x1e0
[   32.494575] Modules linked in: ibmvscsi ibmveth scsi_transport_srp bnx2x ipr 
libata xhci_pci xhci_hcd nvme xts vmx_crypto nvme_core mdio t10_pi libcrc32c 
dm_mirror dm_region_hash dm_log dm_mod
[   32.494601] CPU: 26 PID: 658 Comm: kworker/26:1H Not tainted 5.16.0-rc4-clg+ 
#54
[   32.494607] Workqueue: kblockd blk_mq_timeout_work
[   32.494615] NIP:  c0206f00 LR: c0206df0 CTR: c0201570
[   32.494619] REGS: c018050f3610 TRAP: 0700   Not tainted  
(5.16.0-rc4-clg+)
[   32.494624] MSR:  8282b033   CR: 
44002288  XER: 
[   32.494636] CFAR: c0206e0c IRQMASK: 1
[   32.494636] GPR00: c0206df0 c018050f38b0 c1ca2900 
0800
[   32.494636] GPR04: c1ce21b8 0800 0800 

[   32.494636] GPR08:  0200  
fffd
[   32.494636] GPR12:  c01fff7c5880 c018f488 
c0012faaba40
[   32.494636] GPR16:    
0001
[   32.494636] GPR20:  c018050f3c40 c076e110 
c0013ac23678
[   32.494636] GPR24: 007f 0100 0001 
c01805b08000
[   32.494636] GPR28: c00139b8cc18 0001 0001 
c00139b8cc00
[   32.494681] NIP [c0206f00] irq_startup+0x1c0/0x1e0
[   32.494686] LR [c0206df0] irq_startup+0xb0/0x1e0
[   32.494690] Call Trace:
[   32.494692] [c018050f38b0] [c0206df0] irq_startup+0xb0/0x1e0 
(unreliable)
[   32.494699] [c018050f38f0] [c020155c] __enable_irq+0x9c/0xb0
[   32.494705] [c018050f3950] [c02015d0] enable_irq+0x60/0xc0
[   32.494710] [c018050f39d0] [c00814a54ae8] 
nvme_poll_irqdisable+0x80/0xc0 [nvme]
[   32.494719] [c018050f3a00] [c00814a55824] nvme_timeout+0x18c/0x420 
[nvme]
[   32.494726] [c018050f3ae0] [c076e1b8] 
blk_mq_check_expired+0xa8/0x130
[   32.494732] [c018050f3b10] [c07793e8] bt_iter+0xd8/0x120
[   32.494737] [c018050f3b60] [c077a34c] 
blk_mq_queue_tag_busy_iter+0x25c/0x3f0
[   32.494742] [c018050f3c20] [c076ffa4] 
blk_mq_timeout_work+0x84/0x1a0
[   32.494747] [c018050f3c70] [c0182a78] 
process_one_work+0x2a8/0x5a0
[   32.494754] [c018050f3d10] [c0183468] worker_thread+0xa8/0x610
[   32.494759] [c018050f3da0] [c018f634] kthread+0x1b4/0x1c0
[   32.494764] [c018050f3e10] [c000cd64] 
ret_from_kernel_thread+0x5c/0x64
[   32.494769] Instruction dump:
[   32.494773] 6000 0b03 38a0 7f84e378 7fc3f378 4bff9a55 6000 
7fe3fb78
[   32.494781] 4bfffd79 eb810020 7c7e1b78 4bfffe94 <0fe0> 6000 6000 
6042
[   32.494788] ---[ end trace 2a27b87f2b3e7a1f ]---
[   32.494798] nvme nvme0: I/O 192 QID 128 timeout, aborting
[   32.584562] nvme nvme0: Abort status: 0x0
[   62.574526] nvme nvme0: I/O 200 QID 128 timeout, aborting
[   62.574587]  nvme0n1: p1




[xen-4.15-testing test] 167217: tolerable FAIL - PUSHED

2021-12-07 Thread osstest service owner
flight 167217 xen-4.15-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167217/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 166387

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 166387
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 166387
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 166387
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 166387
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 166387
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 166387
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 166387
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 166387
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 166387
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 166387
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 166387
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 166387
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass


Re: [patch V2 01/23] powerpc/4xx: Remove MSI support which never worked

2021-12-07 Thread Cédric Le Goater

On 12/7/21 12:36, Michael Ellerman wrote:

Cédric Le Goater  writes:

Hello Thomas,

On 12/6/21 23:27, Thomas Gleixner wrote:

This code is broken since day one. ppc4xx_setup_msi_irqs() has the
following gems:

   1) The handling of the result of msi_bitmap_alloc_hwirqs() is completely
  broken:
  
  When the result is greater than or equal 0 (bitmap allocation

  successful) then the loop terminates and the function returns 0
  (success) despite not having installed an interrupt.

  When the result is less than 0 (bitmap allocation fails), it prints an
  error message and continues to "work" with that error code which would
  eventually end up in the MSI message data.

   2) On every invocation the file global pp4xx_msi::msi_virqs bitmap is
  allocated thereby leaking the previous one.

IOW, this has never worked and for more than 10 years nobody cared. Remove
the gunk.

Fixes: 3fb7933850fa ("powerpc/4xx: Adding PCIe MSI support")


Shouldn't we remove all of it ? including the updates in the device trees
and the Kconfig changes under :

arch/powerpc/platforms/44x/Kconfig: select PPC4xx_MSI
arch/powerpc/platforms/44x/Kconfig: select PPC4xx_MSI
arch/powerpc/platforms/44x/Kconfig: select PPC4xx_MSI
arch/powerpc/platforms/44x/Kconfig: select PPC4xx_MSI
arch/powerpc/platforms/40x/Kconfig: select PPC4xx_MSI


This patch should drop those selects I guess. Can you send an
incremental diff for Thomas to squash in?


Sure.


Removing all the tendrils in various device tree files will probably
require some archaeology, and it should be perfectly safe to leave those
in the tree with the driver gone. So I think we can do that as a
subsequent patch, rather than in this series.


Here are the changes. Compiled tested with ppc40x and ppc44x defconfigs.

Thanks,

C.

diff --git a/arch/powerpc/boot/dts/bluestone.dts 
b/arch/powerpc/boot/dts/bluestone.dts
index aa1ae94cd776..6971595319c1 100644
--- a/arch/powerpc/boot/dts/bluestone.dts
+++ b/arch/powerpc/boot/dts/bluestone.dts
@@ -366,30 +366,5 @@ PCIE0: pcie@d {
0x0 0x0 0x0 0x3  0xe 0x4 /* swizzled int C 
*/
0x0 0x0 0x0 0x4  0xf 0x4 /* swizzled int D 
*/>;
};
-
-   MSI: ppc4xx-msi@C1000 {
-   compatible = "amcc,ppc4xx-msi", "ppc4xx-msi";
-   reg = < 0xC 0x1000 0x100
-   0xC 0x1000 0x100>;
-   sdr-base = <0x36C>;
-   msi-data = <0x4440>;
-   msi-mask = <0xffe0>;
-   interrupts =<0 1 2 3 4 5 6 7>;
-   interrupt-parent = <>;
-   #interrupt-cells = <1>;
-   #address-cells = <0>;
-   #size-cells = <0>;
-   msi-available-ranges = <0x0 0x100>;
-   interrupt-map = <
-   0  0x18 1
-   1  0x19 1
-   2  0x1A 1
-   3  0x1B 1
-   4  0x1C 1
-   5  0x1D 1
-   6  0x1E 1
-   7  0x1F 1
-   >;
-   };
};
 };
diff --git a/arch/powerpc/boot/dts/canyonlands.dts 
b/arch/powerpc/boot/dts/canyonlands.dts
index c5fbb08e0a6e..5db1bff6b23d 100644
--- a/arch/powerpc/boot/dts/canyonlands.dts
+++ b/arch/powerpc/boot/dts/canyonlands.dts
@@ -544,23 +544,5 @@ PCIE1: pcie@d2000 {
0x0 0x0 0x0 0x3  0x12 0x4 /* swizzled int 
C */
0x0 0x0 0x0 0x4  0x13 0x4 /* swizzled int D 
*/>;
};
-
-   MSI: ppc4xx-msi@C1000 {
-   compatible = "amcc,ppc4xx-msi", "ppc4xx-msi";
-   reg = < 0xC 0x1000 0x100>;
-   sdr-base = <0x36C>;
-   msi-data = <0x>;
-   msi-mask = <0x>;
-   interrupt-count = <3>;
-   interrupts = <0 1 2 3>;
-   interrupt-parent = <>;
-   #interrupt-cells = <1>;
-   #address-cells = <0>;
-   #size-cells = <0>;
-   interrupt-map = <0  0x18 1
-   1  0x19 1
-   2  0x1A 1
-   3  0x1B 1>;
-   };
};
 };
diff --git a/arch/powerpc/boot/dts/katmai.dts b/arch/powerpc/boot/dts/katmai.dts
index a8f353229fb7..4262b2bbd6de 100644
--- a/arch/powerpc/boot/dts/katmai.dts
+++ b/arch/powerpc/boot/dts/katmai.dts
@@ -442,24 +442,6 @@ PCIE2: pcie@d4000 {
0x0 0x0 0x0 0x4  0xb 0x4 /* swizzled int D 
*/>;
};

Re: [RFC 1/1] xen/arm: set iommu property for IOMMU-protected devices

2021-12-07 Thread Sergiy Kibrik

hi Stefano, Julien,

On 11/10/21 11:12 пп, Stefano Stabellini wrote:

On Mon, 8 Nov 2021, Julien Grall wrote:

[..]

A few years ago, I attempted to disable the swiotlb when Xen configured the
IOMMU for the device (see [1]). Did you have a chance to go through the
thread? In particular, I think Ian Campbell suggestion about creating an IOMMU
binding is quite interesting.

Stefano, what do you think?


Yes I think it is a good idea. In fact, thinking more about it, it is
really the best option. Regardless of the implementation (swiotlb or
whatever) the device tree description is likely to look similar to the
description of an IOMMU because it is the common pattern shared by all
controllers (reset, power, clocks, etc.) so it makes sense to re-use it.

- there is one controller node (the "IOMMU")
- there is one property under each device node that is protected,
   pointing to the controller with a phandle and optional parameters (in
   the case of IOMMUs it is called "iommus")



Code in arch_setup_dma_ops() always forces swiotlb for dom0 regardless 
of any prior IOMMU configuration for the device. So if we are to re-use 
IOMMU bindings and implement kind of dummy iommu (that merely does 
direct allocation and mapping) we'll have to check whether device is 
protected anyway, e.g.:


  diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
  index 49f566ad9acb..6ddef3233095 100644
  --- a/arch/arm/xen/enlighten.c
  +++ b/arch/arm/xen/enlighten.c
  @@ -425,6 +425,10 @@ static int __init xen_pm_init(void)
   }
   late_initcall(xen_pm_init);

  +bool xen_is_device_protected(struct device *dev) {
  + return dev->dma_ops == _xen_iommu_ops;
  +}

   /* empty stubs */
   void xen_arch_pre_suspend(void) { }


Have I got it right?

 - Sergiy



Re: xen: linker symbol mess, and freeing errors

2021-12-07 Thread Jan Beulich
On 03.12.2021 14:34, Andrew Cooper wrote:
> Hello,
> 
> Following the __ro_after_init work, I tried to complete a few pieces of
> cleanup that I'd accrued, and everything has unravelled.
> 
> On x86, the __2M_* symbols haven't really been 2M aligned since their
> introduction, and the utter mess that was _stext starting at 1M has long
> since been cleared up.  Dropping the 2M prefix reveals that we have both
> __init_{start,begin} and identifying that lead to discovering that
> 
>     /* Destroy Xen's mappings, and reuse the pages. */
>     if ( using_2M_mapping() )
>     {
>     start = (unsigned long)&__2M_init_start,
>     end   = (unsigned long)&__2M_init_end;
>     }
>     else
>     {
>     start = (unsigned long)&__init_begin;
>     end   = (unsigned long)&__init_end;
>     }
> 
> is a tautology that nothing is capable of optimising.

Interesting. I would assume it wasn't always that way, but clearly it
is now.

> So I set about trying to simply both x86 and ARM down to a single sets
> of bounding variables, with a requirement that these would be expected
> to be common across all architectures.
> 
> I'm intending to use __$FOO_{start,end} because we're semi-consistent on
> this already, and get rid of the ones such as _{s,e}$FOO because they're
> unnecessarily obscure, and complicated to read for a compound foo.
> 
> At this point (as I haven't really started yet), I could be persuaded on
> a different naming scheme if anyone has any strong views.

Imo that scheme is fine. _{s,e}$FOO have always seemed risky in terms
of name clashes / confusion to me, but I've assumed we use them for
being pretty standard and hence recognized by certain tools.

> But that's only the start of the fun.  The is_kernel() predicate is
> broken (or at least problematic) because it covers the init section. 

I'd say problematic. We may want to have is_active_kernel()
paralleling other is_active_...(); whether a need for is_kernel()
would then remain is to be seen.

Jan




Re: Call for agenda items for December 7th Community Call @ 1500 UTC

2021-12-07 Thread Julien Grall




On 02/12/2021 20:24, Ashley Weltz wrote:

Hi everyone,


Hi Ahsley,



Our next meeting is on Tuesday, December 7th at 1500 UTC.


The calendar invitation is for 4PM UTC. Can you clarify which time is it?

Cheers,

--
Julien Grall



Re: [XEN PATCH 55/57] tools/xenstore: introduce Makefile.common to be used by stubdom

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

Also change stubdom to depends on Makefile.common.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH 54/57] tools/libs: create Makefile.common to be used by stubdom build system

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

This new "Makefile.common" is intended to be used by both tools/ and
stubdom/ build system without stubdom needed to use tools/ build
system.

It should contain the necessary list of objects and CFLAGS needed to
build a static library.

Change stubdom/ to check Makefile.common, for the linkfarm.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v2 1/7] xz: add fall-through comments to a switch statement

2021-12-07 Thread Jan Beulich
On 06.12.2021 14:30, Jan Beulich wrote:
> From: Lasse Collin 
> 
> It's good style. I was also told that GCC 7 is more strict and might
> give a warning when such comments are missing.
> 
> Suggested-by: Andrei Borzenkov 
> Signed-off-by: Lasse Collin 
> Signed-off-by: Jiri Kosina 
> [Linux commit: 5a244f48ecbbd03a11eb84819c5c599db81823ee]
> Signed-off-by: Jan Beulich 
> Acked-by: Julien Grall 

Because of the ongoing discussion on the respective v1 sub-thread I
have dropped this ack. It was suggested that I may have misunderstood
and it was actually withdrawn.

Jan




Re: [XEN PATCH 49/57] libs/toolcore: don't install xentoolcore_internal.h anymore

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

With "xentoolcore_internal.h" been in LIBHEADER, it was installed. But
its dependency "_xentoolcore_list.h" wasn't installed so the header
couldn't be used anyway.

This patch also mean that the rule "headers.chk" doesn't check it
anymore as well.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH 48/57] libs/stat: Fix and rework perl-binding build

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

For PERL_FLAGS, use make's shell rather than a backquote.

Rather than relying on the VCS to create an empty directory for us,
we can create one before generating the *.c file for the bindings.

Make use of generic variable names to build a shared library from a
source file: CFLAGS, LDFLAGS, and LDLIBS.

To build a shared library, we need to build the source file with
"-fPIC", which was drop by 6d0ec05390 (tools: split libxenstat into
new tools/libs/stat directory).

The source file generated by swig seems to be missing many prototype for
many functions, so we need "-Wno-missing-prototypes" in order to
build it. Also, one of the prototype is deemed malformed, so we also
need "-Wno-strict-prototypes".

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH 47/57] libs/stat: Fix and rework python-bindings build

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

Fix the dependency on the library, $(SHLIB) variable doesn't exist
anymore.

Rework dependency on the include file, we can let `swig` generate the
dependency for us with the use of "-M*" flags.

The xenstat.h file has moved so we need to fix the include location.

Rather than relaying on the VCS to create an empty directory for us,
we can create one before generating the *.c file for the bindings.

Make use of generic variable names to build a shared library from a
source file: CFLAGS, LDFLAGS, and LDLIBS.

Fix python's specific *flags by using python-config, and add them to
generic flags variables: CFLAGS, LDLIBS.

To build a shared library, we need to build the source file with
"-fPIC", which was drop by 6d0ec05390 (tools: split libxenstat into
new tools/libs/stat directory).

The source file generated by swig seems to be missing a prototype for
the "init" function, so we need "-Wno-missing-prototypes" in order to
build it.

Add some targets to .PHONY.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 3/3] x86/boot: Don't double-copy the stack during physical relocation

2021-12-07 Thread Jan Beulich
On 07.12.2021 11:53, Andrew Cooper wrote:
> cpu0_stack is contained within .data, which means the memcpy() already takes a
> snapshot at the start of the critical region.
> 
> Later, when we switch to the relocated Xen, we do end up losing updates to the
> local variables,

You word this as if that was the case already before your change, ...

> but that's fine because the only variables we've modified go
> out of scope after the printk().  Use this properly to avoid copying the whole
> stack (32k) a second time.

... not the least because of "Use this properly ...". But aiui this
is only a result of your change, in that you no longer "re-sync the
stack" (as the comment says that you remove).

I can appreciate that copying 32k is too much. Yet I do think that
we're putting ourselves at risk if we rely on local variables going
out of scope which have been updated. Even more so with no comment
next to their declarations making clear that it is imperative for
them to not move to outer scopes. (Seeing e.g. another "i" and "j"
in the outer scope, one might actually be inclined to do so. For
"j" this would, afaict, even work, while "i" collides with the outer
one no matter what.)

You mentioning the printk() - did you consider someone, perhaps just
for debugging purposes, adding uses there of some of these local
variables?

We could limit the copying to just the primary stack(s), for
example. Otoh I'm not convinced trying to save 20k or even 32k of
copied memory is worth it in this boot-time code. There are larger
gains to be had elsewhere - see e.g. my "x86: memcpy() / memset()
(non-)ERMS flavors plus fallout" and "IOMMU: superpage support when
not sharing pagetables" which continue to be pending.

> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1183,6 +1183,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  l3_pgentry_t *pl3e;
>  l2_pgentry_t *pl2e;
>  int i, j, k;
> +unsigned long tmp;
>  
>  /* Select relocation address. */
>  xen_phys_start = end - reloc_size;
> @@ -1193,7 +1194,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>   * Perform relocation to new physical address.
>   * Before doing so we must sync static/global data with main 
> memory
>   * with a barrier(). After this we must *not* modify 
> static/global
> - * data until after we have switched to the relocated pagetables!
> + * data, or locals that need to survive, until after we have
> + * switched to the relocated pagetables!
>   */
>  barrier();
>  memcpy(__va(__pa(_start)), _start, _end - _start);
> @@ -1239,18 +1241,21 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) + 
> xen_phys_start);
>  }
>  
> -/* Re-sync the stack and then switch to relocated pagetables. */
> +/*
> + * Switch to relocated pagetables.  This also discards updates to
> + * any local variables since the memmove() call above, but that's
> + * fine because we don't use any of them again.
> + */
>  asm volatile (
> -"rep movsq; " /* re-sync the stack */
> -"movq %%cr4,%%rsi ; "
> -"andb $0x7f,%%sil ; "
> -"movq %%rsi,%%cr4 ; " /* CR4.PGE == 0 */
> -"movq %[pg],%%cr3 ; " /* CR3 == new pagetables */
> -"orb $0x80,%%sil  ; "
> -"movq %%rsi,%%cr4   " /* CR4.PGE == 1 */
> -: "=" (i), "=" (i), "=" (i) /* All outputs discarded. 
> */
> -:  [pg] "r" (__pa(idle_pg_table)), "0" (cpu0_stack),
> -   "1" (__va(__pa(cpu0_stack))), "2" (STACK_SIZE / 8)
> +"mov%%cr4, %[cr4]\n\t"
> +"andb   $~%c[pge], %b[cr4]\n\t"

Why not simply %[pge] (name subject to improvement) with the value
below becoming ~X86_CR4_PGE (and the constraint becoming "K")?

> +"mov%[cr4], %%cr4\n\t" /* CR4.PGE = 0 */
> +"mov%[cr3], %%cr3\n\t" /* CR3 = new pagetables */
> +"orb$%c[pge], %b[cr4]\n\t"

Oh, I see - the constant gets reused here.

Jan

> +"mov%[cr4], %%cr4\n\t" /* CR4.PGE = 1 */
> +: [cr4] "=" (tmp) /* Could be "r", but "a" makes better 
> asm */
> +: [cr3] "r" (__pa(idle_pg_table)),
> +  [pge] "i" (X86_CR4_PGE)
>  : "memory" );
>  
>  printk("New Xen image base address: %#lx\n", xen_phys_start);
> 




Re: [PATCH 1/7] xz: add fall-through comments to a switch statement

2021-12-07 Thread Jan Beulich
On 07.12.2021 13:00, Ian Jackson wrote:
> Jan Beulich writes ("Re: [PATCH 1/7] xz: add fall-through comments to a 
> switch statement"):
>> I'm unwilling only as long as I don't understand the need for them. As
>> indicated, while I appreciate your "make verification easier for
>> reviewers", I assign that at least no higher priority than my desire
>> to leave out inapplicable data.
> 
> Are we still talking about Signed-off-by ?
> 
> I explained the purpose of Signed-off-by.  It reflects the chain of
> custody.

And I accepted that (without claiming to truly understand things) as
far as the desire to prove all tags goes. Hence the v2 submission. To
me, as can be seen there, that chain of custody includes the original
patch submission, but not the flow through Linux subsystem trees.

>  It is true that s-o-b is often added by people with minimal
> contribution to the content; I don't think that is relevant.
> 
> If the Xen patch was derived from Linux commit XYZ (whether
> automatically or manually) then even those minimal S-o-b are
> applicable.

I'd like to adjust that to "If the Xen patch was claimed to have
been derived from Linux commit XYZ ..." I don't think I've made such
a claim anywhere.

>> I'd be happy for anyone else to start over. I would even ack such a
>> submission myself. But as long as I'm recorded with S-o-b, I'm afraid
>> I'm not going to accept re-addition of the tags for no good (as per my
>> personal view) reason. Otherwise, based on experience, the example of
>> this series could, in the future, be used to tell me that on an earlier
>> occasion I (seemingly) did things differently.
> 
> S-o-b does not indicate complete approval of the content.  It attests
> precisely to the statements in the DCO.  There is nothing irregular
> about putting your S-o-b on something you don't entirely agree with.

Isn't it up to me where I do put my S-o-b under, and where I'd rather
not?

> Stepping back:
> 
> In a collaborative project, we must all respect our peers and the
> community's decision.  That can mean actively progressing patches that
> we personally disagree with, but which the community has decided ought
> to be applied. [1]

The latter part aside (as I don't think I'm standing in the way of
getting these changes in), I don't see any "community decision" here.
What to do (or not) in cases like pulling in changes from elsewhere
is simply not documented anywhere. Hence all I could have gone from
are past examples. (I don't dare to guess what the correct thing to
do was if a change was to be taken from a project not using the S-o-b
model.)

>> As said earlier, if submissions in this form are going to be nak-ed
>> going forward, and if good reasons (see above) will not be provided
>> (and hence leeway will not be granted to the submitter) to support this,
>> then someone else will need to start looking after imports which may be
>> relevant to us.
> 
> The problem is simply one of verification.  So far there does not seem
> to be a positive ack[1] for this patch.  Neither I nor Julien are
> nacking it.

But, as Julien calls it, "open objections" are effectively preventing
the thing from going in. Otherwise Luca's R-b would be enough for the
whole lot to go in.

I did submit v2 with his ack on patch 1, in the belief that a) he didn't
actively withdraw it and b) I did address his concerns there. Earlier
today, seeing the ongoing discussion, I did drop it here (and I'll try
to remember to also reply to this effect to the patch itself).

> AIUI Julien does not want to ack it without being able to check that
> all the appropriate s-o-b (and perhaps other tags) are present.  I
> think that as the submitter it is really best if you make that easy.
> 
> 
> We think the obvious way to make that easy for a reviewer is to just
> copy the tags.  But an alternative would be to include, in the commit
> message, full details of where the patch came from in (including urls
> to mailing list articles) in such a way that a reviewer can see easily
> that all the necessary s-o-b are present.

This alternative is what v2 does: Already in v1 there were links
present to the original submissions, where available. I didn't think
there was any extra wording needed next to the links. v2 merely syncs
the tags with the patch submissions.

> [1] Of course very rarely there might be matters of conscience, where
> we have protested but our protest has been overruled by our peers.
> But that does not seem to be the case here since you are not giving a
> nak.
> 
> Ian.
> 
> [1] Julien did give one earlier but then wrote "actually" and started
> this subthread, so I think Julien has withdrawn his ack.

Well, okay, I didn't read it that way, especially since I did address
his concern there in v2. But as said, I've meanwhile dropped his ack
from there.

Jan




Re: [XEN PATCH 46/57] libs/store: use of -iquote instead of -I

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH 45/57] libs/guest: rework CFLAGS

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

Remove '-Werror -Wmissing-progress -I./include $(CFLAGS_xeninclude)',
those flags are already added via "libs.mk".

Flag "-I." isn't needed, we just need to fix the #include of
"xg_core.h" as this header isn't expected to be installed.

Make use of "-iquote" instead of '-I' for double-quote included
headers.

Also, regroup the CFLAGS into the same place in the makefile.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH v3 5/6] virtio: use ->handle_output() instead of ->handle_aio_output()

2021-12-07 Thread Stefan Hajnoczi
The difference between ->handle_output() and ->handle_aio_output() was
that ->handle_aio_output() returned a bool return value indicating
progress. This was needed by the old polling API but now that the bool
return value is gone, the two functions can be unified.

Signed-off-by: Stefan Hajnoczi 
---
 hw/virtio/virtio.c | 33 +++--
 1 file changed, 3 insertions(+), 30 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index c042be3935..a97a406d3c 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -125,7 +125,6 @@ struct VirtQueue
 
 uint16_t vector;
 VirtIOHandleOutput handle_output;
-VirtIOHandleOutput handle_aio_output;
 VirtIODevice *vdev;
 EventNotifier guest_notifier;
 EventNotifier host_notifier;
@@ -2300,20 +2299,6 @@ void virtio_queue_set_align(VirtIODevice *vdev, int n, 
int align)
 }
 }
 
-static void virtio_queue_notify_aio_vq(VirtQueue *vq)
-{
-if (vq->vring.desc && vq->handle_aio_output) {
-VirtIODevice *vdev = vq->vdev;
-
-trace_virtio_queue_notify(vdev, vq - vdev->vq, vq);
-vq->handle_aio_output(vdev, vq);
-
-if (unlikely(vdev->start_on_kick)) {
-virtio_set_started(vdev, true);
-}
-}
-}
-
 static void virtio_queue_notify_vq(VirtQueue *vq)
 {
 if (vq->vring.desc && vq->handle_output) {
@@ -2392,7 +2377,6 @@ VirtQueue *virtio_add_queue(VirtIODevice *vdev, int 
queue_size,
 vdev->vq[i].vring.num_default = queue_size;
 vdev->vq[i].vring.align = VIRTIO_PCI_VRING_ALIGN;
 vdev->vq[i].handle_output = handle_output;
-vdev->vq[i].handle_aio_output = NULL;
 vdev->vq[i].used_elems = g_malloc0(sizeof(VirtQueueElement) *
queue_size);
 
@@ -2404,7 +2388,6 @@ void virtio_delete_queue(VirtQueue *vq)
 vq->vring.num = 0;
 vq->vring.num_default = 0;
 vq->handle_output = NULL;
-vq->handle_aio_output = NULL;
 g_free(vq->used_elems);
 vq->used_elems = NULL;
 virtio_virtqueue_reset_region_cache(vq);
@@ -3509,14 +3492,6 @@ EventNotifier *virtio_queue_get_guest_notifier(VirtQueue 
*vq)
 return >guest_notifier;
 }
 
-static void virtio_queue_host_notifier_aio_read(EventNotifier *n)
-{
-VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
-if (event_notifier_test_and_clear(n)) {
-virtio_queue_notify_aio_vq(vq);
-}
-}
-
 static void virtio_queue_host_notifier_aio_poll_begin(EventNotifier *n)
 {
 VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
@@ -3536,7 +3511,7 @@ static void 
virtio_queue_host_notifier_aio_poll_ready(EventNotifier *n)
 {
 VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
 
-virtio_queue_notify_aio_vq(vq);
+virtio_queue_notify_vq(vq);
 }
 
 static void virtio_queue_host_notifier_aio_poll_end(EventNotifier *n)
@@ -3551,9 +3526,8 @@ void virtio_queue_aio_set_host_notifier_handler(VirtQueue 
*vq, AioContext *ctx,
 VirtIOHandleOutput handle_output)
 {
 if (handle_output) {
-vq->handle_aio_output = handle_output;
 aio_set_event_notifier(ctx, >host_notifier, true,
-   virtio_queue_host_notifier_aio_read,
+   virtio_queue_host_notifier_read,
virtio_queue_host_notifier_aio_poll,
virtio_queue_host_notifier_aio_poll_ready);
 aio_set_event_notifier_poll(ctx, >host_notifier,
@@ -3563,8 +3537,7 @@ void virtio_queue_aio_set_host_notifier_handler(VirtQueue 
*vq, AioContext *ctx,
 aio_set_event_notifier(ctx, >host_notifier, true, NULL, NULL, 
NULL);
 /* Test and clear notifier before after disabling event,
  * in case poll callback didn't have time to run. */
-virtio_queue_host_notifier_aio_read(>host_notifier);
-vq->handle_aio_output = NULL;
+virtio_queue_host_notifier_read(>host_notifier);
 }
 }
 
-- 
2.33.1




[PATCH v3 6/6] virtio: unify dataplane and non-dataplane ->handle_output()

2021-12-07 Thread Stefan Hajnoczi
Now that virtio-blk and virtio-scsi are ready, get rid of
the handle_aio_output() callback. It's no longer needed.

Signed-off-by: Stefan Hajnoczi 
---
 include/hw/virtio/virtio.h  |  4 +--
 hw/block/dataplane/virtio-blk.c | 16 ++
 hw/scsi/virtio-scsi-dataplane.c | 54 -
 hw/virtio/virtio.c  | 32 +--
 4 files changed, 26 insertions(+), 80 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index b90095628f..f095637058 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -316,8 +316,8 @@ bool virtio_device_ioeventfd_enabled(VirtIODevice *vdev);
 EventNotifier *virtio_queue_get_host_notifier(VirtQueue *vq);
 void virtio_queue_set_host_notifier_enabled(VirtQueue *vq, bool enabled);
 void virtio_queue_host_notifier_read(EventNotifier *n);
-void virtio_queue_aio_set_host_notifier_handler(VirtQueue *vq, AioContext *ctx,
-VirtIOHandleOutput handle_output);
+void virtio_queue_aio_attach_host_notifier(VirtQueue *vq, AioContext *ctx);
+void virtio_queue_aio_detach_host_notifier(VirtQueue *vq, AioContext *ctx);
 VirtQueue *virtio_vector_first_queue(VirtIODevice *vdev, uint16_t vector);
 VirtQueue *virtio_vector_next_queue(VirtQueue *vq);
 
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index a2fa407b98..49276e46f2 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -154,17 +154,6 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 g_free(s);
 }
 
-static void virtio_blk_data_plane_handle_output(VirtIODevice *vdev,
-VirtQueue *vq)
-{
-VirtIOBlock *s = (VirtIOBlock *)vdev;
-
-assert(s->dataplane);
-assert(s->dataplane_started);
-
-virtio_blk_handle_vq(s, vq);
-}
-
 /* Context: QEMU global mutex held */
 int virtio_blk_data_plane_start(VirtIODevice *vdev)
 {
@@ -258,8 +247,7 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
 for (i = 0; i < nvqs; i++) {
 VirtQueue *vq = virtio_get_queue(s->vdev, i);
 
-virtio_queue_aio_set_host_notifier_handler(vq, s->ctx,
-virtio_blk_data_plane_handle_output);
+virtio_queue_aio_attach_host_notifier(vq, s->ctx);
 }
 aio_context_release(s->ctx);
 return 0;
@@ -302,7 +290,7 @@ static void virtio_blk_data_plane_stop_bh(void *opaque)
 for (i = 0; i < s->conf->num_queues; i++) {
 VirtQueue *vq = virtio_get_queue(s->vdev, i);
 
-virtio_queue_aio_set_host_notifier_handler(vq, s->ctx, NULL);
+virtio_queue_aio_detach_host_notifier(vq, s->ctx);
 }
 }
 
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 76137de67f..29575cbaf6 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -49,45 +49,6 @@ void virtio_scsi_dataplane_setup(VirtIOSCSI *s, Error **errp)
 }
 }
 
-static void virtio_scsi_data_plane_handle_cmd(VirtIODevice *vdev,
-  VirtQueue *vq)
-{
-VirtIOSCSI *s = VIRTIO_SCSI(vdev);
-
-virtio_scsi_acquire(s);
-if (!s->dataplane_fenced) {
-assert(s->ctx && s->dataplane_started);
-virtio_scsi_handle_cmd_vq(s, vq);
-}
-virtio_scsi_release(s);
-}
-
-static void virtio_scsi_data_plane_handle_ctrl(VirtIODevice *vdev,
-   VirtQueue *vq)
-{
-VirtIOSCSI *s = VIRTIO_SCSI(vdev);
-
-virtio_scsi_acquire(s);
-if (!s->dataplane_fenced) {
-assert(s->ctx && s->dataplane_started);
-virtio_scsi_handle_ctrl_vq(s, vq);
-}
-virtio_scsi_release(s);
-}
-
-static void virtio_scsi_data_plane_handle_event(VirtIODevice *vdev,
-VirtQueue *vq)
-{
-VirtIOSCSI *s = VIRTIO_SCSI(vdev);
-
-virtio_scsi_acquire(s);
-if (!s->dataplane_fenced) {
-assert(s->ctx && s->dataplane_started);
-virtio_scsi_handle_event_vq(s, vq);
-}
-virtio_scsi_release(s);
-}
-
 static int virtio_scsi_set_host_notifier(VirtIOSCSI *s, VirtQueue *vq, int n)
 {
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
@@ -112,10 +73,10 @@ static void virtio_scsi_dataplane_stop_bh(void *opaque)
 VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
 int i;
 
-virtio_queue_aio_set_host_notifier_handler(vs->ctrl_vq, s->ctx, NULL);
-virtio_queue_aio_set_host_notifier_handler(vs->event_vq, s->ctx, NULL);
+virtio_queue_aio_detach_host_notifier(vs->ctrl_vq, s->ctx);
+virtio_queue_aio_detach_host_notifier(vs->event_vq, s->ctx);
 for (i = 0; i < vs->conf.num_queues; i++) {
-virtio_queue_aio_set_host_notifier_handler(vs->cmd_vqs[i], s->ctx, 
NULL);
+virtio_queue_aio_detach_host_notifier(vs->cmd_vqs[i], s->ctx);
 }
 }
 
@@ -176,14 +137,11 @@ int virtio_scsi_dataplane_start(VirtIODevice *vdev)
 memory_region_transaction_commit();
 
 

Re: [XEN PATCH 44/57] libs/guest: rename ELF_OBJS to LIBELF_OBJS

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

It seems a better name. Latter, we will introduce LIBX86_OBJS to


s/Latter/Later/


collect lib/x86/* objects.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH v3 4/6] virtio-scsi: prepare virtio_scsi_handle_cmd for dataplane

2021-12-07 Thread Stefan Hajnoczi
Prepare virtio_scsi_handle_cmd() to be used by both dataplane and
non-dataplane by making the condition for starting ioeventfd more
specific. This way it won't trigger when dataplane has already been
started.

Signed-off-by: Stefan Hajnoczi 
---
 hw/scsi/virtio-scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 51fd09522a..34a968ecfb 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -720,7 +720,7 @@ static void virtio_scsi_handle_cmd(VirtIODevice *vdev, 
VirtQueue *vq)
 /* use non-QOM casts in the data path */
 VirtIOSCSI *s = (VirtIOSCSI *)vdev;
 
-if (s->ctx) {
+if (s->ctx && !s->dataplane_started) {
 virtio_device_start_ioeventfd(vdev);
 if (!s->dataplane_fenced) {
 return;
-- 
2.33.1




[PATCH v3 1/6] aio-posix: split poll check from ready handler

2021-12-07 Thread Stefan Hajnoczi
Adaptive polling measures the execution time of the polling check plus
handlers called when a polled event becomes ready. Handlers can take a
significant amount of time, making it look like polling was running for
a long time when in fact the event handler was running for a long time.

For example, on Linux the io_submit(2) syscall invoked when a virtio-blk
device's virtqueue becomes ready can take 10s of microseconds. This
can exceed the default polling interval (32 microseconds) and cause
adaptive polling to stop polling.

By excluding the handler's execution time from the polling check we make
the adaptive polling calculation more accurate. As a result, the event
loop now stays in polling mode where previously it would have fallen
back to file descriptor monitoring.

The following data was collected with virtio-blk num-queues=2
event_idx=off using an IOThread. Before:

168k IOPS, IOThread syscalls:

  9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, 
nr: 16, iocbpp: 0x7fcb9f937db0)= 16
  9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, 
count: 8) = 8
  9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, 
count: 8) = 8
  9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 
4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3
  9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, 
count: 512)= 8
  9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, 
count: 512)= 8
  9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, 
count: 512)= 8
  9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, 
nr: 32, iocbpp: 0x7fca7d0cebe0)= 32

174k IOPS (+3.6%), IOThread syscalls:

  9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, 
nr: 32, iocbpp: 0x7fd0cdd62be0)= 32
  9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, 
count: 8) = 8
  9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, 
count: 8) = 8
  9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, 
nr: 32, iocbpp: 0x7fd0d0388b50)= 32

Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because
the IOThread stays in polling mode instead of falling back to file
descriptor monitoring.

As usual, polling is not implemented on Windows so this patch ignores
the new io_poll_read() callback in aio-win32.c.

Signed-off-by: Stefan Hajnoczi 
---
 include/block/aio.h  |  4 +-
 util/aio-posix.h |  1 +
 block/curl.c | 11 ++---
 block/export/fuse.c  |  4 +-
 block/io_uring.c | 19 +
 block/iscsi.c|  4 +-
 block/linux-aio.c| 16 +---
 block/nfs.c  |  6 +--
 block/nvme.c | 51 +++
 block/ssh.c  |  4 +-
 block/win32-aio.c|  4 +-
 hw/virtio/virtio.c   | 16 +---
 hw/xen/xen-bus.c |  6 +--
 io/channel-command.c |  6 ++-
 io/channel-file.c|  3 +-
 io/channel-socket.c  |  3 +-
 migration/rdma.c |  8 ++--
 tests/unit/test-aio.c|  4 +-
 util/aio-posix.c | 89 ++--
 util/aio-win32.c |  4 +-
 util/async.c | 10 -
 util/main-loop.c |  4 +-
 util/qemu-coroutine-io.c |  5 ++-
 util/vhost-user-server.c | 11 ++---
 24 files changed, 191 insertions(+), 102 deletions(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index 47fbe9d81f..5634173b12 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -469,6 +469,7 @@ void aio_set_fd_handler(AioContext *ctx,
 IOHandler *io_read,
 IOHandler *io_write,
 AioPollFn *io_poll,
+IOHandler *io_poll_ready,
 void *opaque);
 
 /* Set polling begin/end callbacks for a file descriptor that has already been
@@ -490,7 +491,8 @@ void aio_set_event_notifier(AioContext *ctx,
 EventNotifier *notifier,
 bool is_external,
 EventNotifierHandler *io_read,
-AioPollFn *io_poll);
+AioPollFn *io_poll,
+EventNotifierHandler *io_poll_ready);
 
 /* Set polling begin/end callbacks for an event notifier that has already been
  * registered with aio_set_event_notifier.  Do nothing if the event notifier is
diff --git a/util/aio-posix.h b/util/aio-posix.h
index c80c04506a..7f2c37a684 100644
--- a/util/aio-posix.h
+++ b/util/aio-posix.h
@@ -24,6 +24,7 @@ struct AioHandler {
 IOHandler *io_read;
 IOHandler *io_write;
 AioPollFn *io_poll;
+

[PATCH v3 3/6] virtio-blk: drop unused virtio_blk_handle_vq() return value

2021-12-07 Thread Stefan Hajnoczi
The return value of virtio_blk_handle_vq() is no longer used. Get rid of
it. This is a step towards unifying the dataplane and non-dataplane
virtqueue handler functions.

Prepare virtio_blk_handle_output() to be used by both dataplane and
non-dataplane by making the condition for starting ioeventfd more
specific. This way it won't trigger when dataplane has already been
started.

Signed-off-by: Stefan Hajnoczi 
---
 include/hw/virtio/virtio-blk.h |  2 +-
 hw/block/virtio-blk.c  | 14 +++---
 2 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 29655a406d..d311c57cca 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -90,7 +90,7 @@ typedef struct MultiReqBuffer {
 bool is_write;
 } MultiReqBuffer;
 
-bool virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq);
+void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq);
 void virtio_blk_process_queued_requests(VirtIOBlock *s, bool is_bh);
 
 #endif
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f139cd7cc9..82676cdd01 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -767,12 +767,11 @@ static int virtio_blk_handle_request(VirtIOBlockReq *req, 
MultiReqBuffer *mrb)
 return 0;
 }
 
-bool virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
+void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
 {
 VirtIOBlockReq *req;
 MultiReqBuffer mrb = {};
 bool suppress_notifications = virtio_queue_get_notification(vq);
-bool progress = false;
 
 aio_context_acquire(blk_get_aio_context(s->blk));
 blk_io_plug(s->blk);
@@ -783,7 +782,6 @@ bool virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
 }
 
 while ((req = virtio_blk_get_request(s, vq))) {
-progress = true;
 if (virtio_blk_handle_request(req, )) {
 virtqueue_detach_element(req->vq, >elem, 0);
 virtio_blk_free_request(req);
@@ -802,19 +800,13 @@ bool virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
 
 blk_io_unplug(s->blk);
 aio_context_release(blk_get_aio_context(s->blk));
-return progress;
-}
-
-static void virtio_blk_handle_output_do(VirtIOBlock *s, VirtQueue *vq)
-{
-virtio_blk_handle_vq(s, vq);
 }
 
 static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
 VirtIOBlock *s = (VirtIOBlock *)vdev;
 
-if (s->dataplane) {
+if (s->dataplane && !s->dataplane_started) {
 /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
  * dataplane here instead of waiting for .set_status().
  */
@@ -823,7 +815,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 return;
 }
 }
-virtio_blk_handle_output_do(s, vq);
+virtio_blk_handle_vq(s, vq);
 }
 
 void virtio_blk_process_queued_requests(VirtIOBlock *s, bool is_bh)
-- 
2.33.1




[PATCH v3 2/6] virtio: get rid of VirtIOHandleAIOOutput

2021-12-07 Thread Stefan Hajnoczi
The virtqueue host notifier API
virtio_queue_aio_set_host_notifier_handler() polls the virtqueue for new
buffers. AioContext previously required a bool progress return value
indicating whether an event was handled or not. This is no longer
necessary because the AioContext polling API has been split into a poll
check function and an event handler function. The event handler is only
run when we know there is work to do, so it doesn't return bool.

The VirtIOHandleAIOOutput function signature is now the same as
VirtIOHandleOutput. Get rid of the bool return value.

Further simplifications will be made for virtio-blk and virtio-scsi in
the next patch.

Signed-off-by: Stefan Hajnoczi 
---
 include/hw/virtio/virtio.h  |  3 +--
 hw/block/dataplane/virtio-blk.c |  4 ++--
 hw/scsi/virtio-scsi-dataplane.c | 18 ++
 hw/virtio/virtio.c  | 12 
 4 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 8bab9cfb75..b90095628f 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -175,7 +175,6 @@ void virtio_error(VirtIODevice *vdev, const char *fmt, ...) 
GCC_FMT_ATTR(2, 3);
 void virtio_device_set_child_bus_name(VirtIODevice *vdev, char *bus_name);
 
 typedef void (*VirtIOHandleOutput)(VirtIODevice *, VirtQueue *);
-typedef bool (*VirtIOHandleAIOOutput)(VirtIODevice *, VirtQueue *);
 
 VirtQueue *virtio_add_queue(VirtIODevice *vdev, int queue_size,
 VirtIOHandleOutput handle_output);
@@ -318,7 +317,7 @@ EventNotifier *virtio_queue_get_host_notifier(VirtQueue 
*vq);
 void virtio_queue_set_host_notifier_enabled(VirtQueue *vq, bool enabled);
 void virtio_queue_host_notifier_read(EventNotifier *n);
 void virtio_queue_aio_set_host_notifier_handler(VirtQueue *vq, AioContext *ctx,
-VirtIOHandleAIOOutput 
handle_output);
+VirtIOHandleOutput handle_output);
 VirtQueue *virtio_vector_first_queue(VirtIODevice *vdev, uint16_t vector);
 VirtQueue *virtio_vector_next_queue(VirtQueue *vq);
 
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index ee5a5352dc..a2fa407b98 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -154,7 +154,7 @@ void virtio_blk_data_plane_destroy(VirtIOBlockDataPlane *s)
 g_free(s);
 }
 
-static bool virtio_blk_data_plane_handle_output(VirtIODevice *vdev,
+static void virtio_blk_data_plane_handle_output(VirtIODevice *vdev,
 VirtQueue *vq)
 {
 VirtIOBlock *s = (VirtIOBlock *)vdev;
@@ -162,7 +162,7 @@ static bool 
virtio_blk_data_plane_handle_output(VirtIODevice *vdev,
 assert(s->dataplane);
 assert(s->dataplane_started);
 
-return virtio_blk_handle_vq(s, vq);
+virtio_blk_handle_vq(s, vq);
 }
 
 /* Context: QEMU global mutex held */
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 18eb824c97..76137de67f 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -49,49 +49,43 @@ void virtio_scsi_dataplane_setup(VirtIOSCSI *s, Error 
**errp)
 }
 }
 
-static bool virtio_scsi_data_plane_handle_cmd(VirtIODevice *vdev,
+static void virtio_scsi_data_plane_handle_cmd(VirtIODevice *vdev,
   VirtQueue *vq)
 {
-bool progress = false;
 VirtIOSCSI *s = VIRTIO_SCSI(vdev);
 
 virtio_scsi_acquire(s);
 if (!s->dataplane_fenced) {
 assert(s->ctx && s->dataplane_started);
-progress = virtio_scsi_handle_cmd_vq(s, vq);
+virtio_scsi_handle_cmd_vq(s, vq);
 }
 virtio_scsi_release(s);
-return progress;
 }
 
-static bool virtio_scsi_data_plane_handle_ctrl(VirtIODevice *vdev,
+static void virtio_scsi_data_plane_handle_ctrl(VirtIODevice *vdev,
VirtQueue *vq)
 {
-bool progress = false;
 VirtIOSCSI *s = VIRTIO_SCSI(vdev);
 
 virtio_scsi_acquire(s);
 if (!s->dataplane_fenced) {
 assert(s->ctx && s->dataplane_started);
-progress = virtio_scsi_handle_ctrl_vq(s, vq);
+virtio_scsi_handle_ctrl_vq(s, vq);
 }
 virtio_scsi_release(s);
-return progress;
 }
 
-static bool virtio_scsi_data_plane_handle_event(VirtIODevice *vdev,
+static void virtio_scsi_data_plane_handle_event(VirtIODevice *vdev,
 VirtQueue *vq)
 {
-bool progress = false;
 VirtIOSCSI *s = VIRTIO_SCSI(vdev);
 
 virtio_scsi_acquire(s);
 if (!s->dataplane_fenced) {
 assert(s->ctx && s->dataplane_started);
-progress = virtio_scsi_handle_event_vq(s, vq);
+virtio_scsi_handle_event_vq(s, vq);
 }
 virtio_scsi_release(s);
-return progress;
 }
 
 static int virtio_scsi_set_host_notifier(VirtIOSCSI *s, VirtQueue *vq, int n)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 

Re: [XEN PATCH 43/57] libs: Rename $(SRCS-y) to $(OBJS-y)

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

The only thing done thing done with $(SRCS-y) is to replace ".c" by
".o". It is more useful to collect which object we want to build as
make will figure out how to build it and from which source file.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH v3 0/6] aio-posix: split poll check from ready handler

2021-12-07 Thread Stefan Hajnoczi
v3:
- Fixed FUSE export aio_set_fd_handler() call that I missed and double-checked
  for any other missing call sites using Coccinelle [Rich]
v2:
- Cleaned up unused return values in nvme and virtio-blk [Stefano]
- Documented try_poll_mode() ready_list argument [Stefano]
- Unified virtio-blk/scsi dataplane and non-dataplane virtqueue handlers 
[Stefano]

The first patch improves AioContext's adaptive polling execution time
measurement. This can result in better performance because the algorithm makes
better decisions about when to poll versus when to fall back to file descriptor
monitoring.

The remaining patches unify the virtio-blk and virtio-scsi dataplane and
non-dataplane virtqueue handlers. This became possible because the dataplane
handler function now has the same function signature as the non-dataplane
handler function. Stefano Garzarella prompted me to make this refactoring.

Stefan Hajnoczi (6):
  aio-posix: split poll check from ready handler
  virtio: get rid of VirtIOHandleAIOOutput
  virtio-blk: drop unused virtio_blk_handle_vq() return value
  virtio-scsi: prepare virtio_scsi_handle_cmd for dataplane
  virtio: use ->handle_output() instead of ->handle_aio_output()
  virtio: unify dataplane and non-dataplane ->handle_output()

 include/block/aio.h |  4 +-
 include/hw/virtio/virtio-blk.h  |  2 +-
 include/hw/virtio/virtio.h  |  5 +-
 util/aio-posix.h|  1 +
 block/curl.c| 11 ++--
 block/export/fuse.c |  4 +-
 block/io_uring.c| 19 ---
 block/iscsi.c   |  4 +-
 block/linux-aio.c   | 16 +++---
 block/nfs.c |  6 +--
 block/nvme.c| 51 ---
 block/ssh.c |  4 +-
 block/win32-aio.c   |  4 +-
 hw/block/dataplane/virtio-blk.c | 16 +-
 hw/block/virtio-blk.c   | 14 ++
 hw/scsi/virtio-scsi-dataplane.c | 60 +++---
 hw/scsi/virtio-scsi.c   |  2 +-
 hw/virtio/virtio.c  | 73 +--
 hw/xen/xen-bus.c|  6 +--
 io/channel-command.c|  6 ++-
 io/channel-file.c   |  3 +-
 io/channel-socket.c |  3 +-
 migration/rdma.c|  8 +--
 tests/unit/test-aio.c   |  4 +-
 util/aio-posix.c| 89 +
 util/aio-win32.c|  4 +-
 util/async.c| 10 +++-
 util/main-loop.c|  4 +-
 util/qemu-coroutine-io.c|  5 +-
 util/vhost-user-server.c| 11 ++--
 30 files changed, 219 insertions(+), 230 deletions(-)

-- 
2.33.1





Re: [XEN PATCH 42/57] libs,tools/include: Clean "clean" targets

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

There is no need for an extra "cleanlocal" target, we can use
double-colon rules instead.

Generated headers are now in tools/include/, so remove those file
there.

Remove -f flag as it's already in $(RM).

libs.mk:
   - don't try to remove "*.rpm" anymore.

libs/light:
   - "_paths.*.tmp" isn't created anymore.
   - clean "libxenlight_test.so" and "libxl_test_*.opic".

libs/util:
   - fix clean of version-script file.

include/xen-foreign:
   - remove __pycache__

Signed-off-by: Anthony PERARD 


Same remark regarding double-colon rules as for last patch.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH 41/57] libs: Remove need for *installlocal targets

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

There is no need for an extra "installlocal" target, we can use
double-colon rules instead.


Doesn't that mean that with the ultimate goal of including all
Makefiles of the tree that all "install" and "uninstall" rules in the
tree will have to be double-colon rules? Citing from the make manual:

  When a target appears in multiple rules, all the rules must be the
  same type: all ordinary, or all double-colon.



"install-headers" in "libs/store" was introduced for the same reason
that "installlocal" exist, so it is replaced as well.

Signed-off-by: Anthony PERARD 


With above remark I don't see how tools/libs/stat/Makefile can be left
unmodified, as it includes libs.mk and it contains an "intall:" rule.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [patch V2 18/36] genirq/msi: Add msi_device_data::properties

2021-12-07 Thread Thomas Gleixner
On Tue, Dec 07 2021 at 13:47, Thomas Gleixner wrote:
> On Tue, Dec 07 2021 at 10:04, Cédric Le Goater wrote:
>>> +/**
>>> + * msi_device_set_properties - Set device specific MSI properties
>>> + * @dev:   Pointer to the device which is queried
>>> + * @prop:  Properties to set
>>> + */
>>> +void msi_device_set_properties(struct device *dev, unsigned long prop)
>>> +{
>>> +   if (WARN_ON_ONCE(!dev->msi.data))
>>> +   return ;
>>> +   dev->msi.data->properties = 0;
>> It would work better if the prop variable was used instead of 0.
>>
>> With that fixed,
>
> Indeed. Copy & pasta w/o brain usage ...

I've pushed out an incremental fix on top. Will be folded back.

 git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git 
msi-v2-part-3-1

Thanks,

tglx



Re: [patch V2 18/36] genirq/msi: Add msi_device_data::properties

2021-12-07 Thread Thomas Gleixner
On Tue, Dec 07 2021 at 10:04, Cédric Le Goater wrote:
>> +/**
>> + * msi_device_set_properties - Set device specific MSI properties
>> + * @dev:Pointer to the device which is queried
>> + * @prop:   Properties to set
>> + */
>> +void msi_device_set_properties(struct device *dev, unsigned long prop)
>> +{
>> +if (WARN_ON_ONCE(!dev->msi.data))
>> +return ;
>> +dev->msi.data->properties = 0;
> It would work better if the prop variable was used instead of 0.
>
> With that fixed,

Indeed. Copy & pasta w/o brain usage ...



Re: [patch V2 29/31] genirq/msi: Add abuse prevention comment to msi header

2021-12-07 Thread Thomas Gleixner
On Tue, Dec 07 2021 at 09:21, Greg Kroah-Hartman wrote:
> On Mon, Dec 06, 2021 at 11:51:49PM +0100, Thomas Gleixner wrote:
>>  #include 
>>  #include 
>>  #include 
>> 
> Ah, to be young and idealistic and hope that kernel developers read
> comments in header files :)

Hope dies last.

> You might want to add this to the driver-api kernel doc build?

When I do the next round of refactoring, I'm going to move the functions
which are available for drivers into a separate header file.

Thanks,

tglx



Re: [XEN PATCH 40/57] libs: rename LDUSELIBS to LDLIBS and use it instead of APPEND_LDFLAGS

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

LDLIBS is more appropriate and intended to be used to add library
dependencies. APPEND_LDFLAGS wasn't intended to be changed by the
build system.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH 39/57] libs: Remove both "libs" and "build" target

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

"libs" is odd and has been introduced without a reason by c7d3afbb44.
Instead, only use "all".

Also remove "build" target as "all" is more appropriate and nothing is
using "build" in libs/ in the xen.git repo.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 2/3] x86/boot: Drop move_memory() and use memcpy() directly

2021-12-07 Thread Jan Beulich
On 07.12.2021 11:53, Andrew Cooper wrote:
> @@ -1243,7 +1196,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>   * data until after we have switched to the relocated pagetables!
>   */
>  barrier();
> -move_memory(e, XEN_IMG_OFFSET, _end - _start, 1);
> +memcpy(__va(__pa(_start)), _start, _end - _start);
>  
>  /* Walk idle_pg_table, relocating non-leaf entries. */
>  pl4e = __va(__pa(idle_pg_table));
> @@ -1300,8 +1253,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> "1" (__va(__pa(cpu0_stack))), "2" (STACK_SIZE / 8)
>  : "memory" );
>  
> -bootstrap_map(NULL);
> -
>  printk("New Xen image base address: %#lx\n", xen_phys_start);
>  }

This bootstrap_map() must have been dead code already before, except
for the "keep" argument above needlessly having got passed as 1? Afaict
passing 1 was pointless without using the function's return value.

> @@ -1325,9 +1276,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>   (headroom ||
>((end - size) >> PAGE_SHIFT) > mod[j].mod_start) )
>  {
> -move_memory(end - size + headroom,
> -(uint64_t)mod[j].mod_start << PAGE_SHIFT,
> -mod[j].mod_end, 0);
> +memcpy(__va(end - size + headroom),
> +   __va((uint64_t)mod[j].mod_start << PAGE_SHIFT),
> +   mod[j].mod_end);

I'm not convinced this can be memcpy() - consider_modules() specifically
allows for the current module's source and destination areas to overlap.
See also the comment ahead of its invocation a few lines up from here.

I'm also not convinced we have the source range (fully) direct-mapped at
this point. Only full superpages have been mapped so far, and only those
for the current or higher address E820 entries (plus of course the pre-
built mappings of the space below 1Gb [PREBUILT_MAP_LIMIT]) located
below 4Gb.

As to the 2nd argument - if this can indeed be converted in the first
place, may I suggest to also switch to using pfn_to_paddr()?

Jan




Re: [PATCH 1/7] xz: add fall-through comments to a switch statement

2021-12-07 Thread Ian Jackson
Jan Beulich writes ("Re: [PATCH 1/7] xz: add fall-through comments to a switch 
statement"):
> I'm unwilling only as long as I don't understand the need for them. As
> indicated, while I appreciate your "make verification easier for
> reviewers", I assign that at least no higher priority than my desire
> to leave out inapplicable data.

Are we still talking about Signed-off-by ?

I explained the purpose of Signed-off-by.  It reflects the chain of
custody.  It is true that s-o-b is often added by people with minimal
contribution to the content; I don't think that is relevant.

If the Xen patch was derived from Linux commit XYZ (whether
automatically or manually) then even those minimal S-o-b are
applicable.

> I'd be happy for anyone else to start over. I would even ack such a
> submission myself. But as long as I'm recorded with S-o-b, I'm afraid
> I'm not going to accept re-addition of the tags for no good (as per my
> personal view) reason. Otherwise, based on experience, the example of
> this series could, in the future, be used to tell me that on an earlier
> occasion I (seemingly) did things differently.

S-o-b does not indicate complete approval of the content.  It attests
precisely to the statements in the DCO.  There is nothing irregular
about putting your S-o-b on something you don't entirely agree with.


Stepping back:

In a collaborative project, we must all respect our peers and the
community's decision.  That can mean actively progressing patches that
we personally disagree with, but which the community has decided ought
to be applied. [1]

I appreciate that can be less fun.  But it comes with the territory of
being a leading member of the community.


> As said earlier, if submissions in this form are going to be nak-ed
> going forward, and if good reasons (see above) will not be provided
> (and hence leeway will not be granted to the submitter) to support this,
> then someone else will need to start looking after imports which may be
> relevant to us.

The problem is simply one of verification.  So far there does not seem
to be a positive ack[1] for this patch.  Neither I nor Julien are
nacking it.

AIUI Julien does not want to ack it without being able to check that
all the appropriate s-o-b (and perhaps other tags) are present.  I
think that as the submitter it is really best if you make that easy.


We think the obvious way to make that easy for a reviewer is to just
copy the tags.  But an alternative would be to include, in the commit
message, full details of where the patch came from in (including urls
to mailing list articles) in such a way that a reviewer can see easily
that all the necessary s-o-b are present.


[1] Of course very rarely there might be matters of conscience, where
we have protested but our protest has been overruled by our peers.
But that does not seem to be the case here since you are not giving a
nak.

Ian.

[1] Julien did give one earlier but then wrote "actually" and started
this subthread, so I think Julien has withdrawn his ack.



Re: [PATCH v7 6/7] xen/arm: process pending vPCI map/unmap operations

2021-12-07 Thread Oleksandr Andrushchenko
Hi, Julien!

On 03.12.21 18:10, Durrant, Paul wrote:
> On 23/11/2021 23:59, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko 
>>
>> vPCI may map and unmap PCI device memory (BARs) being passed through which
>> may take a lot of time. For this those operations may be deferred to be
>> performed later, so that they can be safely preempted.
>>
>> Currently this deferred processing is happening in common IOREQ code
>> which doesn't seem to be the right place for x86 and is even more
>> doubtful because IOREQ may not be enabled for Arm at all.
>> So, for Arm the pending vPCI work may have no chance to be executed
>> if the processing is left as is in the common IOREQ code only.
>> For that reason make vPCI processing happen in arch specific code.
>>
>> Please be aware that there are a few outstanding TODOs affecting this
>> code path, see xen/drivers/vpci/header.c:map_range and
>> xen/drivers/vpci/header.c:vpci_process_pending.
>>
>> Signed-off-by: Oleksandr Andrushchenko 
>> [x86 part]
>> Acked-by: Jan Beulich 
>> Reviewed-by: Julien Grall 
>
> Reviewed-by: Paul Durrant 
>
Do we need anything else for this patch?

Thank you,
Oleksandr

Re: [XEN PATCH v8 11/47] build: fix enforce unique symbols for recent clang version

2021-12-07 Thread Anthony PERARD
On Tue, Dec 07, 2021 at 11:23:26AM +0100, Jan Beulich wrote:
> On 25.11.2021 14:39, Anthony PERARD wrote:
> > clang 6.0 and newer behave like gcc in regards for the FILE symbol, so
> > only the filename rather than the full path to the source file.
> > 
> > clang 3.8.1-24 (in our debian:stretch container) and 3.5.0-10
> > (in our debian:jessie container) do store the full path to the source
> > file in the FILE symbol.
> > 
> > This means that we also need to check clang version to figure out
> > which command we need to use to redefine symbol.
> > 
> > I don't know which version of clang change behavior, we will guess
> > 4.0.
> 
> When I did this earlier work, it was clang5 that I used. Which would seem
> to mean the change in behavior was in version 6.

Thanks, I forgot this fact. I'll make change in the patch, and replace
the last paragraph of the patch description and add something link this
instead:

Also, based on commit 81ecb38b83 ("build: provide option to
disambiguate symbol names"), which were using clang 5, the change of
behavior likely happened in clang 6.0.

Thanks,

-- 
Anthony PERARD



Re: [PATCH 1/3] x86/boot: Drop pte_update_limit from physical relocation logic

2021-12-07 Thread Jan Beulich
On 07.12.2021 11:53, Andrew Cooper wrote:
> This check has existed in one form or another since c/s 369bafdb1c1 "xen: Big
> changes to x86 start-of-day" in 2007.  Even then, AFAICT, it wasn't necessary
> because there was a clean split between the statically created entries (L3 and
> higher) and the dynamically created ones (L2 and lower).
> 
> Without dissecting the myriad changes over the past 14 years, I'm pretty
> certain Xen only booted by accident when l2_xenmap[0] was handled specially
> and skipped the pte_update_limit check which would have left it corrupt.
> 
> Nevertheless, as of right now, all non-leaf PTEs (the first loop), and all 2M
> xenmap leaf mappings (the second loop) need relocating.  They are no unused
> mappings in the range, no mappings which will be encountered multiple times,
> and it is unlikely that such mappings would be introduced.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

However, as to the description and ...

> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1230,7 +1230,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  l3_pgentry_t *pl3e;
>  l2_pgentry_t *pl2e;
>  int i, j, k;
> -unsigned long pte_update_limit;
>  
>  /* Select relocation address. */
>  xen_phys_start = end - reloc_size;
> @@ -1238,14 +1237,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  bootsym(trampoline_xen_phys_start) = xen_phys_start;
>  
>  /*
> - * No PTEs pointing above this address are candidates for 
> relocation.
> - * Due to possibility of partial overlap of the end of source 
> image
> - * and the beginning of region for destination image some PTEs 
> may
> - * point to addresses in range [e, e + XEN_IMG_OFFSET).
> - */
> -pte_update_limit = PFN_DOWN(e);

... considering the comment here: Isn't it 0d31d1680868 ("x86/setup: do
not relocate Xen over current Xen image placement") where need for this
disappeared? Afaict the non-overlap of source and destination is the
crucial factor here, yet your description doesn't mention this aspect at
all. I'd therefore like to ask for an adjustment there.

Jan




Re: [patch V2 01/23] powerpc/4xx: Remove MSI support which never worked

2021-12-07 Thread Michael Ellerman
Cédric Le Goater  writes:
> Hello Thomas,
>
> On 12/6/21 23:27, Thomas Gleixner wrote:
>> This code is broken since day one. ppc4xx_setup_msi_irqs() has the
>> following gems:
>> 
>>   1) The handling of the result of msi_bitmap_alloc_hwirqs() is completely
>>  broken:
>>  
>>  When the result is greater than or equal 0 (bitmap allocation
>>  successful) then the loop terminates and the function returns 0
>>  (success) despite not having installed an interrupt.
>> 
>>  When the result is less than 0 (bitmap allocation fails), it prints an
>>  error message and continues to "work" with that error code which would
>>  eventually end up in the MSI message data.
>> 
>>   2) On every invocation the file global pp4xx_msi::msi_virqs bitmap is
>>  allocated thereby leaking the previous one.
>> 
>> IOW, this has never worked and for more than 10 years nobody cared. Remove
>> the gunk.
>> 
>> Fixes: 3fb7933850fa ("powerpc/4xx: Adding PCIe MSI support")
>
> Shouldn't we remove all of it ? including the updates in the device trees
> and the Kconfig changes under :
>
> arch/powerpc/platforms/44x/Kconfig:   select PPC4xx_MSI
> arch/powerpc/platforms/44x/Kconfig:   select PPC4xx_MSI
> arch/powerpc/platforms/44x/Kconfig:   select PPC4xx_MSI
> arch/powerpc/platforms/44x/Kconfig:   select PPC4xx_MSI
> arch/powerpc/platforms/40x/Kconfig:   select PPC4xx_MSI

This patch should drop those selects I guess. Can you send an
incremental diff for Thomas to squash in?

Removing all the tendrils in various device tree files will probably
require some archaeology, and it should be perfectly safe to leave those
in the tree with the driver gone. So I think we can do that as a
subsequent patch, rather than in this series.

cheers



Re: [XEN PATCH v8 04/47] build: set XEN_BUILD_EFI earlier

2021-12-07 Thread Jan Beulich
On 07.12.2021 12:04, Anthony PERARD wrote:
> On Thu, Dec 02, 2021 at 03:06:54PM +0100, Jan Beulich wrote:
>> On 25.11.2021 14:39, Anthony PERARD wrote:
>>> +export XEN_BUILD_EFI XEN_BUILD_PE MKRELOC
>>> +export EFI_LDFLAGS
>>> +endif
>>
>> Exporting MKRELOC in particular isn't very nice. I wonder whether there
>> wouldn't be a way to keep it local to xen/Makefile.
> 
> I don't think that's possible. The value of MKRELOC depends on a call
> with OBJDUMP which depends on call with LD which depends on a call with
> CC. And the call with CC is the one I'm trying to move.

Like suggested for another variable elsewhere, besides moving the definition
(which I agree looks difficult to achieve) there's also the option of passing
it on the command line to (presumably) just the single sub-make which
actually means to consume it. It's only the final linking step where it's
needed afaict.

Jan




Re: [XEN PATCH v8 07/47] build: set ALL_OBJS to main Makefile; move prelink.o to main Makefile

2021-12-07 Thread Anthony PERARD
On Mon, Dec 06, 2021 at 05:52:51PM +0100, Jan Beulich wrote:
> On 25.11.2021 14:39, Anthony PERARD wrote:
> 
> Nit: In the title, do you mean "set ALL_OBJS in main Makefile; ..."?
> 
> > --- a/xen/Makefile
> > +++ b/xen/Makefile
> > @@ -285,8 +285,21 @@ CFLAGS += -flto
> >  LDFLAGS-$(CONFIG_CC_IS_CLANG) += -plugin LLVMgold.so
> >  endif
> >  
> > +# Note that link order matters!
> > +ALL_OBJS-y:= common/built_in.o
> > +ALL_OBJS-y+= drivers/built_in.o
> > +ALL_OBJS-y+= lib/built_in.o
> > +ALL_OBJS-y+= xsm/built_in.o
> > +ALL_OBJS-y+= arch/$(TARGET_ARCH)/built_in.o
> > +ALL_OBJS-$(CONFIG_CRYPTO) += crypto/built_in.o
> > +
> > +ALL_LIBS-y:= lib/lib.a
> > +
> >  include $(BASEDIR)/arch/$(TARGET_ARCH)/arch.mk
> >  
> > +export ALL_OBJS := $(ALL_OBJS-y)
> > +export ALL_LIBS := $(ALL_LIBS-y)
> 
> Who's the consumer of these exports? I ask because I don't consider the
> names very suitable for exporting, and hence I'd prefer to see their
> scope limited. If e.g. it's only a single make invocation where they
> need propagating, doing so on the command line might be better.

There seems to be only one consumer, "build.mk", and only the last
$(MAKE) call in the recipe "$(TARGET)". So, it's probably fine to set
both on the command line instead of using export. I'll have a look.

> > --- a/xen/arch/arm/Rules.mk
> > +++ b/xen/arch/arm/Rules.mk
> > @@ -0,0 +1,5 @@
> > +# head.o is built by descending into arch/arm/$(TARGET_SUBARCH), depends 
> > on the
> > +# part of $(ALL_OBJS) that will eventually recurse into $(TARGET_SUBARCH)/ 
> > and
> > +# build head.o
> > +arch/arm/$(TARGET_SUBARCH)/head.o: arch/arm/built_in.o
> > +arch/arm/$(TARGET_SUBARCH)/head.o: ;
> 
> Can't this be a single line:
> 
> arch/arm/$(TARGET_SUBARCH)/head.o: arch/arm/built_in.o ;

Sure.

> > @@ -235,7 +218,7 @@ $(TARGET).efi: FORCE
> > echo '$(if $(filter y,$(XEN_BUILD_EFI)),xen.efi generation,EFI support) 
> > disabled'
> >  endif
> >  
> > -efi/buildid.o efi/relocs-dummy.o: $(BASEDIR)/arch/x86/efi/built_in.o
> > +# These should already have been rebuilt when building the prerequisite of 
> > "prelink.o"
> >  efi/buildid.o efi/relocs-dummy.o: ;
> 
> If the comment is true in all cases, do they really still need an empty
> rule?

Yes. Those two targets are unfortunately a prerequisite of "xen.efi", so
make will look for a rule to make them, and would use %.o:%.c without
this explicit rule.

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH 36/57] tools/xenstore: Cleanup makefile

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

Regroup *FLAGS together, use $(LDLIBS).

Remove $(LDLIBS_xenstored) which was the wrong name name as it doesn't
decribe how to link to a potential libxenstored.so, instead add the
value to $(LDLIBS) of xenstored.

Add SYSTEMD_LIBS into $(LDLIBS) instead of $(LDFLAGS).

Remove the "-I." from $(CFLAGS), it shouldn't be needed.

Removed $(CFLAGS-y) and $(LDFLAGS-y). $(CFLAGS-y) is already included
in $(CFLAGS) and both aren't used anyway.

Rename ALL_TARGETS to TARGETS.
Only add programmes we want to build in $(TARGETS), not phony-targets
(replace "clients").

Store all `xenstored` objs into $(XENSTORED_OBJS-y).

Replace one $< by $^ even if there's only one dependency,
(xenstore-control).

clean: "init-xenstore-domain" isn't built here, so stop trying to
remove it, remove $(TARGETS).

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [XEN PATCH v8 16/47] xen/tools/kconfig: fix build with -Wdeclaration-after-statement

2021-12-07 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> We are going to start building kconfig with HOSTCFLAGS from Config.mk,
> it has the flag "-Wdeclaration-after-statement".
> 
> Signed-off-by: Anthony PERARD 

Reviewed-by: Jan Beulich 




[xen-4.14-testing test] 167216: tolerable FAIL - PUSHED

2021-12-07 Thread osstest service owner
flight 167216 xen-4.14-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167216/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 166348
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 166348
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 166348
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 166348
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 166348
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 166348
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 166348
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 166348
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 166348
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 166348
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 166348
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 166348
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  c4cf5388652e8434652e30c73aa79635b4253675
baseline version:
 xen  

Re: [XEN PATCH v8 12/47] build: build everything from the root dir, use obj=$subdir

2021-12-07 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> A subdirectory is now built by setting "$(obj)" instead of changing
> directory. "$(obj)" should always be set when using "Rules.mk" and
> thus a shortcut "$(build)" is introduced and should be used.
> 
> A new variable "$(need-builtin)" is introduce. It is to be used
> whenever a "built_in.o" is wanted from a subdirectory. "built_in.o"
> isn't the main target anymore, and thus only needs to depends on the
> objects that should be part of "built_in.o".
> 
> Introduce $(srctree) and $(objtree) to replace $(BASEDIR) in cases a
> relative path is better, and $(abs_srctree) and $(abs_objtree) which
> have an absolute path.
> 
> DEPS is updated as the existing macro to deal with it doesn't know
> about $(obj).
> 
> There's some changes in "Rules.mk" which in addition to deal with
> "$(obj)" also make it's looks more like "Makefile.build" from Linux
> v5.12.
> 
> test/Makefile doesn't need special handling in order to build
> everything under test/, Rules.mk will visit test/livepatch via
> $(subdir-y), thus "tests" "all" and "build" target are removed.
> "subtree-force-update" target isn't useful so it is removed as well.
> 
> test/livepatch/Makefile doesn't need default target anymore, Rules.mk
> will build everything in $(extra-y) and thus all *.livepatch.
> 
> Adjust cloc recipe: dependency files generated by CC will now have the
> full path to the source file, so we don't need to prepend the
> subdirectory. This fix some issue with source not been parsed by cloc
> before. Also source from tools/kconfig would be listed with changes in
> this patch so adjust the find command to stop listing the "tools"
> directory and thus kconfig. With a default build of Xen on X86, they
> are a few new files parsed by cloc:
> arch/x86/x86_64/compat/mm.c
> arch/x86/x86_64/mm.c
> common/compat/domain.c
> common/compat/memory.c
> common/compat/xlat.c
> 
> Signed-off-by: Anthony PERARD 
> Acked-by: Bob Eshleman 

Reviewed-by: Jan Beulich 
with one nit and a remark:

> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -22,6 +22,15 @@ export CHECKPOLICY ?= checkpolicy
>  export BASEDIR := $(CURDIR)
>  export XEN_ROOT := $(BASEDIR)/..
>  
> +abs_objtree := $(CURDIR)
> +abs_srctree := $(CURDIR)

Nit: In line with e.g. obj-y I think these would better be abs-srctree and
abs-objtree.

> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -80,6 +80,9 @@ endif
>  extra-y += asm-macros.i
>  extra-y += xen.lds
>  
> +# Allows usercopy.c to include itself
> +$(obj)/usercopy.o: CFLAGS-y += -iquote .
> +
>  ifneq ($(CONFIG_HVM),y)
>  $(obj)/x86_emulate.o: CFLAGS-y += -Wno-unused-label
>  endif
> @@ -129,13 +132,13 @@ $(TARGET)-syms: $(BASEDIR)/prelink.o $(obj)/xen.lds
>   $(NM) -pa --format=sysv $(@D)/.$(@F).0 \
>   | $(BASEDIR)/tools/symbols $(all_symbols) --sysv --sort \
>   >$(@D)/.$(@F).0.S
> - $(MAKE) -f $(BASEDIR)/Rules.mk efi-y= $(@D)/.$(@F).0.o
> + $(MAKE) $(build)=$(@D) efi-y= $(@D)/.$(@F).0.o

Hmm, hasn't the efi-y= become unnecessary already by patch 6?

Jan




Re: [PATCH] xen/arm: Add Kconfig parameter for memory banks number

2021-12-07 Thread Bertrand Marquis
Hi,

> On 7 Dec 2021, at 10:52, Luca Fancellu  wrote:
> 
> 
> 
>> On 6 Dec 2021, at 17:05, Julien Grall  wrote:
>> 
>> Hi Luca,
>> 
>> On 06/12/2021 15:37, Luca Fancellu wrote:
>>> Currently the maximum number of memory banks is fixed to
>>> 128, but on some new platforms that have a large amount
>>> of memory, this value is not enough 
>> 
> 
> Hi Julien,
> 
>> Can you provide some information on the setup? Is it using UEFI?
> 
> Yes it is a platform with 32gb of ram, I’ve built Xen with ACPI support and 
> it’s starting using UEFI+ACPI.
> 
>> 
>>> and prevents Xen
>>> from booting.
>> 
>> AFAIK, the restriction should only prevent Xen to use all the memory. If 
>> that's not the case, then this should be fixed.
> 
> The code that it’s failing is this, inside efi_process_memory_map_bootinfo(…) 
> in the arch/arm/efi/efi-boot.h:
> 
> #ifdef CONFIG_ACPI
>else if ( desc_ptr->Type == EfiACPIReclaimMemory )
>{
>if ( !meminfo_add_bank(, desc_ptr) )
>{
>PrintStr(L"Error: All " __stringify(NR_MEM_BANKS)
>  " acpi meminfo mem banks exhausted.\r\n");
>return EFI_LOAD_ERROR;
>}
>}
> #endif
> 
>> 
>>> Create a Kconfig parameter to set the value, by default
>>> 128.
>> 
>> I think Xen should be able to boot on any platform with the default 
>> configuration. So the value should at least be bumped.
>> 
>>> Signed-off-by: Luca Fancellu 
>>> ---
>>> xen/arch/arm/Kconfig| 8 
>>> xen/include/asm-arm/setup.h | 2 +-
>>> 2 files changed, 9 insertions(+), 1 deletion(-)
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index ecfa6822e4d3..805e3c417e89 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -25,6 +25,14 @@ menu "Architecture Features"
>>>   source "arch/Kconfig"
>>> +config MEM_BANKS
>>> +   int "Maximum number of memory banks."
>>> +   default "128"
>>> +   help
>>> + Controls the build-time size memory bank array.
>>> + It is the upper bound of the number of logical entities describing
>>> + the memory.
>> 
>> NR_MEM_BANKS is going to be used by multiple internal structure in Xen (e.g. 
>> static memory, reserved memory, normal memory). So how could an admin decide 
>> the correct value?
>> 
>> In particular for UEFI, we are at the mercy of the firmware that can expose 
>> any kind of memory map (that's why we had to increase the original number of 
>> banks).
>> 
>> So maybe it is time for us to move out from a static array and re-think how 
>> we discover the memory.
>> 
>> That this is probably going to take some time to get it properly, so
>> I would be OK with bumping the value + a config gated UNSUPPORTED.


Looking at what we have, the memory is actually fragmented by ACPI but a long 
term solution could be to make the code a little bit more smart and try to 
merge together adjacent banks.

I would suggest to just increase the existing define to 256 to fix the current 
issue (which might be encountered by anybody using ACPI) and put a comment in 
the code for now with a TODO explaining why we currently need such a high value 
and what should be done to fix this.

Cheers
Bertrand


> 
> I can do that.
> 
> Cheers,
> Luca
> 
>> 
>>> +
>>> config ACPI
>>> bool "ACPI (Advanced Configuration and Power Interface) Support 
>>> (UNSUPPORTED)" if UNSUPPORTED
>>> depends on ARM_64
>>> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
>>> index 95da0b7ab9cd..785a8fe81450 100644
>>> --- a/xen/include/asm-arm/setup.h
>>> +++ b/xen/include/asm-arm/setup.h
>>> @@ -6,7 +6,7 @@
>>> #define MIN_FDT_ALIGN 8
>>> #define MAX_FDT_SIZE SZ_2M
>>> -#define NR_MEM_BANKS 128
>>> +#define NR_MEM_BANKS CONFIG_MEM_BANKS
>>>   #define MAX_MODULES 32 /* Current maximum useful modules */
>>> 
>> 
>> Cheers,
>> 
>> -- 
>> Julien Grall



Re: [XEN PATCH v8 04/47] build: set XEN_BUILD_EFI earlier

2021-12-07 Thread Anthony PERARD
On Thu, Dec 02, 2021 at 03:06:54PM +0100, Jan Beulich wrote:
> On 25.11.2021 14:39, Anthony PERARD wrote:
> > +efi-check-o := arch/x86/efi/check.o
> 
> How about making this
> 
> efi-check := arch/x86/efi/check
> 
> That way you wouldn't need to replace the extension in a number of places,
> but simply append the respective one in every place using this.

This change sound fine. I guess it will make reading the code a bit
easier.

> > +export XEN_BUILD_EFI XEN_BUILD_PE MKRELOC
> > +export EFI_LDFLAGS
> > +endif
> 
> Exporting MKRELOC in particular isn't very nice. I wonder whether there
> wouldn't be a way to keep it local to xen/Makefile.

I don't think that's possible. The value of MKRELOC depends on a call
with OBJDUMP which depends on call with LD which depends on a call with
CC. And the call with CC is the one I'm trying to move.

Unless there is a better way to build *.efi, we need to know whether to
use `mkreloc` or not.

I could rename it XEN_MKRELOC. Or if there is another name that could
make sense, that would be fine too, like XEN_BUILD_EFI_NEED_RELOC which
could be a boolean.

Thanks,

-- 
Anthony PERARD



Re: [XEN PATCH 20/57] tools/Rules.mk: introduce FORCE target

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

And replace the one defined in libs.mk.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH 3/3] x86/boot: Don't double-copy the stack during physical relocation

2021-12-07 Thread Andrew Cooper
cpu0_stack is contained within .data, which means the memcpy() already takes a
snapshot at the start of the critical region.

Later, when we switch to the relocated Xen, we do end up losing updates to the
local variables, but that's fine because the only variables we've modified go
out of scope after the printk().  Use this properly to avoid copying the whole
stack (32k) a second time.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

Ever so slightly RFC, as it has only had light testing, but I'm confident in
the reasoning.
---
 xen/arch/x86/setup.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index a6ff450daab7..c04c68a09b47 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1183,6 +1183,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 l3_pgentry_t *pl3e;
 l2_pgentry_t *pl2e;
 int i, j, k;
+unsigned long tmp;
 
 /* Select relocation address. */
 xen_phys_start = end - reloc_size;
@@ -1193,7 +1194,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
  * Perform relocation to new physical address.
  * Before doing so we must sync static/global data with main memory
  * with a barrier(). After this we must *not* modify static/global
- * data until after we have switched to the relocated pagetables!
+ * data, or locals that need to survive, until after we have
+ * switched to the relocated pagetables!
  */
 barrier();
 memcpy(__va(__pa(_start)), _start, _end - _start);
@@ -1239,18 +1241,21 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) + 
xen_phys_start);
 }
 
-/* Re-sync the stack and then switch to relocated pagetables. */
+/*
+ * Switch to relocated pagetables.  This also discards updates to
+ * any local variables since the memmove() call above, but that's
+ * fine because we don't use any of them again.
+ */
 asm volatile (
-"rep movsq; " /* re-sync the stack */
-"movq %%cr4,%%rsi ; "
-"andb $0x7f,%%sil ; "
-"movq %%rsi,%%cr4 ; " /* CR4.PGE == 0 */
-"movq %[pg],%%cr3 ; " /* CR3 == new pagetables */
-"orb $0x80,%%sil  ; "
-"movq %%rsi,%%cr4   " /* CR4.PGE == 1 */
-: "=" (i), "=" (i), "=" (i) /* All outputs discarded. */
-:  [pg] "r" (__pa(idle_pg_table)), "0" (cpu0_stack),
-   "1" (__va(__pa(cpu0_stack))), "2" (STACK_SIZE / 8)
+"mov%%cr4, %[cr4]\n\t"
+"andb   $~%c[pge], %b[cr4]\n\t"
+"mov%[cr4], %%cr4\n\t" /* CR4.PGE = 0 */
+"mov%[cr3], %%cr3\n\t" /* CR3 = new pagetables */
+"orb$%c[pge], %b[cr4]\n\t"
+"mov%[cr4], %%cr4\n\t" /* CR4.PGE = 1 */
+: [cr4] "=" (tmp) /* Could be "r", but "a" makes better asm 
*/
+: [cr3] "r" (__pa(idle_pg_table)),
+  [pge] "i" (X86_CR4_PGE)
 : "memory" );
 
 printk("New Xen image base address: %#lx\n", xen_phys_start);
-- 
2.11.0




[PATCH 0/3] x86/boot: Cleanup

2021-12-07 Thread Andrew Cooper
More cleanup, following on from the __ro_after_init work.

Andrew Cooper (3):
  x86/boot: Drop pte_update_limit from physical relocation logic
  x86/boot: Drop move_memory() and use memcpy() directly
  x86/boot: Don't double-copy the stack during physical relocation

 xen/arch/x86/setup.c | 105 ---
 1 file changed, 25 insertions(+), 80 deletions(-)

-- 
2.11.0




[PATCH 2/3] x86/boot: Drop move_memory() and use memcpy() directly

2021-12-07 Thread Andrew Cooper
The way move_memory() sets up the virtual mappings means that there are always
two non-overlapping regions.  The virtual layout means that memmove()'s
forward/backwards check doesn't do what the caller intends, as the check ought
to be performed in physical space rather than virtual.

Luckily both callers already provide non-overlapping mappings, so this bug
doesn't manifest, and we can move to memcpy() to avoid a backwards copy.
Backwards rep movs's are typically far slower than forwards copies.

Furthermore, both callers already have suitable directmap mappings.  There is
no need to spend time managing early boot mappings, or chunking the copy
through them.

For the main Xen relocation, we can read out of the virtual mapping that we're
executing on, and write directly into the directmap.  In fact, this removes
one dependency on Xen being "at 0" (the XEN_IMG_OFFSET passed as src) for
relocation to occur.

For the module relocation, just transcribe the move_memory() call into an
equivalent memcpy().

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/setup.c | 58 +---
 1 file changed, 5 insertions(+), 53 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 0492856292cf..a6ff450daab7 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -413,53 +413,6 @@ void *__init bootstrap_map(const module_t *mod)
 return ret;
 }
 
-static void *__init move_memory(
-uint64_t dst, uint64_t src, unsigned int size, bool keep)
-{
-unsigned int blksz = BOOTSTRAP_MAP_LIMIT - BOOTSTRAP_MAP_BASE;
-unsigned int mask = (1L << L2_PAGETABLE_SHIFT) - 1;
-
-if ( src + size > BOOTSTRAP_MAP_BASE )
-blksz >>= 1;
-
-while ( size )
-{
-module_t mod;
-unsigned int soffs = src & mask;
-unsigned int doffs = dst & mask;
-unsigned int sz;
-void *d, *s;
-
-mod.mod_start = (src - soffs) >> PAGE_SHIFT;
-mod.mod_end = soffs + size;
-if ( mod.mod_end > blksz )
-mod.mod_end = blksz;
-sz = mod.mod_end - soffs;
-s = bootstrap_map();
-
-mod.mod_start = (dst - doffs) >> PAGE_SHIFT;
-mod.mod_end = doffs + size;
-if ( mod.mod_end > blksz )
-mod.mod_end = blksz;
-if ( sz > mod.mod_end - doffs )
-sz = mod.mod_end - doffs;
-d = bootstrap_map();
-
-memmove(d + doffs, s + soffs, sz);
-
-dst += sz;
-src += sz;
-size -= sz;
-
-if ( keep )
-return size ? NULL : d + doffs;
-
-bootstrap_map(NULL);
-}
-
-return NULL;
-}
-
 #undef BOOTSTRAP_MAP_LIMIT
 
 static uint64_t __init consider_modules(
@@ -1243,7 +1196,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
  * data until after we have switched to the relocated pagetables!
  */
 barrier();
-move_memory(e, XEN_IMG_OFFSET, _end - _start, 1);
+memcpy(__va(__pa(_start)), _start, _end - _start);
 
 /* Walk idle_pg_table, relocating non-leaf entries. */
 pl4e = __va(__pa(idle_pg_table));
@@ -1300,8 +1253,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
"1" (__va(__pa(cpu0_stack))), "2" (STACK_SIZE / 8)
 : "memory" );
 
-bootstrap_map(NULL);
-
 printk("New Xen image base address: %#lx\n", xen_phys_start);
 }
 
@@ -1325,9 +1276,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
  (headroom ||
   ((end - size) >> PAGE_SHIFT) > mod[j].mod_start) )
 {
-move_memory(end - size + headroom,
-(uint64_t)mod[j].mod_start << PAGE_SHIFT,
-mod[j].mod_end, 0);
+memcpy(__va(end - size + headroom),
+   __va((uint64_t)mod[j].mod_start << PAGE_SHIFT),
+   mod[j].mod_end);
+
 mod[j].mod_start = (end - size) >> PAGE_SHIFT;
 mod[j].mod_end += headroom;
 mod[j].reserved = 1;
-- 
2.11.0




[PATCH 1/3] x86/boot: Drop pte_update_limit from physical relocation logic

2021-12-07 Thread Andrew Cooper
This check has existed in one form or another since c/s 369bafdb1c1 "xen: Big
changes to x86 start-of-day" in 2007.  Even then, AFAICT, it wasn't necessary
because there was a clean split between the statically created entries (L3 and
higher) and the dynamically created ones (L2 and lower).

Without dissecting the myriad changes over the past 14 years, I'm pretty
certain Xen only booted by accident when l2_xenmap[0] was handled specially
and skipped the pte_update_limit check which would have left it corrupt.

Nevertheless, as of right now, all non-leaf PTEs (the first loop), and all 2M
xenmap leaf mappings (the second loop) need relocating.  They are no unused
mappings in the range, no mappings which will be encountered multiple times,
and it is unlikely that such mappings would be introduced.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/setup.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index c8641c227d9a..0492856292cf 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1230,7 +1230,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 l3_pgentry_t *pl3e;
 l2_pgentry_t *pl2e;
 int i, j, k;
-unsigned long pte_update_limit;
 
 /* Select relocation address. */
 xen_phys_start = end - reloc_size;
@@ -1238,14 +1237,6 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 bootsym(trampoline_xen_phys_start) = xen_phys_start;
 
 /*
- * No PTEs pointing above this address are candidates for 
relocation.
- * Due to possibility of partial overlap of the end of source image
- * and the beginning of region for destination image some PTEs may
- * point to addresses in range [e, e + XEN_IMG_OFFSET).
- */
-pte_update_limit = PFN_DOWN(e);
-
-/*
  * Perform relocation to new physical address.
  * Before doing so we must sync static/global data with main memory
  * with a barrier(). After this we must *not* modify static/global
@@ -1267,8 +1258,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 {
 /* Not present, 1GB mapping, or already relocated? */
 if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) ||
- (l3e_get_flags(*pl3e) & _PAGE_PSE) ||
- (l3e_get_pfn(*pl3e) >= pte_update_limit) )
+ (l3e_get_flags(*pl3e) & _PAGE_PSE) )
 continue;
 *pl3e = l3e_from_intpte(l3e_get_intpte(*pl3e) +
 xen_phys_start);
@@ -1277,8 +1267,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 {
 /* Not present, PSE, or already relocated? */
 if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ||
- (l2e_get_flags(*pl2e) & _PAGE_PSE) ||
- (l2e_get_pfn(*pl2e) >= pte_update_limit) )
+ (l2e_get_flags(*pl2e) & _PAGE_PSE) )
 continue;
 *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) +
 xen_phys_start);
@@ -1291,8 +1280,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++, pl2e++ )
 {
 if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ||
- !(l2e_get_flags(*pl2e) & _PAGE_PSE) ||
- (l2e_get_pfn(*pl2e) >= pte_update_limit) )
+ !(l2e_get_flags(*pl2e) & _PAGE_PSE) )
 continue;
 
 *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) + 
xen_phys_start);
-- 
2.11.0




Re: [XEN PATCH 18/57] tools: Use config.h from autoconf instead of "buildmakevars2header"

2021-12-07 Thread Juergen Gross

On 07.12.21 11:49, Anthony PERARD wrote:

On Tue, Dec 07, 2021 at 11:08:55AM +0100, Juergen Gross wrote:

On 06.12.21 18:02, Anthony PERARD wrote:

This avoid the need to generate the _paths.h header when the
information is from autoconf anyway.

They are no more users of the "buildmakevars2header" macro, so it can
be removed from "Config.mk".

Also removed the extra "-f" flag where "$(RM)" is used (xl/Makefile).

Signed-off-by: Anthony PERARD 


The changes for configure should be done in the respective configure.ac
files and then configure can be generated via a call of "autoconf".


This is what this patch does, via changes in "m4/paths.m4" ;-).
All the respective "configure.ac" have "m4_include([m4/paths.m4])", so
nearly any modification to "paths.m4" will change the generated
"configure" script.


Oh, I just looked at the modified files and didn't see any *.ac ones.

I was already wondering how you could miss that, especially as you did
it correctly in the next patch. :-)

You can then add my:

Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] xen/arm: Add Kconfig parameter for memory banks number

2021-12-07 Thread Luca Fancellu



> On 6 Dec 2021, at 17:05, Julien Grall  wrote:
> 
> Hi Luca,
> 
> On 06/12/2021 15:37, Luca Fancellu wrote:
>> Currently the maximum number of memory banks is fixed to
>> 128, but on some new platforms that have a large amount
>> of memory, this value is not enough 
> 

Hi Julien,

> Can you provide some information on the setup? Is it using UEFI?

Yes it is a platform with 32gb of ram, I’ve built Xen with ACPI support and 
it’s starting using UEFI+ACPI.

> 
>> and prevents Xen
>> from booting.
> 
> AFAIK, the restriction should only prevent Xen to use all the memory. If 
> that's not the case, then this should be fixed.

The code that it’s failing is this, inside efi_process_memory_map_bootinfo(…) 
in the arch/arm/efi/efi-boot.h:

#ifdef CONFIG_ACPI
else if ( desc_ptr->Type == EfiACPIReclaimMemory )
{
if ( !meminfo_add_bank(, desc_ptr) )
{
PrintStr(L"Error: All " __stringify(NR_MEM_BANKS)
  " acpi meminfo mem banks exhausted.\r\n");
return EFI_LOAD_ERROR;
}
}
#endif

> 
>> Create a Kconfig parameter to set the value, by default
>> 128.
> 
> I think Xen should be able to boot on any platform with the default 
> configuration. So the value should at least be bumped.
> 
>> Signed-off-by: Luca Fancellu 
>> ---
>>  xen/arch/arm/Kconfig| 8 
>>  xen/include/asm-arm/setup.h | 2 +-
>>  2 files changed, 9 insertions(+), 1 deletion(-)
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index ecfa6822e4d3..805e3c417e89 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -25,6 +25,14 @@ menu "Architecture Features"
>>source "arch/Kconfig"
>>  +config MEM_BANKS
>> +int "Maximum number of memory banks."
>> +default "128"
>> +help
>> +  Controls the build-time size memory bank array.
>> +  It is the upper bound of the number of logical entities describing
>> +  the memory.
> 
> NR_MEM_BANKS is going to be used by multiple internal structure in Xen (e.g. 
> static memory, reserved memory, normal memory). So how could an admin decide 
> the correct value?
> 
> In particular for UEFI, we are at the mercy of the firmware that can expose 
> any kind of memory map (that's why we had to increase the original number of 
> banks).
> 
> So maybe it is time for us to move out from a static array and re-think how 
> we discover the memory.
> 
> That this is probably going to take some time to get it properly, so
> I would be OK with bumping the value + a config gated UNSUPPORTED.

I can do that.

Cheers,
Luca

> 
>> +
>>  config ACPI
>>  bool "ACPI (Advanced Configuration and Power Interface) Support 
>> (UNSUPPORTED)" if UNSUPPORTED
>>  depends on ARM_64
>> diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
>> index 95da0b7ab9cd..785a8fe81450 100644
>> --- a/xen/include/asm-arm/setup.h
>> +++ b/xen/include/asm-arm/setup.h
>> @@ -6,7 +6,7 @@
>>  #define MIN_FDT_ALIGN 8
>>  #define MAX_FDT_SIZE SZ_2M
>>  -#define NR_MEM_BANKS 128
>> +#define NR_MEM_BANKS CONFIG_MEM_BANKS
>>#define MAX_MODULES 32 /* Current maximum useful modules */
>>  
> 
> Cheers,
> 
> -- 
> Julien Grall




Re: [XEN PATCH 18/57] tools: Use config.h from autoconf instead of "buildmakevars2header"

2021-12-07 Thread Anthony PERARD
On Tue, Dec 07, 2021 at 11:08:55AM +0100, Juergen Gross wrote:
> On 06.12.21 18:02, Anthony PERARD wrote:
> > This avoid the need to generate the _paths.h header when the
> > information is from autoconf anyway.
> > 
> > They are no more users of the "buildmakevars2header" macro, so it can
> > be removed from "Config.mk".
> > 
> > Also removed the extra "-f" flag where "$(RM)" is used (xl/Makefile).
> > 
> > Signed-off-by: Anthony PERARD 
> 
> The changes for configure should be done in the respective configure.ac
> files and then configure can be generated via a call of "autoconf".

This is what this patch does, via changes in "m4/paths.m4" ;-).
All the respective "configure.ac" have "m4_include([m4/paths.m4])", so
nearly any modification to "paths.m4" will change the generated
"configure" script.

Cheers,

-- 
Anthony PERARD



Re: [XEN PATCH v8 11/47] build: fix enforce unique symbols for recent clang version

2021-12-07 Thread Jan Beulich
On 25.11.2021 14:39, Anthony PERARD wrote:
> clang 6.0 and newer behave like gcc in regards for the FILE symbol, so
> only the filename rather than the full path to the source file.
> 
> clang 3.8.1-24 (in our debian:stretch container) and 3.5.0-10
> (in our debian:jessie container) do store the full path to the source
> file in the FILE symbol.
> 
> This means that we also need to check clang version to figure out
> which command we need to use to redefine symbol.
> 
> I don't know which version of clang change behavior, we will guess
> 4.0.

When I did this earlier work, it was clang5 that I used. Which would seem
to mean the change in behavior was in version 6.

Jan




Re: [XEN PATCH 19/57] tools/configure.ac: Create ZLIB_LIBS and ZLIB_CFLAGS

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

Use both ZLIB_CFLAGS and ZLIB_LIBS instead of cherry-picking flags
from a single "ZLIB" variable.

Signed-off-by: Anthony PERARD 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 1/7] xz: add fall-through comments to a switch statement

2021-12-07 Thread Jan Beulich
On 07.12.2021 10:59, Julien Grall wrote:
> On 07/12/2021 09:11, Jan Beulich wrote:
>> On 06.12.2021 17:21, Julien Grall wrote:
>>> I still have have no way to verify
>>> what you did is correct.
>>>
>>> For instance, the tags in patch #2 are:
>>>
>>> Link: http://lkml.kernel.org/r/20191104185107.3b633...@tukaani.org
>>> Reported-by: Yu Sun 
>>> Signed-off-by: Lasse Collin 
>>> Acked-by: Daniel Walker 
>>> [Linux commit: 8e20ba2e53fc6198cbfbcc700e9f884157052a8d]
>>>
>>> The tags in the Linux commit are:
>>>
>>> Signed-off-by: Lasse Collin 
>>> Reported-by: Yu Sun 
>>> Acked-by: Daniel Walker 
>>> Cc: "Yixia Si (yisi)" 
>>> Signed-off-by: Andrew Morton 
>>> Signed-off-by: Linus Torvalds 
>>>
>>> * The first two matches the original e-mails
>>> * I couldn't find the 3rd on the ML.
>>
>> See e.g.
>>
>> https://yhbt.net/lore/all/20191108202754.GG18744@zorba/t/
>>
>> (Andrew Morton's reply at the bottom) for where it originates.
> 
> Ok... So this is taken from a different aggregator. I will have to brush 
> by search engine skill then.

To be fair, I went hunting for it only when writing the earlier reply.

>>> * The Cc could be ignored
>>> * The signed-off-by are I guess what you call "mechanical"
>>
>> I would generally retain Reviewed-by when our code is still quite
>> similar to Linux'es. Acked-by are on the edge of being useful, but as
>> you can see I did err on the side of keeping it. As said in a number
>> of places elsewhere, for what I call mechanically added tags I am yet
>> to be told of their value (or even need) in our tree.
> 
> I think the question is how difficult to do you want to make to the 
> other reviewers? I appreciate other (including myself) may have ignored 
> the tags in the past. But now that I know you do it as a manual process, 
> it makes me a lot more nervous to simply ack such patch without any check.
> 
> You seem to be unwilling to simply copy/paste them.

I'm unwilling only as long as I don't understand the need for them. As
indicated, while I appreciate your "make verification easier for
reviewers", I assign that at least no higher priority than my desire
to leave out inapplicable data.

> So for this series, would you be happy if someone else do it for you?

I'd be happy for anyone else to start over. I would even ack such a
submission myself. But as long as I'm recorded with S-o-b, I'm afraid
I'm not going to accept re-addition of the tags for no good (as per my
personal view) reason. Otherwise, based on experience, the example of
this series could, in the future, be used to tell me that on an earlier
occasion I (seemingly) did things differently.

As said earlier, if submissions in this form are going to be nak-ed
going forward, and if good reasons (see above) will not be provided
(and hence leeway will not be granted to the submitter) to support this,
then someone else will need to start looking after imports which may be
relevant to us.

Jan




Re: [XEN PATCH 18/57] tools: Use config.h from autoconf instead of "buildmakevars2header"

2021-12-07 Thread Juergen Gross

On 06.12.21 18:02, Anthony PERARD wrote:

This avoid the need to generate the _paths.h header when the
information is from autoconf anyway.

They are no more users of the "buildmakevars2header" macro, so it can
be removed from "Config.mk".

Also removed the extra "-f" flag where "$(RM)" is used (xl/Makefile).

Signed-off-by: Anthony PERARD 


The changes for configure should be done in the respective configure.ac
files and then configure can be generated via a call of "autoconf".


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


  1   2   >