On Wed, 24 Jul 2019 11:08:58 +0200 Laurent Dufour <lduf...@linux.vnet.ibm.com> wrote:
> Le 23/07/2019 à 18:13, Vaibhav Jain a écrit : > > This doc patch provides an initial description of the HCall op-codes > > that are used by Linux kernel running as a guest operating > > system (LPAR) on top of PowerVM or any other sPAPR compliant > > hyper-visor (e.g qemu). > > > > Apart from documenting the HCalls the doc-patch also provides a > > rudimentary overview of how Hcalls are implemented inside the Linux > > kernel and how information flows between kernel and PowerVM/KVM. > > Hi Vaibhav, > > That's a good idea to introduce such a documentation. > > > Signed-off-by: Vaibhav Jain <vaib...@linux.ibm.com> > > --- > > Change-log: > > > > v5 > > * First patch in this patchset. > > --- > > Documentation/powerpc/hcalls.txt | 140 +++++++++++++++++++++++++++++++ > > 1 file changed, 140 insertions(+) > > create mode 100644 Documentation/powerpc/hcalls.txt > > > > diff --git a/Documentation/powerpc/hcalls.txt > > b/Documentation/powerpc/hcalls.txt > > new file mode 100644 > > index 000000000000..cc9dd872cecd > > --- /dev/null > > +++ b/Documentation/powerpc/hcalls.txt > > @@ -0,0 +1,140 @@ > > +Hyper-visor Call Op-codes (HCALLS) > > +==================================== > > + > > +Overview > > +========= > > + > > +Virtualization on PPC64 arch is based on the PAPR specification[1] which > > +describes run-time environment for a guest operating system and how it > > should > > +interact with the hyper-visor for privileged operations. Currently there > > are two > > +PAPR compliant hypervisors (PHYP): > > + > > +IBM PowerVM: IBM's proprietary hyper-visor that supports AIX, IBM-i and > > Linux as > > + supported guests (termed as Logical Partitions or LPARS). > > + > > +Qemu/KVM: Supports PPC64 linux guests running on a PPC64 linux host. > > + > > +On PPC64 arch a virtualized guest kernel runs in a non-privileged mode > > (HV=0). > > +Hence to perform a privileged operations the guest issues a Hyper-visor > > +Call (HCALL) with necessary input operands. PHYP after performing the > > privilege > > +operation returns a status code and output operands back to the guest. > > + > > +HCALL ABI > > +========= > > +The ABI specification for a HCall between guest os kernel and PHYP is > > +described in [1]. The Opcode for Hcall is set in R3 and subsequent > > in-arguments > > +for the Hcall are provided in registers R4-R12. On return from 'HVCS' > > +instruction the status code of HCall is available in R3 an the output > > parameters > > +are returned in registers R4-R12. > > Would it be good to mention that values passed through the memory must be > stored in Big Endian format ? > > > +Powerpc arch code provides convenient wrappers named plpar_hcall_xxx > > defined in > > +header 'hvcall.h' to issue HCalls from the linux kernel running as guest. > > + > > + > > +DRC & DRC Indexes > > +================= > > + > > + PAPR Guest > > + DR1 Hypervisor OS > > + +--+ +----------+ +---------+ > > + | |<------>| | | User | > > + +--+ DRC1 | | DRC | Space | > > + | | Index +---------+ > > + DR2 | | | | > > + +--+ | |<------->| Kernel | > > + | |<----- >| | HCall | | > > + +--+ DRC2 +----------+ +---------+ > > + > > +PHYP terms shared hardware resources like PCI devices, NVDimms etc > > available for > > +use by LPARs as Dynamic Resource (DR). When a DR is allocated to an LPAR, > > PHYP > > +creates a data-structure called Dynamic Resource Connector (DRC) to manage > > LPAR > > +access. An LPAR refers to a DRC via an opaque 32-bit number called > > DRC-Index. > > +The DRC-index value is provided to the LPAR via device-tree where its > > present > > +as an attribute in the device tree node associated with the DR. > > Should you use the term 'Hypervisor' instead of 'PHYP' which is not usually > designing only the proprietary one ? > > Thanks, > Laurent. > > > + > > +HCALL Op-codes > > +============== > > + > > +Below is a partial of of HCALLs that are supported by PHYP. For the ^^ list? Thanks Michal > > +corresponding opcode values please look into the header > > +'arch/powerpc/include/asm/hvcall.h' : > > + > > +* H_SCM_READ_METADATA: > > + Input: drcIndex, offset, buffer-address, numBytesToRead > > + Out: None > > + Description: > > + Given a DRC Index of an NVDimm, read N-bytes from the the meta data area > > + associated with it, at a specified offset and copy it to provided buffer. > > + The metadata area stores configuration information such as label > > information, > > + bad-blocks etc. The metadata area is located out-of-band of NVDimm > > storage > > + area hence a separate access semantics is provided. > > + > > +* H_SCM_WRITE_METADATA: > > + Input: drcIndex, offset, data, numBytesToWrite > > + Out: None > > + Description: > > + Given a DRC Index of an NVDimm, write N-bytes from provided buffer at the > > + given offset to the the meta data area associated with the NVDimm. > > + > > + > > +* H_SCM_BIND_MEM: > > + Input: drcIndex, startingScmBlockIndex, numScmBlocksToBind, targetAddress > > + Out: guestMappedAddress, numScmBlockBound > > + Description: > > + Given a DRC-Index of an NVDimm, maps the SCM (Storage Class Memory) > > blocks to > > + continuous logical addresses in guest physical address space. The HCALL > > + arguments can be used to map partial range of SCM blocks instead of > > entire > > + NVDimm range to the LPAR. > > + > > +* H_SCM_UNBIND_MEM: > > + Input: drcIndex, startingScmLogicalMemoryAddress, numScmBlocksToUnbind > > + Out: numScmBlocksUnbound > > + Description: > > + Given a DRC-Index of an NVDimm, unmap one or more the SCM blocks from > > guest > > + physical address space. The HCALL can fail if the Guest has an active PTE > > + entry to the SCM block being unbinded. > > + > > +* H_SCM_QUERY_BLOCK_MEM_BINDING: > > + Input: drcIndex, scmBlockIndex > > + Out: Guest-Physical-Address > > + Description: > > + Given a DRC-Index and an SCM Block index return the guest physical > > address to > > + which the SCM block is mapped to. > > + > > +* H_SCM_QUERY_LOGICAL_MEM_BINDING: > > + Input: Guest-Physical-Address > > + Out: drcIndex, scmBlockIndex > > + Description: > > + Given a guest physical address return which DRC Index and SCM block is > > mapped > > + to that address. > > + > > +* H_SCM_UNBIND_ALL: > > + Input: scmTargetScope, drcIndex > > + Out: None > > + Description: > > + Depending on the Target scope unmap all scm blocks belonging to all > > NVDimms > > + or all scm blocks belonging to a single NVDimm identified by its drcIndex > > + from the LPAR memory. > > + > > +* H_SCM_HEALTH: > > + Input: drcIndex > > + Output: health-bitmap, health-bit-valid-bitmap > > + Description: > > + Given a DRC Index return the info on predictive failure and over all > > health of > > + the NVDimm. The asserted bits in the health-bitmap indicate a single > > predictive > > + failure and health-bit-valid-bitmap indicate which bits in health-bitmap > > are > > + valid. > > + > > + > > +* H_SCM_PERFORMANCE_STATS: > > + Input: drcIndex, resultBuffer Addr > > + Out: None > > + Description: > > + Given a DRC Index collect the performance statistics for NVDimm and copy > > them > > + to the resultBuffer. > > + > > + > > +References > > +========== > > +[1]: "Linux on Power Architecture Platform Reference" > > + https://members.openpowerfoundation.org/document/dl/469 > > >