Hi-- On 12/1/20 2:34 PM, mgr...@linux.intel.com wrote: > From: mark gross <mgr...@linux.intel.com> > > > Reviewed-by: Mark Gross <mgr...@linux.intel.com> > Signed-off-by: Mark Gross <mgr...@linux.intel.com>
My reading of submitting-patches.rst seems to indicate that the Reviewer and Submitter are probably not the same person. Are you sure that you reviewed it? > --- > Documentation/index.rst | 3 +- > Documentation/vpu/index.rst | 16 ++ > Documentation/vpu/vpu-stack-overview.rst | 267 +++++++++++++++++++++++ > 3 files changed, 285 insertions(+), 1 deletion(-) > create mode 100644 Documentation/vpu/index.rst > create mode 100644 Documentation/vpu/vpu-stack-overview.rst > > diff --git a/Documentation/index.rst b/Documentation/index.rst > index 57719744774c..0a2cc0204e8f 100644 > --- a/Documentation/index.rst > +++ b/Documentation/index.rst > @@ -1,4 +1,4 @@ > -.. SPDX-License-Identifier: GPL-2.0 > +.. SPDX-License-Identifier: GPL-2.0-only That looks both inappropriate for this patch and incorrect AFAICT. > > > .. The Linux Kernel documentation master file, created by > @@ -137,6 +137,7 @@ needed). > misc-devices/index > scheduler/index > mhi/index > + vpu/index > > Architecture-agnostic documentation > ----------------------------------- > diff --git a/Documentation/vpu/index.rst b/Documentation/vpu/index.rst > new file mode 100644 > index 000000000000..7e290e048910 > --- /dev/null > +++ b/Documentation/vpu/index.rst > @@ -0,0 +1,16 @@ > +.. SPDX-License-Identifier: GPL-2.0-only license-rules.rst says: For 'GNU General Public License (GPL) version 2 only' use: SPDX-License-Identifier: GPL-2.0 > + > +============================================ > +Vision Processor Unit Documentation > +============================================ > + > +This documentation contains information for the Intel VPU stack. > + > +.. class:: toc-title > + > + Table of contents > + > +.. toctree:: > + :maxdepth: 2 > + > + vpu-stack-overview > diff --git a/Documentation/vpu/vpu-stack-overview.rst > b/Documentation/vpu/vpu-stack-overview.rst > new file mode 100644 > index 000000000000..53c06a7d9a52 > --- /dev/null > +++ b/Documentation/vpu/vpu-stack-overview.rst > @@ -0,0 +1,267 @@ > +.. SPDX-License-Identifier: GPL-2.0-only Nope. > + > +====================== > +Intel VPU architecture > +====================== > + > +Overview > +======== > + > +The Intel Movidius acquisition has developed a Vision Processing Unit (VPU) > +roadmap of products starting with Keem Bay (KMB). The HW configurations the s/HW/hardware/ > +VPU can support include: > + > +1. Standalone smart camera that does local CV processing in camera Tell us what CV is before using it. > +2. Standalone appliance or SBC device connected to a network and tethered Tell us what SBC is before using it. (yeah, I know) > + cameras doing local CV processing > +3. Embedded in a USB dongle or M.2 as an CV accelerator. > +4. Multiple VPU enabled SOC's on a PCIE card as a CV accelerator in a larger > IA PCIe (?) > + box or server. > + > +Keem Bay is the first instance of this family of products. This document > +provides an architectural overview of the SW stack supporting the VPU enabled s/SW/software/ > +products. > + > +Keem Bay (KMB) is a Computer Vision AI processing SoC based on ARM A53 CPU > that > +provides Edge neural network acceleration (inference) and includes a Vision > +Processing Unit (VPU) hardware. The ARM CPU SubSystem (CPUSS) interfaces > +locally to the VPU and enables integration/interfacing with a remote host > over > +PCIe or USB or Ethernet interfaces. The interface between the CPUSS and the > VPU > +is implemented with HW FIFOs (Control) and coherent memory mapping (Data) > such > +that zero copy processing can happen within the VPU. > + > +The KMB can be used in all 4 of the above classes of designs. > + > +We refer to the 'local host' as being the ARM part of the SoC, while the > +'remote host' as the IA system hosting the KMB device(s). The KMB SoC boots > +from an eMMC via uBoot and ARM Linux compatible device tree interface with an > +expectation to fully boot within hundreds of milliseconds. There is also > +support for downloading the kernel and root file system image from a remote > +host. > + > +The eMMC can be updated with standard mender update process. Mender > +See https://github.com/mendersoftware/mender > + > +The VPU is started and controlled from the A53 local host. Its firmware > image > +is loaded using the drive FW helper KAPI's. s/FW/firmware/ > + > +The VPU IP FW payload consists of a SPARC ISA RTEMS bootloader and/or > +application binary. > + > +The interface allowing (remote or local) host clients to access VPU IP ^^drop one space > +capabilities is realized through an abstracted programming model, which > +provides Remote Proxy APIs for a host CPU application to dynamically create > and > +execute CV and NN workloads on the VPU. All frameworks exposed through Tell us what NN is. > +programming model’s APIs are contained in the pre-compiled standard firmware > +image. > + > +There is a significant SW stack built up to support KMB and the use cases. > The > +rest of this documentation provides an overview of the components of the > stack. > + > +Keem Bay IPC > +============ > + > +Directly interfaces with the KMB HW FIFOs to provide zero copy processing > from > +the VPU. It implements the lowest level protocol for interacting with the > VPU. > + > +The Keem Bay IPC mechanism is based on shared memory and hardware FIFOs, FIFOs. > +specifically there are: Specifically > + > +* Two 128-entry HW FIFOs, one for the CPU and one for the VPU. > +* Two shared memory regions, used as memory pool for allocating IPC buffers end with a period since the previous line did that. > + > +An IPC channel is a software abstraction allowing communication multiplexing, > +so that multiple applications / users can concurrently communicated to the > VPU. communicate > +IPC channels area conceptually similar to socket ports. > + > +There is a total of 1024 channels, each one identified by a channel ID, > ranging are > +from 0 to 1023. > + > +Channels are divided in two categories: > + > +* High-Speed (HS) channels, having IDs in the 0-9 range. > +* General-Purpose (GP) channels, having IDs in the 10-1023 range. > + > +HS channels have higher priority over GP channels and can be used by > +applications requiring higher throughput or lower latency. > + > +Since all the channels share the same HW resources (i.e., the HW FIFOs and > the > +IPC memory pools), the Keem Bay IPC driver uses software queues to give a > +higher priority to HS channels. > + > +The driver supports a build-time configurable number of communication > channels > +defined in a so called Channel Mapping Table. so-called > + > +An IPC channel is full duplex: a pending operation from a certain channel > does > +not block other operations on the same channel, regardless of their operation > +mode (blocking or non-blocking). > + > +Operation mode is individually selectable for each channel, per operation > +direction (read or write). All operations for that direction comply to > +selection. > + > + > +Keem Bay-VPU-IPC > +================ > + > +This is the MMIO driver of the VPU IP block inside the SOC. It is a control > +driver mapping IPC channel communication to Xlink virtual channels. > + > +This driver provides the following functionality to other drivers in the > +communication stack: > + > +* VPU IP execution control (firmware load, start, reset) > +* VPU IP event notifications (device connected, device disconnected, WDT > event) > +* VPU IP device status query (OFF, BUSY, READY, ERROR, RECOVERY) > +* Communication via the IPC protocol (wrapping the Keem Bay IPC driver and > + exposing it to higher level Xlink layer) > + > +In addition to the above, the driver exposes SoC information (like stepping, > +device ID, etc.) to user-space via sysfs. > + > +This driver depends on the 'Keem Bay IPC' driver, which enables the Keem Bay > +IPC communication protocol. > + > +The driver uses the Firmware API to load the VPU firmware from user-space. > + > +Xlink-IPC > +========= > +This component is implementing the IPC specific Xlink protocol. It maps > channel component implements IPC-specific > +IDs to HW FIFO entries, using the Keem Bay VPU IPC driver. > + > +Some of the main functions this driver provides: > + > +* establishing a connection with an IPC device > +* obtaining a list with the available devices > +* obtaining the status for a device > +* booting a device > +* resetting a device > +* opening and closing channels > +* issuing read and write operations > + > +Xlink-core > +========== > + > +This component implements an abstracted set of control and communication APIs > +based on channel identification. It is intended to support VPU technology > both > +at SoC level as well as at IP level, over multiple interfaces. > + > +It provides symmetrical services, where the producer and the consumer have > +the same privileges. > + > +Xlink driver has the ability to abstract several types of communication > +channels underneath, allowing the usage of different interfaces with the same > +function calls. > + > +Xlink services are available to both kernel and user space clients and > include: > + > +* interface abstract control and communication API > +* multi device support > +* concurrent communication across 4096 communication channels (from 0 to > + 0xFFF), with customizable properties > +* full duplex channels with multiprocess and multithread support > +* channel IDs can be mapped to desired physical interface (PCIE, USB, ETH, > IPC) PCIe > + via a Channel Mapping Table > +* asynchronous fast pass through mode: remote host data packets are directly passthrough > + dispatched using interrupt systems running on local host to IPC calls for > low > + overhead > +* channel handshaking mechanism for peer to peer communication, without the > + need of static channel preallocation > +* channel resource management > +* asynchronous data and device notifications to subscribers > + > +Xlink transports: PCIe, USB, ETH, IPCXLink-PCIe IPC, > + > +XLink-PCIE PCIe > +========== > +This is an endpoint driver that is mapping Xlink channel IDs to PCIE > channels. that maps PCIe > + > +This component ensures (remote)host-to-(local)host communication, and VPU IP > +communication via an asynchronous pass through mode, where PCIE data loads > are passthrough PCIe > +directly dispatched to Xlink-IPC. > + > +The component builds and advertises Device IDs that can are used by local > host that are used > +application in case of multi device scenarios. > + > +XLink-USB > +========== > +This is an endpoint driver that is mapping Xlink channel IDs to bidirectional that maps > +USB endpoints and supports CDC USB class protocol. More than one Xlink > channels channel > +can be mapped to a single USB endpoint. > + > +This component ensures host-to-host communication, and, as well, asynchronous > +pass through communication, where USB transfer packets are directly > dispatched passthrough > +to Xlink-IPC. > + > +The component builds and advertises Device IDs that can are used by local > host that are used > +application in case of multi device scenarios. > + > +XLink-ETH > +========= > + > +This is an endpoint driver that is mapping Xlink channel IDs to Ethernet that maps > +sockets. > + > +This component ensures host-to-host communication, and, as well, asynchronous > +pass through communication, where Ethernet data loads are directly > dispatched to passthrough > +Xlink-IPC. > + > +The component builds and advertises Device IDs that can are used by local > host that are used > +application in case of multi device scenarios. > + > +Assorted drivers that depend on this stack: > + > +Xlink-SMB > +========= > +The Intel Edge.AI Computer Vision platforms have to be monitored using > platform > +devices like sensors, fan controller, IO expander etc. Some of these devices > +are memory mapped and some are i2c based. Either of these devices are not I2C-based. None of these devices is > +directly accessible to the host. > + > +The host here refers to the server to which the vision accelerators are > +connected over PCIe Interface. The Host needs to do a consolidated action > based > +on the parameters of platform devices. In general, most of the standard > devices > +(includes sensors, fan controller, IO expander etc) are I2C/SMBus based and > are SMBus-based > +used to provide the status of the accelerator. Standard drivers for these > +devices are available based on i2c/smbus APIs. I2C/SMBus > + > +Instead of changing the sensor drivers to adapt to PCIe interface, a generic > +i2c adapter "xlink-smbus" which underneath uses xlink as physical medium is I2C Xlink > +used. With xlink-smbus, the drivers for the platform devices doesn't need to don't > +undergo any interface change. > + > +TSEN > +==== > + > +Thermal sensor driver for exporting thermal events to the local Arm64 host as > +well as to the remote X86 host if in the PCIe add in CV accelerator add-in > +configuration. > + > +The driver receiving the junction temperature from different heating points receives > +inside the SOC. The driver will receive the temperature on SMBUS connection > and SMBus > +forward over xlink-smb when in a remote host configuration. > + > +In Keem Bay, the four thermal junction temperature points are, Media > Subsystem ^no comma > +(mss), NN subsystem (nce), Compute subsystem (cse) and SOC(Maximum of mss, > nce and SOC (maximum of mss, nce > +and cse) cse). > + > +HDDL > +==== > + > +- Exports details of temperature sensor, current sensor and fan controller > + present in Intel Edge.AI Computer Vision platforms to IA host. > +- Enable Time sync of Intel Edge.AI Computer Vision platform with IA host. > +- Handles device connect and disconnect events. > +- Receives slave address from the IA host for memory mapped thermal sensors > + present in SoC (Documentation/hwmon/intel_tsens_sensors.rst). > +- Registers i2c slave device for slaves present in Intel Edge.AI Computer I2C > + Vision platform > + > + > +VPUMGR (VPU Manager) > +==================== > + > +Bridges firmware on VPU side and applications on CPU user-space, it assists > +firmware on VPU side serving multiple user space application processes on CPU > +side concurrently while also performing necessary data buffer management on > +behalf of VPU IP. > -- ~Randy