On 03/16/15 15:15, Gabriel L. Somlo wrote:
> This document covers the generic portions of fw_cfg as well as
> the x86/x86-64 architecture specific components.
> 
> Signed-off-by: Jordan Justen <jordan.l.jus...@intel.com>
> Signed-off-by: Gabriel Somlo <so...@cmu.edu>
> ---
> 
> This is a resubmission of Jordan's patch from back when:
>   http://lists.gnu.org/archive/html/qemu-devel/2011-04/msg00238.html
> My own signed-off-by may or may not belong in the commit log, not
> quite 100% sure what the etiquette is. Please apply with or without it :)
> 
> Thanks,
>   Gabriel
> 
>  docs/specs/fw_cfg.txt | 167 
> ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 167 insertions(+)
>  create mode 100644 docs/specs/fw_cfg.txt
> 
> diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
> new file mode 100644
> index 0000000..7d156b7
> --- /dev/null
> +++ b/docs/specs/fw_cfg.txt
> @@ -0,0 +1,167 @@
> +
> += Hardware Interface =
> +
> +== Index Port ==
> +* Two bytes in width (guest native endianness)

Hmmm. I'm not sure if Jordan's patch should be taken separately, then
updated in a separate patch, or if an updated version should be posted
as one squashed patch.

In any case, I'd like to see the term "selector register" mentioned here
(as a synonym, or even replacing "index port"). We have
"Documentation/devicetree/bindings/arm/fw-cfg.txt" in the kernel tree
now, and wording should be consistent.

Second, the selector register's endianness is not guest native *in
general*. The commit message says "generic portions of fw_cfg as well as
the x86/x86-64 architecture specific components", so it's not easy to
decide which this line belongs to.

Namely, on x86 the selector register is mapped as a 16-bit wide, little
endian ioport (see fw_cfg_comb_mem_ops and fw_cfg_io_realize()), and in
that sense "guest native endianness" is correct.

However, on arm and ppc (and whatever else uses fw_cfg_init_mem*()), the
selector register is memory mapped, and it takes the keys in big endian
representation (see fw_cfg_ctl_mem_ops and fw_cfg_mem_realize()). For
the arm target, I think big endian cannot be called "guest native
endianness".

> +* Write only
> +* Can be an I/O port and/or Memory-Mapped I/O

Okay, if you mention MMIO, then the endianness claim is definitely wrong.

> +* Location is platform dependent
> +
> +A write to this port sets the index of a firmware configuration item
> +which can subsequently be accessed at the data port.
> +
> +Setting the index port will cause the data offset to be set to zero.
> +The data offset impacts which data is accessed via the data port, and
> +is explained below.
> +
> +Bit15 of the index value indicates if the configuration setting is
> +architecture specific.  If bit15 of the index is 0, then the item is
> +a generic configuration item.  If bit15 of the index is 1, then the
> +item is specific to a particular architecture.  (In other words,
> +generic configuration items are accessed when the index is between
> +0x0000-0x7fff, and architecture specific configuration items are
> +accessed when the index is between 0x8000-0xffff.)
> +
> +Bit14 of the index value indicates if the configuration setting is
> +being written.  If bit14 of the index is 0, then the item is only
> +being read, and all write access to the data port will be completely
> +ignored.  If bit14 of the index is 1, then the item's data can be
> +written to by writing to the data port.  (In other words,
> +configuration write mode is enabled when the index is between
> +0x4000-0x7fff or 0xc000-0xffff.)
> +
> +== Data Port ==
> +* One byte in width

Not necessarily so, any longer. The memory mapped register can be 1, 2,
4 or 8 bytes wide (see fw_cfg_init_mem_wide(), data_width), with the
choice depending on board code. Then the guest can access the data
register with all of the above widths that do not exceed the maximum
width, but always only at offset #0.

And, somewhat non-intuitively, the data register has no endianness per
se -- the DEVICE_BIG_ENDIAN you see in fw_cfg_data_mem_ops is a pure
technicality -- because it does not transfer numerical values; it is
*string*-preserving.

Once the guest has read an fw_cfg blob, with a sequence of data register
accesses of various allowed widths, *then* it can interpret the blob as
necessary. (For example, (a) the "initrd blob" is an opaque blob, (b)
the "initrd size blob" is a number in little endian encoding, (c) the
"fw_cfg file directory blob" is an array of structures prefixed with the
number of entries, and both that entry count and the numerical fields in
the entry struct have big endian encoding.) Blob transfer and blob
interpretation are separate.

> +* Read + Write
> +* Can be an I/O port and/or Memory-Mapped I/O
> +* Location is platform dependent

I think we should spell out that in the ioport mapping, the two ports
overlap. (They cannot be mapped separately.)

> +
> +The data port allows for access to an array of bytes for each firmware
> +configuration data item.  This item is selected by a write to the
> +index port.
> +
> +Initially following a write to the index port, the data offset will
> +be set to zero.  Each successful read or write to the data port will
> +cause the data offset to increment by one byte.

(In the mmio case, the offset is incremented by the access width, 1 / 2
/ 4 / 8.)

> There is only one
> +data offset value, and it will be incremented by accesses to any of
> +the I/O or MMIO data ports.

I don't understand "any of". There is one data register only.

Hm, yes, I do understand the text; it was originally written when the
API that board code could use had not yet strictly separated the ioport
mapping from the mmio mapping.

>  A write access will not increment the
> +data offset if the selected index did not have bit14 set.
> +
> +Each firmware configuration item has a maximum length of data
> +associated with the item.  After the data offset has passed the
> +end of this maximum data length, then any reads will return a data
> +value of 0x00, and all writes will be ignored.
> +
> +A read of the data port will return the current byte of the firmware
> +configuration item.

In the mmio case, an N-byte wide read of the data register will return
the next N bytes (as a substring of the blob being read) in increasing
address order, like with memcpy().

> +
> +A write of the data port will set the current byte of the firmware
> +configuration item.  A write access will not impact the firmware
> +configuration data if the selected index did not have bit14 set.
> +
> +== x86, x86-64 Ports ==
> +
> +I/O  Index Port: 0x510
> +I/O  Data  Port: 0x511
> +MMIO Index Port: N/A
> +MMIO Data  Port: N/A

Again, these are now mutually exclusive, as provided by the
qemu-internal API. (Of course insane board code could circumvent that,
but it won't.)

> +
> += Firmware Configuration Items =
> +
> +== Ranges ==
> +
> +There are up to 0x4000 generic firmware configuration items, and up to
> +0x4000 architecturally specific firmware configuration items.
> +
> +Index Port Range  Range Usage
> +----------------  -----------
> +0x0000 - 0x3fff   Generic Items (0x0000 - 0x3fff) (Read-only)
> +0x4000 - 0x7fff   Generic Items (0x0000 - 0x3fff) (Read+Write)
> +0x8000 - 0xbfff   Architecture Specific Items (0x0000 - 0x3fff) (Read-only)
> +0xc000 - 0xffff   Architecture Specific Items (0x0000 - 0x3fff) (Read+Write)
> +
> +== Data Items Format ==
> +
> +uint8_t : 8-bit unsigned integer
> +uint16_t: 16-bit unsigned integer
> +uint32_t: 32-bit unsigned integer
> +uint64_t: 64-bit unsigned integer
> +n bytes : byte array of length n
> +array   : byte array of a format specific size
> +string  : null byte terminated ascii string
> +
> +All integer data is accessed with the least significant byte first and
> +then proceeding to more significant bytes on subsequent accesses.
> +

This section needs to be reworked. As I indicated above, blob
interpretation is specific to the individual selector key.

It *is* true that the following internal helper functions:
- fw_cfg_add_i16
- fw_cfg_add_i32
- fw_cfg_add_i64

imbue the blob in question with little endian encoding (they call
cpu_to_leXX internally, before calling fw_cfg_add_bytes()). However, the
term "All integer data" is simply too generic. Pedantically, it should say

  All fw_cfg blobs added with the fw_cfg_add_iXX() functions carry
  integer values of the indicated size, with little-endian internal
  encoding.

A bunch of other "integer data" exists, as part of different kinds of
blobs, and their encodings are arbitrary.

For example, the FWCfgFiles structure (the fw_cfg file directory blob)
that I mentioned above is exposed to the guest directly, and the
following fields are integers, and have big endian encoding (see
fw_cfg_add_file_callback()):

- FWCfgFiles.count
- FWCfgFiles.f[*].size
- FWCfgFiles.f[*].select

> +== Generic Items ==
> +
> +Index  Data Type  Data Meaning
> +-----  ---------  ------------
> +0x00   4 bytes    Signature - 'Q', 'E', 'M', 'U'
> +0x01   uint32_t   ID
> +0x02   16 bytes   System UUID
> +0x03   uint64_t   RAM Size
> +0x04   uint16_t   0 indicates graphics mode, otherwise non-graphics mode
> +0x05   uint16_t   The number of SMP CPUs
> +0x06   uint16_t   Machine ID
> +0x07   uint32_t   Kernel Address
> +0x08   uint32_t   Kernel Size
> +0x09   string     Kernel command line
> +0x0a   uint32_t   Initrd Address
> +0x0b   uint32_t   Initrd Size
> +0x0c   uint16_t   Boot Device
> +0x0d   array      NUMA Data
> +0x0e   uint16_t   Boot Menu
> +0x0f   uint16_t   The maximum number of CPUs (hotpluggable)
> +0x10   uint32_t   Kernel Entry
> +0x11   array      Kernel Data
> +0x12   array      Initrd Data
> +0x13   uint32_t   Command Line Address
> +0x14   uint32_t   Command Line Size
> +0x15   string     Command Line Data
> +0x16   uint32_t   Setup Address
> +0x17   uint32_t   Setup Size
> +0x18   array      Setup Data
> +0x19   array      File Directory

For all of the uintXX_t blobs, their encoding should be pointed out.
(Assuming they are all added with the fw_cfg_add_iXX() functions, they
can be advertised as little endian.)

"string" should be explained as \0-terminated array.

Also, selector key 0x01 ("ID") is now considered an "interface revision
/ feature bitmap". (With value 0 ATM.)

> +
> +=== File Directory Structure ===
> +
> +Note: Integers in the file directory structure (index 0x19) are stored
> +in big-endian format regardless of the host or guest endianness.  This
> +means that the first byte read of the integer is its most significant
> +byte.

I like to separate the transfer of a blob from the interpretation of a
blob more than the above. It's okay with the 1-byte wide ioport, but
with the wide memory-mapped data register, it very easily confuses
people (including me, obviously). So, when talking about a blob's
interpretation, please avoid words like "read" and "write". Transfer is
complete at that point.

> +
> +The structure begins with a uint32_t in big-endian format.
> +This number indicates the number of files that are available.
> +
> +Offset  Data Type     Data Meaning
> +------  ------------  ------------
> +0x00    uint32_t(be)  File count
> +0x04    array         Array of file instance structures.  The total length
> +                      of this array is 64-bytes * file_count.
> +
> +As shown above, following the initial file count uint32_t,
> +there is an array of structures.  Each structure is 64-bytes
> +in size.  The number of instances of this structure in the
> +array is given by the initial uint32_t data value read.
> +Each structure within the array has this format:
> +
> +Offset  Data Type     Data Meaning
> +------  ------------  ------------
> +0x00    uint32_t(be)  File size
> +0x04    uint16_t(be)  Firmware configuration entry index for file data

(selector value or selector key would be the preferred term -- they are
not inherently superior, it's just what I used in the kernel documentation.)

> +0x06    2 bytes       2 reserved bytes
> +0x08    56 bytes      File name as a null terminated ascii string

"NUL-terminated".

> +
> +== x86, x86-64 Architecture Items ==
> +
> +As architecture specific items, these items should be accessed
> +starting at 0x8000 for reading or 0xc000 for reading and writing.
> +
> +Index  Data Type  Data Meaning
> +-----  ---------  ------------
> +0x00   array      ACPI Tables Data
> +0x01   array      SMBIOS Data
> +0x02   uint8_t    IRQ0 Override
> +0x03   array      E820 Table
> +0x04   array      HPET Data

Urgh. No clue about the last three, but I certainly wouldn't advertise
the first two. I think we have much better interfaces now for ACPI and
SMBIOS than the legacy ones above. (And they are exposed as fw_cfg
files, not with fixed selector keys.)

Thanks
Laszlo

Reply via email to