Hello all,

I've been reviewing Mynewt's manufacturing image (mfgimage) support.  I
think there are some shortcomings that we need to address.  I'll
attempt to explain the problems that I see and propose some fixes.
This is going to be a long email, so brace yourselves for a thrilling
read.

### CURRENT IMPLEMENTATION

First, allow me to summarize what a Mynewt mfgimage is:

An mfgimage is the set of binaries that get written to a device at
manufacturing time.  Unlike a Mynewt target which corresponds to a
single executable image, an mfgimage represents the entire flash
contents.  Typically, an mfgimage consists of:

    * 1 boot loader.
    * 1 or 2 Mynewt images.
    * Extra configuration (e.g., a pre-populated sys/config region).

In addition, each mfgimage contains a manufacturing meta region (MMR).
The MMR consists of read-only data that resides in flash for the
lifetime of the device.  There is currently support for two MMR TLV
types:

    * Hash of mfgimage
    * Flash map

The manufacturing hash indicates which manufacuturing image a device
was built with.  A management system may need this information to
determine which images a device can be upgraded to, for example.  A
Mynewt device exposes its manufacturing hash via the `id/mfghash`
config setting.

Since the MMR is not intended to be modified or erased, it is placed in
an unmodifiable part of flash: the end of the boot loader area.

So far, it probably sounds like an mfgimage comprises only a single
binary file.  However, an mfgimage is actually a *set* of binaries -
one binary per flash component.  If a device has internal flash as well
as an external SPI flash, for example, then its mfgimage would consist
of two binary files.

An mfgimage is defined in an `mfg.yml` file.  Below is an example
`mfg.yml` file:

    # Sample `mfg.yml` file:
    mfg.bootloader: 'targets/boot-nrf52dk'

    mfg.images:
        - 'targets/slinky-nrf52dk'
        - 'targets/bleprph-nrf52dk'

    mfg.raw:
        - offset: 0x00080000
          file: mfgdata.raw

        - offset: 0x00090000
          file: file-system.raw

### SHORTCOMINGS

The current mfgimage implementation makes a big assumption: all flash
components get programmed at the same time, more or less.  A single mfg
definition is used to create the full set of binaries.  It is not
possible to change only one binary in the set without creating an
entirely new mfgimage.

This assumption does not align with reality.  Often, the various flash
components are programmed by different parties without coordination.
For example, the internal flash is typically programmed by the MCU
vendor before the chips are shipped, while the external flash might be
programmed on the factory floor.  What if, during production, a defect
in the external flash contents is discovered?  Simply sending a patched
binary to the factory violates one of our requirements: we need to be
able to determine exactly how each device in the field was
manufactured.  Currently, the hash in a device's MMR tells us which
mfgimage it was built with, but the MMR is in the boot loader area in
internal flash, the part that the MCU vendor programs!  With the
current system, a new mfgimage needs to be created and built, the new
internal binary sent to the MCU vendor, and the new external binary
sent to the factory.  And this all needs to be done seamlessly - the
factory needs to switch to the new external binary only for MCUs
containing the new internal binary.  Suffice it to say, this is not a
practical solution.

What we really need is a way to update the external flash contents
without a corresponding change to the internal binary.  The MCU vendor
could continue flashing chips with the original binary they were
provided, and the factory could immediately switch over to their new
binary without needing to coordinate.  We need this to be possible
while sticking to one of our original requirements: it must be possible
to determine exactly how each device was programmed during
manufacturing.

### PROPOSAL

# SUMMARY

The short version is:
    * An mfgimage spans one, and only one, flash component.
    * A device can be built with multipled mfgimages.
    * Each mfgimage has its own MMR.
    * A device's manufacturing identity (mfgid) is a *list* of all
      mfgimages the device was built with.

In the example above, there would be two mfgimages: "internal", and
"external".  The internal binary would be sent to the MCU vendor; the
external binary would be sent to the factory.  When a defect in the
external binary is discovered, a new external mfgimage would be
produced (with a new hash) and sent to the factory.  Devices
manufactured after this point would use the new external binary, and
continue to use the original internal binary.  Since a device's mfgid
is the full list of mfgimages it was built with, devices built after
the fix would be distinguishable from those affected by the defect.

# MFG HASH

Each mfgimage has its own MMR containing a hash.

The MMR at the end of the boot loader area (now called the "boot MMR")
must be present. The boot MMR indicates the flash locations of other
MMRs via the `mmr` TLV type.

At startup, the firmware reads the boot MMR (as before). Next, it reads
any additional MMRs indicated by `mmr` TLVs. An `mmr` TLV contains two
fields:

    * flash device ID
    * flash offset

The referenced MMR can be found at the specified location.

After all MMRs have been read, the firmware populates the `id/mfghash`
setting with a colon-separated list of hashes. By reading and parsing
this setting, a client can derive the full list of mfgimages that the
device was built with.

There are a few annoying implications:

    1. A flash sector for each extra MMR must be reserved in the BSP's
       flash map.
    2. The boot mfgimage has to know the location of the other MMRs.
       If an MMR is moved to a different flash offset, a new boot
       mfgimage with the updated offset must be built.

# SAMPLE `MFG.YML` FILES

Below are sample `mfg.yml` files for hypothetical "internal" and
"external" mfgimages.

    # internal/mfg.yml
    mfg.targets:
        # Include the boot loader only.
        -
            name: 'targets/boot-nrf52dk'
            area: FLASH_AREA_BOOTLOADER
            offset: 0

    mfg.meta:
        # The MMR is placed at the end of the boot loader area.
        area: FLASH_AREA_BOOTLOADER
        offset: end

        # Include a TLV containing the sha256 of the mfg image.
        hash: true

        # Point to MMRs of other mfgimage so that the firmware knows
        # where to find them.
        mmrs:
            -
                area: FLASH_AREA_EXT_MMR
                offset: 0


    # external/mfg.yml
    mfg.targets:
        # Include two images.
        -
            name: 'targets/slinky-nrf52dk'
            area: FLASH_AREA_IMAGE_0
            offset: 0
        -
            name: 'targets/bleprph-nrf52dk'
            area: FLASH_AREA_IMAGE_1
            offset: 0

    mfg.meta:
        # The MMR is placed at the start of the "external MMR" flash
        # area.
        area: FLASH_AREA_EXT_MMR
        offset: 0

        # Include a TLV containing the sha256 of the mfg image.
        hash: true

        # Include a flash map TLV.
        flash_map: true

All comments welcome.

Thanks,
Chris

Reply via email to