Hello all, I've been reviewing Mynewt's manufacturing image (mfgimage) support. I think there are some shortcomings that we need to address. I'll attempt to explain the problems that I see and propose some fixes. This is going to be a long email, so brace yourselves for a thrilling read.
### CURRENT IMPLEMENTATION First, allow me to summarize what a Mynewt mfgimage is: An mfgimage is the set of binaries that get written to a device at manufacturing time. Unlike a Mynewt target which corresponds to a single executable image, an mfgimage represents the entire flash contents. Typically, an mfgimage consists of: * 1 boot loader. * 1 or 2 Mynewt images. * Extra configuration (e.g., a pre-populated sys/config region). In addition, each mfgimage contains a manufacturing meta region (MMR). The MMR consists of read-only data that resides in flash for the lifetime of the device. There is currently support for two MMR TLV types: * Hash of mfgimage * Flash map The manufacturing hash indicates which manufacuturing image a device was built with. A management system may need this information to determine which images a device can be upgraded to, for example. A Mynewt device exposes its manufacturing hash via the `id/mfghash` config setting. Since the MMR is not intended to be modified or erased, it is placed in an unmodifiable part of flash: the end of the boot loader area. So far, it probably sounds like an mfgimage comprises only a single binary file. However, an mfgimage is actually a *set* of binaries - one binary per flash component. If a device has internal flash as well as an external SPI flash, for example, then its mfgimage would consist of two binary files. An mfgimage is defined in an `mfg.yml` file. Below is an example `mfg.yml` file: # Sample `mfg.yml` file: mfg.bootloader: 'targets/boot-nrf52dk' mfg.images: - 'targets/slinky-nrf52dk' - 'targets/bleprph-nrf52dk' mfg.raw: - offset: 0x00080000 file: mfgdata.raw - offset: 0x00090000 file: file-system.raw ### SHORTCOMINGS The current mfgimage implementation makes a big assumption: all flash components get programmed at the same time, more or less. A single mfg definition is used to create the full set of binaries. It is not possible to change only one binary in the set without creating an entirely new mfgimage. This assumption does not align with reality. Often, the various flash components are programmed by different parties without coordination. For example, the internal flash is typically programmed by the MCU vendor before the chips are shipped, while the external flash might be programmed on the factory floor. What if, during production, a defect in the external flash contents is discovered? Simply sending a patched binary to the factory violates one of our requirements: we need to be able to determine exactly how each device in the field was manufactured. Currently, the hash in a device's MMR tells us which mfgimage it was built with, but the MMR is in the boot loader area in internal flash, the part that the MCU vendor programs! With the current system, a new mfgimage needs to be created and built, the new internal binary sent to the MCU vendor, and the new external binary sent to the factory. And this all needs to be done seamlessly - the factory needs to switch to the new external binary only for MCUs containing the new internal binary. Suffice it to say, this is not a practical solution. What we really need is a way to update the external flash contents without a corresponding change to the internal binary. The MCU vendor could continue flashing chips with the original binary they were provided, and the factory could immediately switch over to their new binary without needing to coordinate. We need this to be possible while sticking to one of our original requirements: it must be possible to determine exactly how each device was programmed during manufacturing. ### PROPOSAL # SUMMARY The short version is: * An mfgimage spans one, and only one, flash component. * A device can be built with multipled mfgimages. * Each mfgimage has its own MMR. * A device's manufacturing identity (mfgid) is a *list* of all mfgimages the device was built with. In the example above, there would be two mfgimages: "internal", and "external". The internal binary would be sent to the MCU vendor; the external binary would be sent to the factory. When a defect in the external binary is discovered, a new external mfgimage would be produced (with a new hash) and sent to the factory. Devices manufactured after this point would use the new external binary, and continue to use the original internal binary. Since a device's mfgid is the full list of mfgimages it was built with, devices built after the fix would be distinguishable from those affected by the defect. # MFG HASH Each mfgimage has its own MMR containing a hash. The MMR at the end of the boot loader area (now called the "boot MMR") must be present. The boot MMR indicates the flash locations of other MMRs via the `mmr` TLV type. At startup, the firmware reads the boot MMR (as before). Next, it reads any additional MMRs indicated by `mmr` TLVs. An `mmr` TLV contains two fields: * flash device ID * flash offset The referenced MMR can be found at the specified location. After all MMRs have been read, the firmware populates the `id/mfghash` setting with a colon-separated list of hashes. By reading and parsing this setting, a client can derive the full list of mfgimages that the device was built with. There are a few annoying implications: 1. A flash sector for each extra MMR must be reserved in the BSP's flash map. 2. The boot mfgimage has to know the location of the other MMRs. If an MMR is moved to a different flash offset, a new boot mfgimage with the updated offset must be built. # SAMPLE `MFG.YML` FILES Below are sample `mfg.yml` files for hypothetical "internal" and "external" mfgimages. # internal/mfg.yml mfg.targets: # Include the boot loader only. - name: 'targets/boot-nrf52dk' area: FLASH_AREA_BOOTLOADER offset: 0 mfg.meta: # The MMR is placed at the end of the boot loader area. area: FLASH_AREA_BOOTLOADER offset: end # Include a TLV containing the sha256 of the mfg image. hash: true # Point to MMRs of other mfgimage so that the firmware knows # where to find them. mmrs: - area: FLASH_AREA_EXT_MMR offset: 0 # external/mfg.yml mfg.targets: # Include two images. - name: 'targets/slinky-nrf52dk' area: FLASH_AREA_IMAGE_0 offset: 0 - name: 'targets/bleprph-nrf52dk' area: FLASH_AREA_IMAGE_1 offset: 0 mfg.meta: # The MMR is placed at the start of the "external MMR" flash # area. area: FLASH_AREA_EXT_MMR offset: 0 # Include a TLV containing the sha256 of the mfg image. hash: true # Include a flash map TLV. flash_map: true All comments welcome. Thanks, Chris