Hi Chris, I read it all and indeed it was thrilling :)
I think this is a good idea and this is a way to go. I have just feeling that internal mfgimage should be able to verify external one somehow, to make sure second factory did a good job But maybe this is not needed as bootloader will do signature validation of the images inside the external mfgimage (if I recall correctly). Anyway, just a thought to consider. Best Łukasz On Tue, 11 Dec 2018 at 00:14, Christopher Collins <ch...@runtime.io> wrote: > Hello all, > > I've been reviewing Mynewt's manufacturing image (mfgimage) support. I > think there are some shortcomings that we need to address. I'll > attempt to explain the problems that I see and propose some fixes. > This is going to be a long email, so brace yourselves for a thrilling > read. > > ### CURRENT IMPLEMENTATION > > First, allow me to summarize what a Mynewt mfgimage is: > > An mfgimage is the set of binaries that get written to a device at > manufacturing time. Unlike a Mynewt target which corresponds to a > single executable image, an mfgimage represents the entire flash > contents. Typically, an mfgimage consists of: > > * 1 boot loader. > * 1 or 2 Mynewt images. > * Extra configuration (e.g., a pre-populated sys/config region). > > In addition, each mfgimage contains a manufacturing meta region (MMR). > The MMR consists of read-only data that resides in flash for the > lifetime of the device. There is currently support for two MMR TLV > types: > > * Hash of mfgimage > * Flash map > > The manufacturing hash indicates which manufacuturing image a device > was built with. A management system may need this information to > determine which images a device can be upgraded to, for example. A > Mynewt device exposes its manufacturing hash via the `id/mfghash` > config setting. > > Since the MMR is not intended to be modified or erased, it is placed in > an unmodifiable part of flash: the end of the boot loader area. > > So far, it probably sounds like an mfgimage comprises only a single > binary file. However, an mfgimage is actually a *set* of binaries - > one binary per flash component. If a device has internal flash as well > as an external SPI flash, for example, then its mfgimage would consist > of two binary files. > > An mfgimage is defined in an `mfg.yml` file. Below is an example > `mfg.yml` file: > > # Sample `mfg.yml` file: > mfg.bootloader: 'targets/boot-nrf52dk' > > mfg.images: > - 'targets/slinky-nrf52dk' > - 'targets/bleprph-nrf52dk' > > mfg.raw: > - offset: 0x00080000 > file: mfgdata.raw > > - offset: 0x00090000 > file: file-system.raw > > ### SHORTCOMINGS > > The current mfgimage implementation makes a big assumption: all flash > components get programmed at the same time, more or less. A single mfg > definition is used to create the full set of binaries. It is not > possible to change only one binary in the set without creating an > entirely new mfgimage. > > This assumption does not align with reality. Often, the various flash > components are programmed by different parties without coordination. > For example, the internal flash is typically programmed by the MCU > vendor before the chips are shipped, while the external flash might be > programmed on the factory floor. What if, during production, a defect > in the external flash contents is discovered? Simply sending a patched > binary to the factory violates one of our requirements: we need to be > able to determine exactly how each device in the field was > manufactured. Currently, the hash in a device's MMR tells us which > mfgimage it was built with, but the MMR is in the boot loader area in > internal flash, the part that the MCU vendor programs! With the > current system, a new mfgimage needs to be created and built, the new > internal binary sent to the MCU vendor, and the new external binary > sent to the factory. And this all needs to be done seamlessly - the > factory needs to switch to the new external binary only for MCUs > containing the new internal binary. Suffice it to say, this is not a > practical solution. > > What we really need is a way to update the external flash contents > without a corresponding change to the internal binary. The MCU vendor > could continue flashing chips with the original binary they were > provided, and the factory could immediately switch over to their new > binary without needing to coordinate. We need this to be possible > while sticking to one of our original requirements: it must be possible > to determine exactly how each device was programmed during > manufacturing. > > ### PROPOSAL > > # SUMMARY > > The short version is: > * An mfgimage spans one, and only one, flash component. > * A device can be built with multipled mfgimages. > * Each mfgimage has its own MMR. > * A device's manufacturing identity (mfgid) is a *list* of all > mfgimages the device was built with. > > In the example above, there would be two mfgimages: "internal", and > "external". The internal binary would be sent to the MCU vendor; the > external binary would be sent to the factory. When a defect in the > external binary is discovered, a new external mfgimage would be > produced (with a new hash) and sent to the factory. Devices > manufactured after this point would use the new external binary, and > continue to use the original internal binary. Since a device's mfgid > is the full list of mfgimages it was built with, devices built after > the fix would be distinguishable from those affected by the defect. > > # MFG HASH > > Each mfgimage has its own MMR containing a hash. > > The MMR at the end of the boot loader area (now called the "boot MMR") > must be present. The boot MMR indicates the flash locations of other > MMRs via the `mmr` TLV type. > > At startup, the firmware reads the boot MMR (as before). Next, it reads > any additional MMRs indicated by `mmr` TLVs. An `mmr` TLV contains two > fields: > > * flash device ID > * flash offset > > The referenced MMR can be found at the specified location. > > After all MMRs have been read, the firmware populates the `id/mfghash` > setting with a colon-separated list of hashes. By reading and parsing > this setting, a client can derive the full list of mfgimages that the > device was built with. > > There are a few annoying implications: > > 1. A flash sector for each extra MMR must be reserved in the BSP's > flash map. > 2. The boot mfgimage has to know the location of the other MMRs. > If an MMR is moved to a different flash offset, a new boot > mfgimage with the updated offset must be built. > > # SAMPLE `MFG.YML` FILES > > Below are sample `mfg.yml` files for hypothetical "internal" and > "external" mfgimages. > > # internal/mfg.yml > mfg.targets: > # Include the boot loader only. > - > name: 'targets/boot-nrf52dk' > area: FLASH_AREA_BOOTLOADER > offset: 0 > > mfg.meta: > # The MMR is placed at the end of the boot loader area. > area: FLASH_AREA_BOOTLOADER > offset: end > > # Include a TLV containing the sha256 of the mfg image. > hash: true > > # Point to MMRs of other mfgimage so that the firmware knows > # where to find them. > mmrs: > - > area: FLASH_AREA_EXT_MMR > offset: 0 > > > # external/mfg.yml > mfg.targets: > # Include two images. > - > name: 'targets/slinky-nrf52dk' > area: FLASH_AREA_IMAGE_0 > offset: 0 > - > name: 'targets/bleprph-nrf52dk' > area: FLASH_AREA_IMAGE_1 > offset: 0 > > mfg.meta: > # The MMR is placed at the start of the "external MMR" flash > # area. > area: FLASH_AREA_EXT_MMR > offset: 0 > > # Include a TLV containing the sha256 of the mfg image. > hash: true > > # Include a flash map TLV. > flash_map: true > > All comments welcome. > > Thanks, > Chris >