Re: NAND "FIS" in RedBoot proposal

Jonathan Larmour Fri, 03 Jul 2009 19:05:32 -0700

Ross Younger wrote:

I've include the current version of my proposed design, although following
the conversation with Simon this week it has occurred to me that it might
well be reasonable to fillet out the core idea and make it available as a
simple proto-filesystem. (Essentially it would be a block device with very
large blocks. The high-level operations provided would be reading and
writing up to a blockful, and erasing a whole block; this doesn't seem to
fit well with the eCos block device model, so I suspect it would be better
off as its own interface.) Please do speak up if that would be a useful
development, and if so I'll happily rework this proposal into two parts as
time allows.

I think there would be merit in doing something like that. I agree a blockdevice interface wouldn't be appropriate.

============================================================================

NOR-in-NAND design (3/7/09)

1. Scenario and assumptions:

We want RedBoot to be able to use NAND flash as if it was NOR flash,
for FIS and fconfig.

We don't care about supporting non-RB apps, so can rely on RB's behaviour.

Not sure that is wise. And I'm not sure what specific behaviour you wouldwish to rely on. I wouldn't make the code itself dependent on RedBoot inany case. That's what got us into a mess with the v1 flash drivers.Ultimately the RedBoot FIS approach is not something we'd like to continuewith - it was designed for the common case of the late '90s/early '00s ofsingle NOR flash parts (and in some aspects, isn't so great even for then,such as not being able to support bootblocks properly). That code haslimited lifespan, and many flaws. It's better to do something that canlast beyond that.

I can certainly see this layer being used for a non-RedBoot boot loader.Lots of people may use RedBoot during development, but this sort of layeris useful for any boot loader. Far fewer people want to use RedBoot fortheir final product (outside of development) - they just want to loadtheir apps and go. I'm not suggesting writing a boot loader to do thisnow, but I do believe it would be a mistake to think that people won'twant to.

Analysis:
* fconfig-in-FIS always calls FIS code.
* fconfig not in FIS always erases before programming.
* FIS always erases before it programs, only deals with block-aligned
regions, and never rewrites within such a region. [+]

Design goals:
* Clean and simple, in keeping with the eCos philosophy.
* Robust: copes with blocks going bad and indulges in some sort of
wear-levelling.
* No use of malloc. (As we're targetting only RB, we can use the workspace.)

The workspace is a very crude way to allocate memory, but if you canconstrain yourself to it, so much the better.

The number of blocks available - i.e. (1 + maximum valid logical block
number) - will be computed as:
        * Number of physical blocks in the chip or partition,
        * minus the number of factory bad blocks,
        * minus one (to allow for robust block rewrites),
        * minus an allowance for bad blocks to develop during the life
        of the device.

Of course a (NOR) flash driver has to specify a constant device size,whereas the bad blocks are board specific, so really you have to have anallowance for bad blocks full stop.

The NOR-in-NAND layer will be making multiple-page reads and writes;
it may be worthwhile to put code to do this into the NAND layer, as
opposed to forcing all users to reinvent the wheel.


Sounds like a better approach.

NAND blocks are used as a dumb datastore, with their physical addresses
bearing no relation to their logical addresses. In-use blocks are tagged
in the OOB area with the logical block number they refer to (see below).

Physical storage blocks are used sequentially from the beginning of
the device,  then cycling back to the start to reuse erased blocks. This
is managed by maintaining a next-write "pointer". At runtime, this is
the next vacant block after the last written block; at boot time, it
is initialised by scanning the filesystem to find the block with the
highest serial number, which is taken to be the latest-written block.

That sounds potentially painful for non-trivial numbers of blocks, unlessyou are going to use a large logical block size. More below.

This scheme is much better than a simple block mapping due to its
robustness: it intrinsically provides reasonable wear-levelling,

Not much if most/all blocks are used. I suspect with the normal usagepattern of flash under RedBoot, you wouldn't get much wear-levelling.You'd need to start occasionally reallocating used blocks too towear-level properly, which admittedly probably wouldn't be /that/difficult. That said, I think we should indeed probably just put up withtheoretically inadequate wear-levelling for now.

Conversely, note that MLC NAND may start to become more common, but it is(I think) rated for ~10K cycles which may increase the need for morethorough wear-levelling.

2.1.1 OOB tag format

The tag is written into the OOB area of the first page of each block
(as "application" OOB data, from the NAND library's point of view).

The tag is a packed byte array, in processor-local endian, with the
following contents:

        * Magic number - 2 bytes, 0xEF15. (This is a compile-time constant
        and demonstrates that the block is one of ours. It's a corruption
        of "eCos FIS".)
        * Logical block number - 2 bytes.
        * Master serial number - 4 bytes (see below).

Given you say processor-local endian, I assume you mean there are two16-bit words and a 32-bit word here.

Given you are assuming partial-page writes, I think you can do somethingmore intelligent here to handle the seeking through NAND space that yourproposal entails for every read/write:

- For a start, the serial number seems potentially overkill unless I'mmissing something. All you need to know is whether a discovered logicalblock number is the most recent version of it. The serial only needs toreflect that block. When you write a new revision of a block, you mark theprevious one dead by overwriting it with a partial write (withouterasure). Thus you only have one valid version of a block at one time.Duplicates are dealt with solely at initial device scan time (stomping onthe old one at that point). This way you only need 2 bits to represent theserial, theoretically (as the difference between serials can only be 1, soyou can always tell which is older).

- That frees up space which we can use for potential optimisations. Inparticular, the common use-case we are envisaging is wholly sequentialreads of fairly large images. So we could use 2 bytes to point to the nextblock in the logical block chain. This is very useful if most use issequential. If that block turns out not to be the correct block number, welose very little and just revert to scanning the medium. Most times itshould be correct. (We could make this behaviour a CDL option anyway).This does mean knowing what the next block to be used next will be at thetime you are writing the current block, but that doesn't seem to be muchof an issue - it's primarily just bringing forward a determination you'dhave to make anyway. This all doesn't seem a particularlyhard-to-implement optimisation.

I should note though that multiple writes are not supported on newer MLCNAND flash. This could be an issue as this class of NAND may become morecommon. Perhaps in that case an obsoleted block can just be erasedimmediately.

Also, if scanning the medium for each block isn't as slow as I fear it maybe, then the above may be unnecessary (although there's still a goodargument for freeing up the OOB bits for later use, if they can be freed).

 To be safe, we should impose an upper
limit on the number of physical NAND blocks that this system will use,
and hence cap the number of logical NOR blocks the system will support. I
suggest 1024, which ought to be enough for anybody; it's more than many
(most?) NOR chips.)

The number of blocks should be configurable anyway, so I don't think weneed go beyond that surely? Setting a default of 1024 for such an optionshould be adequate.

2.4 Runtime considerations

If a runtime performance boost was required, the system could on startup
scan the NAND partition and build up a cached mapping in-RAM of the
physical addresses of each (non-zeroed) logical block. However, we don't
expect this code will be used on gigantic NAND arrays [partitions],

Hmm. I'm not as certain about that. People like having lots of space toplay in, with e.g. multiple app versions or linux kernel images or rootfs's or initrd's to load etc. A linear scan will work ok on a brand newboard for a while - starting the scan for the next block from the currentblock of course - but in due course, performance would deteriorate, mostlyirreversibly.

and it should only see light duty via RedBoot.

For a production system sure, but less so on a developer's board, withapps frequently getting written/rewritten. In particular the FIS directoryupdates will get interspersed frequently as a result which will causeincreasing fragmentation. Put it like this - if you're consideringsomething where wear-levelling is a concern (and for the mooted proto-fs,that's certainly valid), then you'd definitely need to consider the lengthof time scanning every block read as the block mappings will drift furtheraway from 1:1 logical to virtual, and stop being linear.

Therefore it won't take
long to linear-scan for blocks during operations, so this optimisation
may not be worth its complexity.


And 4Kbytes RAM (for, say, 1024 blocks).

(Bart suggested that a scan at startup could also take care of the cases
above where more than one physical block is tagged with the same logical
block number, hence reducing the time and complexity of the block access
code. There's a startup time vs access time vs memory trade-off here,
though we haven't fully analysed it. Nick agreed that we ought to think
this through very carefully.)

I think that being forced as a matter of course to scan the whole medium_every_ block read is a bad thing and should be avoided.

Something you may want to think about is that there are problems thatRedBoot's FIS and config code have with multiple flash devices in asystem. This may happen, whether with multiple NANDs or a mixture of flashtypes - possibly increasingly so these days. Work by myself and IIRC tosome extent Bart at eCosCentric has ameliorated problems somewhat, but notfixed them. RedBoot's FIS code orients itself around a single flashdevice, with a single fixed block size for the entirety of that device.You may stumble across problems here as NAND may not be the only flashdevice on many boards, so it's something to bear in mind.


Jifl
--
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine

Re: NAND "FIS" in RedBoot proposal

Reply via email to