7 Flash programming

Andreas Bolsch Wed, 28 Feb 2018 08:11:46 -0800

These figures are quite surprising. I've made a lot of benchmarks with apile of discovery boards, mainly F4 and F7, some L4. Since my focus wason external spi flash, I did not record the results for the internalflash, but as far as I recall, both programming *AND* read (viawrite_bank , read_bank) with the algorithm aproach gave approx. 120-160kBytes/s.

To be sure I just did the following tests (openOCD, current head,integrated ST-Link v2-1, 4 MHz SWD clock):nucleo-f767zi, 2 MByte random data: prog: 140 kBytes/s, read: 150kBytes/s

disco-f412g, 1 MByte random data: prog: 134 kBytes/s, read: 158 kBytes/s

Then STM32CubeProgrammer (defaults, Linux host, integrated ST-Linkv2-1):

disco-f412g, 1 MByte random data: prog. 133 kBytes/s, read: 150 kBytes/s

And finally openOCD with algorithm disabled, anything else as before:

disco-f412g, 1 MByte random data: prog. 1 kByte/s (yes, no kidding,ONE!)

All tests above with SWD, not JTAG. Some weeks ago I did tests withST-Link reflashed to JLink and JLink v8, JLink v9, but the results wererather disappointing. Some quite the same speed, some slightly tomoderately slower when compared to ST-Link. And some tests with ST-Linkv2 clones, they gave roughly the same speed as the integrated ST-Link.

The surprising fact is that I got the very same limit (almost precisely150 kBytes/s) for external spi flash programming and reading (both withthe QSPI interface and bitbanging SPI). This apparently indicates thatthe "real" programming time has almost no impact on the observed speed.What matters is the transport via USB, the ST-Link adapter and SWDclock.

The datasheet for f767 says typ. 16 us per programming operation, so 8us per byte for parallelism 16 or 125 kBytes/s. I. e. openOCD alreadyoperates at the hardware imposed limit, and the programming time isalmost completely absorbed by the data transfer. Quite excellent, I'dsay.

That the direct register approach is quite slow isn't surprising. That'slike playing ping-pong over USB for every single bit. The main benefitof the algorithm approach is that data transport and programming("real" programming with CPU stall) run simultaneously. Of course, thiscan only work smoothly if the programming adapter does support this"streaming" approach, so it won't work reasonably well with a low-leveladapter.

Regarding the parallelism I'd suggest to leave the parallelism bydefault as it currently is, i. e. 16.Anything else would be a pitfall for the unaware user. The assumptionthat most users will use 2.4V to 3.3V supply is still valid, I guess. Ifit were configurable, 32 wouldn't give substancially higher speed (well,at least if a "good" programming adapter is used) anyway.

BTW: "parallelism" apparently means ***maximum*** parallelism, cf.rm0081, 1.5.2:

"Parallelism is the maximum number of bits that may be programmed to 0in one step duringa program or erase operation. The maximum program/erase parallelism islimited by thesupply voltage and by whether the external V PP supply is used or not...."

Hence "limited by" actually means "limited ***above*** by", and thetable indicates the maximum allowed value, not the exact value to use.


On 2018-02-27 21:50, Christopher Head wrote:

As for performance, I have two data points so far.

First, using a ByteBlaster clone, I was able to achieve about 6
kilobytes per second using the algorithm and about 10 using optimized
direct programming (the original direct code got about 3).

Second, using an Olimex ARM-USB-TINY-H (FTDI-based), I had to reduce
the JTAG clock *massively* in order to get the algorithm approach to
even work at all (otherwise it would see a mix of timeout waiting for
algorithm and debug regions unpowered), but optimized direct
programming at the default 2 MHz JTAG clock got me 30 kilobytes per
second, much more than the algorithm approach at the reduced clock
speed.

Both of the above tests were made at 16× parallelism. Repeating the
Olimex test with the optimized direct code at 32× parallelism yielded
84 kilobytes per second.


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
OpenOCD-devel mailing list
OpenOCD-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openocd-devel

Re: [OpenOCD-devel] STM32F2/4/7 Flash programming

Reply via email to