These figures are quite surprising. I've made a lot of benchmarks with a pile of discovery boards, mainly F4 and F7, some L4. Since my focus was on external spi flash, I did not record the results for the internal flash, but as far as I recall, both programming *AND* read (via write_bank , read_bank) with the algorithm aproach gave approx. 120-160 kBytes/s.

To be sure I just did the following tests (openOCD, current head, integrated ST-Link v2-1, 4 MHz SWD clock): nucleo-f767zi, 2 MByte random data: prog: 140 kBytes/s, read: 150 kBytes/s
disco-f412g, 1 MByte random data: prog: 134 kBytes/s, read: 158 kBytes/s

Then STM32CubeProgrammer (defaults, Linux host, integrated ST-Link v2-1):
disco-f412g, 1 MByte random data: prog. 133 kBytes/s, read: 150 kBytes/s

And finally openOCD with algorithm disabled, anything else as before:
disco-f412g, 1 MByte random data: prog. 1 kByte/s (yes, no kidding, ONE!)

All tests above with SWD, not JTAG. Some weeks ago I did tests with ST-Link reflashed to JLink and JLink v8, JLink v9, but the results were rather disappointing. Some quite the same speed, some slightly to moderately slower when compared to ST-Link. And some tests with ST-Link v2 clones, they gave roughly the same speed as the integrated ST-Link.

The surprising fact is that I got the very same limit (almost precisely 150 kBytes/s) for external spi flash programming and reading (both with the QSPI interface and bitbanging SPI). This apparently indicates that the "real" programming time has almost no impact on the observed speed. What matters is the transport via USB, the ST-Link adapter and SWD clock.

The datasheet for f767 says typ. 16 us per programming operation, so 8 us per byte for parallelism 16 or 125 kBytes/s. I. e. openOCD already operates at the hardware imposed limit, and the programming time is almost completely absorbed by the data transfer. Quite excellent, I'd say.

That the direct register approach is quite slow isn't surprising. That's like playing ping-pong over USB for every single bit. The main benefit of the algorithm approach is that data transport and programming ("real" programming with CPU stall) run simultaneously. Of course, this can only work smoothly if the programming adapter does support this "streaming" approach, so it won't work reasonably well with a low-level adapter.

Regarding the parallelism I'd suggest to leave the parallelism by default as it currently is, i. e. 16. Anything else would be a pitfall for the unaware user. The assumption that most users will use 2.4V to 3.3V supply is still valid, I guess. If it were configurable, 32 wouldn't give substancially higher speed (well, at least if a "good" programming adapter is used) anyway.

BTW: "parallelism" apparently means ***maximum*** parallelism, cf. rm0081, 1.5.2:

"Parallelism is the maximum number of bits that may be programmed to 0 in one step during a program or erase operation. The maximum program/erase parallelism is limited by the supply voltage and by whether the external V PP supply is used or not. ..."

Hence "limited by" actually means "limited ***above*** by", and the table indicates the maximum allowed value, not the exact value to use.

On 2018-02-27 21:50, Christopher Head wrote:
As for performance, I have two data points so far.

First, using a ByteBlaster clone, I was able to achieve about 6
kilobytes per second using the algorithm and about 10 using optimized
direct programming (the original direct code got about 3).

Second, using an Olimex ARM-USB-TINY-H (FTDI-based), I had to reduce
the JTAG clock *massively* in order to get the algorithm approach to
even work at all (otherwise it would see a mix of timeout waiting for
algorithm and debug regions unpowered), but optimized direct
programming at the default 2 MHz JTAG clock got me 30 kilobytes per
second, much more than the algorithm approach at the reduced clock
speed.

Both of the above tests were made at 16× parallelism. Repeating the
Olimex test with the optimized direct code at 32× parallelism yielded
84 kilobytes per second.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
OpenOCD-devel mailing list
OpenOCD-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openocd-devel

Reply via email to