These figures are quite surprising. I've made a lot of benchmarks with a
pile of discovery boards, mainly F4 and F7, some L4. Since my focus was
on external spi flash, I did not record the results for the internal
flash, but as far as I recall, both programming *AND* read (via
write_bank , read_bank) with the algorithm aproach gave approx. 120-160
kBytes/s.
To be sure I just did the following tests (openOCD, current head,
integrated ST-Link v2-1, 4 MHz SWD clock):
nucleo-f767zi, 2 MByte random data: prog: 140 kBytes/s, read: 150
kBytes/s
disco-f412g, 1 MByte random data: prog: 134 kBytes/s, read: 158 kBytes/s
Then STM32CubeProgrammer (defaults, Linux host, integrated ST-Link
v2-1):
disco-f412g, 1 MByte random data: prog. 133 kBytes/s, read: 150 kBytes/s
And finally openOCD with algorithm disabled, anything else as before:
disco-f412g, 1 MByte random data: prog. 1 kByte/s (yes, no kidding,
ONE!)
All tests above with SWD, not JTAG. Some weeks ago I did tests with
ST-Link reflashed to JLink and JLink v8, JLink v9, but the results were
rather disappointing. Some quite the same speed, some slightly to
moderately slower when compared to ST-Link. And some tests with ST-Link
v2 clones, they gave roughly the same speed as the integrated ST-Link.
The surprising fact is that I got the very same limit (almost precisely
150 kBytes/s) for external spi flash programming and reading (both with
the QSPI interface and bitbanging SPI). This apparently indicates that
the "real" programming time has almost no impact on the observed speed.
What matters is the transport via USB, the ST-Link adapter and SWD
clock.
The datasheet for f767 says typ. 16 us per programming operation, so 8
us per byte for parallelism 16 or 125 kBytes/s. I. e. openOCD already
operates at the hardware imposed limit, and the programming time is
almost completely absorbed by the data transfer. Quite excellent, I'd
say.
That the direct register approach is quite slow isn't surprising. That's
like playing ping-pong over USB for every single bit. The main benefit
of the algorithm approach is that data transport and programming
("real" programming with CPU stall) run simultaneously. Of course, this
can only work smoothly if the programming adapter does support this
"streaming" approach, so it won't work reasonably well with a low-level
adapter.
Regarding the parallelism I'd suggest to leave the parallelism by
default as it currently is, i. e. 16.
Anything else would be a pitfall for the unaware user. The assumption
that most users will use 2.4V to 3.3V supply is still valid, I guess. If
it were configurable, 32 wouldn't give substancially higher speed (well,
at least if a "good" programming adapter is used) anyway.
BTW: "parallelism" apparently means ***maximum*** parallelism, cf.
rm0081, 1.5.2:
"Parallelism is the maximum number of bits that may be programmed to 0
in one step during
a program or erase operation. The maximum program/erase parallelism is
limited by the
supply voltage and by whether the external V PP supply is used or not.
..."
Hence "limited by" actually means "limited ***above*** by", and the
table indicates the maximum allowed value, not the exact value to use.
On 2018-02-27 21:50, Christopher Head wrote:
As for performance, I have two data points so far.
First, using a ByteBlaster clone, I was able to achieve about 6
kilobytes per second using the algorithm and about 10 using optimized
direct programming (the original direct code got about 3).
Second, using an Olimex ARM-USB-TINY-H (FTDI-based), I had to reduce
the JTAG clock *massively* in order to get the algorithm approach to
even work at all (otherwise it would see a mix of timeout waiting for
algorithm and debug regions unpowered), but optimized direct
programming at the default 2 MHz JTAG clock got me 30 kilobytes per
second, much more than the algorithm approach at the reduced clock
speed.
Both of the above tests were made at 16× parallelism. Repeating the
Olimex test with the optimized direct code at 32× parallelism yielded
84 kilobytes per second.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
OpenOCD-devel mailing list
OpenOCD-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openocd-devel