On Thu, 1 Mar 2018 00:12:12 +0100 Tomas Vanek <tom_...@users.sourceforge.net> wrote:
> We should also focus to a question why algo flashing is broken on > FTDI. Some non STM devices (e.g. Kinetis) work with very similar algo > just perfectly on FTDI or any other adapter. Sure. If someone could fix algorithm-based flashing, I would love to use it! I’m not convinced it will make things any faster for my specific set of hardware, but as long as it’s not broken and not slower, I don’t really care, and I understand it does make things faster for some people. > WAITs are very strange. It looks like the stalled access to flash > blocks also JTAG access to RAM. > And SWD access doesn't suffer this silicon bug... who knows... maybe > some NOPs in algo busy wait loop would fix it. > BTW The programming algo should avoid bus stalling, shouldn't it? I was wondering about this. I have two weird data points I can add to this discussion. The first data point is this: remember early on in the thread where I said I wasn’t able to successfully modify the algorithm to move the CR write and SR read out of the loop? When I tried, it ran for a while and then gave either debug regions unpowered or, more commonly, timeout waiting for algorithm—the same messages I got using the FTDI adapter with the original, unmodified algorithm, only with the modified algorithm, it gave those messages using the ByteBlaster as well, which had formerly been very robust. This seemed suspicious, and I’m reasonably certain I got the modifications to the algorithm correct (I’ve done plenty of Thumb assembly in the past), but I didn’t pay too much attention as the direct approach was so fast. The second data point is this: when using the algorithm-based approach, I attached an oscilloscope to TDO coming out of the F7. I was very surprised to see it *tristate* from time to time (at least, I’m pretty sure it tristated—it had a very slow rise time and settled to a voltage somewhat below VDD). I didn’t manage to correlate the time of the tristate to any particular higher level activity, but it definitely happened quite frequently during a programming operation and looked very weird. I’m pretty sure it didn’t happen during direct programming, only algorithm-driven programming. I found this suspicious, but again, didn’t look into it too much as the direct approach was very fast. The reference manual seems a little unclear on whether the algorithm as written should stall the CPU or not. It says, “Any attempt to read the Flash memory while it is being written or erased, causes the bus to stall. Read operations are processed correctly once the program operation has completed. This means that code or data fetches cannot be performed while a write/erase operation is ongoing.” The obvious way to interpret that sentence is that the STRH places the halfword into the CPU write buffer, the DSB pushes it out as an AXI write cycle to the Flash interface, the Flash interface immediately completely the bus cycle and internally buffers the data while starting the burn, and then the AXI is free to proceed to SR polling, while the *next* AXI cycle, if any, accessing the Flash interface while still busy will be stalled. However, an alternative way to interpret that is that the AXI write cycle that delivers the halfword is stalled by the Flash interface, but the CPU can continue execution because the data is in the CPU write buffer, and the CPU can proceed before the bus cycle completes. In this case the DSB would stall the CPU. BSY seems rather pointless in that case, but I have learned not to assume anything when reading silicon documentation (and it could be useful to avoid stalls if used without DSB, I suppose). My guess would be the first interpretation is correct, though, which means the algorithm as written should indeed not stall the CPU ever, since code execution is from DTCM which is unrelated to Flash. It should be possible to test which interpretation is correct by performing a STRH followed by a DSB and then checking whether BSY is set immediately afterwards; if yes, then interpretation 1 is correct, while if no, then interpretation 2 is correct. Assuming interpretation 1 is correct, though, I don’t see anything wrong with the algorithm code. > What OpenOCD version do you use? It looks like your version misses > Matthias' WAIT handling > if you get such errors like algo timeout. It was head of master from somewhere in the last week or two. I can look up the exact commit ID tomorrow if you want. -- Christopher Head
pgpk4rsNNRS1R.pgp
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ OpenOCD-devel mailing list OpenOCD-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openocd-devel