Hi Havard, On Fri, Jan 15, 2021 at 11:29 AM Havard Skinnemoen <hskinnem...@google.com> wrote: > > Hi Bin, > > On Thu, Jan 14, 2021 at 6:08 PM Bin Meng <bmeng...@gmail.com> wrote: > > > > Hi Francisco, > > > > On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias > > <frasse.igles...@gmail.com> wrote: > > > > > > Hi Bin, > > > > > > On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote: > > > > From: Bin Meng <bin.m...@windriver.com> > > > > > > > > The m25p80 model uses s->needed_bytes to indicate how many follow-up > > > > bytes are expected to be received after it receives a command. For > > > > example, depending on the address mode, either 3-byte address or > > > > 4-byte address is needed. > > > > > > > > For fast read family commands, some dummy cycles are required after > > > > sending the address bytes, and the dummy cycles need to be counted > > > > in s->needed_bytes. This is where the mess began. > > > > > > > > As the variable name (needed_bytes) indicates, the unit is in byte. > > > > It is not in bit, or cycle. However for some reason the model has > > > > been using the number of dummy cycles for s->needed_bytes. The right > > > > approach is to convert the number of dummy cycles to bytes based on > > > > the SPI protocol, for example, 6 dummy cycles for the Fast Read Quad > > > > I/O (EBh) should be converted to 3 bytes per the formula (6 * 4 / 8). > > > > > > While not being the original implementor I must assume that above > > > solution was > > > considered but not chosen by the developers due to it is inaccuracy (it > > > wouldn't be possible to model exacly 6 dummy cycles, only a multiple of 8, > > > meaning that if the controller is wrongly programmed to generate 7 the > > > error > > > wouldn't be caught and the controller will still be considered > > > "correct"). Now > > > that we have this detail in the implementation I'm in favor of keeping > > > it, this > > > also because the detail is already in use for catching exactly above > > > error. > > > > > > > I found no clue from the commit message that my proposed solution here > > was ever considered, otherwise all SPI controller models supporting > > software generation should have been found out seriously broken long > > time ago! > > > > The issue you pointed out that we require the total number of dummy > > bits should be multiple of 8 is true, that's why I added the > > unimplemented log message in this series (patch 2/3/4) to warn users > > if this expectation is not met. However this will not cause any issue > > when running U-Boot or Linux, because both spi-nor drivers expect the > > same assumption as we do here. > > > > See U-Boot spi_nor_read_data() and Linux spi_nor_spimem_read_data(), > > there is a logic to calculate the dummy bytes needed for fast read > > command: > > > > /* convert the dummy cycles to the number of bytes */ > > op.dummy.nbytes = (nor->read_dummy * op.dummy.buswidth) / 8; > > > > Note the default dummy cycles configuration for all flashes I have > > looked into as of today, meets the multiple of 8 assumption. On some > > flashes the dummy cycle number is configurable, and if it's been > > configured to be an odd value, it would not work on U-Boot/Linux in > > the first place. > > > > > > > > > > Things get complicated when interacting with different SPI or QSPI > > > > flash controllers. There are major two cases: > > > > > > > > - Dummy bytes prepared by drivers, and wrote to the controller fifo. > > > > For such case, driver will calculate the correct number of dummy > > > > bytes and write them into the tx fifo. Fixing the m25p80 model will > > > > fix flashes working with such controllers. > > > > > > Above can be fixed while still keeping the detailed dummy cycle > > > implementation > > > inside m25p80. Perhaps one of the following could be looked into: > > > configurating > > > the amount, letting the spi ctrl fetch the amount from m25p80 or by > > > inheriting > > > some functionality handling this in the SPI controller. Or a mixture of > > > above. > > > > Please send patches to explain this in detail how this is going to > > work. I am open to all possible solutions. > > > > > > > > > - Dummy bytes not prepared by drivers. Drivers just tell the hardware > > > > the dummy cycle configuration via some registers, and hardware will > > > > automatically generate dummy cycles for us. Fixing the m25p80 model > > > > is not enough, and we will need to fix the SPI/QSPI models for such > > > > controllers. > > > > > > > > This series fixes the mess in the m25p80 from the flash side first, > > > > > > Considering the problems solved by the solution in tree I find m25p80 > > > pretty > > > clean, at least I don't see any clearly better way for accurately > > > modeling the > > > dummy clock cycles. Counting bits instead of bytes would for example still > > > force the controllers to mark which bits to count (when transmitting one > > > dummy > > > byte from a txfifo on four lines (Quad command) it generates 2 dummy clock > > > cycles since it takes two cycles to transfer 8 bits). > > > > > > > SPI is a bit based protocol, not bytes. If you insist on bit modeling > > with the dummy cycles then you should also suggest we change all > > cycles (including command/addr/dummy/data phases) to be modeled with > > bits. That way we can accurately emulate everything, for example one > > potential problem like transferring 9 bit in the data phase. > > I agree with this. There's really nothing special about dummy cycles. > Making them special makes it super painful to implement SPI controller > emulation because you have to anticipate when ssi_transfer changes > semantics from byte-at-a-time to bit-at-a-time. I doubt all the SPI > controllers in the tree gets it right all the time. >
Yep, it's not just painful for SPI controllers, and for the case 1 SPI controller it's impossible to snoop the data to distinguish when the dummy cycles begin. > > However modeling everything with bit is super inefficient. My view is > > that we should avoid trying to support uncommon use cases (like not > > multiple of 8 for dummy bits) in QEMU. > > Perhaps ssi_transfer could take an additional bits parameter? That > should make it possible to transfer any number of bits up to 32, while > keeping the common case simple on both sides. And it would work for > any SPI transfer, not just dummy cycles. This sounds like a good tradeoff from the emulator perspective. But I am not sure we should do this to solve the dummy cycle mess given all the default dummy cycle configurations so far match the multiple of 8 assumption. Regards, Bin