On 4/28/21 3:12 PM, Bin Meng wrote: > Hi Cédric, > > On Tue, Apr 27, 2021 at 10:32 PM Cédric Le Goater <c...@kaod.org> wrote: >> >> Hello, >> >> On 4/27/21 10:54 AM, Francisco Iglesias wrote: >>> On [2021 Apr 27] Tue 15:56:10, Alistair Francis wrote: >>>> On Fri, Apr 23, 2021 at 4:46 PM Bin Meng <bmeng...@gmail.com> wrote: >>>>> >>>>> On Mon, Feb 8, 2021 at 10:41 PM Bin Meng <bmeng...@gmail.com> wrote: >>>>>> >>>>>> On Thu, Jan 21, 2021 at 10:18 PM Francisco Iglesias >>>>>> <frasse.igles...@gmail.com> wrote: >>>>>>> >>>>>>> Hi Bin, >>>>>>> >>>>>>> On [2021 Jan 21] Thu 16:59:51, Bin Meng wrote: >>>>>>>> Hi Francisco, >>>>>>>> >>>>>>>> On Thu, Jan 21, 2021 at 4:50 PM Francisco Iglesias >>>>>>>> <frasse.igles...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Dear Bin, >>>>>>>>> >>>>>>>>> On [2021 Jan 20] Wed 22:20:25, Bin Meng wrote: >>>>>>>>>> Hi Francisco, >>>>>>>>>> >>>>>>>>>> On Tue, Jan 19, 2021 at 9:01 PM Francisco Iglesias >>>>>>>>>> <frasse.igles...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Bin, >>>>>>>>>>> >>>>>>>>>>> On [2021 Jan 18] Mon 20:32:19, Bin Meng wrote: >>>>>>>>>>>> Hi Francisco, >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jan 18, 2021 at 6:06 PM Francisco Iglesias >>>>>>>>>>>> <frasse.igles...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Bin, >>>>>>>>>>>>> >>>>>>>>>>>>> On [2021 Jan 15] Fri 22:38:18, Bin Meng wrote: >>>>>>>>>>>>>> Hi Francisco, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jan 15, 2021 at 8:26 PM Francisco Iglesias >>>>>>>>>>>>>> <frasse.igles...@gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Bin, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On [2021 Jan 15] Fri 10:07:52, Bin Meng wrote: >>>>>>>>>>>>>>>> Hi Francisco, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias >>>>>>>>>>>>>>>> <frasse.igles...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Bin, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote: >>>>>>>>>>>>>>>>>> From: Bin Meng <bin.m...@windriver.com> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The m25p80 model uses s->needed_bytes to indicate how many >>>>>>>>>>>>>>>>>> follow-up >>>>>>>>>>>>>>>>>> bytes are expected to be received after it receives a >>>>>>>>>>>>>>>>>> command. For >>>>>>>>>>>>>>>>>> example, depending on the address mode, either 3-byte >>>>>>>>>>>>>>>>>> address or >>>>>>>>>>>>>>>>>> 4-byte address is needed. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> For fast read family commands, some dummy cycles are >>>>>>>>>>>>>>>>>> required after >>>>>>>>>>>>>>>>>> sending the address bytes, and the dummy cycles need to be >>>>>>>>>>>>>>>>>> counted >>>>>>>>>>>>>>>>>> in s->needed_bytes. This is where the mess began. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> As the variable name (needed_bytes) indicates, the unit is >>>>>>>>>>>>>>>>>> in byte. >>>>>>>>>>>>>>>>>> It is not in bit, or cycle. However for some reason the >>>>>>>>>>>>>>>>>> model has >>>>>>>>>>>>>>>>>> been using the number of dummy cycles for s->needed_bytes. >>>>>>>>>>>>>>>>>> The right >>>>>>>>>>>>>>>>>> approach is to convert the number of dummy cycles to bytes >>>>>>>>>>>>>>>>>> based on >>>>>>>>>>>>>>>>>> the SPI protocol, for example, 6 dummy cycles for the Fast >>>>>>>>>>>>>>>>>> Read Quad >>>>>>>>>>>>>>>>>> I/O (EBh) should be converted to 3 bytes per the formula (6 >>>>>>>>>>>>>>>>>> * 4 / 8). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> While not being the original implementor I must assume that >>>>>>>>>>>>>>>>> above solution was >>>>>>>>>>>>>>>>> considered but not chosen by the developers due to it is >>>>>>>>>>>>>>>>> inaccuracy (it >>>>>>>>>>>>>>>>> wouldn't be possible to model exacly 6 dummy cycles, only a >>>>>>>>>>>>>>>>> multiple of 8, >>>>>>>>>>>>>>>>> meaning that if the controller is wrongly programmed to >>>>>>>>>>>>>>>>> generate 7 the error >>>>>>>>>>>>>>>>> wouldn't be caught and the controller will still be >>>>>>>>>>>>>>>>> considered "correct"). Now >>>>>>>>>>>>>>>>> that we have this detail in the implementation I'm in favor >>>>>>>>>>>>>>>>> of keeping it, this >>>>>>>>>>>>>>>>> also because the detail is already in use for catching >>>>>>>>>>>>>>>>> exactly above error. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I found no clue from the commit message that my proposed >>>>>>>>>>>>>>>> solution here >>>>>>>>>>>>>>>> was ever considered, otherwise all SPI controller models >>>>>>>>>>>>>>>> supporting >>>>>>>>>>>>>>>> software generation should have been found out seriously >>>>>>>>>>>>>>>> broken long >>>>>>>>>>>>>>>> time ago! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The controllers you are referring to might lack support for >>>>>>>>>>>>>>> commands requiring >>>>>>>>>>>>>>> dummy clock cycles but I really hope they work with the other >>>>>>>>>>>>>>> commands? If so I >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am not sure why you view dummy clock cycles as something >>>>>>>>>>>>>> special >>>>>>>>>>>>>> that needs some special support from the SPI controller. For the >>>>>>>>>>>>>> case >>>>>>>>>>>>>> 1 controller, it's nothing special from the controller >>>>>>>>>>>>>> perspective, >>>>>>>>>>>>>> just like sending out a command, or address bytes, or data. The >>>>>>>>>>>>>> controller just shifts data bit by bit from its tx fifo and >>>>>>>>>>>>>> that's it. >>>>>>>>>>>>>> In the Xilinx GQSPI controller case, the dummy cycles can either >>>>>>>>>>>>>> be >>>>>>>>>>>>>> sent via a regular data (the case 1 controller) in the tx fifo, >>>>>>>>>>>>>> or >>>>>>>>>>>>>> automatically generated (case 2 controller) by the hardware. >>>>>>>>>>>>> >>>>>>>>>>>>> Ok, I'll try to explain my view point a little differently. For >>>>>>>>>>>>> that we also >>>>>>>>>>>>> need to keep in mind that QEMU models HW, and any binary that >>>>>>>>>>>>> runs on a HW >>>>>>>>>>>>> board supported in QEMU should ideally run on that board inside >>>>>>>>>>>>> QEMU aswell >>>>>>>>>>>>> (this can be a bare metal application equaly well as a modified >>>>>>>>>>>>> u-boot/Linux >>>>>>>>>>>>> using SPI commands with a non multiple of 8 number of dummy clock >>>>>>>>>>>>> cycles). >>>>>>>>>>>>> >>>>>>>>>>>>> Once functionality has been introduced into QEMU it is not easy >>>>>>>>>>>>> to know which >>>>>>>>>>>>> intentional or untentional features provided by the functionality >>>>>>>>>>>>> are being >>>>>>>>>>>>> used by users. One of the (perhaps not well known) features I'm >>>>>>>>>>>>> aware of that >>>>>>>>>>>>> is in use and is provided by the accurate dummy clock cycle >>>>>>>>>>>>> modeling inside >>>>>>>>>>>>> m25p80 is the be ability to test drivers accurately regarding the >>>>>>>>>>>>> dummy clock >>>>>>>>>>>>> cycles (even when using commands with a non-multiple of 8 number >>>>>>>>>>>>> of dummy clock >>>>>>>>>>>>> cycles), but there might be others aswell. So by removing this >>>>>>>>>>>>> functionality >>>>>>>>>>>>> above use case will brake, this since those test will not be >>>>>>>>>>>>> reliable. >>>>>>>>>>>>> Furthermore, since users tend to be creative it is not possible >>>>>>>>>>>>> to know if >>>>>>>>>>>>> there are other use cases that will be affected. This means that >>>>>>>>>>>>> in case [1] >>>>>>>>>>>>> needs to be followed the safe path is to add functionality >>>>>>>>>>>>> instead of removing. >>>>>>>>>>>>> Luckily it also easier in this case, see below. >>>>>>>>>>>> >>>>>>>>>>>> I understand there might be users other than U-Boot/Linux that use >>>>>>>>>>>> an >>>>>>>>>>>> odd number of dummy bits (not multiple of 8). If your concern was >>>>>>>>>>>> about model behavior changes, sure I can update >>>>>>>>>>>> qemu/docs/system/deprecated.rst to mention that some flashes in the >>>>>>>>>>>> m25p80 model now implement dummy cycles as bytes. >>>>>>>>>>> >>>>>>>>>>> Yes, something like that. My concern is that since this >>>>>>>>>>> functionality has been >>>>>>>>>>> in tree for while, users have found known or unknown features that >>>>>>>>>>> got >>>>>>>>>>> introduced by it. By removing the functionality (and the >>>>>>>>>>> known/uknown features) >>>>>>>>>>> we are riscing to brake our user's use cases (currently I'm aware >>>>>>>>>>> of one >>>>>>>>>>> feature/use case but it is not unlikely that there are more). [1] >>>>>>>>>>> states that >>>>>>>>>>> "In general features are intended to be supported indefinitely once >>>>>>>>>>> introduced >>>>>>>>>>> into QEMU", to me that makes very much sense because the opposite >>>>>>>>>>> would mean >>>>>>>>>>> that we were not reliable. So in case [1] needs to be honored it >>>>>>>>>>> looks to be >>>>>>>>>>> safer to add functionality instead of removing (and riscing the >>>>>>>>>>> removal of use >>>>>>>>>>> cases/features). Luckily I still believe in this case that it will >>>>>>>>>>> be easier to >>>>>>>>>>> go forward (even if I also agree on what you are saying below about >>>>>>>>>>> what I >>>>>>>>>>> proposed). >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Even if the implementation is buggy and we need to keep the buggy >>>>>>>>>> implementation forever? I think that's why >>>>>>>>>> qemu/docs/system/deprecated.rst was created for deprecating such >>>>>>>>>> feature. >>>>>>>>> >>>>>>>>> With the RFC I posted all commands in m25p80 are working for both the >>>>>>>>> case 1 >>>>>>>>> controller (using a txfifo) and the case 2 controller (no txfifo, as >>>>>>>>> GQSPI). >>>>>>>>> Because of this, I, with all respect, will have to disagree that this >>>>>>>>> is buggy. >>>>>>>> >>>>>>>> Well, the existing m25p80 implementation that uses dummy cycle >>>>>>>> accuracy for those flashes prevents all SPI controllers that use tx >>>>>>>> fifo to work with those flashes. Hence it is buggy. >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> don't think it is fair to call them 'seriously broken' (and >>>>>>>>>>>>>>> else we should >>>>>>>>>>>>>>> probably let the maintainers know about it). Most likely the >>>>>>>>>>>>>>> lack of support >>>>>>>>>>>>>> >>>>>>>>>>>>>> I called it "seriously broken" because current implementation >>>>>>>>>>>>>> only >>>>>>>>>>>>>> considered one type of SPI controllers while completely ignoring >>>>>>>>>>>>>> the >>>>>>>>>>>>>> other type. >>>>>>>>>>>>> >>>>>>>>>>>>> If we change view and see this from the perspective of m25p80, it >>>>>>>>>>>>> models the >>>>>>>>>>>>> commands a certain way and provides an API that the SPI >>>>>>>>>>>>> controllers need to >>>>>>>>>>>>> implement for interacting with it. It is true that there are SPI >>>>>>>>>>>>> controllers >>>>>>>>>>>>> referred to above that do not support the portion of that API >>>>>>>>>>>>> that corresponds >>>>>>>>>>>>> to commands with dummy clock cycles, but I don't think it is true >>>>>>>>>>>>> that this is >>>>>>>>>>>>> broken since there is also one SPI controller that has a working >>>>>>>>>>>>> implementation >>>>>>>>>>>>> of m25p80's full API also when transfering through a tx fifo (use >>>>>>>>>>>>> case 1). But >>>>>>>>>>>>> as mentioned above, by doing a minor extension and improvement to >>>>>>>>>>>>> m25p80's API >>>>>>>>>>>>> and allow for toggling the accuracy from dummy clock cycles to >>>>>>>>>>>>> dummy bytes [1] >>>>>>>>>>>>> will still be honored as in the same time making it possible to >>>>>>>>>>>>> have full >>>>>>>>>>>>> support for the API in the SPI controllers that currently do not >>>>>>>>>>>>> (please reread >>>>>>>>>>>>> the proposal in my previous reply that attempts to do this). I >>>>>>>>>>>>> myself see this >>>>>>>>>>>>> as win/win situation, also because no controller should need >>>>>>>>>>>>> modifications. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I am afraid your proposal does not work. Your proposed new device >>>>>>>>>>>> property 'model_dummy_bytes' to select to convert the accurate >>>>>>>>>>>> dummy >>>>>>>>>>>> clock cycle count to dummy bytes inside m25p80, is hard to justify >>>>>>>>>>>> as >>>>>>>>>>>> a property to the flash itself, as the behavior is tightly coupled >>>>>>>>>>>> to >>>>>>>>>>>> how the SPI controller works. >>>>>>>>>>> >>>>>>>>>>> I agree on above. I decided though that instead of posting sample >>>>>>>>>>> code in here >>>>>>>>>>> I'll post an RFC with hopefully an improved proposal. I'll cc you. >>>>>>>>>>> About below, >>>>>>>>>>> Xilinx ZynqMP GQSPI should not need any modication in a first step. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Wait, (see below) >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Please take a look at the Xilinx GQSPI controller, which supports >>>>>>>>>>>> both >>>>>>>>>>>> use cases, that the dummy cycles can be transferred via tx fifo, or >>>>>>>>>>>> generated by the controller automatically. Please read the example >>>>>>>>>>>> given in: >>>>>>>>>>>> >>>>>>>>>>>> table 24‐22, an example of Generic FIFO Contents for Quad I/O >>>>>>>>>>>> Read >>>>>>>>>>>> Command (EBh) >>>>>>>>>>>> >>>>>>>>>>>> in >>>>>>>>>>>> https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf >>>>>>>>>>>> >>>>>>>>>>>> If you choose to set the m25p80 device property >>>>>>>>>>>> 'model_dummy_bytes' to >>>>>>>>>>>> true when working with the Xilinx GQSPI controller, you are bound >>>>>>>>>>>> to >>>>>>>>>>>> only allow guest software to use tx fifo to transfer the dummy >>>>>>>>>>>> cycles, >>>>>>>>>>>> and this is wrong. >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> You missed this part. I looked at your RFC, and as I mentioned above >>>>>>>>>> your proposal cannot support the complicated controller like Xilinx >>>>>>>>>> GQSPI. Please read the example of table 24-22. With your RFC, you >>>>>>>>>> mandate guest software's GQSPI driver to only use hardware dummy >>>>>>>>>> cycle >>>>>>>>>> generation, which is wrong. >>>>>>>>>> >>>>>>>>> >>>>>>>>> First, thank you very much for looking into the RFC series, very much >>>>>>>>> appreciated. Secondly, about above, the GQSPI model in QEMU transfers >>>>>>>>> from 2 >>>>>>>>> locations in the file, in 1 location the transfer referred to above >>>>>>>>> is done, in >>>>>>>>> another location the transfer through the txfifo is done. The >>>>>>>>> location where >>>>>>>>> transfer referred to above is done will not need any modifications >>>>>>>>> (and will >>>>>>>>> thus work equally well as it does currently). >>>>>>>> >>>>>>>> Please explain this a little bit. How does your RFC series handle >>>>>>>> cases as described in table 24-22, where the 6 dummy cycles are split >>>>>>>> into 2 transfers, with one transfer using tx fifo, and the other one >>>>>>>> using hardware dummy cycle generation? >>>>>>> >>>>>>> Sorry, I missunderstod. You are right, that won't work. >>>>>> >>>>>> +Edgar E. Iglesias >>>>>> >>>>>> So it looks by far the only way to implement dummy cycles correctly to >>>>>> work with all SPI controller models is what I proposed here in this >>>>>> patch series. >>>>>> >>>>>> Maintainers are quite silent, so I would like to hear your thoughts. >>>>>> >>>>>> @Alistair Francis @Philippe Mathieu-Daudé @Peter Maydell would you >>>>>> please share your thoughts since you are the one who reviewed the >>>>>> existing dummy implementation (based on commits history) >>>> >>>> I agree with Edgar, in that Francisco and Bin know this better than me >>>> and that modelling things in cycles is a pain. >>> >>> Hi Alistair, >>> >>>> >>>> As Bin points out it seems like currently we should be modelling bytes >>>> (from the variable name) so it makes sense to keep it in bytes. I >>>> would be in favour of this series in that case. Do we know what use >>>> cases this will break? I know it's hard to answer but I don't think >>>> there are too many SSI users in QEMU so it might not be too hard to >>>> test most of the possible use cases. >>> >>> The use case I'm aware of is regression testing of drivers. Ex: if a >>> driver is using 10 dummy clock cycles with the commands and a patch >>> accidentaly changes the driver to use 11 dummy clock cycles QEMU currently >>> finds the problem, that won't be possible with this series. It's difficult >>> to say but it is not impossible there are other use cases also. >> >> >> It was breaking the Aspeed machines : >> >> >> https://lore.kernel.org/qemu-devel/78a12882-1303-dd6d-6619-96c5e2cbf...@kaod.org/ > > Yes, as I mentioned in the series the modification was based on a pure > guess from existing QEMU codes as I could not find a datasheet of the > Aspeed SPI controller on the internet. Do you know if this is publicly > available?
It is not but much of the register bitfields are described in the code. I should be able to help you in making this work. Thanks, C. >> QEMU 6.1 should have acceptance tests that will help in detecting >> regressions in this area. >> > > Regards, > Bin >