Hi, Roger, Jan!
On 13.01.22 10:58, Roger Pau Monné wrote: > On Wed, Jan 12, 2022 at 04:52:51PM +0100, Jan Beulich wrote: >> On 12.01.2022 16:42, Roger Pau Monné wrote: >>> On Wed, Jan 12, 2022 at 03:57:36PM +0100, Jan Beulich wrote: >>>> On 25.11.2021 12:02, Oleksandr Andrushchenko wrote: >>>>> @@ -379,7 +396,6 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, >>>>> unsigned int size) >>>>> >>>>> data = merge_result(data, tmp_data, size - data_offset, >>>>> data_offset); >>>>> } >>>>> - spin_unlock(&pdev->vpci->lock); >>>>> >>>>> return data & (0xffffffff >> (32 - 8 * size)); >>>>> } >>>> Here and ... >>>> >>>>> @@ -475,13 +498,12 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, >>>>> unsigned int size, >>>>> break; >>>>> ASSERT(data_offset < size); >>>>> } >>>>> + spin_unlock(&pdev->vpci_lock); >>>>> >>>>> if ( data_offset < size ) >>>>> /* Tailing gap, write the remaining. */ >>>>> vpci_write_hw(sbdf, reg + data_offset, size - data_offset, >>>>> data >> (data_offset * 8)); >>>>> - >>>>> - spin_unlock(&pdev->vpci->lock); >>>>> } >>>> ... even more so here I'm not sure of the correctness of the moving >>>> you do: While pdev->vpci indeed doesn't get accessed, I wonder >>>> whether there wasn't an intention to avoid racing calls to >>>> vpci_{read,write}_hw() this way. In any event I think such movement >>>> would need justification in the description. >>> I agree about the need for justification in the commit message, or >>> even better this being split into a pre-patch, as it's not related to >>> the lock switching done here. >>> >>> I do think this is fine however, as racing calls to >>> vpci_{read,write}_hw() shouldn't be a problem. Those are just wrappers >>> around pci_conf_{read,write} functions, and the required locking (in >>> case of using the IO ports) is already taken care in >>> pci_conf_{read,write}. >> IOW you consider it acceptable for a guest (really: Dom0) read racing >> a write to read back only part of what was written (so far)? I would >> think individual multi-byte reads and writes should appear atomic to >> the guest. > We split 64bit writes into two 32bit ones without taking the lock for > the whole duration of the access, so it's already possible to see a > partially updated state as a result of a 64bit write. > > I'm going over the PCI(e) spec but I don't seem to find anything about > whether the ECAM is allowed to split memory transactions into multiple > Configuration Requests, and whether those could then interleave with > requests from a different CPU. So, with the above is it still fine for you to have the change as is or you want this optimization to go into a dedicated patch before this one? > > Thanks, Roger. Thank you, Oleksandr