On Jul 27 11:59, Andrzej Jakowski wrote: > On 7/27/20 2:06 AM, Klaus Jensen wrote: > > On Jul 23 09:03, Andrzej Jakowski wrote: > >> So far it was not possible to have CMB and PMR emulated on the same > >> device, because BAR2 was used exclusively either of PMR or CMB. This > >> patch places CMB at BAR4 offset so it not conflicts with MSI-X vectors. > >> > >> Signed-off-by: Andrzej Jakowski <andrzej.jakow...@linux.intel.com> > >> --- > >> hw/block/nvme.c | 120 +++++++++++++++++++++++++++++-------------- > >> hw/block/nvme.h | 1 + > >> include/block/nvme.h | 4 +- > >> 3 files changed, 85 insertions(+), 40 deletions(-) > >> > >> diff --git a/hw/block/nvme.c b/hw/block/nvme.c > >> index 43866b744f..d55a71a346 100644 > >> --- a/hw/block/nvme.c > >> +++ b/hw/block/nvme.c > >> @@ -22,12 +22,13 @@ > >> * [pmrdev=<mem_backend_file_id>,] \ > >> * max_ioqpairs=<N[optional]> > >> * > >> - * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at > >> - * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. > >> + * Note cmb_size_mb denotes size of CMB in MB. CMB when configured is > >> assumed > >> + * to be resident in BAR4 at offset that is 2MiB aligned. When CMB is > >> emulated > >> + * on Linux guest it is recommended to make cmb_size_mb multiple of 2. > >> Both > >> + * size and alignment restrictions are imposed by Linux guest. > >> * > >> - * cmb_size_mb= and pmrdev= options are mutually exclusive due to > >> limitation > >> - * in available BAR's. cmb_size_mb= will take precedence over pmrdev= when > >> - * both provided. > >> + * pmrdev is assumed to be resident in BAR2/BAR3. When configured it > >> consumes > >> + * whole BAR2/BAR3 exclusively. > >> * Enabling pmr emulation can be achieved by pointing to > >> memory-backend-file. > >> * For example: > >> * -object memory-backend-file,id=<mem_id>,share=on,mem-path=<file_path>, > >> \ > >> @@ -57,8 +58,8 @@ > >> #define NVME_MAX_IOQPAIRS 0xffff > >> #define NVME_DB_SIZE 4 > >> #define NVME_SPEC_VER 0x00010300 > >> -#define NVME_CMB_BIR 2 > >> #define NVME_PMR_BIR 2 > >> +#define NVME_MSIX_BIR 4 > > > > I think that either we keep the CMB constant (but updated with '4' of > > course) or we just get rid of both NVME_{CMB,MSIX}_BIR and use a literal > > '4' in nvme_bar4_init. It is very clear that is only BAR 4 we use. > > > >> #define NVME_TEMPERATURE 0x143 > >> #define NVME_TEMPERATURE_WARNING 0x157 > >> #define NVME_TEMPERATURE_CRITICAL 0x175 > >> @@ -111,16 +112,18 @@ static uint16_t nvme_sqid(NvmeRequest *req) > >> > >> static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr) > >> { > >> - hwaddr low = n->ctrl_mem.addr; > >> - hwaddr hi = n->ctrl_mem.addr + int128_get64(n->ctrl_mem.size); > >> + hwaddr low = memory_region_to_absolute_addr(&n->ctrl_mem, 0); > >> + hwaddr hi = low + int128_get64(n->ctrl_mem.size); > > > > Are we really really sure we want to use a global helper like this? What > > are the chances/risk that we ever introduce another overlay? I'd say > > zero. We are not even using a *real* overlay, it's just an io memory > > region (ctrl_mem) on top of a pure container (bar4). Can't we live with > > an internal helper doing `n->bar4.addr + n->ctrl_mem.addr` and be done > > with it? It also removes a data structure walk on each invocation of > > nvme_addr_is_cmb (which is done for **all** addresses in PRP lists and > > SGLs). > > Thx! > My understanding of memory_region_absolute_addr()([1]) function is that it > walks > memory hierarchy up to root while incrementing absolute addr. It is very > similar to n->bar4.addr + n->ctrl_mem.addr approach with following > differences: > * n->bar4.addr + n->ctrl_mem.addr assumes single level hierarchy. Updates > would > be needed when another memory level is added. > * memory_region_to_absolute_addr() works for any-level hierarchy at tradeoff > of dereferencing data structure. > > I don't have data for likelihood of adding new memory level, nor how much more > memory_region_to_absolute_addr() vs n->bar4.addr + n->ctrl_mem.addr costs. > > Please let me know which approach is preferred. >
Since you are directly asking me for my preference, then that is "n->bar4.addr + n->ctrl_mem.addr". I don't like the walk, even though it is super short. I know that the raw addition assumes single level hierarchy, but I am fine with that. I would still like it to be in an inline helper though.