Hello Alexey, thank you for the feedback! On Mon, 2020-06-22 at 20:02 +1000, Alexey Kardashevskiy wrote: > > On 19/06/2020 15:06, Leonardo Bras wrote: > > From LoPAR level 2.8, "ibm,ddw-extensions" index 3 can make the number of > > outputs from "ibm,query-pe-dma-windows" go from 5 to 6. > > > > This change of output size is meant to expand the address size of > > largest_available_block PE TCE from 32-bit to 64-bit, which ends up > > shifting page_size and migration_capable. > > > > This ends up requiring the update of > > ddw_query_response->largest_available_block from u32 to u64, and manually > > assigning the values from the buffer into this struct, according to > > output size. > > > > Signed-off-by: Leonardo Bras <leobra...@gmail.com> > > --- > > arch/powerpc/platforms/pseries/iommu.c | 57 +++++++++++++++++++++----- > > 1 file changed, 46 insertions(+), 11 deletions(-) > > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c > > b/arch/powerpc/platforms/pseries/iommu.c > > index 6d47b4a3ce39..e5a617738c8b 100644 > > --- a/arch/powerpc/platforms/pseries/iommu.c > > +++ b/arch/powerpc/platforms/pseries/iommu.c > > @@ -334,7 +334,7 @@ struct direct_window { > > /* Dynamic DMA Window support */ > > struct ddw_query_response { > > u32 windows_available; > > - u32 largest_available_block; > > + u64 largest_available_block; > > u32 page_size; > > u32 migration_capable; > > }; > > @@ -869,14 +869,32 @@ static int find_existing_ddw_windows(void) > > } > > machine_arch_initcall(pseries, find_existing_ddw_windows); > > > > +/* > > + * From LoPAR level 2.8, "ibm,ddw-extensions" index 3 can rule how many > > output > > + * parameters ibm,query-pe-dma-windows will have, ranging from 5 to 6. > > + */ > > + > > +static int query_ddw_out_sz(struct device_node *par_dn) > > Can easily be folded into query_ddw().
Sure, but it will get inlined by the compiler, and I think it reads better this way. I mean, I understand you have a reason to think it's better to fold it in query_ddw(), and I would like to better understand that to improve my code in the future. > > +{ > > + int ret; > > + u32 ddw_ext[3]; > > + > > + ret = of_property_read_u32_array(par_dn, "ibm,ddw-extensions", > > + &ddw_ext[0], 3); > > + if (ret || ddw_ext[0] < 2 || ddw_ext[2] != 1) > > Oh that PAPR thing again :-/ > > === > The “ibm,ddw-extensions” property value is a list of integers the first > integer indicates the number of extensions implemented and subsequent > integers, one per extension, provide a value associated with that > extension. > === > > So ddw_ext[0] is length. > Listindex==2 is for "reset" says PAPR and > Listindex==3 is for this new 64bit "largest_available_block". > > So I'd expect ddw_ext[2] to have the "reset" token and ddw_ext[3] to > have "1" for this new feature but indexes are smaller. I am confused. > Either way these "2" and "3" needs to be defined in macros, "0" probably > too. Remember these indexes are not C-like 0-starting indexes, where the size would be Listindex==1. Basically, in C-like array it's : a[0] == size, a[1] == reset_token, a[2] == new 64bit "largest_available_block" > Please post 'lsprop "ibm,ddw-extensions"' here. Thanks, Sure: [root@host pci@800000029004005]# lsprop "ibm,ddw-extensions" ibm,dd w-extensions 00000002 00000056 00000000 > > > + return 5; > > + return 6; > > +} > > + > > static int query_ddw(struct pci_dev *dev, const u32 *ddw_avail, > > - struct ddw_query_response *query) > > + struct ddw_query_response *query, > > + struct device_node *par_dn) > > { > > struct device_node *dn; > > struct pci_dn *pdn; > > - u32 cfg_addr; > > + u32 cfg_addr, query_out[5]; > > u64 buid; > > - int ret; > > + int ret, out_sz; > > > > /* > > * Get the config address and phb buid of the PE window. > > @@ -888,12 +906,29 @@ static int query_ddw(struct pci_dev *dev, const u32 > > *ddw_avail, > > pdn = PCI_DN(dn); > > buid = pdn->phb->buid; > > cfg_addr = ((pdn->busno << 16) | (pdn->devfn << 8)); > > + out_sz = query_ddw_out_sz(par_dn); > > + > > + ret = rtas_call(ddw_avail[0], 3, out_sz, query_out, > > + cfg_addr, BUID_HI(buid), BUID_LO(buid)); > > + dev_info(&dev->dev, "ibm,query-pe-dma-windows(%x) %x %x %x returned > > %d\n", > > + ddw_avail[0], cfg_addr, BUID_HI(buid), BUID_LO(buid), ret); > > + > > + switch (out_sz) { > > + case 5: > > + query->windows_available = query_out[0]; > > + query->largest_available_block = query_out[1]; > > + query->page_size = query_out[2]; > > + query->migration_capable = query_out[3]; > > + break; > > + case 6: > > + query->windows_available = query_out[0]; > > + query->largest_available_block = ((u64)query_out[1] << 32) | > > + query_out[2]; > > + query->page_size = query_out[3]; > > + query->migration_capable = query_out[4]; > > + break; > > + } > > > > - ret = rtas_call(ddw_avail[0], 3, 5, (u32 *)query, > > - cfg_addr, BUID_HI(buid), BUID_LO(buid)); > > - dev_info(&dev->dev, "ibm,query-pe-dma-windows(%x) %x %x %x" > > - " returned %d\n", ddw_avail[0], cfg_addr, BUID_HI(buid), > > - BUID_LO(buid), ret); > > return ret; > > } > > > > @@ -1040,7 +1075,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct > > device_node *pdn) > > * of page sizes: supported and supported for migrate-dma. > > */ > > dn = pci_device_to_OF_node(dev); > > - ret = query_ddw(dev, ddw_avail, &query); > > + ret = query_ddw(dev, ddw_avail, &query, pdn); > > if (ret != 0) > > goto out_failed; > > > > @@ -1068,7 +1103,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct > > device_node *pdn) > > /* check largest block * page size > max memory hotplug addr */ > > max_addr = ddw_memory_hotplug_max(); > > if (query.largest_available_block < (max_addr >> page_shift)) { > > - dev_dbg(&dev->dev, "can't map partition max 0x%llx with %u " > > + dev_dbg(&dev->dev, "can't map partition max 0x%llx with %llu " > > "%llu-sized pages\n", max_addr, > > query.largest_available_block, > > 1ULL << page_shift); > > goto out_failed; > > Best regards, Leonardo