> -----Original Message----- > From: Thomas Monjalon <tho...@monjalon.net> > Sent: Monday, July 15, 2019 8:34 PM > To: Jerin Jacob Kollanukkaran <jer...@marvell.com> > Cc: Burakov, Anatoly <anatoly.bura...@intel.com>; David Marchand > <david.march...@redhat.com>; dev@dpdk.org; John McNamara > <john.mcnam...@intel.com>; Marko Kovacevic > <marko.kovace...@intel.com>; Igor Russkikh > <igor.russk...@aquantia.com>; Pavel Belous <pavel.bel...@aquantia.com>; > Ajit Khaparde <ajit.khapa...@broadcom.com>; Somnath Kotur > <somnath.ko...@broadcom.com>; Wenzhuo Lu <wenzhuo...@intel.com>; > John Daley <johnd...@cisco.com>; Hyong Youb Kim <hyon...@cisco.com>; > Qi Zhang <qi.z.zh...@intel.com>; Xiao Wang <xiao.w.w...@intel.com>; > Beilei Xing <beilei.x...@intel.com>; Jingjing Wu <jingjing...@intel.com>; > Qiming Yang <qiming.y...@intel.com>; Konstantin Ananyev > <konstantin.anan...@intel.com>; Matan Azrad <ma...@mellanox.com>; > Shahaf Shuler <shah...@mellanox.com>; Yongseok Koh > <ys...@mellanox.com>; Viacheslav Ovsiienko > <viachesl...@mellanox.com>; Alejandro Lucero > <alejandro.luc...@netronome.com>; Nithin Kumar Dabilpuram > <ndabilpu...@marvell.com>; Kiran Kumar Kokkilagadda > <kirankum...@marvell.com>; Rasesh Mody <rm...@marvell.com>; Shahed > Shaikh <shsha...@marvell.com>; Bruce Richardson > <bruce.richard...@intel.com>; alia...@mellanox.com; > acon...@redhat.com > Subject: Re: [PATCH 2/2] eal: fix IOVA mode selection as VA for pci drivers > > 15/07/2019 16:26, Jerin Jacob Kollanukkaran: > > > > > + > > > > > +IOVA Mode is selected by considering what the current usable > > > > > +Devices on the system requires and/or supports. > > > > > + > > > > > +Below is the 2-step heuristic for this choice. > > > > > + > > > > > +For the first step, EAL asks each bus its requirement in terms > > > > > +of IOVA mode and decides on a preferred IOVA mode. > > > > > + > > > > > +- if all buses report RTE_IOVA_PA, then the preferred IOVA mode > > > > > +is RTE_IOVA_PA, > > > > > +- if all buses report RTE_IOVA_VA, then the preferred IOVA mode > > > > > +is RTE_IOVA_VA, > > > > > +- if all buses report RTE_IOVA_DC, no bus expressed a > > > > > +preferrence, then the > > > > > + preferred mode is RTE_IOVA_DC, > > > > > +- if the buses disagree (at least one wants RTE_IOVA_PA and at > > > > > +least one wants > > > > > + RTE_IOVA_VA), then the preferred IOVA mode is RTE_IOVA_DC > > > > > +(see below with the > > > > > + check on Physical Addresses availability), > > > > > + > > > > > +The second step is checking if the preferred mode complies with > > > > > +the Physical Addresses availability since those are only > > > > > +available to root user in recent kernels. > > > > > + > > > > > +- if the preferred mode is RTE_IOVA_PA but there is no access > > > > > +to Physical > > > > > + Addresses, then EAL init will fail early, since later probing > > > > > +of the devices > > > > > + would fail anyway, > > > > > +- if the preferred mode is RTE_IOVA_DC then based on the > > > > > +Physical Addresses > > > > > + availability, the preferred mode is adjusted to RTE_IOVA_PA > > > > > +or > > > RTE_IOVA_VA. > > > > > + In the case when the buses had disagreed on the IOVA Mode at > > > > > +the first step, > > > > > + part of the buses won't work because of this decision. > > > > > > > > Is there any specific reason why we always prefer PA if physical > > > > addresses are available? Since we're already assuming that all > > > > devices support PA and VA anyway, what's the harm in enabling VA by > default? > > > > > > If PA is available, it means we are running as root. > > > We can assume that using root is a choice, probably related to a > > > preference for PA. > > > > # Even if we are running as root, Why to choose PA in case of DC? > > ie. Following logic is not need > > if (iova_mode == RTE_IOVA_DC) { > > iova_mode = phys_addrs ? RTE_IOVA_PA : RTE_IOVA_VA; > > RTE_LOG(DEBUG, EAL, > > "Buses did not request a specific IOVA > > mode, using '%s' > based on physical addresses availability.\n", > > phys_addrs ? "PA" : "VA"); > > } > > Why running as root if using VA anyway? > We can assume the user knows what he is doing, so it is a user choice. > We want to allow the user choosing, right?
The user can override iova=pa/va as eal argument if user needs to run a specific mode. Running as root for various other reason(just be lazy) etc. it is not or it should not be connected to set the mode as PA. > > > # When DPDK running on guest, Anyway it can not access the real PA, It will > be IPA. > > What is IPA? Isn't it a beer? There may a beer with that name. In this context, it is "Intermediate physical address" > > > So I don't understand logic behind choose PA when DC. > > To me, it make sense to choose PA when DC. > > You probably mean "choose VA". Yup. > > > # To align with RTE_PCI_DRV_NEED_MAPPING flag and reflect it "need" > > rather than support, I think, flag can be changed to > > RTE_PCI_DRV_NEED_IOVA_AS_VA > > I think the most important is to have a good documentation of this flag (it > was not done properly when Cavium introduced it initially). > If you want to rename the flag, you can do it in a separate patch. > If renaming, I really would like to get an answer to an old question: > Why IO adress is called IOVA? The name "IOVA_AS_VA" looks strange. IOVA = IO virtual address Since IOVA can be PA or VA, the name IOVA_AS_VA as chosen > For reference, one description of addressing: > https://lists.linuxfoundation.org/pipermail/iommu/2018-May/027686.html > > About the naming, do you remember how I insisted to have a correct naming > of all related stuff in DPDK? It was hard to get it accepted, the discussion > was > not nice and I stopped insisting to get all details fine because I just got > bored. > It was a really bad experience. I agree. To me that bad experience was due to mostly not having enough technical comments On the proposal. Though I am not the author/owner of it. > You can ask why I remind this now? Because we must take care of all details, > make sure our messages are well understood, and be cooperative. No disagreement. If we see the history the meaning got changed/updated in this commit By adding intel drivers to it. I would nt say it is big ideal, It just C code, It can be changed based on the need. I think, what really import is, maintain the the feature and commitment towards fixing any issue. commit f37dfab21c988d2d0ecb3c82be4ba9738c7e51c7 Author: Jianfeng Tan <jianfeng....@intel.com> Date: Wed Oct 11 10:33:48 2017 +0000 drivers/net: enable IOVA mode for Intel PMDs If we want to enable IOVA mode, introduced by commit 93878cf0255e ("eal: introduce helper API for IOVA mode"), we need PMDs (for PCI devices) to expose this flag. Signed-off-by: Jianfeng Tan <jianfeng....@intel.com> Acked-by: Anatoly Burakov <anatoly.bura...@intel.com> Reviewed-by: Santosh Shukla <santosh.shu...@caviumnetworks.com> > > > Other than above points, > > Reviewed this patch and tested on octeontx2, It looks good to me. > >