[PATCH v3 06/11] block: Introduce PCI P2P flags for request and request queue

2018-03-12 Thread Logan Gunthorpe
I_P2P flag set. Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- block/blk-core.c | 3 +++ include/linux/blk_types.h | 18 +- include/linux/blkdev.h| 3 +++ 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/block/blk-core.c b/block/blk-c

[PATCH v3 00/11] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-03-12 Thread Logan Gunthorpe
all IO transfers. And if a port is using P2P memory, adding new namespaces that are not supported by that memory will fail. Logan Gunthorpe (11): PCI/P2PDMA: Support peer-to-peer memory PCI/P2PDMA: Add sysfs group to display p2pmem stats PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the

[PATCH v3 10/11] nvme-pci: Add a quirk for a pseudo CMB

2018-03-12 Thread Logan Gunthorpe
Introduce a quirk to use CMB-like memory on older devices that have an exposed BAR but do not advertise support for using CMBLOC and CMBSIZE. We'd like to use some of these older cards to test P2P memory. Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/host/n

[PATCH v3 08/11] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-12 Thread Logan Gunthorpe
Register the CMB buffer as p2pmem and use the appropriate allocation functions to create and destroy the IO SQ. If the CMB supports WDS and RDS, publish it for use as P2P memory by other devices. Signed-off-by: Logan Gunthorpe --- drivers/nvme/host/pci.c | 75

[PATCH v3 04/11] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-03-12 Thread Logan Gunthorpe
-off-by: Logan Gunthorpe --- drivers/pci/Kconfig| 9 + drivers/pci/p2pdma.c | 44 drivers/pci/pci.c | 6 ++ include/linux/pci-p2pdma.h | 5 + 4 files changed, 64 insertions(+) diff --git a/drivers/pci/Kconfig b

[PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-12 Thread Logan Gunthorpe
k and feedback from Christoph Hellwig. Signed-off-by: Christoph Hellwig Signed-off-by: Logan Gunthorpe --- drivers/pci/Kconfig| 16 ++ drivers/pci/Makefile | 1 + drivers/pci/p2pdma.c | 679 + include/linux/memremap.h

Re: [PATCH v3 05/11] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-03-12 Thread Logan Gunthorpe
On 3/12/2018 1:41 PM, Jonathan Corbet wrote: This all seems good, but...could we consider moving this documentation to driver-api/PCI as it's converted to RST? That would keep it together with similar materials and bring a bit more coherence to Documentation/ as a whole. Yup, I'll change this

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 12/03/18 09:28 PM, Sinan Kaya wrote: On 3/12/2018 3:35 PM, Logan Gunthorpe wrote: Regarding the switch business, It is amazing how much trouble you went into limit this functionality into very specific hardware. I thought that we reached to an agreement that code would not impose any

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 12/03/18 09:28 PM, Sinan Kaya wrote: Maybe, dev parameter should also be struct pci_dev so that you can get rid of all to_pci_dev() calls in this code including find_parent_pci_dev() function. No, this was mentioned in v2. find_parent_pci_dev is necessary because the calling drivers aren'

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 11:49 AM, Sinan Kaya wrote: And there's also the ACS problem which means if you want to use P2P on the root ports you'll have to disable ACS on the entire system. (Or preferably, the IOMMU groups need to get more sophisticated to allow for dynamic changes). Do you think you can

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 01:10 PM, Sinan Kaya wrote: > I was thinking of this for the pci_p2pdma_add_client() case for the > parent pointer. > > +struct pci_p2pdma_client { > + struct list_head list; > + struct pci_dev *client; > + struct pci_dev *provider; > +}; Yeah, that structure only exists

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 01:53 PM, Sinan Kaya wrote: > I agree disabling globally would be bad. Somebody can always say I have > ten switches on my system. I want to do peer-to-peer on one switch only. Now, > this change weakened security for the other switches that I had no intention > with doing P2P. > > I

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 03:22 PM, Sinan Kaya wrote: > It sounds like you have very tight hardware expectations for this to work > at this moment. You also don't want to generalize this code for others and > address the shortcomings. No, that's the way the community has pushed this work. Our original work wa

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 04:29 PM, Sinan Kaya wrote: > If hardware doesn't support it, blacklisting should have been the right > path and I still think that you should remove all switch business from the > code. > I did not hear enough justification for having a switch requirement > for P2P. I disagree. >

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 05:08 PM, Bjorn Helgaas wrote: > On Tue, Mar 13, 2018 at 10:31:55PM +, Stephen Bates wrote: > If it *is* necessary because Root Ports and devices below them behave > differently than in conventional PCI, I think you should include a > reference to the relevant section of the spec

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Logan Gunthorpe
On 13/03/18 05:19 PM, Sinan Kaya wrote: > It is still a switch it can move packets but, maybe it can move data at > 100kbps speed. As Stephen pointed out, it's a requirement of the PCIe spec that a switch supports P2P. If you want to sell a switch that does P2P with bad performance then that's

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-14 Thread Logan Gunthorpe
On 13/03/18 08:56 PM, Bjorn Helgaas wrote: > I assume you want to exclude Root Ports because of multi-function > devices and the "route to self" error. I was hoping for a reference > to that so I could learn more about it. I haven't been able to find where in the spec it forbids route to self.

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-14 Thread Logan Gunthorpe
On 14/03/18 06:16 AM, David Laight wrote: > That surprises me (unless I missed something last time I read the spec). > While P2P writes are relatively easy to handle, reads and any other TLP that > require acks are a completely different proposition. > There are no additional fields that can be s

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-14 Thread Logan Gunthorpe
On 14/03/18 12:51 PM, Bjorn Helgaas wrote: > You are focused on PCIe systems, and in those systems, most topologies > do have an upstream switch, which means two upstream bridges. I'm > trying to remove that assumption because I don't think there's a > requirement for it in the spec. Enforcing

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-14 Thread Logan Gunthorpe
On 14/03/18 01:28 PM, Dan Williams wrote: > P2P over PCI/PCI-X is quite common in devices like raid controllers. > It would be useful if those configurations were not left behind so > that Linux could feasibly deploy offload code to a controller in the > PCI domain. Thanks for the note. Neat. In

Re: [PATCH v3 11/11] nvmet: Optionally use PCI P2P memory

2018-03-21 Thread Logan Gunthorpe
On 21/03/18 03:27 AM, Christoph Hellwig wrote: >> + const char *page, size_t count) >> +{ >> +struct nvmet_port *port = to_nvmet_port(item); >> +struct device *dev; >> +struct pci_dev *p2p_dev = NULL; >> +bool use_p2pmem; >> + >> +switch (page[0])

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-23 Thread Logan Gunthorpe
On 23/03/18 03:50 PM, Bjorn Helgaas wrote: > Popping way up the stack, my original point was that I'm trying to > remove restrictions on what devices can participate in peer-to-peer > DMA. I think it's fairly clear that in conventional PCI, any devices > in the same PCI hierarchy, i.e., below th

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-26 Thread Logan Gunthorpe
On 24/03/18 09:28 AM, Stephen Bates wrote: > 1. There is no requirement for a single function to support internal DMAs but > in the case of NVMe we do have a protocol specific way for a NVMe function to > indicate it supports via the CMB BAR. Other protocols may also have such > methods but I

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-26 Thread Logan Gunthorpe
On 26/03/18 08:01 AM, Bjorn Helgaas wrote: > On Mon, Mar 26, 2018 at 12:11:38PM +0100, Jonathan Cameron wrote: >> On Tue, 13 Mar 2018 10:43:55 -0600 >> Logan Gunthorpe wrote: >>> It turns out that root ports that support P2P are far less common than >>> anyon

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-26 Thread Logan Gunthorpe
On 26/03/18 10:41 AM, Jason Gunthorpe wrote: > On Mon, Mar 26, 2018 at 12:11:38PM +0100, Jonathan Cameron wrote: >> On Tue, 13 Mar 2018 10:43:55 -0600 >> Logan Gunthorpe wrote: >> >>> On 12/03/18 09:28 PM, Sinan Kaya wrote: >>>> On 3/12/2018 3:35 PM,

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-26 Thread Logan Gunthorpe
On 26/03/18 01:35 PM, Jason Gunthorpe wrote: > I think this is another case of the HW can do it but the SW support is > missing. IOMMU configuration and maybe firmware too, for instance. Nope, not sure how you can make this leap. We've been specifically told that peer-to-peer PCIe DMA is not sup

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-27 Thread Logan Gunthorpe
On 27/03/18 02:47 AM, Jonathan Cameron wrote: > I'll see if I can get our PCI SIG people to follow this through and see if > it is just an omission or as Bjorn suggested, there is some reason we > aren't thinking of that makes it hard. That would be great! Thanks! Logan ___

[PATCH v4 05/14] docs-rst: Add a new directory for PCI documentation

2018-04-23 Thread Logan Gunthorpe
Add a new directory in the driver API guide for PCI specific documentation. This is in preparation for adding a new PCI P2P DMA driver writers guide which will go in this directory. Signed-off-by: Logan Gunthorpe Cc: Jonathan Corbet Cc: Mauro Carvalho Chehab Cc: Greg Kroah-Hartman Cc: Vinod

[PATCH v4 11/14] nvme-pci: Add a quirk for a pseudo CMB

2018-04-23 Thread Logan Gunthorpe
Introduce a quirk to use CMB-like memory on older devices that have an exposed BAR but do not advertise support for using CMBLOC and CMBSIZE. We'd like to use some of these older cards to test P2P memory. Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg --- drivers/nvme/host/n

[PATCH v4 14/14] nvmet: Optionally use PCI P2P memory

2018-04-23 Thread Logan Gunthorpe
he initial code] Signed-off-by: Christoph Hellwig Signed-off-by: Logan Gunthorpe --- drivers/nvme/target/configfs.c | 67 ++ drivers/nvme/target/core.c | 127 - drivers/nvme/target/io-cmd.c | 3 + drivers/nvme/target/nvmet.

[PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory

2018-04-23 Thread Logan Gunthorpe
ience. The PCI-SIG may be exploring adding a new capability bit to advertise whether this is possible for future hardware. This commit includes significant rework and feedback from Christoph Hellwig. Signed-off-by: Christoph Hellwig Signed-off-by: Logan Gunthorpe --- drivers/pci/Kconfig

[PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-04-23 Thread Logan Gunthorpe
transactions. Signed-off-by: Logan Gunthorpe --- drivers/pci/Kconfig| 9 + drivers/pci/p2pdma.c | 45 ++--- drivers/pci/pci.c | 6 ++ include/linux/pci-p2pdma.h | 5 + 4 files changed, 50 insertions(+), 15 deletions

[PATCH v4 07/14] block: Introduce PCI P2P flags for request and request queue

2018-04-23 Thread Logan Gunthorpe
I_P2P flag set. Signed-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg Reviewed-by: Christoph Hellwig --- block/blk-core.c | 3 +++ include/linux/blk_types.h | 18 +- include/linux/blkdev.h| 3 +++ 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/

[PATCH v4 06/14] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-04-23 Thread Logan Gunthorpe
converted to restructured text at this time. Signed-off-by: Logan Gunthorpe Cc: Jonathan Corbet --- Documentation/PCI/index.rst | 14 +++ Documentation/driver-api/pci/index.rst | 1 + Documentation/driver-api/pci/p2pdma.rst | 166 Documentation

[PATCH v4 03/14] PCI/P2PDMA: Add PCI p2pmem dma mappings to adjust the bus offset

2018-04-23 Thread Logan Gunthorpe
The DMA address used when mapping PCI P2P memory must be the PCI bus address. Thus, introduce pci_p2pmem_[un]map_sg() to map the correct addresses when using P2P memory. For this, we assume that an SGL passed to these functions contain all P2P memory or no P2P memory. Signed-off-by: Logan

[PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-04-23 Thread Logan Gunthorpe
SDs (Intel, Seagate, Samsung) and p2pdma devices (Eideticom, Microsemi, Chelsio and Everspin) using switches from both Microsemi and Broadcomm. Logan Gunthorpe (14): PCI/P2PDMA: Support peer-to-peer memory PCI/P2PDMA: Add sysfs group to display p2pmem stats PCI/P2PDMA: Add PCI p2pmem dma mappi

[PATCH v4 09/14] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-04-23 Thread Logan Gunthorpe
, devm_memremap_pages() allocates regular memory without side effects that's accessible without the iomem accessors. Signed-off-by: Logan Gunthorpe --- drivers/nvme/host/pci.c | 75 +++-- 1 file changed, 41 insertions(+), 34 deletions(-) diff --git a/dr

[PATCH v4 02/14] PCI/P2PDMA: Add sysfs group to display p2pmem stats

2018-04-23 Thread Logan Gunthorpe
Add a sysfs group to display statistics about P2P memory that is registered in each PCI device. Attributes in the group display the total amount of P2P memory, the amount available and whether it is published or not. Signed-off-by: Logan Gunthorpe --- Documentation/ABI/testing/sysfs-bus-pci

[PATCH v4 10/14] nvme-pci: Add support for P2P memory in requests

2018-04-23 Thread Logan Gunthorpe
-off-by: Logan Gunthorpe Reviewed-by: Sagi Grimberg Reviewed-by: Christoph Hellwig --- drivers/nvme/host/core.c | 4 drivers/nvme/host/nvme.h | 1 + drivers/nvme/host/pci.c | 19 +++ 3 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/core.c b

[PATCH v4 13/14] nvmet-rdma: Use new SGL alloc/free helper for requests

2018-04-23 Thread Logan Gunthorpe
called once. Signed-off-by: Logan Gunthorpe Cc: Christoph Hellwig Cc: Sagi Grimberg --- drivers/nvme/target/rdma.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index 52e0c5d579a7..f7a3459d618f

[PATCH v4 12/14] nvmet: Introduce helper functions to allocate and free request SGLs

2018-04-23 Thread Logan Gunthorpe
drivers. The presently unused 'sq' argument in the alloc function will be necessary to decide whether to use peer-to-peer memory and obtain the correct provider to allocate the memory. Signed-off-by: Logan Gunthorpe Cc: Christoph Hellwig Cc: Sagi Grimberg --- drivers/nvme/target/co

[PATCH v4 08/14] IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()

2018-04-23 Thread Logan Gunthorpe
is P2P the entire SGL should be P2P. Signed-off-by: Logan Gunthorpe Reviewed-by: Christoph Hellwig --- drivers/infiniband/core/rw.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c index c8963e91f92a

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-02 Thread Logan Gunthorpe
Hi Christian, On 5/2/2018 5:51 AM, Christian König wrote: it would be rather nice to have if you could separate out the functions to detect if peer2peer is possible between two devices. This would essentially be pci_p2pdma_distance() in the existing patchset. It returns the sum of the distanc

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-03 Thread Logan Gunthorpe
On 03/05/18 03:05 AM, Christian König wrote: > Ok, I'm still missing the big picture here. First question is what is > the P2PDMA provider? Well there's some pretty good documentation in the patchset for this, but in short, a provider is a device that provides some kind of P2P resource (ie. BAR

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-03 Thread Logan Gunthorpe
On 03/05/18 11:29 AM, Christian König wrote: > Ok, that is the point where I'm stuck. Why do we need that in one > function call in the PCIe subsystem? > > The problem at least with GPUs is that we seriously don't have that > information here, cause the PCI subsystem might not be aware of all

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-04 Thread Logan Gunthorpe
On 04/05/18 08:27 AM, Christian König wrote: > Are you sure that this is more convenient? At least on first glance it > feels overly complicated. > > I mean what's the difference between the two approaches? > >     sum = pci_p2pdma_distance(target, [A, B, C, target]); > > and > >     sum =

Re: [PATCH v4 01/14] PCI/P2PDMA: Support peer-to-peer memory

2018-05-07 Thread Logan Gunthorpe
Thanks for the review. I'll apply all of these for the changes for next version of the set. >> +/* >> + * If a device is behind a switch, we try to find the upstream bridge >> + * port of the switch. This requires two calls to pci_upstream_bridge(): >> + * one for the upstream port on the switch, o

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-07 Thread Logan Gunthorpe
> How do you envison merging this? There's a big chunk in drivers/pci, but > really no opportunity for conflicts there, and there's significant stuff in > block and nvme that I don't really want to merge. > > If Alex is OK with the ACS situation, I can ack the PCI parts and you could > merge it

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 01:17 AM, Christian König wrote: > AMD APUs mandatory need the ACS flag set for the GPU integrated in the > CPU when IOMMU is enabled or otherwise you will break SVM. Well, given that the current set only disables ACS bits on bridges (previous versions were only on switches) this sh

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 10:50 AM, Christian König wrote: > E.g. transactions are initially send to the root complex for > translation, that's for sure. But at least for AMD GPUs the root complex > answers with the translated address which is then cached in the device. > > So further transactions for the s

Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 10:57 AM, Alex Williamson wrote: > AIUI from previously questioning this, the change is hidden behind a > build-time config option and only custom kernels or distros optimized > for this sort of support would enable that build option. I'm more than > a little dubious though that we'r

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 01:34 PM, Alex Williamson wrote: > They are not so unrelated, see the ACS Direct Translated P2P > capability, which in fact must be implemented by switch downstream > ports implementing ACS and works specifically with ATS. This appears to > be the way the PCI SIG would intend for P2P

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 02:13 PM, Alex Williamson wrote: > Well, I'm a bit confused, this patch series is specifically disabling > ACS on switches, but per the spec downstream switch ports implementing > ACS MUST implement direct translated P2P. So it seems the only > potential gap here is the endpoint, whi

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 02:43 PM, Alex Williamson wrote: > Yes, GPUs seem to be leading the pack in implementing ATS. So now the > dumb question, why not simply turn off the IOMMU and thus ACS? The > argument of using the IOMMU for security is rather diminished if we're > specifically enabling devices to p

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 04:03 PM, Alex Williamson wrote: > If IOMMU grouping implies device assignment (because nobody else uses > it to the same extent as device assignment) then the build-time option > falls to pieces, we need a single kernel that can do both. I think we > need to get more clever about al

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 05:00 PM, Dan Williams wrote: >> I'd advise caution with a user supplied BDF approach, we have no >> guaranteed persistence for a device's PCI address. Adding a device >> might renumber the buses, replacing a device with one that consumes >> more/less bus numbers can renumber the bus

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-08 Thread Logan Gunthorpe
On 08/05/18 05:11 PM, Alex Williamson wrote: > On to the implementation details... I already mentioned the BDF issue > in my other reply. If we had a way to persistently identify a device, > would we specify the downstream points at which we want to disable ACS > or the endpoints that we want to

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-09 Thread Logan Gunthorpe
On 09/05/18 07:40 AM, Christian König wrote: > The key takeaway is that when any device has ATS enabled you can't > disable ACS without breaking it (even if you unplug and replug it). I don't follow how you came to this conclusion... The ACS bits we'd be turning off are the ones that force TLP

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-10 Thread Logan Gunthorpe
On 10/05/18 08:16 AM, Stephen Bates wrote: > Hi Christian > >> Why would a switch not identify that as a peer address? We use the PASID >>together with ATS to identify the address space which a transaction >>should use. > > I think you are conflating two types of TLPs here. If the de

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-10 Thread Logan Gunthorpe
On 10/05/18 11:11 AM, Stephen Bates wrote: >> Not to me. In the p2pdma code we specifically program DMA engines with >> the PCI bus address. > > Ah yes of course. Brain fart on my part. We are not programming the P2PDMA > initiator with an IOVA but with the PCI bus address... > >> So regardl

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-10 Thread Logan Gunthorpe
On 10/05/18 12:41 PM, Stephen Bates wrote: > Hi Jerome > >>Note on GPU we do would not rely on ATS for peer to peer. Some part >>of the GPU (DMA engines) do not necessarily support ATS. Yet those >>are the part likely to be use in peer to peer. > > OK this is good to know. I agree

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-11 Thread Logan Gunthorpe
On 5/11/2018 2:52 AM, Christian König wrote: This only works when the IOVA and the PCI bus addresses never overlap. I'm not sure how the IOVA allocation works but I don't think we guarantee that on Linux. I find this hard to believe. There's always the possibility that some part of the system

Re: [PATCH v4 04/14] PCI/P2PDMA: Clear ACS P2P flags for all devices behind switches

2018-05-11 Thread Logan Gunthorpe
On 5/11/2018 4:24 PM, Stephen Bates wrote: All  Alex (or anyone else) can you point to where IOVA addresses are generated? A case of RTFM perhaps (though a pointer to the code would still be appreciated). https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt Some exceptions to IOVA --

Re: [PATCH v4 06/14] PCI/P2PDMA: Add P2P DMA driver writer's documentation

2018-05-22 Thread Logan Gunthorpe
Thanks for the review Randy! I'll make the changes for the next time we post the series. On 22/05/18 03:24 PM, Randy Dunlap wrote: >> +The first task an orchestrator driver must do is compile a list of >> +all client drivers that will be involved in a given transaction. For >> +example, the NVMe T

Re: [PATCH v3 3/6] mm: support THP migration to device private memory

2020-12-02 Thread Logan Gunthorpe
On 2020-12-02 3:14 a.m., Christoph Hellwig wrote:>> MEMORY_DEVICE_PCI_P2PDMA: >> Struct pages are created in pci_p2pdma_add_resource() and represent device >> memory accessible by PCIe bar address space. Memory is allocated with >> pci_alloc_p2pmem() based on a byte length but the gen_pool_alloc

Re: [PATCH 1/3] cdev: Finish the cdev api with queued mode support

2021-01-20 Thread Logan Gunthorpe
On 2021-01-20 12:38 p.m., Dan Williams wrote: > ...common reference count handling scenarios were addressed, but the > shutdown-synchronization problem was only mentioned as something driver > developers need to be aware in the following note: > > NOTE: This guarantees that associated sysf

Re: [PATCH 2/3] iopmem : Add a block device driver for PCIe attached IO memory.

2016-10-28 Thread Logan Gunthorpe
Hi Christoph, Thanks so much for the detailed review of the code! Even though by the sounds of things we will be moving to device dax and most of this is moot. Still, it's great to get some feedback and learn a few things. I've given some responses below. On 28/10/16 12:45 AM, Christoph Hellwig

Re: Enabling peer to peer device transactions for PCIe devices

2016-11-23 Thread Logan Gunthorpe
Hey, On 22/11/16 11:59 AM, Serguei Sagalovitch wrote: > - How well we will be able to handle case when we need to "move"/"evict" >memory/data to the new location so CPU pointer should point to the > new physical location/address > (and may be not in PCI device memory at all)? IMO any mem

Re: Enabling peer to peer device transactions for PCIe devices

2016-11-23 Thread Logan Gunthorpe
On 23/11/16 01:33 PM, Jason Gunthorpe wrote: > On Wed, Nov 23, 2016 at 02:58:38PM -0500, Serguei Sagalovitch wrote: > >>We do not want to have "highly" dynamic translation due to >>performance cost. We need to support "overcommit" but would >>like to minimize impact. To support RDM

Re: Enabling peer to peer device transactions for PCIe devices

2016-11-23 Thread Logan Gunthorpe
On 23/11/16 02:55 PM, Jason Gunthorpe wrote: >>> Only ODP hardware allows changing the DMA address on the fly, and it >>> works at the page table level. We do not need special handling for >>> RDMA. >> >> I am aware of ODP but, noted by others, it doesn't provide a general >> solution to the poin

Re: Enabling peer to peer device transactions for PCIe devices

2016-11-24 Thread Logan Gunthorpe
Hey, On 24/11/16 02:45 AM, Christian König wrote: > E.g. it can happen that PCI device A exports it's BAR using ZONE_DEVICE. > Not PCI device B (a SATA device) can directly read/write to it because > it is on the same bus segment, but PCI device C (a network card for > example) can't because it is

Re: Enabling peer to peer device transactions for PCIe devices

2016-11-24 Thread Logan Gunthorpe
On 24/11/16 09:42 AM, Jason Gunthorpe wrote: > There are three cases to worry about: > - Coherent long lived page table mirroring (RDMA ODP MR) > - Non-coherent long lived page table mirroring (RDMA MR) > - Short lived DMA mapping (everything else) > > Like you say below we have to handle sho

Re: Enabling peer to peer device transactions for PCIe devices

2016-11-25 Thread Logan Gunthorpe
On 25/11/16 06:06 AM, Christian König wrote: > Well Serguei send me a couple of documents about QPI when we started to > discuss this internally as well and that's exactly one of the cases I > had in mind when writing this. > > If I understood it correctly for such systems P2P is technical possi

Re: Enabling peer to peer device transactions for PCIe devices

2016-11-28 Thread Logan Gunthorpe
On 28/11/16 09:57 AM, Jason Gunthorpe wrote: >> On PeerDirect, we have some kind of a middle-ground solution for pinning >> GPU memory. We create a non-ODP MR pointing to VRAM but rely on >> user-space and the GPU not to migrate it. If they do, the MR gets >> destroyed immediately. > > That soun

Re: Enabling peer to peer device transactions for PCIe devices

2016-11-28 Thread Logan Gunthorpe
On 28/11/16 12:35 PM, Serguei Sagalovitch wrote: > As soon as PeerDirect mapping is called then GPU must not "move" the > such memory. It is by PeerDirect design. It is similar how it is works > with system memory and RDMA MR: when "get_user_pages" is called then the > memory is pinned. We haven

Re: Enabling peer to peer device transactions for PCIe devices

2016-11-30 Thread Logan Gunthorpe
On 30/11/16 09:23 AM, Jason Gunthorpe wrote: >> Two cases I can think of are RDMA access to an NVMe device's controller >> memory buffer, > > I'm not sure on the use model there.. The NVMe fabrics stuff could probably make use of this. It's an in-kernel system to allow remote access to an NVMe

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Logan Gunthorpe
On 05/12/16 11:08 AM, Dan Williams wrote: I've already recommended that iopmem not be a block device and instead be a device-dax instance. I also don't think it should claim the PCI ID, rather the driver that wants to map one of its bars this way can register the memory region with the device-dax

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Logan Gunthorpe
On 05/12/16 12:14 PM, Jason Gunthorpe wrote: But CMB sounds much more like the GPU case where there is a specialized allocator handing out the BAR to consumers, so I'm not sure a general purpose chardev makes a lot of sense? I don't think it will ever need to be as complicated as the GPU case

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-05 Thread Logan Gunthorpe
On 05/12/16 12:46 PM, Jason Gunthorpe wrote: NVMe might have to deal with pci-e hot-unplug, which is a similar problem-class to the GPU case.. Sure, but if the NVMe device gets hot-unplugged it means that all the CMB mappings are useless and need to be torn down. This probably means killing

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-06 Thread Logan Gunthorpe
Hey, On 06/12/16 09:38 AM, Jason Gunthorpe wrote: >>> I'm not opposed to mapping /dev/nvmeX. However, the lookup is trivial >>> to accomplish in sysfs through /sys/dev/char to find the sysfs path of the >>> device-dax instance under the nvme device, or if you already have the nvme >>> sysfs path

Re: Enabling peer to peer device transactions for PCIe devices

2016-12-06 Thread Logan Gunthorpe
Hey, > Okay, so clearly this needs a kernel side NVMe specific allocator > and locking so users don't step on each other.. Yup, ideally. That's why device dax isn't ideal for this application: it doesn't provide any way to prevent users from stepping on each other. > Or as Christoph says some ki

Re: Enabling peer to peer device transactions for PCIe devices

2017-01-06 Thread Logan Gunthorpe
On 06/01/17 11:26 AM, Jason Gunthorpe wrote: > Make a generic API for all of this and you'd have my vote.. > > IMHO, you must support basic pinning semantics - that is necessary to > support generic short lived DMA (eg filesystem, etc). That hardware > can clearly do that if it can support ODP.

Re: Enabling peer to peer device transactions for PCIe devices

2017-01-12 Thread Logan Gunthorpe
On 11/01/17 09:54 PM, Stephen Bates wrote: > The iopmem patchset addressed all the use cases above and while it is not > an in kernel API it could have been modified to be one reasonably easily. > As Logan states the driver can then choose to pass the VMAs to user-space > in a manner that makes s

[PATCH] device-dax: don't set kobj parent during cdev init

2017-02-10 Thread Logan Gunthorpe
mistake. [1] https://lkml.org/lkml/2017/2/10/370 Signed-off-by: Logan Gunthorpe --- drivers/dax/dax.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c index ed758b7..24e53b7 100644 --- a/drivers/dax/dax.c +++ b/drivers/dax/dax.c @@ -699

Re: [PATCH] device-dax: don't set kobj parent during cdev init

2017-02-10 Thread Logan Gunthorpe
Hey, Also on the subject of very minor fixes: I noticed drivers/dax is not in the maintainers file. I just assumed the nvdimm list should have been included with those from get_maintainers. Thanks, Logan On 10/02/17 12:19 PM, Logan Gunthorpe wrote: > I copied this code and per feedback f

Re: [PATCH] device-dax: don't set kobj parent during cdev init

2017-02-11 Thread Logan Gunthorpe
On 11/02/17 01:56 AM, Dan Williams wrote: When the device is unregistered it invalidates all existing mappings, but the driver may continue to service vm fault requests until the final put of the cdev. Until that time the fault handler needs to be able to check dax_dev->alive. Since the final cde

Re: [PATCH] device-dax: don't set kobj parent during cdev init

2017-02-11 Thread Logan Gunthorpe
On 11/02/17 11:27 AM, Dan Williams wrote: > Why, when the lifetime of the cdev is already correct? Well, it's only correct if you use the kobj parent trick which Greg is arguing against. As someone reviewing/copying code that trick is unclear, undocumented and it looks rather odd messing with int

Re: [PATCH] device-dax: don't set kobj parent during cdev init

2017-02-11 Thread Logan Gunthorpe
On 11/02/17 11:58 AM, Dan Williams wrote: > Also when using an embedded cdev how would you recommend avoiding this > problem? I don't know. Hopefully, Greg has a good idea. But it sounds like a general problem that a lot of cdev's actually suffer from without realizing. Perhaps we need a more gen

Re: [PATCH] device-dax: don't set kobj parent during cdev init

2017-02-13 Thread Logan Gunthorpe
Hey, I like the interface. It's just Greg that needs to comment on whether using the kobj.parent for this purpose is actually sane. That was his argument from the beginning. Logan On 13/02/17 01:47 PM, Dan Williams wrote: > On Sat, Feb 11, 2017 at 9:42 PM, Logan Gunthorpe wrote: >&g

[PATCH 06/14] platform/chrome: utilize new device_add_cdev helper function

2017-02-20 Thread Logan Gunthorpe
Signed-off-by: Logan Gunthorpe --- drivers/platform/chrome/cros_ec_dev.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/platform/chrome/cros_ec_dev.c b/drivers/platform/chrome/cros_ec_dev.c index 47268ec..658fb99 100644 --- a/drivers/platform/chrome

[PATCH 04/14] gpiolib: utilize new device_add_cdev helper function

2017-02-20 Thread Logan Gunthorpe
Signed-off-by: Logan Gunthorpe --- drivers/gpio/gpiolib.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index a07ae9e..04dbc4a 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -1036,9 +1036,8 @@ static int

[PATCH 13/14] scsi: utilize new device_add_cdev helper function

2017-02-20 Thread Logan Gunthorpe
Note: the chardev instance in osd_uld.c originally did not set the kobject parent. Thus, I'm reasonably confident that because of this, this code would have suffered from a minor use after free bug if the cdev was open when the backing device was released. Signed-off-by: Logan Gunt

[PATCH 09/14] media: utilize new device_add_cdev helper function

2017-02-20 Thread Logan Gunthorpe
Signed-off-by: Logan Gunthorpe --- drivers/media/cec/cec-core.c | 3 +-- drivers/media/media-devnode.c | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/media/cec/cec-core.c b/drivers/media/cec/cec-core.c index aca3ab8..a475aa5 100644 --- a/drivers/media/cec/cec

[PATCH 01/14] chardev: add helper function to register char devs with a struct device

2017-02-20 Thread Logan Gunthorpe
set members in the underlying kobject. In [1], Dan notes he took inspiration for the form of the API device_add_disk. [1] https://lkml.org/lkml/2017/2/13/700 [2] https://lkml.org/lkml/2017/2/10/370 Signed-off-by: Logan Gunthorpe --- fs/char_dev.c| 24 include/linux/cde

[PATCH 02/14] device-dax: utilize new device_add_cdev helper function

2017-02-20 Thread Logan Gunthorpe
Signed-off-by: Logan Gunthorpe --- drivers/dax/dax.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c index ed758b7..0d24822 100644 --- a/drivers/dax/dax.c +++ b/drivers/dax/dax.c @@ -701,12 +701,12 @@ struct dax_dev

[PATCH 10/14] mtd: utilize new device_add_cdev helper function

2017-02-20 Thread Logan Gunthorpe
Note: neither of the cdev instances in the mtd tree originally set the kobject parent. Thus, I'm reasonably confident that both these instances would have suffered from a minor use after free bug if the cdevs were open when the backing device was released. Signed-off-by: Logan Gunt

[PATCH 14/14] switchtec: utilize new device_add_cdev helper function

2017-02-20 Thread Logan Gunthorpe
Signed-off-by: Logan Gunthorpe --- drivers/pci/switch/switchtec.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c index 82bfd18..95aabd0 100644 --- a/drivers/pci/switch/switchtec.c +++ b/drivers/pci/switch

[PATCH 11/14] rapidio: utilize new device_add_cdev helper function

2017-02-20 Thread Logan Gunthorpe
Note: the chardev instance in rio_mport_cdev originally did not set the kobject parent. Thus, I'm reasonably confident that because of this, this code would have suffered from a minor use after free bug if the cdev was open when the backing device was released. Signed-off-by: Logan Gunt

[PATCH 00/14] Cleanup chardev instances with helper function

2017-02-20 Thread Logan Gunthorpe
y new driver which, as yet, has not been accepted upstream. @Dan the first patch in this series will need your sign-off. Thanks, Logan [1] https://lkml.org/lkml/2017/2/10/370 [2] https://lkml.org/lkml/2017/2/10/607 [3] https://lkml.org/lkml/2017/2/13/700 Logan Gunthorpe (14): chardev: add h

[PATCH 12/14] rtc: utilize new device_add_cdev helper function

2017-02-20 Thread Logan Gunthorpe
Signed-off-by: Logan Gunthorpe --- drivers/rtc/rtc-dev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/rtc/rtc-dev.c b/drivers/rtc/rtc-dev.c index a6d9434..e4012bb 100644 --- a/drivers/rtc/rtc-dev.c +++ b/drivers/rtc/rtc-dev.c @@ -477,12 +477,11 @@ void

<    1   2   3   4   5   6   >