Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-13 Thread Dan Williams
On Thu, Aug 13, 2015 at 9:34 AM, Boaz Harrosh  wrote:
> On 08/13/2015 06:21 PM, Dan Williams wrote:
>> On Wed, Aug 12, 2015 at 11:26 PM, Boaz Harrosh  wrote:
> <>
>>
>> Hmm, that's not the same block layer I've been working with for the
>> past several years:
>>
>> $ mount /dev/pmem0 /mnt
>> $ echo namespace0.0 > ../drivers/nd_pmem/unbind # succeeds
>>
>> Unbind always proceeds unconditionally.  See the recent kernel summit
>> topic discussion around devm vs unbind [1].  While kmap_atomic_pfn_t()
>> does not implement revoke semantics it at least forces re-validation
>> and time bounded references.  For the unplug case we'll need to go
>> shootdown those DAX mappings in userspace so that they return SIGBUS
>> on access, or something along those lines.
>>
>
> Then fix unbind to refuse. What is the point of unbind when it trashes
> the hot path so badly and makes the code so fat.

What? The DAX hot path avoids the kernel entirely.

> Who uses it and what for?

The device driver core.  We simply can't hold off remove indefinitely.
If the administrator wants the device disabled we need to tear down
and revoke active mappings, or at very least guarantee time bounded
removal.

> First I ever heard of it and I do use Linux a little bit.
>
>> [1]: http://www.spinics.net/lists/kernel/msg2032864.html
>>
> Hm...
>
> OK I hate it. I would just make sure to override and refuse unbinding with an
> elevated ref count. Is not a good reason for me to trash the hotpath.

Again, the current usages are not in hot paths.  If it becomes part of
a hot path *and* shows up in a profile we can look to implement
something with less overhead.  Until then we should plan to honor the
lifetime as defined by ->probe() and ->remove().

In fact I proposed the same as you, but then changed my mind based on
Tejun's response [1].  So please reconsider this idea to solve the
problem by blocking ->remove().  PMEM is new and special, but not
*that* special as to justify breaking basic guarantees.

[1]: https://lkml.org/lkml/2015/7/15/731
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 06:21 PM, Dan Williams wrote:
> On Wed, Aug 12, 2015 at 11:26 PM, Boaz Harrosh  wrote:
<>
> 
> Hmm, that's not the same block layer I've been working with for the
> past several years:
> 
> $ mount /dev/pmem0 /mnt
> $ echo namespace0.0 > ../drivers/nd_pmem/unbind # succeeds
> 
> Unbind always proceeds unconditionally.  See the recent kernel summit
> topic discussion around devm vs unbind [1].  While kmap_atomic_pfn_t()
> does not implement revoke semantics it at least forces re-validation
> and time bounded references.  For the unplug case we'll need to go
> shootdown those DAX mappings in userspace so that they return SIGBUS
> on access, or something along those lines.
> 

Then fix unbind to refuse. What is the point of unbind when it trashes
the hot path so badly and makes the code so fat. Who uses it and what for?

First I ever heard of it and I do use Linux a little bit.

> [1]: http://www.spinics.net/lists/kernel/msg2032864.html
> 
Hm...

OK I hate it. I would just make sure to override and refuse unbinding with an
elevated ref count. Is not a good reason for me to trash the hotpath.

>> And for god sake. I have a bdev I call bdev_direct_access(sector), the bdev 
>> calculated the
>> exact address for me (base + sector). Now I get back this __pfn_t and I need 
>> to call
>> kmap_atomic_pfn_t() which does a loop to search for my range and again 
>> base+offset ?
>>
>> This all model is broken, sorry?
> 
> I think you are confused about the lifetime of the userspace DAX
> mapping vs the kernel's mapping and the frequency of calls to
> kmap_atomic_pfn_t().  I'm sure you can make this loop look bad with a
> micro-benchmark, but the whole point of DAX is to get the kernel out
> of the I/O path, so I'm not sure this overhead shows up in any real
> way in practice.

Sigh! It does. very much. 4k random write for you. Will drop in half
if I do this. We've been testing with memory for a long time every
rcu lock counts. A single atomic will drop things by %20

Thanks
Boaz

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-13 Thread Dan Williams
On Wed, Aug 12, 2015 at 11:26 PM, Boaz Harrosh  wrote:
> B. Here this all set is a joke. The all "pmem disable vs still-in-use" 
> argument is mute
> here below you have inserted a live, used for ever, pfn into a process vm 
> without holding
> a map.

Careful, don't confuse "unbind" with "unplug".  "Unbind" invalidates
the driver's mapping (ioremap) while "unplug" would invalidate the
pfn.  DAX is indeed broken with respect to unplug and we'll need to go
solve that separately.  I expect "unplug" support will be needed for
hot provisioning pmem to/from virtual machines.

> The all "pmem disable vs still-in-use" is a joke. The FS loaded has a 
> reference on the bdev
> and the filehadle has a reference on the FS. So what is exactly this "pmem 
> disable" you are
> talking about?

Hmm, that's not the same block layer I've been working with for the
past several years:

$ mount /dev/pmem0 /mnt
$ echo namespace0.0 > ../drivers/nd_pmem/unbind # succeeds

Unbind always proceeds unconditionally.  See the recent kernel summit
topic discussion around devm vs unbind [1].  While kmap_atomic_pfn_t()
does not implement revoke semantics it at least forces re-validation
and time bounded references.  For the unplug case we'll need to go
shootdown those DAX mappings in userspace so that they return SIGBUS
on access, or something along those lines.

[1]: http://www.spinics.net/lists/kernel/msg2032864.html

> And for god sake. I have a bdev I call bdev_direct_access(sector), the bdev 
> calculated the
> exact address for me (base + sector). Now I get back this __pfn_t and I need 
> to call
> kmap_atomic_pfn_t() which does a loop to search for my range and again 
> base+offset ?
>
> This all model is broken, sorry?

I think you are confused about the lifetime of the userspace DAX
mapping vs the kernel's mapping and the frequency of calls to
kmap_atomic_pfn_t().  I'm sure you can make this loop look bad with a
micro-benchmark, but the whole point of DAX is to get the kernel out
of the I/O path, so I'm not sure this overhead shows up in any real
way in practice.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 06:01 AM, Dan Williams wrote:
> The primary source for non-page-backed page-frames to enter the system
> is via the pmem driver's ->direct_access() method.  The pfns returned by
> the top-level bdev_direct_access() may be passed to any other subsystem
> in the kernel and those sub-systems either need to assume that the pfn
> is page backed (CONFIG_DEV_PFN=n) or be prepared to handle non-page
> backed case (CONFIG_DEV_PFN=y).  Currently the pfns returned by
> ->direct_access() are only ever used by vm_insert_mixed() which does not
> care if the pfn is mapped.  As we go to add more usages of these pfns
> add the type-safety of __pfn_t.
> 
> This also simplifies the calling convention of ->direct_access() by not
> returning the virtual address in the same call.  This annotates cases
> where the kernel is directly accessing pmem outside the driver, and
> makes the valid lifetime of the reference explicit.  This property may
> be useful in the future for invalidating mappings to pmem, but for now
> it provides some protection against the "pmem disable vs still-in-use"
> race.
> 
> Note that axon_ram_direct_access and dcssblk_direct_access were
> previously making potentially incorrect assumptions about the addresses
> they passed to virt_to_phys().
> 
> [hch: various minor updates]
> Signed-off-by: Christoph Hellwig 
> Signed-off-by: Dan Williams 
> ---
>  arch/powerpc/platforms/Kconfig |1 +
>  arch/powerpc/sysdev/axonram.c  |   24 
>  drivers/block/brd.c|5 +--
>  drivers/nvdimm/Kconfig |1 +
>  drivers/nvdimm/pmem.c  |   24 +++-
>  drivers/s390/block/Kconfig |1 +
>  drivers/s390/block/dcssblk.c   |   23 ++--
>  fs/Kconfig |1 +
>  fs/block_dev.c |4 +-
>  fs/dax.c   |   79 
> +---
>  include/linux/blkdev.h |7 ++--
>  include/linux/mm.h |   12 ++
>  12 files changed, 129 insertions(+), 53 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
> index b7f9c408bf24..6b1c2f2e5fb4 100644
> --- a/arch/powerpc/platforms/Kconfig
> +++ b/arch/powerpc/platforms/Kconfig
> @@ -307,6 +307,7 @@ config CPM2
>  config AXON_RAM
>   tristate "Axon DDR2 memory device driver"
>   depends on PPC_IBM_CELL_BLADE && BLOCK
> + select KMAP_PFN
>   default m
>   help
> It registers one block device per Axon's DDR2 memory bank found
> diff --git a/arch/powerpc/sysdev/axonram.c b/arch/powerpc/sysdev/axonram.c
> index e8657d3bc588..7c5a1563c0fd 100644
> --- a/arch/powerpc/sysdev/axonram.c
> +++ b/arch/powerpc/sysdev/axonram.c
> @@ -43,6 +43,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -141,14 +142,12 @@ axon_ram_make_request(struct request_queue *queue, 
> struct bio *bio)
>   */
>  static long
>  axon_ram_direct_access(struct block_device *device, sector_t sector,
> -void **kaddr, unsigned long *pfn)
> +__pfn_t *pfn)
>  {
>   struct axon_ram_bank *bank = device->bd_disk->private_data;
>   loff_t offset = (loff_t)sector << AXON_RAM_SECTOR_SHIFT;
>  
> - *kaddr = (void *)(bank->ph_addr + offset);
> - *pfn = virt_to_phys(*kaddr) >> PAGE_SHIFT;
> -
> + *pfn = phys_to_pfn_t(bank->ph_addr + offset, PFN_DEV);
>   return bank->size - offset;
>  }
>  
> @@ -165,9 +164,13 @@ static int axon_ram_probe(struct platform_device *device)
>  {
>   static int axon_ram_bank_id = -1;
>   struct axon_ram_bank *bank;
> - struct resource resource;
> + struct resource *resource;
>   int rc = 0;
>  
> + resource = devm_kzalloc(>dev, sizeof(*resource), GFP_KERNEL);
> + if (!resource)
> + return -ENOMEM;
> +
>   axon_ram_bank_id++;
>  
>   dev_info(>dev, "Found memory controller on %s\n",
> @@ -184,13 +187,13 @@ static int axon_ram_probe(struct platform_device 
> *device)
>  
>   bank->device = device;
>  
> - if (of_address_to_resource(device->dev.of_node, 0, ) != 0) {
> + if (of_address_to_resource(device->dev.of_node, 0, resource) != 0) {
>   dev_err(>dev, "Cannot access device tree\n");
>   rc = -EFAULT;
>   goto failed;
>   }
>  
> - bank->size = resource_size();
> + bank->size = resource_size(resource);
>  
>   if (bank->size == 0) {
>   dev_err(>dev, "No DDR2 memory found for %s%d\n",
> @@ -202,7 +205,7 @@ static int axon_ram_probe(struct platform_device *device)
>   dev_info(>dev, "Register DDR2 memory device %s%d with %luMB\n",
>   AXON_RAM_DEVICE_NAME, axon_ram_bank_id, bank->size >> 
> 20);
>  
> - bank->ph_addr = resource.start;
> + bank->ph_addr = resource->start;
>   bank->io_addr = (unsigned long) ioremap_prot(
>   bank->ph_addr, bank->size, _PAGE_NO_CACHE);
>   if (bank->io_addr == 

Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 06:01 AM, Dan Williams wrote:
 The primary source for non-page-backed page-frames to enter the system
 is via the pmem driver's -direct_access() method.  The pfns returned by
 the top-level bdev_direct_access() may be passed to any other subsystem
 in the kernel and those sub-systems either need to assume that the pfn
 is page backed (CONFIG_DEV_PFN=n) or be prepared to handle non-page
 backed case (CONFIG_DEV_PFN=y).  Currently the pfns returned by
 -direct_access() are only ever used by vm_insert_mixed() which does not
 care if the pfn is mapped.  As we go to add more usages of these pfns
 add the type-safety of __pfn_t.
 
 This also simplifies the calling convention of -direct_access() by not
 returning the virtual address in the same call.  This annotates cases
 where the kernel is directly accessing pmem outside the driver, and
 makes the valid lifetime of the reference explicit.  This property may
 be useful in the future for invalidating mappings to pmem, but for now
 it provides some protection against the pmem disable vs still-in-use
 race.
 
 Note that axon_ram_direct_access and dcssblk_direct_access were
 previously making potentially incorrect assumptions about the addresses
 they passed to virt_to_phys().
 
 [hch: various minor updates]
 Signed-off-by: Christoph Hellwig h...@lst.de
 Signed-off-by: Dan Williams dan.j.willi...@intel.com
 ---
  arch/powerpc/platforms/Kconfig |1 +
  arch/powerpc/sysdev/axonram.c  |   24 
  drivers/block/brd.c|5 +--
  drivers/nvdimm/Kconfig |1 +
  drivers/nvdimm/pmem.c  |   24 +++-
  drivers/s390/block/Kconfig |1 +
  drivers/s390/block/dcssblk.c   |   23 ++--
  fs/Kconfig |1 +
  fs/block_dev.c |4 +-
  fs/dax.c   |   79 
 +---
  include/linux/blkdev.h |7 ++--
  include/linux/mm.h |   12 ++
  12 files changed, 129 insertions(+), 53 deletions(-)
 
 diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
 index b7f9c408bf24..6b1c2f2e5fb4 100644
 --- a/arch/powerpc/platforms/Kconfig
 +++ b/arch/powerpc/platforms/Kconfig
 @@ -307,6 +307,7 @@ config CPM2
  config AXON_RAM
   tristate Axon DDR2 memory device driver
   depends on PPC_IBM_CELL_BLADE  BLOCK
 + select KMAP_PFN
   default m
   help
 It registers one block device per Axon's DDR2 memory bank found
 diff --git a/arch/powerpc/sysdev/axonram.c b/arch/powerpc/sysdev/axonram.c
 index e8657d3bc588..7c5a1563c0fd 100644
 --- a/arch/powerpc/sysdev/axonram.c
 +++ b/arch/powerpc/sysdev/axonram.c
 @@ -43,6 +43,7 @@
  #include linux/types.h
  #include linux/of_device.h
  #include linux/of_platform.h
 +#include linux/kmap_pfn.h
  
  #include asm/page.h
  #include asm/prom.h
 @@ -141,14 +142,12 @@ axon_ram_make_request(struct request_queue *queue, 
 struct bio *bio)
   */
  static long
  axon_ram_direct_access(struct block_device *device, sector_t sector,
 -void **kaddr, unsigned long *pfn)
 +__pfn_t *pfn)
  {
   struct axon_ram_bank *bank = device-bd_disk-private_data;
   loff_t offset = (loff_t)sector  AXON_RAM_SECTOR_SHIFT;
  
 - *kaddr = (void *)(bank-ph_addr + offset);
 - *pfn = virt_to_phys(*kaddr)  PAGE_SHIFT;
 -
 + *pfn = phys_to_pfn_t(bank-ph_addr + offset, PFN_DEV);
   return bank-size - offset;
  }
  
 @@ -165,9 +164,13 @@ static int axon_ram_probe(struct platform_device *device)
  {
   static int axon_ram_bank_id = -1;
   struct axon_ram_bank *bank;
 - struct resource resource;
 + struct resource *resource;
   int rc = 0;
  
 + resource = devm_kzalloc(device-dev, sizeof(*resource), GFP_KERNEL);
 + if (!resource)
 + return -ENOMEM;
 +
   axon_ram_bank_id++;
  
   dev_info(device-dev, Found memory controller on %s\n,
 @@ -184,13 +187,13 @@ static int axon_ram_probe(struct platform_device 
 *device)
  
   bank-device = device;
  
 - if (of_address_to_resource(device-dev.of_node, 0, resource) != 0) {
 + if (of_address_to_resource(device-dev.of_node, 0, resource) != 0) {
   dev_err(device-dev, Cannot access device tree\n);
   rc = -EFAULT;
   goto failed;
   }
  
 - bank-size = resource_size(resource);
 + bank-size = resource_size(resource);
  
   if (bank-size == 0) {
   dev_err(device-dev, No DDR2 memory found for %s%d\n,
 @@ -202,7 +205,7 @@ static int axon_ram_probe(struct platform_device *device)
   dev_info(device-dev, Register DDR2 memory device %s%d with %luMB\n,
   AXON_RAM_DEVICE_NAME, axon_ram_bank_id, bank-size  
 20);
  
 - bank-ph_addr = resource.start;
 + bank-ph_addr = resource-start;
   bank-io_addr = (unsigned long) ioremap_prot(
   bank-ph_addr, bank-size, _PAGE_NO_CACHE);
   if (bank-io_addr == 

Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-13 Thread Dan Williams
On Wed, Aug 12, 2015 at 11:26 PM, Boaz Harrosh b...@plexistor.com wrote:
 B. Here this all set is a joke. The all pmem disable vs still-in-use 
 argument is mute
 here below you have inserted a live, used for ever, pfn into a process vm 
 without holding
 a map.

Careful, don't confuse unbind with unplug.  Unbind invalidates
the driver's mapping (ioremap) while unplug would invalidate the
pfn.  DAX is indeed broken with respect to unplug and we'll need to go
solve that separately.  I expect unplug support will be needed for
hot provisioning pmem to/from virtual machines.

 The all pmem disable vs still-in-use is a joke. The FS loaded has a 
 reference on the bdev
 and the filehadle has a reference on the FS. So what is exactly this pmem 
 disable you are
 talking about?

Hmm, that's not the same block layer I've been working with for the
past several years:

$ mount /dev/pmem0 /mnt
$ echo namespace0.0  ../drivers/nd_pmem/unbind # succeeds

Unbind always proceeds unconditionally.  See the recent kernel summit
topic discussion around devm vs unbind [1].  While kmap_atomic_pfn_t()
does not implement revoke semantics it at least forces re-validation
and time bounded references.  For the unplug case we'll need to go
shootdown those DAX mappings in userspace so that they return SIGBUS
on access, or something along those lines.

[1]: http://www.spinics.net/lists/kernel/msg2032864.html

 And for god sake. I have a bdev I call bdev_direct_access(sector), the bdev 
 calculated the
 exact address for me (base + sector). Now I get back this __pfn_t and I need 
 to call
 kmap_atomic_pfn_t() which does a loop to search for my range and again 
 base+offset ?

 This all model is broken, sorry?

I think you are confused about the lifetime of the userspace DAX
mapping vs the kernel's mapping and the frequency of calls to
kmap_atomic_pfn_t().  I'm sure you can make this loop look bad with a
micro-benchmark, but the whole point of DAX is to get the kernel out
of the I/O path, so I'm not sure this overhead shows up in any real
way in practice.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 06:21 PM, Dan Williams wrote:
 On Wed, Aug 12, 2015 at 11:26 PM, Boaz Harrosh b...@plexistor.com wrote:

 
 Hmm, that's not the same block layer I've been working with for the
 past several years:
 
 $ mount /dev/pmem0 /mnt
 $ echo namespace0.0  ../drivers/nd_pmem/unbind # succeeds
 
 Unbind always proceeds unconditionally.  See the recent kernel summit
 topic discussion around devm vs unbind [1].  While kmap_atomic_pfn_t()
 does not implement revoke semantics it at least forces re-validation
 and time bounded references.  For the unplug case we'll need to go
 shootdown those DAX mappings in userspace so that they return SIGBUS
 on access, or something along those lines.
 

Then fix unbind to refuse. What is the point of unbind when it trashes
the hot path so badly and makes the code so fat. Who uses it and what for?

First I ever heard of it and I do use Linux a little bit.

 [1]: http://www.spinics.net/lists/kernel/msg2032864.html
 
Hm...

OK I hate it. I would just make sure to override and refuse unbinding with an
elevated ref count. Is not a good reason for me to trash the hotpath.

 And for god sake. I have a bdev I call bdev_direct_access(sector), the bdev 
 calculated the
 exact address for me (base + sector). Now I get back this __pfn_t and I need 
 to call
 kmap_atomic_pfn_t() which does a loop to search for my range and again 
 base+offset ?

 This all model is broken, sorry?
 
 I think you are confused about the lifetime of the userspace DAX
 mapping vs the kernel's mapping and the frequency of calls to
 kmap_atomic_pfn_t().  I'm sure you can make this loop look bad with a
 micro-benchmark, but the whole point of DAX is to get the kernel out
 of the I/O path, so I'm not sure this overhead shows up in any real
 way in practice.

Sigh! It does. very much. 4k random write for you. Will drop in half
if I do this. We've been testing with memory for a long time every
rcu lock counts. A single atomic will drop things by %20

Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-13 Thread Dan Williams
On Thu, Aug 13, 2015 at 9:34 AM, Boaz Harrosh b...@plexistor.com wrote:
 On 08/13/2015 06:21 PM, Dan Williams wrote:
 On Wed, Aug 12, 2015 at 11:26 PM, Boaz Harrosh b...@plexistor.com wrote:
 

 Hmm, that's not the same block layer I've been working with for the
 past several years:

 $ mount /dev/pmem0 /mnt
 $ echo namespace0.0  ../drivers/nd_pmem/unbind # succeeds

 Unbind always proceeds unconditionally.  See the recent kernel summit
 topic discussion around devm vs unbind [1].  While kmap_atomic_pfn_t()
 does not implement revoke semantics it at least forces re-validation
 and time bounded references.  For the unplug case we'll need to go
 shootdown those DAX mappings in userspace so that they return SIGBUS
 on access, or something along those lines.


 Then fix unbind to refuse. What is the point of unbind when it trashes
 the hot path so badly and makes the code so fat.

What? The DAX hot path avoids the kernel entirely.

 Who uses it and what for?

The device driver core.  We simply can't hold off remove indefinitely.
If the administrator wants the device disabled we need to tear down
and revoke active mappings, or at very least guarantee time bounded
removal.

 First I ever heard of it and I do use Linux a little bit.

 [1]: http://www.spinics.net/lists/kernel/msg2032864.html

 Hm...

 OK I hate it. I would just make sure to override and refuse unbinding with an
 elevated ref count. Is not a good reason for me to trash the hotpath.

Again, the current usages are not in hot paths.  If it becomes part of
a hot path *and* shows up in a profile we can look to implement
something with less overhead.  Until then we should plan to honor the
lifetime as defined by -probe() and -remove().

In fact I proposed the same as you, but then changed my mind based on
Tejun's response [1].  So please reconsider this idea to solve the
problem by blocking -remove().  PMEM is new and special, but not
*that* special as to justify breaking basic guarantees.

[1]: https://lkml.org/lkml/2015/7/15/731
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/