Tried on kernel 4.18.20 and this issue is not seen.


[root@localhost ~]# ./test-ibv_reg 10.8.8.133 /dev/dax0.0 3

Creating RDMA event channel.

Creating RDMA communication identifier.

RDMA bind address to 10.8.8.133

RDMA start listen

Register memory region.

Unregister memory region.

Pool unmapped.

Pool handler closed.

Pool closed.

De-allocated PD.

Destroyed RDMA communication identifier.

Destroyed RDMA event channel.

[root@localhost ~]# ndctl create-namespace -fe namespace0.0 -a 4k

{

  "dev":"namespace0.0",

  "mode":"devdax",

  "map":"dev",

  "size":"7.87 GiB (8.45 GB)",

  "uuid":"743ec485-6c77-4323-90ca-5ad864a00e72",

  "daxregion":{

    "id":0,

    "size":"7.87 GiB (8.45 GB)",

    "align":4096,

    "devices":[

      {

        "chardev":"dax0.0",

        "size":"7.87 GiB (8.45 GB)"

      }

    ]

  },

  "numa_node":0

}



[root@localhost ~]# uname -a

Linux localhost.localdomain 4.18.20 #1 SMP Mon Jun 17 06:43:19 EDT 2019 x86_64 
x86_64 x86_64 GNU/Linux





Thanks,

Jacky



-----Original Message-----
From: Jacky Wu
Sent: Monday, June 17, 2019 4:58 PM
To: Yue Li <yue...@memverge.com>; Dan Williams <dan.j.willi...@intel.com>
Cc: Scargall, Steve <steve.scarg...@intel.com>; linux-nvdimm@lists.01.org
Subject: RE: ndctl hangs after memory deregistration



Hi Dan,



I wrote a small program to simulate our use case, and tested 3 cases, do no 
register/unregister, do register only but no unregister, do both 
register/unregister, and ndctl command hung in latter two cases.  I'm attaching 
the source code for your reference.



I will try using latest kernel next.



Thanks,

Jacky



-----Original Message-----

From: Yue Li <yue...@memverge.com<mailto:yue...@memverge.com>>

Sent: Friday, June 14, 2019 7:10 AM

To: Dan Williams <dan.j.willi...@intel.com<mailto:dan.j.willi...@intel.com>>

Cc: Scargall, Steve 
<steve.scarg...@intel.com<mailto:steve.scarg...@intel.com>>; Jacky Wu 
<jacky...@memverge.com<mailto:jacky...@memverge.com>>; 
linux-nvdimm@lists.01.org<mailto:linux-nvdimm@lists.01.org>

Subject: Re: ndctl hangs after memory deregistration



Thanks Dan for the reply!



On 6/14/19, 3:06 AM, "Dan Williams" 
<dan.j.willi...@intel.com<mailto:dan.j.willi...@intel.com>> wrote:



    On Wed, Jun 12, 2019 at 9:08 PM Yue Li 
<yue...@memverge.com<mailto:yue...@memverge.com>> wrote:

    >

    > hi Dan and Steve,

    >

    >



    Hi,



    I just happened to see this by luck, please use my Intel address, and

   copy the libnvdimm mailing list on issues like this

    (linux-nvdimm@lists.01.org<mailto:linux-nvdimm@lists.01.org>).



OK.



    > We recently ran into a strange issue where ndctl command hangs on dev dax 
after our software uses it.



    The last thing that device-dax teardown does is wait for any pinned

    pages to be released before allowing the exit to proceed.



OK.



    > Inside our application, we basically will first RDMA register the whole 
device, then deregister, and exit.



    Is this just using simple ibverbs to unregister or something specific

    to this driver.



    There was a bug upstream that addressed cases where device teardown

    proceeded when it shouldn't, but the sequence you describe is the

    opposite the pages pins should be torn down before the device

    reconfiguration.



    > However, if we remove the registration and deregistration code, ndctl 
works correctly without hanging. The problem occurs both on DRAM emulated dax 
as well as real PMEM backed dax.

    >

    > Here is our system information:

    >

    >

    >

    > CentOS 7.6

    >

    > Vanilla kernel 3.10.0-957.el7.x86_64



    Are you familiar with rebuilding the kernel? I'd ask you to try to

    reproduce with the latest development kernel that includes these

    fixes:



    4422ee8476f0 mm/devm_memremap_pages: fix final page put race

    771f0714d0dc PCI/P2PDMA: track pgmap references per resource, not globally

    af37085de906 lib/genalloc: introduce chunk owners

    e0047ff8aa77 PCI/P2PDMA: fix the gen_pool_add_virt() failure path

    0315d47d6ae9 mm/devm_memremap_pages: introduce devm_memunmap_pages

    216475c7eaa8 drivers/base/devres: introduce devm_release_action()



    ...but it sounds like you may be hitting a different issue.



Thanks for the suggestion, we will download the upstream kernel and try it 
again. Will post the results soon.



Best,



Yue








_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to