Re: [PATCH] Install Notify() handler before getting NFIT table

2023-10-20 Thread Ira Weiny
Rafael J. Wysocki wrote:
> On Thu, Oct 19, 2023 at 2:57 PM chenxiang  wrote:
> >
> > From: Xiang Chen 
> >
> > If there is no NFIT at startup, it will return 0 immediately in function
> > acpi_nfit_add() and will not install Notify() handler. If hotplugging
> > a nvdimm device later, it will not be identified as there is no Notify()
> > handler.
> 
> Yes, this is a change in behavior that shouldn't have been made.
> 
> > So move handler installing before getting NFI table in function
> > acpi_nfit_add() to avoid above issue.
> 
> And the fix is correct if I'm not mistaken.
> 
> I can still queue it up for 6.6 if that's fine with everyone.  Dan?

That is fine with me.  Vishal, Dave Jiang, and I are wrangling the nvdimm
tree these days.  I've prepared 6.7 already so I'll ignore this.

Ira

> 
> > Fixes: dcca12ab62a2 ("ACPI: NFIT: Install Notify() handler directly")
> > Signed-off-by: Xiang Chen 
> > ---
> >  drivers/acpi/nfit/core.c | 22 +++---
> >  1 file changed, 11 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
> > index 3826f49..9923855 100644
> > --- a/drivers/acpi/nfit/core.c
> > +++ b/drivers/acpi/nfit/core.c
> > @@ -3339,6 +3339,16 @@ static int acpi_nfit_add(struct acpi_device *adev)
> > acpi_size sz;
> > int rc = 0;
> >
> > +   rc = acpi_dev_install_notify_handler(adev, ACPI_DEVICE_NOTIFY,
> > +acpi_nfit_notify, adev);
> > +   if (rc)
> > +   return rc;
> > +
> > +   rc = devm_add_action_or_reset(dev, acpi_nfit_remove_notify_handler,
> > +   adev);
> > +   if (rc)
> > +   return rc;
> > +
> > status = acpi_get_table(ACPI_SIG_NFIT, 0, );
> > if (ACPI_FAILURE(status)) {
> > /* The NVDIMM root device allows OS to trigger enumeration 
> > of
> > @@ -3386,17 +3396,7 @@ static int acpi_nfit_add(struct acpi_device *adev)
> > if (rc)
> > return rc;
> >
> > -   rc = devm_add_action_or_reset(dev, acpi_nfit_shutdown, acpi_desc);
> > -   if (rc)
> > -   return rc;
> > -
> > -   rc = acpi_dev_install_notify_handler(adev, ACPI_DEVICE_NOTIFY,
> > -acpi_nfit_notify, adev);
> > -   if (rc)
> > -   return rc;
> > -
> > -   return devm_add_action_or_reset(dev, 
> > acpi_nfit_remove_notify_handler,
> > -   adev);
> > +   return devm_add_action_or_reset(dev, acpi_nfit_shutdown, acpi_desc);
> >  }
> >
> >  static void acpi_nfit_update_notify(struct device *dev, acpi_handle handle)
> > --
> 





Re: [PATCH v15] mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind

2023-10-20 Thread Darrick J. Wong
On Fri, Oct 20, 2023 at 03:26:32PM +0530, Chandan Babu R wrote:
> On Thu, Sep 28, 2023 at 06:32:27 PM +0800, Shiyang Ruan wrote:
> > 
> > Changes since v14:
> >  1. added/fixed code comments per Dan's comments
> > 
> >
> > Now, if we suddenly remove a PMEM device(by calling unbind) which
> > contains FSDAX while programs are still accessing data in this device,
> > e.g.:
> > ```
> >  $FSSTRESS_PROG -d $SCRATCH_MNT -n 9 -p 4 &
> >  # $FSX_PROG -N 100 -o 8192 -l 50 $SCRATCH_MNT/t001 &
> >  echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind
> > ```
> > it could come into an unacceptable state:
> >   1. device has gone but mount point still exists, and umount will fail
> >with "target is busy"
> >   2. programs will hang and cannot be killed
> >   3. may crash with NULL pointer dereference
> >
> > To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we
> > are going to remove the whole device, and make sure all related processes
> > could be notified so that they could end up gracefully.
> >
> > This patch is inspired by Dan's "mm, dax, pmem: Introduce
> > dev_pagemap_failure()"[1].  With the help of dax_holder and
> > ->notify_failure() mechanism, the pmem driver is able to ask filesystem
> > on it to unmap all files in use, and notify processes who are using
> > those files.
> >
> > Call trace:
> > trigger unbind
> >  -> unbind_store()
> >   -> ... (skip)
> >-> devres_release_all()
> > -> kill_dax()
> >  -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE)
> >   -> xfs_dax_notify_failure()
> >   `-> freeze_super() // freeze (kernel call)
> >   `-> do xfs rmap
> >   ` -> mf_dax_kill_procs()
> >   `  -> collect_procs_fsdax()// all associated processes
> >   `  -> unmap_and_kill()
> >   ` -> invalidate_inode_pages2_range() // drop file's cache
> >   `-> thaw_super()   // thaw (both kernel & user call)
> >
> > Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove
> > event.  Use the exclusive freeze/thaw[2] to lock the filesystem to prevent
> > new dax mapping from being created.  Do not shutdown filesystem directly
> > if configuration is not supported, or if failure range includes metadata
> > area.  Make sure all files and processes(not only the current progress)
> > are handled correctly.  Also drop the cache of associated files before
> > pmem is removed.
> >
> > [1]: 
> > https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.st...@dwillia2-desk3.amr.corp.intel.com/
> > [2]: 
> > https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/
> >
> > Signed-off-by: Shiyang Ruan 
> > Reviewed-by: Darrick J. Wong 
> > Acked-by: Dan Williams 
> 
> Hi Andrew,
> 
> Shiyang had indicated that this patch has been added to
> akpm/mm-hotfixes-unstable branch. However, I don't see the patch listed in
> that branch.
> 
> I am about to start collecting XFS patches for v6.7 cycle. Please let me know
> if you have any objections with me taking this patch via the XFS tree.

V15 was dropped from his tree on 28 Sept., you might as well pull it
into your own tree for 6.7.  It's been testing fine on my trees for the
past 3 weeks.

https://lore.kernel.org/mm-commits/20230928172815.ee6afc43...@smtp.kernel.org/

--D

> 
> -- 
> Chandan



Re: [PATCH v15] mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind

2023-10-20 Thread Chandan Babu R
On Thu, Sep 28, 2023 at 06:32:27 PM +0800, Shiyang Ruan wrote:
> 
> Changes since v14:
>  1. added/fixed code comments per Dan's comments
> 
>
> Now, if we suddenly remove a PMEM device(by calling unbind) which
> contains FSDAX while programs are still accessing data in this device,
> e.g.:
> ```
>  $FSSTRESS_PROG -d $SCRATCH_MNT -n 9 -p 4 &
>  # $FSX_PROG -N 100 -o 8192 -l 50 $SCRATCH_MNT/t001 &
>  echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind
> ```
> it could come into an unacceptable state:
>   1. device has gone but mount point still exists, and umount will fail
>with "target is busy"
>   2. programs will hang and cannot be killed
>   3. may crash with NULL pointer dereference
>
> To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we
> are going to remove the whole device, and make sure all related processes
> could be notified so that they could end up gracefully.
>
> This patch is inspired by Dan's "mm, dax, pmem: Introduce
> dev_pagemap_failure()"[1].  With the help of dax_holder and
> ->notify_failure() mechanism, the pmem driver is able to ask filesystem
> on it to unmap all files in use, and notify processes who are using
> those files.
>
> Call trace:
> trigger unbind
>  -> unbind_store()
>   -> ... (skip)
>-> devres_release_all()
> -> kill_dax()
>  -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE)
>   -> xfs_dax_notify_failure()
>   `-> freeze_super() // freeze (kernel call)
>   `-> do xfs rmap
>   ` -> mf_dax_kill_procs()
>   `  -> collect_procs_fsdax()// all associated processes
>   `  -> unmap_and_kill()
>   ` -> invalidate_inode_pages2_range() // drop file's cache
>   `-> thaw_super()   // thaw (both kernel & user call)
>
> Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove
> event.  Use the exclusive freeze/thaw[2] to lock the filesystem to prevent
> new dax mapping from being created.  Do not shutdown filesystem directly
> if configuration is not supported, or if failure range includes metadata
> area.  Make sure all files and processes(not only the current progress)
> are handled correctly.  Also drop the cache of associated files before
> pmem is removed.
>
> [1]: 
> https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.st...@dwillia2-desk3.amr.corp.intel.com/
> [2]: 
> https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/
>
> Signed-off-by: Shiyang Ruan 
> Reviewed-by: Darrick J. Wong 
> Acked-by: Dan Williams 

Hi Andrew,

Shiyang had indicated that this patch has been added to
akpm/mm-hotfixes-unstable branch. However, I don't see the patch listed in
that branch.

I am about to start collecting XFS patches for v6.7 cycle. Please let me know
if you have any objections with me taking this patch via the XFS tree.

-- 
Chandan