Re: [PATCH v2] mm: alloc_pages_bulk: remove assumption of populating only NULL elements

2025-03-07 Thread Dave Chinner
On Tue, Mar 04, 2025 at 08:09:35PM +0800, Yunsheng Lin wrote: > On 2025/3/4 16:18, Dave Chinner wrote: > > ... > > > > >> > >> 1. > >> https://lore.kernel.org/all/bd8c2f5c-464d-44ab-b607-390a87ea4...@huawei.com/ > >> 2. > >>

Re: [PATCH v2] mm: alloc_pages_bulk: remove assumption of populating only NULL elements

2025-03-04 Thread Dave Chinner
defragmenting code there? > > 1. > https://lore.kernel.org/all/bd8c2f5c-464d-44ab-b607-390a87ea4...@huawei.com/ > 2. > https://lore.kernel.org/all/20250212092552.1779679-1-linyunsh...@huawei.com/ > CC: Jesper Dangaard Brouer > CC: Luiz Capitulino > CC: Mel Gorman > CC:

Re: [RFC] mm: alloc_pages_bulk: remove assumption of populating only NULL elements

2025-02-18 Thread Dave Chinner
On Tue, Feb 18, 2025 at 05:21:27PM +0800, Yunsheng Lin wrote: > On 2025/2/18 5:31, Dave Chinner wrote: > > ... > > > . > > > >> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > >> index 15bb790359f8..9e1ce0ab9c35 100644 > >>

Re: [RFC] mm: alloc_pages_bulk: remove assumption of populating only NULL elements

2025-02-17 Thread Dave Chinner
to add to their allocator wrapper. IOWs, you just demonstrated why the existing API is more desirable than a highly constrained, slightly faster API that requires callers to get every detail right. i.e. it's hard to get it wrong with the existing API, yet it's so easy to make mistakes with the proposed API that the patch proposing the change has serious bugs in it. -Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH v7 00/27] DCD: Add support for Dynamic Capacity Devices (DCD)

2024-11-08 Thread Dave Jiang
On 11/7/24 1:58 PM, Ira Weiny wrote: > A git tree of this series can be found here: > > https://github.com/weiny2/linux-kernel/tree/dcd-v4-2024-11-07 > > This is a quick spin with minor clean ups Dave was going to apply as > well as a couple of clean ups I had slated f

Re: [PATCH v5 00/27] DCD: Add support for Dynamic Capacity Devices (DCD)

2024-10-31 Thread Dave Jiang
t one build bot reported > issue all looks good to me. Nice work Ira, Navneet etc. > > Maybe optimistic to hit 6.13, but I'd love it if it did. > If not, Dave, how about shaving a few off the front so at least > there is less to remember for v6 onwards :) I'd like to take it for 6.13. Just seeing if Dan has any last minute complaints :) We should be able to take 1-6 at least. > > Jonathan

Re: [PATCH v3 11/25] cxl/mem: Expose DCD partition capabilities in sysfs

2024-08-23 Thread Dave Jiang
On 8/22/24 7:28 PM, Ira Weiny wrote: > Dave Jiang wrote: >> >> >> On 8/16/24 7:44 AM, ira.we...@intel.com wrote: >>> From: Navneet Singh >>> >>> To properly configure CXL regions on Dynamic Capacity Devices (DCD), >>> user space wil

Re: [PATCH v3 24/25] tools/testing/cxl: Make event logs dynamic

2024-08-20 Thread Dave Jiang
e concurrency required when user space is allowed to > control DCD extents > > Signed-off-by: Ira Weiny Reviewed-by: Dave Jiang > > --- > Changes: > [iweiny: rebase] > --- > tools/testing/cxl/test/mem.c | 278 > ++- > 1

Re: [PATCH v3 23/25] cxl/mem: Trace Dynamic capacity Event Record

2024-08-20 Thread Dave Jiang
ed-off-by: Navneet Singh > Signed-off-by: Ira Weiny Reviewed-by: Dave Jiang small nit below > > --- > Changes: > [Alison: Update commit message] > --- > drivers/cxl/core/mbox.c | 4 +++ > drivers/cxl/core/trace.h | 65 >

Re: [PATCH v3 22/25] cxl/region: Read existing extents on region creation

2024-08-19 Thread Dave Jiang
On 8/16/24 7:44 AM, ira.we...@intel.com wrote: > From: Navneet Singh > > Dynamic capacity device extents may be left in an accepted state on a > device due to an unexpected host crash. In this case it is expected > that the creation of a new region on top of a DC partition can read > those ex

Re: [PATCH v3 21/25] dax/region: Create resources on sparse DAX regions

2024-08-19 Thread Dave Jiang
On 8/16/24 7:44 AM, ira.we...@intel.com wrote: > From: Navneet Singh > > DAX regions which map dynamic capacity partitions require that memory be > allowed to come and go. Recall sparse regions were created for this > purpose. Now that extents can be realized within DAX regions the DAX > reg

Re: [PATCH v3 20/25] dax/bus: Factor out dev dax resize logic

2024-08-19 Thread Dave Jiang
e tree > which is less complicated for finding space. > > In preparation for this change, factor out the dev_dax_resize logic. > For static regions use dax_region->res as the parent to find space for > the dax ranges. Future patches will use the same algorithm with > individual exte

Re: [PATCH v3 19/25] cxl/region/extent: Expose region extent information in sysfs

2024-08-19 Thread Dave Jiang
On 8/16/24 7:44 AM, ira.we...@intel.com wrote: > From: Navneet Singh > > Extent information can be helpful to the user to coordinate memory usage > with the external orchestrator and FM. > > Expose the details of region extents by creating the following > sysfs entries. > > /sys/bus/

Re: [PATCH v3 18/25] cxl/extent: Process DCD events and realize region extents

2024-08-19 Thread Dave Jiang
Process DCD events and create region devices. > > Signed-off-by: Navneet Singh > Co-developed-by: Ira Weiny > Signed-off-by: Ira Weiny A few nits below, but in general Reviewed-by: Dave Jiang > > --- > Changes: > [iweiny: combine this with the extent surface patches to be

Re: [PATCH v3 17/25] cxl/core: Return endpoint decoder information from region search

2024-08-19 Thread Dave Jiang
addition, well behaved > extents should be contained within an endpoint decoder. > > Return the endpoint decoder found to be used in subsequent DCD code. > > Signed-off-by: Ira Weiny Reviewed-by: Dave Jiang > --- > drivers/cxl/core/core.h | 6 -- > drivers/cxl/c

Re: [PATCH v3 16/25] cxl/mem: Configure dynamic capacity interrupts

2024-08-16 Thread Dave Jiang
by the FW if FW first > has been selected by the BIOS. > > Signed-off-by: Navneet Singh > Co-developed-by: Ira Weiny > Signed-off-by: Ira Weiny Reviewed-by: Dave Jiang > > --- > Changes: > [iweiny: update commit message] > [iweiny: rebase to upstream irq code] &g

Re: [PATCH v3 14/25] cxl/events: Split event msgnum configuration from irq setup

2024-08-16 Thread Dave Jiang
onfiguration. > > Split cxl_event_config_msgnums() from irq setup in preparation for > separate DCD interrupts configuration. > > Signed-off-by: Ira Weiny Reviewed-by: Dave Jiang > --- > drivers/cxl/pci.c | 24 > 1 file changed, 12 insertions(+), 12 de

Re: [PATCH v3 13/25] cxl/region: Add sparse DAX region support

2024-08-16 Thread Dave Jiang
ax devices. There is no > known use case for range mapping on sparse regions. Avoid the > complication by preventing range mapping of dax devices on sparse > regions. > > Interleaving is deferred for now. Add checks. > > Signed-off-by: Navneet Singh > Co-d

Re: [PATCH v3 12/25] cxl/region: Refactor common create region code

2024-08-16 Thread Dave Jiang
ion_store() and create_ram_region_store() to use > a single common function to be used in subsequent DC code. > > Suggested-by: Fan Ni > Signed-off-by: Ira Weiny Reviewed-by: Dave Jiang > --- > drivers/cxl/core/region.c | 28 +++- > 1 file changed, 11 inse

Re: [PATCH v3 11/25] cxl/mem: Expose DCD partition capabilities in sysfs

2024-08-16 Thread Dave Jiang
On 8/16/24 7:44 AM, ira.we...@intel.com wrote: > From: Navneet Singh > > To properly configure CXL regions on Dynamic Capacity Devices (DCD), > user space will need to know the details of the DC partitions available. > > Expose dynamic capacity capabilities through sysfs. > > Signed-off-by:

Re: [PATCH v3 10/25] cxl/port: Add endpoint decoder DC mode support to sysfs

2024-08-16 Thread Dave Jiang
> Signed-off-by: Navneet Singh > Co-developed-by: Ira Weiny > Signed-off-by: Ira Weiny Reviewed-by: Dave Jiang > > --- > Changes: > [Fan: change mode range logic] > [Fan: use !resource_size()] > [djiang: use the static mode name string array in mode_store()] >

Re: [PATCH v3 09/25] cxl/hdm: Add dynamic capacity size support to endpoint decoders

2024-08-16 Thread Dave Jiang
On 8/16/24 7:44 AM, ira.we...@intel.com wrote: > From: Navneet Singh > > To support Dynamic Capacity Devices (DCD) endpoint decoders will need to > map DC partitions (regions). In addition to assigning the size of the > DC partition, the decoder must assign any skip value from the previous >

Re: [PATCH v3 08/25] cxl/region: Add dynamic capacity decoder and region modes

2024-08-16 Thread Dave Jiang
-by: Navneet Singh > Co-developed-by: Ira Weiny > Signed-off-by: Ira Weiny Reviewed-by: Dave Jiang > > --- > Changes: > [iweiny: keep tags on simple patch] > [Fan: s/partitions/partition/] > [djiang: New wording for the commit message] > [iweiny: reword commit mess

Re: [PATCH v3 07/25] cxl/core: Separate region mode from decoder mode

2024-08-16 Thread Dave Jiang
just the code to process these modes > independently. > > There is no equal to decoder mode dead in region modes. Avoid > constructing regions with decoders which have been flagged as dead. > > Suggested-by: Jonathan Cameron > Signed-off-by: Navneet Singh > Co-developed-by: Ir

Re: [PATCH v3 06/25] cxl/mem: Read dynamic capacity configuration from the device

2024-08-16 Thread Dave Jiang
On 8/16/24 7:44 AM, ira.we...@intel.com wrote: > From: Navneet Singh > > Devices which optionally support Dynamic Capacity (DC) are configured > via mailbox commands. CXL 3.1 requires the host to issue the Get DC > Configuration command in order to properly configure DCDs. Without the > Get

Re: [PATCH v3 03/25] dax: Document dax dev range tuple

2024-08-16 Thread Dave Jiang
Signed-off-by: Ira Weiny Reviewed-by: Dave Jiang > > --- > Changes: > [iweiny: move to start of series] > --- > drivers/dax/dax-private.h | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.

Re: parent transid verify failed / ERROR: could not setup extent tree

2021-03-22 Thread Dave T
On Mon, Mar 22, 2021 at 3:49 PM Chris Murphy wrote: > > On Mon, Mar 22, 2021 at 12:32 AM Dave T wrote: > > > > On Sun, Mar 21, 2021 at 2:03 PM Chris Murphy > > wrote: > > > > > > On Sat, Mar 20, 2021 at 11:54 PM Dave T wrote: > > > >

Re: parent transid verify failed / ERROR: could not setup extent tree

2021-03-21 Thread Dave T
On Sun, Mar 21, 2021 at 2:03 PM Chris Murphy wrote: > > On Sat, Mar 20, 2021 at 11:54 PM Dave T wrote: > > > > # btrfs check -r 2853787942912 /dev/mapper/xyz > > Opening filesystem to check... > > parent transid verify failed on 2853787942912 wanted 29436 found 2

RE: parent transid verify failed / ERROR: could not setup extent tree

2021-03-20 Thread Dave T
On Sat, Mar 20, 2021 at 10:04 PM Chris Murphy wrote: > > On Sat, Mar 20, 2021 at 5:15 AM Dave T wrote: > > > > I hope to get some expert advice before I proceed. I don't want to > > make things worse. Here's my situation now: > > > > This p

parent transid verify failed / ERROR: could not setup extent tree

2021-03-20 Thread Dave T
I hope to get some expert advice before I proceed. I don't want to make things worse. Here's my situation now: This problem is with an external USB drive and it is encrypted. cryptsetup open succeeds. But mount fails.k mount /backup mount: /backup: wrong fs type, bad option, bad superblo

Re: parent transid verify failed / ERROR: could not setup extent tree

2021-03-20 Thread Dave T
On Sat, Mar 20, 2021 at 2:33 AM Dave T wrote: > > I hope to get some expert advice before I proceed. I don't want to > make things worse. Here's my situation now: > > This problem is with an external USB drive and it is encrypted. > cryptsetup open succeeds. But

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-03-02 Thread Dave Chinner
ck and > > pmem subsystems so that they can take notifications and reverse-map them > > through the storage stack until they reach an fs superblock. > > I'm chuckling because this "reverse map all the way up the block > layer" is the opposite of what Dave said at t

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-03-02 Thread Dave Chinner
On Mon, Mar 01, 2021 at 07:33:28PM -0800, Dan Williams wrote: > On Mon, Mar 1, 2021 at 6:42 PM Dave Chinner wrote: > [..] > > We do not need a DAX specific mechanism to tell us "DAX device > > gone", we need a generic block device interface that tells us "

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-03-02 Thread Dave Chinner
On Mon, Mar 01, 2021 at 04:32:36PM -0800, Dan Williams wrote: > On Mon, Mar 1, 2021 at 2:47 PM Dave Chinner wrote: > > Now we have the filesytem people providing a mechanism for the pmem > > devices to tell the filesystems about physical device failures so > > they can

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-03-01 Thread Dave Chinner
On Mon, Mar 01, 2021 at 12:55:53PM -0800, Dan Williams wrote: > On Sun, Feb 28, 2021 at 2:39 PM Dave Chinner wrote: > > > > On Sat, Feb 27, 2021 at 03:40:24PM -0800, Dan Williams wrote: > > > On Sat, Feb 27, 2021 at 2:36 PM Dave Chinner wrote: > > > > On F

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-02-28 Thread Dave Chinner
On Sat, Feb 27, 2021 at 03:40:24PM -0800, Dan Williams wrote: > On Sat, Feb 27, 2021 at 2:36 PM Dave Chinner wrote: > > On Fri, Feb 26, 2021 at 02:41:34PM -0800, Dan Williams wrote: > > > On Fri, Feb 26, 2021 at 1:28 PM Dave Chinner wrote: > > > > On Fri, Feb 26,

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-02-27 Thread Dave Chinner
On Fri, Feb 26, 2021 at 02:41:34PM -0800, Dan Williams wrote: > On Fri, Feb 26, 2021 at 1:28 PM Dave Chinner wrote: > > On Fri, Feb 26, 2021 at 12:59:53PM -0800, Dan Williams wrote: > > > On Fri, Feb 26, 2021 at 12:51 PM Dave Chinner wrote: > > > > > My imm

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-02-26 Thread Dave Chinner
On Fri, Feb 26, 2021 at 12:59:53PM -0800, Dan Williams wrote: > On Fri, Feb 26, 2021 at 12:51 PM Dave Chinner wrote: > > > > On Fri, Feb 26, 2021 at 11:24:53AM -0800, Dan Williams wrote: > > > On Fri, Feb 26, 2021 at 11:05 AM Darrick J. Wong > > > wrote: > &

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-02-26 Thread Dave Chinner
hen when userspace tries to access the mapped DAX pages we get a new page fault. In processing the fault, the filesystem will try to get direct access to the pmem from the block device. This will get an ENODEV error from the block device because because the backing store (pmem) has been unplugged and is no longer there... AFAICT, as long as pmem removal invalidates all the active ptes that point at the pmem being removed, the filesystem doesn't need to care about device removal at all, DAX or no DAX... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: page->index limitation on 32bit system?

2021-02-21 Thread Dave Chinner
fix it in mainline that I know of. > As I said, some vendors have tried to fix it in their NAS products, > but I don't know where to find that patch any more. It's not suportable from a disaster recovery perspective. I recently saw a 14TB filesystem with billions of hardlinks in it require 240GB of RAM to run xfs_repair. We just can't support large filesystems on 32 bit systems, and it has nothing to do with simple stuff like page cache index sizes... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: page->index limitation on 32bit system?

2021-02-21 Thread Dave Chinner
fset for such systems to 16TB so sparse files can't be larger than what the kernel supports. See xfs_sb_validate_fsb_count() call and the file offset checks against MAX_LFS_FILESIZE in xfs_fs_fill_super()... FWIW, XFS has been doing this for roughly 20 years now - >16TB on 32 bit machines w

Re: Unexpected reflink/subvol snapshot behaviour

2021-02-01 Thread Dave Chinner
On Mon, Feb 01, 2021 at 06:14:21PM -0800, Darrick J. Wong wrote: > On Fri, Jan 22, 2021 at 09:20:51AM +1100, Dave Chinner wrote: > > Hi btrfs-gurus, > > > > I'm running a simple reflink/snapshot/COW scalability test at the > > moment. It is just a loop that d

Re: Unexpected reflink/subvol snapshot behaviour

2021-02-01 Thread Dave Chinner
On Fri, Jan 29, 2021 at 06:25:50PM -0500, Zygo Blaxell wrote: > On Mon, Jan 25, 2021 at 09:36:55AM +1100, Dave Chinner wrote: > > On Sat, Jan 23, 2021 at 04:42:33PM +0800, Qu Wenruo wrote: > > > > > > > > > On 2021/1/22 上午6:20, Dave Chinner wrote: > >

Re: Unexpected reflink/subvol snapshot behaviour

2021-01-24 Thread Dave Chinner
On Sat, Jan 23, 2021 at 04:42:33PM +0800, Qu Wenruo wrote: > > > On 2021/1/22 上午6:20, Dave Chinner wrote: > > Hi btrfs-gurus, > > > > I'm running a simple reflink/snapshot/COW scalability test at the > > moment. It is just a loop that does "fio overwri

Re: Unexpected reflink/subvol snapshot behaviour

2021-01-24 Thread Dave Chinner
On Sat, Jan 23, 2021 at 07:19:03PM -0500, Zygo Blaxell wrote: > On Fri, Jan 22, 2021 at 09:20:51AM +1100, Dave Chinner wrote: > > Hi btrfs-gurus, > > > > I'm running a simple reflink/snapshot/COW scalability test at the > > moment. It is just a loop that d

Unexpected reflink/subvol snapshot behaviour

2021-01-21 Thread Dave Chinner
workload, I suspect the issues I note above are btrfs issues, not expected behaviour. I'm not sure what the expected scalability of btrfs file clones and snapshots are though, so I'm interested to hear if these results are expected or not. Cheers, Dave. -- Dave Chinner da...@fromorbit.com JOBS=4 IODEPTH=4 IOCOUNT=$((1 / $JOBS)) FILESIZE=4g cat >$fio_config <

Re: [PATCH 1/2] iomap: Separate out generic_write_sync() from iomap_dio_complete()

2020-12-15 Thread Dave Chinner
AP_DIO_NEED_SYNC)) > - ret = generic_write_sync(iocb, ret); > + ret = generic_write_sync(dio->iocb, ret); > > kfree(dio); > > return ret; > } > -EXPORT_SYMBOL_GPL(iomap_dio_complete); > + NACK. If you don't want iomap_dio_comple

Re: [RFC PATCH v2 0/5] fs: interface for directly reading/writing compressed data

2019-10-20 Thread Dave Chinner
em. It is based on my previous series which > added a Btrfs-specific ioctl [1], but it is now an extension to > preadv2()/pwritev2() as suggested by Dave Chinner [2]. I've included a > man page patch describing the API in detail. Test cases and examples > programs are available [3]. &g

Re: [RFC PATCH 2/3] fs: add RWF_ENCODED for writing compressed data

2019-09-25 Thread Dave Chinner
On Wed, Sep 25, 2019 at 08:07:12AM -0400, Colin Walters wrote: > > > On Wed, Sep 25, 2019, at 3:11 AM, Dave Chinner wrote: > > > > We're talking about user data read/write access here, not some > > special security capability. Access to the data has already bee

Re: [RFC PATCH 2/3] fs: add RWF_ENCODED for writing compressed data

2019-09-25 Thread Dave Chinner
checked, so why should the format that the data is supplied to the kernel in suddenly require new privilege checks? i.e. writing encoded data to a file requires exactly the same access permissions as writing cleartext data to the file. The only extra information here is whether the _filesystem_ supports encoded data, and that doesn't change regardless of what the open file gets passed to. Hence the capability is either there or it isn't, it doesn't transform not matter what privilege boundary the file is passed across. Similarly, we have permission to access the data or we don't through the struct file, it doesn't transform either. Hence I don't see why CAP_SYS_ADMIN or any special permissions are needed for an application with access permissions to file data to use these RWF_ENCODED IO interfaces. I am inclined to think the permission check here is wrong and should be dropped, and then all these issues go away. Yes, the app that is going to use this needs root perms because it accesses all data in the fs (it's a backup app!), but that doesn't mean you can only use RWF_ENCODED if you have root perms. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 15/15] xfs: Use the new iomap infrastructure for CoW

2019-09-06 Thread Dave Chinner
; > This now at least survives xfstests -g quick on a 4k xfs file system > for. Here is my current tree: > > http://git.infradead.org/users/hch/xfs.git/shortlog/refs/heads/xfs-cow-iomap That looks somewhat reasonable. The XFS mapping function is turning into spagetti and getting really hard to follow again, though. Perhaps we should consider splitting the shared/COW path out of it... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/2] btrfs: add ioctl for directly writing compressed data

2019-09-06 Thread Dave Chinner
On Fri, Sep 06, 2019 at 11:19:49AM -0700, Omar Sandoval wrote: > On Thu, Sep 05, 2019 at 12:10:12PM +1000, Dave Chinner wrote: > > On Wed, Sep 04, 2019 at 12:13:26PM -0700, Omar Sandoval wrote: > > > From: Omar Sandoval > > > > > > This adds an API for wri

Re: [PATCH 2/2] btrfs: add ioctl for directly writing compressed data

2019-09-06 Thread Dave Chinner
On Thu, Sep 05, 2019 at 02:16:37PM +0200, Johannes Thumshirn wrote: > On 05/09/2019 04:10, Dave Chinner wrote: > > On Wed, Sep 04, 2019 at 12:13:26PM -0700, Omar Sandoval wrote: > >> From: Omar Sandoval > >> > >> This adds an API for writing compressed data

Re: [PATCH 2/2] btrfs: add ioctl for directly writing compressed data

2019-09-04 Thread Dave Chinner
hat skips the compression/decompression code and sets a few extra flags in the iocb that is passed down to the direct IO code. We don't need a whole new IO path just to skip a data transformation step in the direct IO path Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 10/13] iomap: use a function pointer for dio submits

2019-08-08 Thread Dave Chinner
now if there's hardware encryption below or software encryption on top becomes problematic... So really, from a filesystem and iomap perspective, What Eric says is the right - it's the only order that makes sense... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 10/13] iomap: use a function pointer for dio submits

2019-08-05 Thread Dave Chinner
On Mon, Aug 05, 2019 at 04:08:43PM +, Goldwyn Rodrigues wrote: > On Mon, 2019-08-05 at 09:43 +1000, Dave Chinner wrote: > > On Fri, Aug 02, 2019 at 05:00:45PM -0500, Goldwyn Rodrigues wrote: > > > From: Goldwyn Rodrigues > > > > > > This helps filesyste

Re: [PATCH 05/13] btrfs: Add CoW in iomap based writes

2019-08-04 Thread Dave Chinner
iomap->type = IOMAP_DELALLOC; > + } > + > iomap->addr = IOMAP_NULL_ADDR; > iomap->type = IOMAP_DELALLOC; The iomap->type is overwritten here and so IOMAP_COW will never be seen by the iomap infrastructure... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 04/13] btrfs: Add a simple buffered iomap write

2019-08-04 Thread Dave Chinner
en; > + if (iocb->ki_pos > i_size_read(inode)) > + i_size_write(inode, iocb->ki_pos); > + return written; Looks like it fails to handle O_[D]SYNC writes. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 01/13] iomap: Use a IOMAP_COW/srcmap for a read-modify-write I/O

2019-08-04 Thread Dave Chinner
((name)[IOMAP_SOURCE_MAP]) And now we only have to pass a single iomap parameter to each function as "struct iomap **iomap". This makes the code somewhat simpler, and we only ever need to use IOMAP_S(iomap) when IOMAP_B(iomap)->type == IOMAP_COW. The other advantage of this is that if we even need new functionality that requires 2 (or more) iomaps, we don't have to change APIs again Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 02/13] iomap: Read page from srcmap for IOMAP_COW

2019-08-04 Thread Dave Chinner
Darrick on CONFIG_IOMAP_DEBUG here - we need to start locking down invalid behaviour and invalid combinations with asserts that tell developers they've broken something. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 10/13] iomap: use a function pointer for dio submits

2019-08-04 Thread Dave Chinner
louts is completely the wrong approach to be taking. We need to do these things in a generic manner so that all filesystems (and block devices!) that use the iomap infrastructure can take advantage of them, not just one of them. Quite frankly, I don't care if it takes more time and work up

Re: [RFC][PATCH] link.2: AT_ATOMIC_DATA and AT_ATOMIC_METADATA

2019-06-02 Thread Dave Chinner
On Sat, Jun 01, 2019 at 11:01:42AM +0300, Amir Goldstein wrote: > On Sat, Jun 1, 2019 at 2:28 AM Dave Chinner wrote: > > > > On Sat, Jun 01, 2019 at 08:45:49AM +1000, Dave Chinner wrote: > > > Given that we can already use AIO to provide this sort of ordering, > > &

Re: [RFC][PATCH] link.2: AT_ATOMIC_DATA and AT_ATOMIC_METADATA

2019-05-31 Thread Dave Chinner
On Sat, Jun 01, 2019 at 08:45:49AM +1000, Dave Chinner wrote: > Given that we can already use AIO to provide this sort of ordering, > and AIO is vastly faster than synchronous IO, I don't see any point > in adding complex barrier interfaces that can be /easily implemented > in

Re: [RFC][PATCH] link.2: AT_ATOMIC_DATA and AT_ATOMIC_METADATA

2019-05-31 Thread Dave Chinner
ee any point in adding complex barrier interfaces that can be /easily implemented in userspace/ using existing AIO primitives. You should start thinking about expanding libaio with stuff like "link_after_fdatasync()" and suddenly the whole problem of filesystem data vs metadata ordering goes away because the application directly controls all ordering without blocking and doesn't need to care what the filesystem under it does Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 04/18] dax: Introduce IOMAP_DAX_COW to CoW edges during writes

2019-05-30 Thread Dave Chinner
On Thu, May 30, 2019 at 01:16:05PM +0200, Jan Kara wrote: > On Thu 30-05-19 08:14:45, Dave Chinner wrote: > > On Wed, May 29, 2019 at 03:46:29PM +0200, Jan Kara wrote: > > > On Wed 29-05-19 14:46:58, Dave Chinner wrote: > > > > iomap_apply() > >

Re: [PATCH 04/18] dax: Introduce IOMAP_DAX_COW to CoW edges during writes

2019-05-29 Thread Dave Chinner
On Wed, May 29, 2019 at 03:46:29PM +0200, Jan Kara wrote: > On Wed 29-05-19 14:46:58, Dave Chinner wrote: > > iomap_apply() > > > > ->iomap_begin() > > map old data extent that we copy from > > > > alloca

Re: [PATCH 04/18] dax: Introduce IOMAP_DAX_COW to CoW edges during writes

2019-05-28 Thread Dave Chinner
On Tue, May 28, 2019 at 09:07:19PM -0700, Darrick J. Wong wrote: > On Wed, May 29, 2019 at 12:02:40PM +0800, Shiyang Ruan wrote: > > On 5/29/19 10:47 AM, Dave Chinner wrote: > > > On Wed, May 29, 2019 at 10:01:58AM +0800, Shiyang Ruan wrote: > > > > On 5

Re: [PATCH 04/18] dax: Introduce IOMAP_DAX_COW to CoW edges during writes

2019-05-28 Thread Dave Chinner
t; > The ->iomap_begin() fills @saddr if the extent is COW, and 0 if not. > > > > > Then > > > > > handle this @saddr in each ->actor(). No more modifications in other > > > > > functions. > > > > > > > > Yes, I st

Re: [PATCH 04/18] dax: Introduce IOMAP_DAX_COW to CoW edges during writes

2019-05-22 Thread Dave Chinner
uggestion during the V1 patchset > > review that instead of adding more iomap flags/types and overloading > > fields, we simply pass two struct iomaps into ->iomap_begin: > > Actually, Dave is the one who suggested to perform it this way. > https://patchwork.kernel.org/comment

ref mismatch / root not found in extent tree / backpointer mismatch / owner ref check failed

2019-05-03 Thread Dave T
The filesystem has become very, very slow. smartctl doesn't show any problems with the HDD. My usual btrfs maintenance (balance, scrub, defrag) did not show any problems -- but did not resolve the slowness. So I ran a btrfs check -- the result is pasted below. What causes this and is there any solu

Re: [PATCH v2 1/7] fsstress: allow fsync on directories too

2019-04-03 Thread Dave Chinner
On Wed, Apr 03, 2019 at 05:35:20PM +, Filipe Manana wrote: > On Wed, Apr 3, 2019 at 3:18 AM Dave Chinner wrote: > > > > On Mon, Apr 01, 2019 at 01:50:18PM +0100, fdman...@kernel.org wrote: > > > From: Filipe Manana > > > > > > Currently the fs

Re: [PATCH 04/15] dax: Introduce IOMAP_F_COW for copy-on-write

2019-04-02 Thread Dave Chinner
On Tue, Apr 02, 2019 at 08:56:31PM -0500, Goldwyn Rodrigues wrote: > On 10:06 02/04, Dave Chinner wrote: > > On Mon, Apr 01, 2019 at 04:41:02PM -0500, Goldwyn Rodrigues wrote: > > > After Darrick's suggestion, we can even do away with cow_pos, so > > > only the rea

Re: [PATCH v2 1/7] fsstress: allow fsync on directories too

2019-04-02 Thread Dave Chinner
thname(&f); > - if (!get_fname(FT_REGFILE, r, &f, NULL, NULL, &v)) { > + if (!get_fname(FT_REGFILE | FT_DIRm, r, &f, NULL, NULL, &v)) { > if (v) > printf("%d/%d: fsync - no filename\n", procid, opno); > free_pathn

Re: [PATCH 04/15] dax: Introduce IOMAP_F_COW for copy-on-write

2019-04-01 Thread Dave Chinner
On Mon, Apr 01, 2019 at 04:41:02PM -0500, Goldwyn Rodrigues wrote: > On 15:38 01/04, Dave Chinner wrote: > > On Tue, Mar 26, 2019 at 02:02:50PM -0500, Goldwyn Rodrigues wrote: > > > From: Goldwyn Rodrigues > > > > > > The IOMAP_F_COW is a flag to notify dax

Re: [PATCH 04/15] dax: Introduce IOMAP_F_COW for copy-on-write

2019-03-31 Thread Dave Chinner
#define IOMAP_F_DIRTY0x02/* uncommitted metadata */ > #define IOMAP_F_BUFFER_HEAD 0x04/* file system requires buffer heads */ > +#define IOMAP_F_COW 0x08/* cow before write */ "Copy on write before write"? :) Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 5/7] fsstress: allow afsync on directories too

2019-03-28 Thread Dave Chinner
procid, opno, e); > - free_pathname(&f); > - close(fd); > - return; > + goto out; > } > > e = event.res2; > if (v) > printf("%d/%d: afsync %s %d\n", procid, opno, f.path, e); > +out: > free_pathname(&f); > - close(fd); > + if (dir) > + closedir(dir); > + else > + close(fd); Same here for close. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/7] fsstress: add operation for setting xattrs on files and directories

2019-03-28 Thread Dave Chinner
ded attribute frequency. There are some tests that actually use "-f setxattr=n" (and who knows how many custom test scripts using fsstress built from fstests), so I don't think we should be renaming existing operations to something else and then reusing the name for a new type of operation like this I certainly agree with the idea of adding extended attributes to fsstress, just not this way... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 1/7] fsstress: rename setxattr operation to chproj

2019-03-28 Thread Dave Chinner
d it to testing other bits of the FS_IOC_FS[GS]ETXATTR interface, so it's appropriately named. if youare going to change it, then "fssetxattr" is probably the right thing to change it to... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 3/3] xfs: don't allow most setxattr to immutable files

2019-03-28 Thread Dave Chinner
_IMMUTABLE)) > + return -EPERM; > + > /* diflags2 only valid for v3 inodes. */ > di_flags2 = xfs_flags2diflags2(ip, fa->fsx_xflags); > if (di_flags2 && ip->i_d.di_version < 3) Looks fine - catches both FS_IOC_SETFLAGS and FS_IOC_FSSETXATTR for XFS. Do the other filesystems that implement FS_IOC_FSSETXATTR have the same bug? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 2/3] xfs: reset page mappings after setting immutable

2019-03-28 Thread Dave Chinner
+ goto out_unlock; > + > + *join_flags = XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL; > + return 0; > + > +out_unlock: > + xfs_iunlock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); > + return error; > + > +} Doesn't wait for direct IO to drain. Wouldn't it be better to do this? lock() xfs_flush_unmap_range(ip, 0, XFS_SIZE(ip)); unlock() Otherwise looks ok. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] generic: add test for fsync after shrinking truncate and rename

2019-03-18 Thread Dave Chinner
(Sorry, missed this email and only just noticed it...) On Fri, Mar 08, 2019 at 09:11:19AM -0600, Vijay Chidambaram wrote: > On Thu, Mar 7, 2019 at 10:35 PM Dave Chinner wrote: > > > > On Thu, Mar 07, 2019 at 05:19:51PM -0600, Jayashree Mohan wrote: > > > Hi Amir, >

Re: [PATCH] generic: add test for fsync after shrinking truncate and rename

2019-03-07 Thread Dave Chinner
ps://patchwork.kernel.org/patch/8293181/). I really wouldn't try to infer anything from the bugs in btrfs fsync behaviour or the test cases that expose them. 'Behave like other filesystems" is not a substitute for having solid fundamental algorithms... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH] generic: add test for fsync after shrinking truncate and rename

2019-03-07 Thread Dave Chinner
On Thu, Mar 07, 2019 at 09:52:03AM +0200, Amir Goldstein wrote: > On Wed, Mar 6, 2019 at 11:48 PM Dave Chinner wrote: > > > > On Wed, Mar 06, 2019 at 09:51:23AM +0200, Amir Goldstein wrote: > > > On Wed, Mar 6, 2019 at 12:33 AM Dave Chinner wrote: > > > >

Re: [PATCH] generic: add test for fsync after shrinking truncate and rename

2019-03-06 Thread Dave Chinner
On Wed, Mar 06, 2019 at 09:51:23AM +0200, Amir Goldstein wrote: > On Wed, Mar 6, 2019 at 12:33 AM Dave Chinner wrote: > > > So the reason this is working is because 2nd fsync needs to > > > persist ctime of B and not because it needs to persist the > > > truncat

Re: [PATCH] generic: add test for fsync after shrinking truncate and rename

2019-03-05 Thread Dave Chinner
On Tue, Mar 05, 2019 at 07:39:28AM +0200, Amir Goldstein wrote: > On Tue, Mar 5, 2019 at 2:50 AM Dave Chinner wrote: > > > > On Mon, Mar 04, 2019 at 05:04:23PM +0200, Amir Goldstein wrote: > > > On Mon, Mar 4, 2019 at 4:44 PM wrote: > > > > > > > >

Re: [PATCH] generic: add test for fsync after shrinking truncate and rename

2019-03-04 Thread Dave Chinner
On Tue, Mar 05, 2019 at 11:50:20AM +1100, Dave Chinner wrote: > On Mon, Mar 04, 2019 at 05:04:23PM +0200, Amir Goldstein wrote: > > On Mon, Mar 4, 2019 at 4:44 PM wrote: > > > > > > From: Filipe Manana > > > > > > Test that if we truncate a file to re

Re: [PATCH] generic: add test for fsync after shrinking truncate and rename

2019-03-04 Thread Dave Chinner
llows the behaviour to be implementation specific. In this case, file systems with strictly ordered metadata will end up making the rename visible because the rename occurred before the truncate that the fsync() is persisting... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [LSF/MM TOPIC] More async operations for file systems - async discard?

2019-02-21 Thread Dave Chinner
ent if you write > or discard at the smaller granularity. Filesystems discard extents these days, not individual blocks. If you free a 1MB file, they you are likely to get a 1MB discard. Or if you use fstrim, then it's free space extent sizes (on XFS can be hundred of GBs) and small fre

Re: [RFC PATCH 0/6] Allow setting file birth time with utimensat()

2019-02-18 Thread Dave Chinner
; overridable. > > Please, no. We need to have consistent behaviour between at least > Linux local filesystems. Not "Chris thinks this is a good idea, > while Dave and Ted think its a bad idea, so btrfs supports it and > XFS and ext4 disallow it". And, quite frankly, t

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-18 Thread Dave Chinner
On Mon, Feb 18, 2019 at 06:15:34PM -0800, Jane Chu wrote: > On 2/15/2019 9:39 PM, Dave Chinner wrote: > > >On Sat, Feb 16, 2019 at 04:31:33PM +1100, Dave Chinner wrote: > >>On Fri, Feb 15, 2019 at 10:57:12AM +0100, Johannes Thumshirn wrote: > >>>(This is a jo

Re: [RFC PATCH 0/6] Allow setting file birth time with utimensat()

2019-02-18 Thread Dave Chinner
nd expose it through statx() (as authored time, not birth time), but store it a system xattr rather than an internal filesystem metadata field that requires was never intended to be user modifiable. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [LSF/MM TOPIC] More async operations for file systems - async discard?

2019-02-17 Thread Dave Chinner
On Sun, Feb 17, 2019 at 06:42:59PM -0500, Ric Wheeler wrote: > On 2/17/19 4:09 PM, Dave Chinner wrote: > >On Sun, Feb 17, 2019 at 03:36:10PM -0500, Ric Wheeler wrote: > >>One proposal for btrfs was that we should look at getting discard > >>out of the synchronous pa

Re: [LSF/MM TOPIC] More async operations for file systems - async discard?

2019-02-17 Thread Dave Chinner
st of the various discard > commands - how painful is it for modern SSD's? AIUI, it still depends on the SSD implementation, unfortunately. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-16 Thread Dave Chinner
On Sat, Feb 16, 2019 at 09:05:31AM -0800, Dan Williams wrote: > On Fri, Feb 15, 2019 at 9:40 PM Dave Chinner wrote: > > > > On Sat, Feb 16, 2019 at 04:31:33PM +1100, Dave Chinner wrote: > > > On Fri, Feb 15, 2019 at 10:57:12AM +0100, Johannes Thumshirn wrote: > >

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-15 Thread Dave Chinner
On Sat, Feb 16, 2019 at 04:31:33PM +1100, Dave Chinner wrote: > On Fri, Feb 15, 2019 at 10:57:12AM +0100, Johannes Thumshirn wrote: > > (This is a joint proposal with Hannes Reinecke) > > > > Servers with NV-DIMM are slowly emerging in data centers but one key feature > &g

Re: [LSF/MM TOPIC] Software RAID Support for NV-DIMM

2019-02-15 Thread Dave Chinner
all the metadata goes to the software raided pmem block devices that aren't DAX capable. Problem already solved, yes? Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [RFC PATCH 0/6] Allow setting file birth time with utimensat()

2019-02-14 Thread Dave Chinner
On Thu, Feb 14, 2019 at 03:14:29PM -0800, Omar Sandoval wrote: > On Fri, Feb 15, 2019 at 09:06:26AM +1100, Dave Chinner wrote: > > On Thu, Feb 14, 2019 at 02:00:07AM -0800, Omar Sandoval wrote: > > > From: Omar Sandoval > > > > > > Hi, > > > >

Re: [RFC PATCH 0/6] Allow setting file birth time with utimensat()

2019-02-14 Thread Dave Chinner
e create time doesn't really help, because once you've broken into a system, this makes it really easy to cover tracks (e.g. we can't find files that were created and unlinked during the break in window anymore) and lay false trails Cheers, Dave. -- Dave Chinner da...@fromorbit.com

a new kind of "No space left on device" error

2018-10-28 Thread Dave
This is one I have not seen before. When running a simple, well-tested and well-used script that makes backups using btrfs send | receive, I got these two errors: At subvol snapshot ERROR: rename o131621-1091-0 -> usr/lib/node_modules/node-gyp/gyp/pylib/gyp/MSVSVersion.py failed: No space left on

btrfs-qgroup-rescan using 100% CPU

2018-10-27 Thread Dave
I'm using btrfs and snapper on a system with an SSD. On this system when I run `snapper -c root ls` (where `root` is the snapper config for /), the process takes a very long time and top shows the following process using 100% of the CPU: kworker/u8:6+btrfs-qgroup-rescan I have multiple comput

  1   2   3   4   5   6   7   8   >