Re: [PATCH 02/26] Add libbtrfsutil

2018-01-29 Thread Omar Sandoval
On Mon, Jan 29, 2018 at 10:16:40AM +0800, Qu Wenruo wrote:
> 
> 
> On 2018年01月27日 02:40, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > Currently, users wishing to manage Btrfs filesystems programatically
> > have to shell out to btrfs-progs and parse the output. This isn't ideal.
> > The goal of libbtrfsutil is to provide a library version of as many of
> > the operations of btrfs-progs as possible and to migrate btrfs-progs to
> > use it.
> > 
> > Rather than simply refactoring the existing btrfs-progs code, the code
> > has to be written from scratch for a couple of reasons:
> > 
> > * A lot of the btrfs-progs code was not designed with a nice library API
> >   in mind in terms of reusability, naming, and error reporting.
> > * libbtrfsutil is licensed under the LGPL, whereas btrfs-progs is under
> >   the GPL, which makes it dubious to directly copy or move the code.
> > 
> > Eventually, most of the low-level btrfs-progs code should either live in
> > libbtrfsutil or the shared kernel/userspace filesystem code, and
> > btrfs-progs will just be the CLI wrapper.
> > 
> > This first commit just includes the build system changes, license,
> > README, and error reporting helper.
> > 
> > Signed-off-by: Omar Sandoval 
> > ---
> >  .gitignore  |   2 +
> >  Makefile|  47 +--
> >  libbtrfsutil/COPYING| 674 
> > 
> >  libbtrfsutil/COPYING.LESSER | 165 +++
> >  libbtrfsutil/README.md  |  32 +++
> >  libbtrfsutil/btrfsutil.h|  72 +
> >  libbtrfsutil/errors.c   |  55 
> >  7 files changed, 1031 insertions(+), 16 deletions(-)
> >  create mode 100644 libbtrfsutil/COPYING
> >  create mode 100644 libbtrfsutil/COPYING.LESSER
> >  create mode 100644 libbtrfsutil/README.md
> >  create mode 100644 libbtrfsutil/btrfsutil.h
> >  create mode 100644 libbtrfsutil/errors.c
> > 

[snip]

> > diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
> > new file mode 100644
> > index ..fe1091ca
> > --- /dev/null
> > +++ b/libbtrfsutil/btrfsutil.h
> > @@ -0,0 +1,72 @@
> > +/*
> > + * Copyright (C) 2018 Facebook
> > + *
> > + * This file is part of libbtrfsutil.
> > + *
> > + * libbtrfsutil is free software: you can redistribute it and/or modify
> > + * it under the terms of the GNU Lesser General Public License as 
> > published by
> > + * the Free Software Foundation, either version 3 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * libbtrfsutil is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public License
> > + * along with libbtrfsutil.  If not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#ifndef BTRFS_UTIL_H
> > +#define BTRFS_UTIL_H
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +/**
> > + * enum btrfs_util_error - libbtrfsutil error codes.
> > + *
> > + * All functions in libbtrfsutil that can return an error return this type 
> > and
> > + * set errno.
> > + */
> 
> I totally understand that libbtrfsutils needs extra error numbers, but I
> didn't see similar practice, would you mind to give some existing
> example of such >0 error usage in other projects?
> (Just curious)

Sure, pthreads returns 0 on success, positive errnos on failure. libcurl
also uses positive error codes 
(https://curl.haxx.se/libcurl/c/libcurl-errors.html).
Those are just the first two I checked.

> Normally people would expect error values < 0 to indicate errors, just
> like glibc system call wrappers, which always return -1 to indicate errors.
> 
> > +enum btrfs_util_error {
> > +   BTRFS_UTIL_OK,
> > +   BTRFS_UTIL_ERROR_STOP_ITERATION,
> > +   BTRFS_UTIL_ERROR_NO_MEMORY,
> 
> Not sure if this is duplicated with -ENOMEM errno.
> 
> From my understanding, these extra numbers should be used to indicate
> extra error not definied in generic errno.h.
> 
> For NOT_BTRFS and NOT_SUBVOLUME they makes sense, but for NO_MEMORY, I'm
> really not sure.

So sometimes we return an errno, sometimes an enum btrfs_util_error? And
then we have to make sure to

Re: [PATCH 04/26] libbtrfsutil: add btrfs_util_is_subvolume() and btrfs_util_subvolume_id()

2018-02-02 Thread Omar Sandoval
On Thu, Feb 01, 2018 at 05:28:28PM +0100, David Sterba wrote:
> On Tue, Jan 30, 2018 at 08:54:08AM +0200, Nikolay Borisov wrote:
> > 
> > 
> > On 29.01.2018 23:43, Omar Sandoval wrote:
> > > On Mon, Jan 29, 2018 at 12:24:26PM +0200, Nikolay Borisov wrote:
> > >> On 26.01.2018 20:40, Omar Sandoval wrote:
> > > [snip]
> > >>> +/*
> > >>> + * This intentionally duplicates btrfs_util_f_is_subvolume() instead 
> > >>> of opening
> > >>> + * a file descriptor and calling it, because fstat() and fstatfs() 
> > >>> don't accept
> > >>> + * file descriptors opened with O_PATH on old kernels (before v3.6 and 
> > >>> before
> > >>> + * v3.12, respectively), but stat() and statfs() can be called on a 
> > >>> path that
> > >>> + * the user doesn't have read or write permissions to.
> > >>> + */
> > >>> +__attribute__((visibility("default")))
> > >>
> > >> Why do we need to explicitly set the attribute visibility to default,
> > >> isn't it implicitly default already?
> > > 
> > > Ah, I forgot to add -fvisibility=hidden to the build rule when I ported
> > > this to the btrfs-progs Makefile, that's why that's there. I'll add
> > > -fvisibility=hidden to the Makefile.
> > 
> > Right, it could be a good idea to hide the visibility attribute behind
> > an eloquent macro i.e. (PUBLIC|LIBRARY)_FUNC or some such.
> 
> Macro would be better (but is not needed for the initial version).
> 
> Alternatively the library .sym file can externally track the exported
> symbols and also track versioning.

I'll add a macro for this, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 21/26] btrfs-progs: use libbtrfsutil for subvol show

2018-02-02 Thread Omar Sandoval
On Sat, Feb 03, 2018 at 12:18:02AM +0100, Hans van Kranenburg wrote:
> On 01/26/2018 07:41 PM, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > Now implemented with btrfs_util_subvolume_path(),
> > btrfs_util_subvolume_info(), and subvolume iterators.
> > 
> > Signed-off-by: Omar Sandoval 
> > ---
> >  cmds-subvolume.c | 150 
> > ---
> >  utils.c  | 118 ---
> >  utils.h  |   5 --
> >  3 files changed, 99 insertions(+), 174 deletions(-)
> > 
> > diff --git a/cmds-subvolume.c b/cmds-subvolume.c
> > index c5e03011..b969fc88 100644
> > --- a/cmds-subvolume.c
> > +++ b/cmds-subvolume.c
> >
> >
> > [...]
> > } else
> > strcpy(tstr, "-");
> > printf("\tCreation time: \t\t%s\n", tstr);
> >  
> > -   printf("\tSubvolume ID: \t\t%llu\n", get_ri.root_id);
> > -   printf("\tGeneration: \t\t%llu\n", get_ri.gen);
> > -   printf("\tGen at creation: \t%llu\n", get_ri.ogen);
> > -   printf("\tParent ID: \t\t%llu\n", get_ri.ref_tree);
> > -   printf("\tTop level ID: \t\t%llu\n", get_ri.top_id);
> > +   printf("\tSubvolume ID: \t\t%" PRIu64 "\n", subvol.id);
> > +   printf("\tGeneration: \t\t%" PRIu64 "\n", subvol.generation);
> > +   printf("\tGen at creation: \t%" PRIu64 "\n", subvol.otransid);
> > +   printf("\tParent ID: \t\t%" PRIu64 "\n", subvol.parent_id);
> > +   printf("\tTop level ID: \t\t%" PRIu64 "\n", subvol.parent_id);
> >  
> > -   if (get_ri.flags & BTRFS_ROOT_SUBVOL_RDONLY)
> > +   if (subvol.flags & BTRFS_ROOT_SUBVOL_RDONLY)
> > printf("\tFlags: \t\t\treadonly\n");
> > else
> > printf("\tFlags: \t\t\t-\n");
> >  
> > /* print the snapshots of the given subvol if any*/
> > printf("\tSnapshot(s):\n");
> > -   filter_set = btrfs_list_alloc_filter_set();
> > -   btrfs_list_setup_filter(&filter_set, BTRFS_LIST_FILTER_BY_PARENT,
> > -   (u64)(unsigned long)get_ri.uuid);
> > -   btrfs_list_setup_print_column(BTRFS_LIST_PATH);
> >  
> > -   fd = open_file_or_dir(fullpath, &dirstream1);
> > -   if (fd < 0) {
> > -   fprintf(stderr, "ERROR: can't access '%s'\n", fullpath);
> > -   goto out;
> > +   err = btrfs_util_f_create_subvolume_iterator(fd,
> > +BTRFS_FS_TREE_OBJECTID,
> > +0, &iter);
> > +
> > [...]
> When you have enough subvolumes in a filesystem, let's say 10 (yes,
> that sometimes happens), the current btrfs sub list is quite unusable,
> which is kind of expected. But, currently, sub show is also unusable
> because it still starts loading a list of all subvolumes to be able to
> print the 'snapshots, if any'.
> 
> So I guess that this new sub show will at least print the info first and
> then use the iterator and start crawling through the list, which can be
> interrupted? At least you get the relevant info first then. :-)

Right, and since we don't load everything into memory all at once, both
show and list will be able to output subvolumes one by one.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 00/27] btrfs-progs: introduce libbtrfsutil, "btrfs-progs as a library"

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Hi,

This is v2 of "btrfs-progs as a library".

Most of the changes since v1 are small:

- Rebased onto v4.15
- Split up btrfs_util_subvolume_path() which was accidentally squashed together
  with the commit adding btrfs_util_create_subvolume()
- Renamed btrfs_util_f_* functions to btrfs_util_*_fd for clarity
- Added -fvisibility=hidden and a macro for
  __attribute__((visibility("default")))
- Changed to use semantic versioning
- Fixed missing install of btrfsutil.h
- Documented functions which require root or are non-atomic
- Added a missing license to setup.py

The bigger change is in the last two patches. Dave requested that I get
rid of the runtime dependency of libbtrfsutil from libbtrfs. The easiest
way to do this was to remove the btrfs_list_subvols_print()
implementation from libbtrfs and put it in cmds-subvolume.c (details in
patch 26). I'm open to alternatives.

Please share feedback, thanks!

Cover letter from v1:

One of the features requests I get most often is a library to do the
sorts of operations that we do with btrfs-progs. We can shell out to
btrfs-progs, but the output format isn't always easily parsasble, and
shelling out isn't always ideal. There's libbtrfs, but it's very
incomplete, doesn't have a well thought out API, and is licensed under
the GPL, making it hard to use for many projects.

libbtrfsutil is a new library written from scratch to address these
issues. The implementation is completely independent of the existing
btrfs-progs code, including kerncompat.h, and has a clean API and naming
scheme. It is licensed under the LGPL. It also includes Python bindings
by default. I will maintain the library code.

Patch 1 is a preparation cleanup which can go in independently. Patch 2
adds the build system stuff for the library, and patch 3 does the same
for the Python bindings. Patches 4-14 implement the library helpers,
currently subvolume helpers and the sync ioctls. Patches 15-26 replace
the btrfs-progs and libbtrfs code to use libbtrfsutil instead. I took
care to preserve backwards-compatibility. `btrfs subvol list` in
particular had some buggy behaviors for -o and -a that I emulated in the
new code, see the comments in the code.

These patches are also available on my GitHub:
https://github.com/osandov/btrfs-progs/tree/libbtrfsutil. That branch
will rebase as I update this series.

Omar Sandoval (27):
  btrfs-progs: get rid of undocumented qgroup inheritance options
  Add libbtrfsutil
  libbtrfsutil: add Python bindings
  libbtrfsutil: add btrfs_util_is_subvolume() and
btrfs_util_subvolume_id()
  libbtrfsutil: add qgroup inheritance helpers
  libbtrfsutil: add btrfs_util_create_subvolume()
  libbtrfsutil: add btrfs_util_subvolume_path()
  libbtrfsutil: add btrfs_util_subvolume_info()
  libbtrfsutil: add btrfs_util_[gs]et_read_only()
  libbtrfsutil: add btrfs_util_[gs]et_default_subvolume()
  libbtrfsutil: add subvolume iterator helpers
  libbtrfsutil: add btrfs_util_create_snapshot()
  libbtrfsutil: add btrfs_util_delete_subvolume()
  libbtrfsutil: add btrfs_util_deleted_subvolumes()
  libbtrfsutil: add filesystem sync helpers
  btrfs-progs: use libbtrfsutil for read-only property
  btrfs-progs: use libbtrfsutil for sync ioctls
  btrfs-progs: use libbtrfsutil for set-default
  btrfs-progs: use libbtrfsutil for get-default
  btrfs-progs: use libbtrfsutil for subvol create and snapshot
  btrfs-progs: use libbtrfsutil for subvol delete
  btrfs-progs: use libbtrfsutil for subvol show
  btrfs-progs: use libbtrfsutil for subvol sync
  btrfs-progs: replace test_issubvolume() with btrfs_util_is_subvolume()
  btrfs-progs: add recursive snapshot/delete using libbtrfsutil
  btrfs-progs: use libbtrfsutil for subvolume list
  btrfs-progs: deprecate libbtrfs helpers

 .gitignore   |2 +
 Documentation/btrfs-subvolume.asciidoc   |   14 +-
 INSTALL  |4 +
 Makefile |  109 +-
 Makefile.inc.in  |4 +-
 btrfs-list.c |  810 +--
 btrfs-list.h |  104 +-
 cmds-filesystem.c|   19 +-
 cmds-inspect.c   |   10 +-
 cmds-qgroup.c|   21 +-
 cmds-receive.c   |   13 +-
 cmds-subvolume.c | 1877 ++
 configure.ac |   15 +
 libbtrfsutil/COPYING |  674 +
 libbtrfsutil/COPYING.LESSER  |  165 +++
 libbtrfsutil/README.md   |   38 +
 libbtrfsutil/btrfsutil.h |  648 +
 libbtrfsutil/btrfsutil_internal.h|   40 +
 libbtrfsutil/errors.c|   55 +
 libbtrfsutil/filesystem.c|  103 ++
 l

[PATCH v2 01/27] btrfs-progs: get rid of undocumented qgroup inheritance options

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

The -c option to subvol create and the -c and -x options to subvol
snapshot have never been documented (and -x doesn't actually work
because it's not in the getopt option string). The functionality is
dubious and the kernel interface is being removed, so get rid of these.

Signed-off-by: Omar Sandoval 
---
 cmds-subvolume.c | 25 ++---
 qgroup.c | 42 --
 qgroup.h |  2 --
 3 files changed, 2 insertions(+), 67 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 8a473f7a..edcb4f11 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -129,18 +129,11 @@ static int cmd_subvol_create(int argc, char **argv)
DIR *dirstream = NULL;
 
while (1) {
-   int c = getopt(argc, argv, "c:i:");
+   int c = getopt(argc, argv, "i:");
if (c < 0)
break;
 
switch (c) {
-   case 'c':
-   res = qgroup_inherit_add_copy(&inherit, optarg, 0);
-   if (res) {
-   retval = res;
-   goto out;
-   }
-   break;
case 'i':
res = qgroup_inherit_add_group(&inherit, optarg);
if (res) {
@@ -662,18 +655,11 @@ static int cmd_subvol_snapshot(int argc, char **argv)
 
memset(&args, 0, sizeof(args));
while (1) {
-   int c = getopt(argc, argv, "c:i:r");
+   int c = getopt(argc, argv, "i:r");
if (c < 0)
break;
 
switch (c) {
-   case 'c':
-   res = qgroup_inherit_add_copy(&inherit, optarg, 0);
-   if (res) {
-   retval = res;
-   goto out;
-   }
-   break;
case 'i':
res = qgroup_inherit_add_group(&inherit, optarg);
if (res) {
@@ -684,13 +670,6 @@ static int cmd_subvol_snapshot(int argc, char **argv)
case 'r':
readonly = 1;
break;
-   case 'x':
-   res = qgroup_inherit_add_copy(&inherit, optarg, 1);
-   if (res) {
-   retval = res;
-   goto out;
-   }
-   break;
default:
usage(cmd_subvol_snapshot_usage);
}
diff --git a/qgroup.c b/qgroup.c
index b5b893f4..b107a683 100644
--- a/qgroup.c
+++ b/qgroup.c
@@ -1324,45 +1324,3 @@ int qgroup_inherit_add_group(struct btrfs_qgroup_inherit 
**inherit, char *arg)
 
return 0;
 }
-
-int qgroup_inherit_add_copy(struct btrfs_qgroup_inherit **inherit, char *arg,
-   int type)
-{
-   int ret;
-   u64 qgroup_src;
-   u64 qgroup_dst;
-   char *p;
-   int pos = 0;
-
-   p = strchr(arg, ':');
-   if (!p) {
-bad:
-   error("invalid copy specification, missing separator :");
-   return -EINVAL;
-   }
-   *p = 0;
-   qgroup_src = parse_qgroupid(arg);
-   qgroup_dst = parse_qgroupid(p + 1);
-   *p = ':';
-
-   if (!qgroup_src || !qgroup_dst)
-   goto bad;
-
-   if (*inherit)
-   pos = (*inherit)->num_qgroups +
- (*inherit)->num_ref_copies * 2 * type;
-
-   ret = qgroup_inherit_realloc(inherit, 2, pos);
-   if (ret)
-   return ret;
-
-   (*inherit)->qgroups[pos++] = qgroup_src;
-   (*inherit)->qgroups[pos++] = qgroup_dst;
-
-   if (!type)
-   ++(*inherit)->num_ref_copies;
-   else
-   ++(*inherit)->num_excl_copies;
-
-   return 0;
-}
diff --git a/qgroup.h b/qgroup.h
index 875fbdf3..bb6610d7 100644
--- a/qgroup.h
+++ b/qgroup.h
@@ -92,7 +92,5 @@ int btrfs_qgroup_setup_comparer(struct 
btrfs_qgroup_comparer_set **comp_set,
int is_descending);
 int qgroup_inherit_size(struct btrfs_qgroup_inherit *p);
 int qgroup_inherit_add_group(struct btrfs_qgroup_inherit **inherit, char *arg);
-int qgroup_inherit_add_copy(struct btrfs_qgroup_inherit **inherit, char *arg,
-   int type);
 
 #endif
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 03/27] libbtrfsutil: add Python bindings

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

The C libbtrfsutil library isn't very useful for scripting, so we also
want bindings for Python. Writing unit tests in Python is also much
easier than doing so in C. Only Python 3 is supported; if someone really
wants Python 2 support, they can write their own bindings. This commit
is just the scaffolding.

Signed-off-by: Omar Sandoval 
---
 INSTALL   |   4 +
 Makefile  |  36 ++
 Makefile.inc.in   |   2 +
 configure.ac  |  15 +++
 libbtrfsutil/README.md|   5 +-
 libbtrfsutil/python/.gitignore|   7 ++
 libbtrfsutil/python/btrfsutilpy.h |  57 ++
 libbtrfsutil/python/error.c   | 202 ++
 libbtrfsutil/python/module.c  | 166 
 libbtrfsutil/python/setup.py  | 108 ++
 libbtrfsutil/python/tests/__init__.py |   0
 11 files changed, 601 insertions(+), 1 deletion(-)
 create mode 100644 libbtrfsutil/python/.gitignore
 create mode 100644 libbtrfsutil/python/btrfsutilpy.h
 create mode 100644 libbtrfsutil/python/error.c
 create mode 100644 libbtrfsutil/python/module.c
 create mode 100755 libbtrfsutil/python/setup.py
 create mode 100644 libbtrfsutil/python/tests/__init__.py

diff --git a/INSTALL b/INSTALL
index 819b92ea..24d6e24f 100644
--- a/INSTALL
+++ b/INSTALL
@@ -41,6 +41,10 @@ To build from the released tarballs:
 $ make
 $ make install
 
+To install the libbtrfsutil Python bindings:
+
+$ make install_python
+
 You may disable building some parts like documentation, btrfs-convert or
 backtrace support. See ./configure --help for more.
 
diff --git a/Makefile b/Makefile
index 7fb70d06..95ee9678 100644
--- a/Makefile
+++ b/Makefile
@@ -154,8 +154,10 @@ endif
 
 ifeq ($(BUILD_VERBOSE),1)
   Q =
+  SETUP_PY_Q =
 else
   Q = @
+  SETUP_PY_Q = -q
 endif
 
 ifeq ("$(origin D)", "command line")
@@ -302,6 +304,9 @@ endif
$($(subst -,_,btrfs-$(@:%/$(notdir $@)=%)-cflags))
 
 all: $(progs) $(libs) $(lib_links) $(BUILDDIRS)
+ifeq ($(PYTHON_BINDINGS),1)
+all: libbtrfsutil_python
+endif
 $(SUBDIRS): $(BUILDDIRS)
 $(BUILDDIRS):
@echo "Making all in $(patsubst build-%,%,$@)"
@@ -345,6 +350,16 @@ test-inst: all
 
 test: test-fsck test-mkfs test-convert test-misc test-fuzz test-cli
 
+ifeq ($(PYTHON_BINDINGS),1)
+test-libbtrfsutil: libbtrfsutil_python
+   $(Q)cd libbtrfsutil/python; \
+   LD_LIBRARY_PATH=../.. $(PYTHON) -m unittest discover -v tests
+
+.PHONY: test-libbtrfsutil
+
+test: test-libbtrfsutil
+endif
+
 #
 # NOTE: For static compiles, you need to have all the required libs
 #  static equivalent available
@@ -395,6 +410,15 @@ libbtrfsutil.so.$(libbtrfsutil_major) libbtrfsutil.so: 
libbtrfsutil.so.$(libbtrf
@echo "[LN] $@"
$(Q)$(LN_S) -f $< $@
 
+ifeq ($(PYTHON_BINDINGS),1)
+libbtrfsutil_python: libbtrfsutil.so libbtrfsutil/btrfsutil.h
+   @echo "[PY] libbtrfsutil"
+   $(Q)cd libbtrfsutil/python; \
+   CFLAGS= LDFLAGS= $(PYTHON) setup.py $(SETUP_PY_Q) build_ext -i 
build
+
+.PHONY: libbtrfsutil_python
+endif
+
 # keep intermediate files from the below implicit rules around
 .PRECIOUS: $(addsuffix .o,$(progs))
 
@@ -578,6 +602,10 @@ clean: $(CLEANDIRS)
  $(libs) $(lib_links) \
  $(progs_static) $(progs_extra) \
  libbtrfsutil/*.o libbtrfsutil/*.o.d
+ifeq ($(PYTHON_BINDINGS),1)
+   $(Q)cd libbtrfsutil/python; \
+   $(PYTHON) setup.py $(SETUP_PY_Q) clean -a
+endif
 
 clean-doc:
@echo "Cleaning Documentation"
@@ -613,6 +641,14 @@ ifneq ($(udevdir),)
$(INSTALL) -m644 $(udev_rules) $(DESTDIR)$(udevruledir)
 endif
 
+ifeq ($(PYTHON_BINDINGS),1)
+install_python: libbtrfsutil_python
+   $(Q)cd libbtrfsutil/python; \
+   $(PYTHON) setup.py install --skip-build $(if $(DESTDIR),--root 
$(DESTDIR)) --prefix $(prefix)
+
+.PHONY: install_python
+endif
+
 install-static: $(progs_static) $(INSTALLDIRS)
$(INSTALL) -m755 -d $(DESTDIR)$(bindir)
$(INSTALL) $(progs_static) $(DESTDIR)$(bindir)
diff --git a/Makefile.inc.in b/Makefile.inc.in
index b53bef80..159d38ed 100644
--- a/Makefile.inc.in
+++ b/Makefile.inc.in
@@ -14,6 +14,8 @@ DISABLE_BTRFSCONVERT = @DISABLE_BTRFSCONVERT@
 BTRFSCONVERT_EXT2 = @BTRFSCONVERT_EXT2@
 BTRFSCONVERT_REISERFS = @BTRFSCONVERT_REISERFS@
 BTRFSRESTORE_ZSTD = @BTRFSRESTORE_ZSTD@
+PYTHON_BINDINGS = @PYTHON_BINDINGS@
+PYTHON = @PYTHON@
 
 SUBST_CFLAGS = @CFLAGS@
 SUBST_LDFLAGS = @LDFLAGS@
diff --git a/configure.ac b/configure.ac
index 3afcdb47..50d36e4f 100644
--- a/configure.ac
+++ b/configure.ac
@@ -195,6 +195,19 @@ fi
 AS_IF([test "x$enable_zstd" = xyes], [BTRFSRESTORE_ZSTD=1], 
[BTRFSRESTORE_ZSTD=0])
 AC_SUBST(BTRFSRESTORE_ZSTD)
 
+AC_ARG_ENABLE([python],
+   

[PATCH v2 07/27] libbtrfsutil: add btrfs_util_subvolume_path()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

We can just walk up root backrefs with BTRFS_IOC_TREE_SEARCH and inode
paths with BTRFS_IOC_INO_LOOKUP.

Signed-off-by: Omar Sandoval 
---
 libbtrfsutil/btrfsutil.h|  21 +
 libbtrfsutil/python/btrfsutilpy.h   |   2 +-
 libbtrfsutil/python/module.c|   8 ++
 libbtrfsutil/python/subvolume.c |  30 +++
 libbtrfsutil/python/tests/test_subvolume.py |  31 +++
 libbtrfsutil/subvolume.c| 125 
 6 files changed, 216 insertions(+), 1 deletion(-)

diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 5d67f111..f96c9c4e 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -103,6 +103,27 @@ enum btrfs_util_error btrfs_util_subvolume_id(const char 
*path,
  */
 enum btrfs_util_error btrfs_util_subvolume_id_fd(int fd, uint64_t *id_ret);
 
+/**
+ * btrfs_util_subvolume_path() - Get the path of the subvolume with a given ID
+ * relative to the filesystem root.
+ * @path: Path on a Btrfs filesystem.
+ * @id: ID of subvolume to set as the default. If zero is given, the subvolume
+ * ID of @path is used.
+ * @path_ret: Returned path.
+ *
+ * This requires appropriate privilege (CAP_SYS_ADMIN).
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_subvolume_path(const char *path, uint64_t id,
+   char **path_ret);
+
+/**
+ * btrfs_util_subvolume_path_fd() - See btrfs_util_subvolume_path().
+ */
+enum btrfs_util_error btrfs_util_subvolume_path_fd(int fd, uint64_t id,
+  char **path_ret);
+
 struct btrfs_util_qgroup_inherit;
 
 /**
diff --git a/libbtrfsutil/python/btrfsutilpy.h 
b/libbtrfsutil/python/btrfsutilpy.h
index 87d47ae0..ffd62ba7 100644
--- a/libbtrfsutil/python/btrfsutilpy.h
+++ b/libbtrfsutil/python/btrfsutilpy.h
@@ -60,7 +60,7 @@ void SetFromBtrfsUtilErrorWithPaths(enum btrfs_util_error err,
 
 PyObject *is_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
 PyObject *subvolume_id(PyObject *self, PyObject *args, PyObject *kwds);
-PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
+PyObject *subvolume_path(PyObject *self, PyObject *args, PyObject *kwds);
 
 void add_module_constants(PyObject *m);
 
diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
index 69bba704..444516b1 100644
--- a/libbtrfsutil/python/module.c
+++ b/libbtrfsutil/python/module.c
@@ -144,6 +144,14 @@ static PyMethodDef btrfsutil_methods[] = {
 "Get the ID of the subvolume containing a file.\n\n"
 "Arguments:\n"
 "path -- string, bytes, path-like object, or open file descriptor"},
+   {"subvolume_path", (PyCFunction)subvolume_path,
+METH_VARARGS | METH_KEYWORDS,
+"subvolume_path(path, id=0) -> int\n\n"
+"Get the path of a subvolume relative to the filesystem root.\n\n"
+"Arguments:\n"
+"path -- string, bytes, path-like object, or open file descriptor\n"
+"id -- if not zero, instead of returning the subvolume path of the\n"
+"given path, return the path of the subvolume with this ID"},
{"create_subvolume", (PyCFunction)create_subvolume,
 METH_VARARGS | METH_KEYWORDS,
 "create_subvolume(path, async=False)\n\n"
diff --git a/libbtrfsutil/python/subvolume.c b/libbtrfsutil/python/subvolume.c
index 6f2080ee..6382d290 100644
--- a/libbtrfsutil/python/subvolume.c
+++ b/libbtrfsutil/python/subvolume.c
@@ -72,6 +72,36 @@ PyObject *subvolume_id(PyObject *self, PyObject *args, 
PyObject *kwds)
return PyLong_FromUnsignedLongLong(id);
 }
 
+PyObject *subvolume_path(PyObject *self, PyObject *args, PyObject *kwds)
+{
+   static char *keywords[] = {"path", "id", NULL};
+   struct path_arg path = {.allow_fd = true};
+   enum btrfs_util_error err;
+   uint64_t id = 0;
+   char *subvol_path;
+   PyObject *ret;
+
+   if (!PyArg_ParseTupleAndKeywords(args, kwds, "O&|K:subvolume_path",
+keywords, &path_converter, &path, &id))
+   return NULL;
+
+   if (path.path)
+   err = btrfs_util_subvolume_path(path.path, id, &subvol_path);
+   else
+   err = btrfs_util_subvolume_path_fd(path.fd, id, &subvol_path);
+   if (err) {
+   SetFromBtrfsUtilErrorWithPath(err, &path);
+   path_cleanup(&path);
+   return NULL;
+   }
+
+   path_cleanup(&path);
+
+   ret = PyUnicode_DecodeFSDefault(subvol_path);
+   free(subvol_path);
+   return ret;
+}
+
 PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds)
 {
static char *ke

[PATCH v2 02/27] Add libbtrfsutil

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Currently, users wishing to manage Btrfs filesystems programatically
have to shell out to btrfs-progs and parse the output. This isn't ideal.
The goal of libbtrfsutil is to provide a library version of as many of
the operations of btrfs-progs as possible and to migrate btrfs-progs to
use it.

Rather than simply refactoring the existing btrfs-progs code, the code
has to be written from scratch for a couple of reasons:

* A lot of the btrfs-progs code was not designed with a nice library API
  in mind in terms of reusability, naming, and error reporting.
* libbtrfsutil is licensed under the LGPL, whereas btrfs-progs is under
  the GPL, which makes it dubious to directly copy or move the code.

Eventually, most of the low-level btrfs-progs code should either live in
libbtrfsutil or the shared kernel/userspace filesystem code, and
btrfs-progs will just be the CLI wrapper.

This first commit just includes the build system changes, license,
README, and error reporting helper.

Signed-off-by: Omar Sandoval 
---
 .gitignore|   2 +
 Makefile  |  72 ++--
 Makefile.inc.in   |   2 +-
 libbtrfsutil/COPYING  | 674 ++
 libbtrfsutil/COPYING.LESSER   | 165 ++
 libbtrfsutil/README.md|  35 ++
 libbtrfsutil/btrfsutil.h  |  76 +
 libbtrfsutil/btrfsutil_internal.h |  40 +++
 libbtrfsutil/errors.c |  55 
 9 files changed, 1100 insertions(+), 21 deletions(-)
 create mode 100644 libbtrfsutil/COPYING
 create mode 100644 libbtrfsutil/COPYING.LESSER
 create mode 100644 libbtrfsutil/README.md
 create mode 100644 libbtrfsutil/btrfsutil.h
 create mode 100644 libbtrfsutil/btrfsutil_internal.h
 create mode 100644 libbtrfsutil/errors.c

diff --git a/.gitignore b/.gitignore
index 8e607f6e..272d53e4 100644
--- a/.gitignore
+++ b/.gitignore
@@ -43,6 +43,8 @@ libbtrfs.so.0.1
 library-test
 library-test-static
 /fssum
+/libbtrfsutil.so*
+/libbtrfsutil.a
 
 /tests/*-tests-results.txt
 /tests/test-console.txt
diff --git a/Makefile b/Makefile
index 00e21379..7fb70d06 100644
--- a/Makefile
+++ b/Makefile
@@ -73,10 +73,20 @@ CFLAGS = $(SUBST_CFLAGS) \
 -fPIC \
 -I$(TOPDIR) \
 -I$(TOPDIR)/kernel-lib \
+-I$(TOPDIR)/libbtrfsutil \
 $(EXTRAWARN_CFLAGS) \
 $(DEBUG_CFLAGS_INTERNAL) \
 $(EXTRA_CFLAGS)
 
+LIBBTRFSUTIL_CFLAGS = $(SUBST_CFLAGS) \
+ $(CSTD) \
+ -D_GNU_SOURCE \
+ -fPIC \
+ -fvisibility=hidden \
+ -I$(TOPDIR)/libbtrfsutil \
+ $(EXTRAWARN_CFLAGS) \
+ $(EXTRA_CFLAGS)
+
 LDFLAGS = $(SUBST_LDFLAGS) \
  -rdynamic -L$(TOPDIR) \
  $(DEBUG_LDFLAGS_INTERNAL) \
@@ -121,12 +131,17 @@ libbtrfs_headers = send-stream.h send-utils.h send.h 
kernel-lib/rbtree.h btrfs-l
   kernel-lib/crc32c.h kernel-lib/list.h kerncompat.h \
   kernel-lib/radix-tree.h kernel-lib/sizes.h kernel-lib/raid56.h \
   extent-cache.h extent_io.h ioctl.h ctree.h btrfsck.h version.h
+libbtrfsutil_major := $(shell sed -rn 's/^\#define BTRFS_UTIL_VERSION_MAJOR 
([0-9])+$$/\1/p' libbtrfsutil/btrfsutil.h)
+libbtrfsutil_minor := $(shell sed -rn 's/^\#define BTRFS_UTIL_VERSION_MINOR 
([0-9])+$$/\1/p' libbtrfsutil/btrfsutil.h)
+libbtrfsutil_patch := $(shell sed -rn 's/^\#define BTRFS_UTIL_VERSION_PATCH 
([0-9])+$$/\1/p' libbtrfsutil/btrfsutil.h)
+libbtrfsutil_version := 
$(libbtrfsutil_major).$(libbtrfsutil_minor).$(libbtrfsutil_patch)
+libbtrfsutil_objects = libbtrfsutil/errors.o
 convert_objects = convert/main.o convert/common.o convert/source-fs.o \
  convert/source-ext2.o convert/source-reiserfs.o
 mkfs_objects = mkfs/main.o mkfs/common.o mkfs/rootdir.o
 image_objects = image/main.o image/sanitize.o
 all_objects = $(objects) $(cmds_objects) $(libbtrfs_objects) 
$(convert_objects) \
- $(mkfs_objects) $(image_objects)
+ $(mkfs_objects) $(image_objects) $(libbtrfsutil_objects)
 
 udev_rules = 64-btrfs-dm.rules
 
@@ -246,11 +261,10 @@ static_convert_objects = $(patsubst %.o, %.static.o, 
$(convert_objects))
 static_mkfs_objects = $(patsubst %.o, %.static.o, $(mkfs_objects))
 static_image_objects = $(patsubst %.o, %.static.o, $(image_objects))
 
-libs_shared = libbtrfs.so.0.1
-libs_static = libbtrfs.a
+libs_shared = libbtrfs.so.0.1 libbtrfsutil.so.$(libbtrfsutil_version)
+libs_static = libbtrfs.a libbtrfsutil.a
 libs = $(libs_shared) $(libs_static)
-lib_links = libbtrfs.so.0 libbtrfs.so
-headers = $(libbtrfs_headers)
+lib_links = libbtrfs.so.0 libbtrfs.so libbtrfsutil.so.$(libbtrfsutil_major) 
libbtrfsutil.so
 
 # make C=1 to enable sparse
 ifdef C
@@ -287,7 +301,7 @@ endif
$(Q)$(CC) $(STATIC_CFLAGS) -c $< -o $@ $($(subst 
-,_,$(@:%.static.o=%)-cflags)) \
$($(subst 

[PATCH v2 21/27] btrfs-progs: use libbtrfsutil for subvol delete

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Most of the interesting part of this command is the commit mode, so this
only saves a little bit of code.

Signed-off-by: Omar Sandoval 
---
 cmds-subvolume.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index ea436ba0..68768914 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -225,7 +225,6 @@ static int cmd_subvol_delete(int argc, char **argv)
int res, ret = 0;
int cnt;
int fd = -1;
-   struct btrfs_ioctl_vol_args args;
char*dname, *vname, *cpath;
char*dupdname = NULL;
char*dupvname = NULL;
@@ -237,6 +236,7 @@ static int cmd_subvol_delete(int argc, char **argv)
char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
struct seen_fsid *seen_fsid_hash[SEEN_FSID_HASH_SIZE] = { NULL, };
enum { COMMIT_AFTER = 1, COMMIT_EACH = 2 };
+   enum btrfs_util_error err;
 
while (1) {
int c;
@@ -280,14 +280,9 @@ static int cmd_subvol_delete(int argc, char **argv)
 again:
path = argv[cnt];
 
-   res = test_issubvolume(path);
-   if (res < 0) {
-   error("cannot access subvolume %s: %s", path, strerror(-res));
-   ret = 1;
-   goto out;
-   }
-   if (!res) {
-   error("not a subvolume: %s", path);
+   err = btrfs_util_is_subvolume(path);
+   if (err) {
+   error_btrfs_util(err);
ret = 1;
goto out;
}
@@ -313,11 +308,10 @@ again:
printf("Delete subvolume (%s): '%s/%s'\n",
commit_mode == COMMIT_EACH || (commit_mode == COMMIT_AFTER && 
cnt + 1 == argc)
? "commit" : "no-commit", dname, vname);
-   memset(&args, 0, sizeof(args));
-   strncpy_null(args.name, vname);
-   res = ioctl(fd, BTRFS_IOC_SNAP_DESTROY, &args);
-   if(res < 0 ){
-   error("cannot delete '%s/%s': %m", dname, vname);
+
+   err = btrfs_util_delete_subvolume_fd(fd, vname, 0);
+   if (err) {
+   error_btrfs_util(err);
ret = 1;
goto out;
}
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 19/27] btrfs-progs: use libbtrfsutil for get-default

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

The only thing of note here is the "top level" column. This used to mean
something else, but for a long time it has been equal to the parent ID.
I preserved this for backwards compatability.

Signed-off-by: Omar Sandoval 
---
 cmds-subvolume.c | 54 ++
 1 file changed, 26 insertions(+), 28 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 700e822c..42cc30ce 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -14,6 +14,7 @@
  * Boston, MA 021110-1307, USA.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -785,31 +786,25 @@ static const char * const cmd_subvol_get_default_usage[] 
= {
 static int cmd_subvol_get_default(int argc, char **argv)
 {
int fd = -1;
-   int ret;
-   char *subvol;
-   struct btrfs_list_filter_set *filter_set;
-   u64 default_id;
+   int ret = 1;
+   uint64_t default_id;
DIR *dirstream = NULL;
+   enum btrfs_util_error err;
+   struct btrfs_util_subvolume_info subvol;
+   char *path;
 
clean_args_no_options(argc, argv, cmd_subvol_get_default_usage);
 
if (check_argc_exact(argc - optind, 1))
usage(cmd_subvol_get_default_usage);
 
-   subvol = argv[1];
-   fd = btrfs_open_dir(subvol, &dirstream, 1);
+   fd = btrfs_open_dir(argv[1], &dirstream, 1);
if (fd < 0)
return 1;
 
-   ret = btrfs_list_get_default_subvolume(fd, &default_id);
-   if (ret) {
-   error("failed to look up default subvolume: %m");
-   goto out;
-   }
-
-   ret = 1;
-   if (default_id == 0) {
-   error("'default' dir item not found");
+   err = btrfs_util_get_default_subvolume_fd(fd, &default_id);
+   if (err) {
+   error_btrfs_util(err);
goto out;
}
 
@@ -820,24 +815,27 @@ static int cmd_subvol_get_default(int argc, char **argv)
goto out;
}
 
-   filter_set = btrfs_list_alloc_filter_set();
-   btrfs_list_setup_filter(&filter_set, BTRFS_LIST_FILTER_ROOTID,
-   default_id);
+   err = btrfs_util_subvolume_info_fd(fd, default_id, &subvol);
+   if (err) {
+   error_btrfs_util(err);
+   goto out;
+   }
 
-   /* by default we shall print the following columns*/
-   btrfs_list_setup_print_column(BTRFS_LIST_OBJECTID);
-   btrfs_list_setup_print_column(BTRFS_LIST_GENERATION);
-   btrfs_list_setup_print_column(BTRFS_LIST_TOP_LEVEL);
-   btrfs_list_setup_print_column(BTRFS_LIST_PATH);
+   err = btrfs_util_subvolume_path_fd(fd, default_id, &path);
+   if (err) {
+   error_btrfs_util(err);
+   goto out;
+   }
 
-   ret = btrfs_list_subvols_print(fd, filter_set, NULL,
-   BTRFS_LIST_LAYOUT_DEFAULT, 1, NULL);
+   printf("ID %" PRIu64 " gen %" PRIu64 " top level %" PRIu64 " path %s\n",
+  subvol.id, subvol.generation, subvol.parent_id, path);
 
-   if (filter_set)
-   free(filter_set);
+   free(path);
+
+   ret = 0;
 out:
close_file_or_dir(fd, dirstream);
-   return !!ret;
+   return ret;
 }
 
 static const char * const cmd_subvol_set_default_usage[] = {
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 25/27] btrfs-progs: add recursive snapshot/delete using libbtrfsutil

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

And update the documentation.

Signed-off-by: Omar Sandoval 
---
 Documentation/btrfs-subvolume.asciidoc | 14 --
 cmds-subvolume.c   | 25 +
 2 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/Documentation/btrfs-subvolume.asciidoc 
b/Documentation/btrfs-subvolume.asciidoc
index a8c4af4b..7a2ec8d2 100644
--- a/Documentation/btrfs-subvolume.asciidoc
+++ b/Documentation/btrfs-subvolume.asciidoc
@@ -81,6 +81,12 @@ wait for transaction commit at the end of the operation
 +
 -C|--commit-each
 wait for transaction commit after deleting each subvolume
++
+-R|--recursive
+delete subvolumes beneath each subvolume recursively
++
+-v|--verbose
+output more verbosely
 
 *find-new*  ::
 List the recently modified files in a subvolume, after  ID.
@@ -157,7 +163,7 @@ The id can be obtained from *btrfs subvolume list*, *btrfs 
subvolume show* or
 *show* ::
 Show information of a given subvolume in the .
 
-*snapshot* [-r]  |[/]::
+*snapshot* [-r|-R]  |[/]::
 Create a snapshot of the subvolume  with the
 name  in the  directory.
 +
@@ -167,7 +173,11 @@ If  is not a subvolume, btrfs returns an error.
 `Options`
 +
 -r
-Make the new snapshot read only.
+make the new snapshot read-only
++
+-R
+recursively snapshot subvolumes beneath the source; this option cannot be
+combined with -r
 
 *sync*  [subvolid...]::
 Wait until given subvolume(s) are completely removed from the filesystem after
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 43875c96..101ba4ca 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -187,6 +187,7 @@ static const char * const cmd_subvol_delete_usage[] = {
"",
"-c|--commit-after  wait for transaction commit at the end of the 
operation",
"-C|--commit-each   wait for transaction commit after deleting each 
subvolume",
+   "-R|--recursive delete subvolumes beneath each subvolume 
recursively",
"-v|--verbose   verbose output of operations",
NULL
 };
@@ -203,6 +204,7 @@ static int cmd_subvol_delete(int argc, char **argv)
DIR *dirstream = NULL;
int verbose = 0;
int commit_mode = 0;
+   int flags = 0;
u8 fsid[BTRFS_FSID_SIZE];
char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
struct seen_fsid *seen_fsid_hash[SEEN_FSID_HASH_SIZE] = { NULL, };
@@ -214,11 +216,12 @@ static int cmd_subvol_delete(int argc, char **argv)
static const struct option long_options[] = {
{"commit-after", no_argument, NULL, 'c'},
{"commit-each", no_argument, NULL, 'C'},
+   {"recursive", no_argument, NULL, 'R'},
{"verbose", no_argument, NULL, 'v'},
{NULL, 0, NULL, 0}
};
 
-   c = getopt_long(argc, argv, "cCv", long_options, NULL);
+   c = getopt_long(argc, argv, "cCRv", long_options, NULL);
if (c < 0)
break;
 
@@ -229,6 +232,9 @@ static int cmd_subvol_delete(int argc, char **argv)
case 'C':
commit_mode = COMMIT_EACH;
break;
+   case 'R':
+   flags |= BTRFS_UTIL_DELETE_SUBVOLUME_RECURSIVE;
+   break;
case 'v':
verbose++;
break;
@@ -280,7 +286,7 @@ again:
commit_mode == COMMIT_EACH || (commit_mode == COMMIT_AFTER && 
cnt + 1 == argc)
? "commit" : "no-commit", dname, vname);
 
-   err = btrfs_util_delete_subvolume_fd(fd, vname, 0);
+   err = btrfs_util_delete_subvolume_fd(fd, vname, flags);
if (err) {
error_btrfs_util(err);
ret = 1;
@@ -569,13 +575,15 @@ out:
 }
 
 static const char * const cmd_subvol_snapshot_usage[] = {
-   "btrfs subvolume snapshot [-r] [-i ]  
|[/]",
+   "btrfs subvolume snapshot [-r|-R] [-i ]  
|[/]",
"Create a snapshot of the subvolume",
"Create a writable/readonly snapshot of the subvolume  with",
"the name  in the  directory.  If only  is given,",
"the subvolume will be named the basename of .",
"",
"-r create a readonly snapshot",
+   "-R recursively snapshot subvolumes beneath the source; 
this",
+   "   option cannot be combined with -r",
"-i   add the newly created snapshot to a qgroup. This",
"   option can be given multiple times.",
NULL
@@ -589,7 +597,7 @@

[PATCH v2 23/27] btrfs-progs: use libbtrfsutil for subvol sync

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

btrfs_util_f_deleted_subvolumes() replaces enumerate_dead_subvols() and
btrfs_util_f_subvolume_info() replaces is_subvolume_cleaned().

Signed-off-by: Omar Sandoval 
---
 cmds-subvolume.c | 217 ++-
 1 file changed, 21 insertions(+), 196 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 49c9c8cf..9bab9312 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -42,38 +42,11 @@
 #include "utils.h"
 #include "help.h"
 
-static int is_subvolume_cleaned(int fd, u64 subvolid)
+static int wait_for_subvolume_cleaning(int fd, size_t count, uint64_t *ids,
+  int sleep_interval)
 {
-   int ret;
-   struct btrfs_ioctl_search_args args;
-   struct btrfs_ioctl_search_key *sk = &args.key;
-
-   sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
-   sk->min_objectid = subvolid;
-   sk->max_objectid = subvolid;
-   sk->min_type = BTRFS_ROOT_ITEM_KEY;
-   sk->max_type = BTRFS_ROOT_ITEM_KEY;
-   sk->min_offset = 0;
-   sk->max_offset = (u64)-1;
-   sk->min_transid = 0;
-   sk->max_transid = (u64)-1;
-   sk->nr_items = 1;
-
-   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
-   if (ret < 0)
-   return -errno;
-
-   if (sk->nr_items == 0)
-   return 1;
-
-   return 0;
-}
-
-static int wait_for_subvolume_cleaning(int fd, int count, u64 *ids,
-   int sleep_interval)
-{
-   int ret;
-   int i;
+   size_t i;
+   enum btrfs_util_error err;
 
while (1) {
int clean = 1;
@@ -81,16 +54,14 @@ static int wait_for_subvolume_cleaning(int fd, int count, 
u64 *ids,
for (i = 0; i < count; i++) {
if (!ids[i])
continue;
-   ret = is_subvolume_cleaned(fd, ids[i]);
-   if (ret < 0) {
-   error(
-   "cannot read status of dead subvolume %llu: %s",
-   (unsigned long long)ids[i], 
strerror(-ret));
-   return ret;
-   }
-   if (ret) {
-   printf("Subvolume id %llu is gone\n", ids[i]);
+   err = btrfs_util_subvolume_info_fd(fd, ids[i], NULL);
+   if (err == BTRFS_UTIL_ERROR_SUBVOLUME_NOT_FOUND) {
+   printf("Subvolume id %" PRIu64 " is gone\n",
+  ids[i]);
ids[i] = 0;
+   } else if (err) {
+   error_btrfs_util(err);
+   return -errno;
} else {
clean = 0;
}
@@ -1028,160 +999,15 @@ static const char * const cmd_subvol_sync_usage[] = {
NULL
 };
 
-#if 0
-/*
- * If we're looking for any dead subvolume, take a shortcut and look
- * for any ORPHAN_ITEMs in the tree root
- */
-static int fs_has_dead_subvolumes(int fd)
-{
-   int ret;
-   struct btrfs_ioctl_search_args args;
-   struct btrfs_ioctl_search_key *sk = &args.key;
-   struct btrfs_ioctl_search_header sh;
-   u64 min_subvolid = 0;
-
-again:
-   sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
-   sk->min_objectid = BTRFS_ORPHAN_OBJECTID;
-   sk->max_objectid = BTRFS_ORPHAN_OBJECTID;
-   sk->min_type = BTRFS_ORPHAN_ITEM_KEY;
-   sk->max_type = BTRFS_ORPHAN_ITEM_KEY;
-   sk->min_offset = min_subvolid;
-   sk->max_offset = (u64)-1;
-   sk->min_transid = 0;
-   sk->max_transid = (u64)-1;
-   sk->nr_items = 1;
-
-   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
-   if (ret < 0)
-   return -errno;
-
-   if (!sk->nr_items)
-   return 0;
-
-   memcpy(&sh, args.buf, sizeof(sh));
-   min_subvolid = sh.offset;
-
-   /*
-* Verify that the root item is really there and we haven't hit
-* a stale orphan
-*/
-   sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
-   sk->min_objectid = min_subvolid;
-   sk->max_objectid = min_subvolid;
-   sk->min_type = BTRFS_ROOT_ITEM_KEY;
-   sk->max_type = BTRFS_ROOT_ITEM_KEY;
-   sk->min_offset = 0;
-   sk->max_offset = (u64)-1;
-   sk->min_transid = 0;
-   sk->max_transid = (u64)-1;
-   sk->nr_items = 1;
-
-   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
-   if (ret < 0)
-   return -errno;
-
-   /*
-* Stale orphan, try the next one
-*/
-   if (!sk->nr_items) {
-   min_subvolid++;
-   goto again;
-  

[PATCH v2 27/27] btrfs-progs: deprecate libbtrfs helpers

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

The old libbtrfs defines some helpers which do the same thing as some
libbtrfsutil helpers. Update btrfs-progs to use the libbtrfsutil APIs
and mark the libbtrfs versions as deprecated, which we could ideally get
rid of eventually.

Signed-off-by: Omar Sandoval 
---
 btrfs-list.c |  6 ++
 btrfs-list.h | 15 ++-
 cmds-inspect.c   | 10 ++
 cmds-receive.c   | 13 +
 cmds-subvolume.c | 10 +++---
 send-utils.c | 25 +
 utils.h  |  1 -
 7 files changed, 55 insertions(+), 25 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index a2fdb3f9..56aa2455 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -34,6 +34,12 @@
 #include "btrfs-list.h"
 #include "rbtree-utils.h"
 
+/*
+ * The deprecated functions in this file depend on each other, so silence the
+ * warnings.
+ */
+#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
+
 /* we store all the roots we find in an rbtree so that we can
  * search for them later.
  */
diff --git a/btrfs-list.h b/btrfs-list.h
index 54e1888f..774bcece 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -82,10 +82,15 @@ struct root_info {
 };
 
 int btrfs_list_find_updated_files(int fd, u64 root_id, u64 oldest_gen);
-int btrfs_list_get_default_subvolume(int fd, u64 *default_id);
-char *btrfs_list_path_for_root(int fd, u64 root);
-int btrfs_list_get_path_rootid(int fd, u64 *treeid);
-int btrfs_get_subvol(int fd, struct root_info *the_ri);
-int btrfs_get_toplevel_subvol(int fd, struct root_info *the_ri);
+int btrfs_list_get_default_subvolume(int fd, u64 *default_id)
+   __attribute__((deprecated("use 
btrfs_util_get_default_subvolume_fd()")));
+char *btrfs_list_path_for_root(int fd, u64 root)
+   __attribute__((deprecated("use btrfs_util_subvolume_path_fd()")));
+int btrfs_list_get_path_rootid(int fd, u64 *treeid)
+   __attribute__((deprecated("use btrfs_util_subvolume_id_fd()")));
+int btrfs_get_subvol(int fd, struct root_info *the_ri)
+   __attribute__((deprecated("use btrfs_util_subvolume_info_fd() and 
btrfs_util_subvolume_path_fd()")));
+int btrfs_get_toplevel_subvol(int fd, struct root_info *the_ri)
+   __attribute__((deprecated("use btrfs_util_subvolume_info_fd() and 
btrfs_util_subvolume_path_fd()")));
 
 #endif
diff --git a/cmds-inspect.c b/cmds-inspect.c
index afd7fe48..77585a23 100644
--- a/cmds-inspect.c
+++ b/cmds-inspect.c
@@ -23,6 +23,8 @@
 #include 
 #include 
 
+#include 
+
 #include "kerncompat.h"
 #include "ioctl.h"
 #include "utils.h"
@@ -30,7 +32,6 @@
 #include "send-utils.h"
 #include "disk-io.h"
 #include "commands.h"
-#include "btrfs-list.h"
 #include "help.h"
 
 static const char * const inspect_cmd_group_usage[] = {
@@ -147,6 +148,7 @@ static int cmd_inspect_logical_resolve(int argc, char 
**argv)
char full_path[PATH_MAX];
char *path_ptr;
DIR *dirstream = NULL;
+   enum btrfs_util_error err;
 
while (1) {
int c = getopt(argc, argv, "Pvs:");
@@ -219,9 +221,9 @@ static int cmd_inspect_logical_resolve(int argc, char 
**argv)
DIR *dirs = NULL;
 
if (getpath) {
-   name = btrfs_list_path_for_root(fd, root);
-   if (IS_ERR(name)) {
-   ret = PTR_ERR(name);
+   err = btrfs_util_subvolume_path_fd(fd, root, &name);
+   if (err) {
+   ret = -errno;
goto out;
}
if (!name) {
diff --git a/cmds-receive.c b/cmds-receive.c
index 68123a31..cd880ee7 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -39,12 +39,13 @@
 #include 
 #include 
 
+#include 
+
 #include "ctree.h"
 #include "ioctl.h"
 #include "commands.h"
 #include "utils.h"
 #include "list.h"
-#include "btrfs-list.h"
 
 #include "send.h"
 #include "send-stream.h"
@@ -1086,12 +1087,13 @@ static struct btrfs_send_ops send_ops = {
 static int do_receive(struct btrfs_receive *rctx, const char *tomnt,
  char *realmnt, int r_fd, u64 max_errors)
 {
-   u64 subvol_id;
+   uint64_t subvol_id;
int ret;
char *dest_dir_full_path;
char root_subvol_path[PATH_MAX];
int end = 0;
int iterations = 0;
+   enum btrfs_util_error err;
 
dest_dir_full_path = realpath(tomnt, NULL);
if (!dest_dir_full_path) {
@@ -1136,9 +1138,12 @@ static int do_receive(struct btrfs_receive *rctx, const 
char *tomnt,
 * subvolume we're sitting in so that we can adjust the paths of any
 * subvols we want to receive in.
 */
-   ret = btrfs_list_get_path_rootid(rctx-&

[PATCH v2 26/27] btrfs-progs: use libbtrfsutil for subvolume list

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

This is the most involved conversion because it replaces the libbtrfs
implementation entirely. I took care to preserve the existing behavior
in all cases, warts and all.

This also moves the list printing code from libbtrfs to
cmds-subvolume.c. This is an ABI break, but it avoids adding a
libbtrfsutil dependency to libbtrfs. A search online didn't turn up any
projects which use this interface.

Signed-off-by: Omar Sandoval 
---
 btrfs-list.c | 806 +-
 btrfs-list.h |  89 --
 cmds-subvolume.c | 948 +++
 3 files changed, 956 insertions(+), 887 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index e01c5899..a2fdb3f9 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -34,9 +34,6 @@
 #include "btrfs-list.h"
 #include "rbtree-utils.h"
 
-#define BTRFS_LIST_NFILTERS_INCREASE   (2 * BTRFS_LIST_FILTER_MAX)
-#define BTRFS_LIST_NCOMPS_INCREASE (2 * BTRFS_LIST_COMP_MAX)
-
 /* we store all the roots we find in an rbtree so that we can
  * search for them later.
  */
@@ -44,285 +41,14 @@ struct root_lookup {
struct rb_root root;
 };
 
-static struct {
-   char*name;
-   char*column_name;
-   int need_print;
-} btrfs_list_columns[] = {
-   {
-   .name   = "ID",
-   .column_name= "ID",
-   .need_print = 0,
-   },
-   {
-   .name   = "gen",
-   .column_name= "Gen",
-   .need_print = 0,
-   },
-   {
-   .name   = "cgen",
-   .column_name= "CGen",
-   .need_print = 0,
-   },
-   {
-   .name   = "parent",
-   .column_name= "Parent",
-   .need_print = 0,
-   },
-   {
-   .name   = "top level",
-   .column_name= "Top Level",
-   .need_print = 0,
-   },
-   {
-   .name   = "otime",
-   .column_name= "OTime",
-   .need_print = 0,
-   },
-   {
-   .name   = "parent_uuid",
-   .column_name= "Parent UUID",
-   .need_print = 0,
-   },
-   {
-   .name   = "received_uuid",
-   .column_name= "Received UUID",
-   .need_print = 0,
-   },
-   {
-   .name   = "uuid",
-   .column_name= "UUID",
-   .need_print = 0,
-   },
-   {
-   .name   = "path",
-   .column_name= "Path",
-   .need_print = 0,
-   },
-   {
-   .name   = NULL,
-   .column_name= NULL,
-   .need_print = 0,
-   },
-};
-
-static btrfs_list_filter_func all_filter_funcs[];
-static btrfs_list_comp_func all_comp_funcs[];
-
-void btrfs_list_setup_print_column(enum btrfs_list_column_enum column)
+static int comp_rootid(struct root_info *entry1, struct root_info *entry2)
 {
-   int i;
-
-   ASSERT(0 <= column && column <= BTRFS_LIST_ALL);
-
-   if (column < BTRFS_LIST_ALL) {
-   btrfs_list_columns[column].need_print = 1;
-   return;
-   }
-
-   for (i = 0; i < BTRFS_LIST_ALL; i++)
-   btrfs_list_columns[i].need_print = 1;
-}
-
-static int comp_entry_with_rootid(struct root_info *entry1,
- struct root_info *entry2,
- int is_descending)
-{
-   int ret;
-
if (entry1->root_id > entry2->root_id)
-   ret = 1;
+   return 1;
else if (entry1->root_id < entry2->root_id)
-   ret = -1;
-   else
-   ret = 0;
-
-   return is_descending ? -ret : ret;
-}
-
-static int comp_entry_with_gen(struct root_info *entry1,
-  struct root_info *entry2,
-  int is_descending)
-{
-   int ret;
-
-   if (entry1->gen > entry2->gen)
-   ret = 1;
-   else if (entry1->gen < entry2->gen)
-   ret = -1;
-   else
-   ret = 0;
-
-   return is_descending ? -ret : ret;
-}
-
-static int comp_entry_with_ogen(struct root_info *entry1,
-   struct root_info *entry2,
-   int is_descending)
-{
-   int ret;
-
-   if (entry1->ogen > entry2->ogen)
-   ret = 1;
-   else if (entry1->ogen < entry2->ogen)
-   ret = -1;
+   

[PATCH v2 24/27] btrfs-progs: replace test_issubvolume() with btrfs_util_is_subvolume()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

This gets the remaining occurrences that weren't covered by previous
conversions.

Signed-off-by: Omar Sandoval 
---
 cmds-qgroup.c| 11 ---
 cmds-subvolume.c | 10 +++---
 utils.c  | 34 +-
 utils.h  |  1 -
 4 files changed, 12 insertions(+), 44 deletions(-)

diff --git a/cmds-qgroup.c b/cmds-qgroup.c
index f9a52fa8..f733565e 100644
--- a/cmds-qgroup.c
+++ b/cmds-qgroup.c
@@ -425,6 +425,7 @@ static int cmd_qgroup_limit(int argc, char **argv)
int compressed = 0;
int exclusive = 0;
DIR *dirstream = NULL;
+   enum btrfs_util_error err;
 
while (1) {
int c = getopt(argc, argv, "ce");
@@ -465,13 +466,9 @@ static int cmd_qgroup_limit(int argc, char **argv)
if (argc - optind == 2) {
args.qgroupid = 0;
path = argv[optind + 1];
-   ret = test_issubvolume(path);
-   if (ret < 0) {
-   error("cannot access '%s': %s", path, strerror(-ret));
-   return 1;
-   }
-   if (!ret) {
-   error("'%s' is not a subvolume", path);
+   err = btrfs_util_is_subvolume(path);
+   if (err) {
+   error_btrfs_util(err);
return 1;
}
/*
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 9bab9312..43875c96 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -756,13 +756,9 @@ static int cmd_subvol_find_new(int argc, char **argv)
subvol = argv[optind];
last_gen = arg_strtou64(argv[optind + 1]);
 
-   ret = test_issubvolume(subvol);
-   if (ret < 0) {
-   error("cannot access subvolume %s: %s", subvol, strerror(-ret));
-   return 1;
-   }
-   if (!ret) {
-   error("not a subvolume: %s", subvol);
+   err = btrfs_util_is_subvolume(subvol);
+   if (err) {
+   error_btrfs_util(err);
return 1;
}
 
diff --git a/utils.c b/utils.c
index 6e6f295f..9be91437 100644
--- a/utils.c
+++ b/utils.c
@@ -40,6 +40,8 @@
 #include 
 #include 
 
+#include 
+
 #include "kerncompat.h"
 #include "radix-tree.h"
 #include "ctree.h"
@@ -1453,6 +1455,7 @@ u64 parse_qgroupid(const char *p)
char *s = strchr(p, '/');
const char *ptr_src_end = p + strlen(p);
char *ptr_parse_end = NULL;
+   enum btrfs_util_error err;
u64 level;
u64 id;
int fd;
@@ -1480,8 +1483,8 @@ u64 parse_qgroupid(const char *p)
 
 path:
/* Path format like subv at 'my_subvol' is the fallback case */
-   ret = test_issubvolume(p);
-   if (ret < 0 || !ret)
+   err = btrfs_util_is_subvolume(p);
+   if (err)
goto err;
fd = open(p, O_RDONLY);
if (fd < 0)
@@ -2451,33 +2454,6 @@ int test_issubvolname(const char *name)
strcmp(name, ".") && strcmp(name, "..");
 }
 
-/*
- * Test if path is a subvolume
- * Returns:
- *   0 - path exists but it is not a subvolume
- *   1 - path exists and it is  a subvolume
- * < 0 - error
- */
-int test_issubvolume(const char *path)
-{
-   struct stat st;
-   struct statfs stfs;
-   int res;
-
-   res = stat(path, &st);
-   if (res < 0)
-   return -errno;
-
-   if (st.st_ino != BTRFS_FIRST_FREE_OBJECTID || !S_ISDIR(st.st_mode))
-   return 0;
-
-   res = statfs(path, &stfs);
-   if (res < 0)
-   return -errno;
-
-   return (int)stfs.f_type == BTRFS_SUPER_MAGIC;
-}
-
 const char *subvol_strip_mountpoint(const char *mnt, const char *full_path)
 {
int len = strlen(mnt);
diff --git a/utils.h b/utils.h
index eb460e9b..403de481 100644
--- a/utils.h
+++ b/utils.h
@@ -149,7 +149,6 @@ u64 disk_size(const char *path);
 u64 get_partition_size(const char *dev);
 
 int test_issubvolname(const char *name);
-int test_issubvolume(const char *path);
 int test_isdir(const char *path);
 
 const char *subvol_strip_mountpoint(const char *mnt, const char *full_path);
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 22/27] btrfs-progs: use libbtrfsutil for subvol show

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Now implemented with btrfs_util_subvolume_path(),
btrfs_util_subvolume_info(), and subvolume iterators.

Signed-off-by: Omar Sandoval 
---
 cmds-subvolume.c | 149 ---
 utils.c  | 118 ---
 utils.h  |   5 --
 3 files changed, 98 insertions(+), 174 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 68768914..49c9c8cf 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -824,19 +824,20 @@ static const char * const cmd_subvol_show_usage[] = {
 
 static int cmd_subvol_show(int argc, char **argv)
 {
-   struct root_info get_ri;
-   struct btrfs_list_filter_set *filter_set = NULL;
char tstr[256];
char uuidparse[BTRFS_UUID_UNPARSED_SIZE];
char *fullpath = NULL;
-   char raw_prefix[] = "\t\t\t\t";
int fd = -1;
int ret = 1;
DIR *dirstream1 = NULL;
int by_rootid = 0;
int by_uuid = 0;
-   u64 rootid_arg;
+   u64 rootid_arg = 0;
u8 uuid_arg[BTRFS_UUID_SIZE];
+   struct btrfs_util_subvolume_iterator *iter;
+   struct btrfs_util_subvolume_info subvol;
+   char *subvol_path = NULL;
+   enum btrfs_util_error err;
 
while (1) {
int c;
@@ -873,96 +874,142 @@ static int cmd_subvol_show(int argc, char **argv)
usage(cmd_subvol_show_usage);
}
 
-   memset(&get_ri, 0, sizeof(get_ri));
fullpath = realpath(argv[optind], NULL);
if (!fullpath) {
error("cannot find real path for '%s': %m", argv[optind]);
goto out;
}
 
-   if (by_rootid) {
-   ret = get_subvol_info_by_rootid(fullpath, &get_ri, rootid_arg);
-   } else if (by_uuid) {
-   ret = get_subvol_info_by_uuid(fullpath, &get_ri, uuid_arg);
-   } else {
-   ret = get_subvol_info(fullpath, &get_ri);
+   fd = open_file_or_dir(fullpath, &dirstream1);
+   if (fd < 0) {
+   error("can't access '%s'", fullpath);
+   goto out;
}
 
-   if (ret) {
-   if (ret < 0) {
-   error("Failed to get subvol info %s: %s",
-   fullpath, strerror(-ret));
-   } else {
-   error("Failed to get subvol info %s: %d",
-   fullpath, ret);
+   if (by_uuid) {
+   err = btrfs_util_create_subvolume_iterator_fd(fd,
+ 
BTRFS_FS_TREE_OBJECTID,
+ 0, &iter);
+   if (err) {
+   error_btrfs_util(err);
+   goto out;
+   }
+
+   for (;;) {
+   err = btrfs_util_subvolume_iterator_next_info(iter,
+ 
&subvol_path,
+ &subvol);
+   if (err == BTRFS_UTIL_ERROR_STOP_ITERATION) {
+   uuid_unparse(uuid_arg, uuidparse);
+   error("can't find uuid '%s' on '%s'", uuidparse,
+ fullpath);
+   btrfs_util_destroy_subvolume_iterator(iter);
+   goto out;
+   } else if (err) {
+   error_btrfs_util(err);
+   btrfs_util_destroy_subvolume_iterator(iter);
+   goto out;
+   }
+
+   if (uuid_compare(subvol.uuid, uuid_arg) == 0)
+   break;
+
+   free(subvol_path);
+   }
+   btrfs_util_destroy_subvolume_iterator(iter);
+   } else {
+   /*
+* If !by_rootid, rootid_arg = 0, which means find the
+* subvolume ID of the fd and use that.
+*/
+   err = btrfs_util_subvolume_info_fd(fd, rootid_arg, &subvol);
+   if (err) {
+   error_btrfs_util(err);
+   goto out;
+   }
+
+   err = btrfs_util_subvolume_path_fd(fd, subvol.id, &subvol_path);
+   if (err) {
+   error_btrfs_util(err);
+   goto out;
}
-   return ret;
+
}
 
/* print the info */
-   printf("%s\n", get_ri.full_path);
-   printf("\tName: \t\t\t%s\n", get_ri.name);
+   printf("%s\n", subvol.id == BTRFS_FS_TREE_OBJECTID ? "/" : subvol

[PATCH v2 18/27] btrfs-progs: use libbtrfsutil for set-default

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Signed-off-by: Omar Sandoval 
---
 cmds-subvolume.c | 43 ---
 1 file changed, 8 insertions(+), 35 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 6006a278..700e822c 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -851,11 +851,9 @@ static const char * const cmd_subvol_set_default_usage[] = 
{
 
 static int cmd_subvol_set_default(int argc, char **argv)
 {
-   int ret=0, fd;
-   u64 objectid;
-   char*path;
-   char*subvolid;
-   DIR *dirstream = NULL;
+   u64 objectid;
+   char *path;
+   enum btrfs_util_error err;
 
clean_args_no_options(argc, argv, cmd_subvol_set_default_usage);
 
@@ -865,42 +863,17 @@ static int cmd_subvol_set_default(int argc, char **argv)
 
if (argc - optind == 1) {
/* path to the subvolume is specified */
+   objectid = 0;
path = argv[optind];
-
-   ret = test_issubvolume(path);
-   if (ret < 0) {
-   error("stat error: %s", strerror(-ret));
-   return 1;
-   } else if (!ret) {
-   error("'%s' is not a subvolume", path);
-   return 1;
-   }
-
-   fd = btrfs_open_dir(path, &dirstream, 1);
-   if (fd < 0)
-   return 1;
-
-   ret = lookup_path_rootid(fd, &objectid);
-   if (ret) {
-   error("unable to get subvol id: %s", strerror(-ret));
-   close_file_or_dir(fd, dirstream);
-   return 1;
-   }
} else {
/* subvol id and path to the filesystem are specified */
-   subvolid = argv[optind];
+   objectid = arg_strtou64(argv[optind]);
path = argv[optind + 1];
-   objectid = arg_strtou64(subvolid);
-
-   fd = btrfs_open_dir(path, &dirstream, 1);
-   if (fd < 0)
-   return 1;
}
 
-   ret = ioctl(fd, BTRFS_IOC_DEFAULT_SUBVOL, &objectid);
-   close_file_or_dir(fd, dirstream);
-   if (ret < 0) {
-   error("unable to set a new default subvolume: %m");
+   err = btrfs_util_set_default_subvolume(path, objectid);
+   if (err) {
+   error_btrfs_util(err);
return 1;
}
return 0;
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 04/27] libbtrfsutil: add btrfs_util_is_subvolume() and btrfs_util_subvolume_id()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

These are the most trivial helpers in the library and will be used to
implement several of the more involved functions.

Signed-off-by: Omar Sandoval 
---
 Makefile|   2 +-
 libbtrfsutil/btrfsutil.h|  33 
 libbtrfsutil/python/btrfsutilpy.h   |   3 +
 libbtrfsutil/python/module.c|  12 +++
 libbtrfsutil/python/setup.py|   1 +
 libbtrfsutil/python/subvolume.c |  73 
 libbtrfsutil/python/tests/__init__.py   |  66 +++
 libbtrfsutil/python/tests/test_subvolume.py |  57 +
 libbtrfsutil/subvolume.c| 127 
 9 files changed, 373 insertions(+), 1 deletion(-)
 create mode 100644 libbtrfsutil/python/subvolume.c
 create mode 100644 libbtrfsutil/python/tests/test_subvolume.py
 create mode 100644 libbtrfsutil/subvolume.c

diff --git a/Makefile b/Makefile
index 95ee9678..58b3eca4 100644
--- a/Makefile
+++ b/Makefile
@@ -135,7 +135,7 @@ libbtrfsutil_major := $(shell sed -rn 's/^\#define 
BTRFS_UTIL_VERSION_MAJOR ([0-
 libbtrfsutil_minor := $(shell sed -rn 's/^\#define BTRFS_UTIL_VERSION_MINOR 
([0-9])+$$/\1/p' libbtrfsutil/btrfsutil.h)
 libbtrfsutil_patch := $(shell sed -rn 's/^\#define BTRFS_UTIL_VERSION_PATCH 
([0-9])+$$/\1/p' libbtrfsutil/btrfsutil.h)
 libbtrfsutil_version := 
$(libbtrfsutil_major).$(libbtrfsutil_minor).$(libbtrfsutil_patch)
-libbtrfsutil_objects = libbtrfsutil/errors.o
+libbtrfsutil_objects = libbtrfsutil/errors.o libbtrfsutil/subvolume.o
 convert_objects = convert/main.o convert/common.o convert/source-fs.o \
  convert/source-ext2.o convert/source-reiserfs.o
 mkfs_objects = mkfs/main.o mkfs/common.o mkfs/rootdir.o
diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 867418f2..ca36f695 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -20,6 +20,8 @@
 #ifndef BTRFS_UTIL_H
 #define BTRFS_UTIL_H
 
+#include 
+
 #define BTRFS_UTIL_VERSION_MAJOR 1
 #define BTRFS_UTIL_VERSION_MINOR 0
 #define BTRFS_UTIL_VERSION_PATCH 0
@@ -69,6 +71,37 @@ enum btrfs_util_error {
  */
 const char *btrfs_util_strerror(enum btrfs_util_error err);
 
+/**
+ * btrfs_util_is_subvolume() - Return whether a given path is a Btrfs 
subvolume.
+ * @path: Path to check.
+ *
+ * Return: %BTRFS_UTIL_OK if @path is a Btrfs subvolume,
+ * %BTRFS_UTIL_ERROR_NOT_BTRFS if @path is not on a Btrfs filesystem,
+ * %BTRFS_UTIL_ERROR_NOT_SUBVOLUME if @path is not a subvolume, non-zero error
+ * code on any other failure.
+ */
+enum btrfs_util_error btrfs_util_is_subvolume(const char *path);
+
+/**
+ * btrfs_util_is_subvolume_fd() - See btrfs_util_is_subvolume().
+ */
+enum btrfs_util_error btrfs_util_is_subvolume_fd(int fd);
+
+/**
+ * btrfs_util_subvolume_id() - Get the ID of the subvolume containing a path.
+ * @path: Path on a Btrfs filesystem.
+ * @id_ret: Returned subvolume ID.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_subvolume_id(const char *path,
+ uint64_t *id_ret);
+
+/**
+ * btrfs_util_subvolume_id_fd() - See btrfs_util_subvolume_id().
+ */
+enum btrfs_util_error btrfs_util_subvolume_id_fd(int fd, uint64_t *id_ret);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/libbtrfsutil/python/btrfsutilpy.h 
b/libbtrfsutil/python/btrfsutilpy.h
index 6d82f7e1..9a04fda7 100644
--- a/libbtrfsutil/python/btrfsutilpy.h
+++ b/libbtrfsutil/python/btrfsutilpy.h
@@ -52,6 +52,9 @@ void SetFromBtrfsUtilErrorWithPaths(enum btrfs_util_error err,
struct path_arg *path1,
struct path_arg *path2);
 
+PyObject *is_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
+PyObject *subvolume_id(PyObject *self, PyObject *args, PyObject *kwds);
+
 void add_module_constants(PyObject *m);
 
 #endif /* BTRFSUTILPY_H */
diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
index d7398808..d492cbc7 100644
--- a/libbtrfsutil/python/module.c
+++ b/libbtrfsutil/python/module.c
@@ -132,6 +132,18 @@ void path_cleanup(struct path_arg *path)
 }
 
 static PyMethodDef btrfsutil_methods[] = {
+   {"is_subvolume", (PyCFunction)is_subvolume,
+METH_VARARGS | METH_KEYWORDS,
+"is_subvolume(path) -> bool\n\n"
+"Get whether a file is a subvolume.\n\n"
+"Arguments:\n"
+"path -- string, bytes, path-like object, or open file descriptor"},
+   {"subvolume_id", (PyCFunction)subvolume_id,
+METH_VARARGS | METH_KEYWORDS,
+"subvolume_id(path) -> int\n\n"
+"Get the ID of the subvolume containing a file.\n\n"
+"Arguments:\n"
+"path -- string, bytes, path-like object, or open file descriptor"},
{}

[PATCH v2 20/27] btrfs-progs: use libbtrfsutil for subvol create and snapshot

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

These become trivial calls to the libbtrfsutil helpers, and we can get
rid of the qgroup inherit code in progs.

Signed-off-by: Omar Sandoval 
---
 cmds-subvolume.c | 225 ++-
 qgroup.c |  64 
 qgroup.h |   2 -
 3 files changed, 58 insertions(+), 233 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 42cc30ce..ea436ba0 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -103,6 +103,35 @@ static int wait_for_subvolume_cleaning(int fd, int count, 
u64 *ids,
return 0;
 }
 
+static int qgroup_inherit_add_group(struct btrfs_util_qgroup_inherit **inherit,
+   const char *arg)
+{
+   enum btrfs_util_error err;
+   u64 qgroupid;
+
+   if (!*inherit) {
+   err = btrfs_util_create_qgroup_inherit(0, inherit);
+   if (err) {
+   error_btrfs_util(err);
+   return -1;
+   }
+   }
+
+   qgroupid = parse_qgroupid(optarg);
+   if (qgroupid == 0) {
+   error("invalid qgroup specification, qgroupid must not be 0");
+   return -1;
+   }
+
+   err = btrfs_util_qgroup_inherit_add_group(inherit, qgroupid);
+   if (err) {
+   error_btrfs_util(err);
+   return -1;
+   }
+
+   return 0;
+}
+
 static const char * const subvolume_cmd_group_usage[] = {
"btrfs subvolume  ",
NULL
@@ -121,15 +150,9 @@ static const char * const cmd_subvol_create_usage[] = {
 
 static int cmd_subvol_create(int argc, char **argv)
 {
-   int retval, res, len;
-   int fddst = -1;
-   char*dupname = NULL;
-   char*dupdir = NULL;
-   char*newname;
-   char*dstdir;
-   char*dst;
-   struct btrfs_qgroup_inherit *inherit = NULL;
-   DIR *dirstream = NULL;
+   struct btrfs_util_qgroup_inherit *inherit = NULL;
+   enum btrfs_util_error err;
+   int retval = 1;
 
while (1) {
int c = getopt(argc, argv, "i:");
@@ -138,11 +161,8 @@ static int cmd_subvol_create(int argc, char **argv)
 
switch (c) {
case 'i':
-   res = qgroup_inherit_add_group(&inherit, optarg);
-   if (res) {
-   retval = res;
+   if (qgroup_inherit_add_group(&inherit, optarg) == -1)
goto out;
-   }
break;
default:
usage(cmd_subvol_create_usage);
@@ -152,70 +172,18 @@ static int cmd_subvol_create(int argc, char **argv)
if (check_argc_exact(argc - optind, 1))
usage(cmd_subvol_create_usage);
 
-   dst = argv[optind];
-
-   retval = 1; /* failure */
-   res = test_isdir(dst);
-   if (res < 0 && res != -ENOENT) {
-   error("cannot access %s: %s", dst, strerror(-res));
-   goto out;
-   }
-   if (res >= 0) {
-   error("target path already exists: %s", dst);
-   goto out;
-   }
-
-   dupname = strdup(dst);
-   newname = basename(dupname);
-   dupdir = strdup(dst);
-   dstdir = dirname(dupdir);
-
-   if (!test_issubvolname(newname)) {
-   error("invalid subvolume name: %s", newname);
-   goto out;
-   }
-
-   len = strlen(newname);
-   if (len == 0 || len >= BTRFS_VOL_NAME_MAX) {
-   error("subvolume name too long: %s", newname);
-   goto out;
-   }
-
-   fddst = btrfs_open_dir(dstdir, &dirstream, 1);
-   if (fddst < 0)
-   goto out;
-
-   printf("Create subvolume '%s/%s'\n", dstdir, newname);
-   if (inherit) {
-   struct btrfs_ioctl_vol_args_v2  args;
-
-   memset(&args, 0, sizeof(args));
-   strncpy_null(args.name, newname);
-   args.flags |= BTRFS_SUBVOL_QGROUP_INHERIT;
-   args.size = qgroup_inherit_size(inherit);
-   args.qgroup_inherit = inherit;
-
-   res = ioctl(fddst, BTRFS_IOC_SUBVOL_CREATE_V2, &args);
-   } else {
-   struct btrfs_ioctl_vol_args args;
-
-   memset(&args, 0, sizeof(args));
-   strncpy_null(args.name, newname);
-
-   res = ioctl(fddst, BTRFS_IOC_SUBVOL_CREATE, &args);
-   }
+   printf("Create subvolume '%s'\n", argv[optind]);
 
-   if (res < 0) {
-   error("cannot create subvolume: %m");
+   err = btrfs_util_create_subvolume(argv[optind], 0, NULL, inherit);
+   if (err) {
+   error_btrfs_util(err);
goto out;
 

[PATCH v2 14/27] libbtrfsutil: add btrfs_util_deleted_subvolumes()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Signed-off-by: Omar Sandoval 
---
 libbtrfsutil/btrfsutil.h| 21 +++
 libbtrfsutil/python/btrfsutilpy.h   |  3 +
 libbtrfsutil/python/module.c| 30 ++
 libbtrfsutil/python/qgroup.c| 17 +-
 libbtrfsutil/python/subvolume.c | 30 ++
 libbtrfsutil/python/tests/test_subvolume.py |  8 +++
 libbtrfsutil/subvolume.c| 89 +
 7 files changed, 183 insertions(+), 15 deletions(-)

diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 00c86174..677ab3c1 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -534,6 +534,27 @@ enum btrfs_util_error 
btrfs_util_subvolume_iterator_next_info(struct btrfs_util_
  char **path_ret,
  struct 
btrfs_util_subvolume_info *subvol);
 
+/**
+ * btrfs_util_deleted_subvolumes() - Get a list of subvolume which have been
+ * deleted but not yet cleaned up.
+ * @path: Path on a Btrfs filesystem.
+ * @ids: Returned array of subvolume IDs.
+ * @n: Returned number of IDs in the @ids array.
+ *
+ * This requires appropriate privilege (CAP_SYS_ADMIN).
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_deleted_subvolumes(const char *path,
+   uint64_t **ids,
+   size_t *n);
+
+/**
+ * btrfs_util_deleted_subvolumes_fd() - See btrfs_util_deleted_subvolumes().
+ */
+enum btrfs_util_error btrfs_util_deleted_subvolumes_fd(int fd, uint64_t **ids,
+  size_t *n);
+
 /**
  * btrfs_util_create_qgroup_inherit() - Create a qgroup inheritance specifier
  * for btrfs_util_create_subvolume() or btrfs_util_create_snapshot().
diff --git a/libbtrfsutil/python/btrfsutilpy.h 
b/libbtrfsutil/python/btrfsutilpy.h
index b3ec047f..be5122e2 100644
--- a/libbtrfsutil/python/btrfsutilpy.h
+++ b/libbtrfsutil/python/btrfsutilpy.h
@@ -54,6 +54,8 @@ struct path_arg {
 int path_converter(PyObject *o, void *p);
 void path_cleanup(struct path_arg *path);
 
+PyObject *list_from_uint64_array(const uint64_t *arr, size_t n);
+
 void SetFromBtrfsUtilError(enum btrfs_util_error err);
 void SetFromBtrfsUtilErrorWithPath(enum btrfs_util_error err,
   struct path_arg *path);
@@ -72,6 +74,7 @@ PyObject *set_default_subvolume(PyObject *self, PyObject 
*args, PyObject *kwds);
 PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
 PyObject *create_snapshot(PyObject *self, PyObject *args, PyObject *kwds);
 PyObject *delete_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
+PyObject *deleted_subvolumes(PyObject *self, PyObject *args, PyObject *kwds);
 
 void add_module_constants(PyObject *m);
 
diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
index e995a1be..eaa062ac 100644
--- a/libbtrfsutil/python/module.c
+++ b/libbtrfsutil/python/module.c
@@ -125,6 +125,29 @@ err:
return 0;
 }
 
+PyObject *list_from_uint64_array(const uint64_t *arr, size_t n)
+{
+PyObject *ret;
+size_t i;
+
+ret = PyList_New(n);
+if (!ret)
+   return NULL;
+
+for (i = 0; i < n; i++) {
+   PyObject *tmp;
+
+   tmp = PyLong_FromUnsignedLongLong(arr[i]);
+   if (!tmp) {
+   Py_DECREF(ret);
+   return NULL;
+   }
+   PyList_SET_ITEM(ret, i, tmp);
+}
+
+return ret;
+}
+
 void path_cleanup(struct path_arg *path)
 {
Py_CLEAR(path->object);
@@ -214,6 +237,13 @@ static PyMethodDef btrfsutil_methods[] = {
 "path -- string, bytes, or path-like object\n"
 "recursive -- if the given subvolume has child subvolumes, delete\n"
 "them instead of failing"},
+   {"deleted_subvolumes", (PyCFunction)deleted_subvolumes,
+METH_VARARGS | METH_KEYWORDS,
+"deleted_subvolumes(path)\n\n"
+"Get the list of subvolume IDs which have been deleted but not yet\n"
+"cleaned up\n\n"
+"Arguments:\n"
+"path -- string, bytes, path-like object, or open file descriptor"},
{},
 };
 
diff --git a/libbtrfsutil/python/qgroup.c b/libbtrfsutil/python/qgroup.c
index 69716d92..44ac5ebc 100644
--- a/libbtrfsutil/python/qgroup.c
+++ b/libbtrfsutil/python/qgroup.c
@@ -55,25 +55,12 @@ static PyObject *QgroupInherit_getattro(QgroupInherit 
*self, PyObject *nameobj)
 }
 
 if (strcmp(name, "groups") == 0) {
-   PyObject *ret, *tmp;
const uint64_t *arr;
-   size_t n, i;
+   size_t n;
 
btrfs_util_qgroup_inherit_get_groups(self->inherit, &arr, &n);
- 

[PATCH v2 11/27] libbtrfsutil: add subvolume iterator helpers

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

This is how we can implement stuff like `btrfs subvol list`. Rather than
producing the entire list upfront, the iterator approach uses less
memory in the common case where the whole list is not stored (O(max
subvolume path length)). It supports both pre-order traversal (useful
for, e.g, recursive snapshot) and post-order traversal (useful for
recursive delete).

Signed-off-by: Omar Sandoval 
---
 libbtrfsutil/btrfsutil.h|  97 +
 libbtrfsutil/python/btrfsutilpy.h   |   1 +
 libbtrfsutil/python/module.c|   8 +
 libbtrfsutil/python/subvolume.c | 211 ++
 libbtrfsutil/python/tests/test_subvolume.py |  56 +
 libbtrfsutil/subvolume.c| 317 
 6 files changed, 690 insertions(+)

diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 54777f1d..7d3a87f6 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -345,6 +345,103 @@ enum btrfs_util_error btrfs_util_create_subvolume_fd(int 
parent_fd,
 uint64_t *async_transid,
 struct 
btrfs_util_qgroup_inherit *qgroup_inherit);
 
+struct btrfs_util_subvolume_iterator;
+
+/**
+ * BTRFS_UTIL_SUBVOLUME_ITERATOR_POST_ORDER - Iterate post-order. The default
+ * behavior is pre-order, e.g., foo will be yielded before foo/bar. If this 
flag
+ * is specified, foo/bar will be yielded before foo.
+ */
+#define BTRFS_UTIL_SUBVOLUME_ITERATOR_POST_ORDER (1 << 0)
+#define BTRFS_UTIL_SUBVOLUME_ITERATOR_MASK ((1 << 1) - 1)
+
+/**
+ * btrfs_util_create_subvolume_iterator() - Create an iterator over subvolumes
+ * in a Btrfs filesystem.
+ * @path: Path in a Btrfs filesystem. This may be any path in the filesystem; 
it
+ * does not have to refer to a subvolume unless @top is zero.
+ * @top: List subvolumes beneath (but not including) the subvolume with this 
ID.
+ * If zero is given, the subvolume ID of @path is used. To list all subvolumes,
+ * pass %BTRFS_FS_TREE_OBJECTID (i.e., 5). The returned paths are relative to
+ * the subvolume with this ID.
+ * @flags: Bitmask of BTRFS_UTIL_SUBVOLUME_ITERATOR_* flags.
+ * @ret: Returned iterator.
+ *
+ * The returned iterator must be freed with
+ * btrfs_util_destroy_subvolume_iterator().
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_create_subvolume_iterator(const char *path,
+  uint64_t top,
+  int flags,
+  struct 
btrfs_util_subvolume_iterator **ret);
+
+/**
+ * btrfs_util_create_subvolume_iterator_fd() - See
+ * btrfs_util_create_subvolume_iterator().
+ */
+enum btrfs_util_error btrfs_util_create_subvolume_iterator_fd(int fd,
+ uint64_t top,
+ int flags,
+ struct 
btrfs_util_subvolume_iterator **ret);
+
+/**
+ * btrfs_util_destroy_subvolume_iterator() - Destroy a subvolume iterator
+ * previously created by btrfs_util_create_subvolume_iterator().
+ * @iter: Iterator to destroy.
+ */
+void btrfs_util_destroy_subvolume_iterator(struct 
btrfs_util_subvolume_iterator *iter);
+
+/**
+ * btrfs_util_subvolume_iterator_fd() - Get the file descriptor associated with
+ * a subvolume iterator.
+ * @iter: Iterator to get.
+ *
+ * This can be used to get the file descriptor opened by
+ * btrfs_util_create_subvolume_iterator() in order to use it for other
+ * functions.
+ *
+ * Return: File descriptor.
+ */
+int btrfs_util_subvolume_iterator_fd(const struct 
btrfs_util_subvolume_iterator *iter);
+
+/**
+ * btrfs_util_subvolume_iterator_next() - Get the next subvolume from a
+ * subvolume iterator.
+ * @iter: Subvolume iterator.
+ * @path_ret: Returned subvolume path, relative to the subvolume ID used to
+ * create the iterator. May be %NULL.
+ * Must be freed with free().
+ * @id_ret: Returned subvolume ID. May be %NULL.
+ *
+ * This requires appropriate privilege (CAP_SYS_ADMIN).
+ *
+ * Return: %BTRFS_UTIL_OK on success, %BTRFS_UTIL_ERROR_STOP_ITERATION if there
+ * are no more subvolumes, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_subvolume_iterator_next(struct 
btrfs_util_subvolume_iterator *iter,
+char **path_ret,
+uint64_t *id_ret);
+
+/**
+ * btrfs_util_subvolume_iterator_next_info() - Get information about the next
+ * subvolume for a subvolume iterator.
+ * @iter: Subvolume iterator.
+ * @path_ret: See btrfs_util_subvolume_iterator_next().
+ * @subvol: Returned subvolume information.
+ *
+ * This convenie

[PATCH v2 15/27] libbtrfsutil: add filesystem sync helpers

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Namely, sync, start_sync, and wait_sync.

Signed-off-by: Omar Sandoval 
---
 Makefile |   4 +-
 libbtrfsutil/btrfsutil.h |  44 
 libbtrfsutil/filesystem.c| 103 +++
 libbtrfsutil/python/btrfsutilpy.h|   3 +
 libbtrfsutil/python/filesystem.c |  94 
 libbtrfsutil/python/module.c |  21 ++
 libbtrfsutil/python/setup.py |   1 +
 libbtrfsutil/python/tests/test_filesystem.py |  73 +++
 8 files changed, 341 insertions(+), 2 deletions(-)
 create mode 100644 libbtrfsutil/filesystem.c
 create mode 100644 libbtrfsutil/python/filesystem.c
 create mode 100644 libbtrfsutil/python/tests/test_filesystem.py

diff --git a/Makefile b/Makefile
index fa32c50d..37bd6e5d 100644
--- a/Makefile
+++ b/Makefile
@@ -135,8 +135,8 @@ libbtrfsutil_major := $(shell sed -rn 's/^\#define 
BTRFS_UTIL_VERSION_MAJOR ([0-
 libbtrfsutil_minor := $(shell sed -rn 's/^\#define BTRFS_UTIL_VERSION_MINOR 
([0-9])+$$/\1/p' libbtrfsutil/btrfsutil.h)
 libbtrfsutil_patch := $(shell sed -rn 's/^\#define BTRFS_UTIL_VERSION_PATCH 
([0-9])+$$/\1/p' libbtrfsutil/btrfsutil.h)
 libbtrfsutil_version := 
$(libbtrfsutil_major).$(libbtrfsutil_minor).$(libbtrfsutil_patch)
-libbtrfsutil_objects = libbtrfsutil/errors.o libbtrfsutil/qgroup.o \
-  libbtrfsutil/subvolume.o
+libbtrfsutil_objects = libbtrfsutil/errors.o libbtrfsutil/filesystem.o \
+  libbtrfsutil/qgroup.o libbtrfsutil/subvolume.o
 convert_objects = convert/main.o convert/common.o convert/source-fs.o \
  convert/source-ext2.o convert/source-reiserfs.o
 mkfs_objects = mkfs/main.o mkfs/common.o mkfs/rootdir.o
diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 677ab3c1..34a2fc4a 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -74,6 +74,50 @@ enum btrfs_util_error {
  */
 const char *btrfs_util_strerror(enum btrfs_util_error err);
 
+/**
+ * btrfs_util_sync() - Force a sync on a specific Btrfs filesystem.
+ * @path: Path on a Btrfs filesystem.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_sync(const char *path);
+
+/**
+ * btrfs_util_sync_fd() - See btrfs_util_sync().
+ */
+enum btrfs_util_error btrfs_util_sync_fd(int fd);
+
+/**
+ * btrfs_util_start_sync() - Start a sync on a specific Btrfs filesystem but
+ * don't wait for it.
+ * @path: Path on a Btrfs filesystem.
+ * @transid: Returned transaction ID which can be waited on with
+ * btrfs_util_wait_sync(). This can be %NULL.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_start_sync(const char *path,
+   uint64_t *transid);
+
+/**
+ * btrfs_util_start_sync_fd() - See btrfs_util_start_sync().
+ */
+enum btrfs_util_error btrfs_util_start_sync_fd(int fd, uint64_t *transid);
+
+/**
+ * btrfs_util_wait_sync() - Wait for a transaction with a given ID to sync.
+ * @path: Path on a Btrfs filesystem.
+ * @transid: Transaction ID to wait for, or zero for the current transaction.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_wait_sync(const char *path, uint64_t transid);
+
+/**
+ * btrfs_util_wait_sync_fd() - See btrfs_util_wait_sync().
+ */
+enum btrfs_util_error btrfs_util_wait_sync_fd(int fd, uint64_t transid);
+
 /**
  * btrfs_util_is_subvolume() - Return whether a given path is a Btrfs 
subvolume.
  * @path: Path to check.
diff --git a/libbtrfsutil/filesystem.c b/libbtrfsutil/filesystem.c
new file mode 100644
index ..dfd171bf
--- /dev/null
+++ b/libbtrfsutil/filesystem.c
@@ -0,0 +1,103 @@
+/*
+ * Copyright (C) 2018 Facebook
+ *
+ * This file is part of libbtrfsutil.
+ *
+ * libbtrfsutil is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published by
+ * the Free Software Foundation, either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * libbtrfsutil is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public License
+ * along with libbtrfsutil.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "btrfsutil_internal.h"
+
+PUBLIC enum btrfs_util_error btrfs_util_sync(const char *path)
+{
+   enum btrfs_util_error err;
+   int fd;
+
+   fd = open(path, O_RDONLY);
+   if (fd == -1)
+   return BTRFS_UTIL_ERROR_O

[PATCH v2 12/27] libbtrfsutil: add btrfs_util_create_snapshot()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Thanks to subvolume iterators, we can also implement recursive snapshot
fairly easily.

Signed-off-by: Omar Sandoval 
---
 libbtrfsutil/btrfsutil.h|  59 ++
 libbtrfsutil/python/btrfsutilpy.h   |   1 +
 libbtrfsutil/python/module.c|  11 ++
 libbtrfsutil/python/subvolume.c |  49 
 libbtrfsutil/python/tests/test_qgroup.py|  10 ++
 libbtrfsutil/python/tests/test_subvolume.py |  55 +
 libbtrfsutil/subvolume.c| 167 
 7 files changed, 352 insertions(+)

diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 7d3a87f6..04fe 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -345,6 +345,65 @@ enum btrfs_util_error btrfs_util_create_subvolume_fd(int 
parent_fd,
 uint64_t *async_transid,
 struct 
btrfs_util_qgroup_inherit *qgroup_inherit);
 
+/**
+ * BTRFS_UTIL_CREATE_SNAPSHOT_RECURSIVE - Also snapshot subvolumes beneath the
+ * source subvolume onto the same location on the new snapshot.
+ *
+ * Note that this is currently implemented in userspace non-atomically. Because
+ * it modifies the newly-created snapshot, it cannot be combined with
+ * %BTRFS_UTIL_CREATE_SNAPSHOT_READ_ONLY. It requires appropriate privilege
+ * (CAP_SYS_ADMIN).
+ */
+#define BTRFS_UTIL_CREATE_SNAPSHOT_RECURSIVE (1 << 0)
+/**
+ * BTRFS_UTIL_CREATE_SNAPSHOT_READ_ONLY - Create a read-only snapshot.
+ */
+#define BTRFS_UTIL_CREATE_SNAPSHOT_READ_ONLY (1 << 1)
+#define BTRFS_UTIL_CREATE_SNAPSHOT_MASK ((1 << 2) - 1)
+
+/**
+ * btrfs_util_create_snapshot() - Create a new snapshot from a source subvolume
+ * path.
+ * @source: Path of the existing subvolume to snapshot.
+ * @path: Where to create the snapshot.
+ * @flags: Bitmask of BTRFS_UTIL_CREATE_SNAPSHOT_* flags.
+ * @async_transid: See btrfs_util_create_subvolume(). If
+ * %BTRFS_UTIL_CREATE_SNAPSHOT_RECURSIVE was in @flags, then this will contain
+ * the largest transaction ID of all created subvolumes.
+ * @qgroup_inherit: See btrfs_util_create_subvolume().
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_create_snapshot(const char *source,
+const char *path, int flags,
+uint64_t *async_transid,
+struct 
btrfs_util_qgroup_inherit *qgroup_inherit);
+
+/**
+ * btrfs_util_create_snapshot_fd() - See btrfs_util_create_snapshot().
+ */
+enum btrfs_util_error btrfs_util_create_snapshot_fd(int fd, const char *path,
+   int flags,
+   uint64_t *async_transid,
+   struct 
btrfs_util_qgroup_inherit *qgroup_inherit);
+
+/**
+ * btrfs_util_create_snapshot_fd2() - Create a new snapshot from a source
+ * subvolume file descriptor and a target parent file descriptor and name.
+ * @fd: File descriptor of the existing subvolume to snapshot.
+ * @parent_fd: File descriptor of the parent directory where the snapshot 
should
+ * be created.
+ * @name: Name of the snapshot to create.
+ * @flags: See btrfs_util_create_snapshot().
+ * @async_transid: See btrfs_util_create_snapshot().
+ * @qgroup_inherit: See btrfs_util_create_snapshot().
+ */
+enum btrfs_util_error btrfs_util_create_snapshot_fd2(int fd, int parent_fd,
+const char *name,
+int flags,
+uint64_t *async_transid,
+struct 
btrfs_util_qgroup_inherit *qgroup_inherit);
+
 struct btrfs_util_subvolume_iterator;
 
 /**
diff --git a/libbtrfsutil/python/btrfsutilpy.h 
b/libbtrfsutil/python/btrfsutilpy.h
index a9c15219..d552e416 100644
--- a/libbtrfsutil/python/btrfsutilpy.h
+++ b/libbtrfsutil/python/btrfsutilpy.h
@@ -70,6 +70,7 @@ PyObject *set_subvolume_read_only(PyObject *self, PyObject 
*args, PyObject *kwds
 PyObject *get_default_subvolume(PyObject *self, PyObject *args, PyObject 
*kwds);
 PyObject *set_default_subvolume(PyObject *self, PyObject *args, PyObject 
*kwds);
 PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
+PyObject *create_snapshot(PyObject *self, PyObject *args, PyObject *kwds);
 
 void add_module_constants(PyObject *m);
 
diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
index daf0747f..d8f797cb 100644
--- a/libbtrfsutil/python/module.c
+++ b/libbtrfsutil/python/module.c
@@ -195,6 +195,17 @@ static PyMethodDef btrfsutil_methods[] = {
 "path -- string, bytes, or path-like object\n"
 "async -- create the

[PATCH v2 10/27] libbtrfsutil: add btrfs_util_[gs]et_default_subvolume()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

set_default_subvolume() is a trivial ioctl(), but there's no ioctl() for
get_default_subvolume(), so we need to search the root tree.

Signed-off-by: Omar Sandoval 
---
 libbtrfsutil/btrfsutil.h|  41 ++
 libbtrfsutil/python/btrfsutilpy.h   |   2 +
 libbtrfsutil/python/module.c|  14 
 libbtrfsutil/python/subvolume.c |  50 
 libbtrfsutil/python/tests/test_subvolume.py |  14 
 libbtrfsutil/subvolume.c| 113 
 6 files changed, 234 insertions(+)

diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 8bd2b847..54777f1d 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -256,6 +256,8 @@ enum btrfs_util_error 
btrfs_util_get_subvolume_read_only_fd(int fd, bool *ret);
  * @path: Subvolume path.
  * @read_only: New value of read-only flag.
  *
+ * This requires appropriate privilege (CAP_SYS_ADMIN).
+ *
  * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
  */
 enum btrfs_util_error btrfs_util_set_subvolume_read_only(const char *path,
@@ -268,6 +270,45 @@ enum btrfs_util_error 
btrfs_util_set_subvolume_read_only(const char *path,
 enum btrfs_util_error btrfs_util_set_subvolume_read_only_fd(int fd,
bool read_only);
 
+/**
+ * btrfs_util_get_default_subvolume() - Get the default subvolume for a
+ * filesystem.
+ * @path: Path on a Btrfs filesystem.
+ * @id_ret: Returned subvolume ID.
+ *
+ * This requires appropriate privilege (CAP_SYS_ADMIN).
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_get_default_subvolume(const char *path,
+  uint64_t *id_ret);
+
+/**
+ * btrfs_util_get_default_subvolume_fd() - See
+ * btrfs_util_get_default_subvolume().
+ */
+enum btrfs_util_error btrfs_util_get_default_subvolume_fd(int fd,
+ uint64_t *id_ret);
+
+/**
+ * btrfs_util_set_default_subvolume() - Set the default subvolume for a
+ * filesystem.
+ * @path: Path in a Btrfs filesystem. This may be any path in the filesystem; 
it
+ * does not have to refer to a subvolume unless @id is zero.
+ * @id: ID of subvolume to set as the default. If zero is given, the subvolume
+ * ID of @path is used.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_set_default_subvolume(const char *path,
+  uint64_t id);
+
+/**
+ * btrfs_util_set_default_subvolume_fd() - See
+ * btrfs_util_set_default_subvolume().
+ */
+enum btrfs_util_error btrfs_util_set_default_subvolume_fd(int fd, uint64_t id);
+
 struct btrfs_util_qgroup_inherit;
 
 /**
diff --git a/libbtrfsutil/python/btrfsutilpy.h 
b/libbtrfsutil/python/btrfsutilpy.h
index 21253e51..41314d4a 100644
--- a/libbtrfsutil/python/btrfsutilpy.h
+++ b/libbtrfsutil/python/btrfsutilpy.h
@@ -66,6 +66,8 @@ PyObject *subvolume_path(PyObject *self, PyObject *args, 
PyObject *kwds);
 PyObject *subvolume_info(PyObject *self, PyObject *args, PyObject *kwds);
 PyObject *get_subvolume_read_only(PyObject *self, PyObject *args, PyObject 
*kwds);
 PyObject *set_subvolume_read_only(PyObject *self, PyObject *args, PyObject 
*kwds);
+PyObject *get_default_subvolume(PyObject *self, PyObject *args, PyObject 
*kwds);
+PyObject *set_default_subvolume(PyObject *self, PyObject *args, PyObject 
*kwds);
 PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
 
 void add_module_constants(PyObject *m);
diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
index 3395fb14..0ac4d63a 100644
--- a/libbtrfsutil/python/module.c
+++ b/libbtrfsutil/python/module.c
@@ -173,6 +173,20 @@ static PyMethodDef btrfsutil_methods[] = {
 "Arguments:\n"
 "path -- string, bytes, path-like object, or open file descriptor\n"
 "read_only -- bool flag value"},
+   {"get_default_subvolume", (PyCFunction)get_default_subvolume,
+METH_VARARGS | METH_KEYWORDS,
+"get_default_subvolume(path) -> int\n\n"
+"Get the ID of the default subvolume of a filesystem.\n\n"
+"Arguments:\n"
+"path -- string, bytes, path-like object, or open file descriptor"},
+   {"set_default_subvolume", (PyCFunction)set_default_subvolume,
+METH_VARARGS | METH_KEYWORDS,
+"set_default_subvolume(path, id=0)\n\n"
+"Set the default subvolume of a filesystem.\n\n"
+"Arguments:\n"
+"path -- string, bytes, path-like object, or open file descriptor\n"
+"id -- if not zero, set the default subvolume to the subvolume with\n"
+"this ID instea

[PATCH v2 08/27] libbtrfsutil: add btrfs_util_subvolume_info()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

This gets the the information in `btrfs subvolume show` from the root
item.

Signed-off-by: Omar Sandoval 
---
 libbtrfsutil/btrfsutil.h| 111 +
 libbtrfsutil/python/btrfsutilpy.h   |   4 +
 libbtrfsutil/python/module.c|  14 +++
 libbtrfsutil/python/subvolume.c | 113 +
 libbtrfsutil/python/tests/test_subvolume.py |  50 ++
 libbtrfsutil/subvolume.c| 148 
 6 files changed, 440 insertions(+)

diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index f96c9c4e..0d83dea9 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -20,8 +20,10 @@
 #ifndef BTRFS_UTIL_H
 #define BTRFS_UTIL_H
 
+#include 
 #include 
 #include 
+#include 
 
 #define BTRFS_UTIL_VERSION_MAJOR 1
 #define BTRFS_UTIL_VERSION_MINOR 0
@@ -124,6 +126,115 @@ enum btrfs_util_error btrfs_util_subvolume_path(const 
char *path, uint64_t id,
 enum btrfs_util_error btrfs_util_subvolume_path_fd(int fd, uint64_t id,
   char **path_ret);
 
+/**
+ * struct btrfs_util_subvolume_info - Information about a Btrfs subvolume.
+ */
+struct btrfs_util_subvolume_info {
+   /** @id: ID of this subvolume, unique across the filesystem. */
+   uint64_t id;
+
+   /**
+* @parent_id: ID of the subvolume which contains this subvolume, or
+* zero for the root subvolume (BTRFS_FS_TREE_OBJECTID) or orphaned
+* subvolumes (i.e., subvolumes which have been deleted but not yet
+* cleaned up).
+*/
+   uint64_t parent_id;
+
+   /**
+* @dir_id: Inode number of the directory containing this subvolume in
+* the parent subvolume, or zero for the root subvolume
+* (BTRFS_FS_TREE_OBJECTID) or orphaned subvolumes.
+*/
+   uint64_t dir_id;
+
+   /** @flags: On-disk root item flags. */
+   uint64_t flags;
+
+   /** @uuid: UUID of this subvolume. */
+   uint8_t uuid[16];
+
+   /**
+* @parent_uuid: UUID of the subvolume this subvolume is a snapshot of,
+* or all zeroes if this subvolume is not a snapshot.
+*/
+   uint8_t parent_uuid[16];
+
+   /**
+* @received_uuid: UUID of the subvolume this subvolume was received
+* from, or all zeroes if this subvolume was not received. Note that
+* this field, @stransid, @rtransid, @stime, and @rtime are set manually
+* by userspace after a subvolume is received.
+*/
+   uint8_t received_uuid[16];
+
+   /** @generation: Transaction ID of the subvolume root. */
+   uint64_t generation;
+
+   /**
+* @ctransid: Transaction ID when an inode in this subvolume was last
+* changed.
+*/
+   uint64_t ctransid;
+
+   /** @otransid: Transaction ID when this subvolume was created. */
+   uint64_t otransid;
+
+   /**
+* @stransid: Transaction ID of the sent subvolume this subvolume was
+* received from, or zero if this subvolume was not received. See the
+* note on @received_uuid.
+*/
+   uint64_t stransid;
+
+   /**
+* @rtransid: Transaction ID when this subvolume was received, or zero
+* if this subvolume was not received. See the note on @received_uuid.
+*/
+   uint64_t rtransid;
+
+   /** @ctime: Time when an inode in this subvolume was last changed. */
+   struct timespec ctime;
+
+   /** @otime: Time when this subvolume was created. */
+   struct timespec otime;
+
+   /**
+* @stime: Not well-defined, usually zero unless it was set otherwise.
+* See the note on @received_uuid.
+*/
+   struct timespec stime;
+
+   /**
+* @rtime: Time when this subvolume was received, or zero if this
+* subvolume was not received. See the note on @received_uuid.
+*/
+   struct timespec rtime;
+};
+
+/**
+ * btrfs_util_subvolume_info() - Get information about a subvolume.
+ * @path: Path in a Btrfs filesystem. This may be any path in the filesystem; 
it
+ * does not have to refer to a subvolume unless @id is zero.
+ * @id: ID of subvolume to get information about. If zero is given, the
+ * subvolume ID of @path is used.
+ * @subvol: Returned subvolume information. This can be %NULL if you just want
+ * to check whether the subvolume exists; %BTRFS_UTIL_ERROR_SUBVOLUME_NOT_FOUND
+ * will be returned if it does not.
+ *
+ * This requires appropriate privilege (CAP_SYS_ADMIN).
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_subvolume_info(const char *path, uint64_t id,
+   struct 
btrfs_util_subvolume_info *subvol);
+
+/**
+ * btrfs_util_subvolume_info_fd() - See btrfs_util_subvolume_info().
+ */
+enum btrfs_util_error

[PATCH v2 13/27] libbtrfsutil: add btrfs_util_delete_subvolume()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

We also support recursive deletion using a subvolume iterator.

Signed-off-by: Omar Sandoval 
---
 libbtrfsutil/btrfsutil.h|  33 +
 libbtrfsutil/python/btrfsutilpy.h   |   1 +
 libbtrfsutil/python/module.c|   8 +++
 libbtrfsutil/python/subvolume.c |  27 
 libbtrfsutil/python/tests/test_subvolume.py |  48 +
 libbtrfsutil/subvolume.c| 102 
 6 files changed, 219 insertions(+)

diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 04fe..00c86174 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -404,6 +404,39 @@ enum btrfs_util_error btrfs_util_create_snapshot_fd2(int 
fd, int parent_fd,
 uint64_t *async_transid,
 struct 
btrfs_util_qgroup_inherit *qgroup_inherit);
 
+/**
+ * BTRFS_UTIL_DELETE_SUBVOLUME_RECURSIVE - Delete subvolumes beneath the given
+ * subvolume before attempting to delete the given subvolume.
+ *
+ * If this flag is not used, deleting a subvolume with child subvolumes is an
+ * error. Note that this is currently implemented in userspace non-atomically.
+ * It requires appropriate privilege (CAP_SYS_ADMIN).
+ */
+#define BTRFS_UTIL_DELETE_SUBVOLUME_RECURSIVE (1 << 0)
+#define BTRFS_UTIL_DELETE_SUBVOLUME_MASK ((1 << 1) - 1)
+
+/**
+ * btrfs_util_delete_subvolume() - Delete a subvolume or snapshot.
+ * @path: Path of the subvolume to delete.
+ * @flags: Bitmask of BTRFS_UTIL_DELETE_SUBVOLUME_* flags.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_delete_subvolume(const char *path, int flags);
+
+/**
+ * btrfs_util_delete_subvolume_fd() - Delete a subvolume or snapshot given its
+ * parent and name.
+ * @parent_fd: File descriptor of the subvolume's parent directory.
+ * @name: Name of the subvolume.
+ * @flags: See btrfs_util_delete_subvolume().
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_delete_subvolume_fd(int parent_fd,
+const char *name,
+int flags);
+
 struct btrfs_util_subvolume_iterator;
 
 /**
diff --git a/libbtrfsutil/python/btrfsutilpy.h 
b/libbtrfsutil/python/btrfsutilpy.h
index d552e416..b3ec047f 100644
--- a/libbtrfsutil/python/btrfsutilpy.h
+++ b/libbtrfsutil/python/btrfsutilpy.h
@@ -71,6 +71,7 @@ PyObject *get_default_subvolume(PyObject *self, PyObject 
*args, PyObject *kwds);
 PyObject *set_default_subvolume(PyObject *self, PyObject *args, PyObject 
*kwds);
 PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
 PyObject *create_snapshot(PyObject *self, PyObject *args, PyObject *kwds);
+PyObject *delete_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
 
 void add_module_constants(PyObject *m);
 
diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
index d8f797cb..e995a1be 100644
--- a/libbtrfsutil/python/module.c
+++ b/libbtrfsutil/python/module.c
@@ -206,6 +206,14 @@ static PyMethodDef btrfsutil_methods[] = {
 "read_only -- create a read-only snapshot\n"
 "async -- create the subvolume without waiting for it to commit to\n"
 "disk and return the transaction ID"},
+   {"delete_subvolume", (PyCFunction)delete_subvolume,
+METH_VARARGS | METH_KEYWORDS,
+"delete_subvolume(path, recursive=False)\n\n"
+"Delete a subvolume or snapshot.\n\n"
+"Arguments:\n"
+"path -- string, bytes, or path-like object\n"
+"recursive -- if the given subvolume has child subvolumes, delete\n"
+"them instead of failing"},
{},
 };
 
diff --git a/libbtrfsutil/python/subvolume.c b/libbtrfsutil/python/subvolume.c
index a158ade7..eb3f6e27 100644
--- a/libbtrfsutil/python/subvolume.c
+++ b/libbtrfsutil/python/subvolume.c
@@ -398,6 +398,33 @@ PyObject *create_snapshot(PyObject *self, PyObject *args, 
PyObject *kwds)
Py_RETURN_NONE;
 }
 
+PyObject *delete_subvolume(PyObject *self, PyObject *args, PyObject *kwds)
+{
+   static char *keywords[] = {"path", "recursive", NULL};
+   struct path_arg path = {.allow_fd = false};
+   enum btrfs_util_error err;
+   int recursive = 0;
+   int flags = 0;
+
+   if (!PyArg_ParseTupleAndKeywords(args, kwds, "O&|p:delete_subvolume",
+keywords, &path_converter, &path,
+&recursive))
+   return NULL;
+
+   if (recursive)
+   flags |= BTRFS_UTIL_DELETE_SUBVOLUME_RECURSIVE;
+
+   er

[PATCH v2 05/27] libbtrfsutil: add qgroup inheritance helpers

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

We want to hide struct btrfs_qgroup_inherit from the user because that
comes from the Btrfs UAPI headers. Instead, wrap it in a struct
btrfs_util_qgroup_inherit and provide helpers to manipulate it. This
will be used for subvolume and snapshot creation.

Signed-off-by: Omar Sandoval 
---
 Makefile |   3 +-
 libbtrfsutil/btrfsutil.h |  45 +
 libbtrfsutil/python/btrfsutilpy.h|   6 ++
 libbtrfsutil/python/module.c |   8 ++
 libbtrfsutil/python/qgroup.c | 154 +++
 libbtrfsutil/python/setup.py |   1 +
 libbtrfsutil/python/tests/test_qgroup.py |  36 
 libbtrfsutil/qgroup.c|  83 +
 8 files changed, 335 insertions(+), 1 deletion(-)
 create mode 100644 libbtrfsutil/python/qgroup.c
 create mode 100644 libbtrfsutil/python/tests/test_qgroup.py
 create mode 100644 libbtrfsutil/qgroup.c

diff --git a/Makefile b/Makefile
index 58b3eca4..fa32c50d 100644
--- a/Makefile
+++ b/Makefile
@@ -135,7 +135,8 @@ libbtrfsutil_major := $(shell sed -rn 's/^\#define 
BTRFS_UTIL_VERSION_MAJOR ([0-
 libbtrfsutil_minor := $(shell sed -rn 's/^\#define BTRFS_UTIL_VERSION_MINOR 
([0-9])+$$/\1/p' libbtrfsutil/btrfsutil.h)
 libbtrfsutil_patch := $(shell sed -rn 's/^\#define BTRFS_UTIL_VERSION_PATCH 
([0-9])+$$/\1/p' libbtrfsutil/btrfsutil.h)
 libbtrfsutil_version := 
$(libbtrfsutil_major).$(libbtrfsutil_minor).$(libbtrfsutil_patch)
-libbtrfsutil_objects = libbtrfsutil/errors.o libbtrfsutil/subvolume.o
+libbtrfsutil_objects = libbtrfsutil/errors.o libbtrfsutil/qgroup.o \
+  libbtrfsutil/subvolume.o
 convert_objects = convert/main.o convert/common.o convert/source-fs.o \
  convert/source-ext2.o convert/source-reiserfs.o
 mkfs_objects = mkfs/main.o mkfs/common.o mkfs/rootdir.o
diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index ca36f695..76bf0b60 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -20,6 +20,7 @@
 #ifndef BTRFS_UTIL_H
 #define BTRFS_UTIL_H
 
+#include 
 #include 
 
 #define BTRFS_UTIL_VERSION_MAJOR 1
@@ -102,6 +103,50 @@ enum btrfs_util_error btrfs_util_subvolume_id(const char 
*path,
  */
 enum btrfs_util_error btrfs_util_subvolume_id_fd(int fd, uint64_t *id_ret);
 
+struct btrfs_util_qgroup_inherit;
+
+/**
+ * btrfs_util_create_qgroup_inherit() - Create a qgroup inheritance specifier
+ * for btrfs_util_create_subvolume() or btrfs_util_create_snapshot().
+ * @flags: Must be zero.
+ * @ret: Returned qgroup inheritance specifier.
+ *
+ * The returned structure must be freed with
+ * btrfs_util_destroy_qgroup_inherit().
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_create_qgroup_inherit(int flags,
+  struct 
btrfs_util_qgroup_inherit **ret);
+
+/**
+ * btrfs_util_destroy_qgroup_inherit() - Destroy a qgroup inheritance specifier
+ * previously created with btrfs_util_create_qgroup_inherit().
+ * @inherit: Specifier to destroy.
+ */
+void btrfs_util_destroy_qgroup_inherit(struct btrfs_util_qgroup_inherit 
*inherit);
+
+/**
+ * btrfs_util_qgroup_inherit_add_group() - Add inheritance from a qgroup to a
+ * qgroup inheritance specifier.
+ * @inherit: Specifier to modify. May be reallocated.
+ * @qgroupid: ID of qgroup to inherit from.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_qgroup_inherit_add_group(struct 
btrfs_util_qgroup_inherit **inherit,
+ uint64_t qgroupid);
+
+/**
+ * btrfs_util_qgroup_inherit_get_groups() - Get the qgroups a qgroup 
inheritance
+ * specifier contains.
+ * @inherit: Qgroup inheritance specifier.
+ * @groups: Returned array of qgroup IDs.
+ * @n: Returned number of entries in the @groups array.
+ */
+void btrfs_util_qgroup_inherit_get_groups(const struct 
btrfs_util_qgroup_inherit *inherit,
+ const uint64_t **groups, size_t *n);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/libbtrfsutil/python/btrfsutilpy.h 
b/libbtrfsutil/python/btrfsutilpy.h
index 9a04fda7..a36f2671 100644
--- a/libbtrfsutil/python/btrfsutilpy.h
+++ b/libbtrfsutil/python/btrfsutilpy.h
@@ -29,7 +29,13 @@
 
 #include 
 
+typedef struct {
+   PyObject_HEAD
+   struct btrfs_util_qgroup_inherit *inherit;
+} QgroupInherit;
+
 extern PyTypeObject BtrfsUtilError_type;
+extern PyTypeObject QgroupInherit_type;
 
 /*
  * Helpers for path arguments based on posixmodule.c in CPython.
diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
index d492cbc7..de7d17a1 100644
--- a/libbtrfsutil/python/module.c
+++ b/libbtrfsutil/python/module.c
@@ -164,6 +164,10 @@ PyInit_btrfsutil(void)
if (PyType_Ready(&BtrfsUtilError_type) <

[PATCH v2 06/27] libbtrfsutil: add btrfs_util_create_subvolume()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Doing the ioctl() directly isn't too bad, but passing in a full path is
more convenient than opening the parent and passing the path component.

Signed-off-by: Omar Sandoval 
---
 libbtrfsutil/btrfsutil.h|  34 +
 libbtrfsutil/python/btrfsutilpy.h   |   1 +
 libbtrfsutil/python/module.c|   8 ++
 libbtrfsutil/python/subvolume.c |  29 +++
 libbtrfsutil/python/tests/test_qgroup.py|  11 +++
 libbtrfsutil/python/tests/test_subvolume.py |  49 +++-
 libbtrfsutil/subvolume.c| 114 
 7 files changed, 245 insertions(+), 1 deletion(-)

diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 76bf0b60..5d67f111 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -105,6 +105,40 @@ enum btrfs_util_error btrfs_util_subvolume_id_fd(int fd, 
uint64_t *id_ret);
 
 struct btrfs_util_qgroup_inherit;
 
+/**
+ * btrfs_util_create_subvolume() - Create a new subvolume.
+ * @path: Where to create the subvolume.
+ * @flags: Must be zero.
+ * @async_transid: If not NULL, create the subvolume asynchronously (i.e.,
+ * without waiting for it to commit it to disk) and return the transaction ID
+ * that it was created in. This transaction ID can be waited on with
+ * btrfs_util_wait_sync().
+ * @qgroup_inherit: Qgroups to inherit from, or NULL.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_create_subvolume(const char *path, int flags,
+ uint64_t *async_transid,
+ struct 
btrfs_util_qgroup_inherit *qgroup_inherit);
+
+/**
+ * btrfs_util_create_subvolume_fd() - Create a new subvolume given its parent
+ * and name.
+ * @parent_fd: File descriptor of the parent directory where the subvolume
+ * should be created.
+ * @name: Name of the subvolume to create.
+ * @flags: See btrfs_util_create_subvolume().
+ * @async_transid: See btrfs_util_create_subvolume().
+ * @qgroup_inherit: See btrfs_util_create_subvolume().
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_create_subvolume_fd(int parent_fd,
+const char *name,
+int flags,
+uint64_t *async_transid,
+struct 
btrfs_util_qgroup_inherit *qgroup_inherit);
+
 /**
  * btrfs_util_create_qgroup_inherit() - Create a qgroup inheritance specifier
  * for btrfs_util_create_subvolume() or btrfs_util_create_snapshot().
diff --git a/libbtrfsutil/python/btrfsutilpy.h 
b/libbtrfsutil/python/btrfsutilpy.h
index a36f2671..87d47ae0 100644
--- a/libbtrfsutil/python/btrfsutilpy.h
+++ b/libbtrfsutil/python/btrfsutilpy.h
@@ -60,6 +60,7 @@ void SetFromBtrfsUtilErrorWithPaths(enum btrfs_util_error err,
 
 PyObject *is_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
 PyObject *subvolume_id(PyObject *self, PyObject *args, PyObject *kwds);
+PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
 
 void add_module_constants(PyObject *m);
 
diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
index de7d17a1..69bba704 100644
--- a/libbtrfsutil/python/module.c
+++ b/libbtrfsutil/python/module.c
@@ -144,6 +144,14 @@ static PyMethodDef btrfsutil_methods[] = {
 "Get the ID of the subvolume containing a file.\n\n"
 "Arguments:\n"
 "path -- string, bytes, path-like object, or open file descriptor"},
+   {"create_subvolume", (PyCFunction)create_subvolume,
+METH_VARARGS | METH_KEYWORDS,
+"create_subvolume(path, async=False)\n\n"
+"Create a new subvolume.\n\n"
+"Arguments:\n"
+"path -- string, bytes, or path-like object\n"
+"async -- create the subvolume without waiting for it to commit to\n"
+"disk and return the transaction ID"},
{},
 };
 
diff --git a/libbtrfsutil/python/subvolume.c b/libbtrfsutil/python/subvolume.c
index 4aab06a5..6f2080ee 100644
--- a/libbtrfsutil/python/subvolume.c
+++ b/libbtrfsutil/python/subvolume.c
@@ -71,3 +71,32 @@ PyObject *subvolume_id(PyObject *self, PyObject *args, 
PyObject *kwds)
path_cleanup(&path);
return PyLong_FromUnsignedLongLong(id);
 }
+
+PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds)
+{
+   static char *keywords[] = {"path", "async", "qgroup_inherit", NULL};
+   struct path_arg path = {.allow_fd = false};
+   enum btrfs_util_error err;
+   int async = 0;
+   QgroupInherit *inherit = NULL;
+   uint64_t transid;
+
+   if (!P

[PATCH v2 16/27] btrfs-progs: use libbtrfsutil for read-only property

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Signed-off-by: Omar Sandoval 
---
 messages.h | 13 
 props.c| 69 +++---
 2 files changed, 38 insertions(+), 44 deletions(-)

diff --git a/messages.h b/messages.h
index 4999c7b9..004d5167 100644
--- a/messages.h
+++ b/messages.h
@@ -54,6 +54,19 @@
DO_ABORT_ON_ERROR;  \
} while (0)
 
+#define error_btrfs_util(err)  \
+   do {\
+   const char *errno_str = strerror(errno);\
+   const char *lib_str = btrfs_util_strerror(err)  \
+   PRINT_TRACE_ON_ERROR;   \
+   PRINT_VERBOSE_ERROR;\
+   if (lib_str && strcmp(errno_str, lib_str) != 0) \
+   __btrfs_error("%s: %s", lib_str, errno_str);\
+   else\
+   __btrfs_error("%s", errno_str); \
+   DO_ABORT_ON_ERROR;  \
+   } while (0)
+
 #define warning(fmt, ...)  \
do {\
PRINT_TRACE_ON_ERROR;   \
diff --git a/props.c b/props.c
index cddbd927..e4edba06 100644
--- a/props.c
+++ b/props.c
@@ -21,6 +21,8 @@
 #include 
 #include 
 
+#include 
+
 #include "ctree.h"
 #include "commands.h"
 #include "utils.h"
@@ -41,56 +43,35 @@ static int prop_read_only(enum prop_object_type type,
  const char *name,
  const char *value)
 {
-   int ret = 0;
-   int fd = -1;
-   u64 flags = 0;
-
-   fd = open(object, O_RDONLY);
-   if (fd < 0) {
-   ret = -errno;
-   error("failed to open %s: %s", object, strerror(-ret));
-   goto out;
-   }
+   enum btrfs_util_error err;
+   bool read_only;
 
-   ret = ioctl(fd, BTRFS_IOC_SUBVOL_GETFLAGS, &flags);
-   if (ret < 0) {
-   ret = -errno;
-   error("failed to get flags for %s: %s", object,
-   strerror(-ret));
-   goto out;
-   }
-
-   if (!value) {
-   if (flags & BTRFS_SUBVOL_RDONLY)
-   fprintf(stdout, "ro=true\n");
-   else
-   fprintf(stdout, "ro=false\n");
-   ret = 0;
-   goto out;
-   }
+   if (value) {
+   if (!strcmp(value, "true")) {
+   read_only = true;
+   } else if (!strcmp(value, "false")) {
+   read_only = false;
+   } else {
+   error("invalid value for property: %s", value);
+   return -EINVAL;
+   }
 
-   if (!strcmp(value, "true")) {
-   flags |= BTRFS_SUBVOL_RDONLY;
-   } else if (!strcmp(value, "false")) {
-   flags = flags & ~BTRFS_SUBVOL_RDONLY;
+   err = btrfs_util_set_subvolume_read_only(object, read_only);
+   if (err) {
+   error_btrfs_util(err);
+   return -errno;
+   }
} else {
-   ret = -EINVAL;
-   error("invalid value for property: %s", value);
-   goto out;
-   }
+   err = btrfs_util_get_subvolume_read_only(object, &read_only);
+   if (err) {
+   error_btrfs_util(err);
+   return -errno;
+   }
 
-   ret = ioctl(fd, BTRFS_IOC_SUBVOL_SETFLAGS, &flags);
-   if (ret < 0) {
-   ret = -errno;
-   error("failed to set flags for %s: %s", object,
-   strerror(-ret));
-   goto out;
+   printf("ro=%s\n", read_only ? "true" : "false");
}
 
-out:
-   if (fd != -1)
-   close(fd);
-   return ret;
+   return 0;
 }
 
 static int prop_label(enum prop_object_type type,
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 17/27] btrfs-progs: use libbtrfsutil for sync ioctls

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

Signed-off-by: Omar Sandoval 
---
 cmds-filesystem.c | 19 ++-
 cmds-qgroup.c | 10 +++---
 cmds-subvolume.c  | 25 +
 3 files changed, 30 insertions(+), 24 deletions(-)

diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index 467aff11..225df421 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -28,6 +28,8 @@
 #include 
 #include 
 
+#include 
+
 #include "kerncompat.h"
 #include "ctree.h"
 #include "utils.h"
@@ -813,25 +815,16 @@ static const char * const cmd_filesystem_sync_usage[] = {
 
 static int cmd_filesystem_sync(int argc, char **argv)
 {
-   int fd, res;
-   char*path;
-   DIR *dirstream = NULL;
+   enum btrfs_util_error err;
 
clean_args_no_options(argc, argv, cmd_filesystem_sync_usage);
 
if (check_argc_exact(argc - optind, 1))
usage(cmd_filesystem_sync_usage);
 
-   path = argv[optind];
-
-   fd = btrfs_open_dir(path, &dirstream, 1);
-   if (fd < 0)
-   return 1;
-
-   res = ioctl(fd, BTRFS_IOC_SYNC);
-   close_file_or_dir(fd, dirstream);
-   if( res < 0 ){
-   error("sync ioctl failed on '%s': %m", path);
+   err = btrfs_util_sync(argv[optind]);
+   if (err) {
+   error_btrfs_util(err);
return 1;
}
 
diff --git a/cmds-qgroup.c b/cmds-qgroup.c
index 4f99e419..f9a52fa8 100644
--- a/cmds-qgroup.c
+++ b/cmds-qgroup.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 
+#include 
+
 #include "ctree.h"
 #include "ioctl.h"
 
@@ -299,6 +301,7 @@ static int cmd_qgroup_show(int argc, char **argv)
int filter_flag = 0;
unsigned unit_mode;
int sync = 0;
+   enum btrfs_util_error err;
 
struct btrfs_qgroup_comparer_set *comparer_set;
struct btrfs_qgroup_filter_set *filter_set;
@@ -372,9 +375,10 @@ static int cmd_qgroup_show(int argc, char **argv)
}
 
if (sync) {
-   ret = ioctl(fd, BTRFS_IOC_SYNC);
-   if (ret < 0)
-   warning("sync ioctl failed on '%s': %m", path);
+   err = btrfs_util_sync_fd(fd);
+   if (err)
+   warning("sync ioctl failed on '%s': %s", path,
+   strerror(errno));
}
 
if (filter_flag) {
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index edcb4f11..6006a278 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -28,6 +28,8 @@
 #include 
 #include 
 
+#include 
+
 #include "kerncompat.h"
 #include "ioctl.h"
 #include "qgroup.h"
@@ -219,12 +221,18 @@ out:
 
 static int wait_for_commit(int fd)
 {
-   int ret;
+   enum btrfs_util_error err;
+   uint64_t transid;
 
-   ret = ioctl(fd, BTRFS_IOC_START_SYNC, NULL);
-   if (ret < 0)
-   return ret;
-   return ioctl(fd, BTRFS_IOC_WAIT_SYNC, NULL);
+   err = btrfs_util_start_sync_fd(fd, &transid);
+   if (err)
+   return -1;
+
+   err = btrfs_util_wait_sync_fd(fd, transid);
+   if (err)
+   return -1;
+
+   return 0;
 }
 
 static const char * const cmd_subvol_delete_usage[] = {
@@ -911,6 +919,7 @@ static int cmd_subvol_find_new(int argc, char **argv)
char *subvol;
u64 last_gen;
DIR *dirstream = NULL;
+   enum btrfs_util_error err;
 
clean_args_no_options(argc, argv, cmd_subvol_find_new_usage);
 
@@ -934,9 +943,9 @@ static int cmd_subvol_find_new(int argc, char **argv)
if (fd < 0)
return 1;
 
-   ret = ioctl(fd, BTRFS_IOC_SYNC);
-   if (ret < 0) {
-   error("sync ioctl failed on '%s': %m", subvol);
+   err = btrfs_util_sync_fd(fd);
+   if (err) {
+   error_btrfs_util(err);
close_file_or_dir(fd, dirstream);
return 1;
}
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 09/27] libbtrfsutil: add btrfs_util_[gs]et_read_only()

2018-02-15 Thread Omar Sandoval
From: Omar Sandoval 

In the future, btrfs_util_[gs]et_subvolume_flags() might be useful, but
since these are the only subvolume flags we've defined in all this time,
this will do for now.

Signed-off-by: Omar Sandoval 
---
 libbtrfsutil/btrfsutil.h| 33 +++
 libbtrfsutil/python/btrfsutilpy.h   |  2 +
 libbtrfsutil/python/module.c| 13 ++
 libbtrfsutil/python/subvolume.c | 55 
 libbtrfsutil/python/tests/test_subvolume.py | 17 
 libbtrfsutil/subvolume.c| 66 +
 6 files changed, 186 insertions(+)

diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
index 0d83dea9..8bd2b847 100644
--- a/libbtrfsutil/btrfsutil.h
+++ b/libbtrfsutil/btrfsutil.h
@@ -235,6 +235,39 @@ enum btrfs_util_error btrfs_util_subvolume_info(const char 
*path, uint64_t id,
 enum btrfs_util_error btrfs_util_subvolume_info_fd(int fd, uint64_t id,
   struct 
btrfs_util_subvolume_info *subvol);
 
+/**
+ * btrfs_util_get_subvolume_read_only() - Get whether a subvolume is read-only.
+ * @path: Subvolume path.
+ * @ret: Returned read-only flag.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_get_subvolume_read_only(const char *path,
+bool *ret);
+
+/**
+ * btrfs_util_get_subvolume_read_only_fd() - See
+ * btrfs_util_get_subvolume_read_only().
+ */
+enum btrfs_util_error btrfs_util_get_subvolume_read_only_fd(int fd, bool *ret);
+
+/**
+ * btrfs_util_set_subvolume_read_only() - Set whether a subvolume is read-only.
+ * @path: Subvolume path.
+ * @read_only: New value of read-only flag.
+ *
+ * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
+ */
+enum btrfs_util_error btrfs_util_set_subvolume_read_only(const char *path,
+bool read_only);
+
+/**
+ * btrfs_util_set_subvolume_read_only_fd() - See
+ * btrfs_util_set_subvolume_read_only().
+ */
+enum btrfs_util_error btrfs_util_set_subvolume_read_only_fd(int fd,
+   bool read_only);
+
 struct btrfs_util_qgroup_inherit;
 
 /**
diff --git a/libbtrfsutil/python/btrfsutilpy.h 
b/libbtrfsutil/python/btrfsutilpy.h
index e601cb8b..21253e51 100644
--- a/libbtrfsutil/python/btrfsutilpy.h
+++ b/libbtrfsutil/python/btrfsutilpy.h
@@ -64,6 +64,8 @@ PyObject *is_subvolume(PyObject *self, PyObject *args, 
PyObject *kwds);
 PyObject *subvolume_id(PyObject *self, PyObject *args, PyObject *kwds);
 PyObject *subvolume_path(PyObject *self, PyObject *args, PyObject *kwds);
 PyObject *subvolume_info(PyObject *self, PyObject *args, PyObject *kwds);
+PyObject *get_subvolume_read_only(PyObject *self, PyObject *args, PyObject 
*kwds);
+PyObject *set_subvolume_read_only(PyObject *self, PyObject *args, PyObject 
*kwds);
 PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
 
 void add_module_constants(PyObject *m);
diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
index b1469fc9..3395fb14 100644
--- a/libbtrfsutil/python/module.c
+++ b/libbtrfsutil/python/module.c
@@ -160,6 +160,19 @@ static PyMethodDef btrfsutil_methods[] = {
 "path -- string, bytes, path-like object, or open file descriptor\n"
 "id -- if not zero, instead of returning information about the\n"
 "given path, return information about the subvolume with this ID"},
+   {"get_subvolume_read_only", (PyCFunction)get_subvolume_read_only,
+METH_VARARGS | METH_KEYWORDS,
+"get_subvolume_read_only(path) -> bool\n\n"
+"Get whether a subvolume is read-only.\n\n"
+"Arguments:\n"
+"path -- string, bytes, path-like object, or open file descriptor"},
+   {"set_subvolume_read_only", (PyCFunction)set_subvolume_read_only,
+METH_VARARGS | METH_KEYWORDS,
+"set_subvolume_read_only(path, read_only=True)\n\n"
+"Set whether a subvolume is read-only.\n\n"
+"Arguments:\n"
+"path -- string, bytes, path-like object, or open file descriptor\n"
+"read_only -- bool flag value"},
{"create_subvolume", (PyCFunction)create_subvolume,
 METH_VARARGS | METH_KEYWORDS,
 "create_subvolume(path, async=False)\n\n"
diff --git a/libbtrfsutil/python/subvolume.c b/libbtrfsutil/python/subvolume.c
index 31b6ca2e..76487865 100644
--- a/libbtrfsutil/python/subvolume.c
+++ b/libbtrfsutil/python/subvolume.c
@@ -215,6 +215,61 @@ PyStructSequence_Desc SubvolumeInfo_desc = {
 
 PyTypeObject SubvolumeInfo_type;
 
+PyObject *get_subvolume_read_only(PyObject *self, PyObject *args, PyObject 
*kwds)
+{
+   

Re: [PATCH v2 03/27] libbtrfsutil: add Python bindings

2018-02-21 Thread Omar Sandoval
On Wed, Feb 21, 2018 at 02:47:54PM +0100, David Sterba wrote:
> On Thu, Feb 15, 2018 at 11:04:48AM -0800, Omar Sandoval wrote:
> > +setup(
> > +name='btrfsutil',
> > +version=get_version(),
> > +description='Library for managing Btrfs filesystems',
> > +url='https://github.com/kdave/btrfs-progs',
> > +license='GPLv3',
> 
> LGPLv3?

Yes, that was a typo. I'll send an incremental patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 04/27] libbtrfsutil: add btrfs_util_is_subvolume() and btrfs_util_subvolume_id()

2018-02-21 Thread Omar Sandoval
On Wed, Feb 21, 2018 at 02:02:02PM +0100, David Sterba wrote:
> On Wed, Feb 21, 2018 at 12:43:07PM +0100, David Sterba wrote:
> > On Thu, Feb 15, 2018 at 11:04:49AM -0800, Omar Sandoval wrote:
> > > --- /dev/null
> > > +++ b/libbtrfsutil/subvolume.c
> > > @@ -0,0 +1,127 @@
> > > +/*
> > > + * Copyright (C) 2018 Facebook
> > > + *
> > > + * This file is part of libbtrfsutil.
> > > + *
> > > + * libbtrfsutil is free software: you can redistribute it and/or modify
> > > + * it under the terms of the GNU Lesser General Public License as 
> > > published by
> > > + * the Free Software Foundation, either version 3 of the License, or
> > > + * (at your option) any later version.
> > > + *
> > > + * libbtrfsutil is distributed in the hope that it will be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > + * GNU Lesser General Public License for more details.
> > > + *
> > > + * You should have received a copy of the GNU Lesser General Public 
> > > License
> > > + * along with libbtrfsutil.  If not, see <http://www.gnu.org/licenses/>.
> > > + */
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > 
> > How is this supposed to work? This pulls the system-wide includes, the
> > travis-ci environment has an old kernel and btrfs_tree.h does not exist
> > there. The missing defines are provided by libbtrfs' btrfs/ctree.h.
> > 
> > We can add a configure-time check to detect the availability of the
> > headers, but I'm not sure if this is right. As libbtrfsutil/subvolume.c
> > is built internally it should use the includes from the btrfs-progs git
> > itself, no?

Dang it, yes, since the lib started out as its own repo I used the
system headers, but when I ported it to btrfs-progs I forgot to sort
this out.

> Oh yeah, the fun has begun.  This will need to be sorted first. ctree.h
> unconditionally pulls kerncompat.h, so it cannot be used instead of the
> linux/btrfs*.h headers, there are warnings about redefined endianity
> conversion macros.
> 
> We can split the headers by type, so ctree.h does not contain
> everything, or copy defines and types to libbtrfsutil header so it is
> completely independent of the git or system headers. Or something else.
> 
> I can now make it compile without including btrfs_tree.h and copying
> parts of ctree.h until it compiles. The type u8 also has to be __u8.

I'll figure something out and send a patch, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 27/27] btrfs-progs: deprecate libbtrfs helpers

2018-02-21 Thread Omar Sandoval
On Wed, Feb 21, 2018 at 04:04:28PM +0100, David Sterba wrote:
> On Thu, Feb 15, 2018 at 11:05:12AM -0800, Omar Sandoval wrote:
> > diff --git a/btrfs-list.c b/btrfs-list.c
> > index a2fdb3f9..56aa2455 100644
> > --- a/btrfs-list.c
> > +++ b/btrfs-list.c
> > @@ -34,6 +34,12 @@
> >  #include "btrfs-list.h"
> >  #include "rbtree-utils.h"
> >  
> > +/*
> > + * The deprecated functions in this file depend on each other, so silence 
> > the
> > + * warnings.
> > + */
> > +#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
> 
> How does this behave under clang?

Clang supports it.

> > +
> >  /* we store all the roots we find in an rbtree so that we can
> >   * search for them later.
> >   */
> > diff --git a/btrfs-list.h b/btrfs-list.h
> > index 54e1888f..774bcece 100644
> > --- a/btrfs-list.h
> > +++ b/btrfs-list.h
> > @@ -82,10 +82,15 @@ struct root_info {
> >  };
> >  
> >  int btrfs_list_find_updated_files(int fd, u64 root_id, u64 oldest_gen);
> > -int btrfs_list_get_default_subvolume(int fd, u64 *default_id);
> > -char *btrfs_list_path_for_root(int fd, u64 root);
> > -int btrfs_list_get_path_rootid(int fd, u64 *treeid);
> > -int btrfs_get_subvol(int fd, struct root_info *the_ri);
> > -int btrfs_get_toplevel_subvol(int fd, struct root_info *the_ri);
> > +int btrfs_list_get_default_subvolume(int fd, u64 *default_id)
> > +   __attribute__((deprecated("use 
> > btrfs_util_get_default_subvolume_fd()")));
> 
> IIRC the parametrized deprecated("...") is not supported on older
> compilers, see 1b1fd2c190ddb896a010a4c704ec1c2d46922aaf . We might avoid
> using raw __attirbute__ anyway, so a wrapper __deprecated("with string")
> could make it ifdef-ed properly. With some configure-time detection.

This should probably be a compile-time check because clients using the
libbtrfs headers might be compiled with a different setup than what was
used to build btrfs-progs, but yes, I'll try to do this. Failing that,
I'll just remove the string. Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/27] btrfs-progs: introduce libbtrfsutil, "btrfs-progs as a library"

2018-02-21 Thread Omar Sandoval
On Wed, Feb 21, 2018 at 04:13:38PM +0100, David Sterba wrote:
> On Tue, Feb 20, 2018 at 07:50:48PM +0100, David Sterba wrote:
> > I have more comments or maybe questions about the future development
> > workflow, but at this point the patchset is in a good shape for
> > incremental merge.
> 
> After removnig the first patch adding subvolume.c (with
> linux/btrfs_tree.h) and what depends on it, I'm left with:
> 
> Omar Sandoval (4):
>   Add libbtrfsutil
>   libbtrfsutil: add Python bindings
>   libbtrfsutil: add qgroup inheritance helpers
>   libbtrfsutil: add filesystem sync helpers
> 
> with some context updates. That builds and passes the CI tests.

Great. Does the CI system run the Python tests yet?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/27] libbtrfsutil: add btrfs_util_[gs]et_default_subvolume()

2018-02-23 Thread Omar Sandoval
On Thu, Feb 22, 2018 at 10:55:48AM +0900, Misono, Tomohiro wrote:
> On 2018/02/16 4:04, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > set_default_subvolume() is a trivial ioctl(), but there's no ioctl() for
> > get_default_subvolume(), so we need to search the root tree.
> > 
> > Signed-off-by: Omar Sandoval 
> > ---
> >  libbtrfsutil/btrfsutil.h|  41 ++
> >  libbtrfsutil/python/btrfsutilpy.h   |   2 +
> >  libbtrfsutil/python/module.c|  14 
> >  libbtrfsutil/python/subvolume.c |  50 
> >  libbtrfsutil/python/tests/test_subvolume.py |  14 
> >  libbtrfsutil/subvolume.c| 113 
> > 
> >  6 files changed, 234 insertions(+)
> > 
> > diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
> > index 8bd2b847..54777f1d 100644
> > --- a/libbtrfsutil/btrfsutil.h
> > +++ b/libbtrfsutil/btrfsutil.h
> > @@ -256,6 +256,8 @@ enum btrfs_util_error 
> > btrfs_util_get_subvolume_read_only_fd(int fd, bool *ret);
> >   * @path: Subvolume path.
> >   * @read_only: New value of read-only flag.
> >   *
> > + * This requires appropriate privilege (CAP_SYS_ADMIN).
> > + *
> >   * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
> >   */
> >  enum btrfs_util_error btrfs_util_set_subvolume_read_only(const char *path,
> > @@ -268,6 +270,45 @@ enum btrfs_util_error 
> > btrfs_util_set_subvolume_read_only(const char *path,
> >  enum btrfs_util_error btrfs_util_set_subvolume_read_only_fd(int fd,
> > bool read_only);
> >  
> > +/**
> > + * btrfs_util_get_default_subvolume() - Get the default subvolume for a
> > + * filesystem.
> > + * @path: Path on a Btrfs filesystem.
> > + * @id_ret: Returned subvolume ID.
> > + *
> > + * This requires appropriate privilege (CAP_SYS_ADMIN).
> > + *
> > + * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
> > + */
> > +enum btrfs_util_error btrfs_util_get_default_subvolume(const char *path,
> > +  uint64_t *id_ret);
> > +
> > +/**
> > + * btrfs_util_get_default_subvolume_fd() - See
> > + * btrfs_util_get_default_subvolume().
> > + */
> > +enum btrfs_util_error btrfs_util_get_default_subvolume_fd(int fd,
> > + uint64_t *id_ret);
> > +
> > +/**
> > + * btrfs_util_set_default_subvolume() - Set the default subvolume for a
> > + * filesystem.
> > + * @path: Path in a Btrfs filesystem. This may be any path in the 
> > filesystem; it
> > + * does not have to refer to a subvolume unless @id is zero.
> > + * @id: ID of subvolume to set as the default. If zero is given, the 
> > subvolume
> > + * ID of @path is used.
> 
> The line "This requires appropriate privilege (CAP_SYS_ADMIN)." is missing 
> here.

Good catch, thanks, fixed.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 23/27] btrfs-progs: use libbtrfsutil for subvol sync

2018-02-23 Thread Omar Sandoval
On Thu, Feb 22, 2018 at 11:03:05AM +0900, Misono, Tomohiro wrote:
> 
> 
> On 2018/02/16 4:05, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > btrfs_util_f_deleted_subvolumes() replaces enumerate_dead_subvols() and
> > btrfs_util_f_subvolume_info() replaces is_subvolume_cleaned().
> > 
> > Signed-off-by: Omar Sandoval 
> > ---
> >  cmds-subvolume.c | 217 
> > ++-
> >  1 file changed, 21 insertions(+), 196 deletions(-)
> > 
> > diff --git a/cmds-subvolume.c b/cmds-subvolume.c
> > index 49c9c8cf..9bab9312 100644
> > --- a/cmds-subvolume.c
> > +++ b/cmds-subvolume.c
> > @@ -42,38 +42,11 @@
> >  #include "utils.h"
> >  #include "help.h"
> >  
> > -static int is_subvolume_cleaned(int fd, u64 subvolid)
> > +static int wait_for_subvolume_cleaning(int fd, size_t count, uint64_t *ids,
> > +  int sleep_interval)
> >  {
> > -   int ret;
> > -   struct btrfs_ioctl_search_args args;
> > -   struct btrfs_ioctl_search_key *sk = &args.key;
> > -
> > -   sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
> > -   sk->min_objectid = subvolid;
> > -   sk->max_objectid = subvolid;
> > -   sk->min_type = BTRFS_ROOT_ITEM_KEY;
> > -   sk->max_type = BTRFS_ROOT_ITEM_KEY;
> > -   sk->min_offset = 0;
> > -   sk->max_offset = (u64)-1;
> > -   sk->min_transid = 0;
> > -   sk->max_transid = (u64)-1;
> > -   sk->nr_items = 1;
> > -
> > -   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
> > -   if (ret < 0)
> > -   return -errno;
> > -
> > -   if (sk->nr_items == 0)
> > -   return 1;
> > -
> > -   return 0;
> > -}
> > -
> > -static int wait_for_subvolume_cleaning(int fd, int count, u64 *ids,
> > -   int sleep_interval)
> > -{
> > -   int ret;
> > -   int i;
> > +   size_t i;
> > +   enum btrfs_util_error err;
> >  
> > while (1) {
> > int clean = 1;
> > @@ -81,16 +54,14 @@ static int wait_for_subvolume_cleaning(int fd, int 
> > count, u64 *ids,
> > for (i = 0; i < count; i++) {
> > if (!ids[i])
> > continue;
> > -   ret = is_subvolume_cleaned(fd, ids[i]);
> > -   if (ret < 0) {
> > -   error(
> > -   "cannot read status of dead subvolume %llu: %s",
> > -   (unsigned long long)ids[i], 
> > strerror(-ret));
> > -   return ret;
> > -   }
> > -   if (ret) {
> > -   printf("Subvolume id %llu is gone\n", ids[i]);
> > +   err = btrfs_util_subvolume_info_fd(fd, ids[i], NULL);
> > +   if (err == BTRFS_UTIL_ERROR_SUBVOLUME_NOT_FOUND) {
> > +   printf("Subvolume id %" PRIu64 " is gone\n",
> > +  ids[i]);
> > ids[i] = 0;
> > +   } else if (err) {
> > +   error_btrfs_util(err);
> > +   return -errno;
> > } else {
> > clean = 0;
> > }
> > @@ -1028,160 +999,15 @@ static const char * const cmd_subvol_sync_usage[] = 
> > {
> > NULL
> >  };
> >  
> > -#if 0
> > -/*
> > - * If we're looking for any dead subvolume, take a shortcut and look
> > - * for any ORPHAN_ITEMs in the tree root
> > - */
> > -static int fs_has_dead_subvolumes(int fd)
> > -{
> > -   int ret;
> > -   struct btrfs_ioctl_search_args args;
> > -   struct btrfs_ioctl_search_key *sk = &args.key;
> > -   struct btrfs_ioctl_search_header sh;
> > -   u64 min_subvolid = 0;
> > -
> > -again:
> > -   sk->tree_id = BTRFS_ROOT_TREE_OBJECTID;
> > -   sk->min_objectid = BTRFS_ORPHAN_OBJECTID;
> > -   sk->max_objectid = BTRFS_ORPHAN_OBJECTID;
> > -   sk->min_type = BTRFS_ORPHAN_ITEM_KEY;
> > -   sk->max_type = BTRFS_ORPHAN_ITEM_KEY;
> > -   sk->min_offset = min_subvolid;
> > -   sk->max_offset = (u64)-1;
> > -   sk->min_transid = 0;
> > -   sk->max_transid = (u64)-1;
> > -   sk->nr_items = 1;
> > -
> > -   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
> &

Re: [PATCH v2 16/27] btrfs-progs: use libbtrfsutil for read-only property

2018-02-23 Thread Omar Sandoval
On Thu, Feb 22, 2018 at 01:23:45PM +0900, Misono, Tomohiro wrote:
> 
> 
> On 2018/02/16 4:05, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > Signed-off-by: Omar Sandoval 
> > ---
> >  messages.h | 13 
> >  props.c| 69 
> > +++---
> >  2 files changed, 38 insertions(+), 44 deletions(-)
> > 
> > diff --git a/messages.h b/messages.h
> > index 4999c7b9..004d5167 100644
> > --- a/messages.h
> > +++ b/messages.h
> > @@ -54,6 +54,19 @@
> > DO_ABORT_ON_ERROR;  \
> > } while (0)
> >  
> > +#define error_btrfs_util(err)  
> > \
> > +   do {\
> > +   const char *errno_str = strerror(errno);\
> > +   const char *lib_str = btrfs_util_strerror(err)  \
> 
> "make D=trace" fails because ";" is missing here.

That's embarrassing, thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 07/27] libbtrfsutil: add btrfs_util_subvolume_path()

2018-02-23 Thread Omar Sandoval
On Fri, Feb 23, 2018 at 03:27:52PM +0900, Misono, Tomohiro wrote:
> On 2018/02/16 4:04, Omar Sandoval wrote:
> > From: Omar Sandoval 
> 
> > +PUBLIC enum btrfs_util_error btrfs_util_subvolume_path_fd(int fd, uint64_t 
> > id,
> > + char **path_ret)
> > +{
> > +   char *path, *p;
> > +   size_t capacity = 4096;
> > +
> > +   path = malloc(capacity);
> > +   if (!path)
> > +   return BTRFS_UTIL_ERROR_NO_MEMORY;
> > +   p = path + capacity - 1;
> > +   p[0] = '\0';
> > +
> > +   if (id == 0) {
> > +   enum btrfs_util_error err;
> > +
> > +   err = btrfs_util_is_subvolume_fd(fd);
> > +   if (err)
> > +   return err;
> 
> 'path' should be freed here and below.
> 
> > +
> > +   err = btrfs_util_subvolume_id_fd(fd, &id);
> > +   if (err)
> > +   return err;
> > +   }

Indeed, although I'll just change it to allocate path after this
instead.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 11/27] libbtrfsutil: add subvolume iterator helpers

2018-02-23 Thread Omar Sandoval
On Fri, Feb 23, 2018 at 04:40:45PM +0900, Misono, Tomohiro wrote:
> On 2018/02/16 4:04, Omar Sandoval wrote:
> > From: Omar Sandoval 
> 
> > +PUBLIC enum btrfs_util_error btrfs_util_create_subvolume_iterator(const 
> > char *path,
> > + uint64_t top,
> > + int flags,
> > + struct 
> > btrfs_util_subvolume_iterator **ret)
> > +{
> > +   enum btrfs_util_error err;
> > +   int fd;
> > +
> > +   fd = open(path, O_RDONLY);
> > +   if (fd == -1)
> > +   return BTRFS_UTIL_ERROR_OPEN_FAILED;
> > +
> > +   err = btrfs_util_create_subvolume_iterator_fd(fd, top, flags, ret);
> > +   if (err == BTRFS_UTIL_OK)
> > +   (*ret)->flags |= BTRFS_UTIL_SUBVOLUME_ITERATOR_CLOSE_FD;
> 
> If btrfs_util_create_subvolume_iterator_fd() returns error, 'fd' remains open.
> So, fd should be closed here.

Good catch, fixed.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 06/27] libbtrfsutil: add btrfs_util_create_subvolume()

2018-02-23 Thread Omar Sandoval
On Fri, Feb 23, 2018 at 05:24:04PM +0900, Misono, Tomohiro wrote:
> On 2018/02/16 4:04, Omar Sandoval wrote:
> > From: Omar Sandoval 
> 
> > +static enum btrfs_util_error openat_parent_and_name(int dirfd, const char 
> > *path,
> > +   char *name, size_t name_len,
> > +   int *fd)
> > +{
> > +   char *tmp_path, *slash, *dirname, *basename;
> > +   size_t len;
> > +
> > +   /* Ignore trailing slashes. */
> > +   len = strlen(path);
> > +   while (len > 1 && path[len - 1] == '/')
> > +   len--;
> > +
> > +   tmp_path = malloc(len + 1);
> > +   if (!tmp_path)
> > +   return BTRFS_UTIL_ERROR_NO_MEMORY;
> > +   memcpy(tmp_path, path, len);
> > +   tmp_path[len] = '\0';
> > +
> > +   slash = memrchr(tmp_path, '/', len);
> > +   if (slash == tmp_path) {
> > +   dirname = "/";
> > +   basename = tmp_path + 1;
> > +   } else if (slash) {
> > +   *slash = '\0';
> > +   dirname = tmp_path;
> > +   basename = slash + 1;
> > +   } else {
> > +   dirname = ".";
> > +   basename = tmp_path;
> > +   }
> > +
> > +   len = strlen(basename);
> > +   if (len >= name_len) {
> > +   errno = ENAMETOOLONG;
> 
> tmp_path should be also freed here.

Another good catch, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 26/27] btrfs-progs: use libbtrfsutil for subvolume list

2018-02-23 Thread Omar Sandoval
On Fri, Feb 23, 2018 at 11:26:35AM +0900, Misono, Tomohiro wrote:
> 
> 
> On 2018/02/16 4:05, Omar Sandoval wrote:
> > From: Omar Sandoval 
> 
> > +static struct subvol_list *btrfs_list_deleted_subvols(int fd,
> > + struct 
> > btrfs_list_filter_set *filter_set)
> > +{
> > +   struct subvol_list *subvols = NULL;
> > +   uint64_t *ids = NULL;
> > +   size_t i, n;
> > +   enum btrfs_util_error err;
> > +   int ret = -1;
> > +
> > +   err = btrfs_util_deleted_subvolumes_fd(fd, &ids, &n);
> > +   if (err) {
> > +   error_btrfs_util(err);
> > +   return NULL;
> > +   }
> > +
> > +   subvols = malloc(sizeof(*subvols) + n * sizeof(subvols->subvols[0]));
> > +   if (!subvols) {
> > +   error("out of memory");
> > +   goto out;
> > +   }
> > +
> > +   subvols->num = 0;
> > +   for (i = 0; i < n; i++) {
> > +   struct listed_subvol *subvol = &subvols->subvols[subvols->num];
> > +
> > +   err = btrfs_util_subvolume_info_fd(fd, ids[i], &subvol->info);
> > +   if (err) {
> 
> I think there is a small chance that subvolume would be removed from tree 
> between 
> btrfs_util_deleted_subvolumes_fd() and btrfs_util_subvolume_info_fd().
> So, error of BTRFS_UTIL_ERROR_SUBVOLUME_NOT_FOUND should be ignored.

Thanks, since this patch isn't in the devel branch in, I'll fold the fix
in.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 14/27] libbtrfsutil: add btrfs_util_deleted_subvolumes()

2018-02-23 Thread Omar Sandoval
On Fri, Feb 23, 2018 at 11:12:56AM +0900, Misono, Tomohiro wrote:
> 
> On 2018/02/16 4:04, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > Signed-off-by: Omar Sandoval 
> > ---
> >  libbtrfsutil/btrfsutil.h| 21 +++
> >  libbtrfsutil/python/btrfsutilpy.h   |  3 +
> >  libbtrfsutil/python/module.c| 30 ++
> >  libbtrfsutil/python/qgroup.c| 17 +-
> >  libbtrfsutil/python/subvolume.c | 30 ++
> >  libbtrfsutil/python/tests/test_subvolume.py |  8 +++
> >  libbtrfsutil/subvolume.c| 89 
> > +
> >  7 files changed, 183 insertions(+), 15 deletions(-)
> > 
> > diff --git a/libbtrfsutil/btrfsutil.h b/libbtrfsutil/btrfsutil.h
> > index 00c86174..677ab3c1 100644
> > --- a/libbtrfsutil/btrfsutil.h
> > +++ b/libbtrfsutil/btrfsutil.h
> > @@ -534,6 +534,27 @@ enum btrfs_util_error 
> > btrfs_util_subvolume_iterator_next_info(struct btrfs_util_
> >   char **path_ret,
> >   struct 
> > btrfs_util_subvolume_info *subvol);
> >  
> > +/**
> > + * btrfs_util_deleted_subvolumes() - Get a list of subvolume which have 
> > been
> > + * deleted but not yet cleaned up.
> > + * @path: Path on a Btrfs filesystem.
> > + * @ids: Returned array of subvolume IDs.
> > + * @n: Returned number of IDs in the @ids array.
> > + *
> > + * This requires appropriate privilege (CAP_SYS_ADMIN).
> > + *
> > + * Return: %BTRFS_UTIL_OK on success, non-zero error code on failure.
> > + */
> > +enum btrfs_util_error btrfs_util_deleted_subvolumes(const char *path,
> > +   uint64_t **ids,
> > +   size_t *n);
> > +
> > +/**
> > + * btrfs_util_deleted_subvolumes_fd() - See 
> > btrfs_util_deleted_subvolumes().
> > + */
> > +enum btrfs_util_error btrfs_util_deleted_subvolumes_fd(int fd, uint64_t 
> > **ids,
> > +  size_t *n);
> > +
> >  /**
> >   * btrfs_util_create_qgroup_inherit() - Create a qgroup inheritance 
> > specifier
> >   * for btrfs_util_create_subvolume() or btrfs_util_create_snapshot().
> > diff --git a/libbtrfsutil/python/btrfsutilpy.h 
> > b/libbtrfsutil/python/btrfsutilpy.h
> > index b3ec047f..be5122e2 100644
> > --- a/libbtrfsutil/python/btrfsutilpy.h
> > +++ b/libbtrfsutil/python/btrfsutilpy.h
> > @@ -54,6 +54,8 @@ struct path_arg {
> >  int path_converter(PyObject *o, void *p);
> >  void path_cleanup(struct path_arg *path);
> >  
> > +PyObject *list_from_uint64_array(const uint64_t *arr, size_t n);
> > +
> >  void SetFromBtrfsUtilError(enum btrfs_util_error err);
> >  void SetFromBtrfsUtilErrorWithPath(enum btrfs_util_error err,
> >struct path_arg *path);
> > @@ -72,6 +74,7 @@ PyObject *set_default_subvolume(PyObject *self, PyObject 
> > *args, PyObject *kwds);
> >  PyObject *create_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
> >  PyObject *create_snapshot(PyObject *self, PyObject *args, PyObject *kwds);
> >  PyObject *delete_subvolume(PyObject *self, PyObject *args, PyObject *kwds);
> > +PyObject *deleted_subvolumes(PyObject *self, PyObject *args, PyObject 
> > *kwds);
> >  
> >  void add_module_constants(PyObject *m);
> >  
> > diff --git a/libbtrfsutil/python/module.c b/libbtrfsutil/python/module.c
> > index e995a1be..eaa062ac 100644
> > --- a/libbtrfsutil/python/module.c
> > +++ b/libbtrfsutil/python/module.c
> > @@ -125,6 +125,29 @@ err:
> > return 0;
> >  }
> >  
> > +PyObject *list_from_uint64_array(const uint64_t *arr, size_t n)
> > +{
> > +PyObject *ret;
> > +size_t i;
> > +
> > +ret = PyList_New(n);
> > +if (!ret)
> > +   return NULL;
> > +
> > +for (i = 0; i < n; i++) {
> > +   PyObject *tmp;
> > +
> > +   tmp = PyLong_FromUnsignedLongLong(arr[i]);
> > +   if (!tmp) {
> > +   Py_DECREF(ret);
> > +   return NULL;
> > +   }
> > +   PyList_SET_ITEM(ret, i, tmp);
> > +}
> > +
> > +return ret;
> > +}
> > +
> >  void path_cleanup(struct path_arg *path)
> >  {
> > Py_CLEAR(path->object);
> > @@ -214,6 +237,13 @@ static PyMethodDef btrfsutil_metho

Re: [PATCH v2 00/27] btrfs-progs: introduce libbtrfsutil, "btrfs-progs as a library"

2018-02-26 Thread Omar Sandoval
On Fri, Feb 23, 2018 at 09:28:42PM +0100, David Sterba wrote:
> On Wed, Feb 21, 2018 at 10:50:32AM -0800, Omar Sandoval wrote:
> > On Wed, Feb 21, 2018 at 04:13:38PM +0100, David Sterba wrote:
> > > On Tue, Feb 20, 2018 at 07:50:48PM +0100, David Sterba wrote:
> > > > I have more comments or maybe questions about the future development
> > > > workflow, but at this point the patchset is in a good shape for
> > > > incremental merge.
> > > 
> > > After removnig the first patch adding subvolume.c (with
> > > linux/btrfs_tree.h) and what depends on it, I'm left with:
> > > 
> > > Omar Sandoval (4):
> > >   Add libbtrfsutil
> > >   libbtrfsutil: add Python bindings
> > >   libbtrfsutil: add qgroup inheritance helpers
> > >   libbtrfsutil: add filesystem sync helpers
> > > 
> > > with some context updates. That builds and passes the CI tests.
> > 
> > Great. Does the CI system run the Python tests yet?
> 
> Tested here https://travis-ci.org/kdave/btrfs-progs/jobs/345410536 ,
> does not pass.
> 
> 
> test_start_sync (test_filesystem.TestSubvolume) ... mkfs.btrfs: invalid 
> option -- 'q'
> usage: mkfs.btrfs [options] dev [ dev ... ]
> 
> 
> Looks like it tries to use the system mkfs.btrfs that is old.

Hm... according the documentation for the existing tests, the person
running the tests is expected to set PATH to contain the local binaries,
otherwise it'll use the system ones. Does the CI system not do that?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/27] btrfs-progs: introduce libbtrfsutil, "btrfs-progs as a library"

2018-02-27 Thread Omar Sandoval
On Tue, Feb 27, 2018 at 04:04:28PM +0100, David Sterba wrote:
> On Mon, Feb 26, 2018 at 03:36:41PM -0800, Omar Sandoval wrote:
> > On Fri, Feb 23, 2018 at 09:28:42PM +0100, David Sterba wrote:
> > > On Wed, Feb 21, 2018 at 10:50:32AM -0800, Omar Sandoval wrote:
> > > > On Wed, Feb 21, 2018 at 04:13:38PM +0100, David Sterba wrote:
> > > > > On Tue, Feb 20, 2018 at 07:50:48PM +0100, David Sterba wrote:
> > > > > > I have more comments or maybe questions about the future development
> > > > > > workflow, but at this point the patchset is in a good shape for
> > > > > > incremental merge.
> > > > > 
> > > > > After removnig the first patch adding subvolume.c (with
> > > > > linux/btrfs_tree.h) and what depends on it, I'm left with:
> > > > > 
> > > > > Omar Sandoval (4):
> > > > >   Add libbtrfsutil
> > > > >   libbtrfsutil: add Python bindings
> > > > >   libbtrfsutil: add qgroup inheritance helpers
> > > > >   libbtrfsutil: add filesystem sync helpers
> > > > > 
> > > > > with some context updates. That builds and passes the CI tests.
> > > > 
> > > > Great. Does the CI system run the Python tests yet?
> > > 
> > > Tested here https://travis-ci.org/kdave/btrfs-progs/jobs/345410536 ,
> > > does not pass.
> > > 
> > > 
> > > test_start_sync (test_filesystem.TestSubvolume) ... mkfs.btrfs: invalid 
> > > option -- 'q'
> > > usage: mkfs.btrfs [options] dev [ dev ... ]
> > > 
> > > 
> > > Looks like it tries to use the system mkfs.btrfs that is old.
> > 
> > Hm... according the documentation for the existing tests, the person
> > running the tests is expected to set PATH to contain the local binaries,
> 
> No, where is this written? The closest hit is in the 'Exported
> testsuite' but otherwise all paths must be "$TOP/mkfs.btrfs". The
> testsuite will detect where it is running and will set the TOP variable
> accordingly, but this is transparent to the tests.
> 
> The python tests should probably build on the same exec path magic.

Oops, I misunderstood from skimming the exported testsuite too quickly.
I'll update the Python tests to do something similar.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/5] generic: add test for dedupe on an active swapfile

2018-05-22 Thread Omar Sandoval
From: Omar Sandoval 

Similar to the reflink test generic/356, but makes sure we can't dedupe
an active swapfile.

Signed-off-by: Omar Sandoval 
---
 tests/generic/492 | 76 +++
 tests/generic/492.out |  7 
 tests/generic/group   |  1 +
 3 files changed, 84 insertions(+)
 create mode 100755 tests/generic/492
 create mode 100644 tests/generic/492.out

diff --git a/tests/generic/492 b/tests/generic/492
new file mode 100755
index ..54a2553d
--- /dev/null
+++ b/tests/generic/492
@@ -0,0 +1,76 @@
+#! /bin/bash
+# FS QA Test 492
+#
+# Check that we can't dedupe a swapfile.
+#
+#---
+# Copyright (c) 2018 Facebook.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+. ./common/rc
+. ./common/filter
+. ./common/reflink
+
+rm -f $seqres.full
+
+_supported_fs generic
+_supported_os Linux
+_require_scratch_swapfile
+_require_scratch_dedupe
+
+echo "Format and mount"
+_scratch_mkfs > $seqres.full 2>&1
+_scratch_mount >> $seqres.full 2>&1
+
+testdir="$SCRATCH_MNT/test-$seq"
+mkdir "$testdir"
+
+blocks=160
+blksz=65536
+
+echo "Initialize file"
+_format_swapfile "$testdir/file1" $((blocks * blksz))
+swapon "$testdir/file1"
+
+touch "$testdir/file2"
+$CHATTR_PROG +C "$testdir/file2" >/dev/null 2>&1
+
+echo "Try to dedupe"
+cp "$testdir/file1" "$testdir/file2"
+_dedupe_range "$testdir/file1" 0 "$testdir/file2" 0 $((blocks * blksz))
+_dedupe_range "$testdir/file2" 0 "$testdir/file1" 0 $((blocks * blksz))
+
+echo "Tear it down"
+swapoff "$testdir/file1"
+
+status=0
+exit
diff --git a/tests/generic/492.out b/tests/generic/492.out
new file mode 100644
index ..e1f3cc69
--- /dev/null
+++ b/tests/generic/492.out
@@ -0,0 +1,7 @@
+QA output created by 492
+Format and mount
+Initialize file
+Try to dedupe
+XFS_IOC_FILE_EXTENT_SAME: Text file busy
+XFS_IOC_FILE_EXTENT_SAME: Text file busy
+Tear it down
diff --git a/tests/generic/group b/tests/generic/group
index 111e42e7..dc5adf04 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -494,3 +494,4 @@
 489 auto quick attr
 490 auto quick rw
 491 auto quick freeze mount
+492 auto quick swap
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 5/5] generic: test invalid swap file activation

2018-05-22 Thread Omar Sandoval
From: Omar Sandoval 

Swap files cannot have holes, and they must at least two pages.
swapon(8) and mkswap(8) have stricter restrictions, so add versions of
those commands without any restrictions.

Signed-off-by: Omar Sandoval 
---
 .gitignore|  2 ++
 src/Makefile  |  2 +-
 src/mkswap.c  | 83 +++
 src/swapon.c  | 24 +
 tests/generic/494 | 77 +++
 tests/generic/494.out |  5 +++
 tests/generic/group   |  1 +
 7 files changed, 193 insertions(+), 1 deletion(-)
 create mode 100644 src/mkswap.c
 create mode 100644 src/swapon.c
 create mode 100755 tests/generic/494
 create mode 100644 tests/generic/494.out

diff --git a/.gitignore b/.gitignore
index 53029e24..efc73a7c 100644
--- a/.gitignore
+++ b/.gitignore
@@ -92,6 +92,7 @@
 /src/lstat64
 /src/makeextents
 /src/metaperf
+/src/mkswap
 /src/mmapcat
 /src/multi_open_unlink
 /src/nametest
@@ -111,6 +112,7 @@
 /src/seek_sanity_test
 /src/stale_handle
 /src/stat_test
+/src/swapon
 /src/t_access_root
 /src/t_dir_offset
 /src/t_dir_offset2
diff --git a/src/Makefile b/src/Makefile
index c42d3bb1..01fe99ef 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -26,7 +26,7 @@ LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize 
preallo_rw_pattern_reader \
renameat2 t_getcwd e4compact test-nextquota punch-alternating \
attr-list-by-handle-cursor-test listxattr dio-interleaved t_dir_type \
dio-invalidate-cache stat_test t_encrypted_d_revalidate \
-   attr_replace_test
+   attr_replace_test swapon mkswap
 
 SUBDIRS = log-writes perf
 
diff --git a/src/mkswap.c b/src/mkswap.c
new file mode 100644
index ..d0bce2bd
--- /dev/null
+++ b/src/mkswap.c
@@ -0,0 +1,83 @@
+/* mkswap(8) without any sanity checks */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct swap_header {
+   charbootbits[1024];
+   uint32_tversion;
+   uint32_tlast_page;
+   uint32_tnr_badpages;
+   unsigned char   sws_uuid[16];
+   unsigned char   sws_volume[16];
+   uint32_tpadding[117];
+   uint32_tbadpages[1];
+};
+
+int main(int argc, char **argv)
+{
+   struct swap_header *hdr;
+   FILE *file;
+   struct stat st;
+   long page_size;
+   int ret;
+
+   if (argc != 2) {
+   fprintf(stderr, "usage: %s PATH\n", argv[0]);
+   return EXIT_FAILURE;
+   }
+
+   page_size = sysconf(_SC_PAGESIZE);
+   if (page_size == -1) {
+   perror("sysconf");
+   return EXIT_FAILURE;
+   }
+
+   hdr = calloc(1, page_size);
+   if (!hdr) {
+   perror("calloc");
+   return EXIT_FAILURE;
+   }
+
+   file = fopen(argv[1], "r+");
+   if (!file) {
+   perror("fopen");
+   free(hdr);
+   return EXIT_FAILURE;
+   }
+
+   ret = fstat(fileno(file), &st);
+   if (ret) {
+   perror("fstat");
+   free(hdr);
+   fclose(file);
+   return EXIT_FAILURE;
+   }
+
+   hdr->version = 1;
+   hdr->last_page = st.st_size / page_size - 1;
+   memset(&hdr->sws_uuid, 0x99, sizeof(hdr->sws_uuid));
+   memcpy((char *)hdr + page_size - 10, "SWAPSPACE2", 10);
+
+   if (fwrite(hdr, page_size, 1, file) != 1) {
+   perror("fwrite");
+   free(hdr);
+   fclose(file);
+   return EXIT_FAILURE;
+   }
+
+   if (fclose(file) == EOF) {
+   perror("fwrite");
+   free(hdr);
+   return EXIT_FAILURE;
+   }
+
+   free(hdr);
+
+   return EXIT_SUCCESS;
+}
diff --git a/src/swapon.c b/src/swapon.c
new file mode 100644
index ..0cb7108a
--- /dev/null
+++ b/src/swapon.c
@@ -0,0 +1,24 @@
+/* swapon(8) without any sanity checks; simply calls swapon(2) directly. */
+
+#include 
+#include 
+#include 
+#include 
+
+int main(int argc, char **argv)
+{
+   int ret;
+
+   if (argc != 2) {
+   fprintf(stderr, "usage: %s PATH\n", argv[0]);
+   return EXIT_FAILURE;
+   }
+
+   ret = swapon(argv[1], 0);
+   if (ret) {
+   perror("swapon");
+   return EXIT_FAILURE;
+   }
+
+   return EXIT_SUCCESS;
+}
diff --git a/tests/generic/494 b/tests/generic/494
new file mode 100755
index ..28468033
--- /dev/null
+++ b/tests/generic/494
@@ -0,0 +1,77 @@
+#! /bin/bash
+# FS QA Test 494
+#
+# Test invalid swap files.
+#
+#---
+# Copyright (c) 2018 Facebook.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Pub

[PATCH v3 1/5] xfstests: create swap group

2018-05-22 Thread Omar Sandoval
From: Omar Sandoval 

I'm going to add a bunch of tests for swap files, so create a group for
them and add the existing tests.

Signed-off-by: Omar Sandoval 
---
 tests/generic/group | 4 ++--
 tests/xfs/group | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/generic/group b/tests/generic/group
index 4fc3e457..111e42e7 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -358,8 +358,8 @@
 353 auto quick clone
 354 auto
 355 auto quick
-356 auto quick clone
-357 auto quick clone
+356 auto quick clone swap
+357 auto quick clone swap
 358 auto quick clone
 359 auto quick clone
 360 auto quick metadata
diff --git a/tests/xfs/group b/tests/xfs/group
index 51326d95..e5fd1c6d 100644
--- a/tests/xfs/group
+++ b/tests/xfs/group
@@ -416,7 +416,7 @@
 416 dangerous_fuzzers dangerous_scrub dangerous_repair
 417 dangerous_fuzzers dangerous_scrub dangerous_online_repair
 418 dangerous_fuzzers dangerous_scrub dangerous_repair
-419 auto quick
+419 auto quick swap
 420 auto quick clone dedupe
 421 auto quick clone dedupe
 422 dangerous_scrub dangerous_online_repair
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Btrfs: fix partly checksummed file races

2018-05-23 Thread Omar Sandoval
On Wed, May 23, 2018 at 12:17:14PM +0200, David Sterba wrote:
> On Tue, May 22, 2018 at 03:02:11PM -0700, Omar Sandoval wrote:
> > Based on kdave/for-next. Note that there's a Fixes: tag in there
> > referencing a commit in the for-next branch, so that would have to be
> > updated if the commit gets rebased. These patches are also available at
> > https://github.com/osandov/linux/tree/btrfs-nodatasum-race.
> 
> If the original patch is not in any released or frozen branch, then the
> fix should be folded to the original patch. The for-next branch is for
> preview, testing and catching bugs that slip past the review. And gets
> reassembled frequently so referencing a patch from there does not make
> sense.
> 
> Sending the fixups as patches is ok, replies to the original thread
> might get lost in the noise.

Ok, let's fold it in. I pushed Timofey's series with the fix folded in
here: https://github.com/osandov/linux/tree/btrfs-ioctl-fixes, based on
misc-next with Timofey's patches removed. The diff vs his original
patches is the same as my patch:

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 65d709002775..75c66ac77fd7 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3113,12 +3113,6 @@ static int btrfs_extent_same(struct inode *src, u64 
loff, u64 olen,
if (olen == 0)
return 0;
 
-   /* don't make the dst file partly checksummed */
-   if ((BTRFS_I(src)->flags & BTRFS_INODE_NODATASUM) !=
-   (BTRFS_I(dst)->flags & BTRFS_INODE_NODATASUM)) {
-   return -EINVAL;
-   }
-
tail_len = olen % BTRFS_MAX_DEDUPE_LEN;
chunk_count = div_u64(olen, BTRFS_MAX_DEDUPE_LEN);
if (chunk_count == 0)
@@ -3151,6 +3145,13 @@ static int btrfs_extent_same(struct inode *src, u64 
loff, u64 olen,
else
btrfs_double_inode_lock(src, dst);
 
+   /* don't make the dst file partly checksummed */
+   if ((BTRFS_I(src)->flags & BTRFS_INODE_NODATASUM) !=
+   (BTRFS_I(dst)->flags & BTRFS_INODE_NODATASUM)) {
+   ret = -EINVAL;
+   goto out;
+   }
+
for (i = 0; i < chunk_count; i++) {
ret = btrfs_extent_same_range(src, loff, BTRFS_MAX_DEDUPE_LEN,
  dst, dst_loff, &cmp);

The clone fix and device remove fix are in that branch, too. Let me know
if you'd prefer it as patches.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v4 0/6] Btrfs: implement swap file support

2018-05-24 Thread Omar Sandoval
From: Omar Sandoval 

Hi,

This patch series implements swap file support for Btrfs. If you don't
remember versions 1-3, that's because they were almost 4 years ago [1]
:)

This attempt takes a very different approach from my original versions
back then. As a refresher, the original idea was to go through
->read_iter() and ->write_iter() in O_DIRECT mode. It turns out that
this is really hard to get right, because it effectively makes the
GFP_NOFS flag meaningless, since now swapping out can go through the
filesystem and grab locks and such. We could try to make the read/write
path lockless for a swap file, but this would be too easy to get wrong.

So, instead, this approach was inspired by the iomap swap file support
[2], where we directly tell the swap code where it can find the swap
extents while doing our own sanity checks. This has a bunch of
restrictions, documented in patches 4 and 6. I have a bunch of xfstests
for these cases at https://github.com/osandov/xfstests/tree/btrfs-swap.

This series can also be found at 
https://github.com/osandov/linux/tree/btrfs-swap.

All comments welcome. Thanks!

1: https://www.spinics.net/lists/linux-btrfs/msg40129.html
2: https://patchwork.kernel.org/patch/10390417/

Omar Sandoval (6):
  mm: split SWP_FILE into SWP_ACTIVATED and SWP_FS
  vfs: update swap_{,de}activate documentation
  Btrfs: push EXCL_OP set into btrfs_rm_device()
  Btrfs: prevent ioctls from interfering with a swap file
  Btrfs: rename get_chunk_map() and make it non-static
  Btrfs: support swap files

 Documentation/filesystems/Locking |  17 +--
 Documentation/filesystems/vfs.txt |  12 +-
 fs/btrfs/ctree.h  |   6 +
 fs/btrfs/disk-io.c|   3 +
 fs/btrfs/inode.c  | 220 ++
 fs/btrfs/ioctl.c  |  64 ++---
 fs/btrfs/volumes.c|  34 +++--
 fs/btrfs/volumes.h|   2 +
 include/linux/swap.h  |  13 +-
 mm/page_io.c  |   6 +-
 mm/swapfile.c |  14 +-
 11 files changed, 335 insertions(+), 56 deletions(-)

-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v4 1/6] mm: split SWP_FILE into SWP_ACTIVATED and SWP_FS

2018-05-24 Thread Omar Sandoval
From: Omar Sandoval 

The SWP_FILE flag serves two purposes: to make swap_{read,write}page()
go through the filesystem, and to make swapoff() call
->swap_deactivate(). For Btrfs, we want the latter but not the former,
so split this flag into two. This makes us always call
->swap_deactivate() if ->swap_activate() succeeded, not just if it
didn't add any swap extents itself.

This also resolves the issue of the very misleading name of SWP_FILE,
which is only used for swap files over NFS.

Signed-off-by: Omar Sandoval 
---
 include/linux/swap.h | 13 +++--
 mm/page_io.c |  6 +++---
 mm/swapfile.c| 13 -
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 2417d288e016..29dfd436435c 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -167,13 +167,14 @@ enum {
SWP_SOLIDSTATE  = (1 << 4), /* blkdev seeks are cheap */
SWP_CONTINUED   = (1 << 5), /* swap_map has count continuation */
SWP_BLKDEV  = (1 << 6), /* its a block device */
-   SWP_FILE= (1 << 7), /* set after swap_activate success */
-   SWP_AREA_DISCARD = (1 << 8),/* single-time swap area discards */
-   SWP_PAGE_DISCARD = (1 << 9),/* freed swap page-cluster discards */
-   SWP_STABLE_WRITES = (1 << 10),  /* no overwrite PG_writeback pages */
-   SWP_SYNCHRONOUS_IO = (1 << 11), /* synchronous IO is efficient */
+   SWP_ACTIVATED   = (1 << 7), /* set after swap_activate success */
+   SWP_FS  = (1 << 8), /* swap file goes through fs */
+   SWP_AREA_DISCARD = (1 << 9),/* single-time swap area discards */
+   SWP_PAGE_DISCARD = (1 << 10),   /* freed swap page-cluster discards */
+   SWP_STABLE_WRITES = (1 << 11),  /* no overwrite PG_writeback pages */
+   SWP_SYNCHRONOUS_IO = (1 << 12), /* synchronous IO is efficient */
/* add others here before... */
-   SWP_SCANNING= (1 << 12),/* refcount in scan_swap_map */
+   SWP_SCANNING= (1 << 13),/* refcount in scan_swap_map */
 };
 
 #define SWAP_CLUSTER_MAX 32UL
diff --git a/mm/page_io.c b/mm/page_io.c
index b41cf9644585..f2d06c1d0cc1 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -283,7 +283,7 @@ int __swap_writepage(struct page *page, struct 
writeback_control *wbc,
struct swap_info_struct *sis = page_swap_info(page);
 
VM_BUG_ON_PAGE(!PageSwapCache(page), page);
-   if (sis->flags & SWP_FILE) {
+   if (sis->flags & SWP_FS) {
struct kiocb kiocb;
struct file *swap_file = sis->swap_file;
struct address_space *mapping = swap_file->f_mapping;
@@ -364,7 +364,7 @@ int swap_readpage(struct page *page, bool synchronous)
goto out;
}
 
-   if (sis->flags & SWP_FILE) {
+   if (sis->flags & SWP_FS) {
struct file *swap_file = sis->swap_file;
struct address_space *mapping = swap_file->f_mapping;
 
@@ -422,7 +422,7 @@ int swap_set_page_dirty(struct page *page)
 {
struct swap_info_struct *sis = page_swap_info(page);
 
-   if (sis->flags & SWP_FILE) {
+   if (sis->flags & SWP_FS) {
struct address_space *mapping = sis->swap_file->f_mapping;
 
VM_BUG_ON_PAGE(!PageSwapCache(page), page);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index cc2cf04d9018..886c9d89b144 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -973,7 +973,7 @@ int get_swap_pages(int n_goal, bool cluster, swp_entry_t 
swp_entries[])
goto nextsi;
}
if (cluster) {
-   if (!(si->flags & SWP_FILE))
+   if (!(si->flags & SWP_FS))
n_ret = swap_alloc_cluster(si, swp_entries);
} else
n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
@@ -2327,12 +2327,13 @@ static void destroy_swap_extents(struct 
swap_info_struct *sis)
kfree(se);
}
 
-   if (sis->flags & SWP_FILE) {
+   if (sis->flags & SWP_ACTIVATED) {
struct file *swap_file = sis->swap_file;
struct address_space *mapping = swap_file->f_mapping;
 
-   sis->flags &= ~SWP_FILE;
-   mapping->a_ops->swap_deactivate(swap_file);
+   sis->flags &= ~SWP_ACTIVATED;
+   if (mapping->a_ops->swap_deactivate)
+   mapping->a_ops->swap_deactivate(swap_file);
}
 }
 
@@ -2428,8 +2429,10 @@ static int setup_swap_extents(struct swap_info_struct 
*sis, sector_t *span)
 
if (mapping->a_ops->swap_ac

[RFC PATCH v4 2/6] vfs: update swap_{,de}activate documentation

2018-05-24 Thread Omar Sandoval
From: Omar Sandoval 

The documentation for these functions is wrong in several ways:

- swap_activate() is called with the inode locked
- swap_activate() takes a swap_info_struct * and a sector_t *
- swap_activate() can also return a positive number of extents it added
  itself
- swap_deactivate() does not return anything

Signed-off-by: Omar Sandoval 
---
 Documentation/filesystems/Locking | 17 +++--
 Documentation/filesystems/vfs.txt | 12 
 2 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/Documentation/filesystems/Locking 
b/Documentation/filesystems/Locking
index 75d2d57e2c44..7f009e98fa3c 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -211,8 +211,9 @@ prototypes:
int (*launder_page)(struct page *);
int (*is_partially_uptodate)(struct page *, unsigned long, unsigned 
long);
int (*error_remove_page)(struct address_space *, struct page *);
-   int (*swap_activate)(struct file *);
-   int (*swap_deactivate)(struct file *);
+   int (*swap_activate)(struct swap_info_struct *, struct file *,
+sector_t *);
+   void (*swap_deactivate)(struct file *);
 
 locking rules:
All except set_page_dirty and freepage may block
@@ -236,8 +237,8 @@ putback_page:   yes
 launder_page:  yes
 is_partially_uptodate: yes
 error_remove_page: yes
-swap_activate: no
-swap_deactivate:   no
+swap_activate: yes
+swap_deactivate:   no
 
->write_begin(), ->write_end() and ->readpage() may be called from
 the request handler (/dev/loop).
@@ -334,14 +335,10 @@ cleaned, or an error value if not. Note that in order to 
prevent the page
 getting mapped back in and redirtied, it needs to be kept locked
 across the entire operation.
 
-   ->swap_activate will be called with a non-zero argument on
-files backing (non block device backed) swapfiles. A return value
-of zero indicates success, in which case this file can be used for
-backing swapspace. The swapspace operations will be proxied to the
-address space operations.
+   ->swap_activate is called from sys_swapon() with the inode locked.
 
->swap_deactivate() will be called in the sys_swapoff()
-path after ->swap_activate() returned success.
+path after ->swap_activate() returned success. The inode is not locked.
 
 --- file_lock_operations --
 prototypes:
diff --git a/Documentation/filesystems/vfs.txt 
b/Documentation/filesystems/vfs.txt
index 5fd325df59e2..0149109d94d1 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -650,8 +650,9 @@ struct address_space_operations {
unsigned long);
void (*is_dirty_writeback) (struct page *, bool *, bool *);
int (*error_remove_page) (struct mapping *mapping, struct page *page);
-   int (*swap_activate)(struct file *);
-   int (*swap_deactivate)(struct file *);
+   int (*swap_activate)(struct swap_info_struct *, struct file *,
+sector_t *);
+   void (*swap_deactivate)(struct file *);
 };
 
   writepage: called by the VM to write a dirty page to backing store.
@@ -828,8 +829,11 @@ struct address_space_operations {
 
   swap_activate: Called when swapon is used on a file to allocate
space if necessary and pin the block lookup information in
-   memory. A return value of zero indicates success,
-   in which case this file can be used to back swapspace.
+   memory. If this returns zero, the swap system will call the address
+   space operations ->readpage() and ->direct_IO(). Alternatively, this
+   may call add_swap_extent() and return the number of extents added, in
+   which case the swap system will use the provided blocks directly
+   instead of going through the filesystem.
 
   swap_deactivate: Called during swapoff on files where swap_activate
was successful.
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v4 4/6] Btrfs: prevent ioctls from interfering with a swap file

2018-05-24 Thread Omar Sandoval
From: Omar Sandoval 

When a swap file is active, we must make sure that the extents of the
file are not moved and that they don't become shared. That means that
the following are not safe:

- chattr +c (enable compression)
- reflink
- dedupe
- snapshot
- defrag
- balance
- device remove/replace/resize

Don't allow those to happen on an active swap file. Balance and device
remove/replace/resize in particular are disallowed entirely; in the
future, we can relax this so that relocation skips/errors out only on
chunks containing an active swap file.

Note that we don't have to worry about chattr -C (disable nocow), which
we ignore for non-empty files, because an active swapfile must be
non-empty and can't be truncated. We also don't have to worry about
autodefrag because it's only done on COW files. Truncate and fallocate
are already taken care of by the generic code. Device add doesn't do
relocation so it's not an issue, either.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/ctree.h   |  6 ++
 fs/btrfs/disk-io.c |  3 +++
 fs/btrfs/ioctl.c   | 51 ++
 fs/btrfs/volumes.c |  8 
 4 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 6deb39d8415d..280d7f5e2fe4 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1117,6 +1117,9 @@ struct btrfs_fs_info {
u32 sectorsize;
u32 stripesize;
 
+   /* Number of active swapfiles */
+   atomic_t nr_swapfiles;
+
 #ifdef CONFIG_BTRFS_FS_REF_VERIFY
spinlock_t ref_verify_lock;
struct rb_root block_tree;
@@ -1282,6 +1285,9 @@ struct btrfs_root {
spinlock_t qgroup_meta_rsv_lock;
u64 qgroup_meta_rsv_pertrans;
u64 qgroup_meta_rsv_prealloc;
+
+   /* Number of active swapfiles */
+   atomic_t nr_swapfiles;
 };
 
 struct btrfs_file_private {
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 22171bbe86e3..b42fd6b41b20 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1215,6 +1215,7 @@ static void __setup_root(struct btrfs_root *root, struct 
btrfs_fs_info *fs_info,
atomic_set(&root->log_batch, 0);
refcount_set(&root->refs, 1);
atomic_set(&root->will_be_snapshotted, 0);
+   atomic_set(&root->nr_swapfiles, 0);
root->log_transid = 0;
root->log_transid_committed = -1;
root->last_log_commit = 0;
@@ -2825,6 +2826,8 @@ int open_ctree(struct super_block *sb,
fs_info->sectorsize = 4096;
fs_info->stripesize = 4096;
 
+   atomic_set(&fs_info->nr_swapfiles, 0);
+
ret = btrfs_alloc_stripe_hash_table(fs_info);
if (ret) {
err = ret;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 82be4a94334b..5f068aabe612 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -295,6 +295,11 @@ static int btrfs_ioctl_setflags(struct file *file, void 
__user *arg)
} else if (fsflags & FS_COMPR_FL) {
const char *comp;
 
+   if (IS_SWAPFILE(inode)) {
+   ret = -ETXTBSY;
+   goto out_unlock;
+   }
+
binode->flags |= BTRFS_INODE_COMPRESS;
binode->flags &= ~BTRFS_INODE_NOCOMPRESS;
 
@@ -767,6 +772,12 @@ static int create_snapshot(struct btrfs_root *root, struct 
inode *dir,
if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state))
return -EINVAL;
 
+   if (atomic_read(&root->nr_swapfiles)) {
+   btrfs_info(fs_info,
+  "cannot create snapshot with active swapfile");
+   return -ETXTBSY;
+   }
+
pending_snapshot = kzalloc(sizeof(*pending_snapshot), GFP_KERNEL);
if (!pending_snapshot)
return -ENOMEM;
@@ -1504,9 +1515,13 @@ int btrfs_defrag_file(struct inode *inode, struct file 
*file,
}
 
inode_lock(inode);
-   if (do_compress)
-   BTRFS_I(inode)->defrag_compress = compress_type;
-   ret = cluster_pages_for_defrag(inode, pages, i, cluster);
+   if (IS_SWAPFILE(inode)) {
+   ret = -ETXTBSY;
+   } else {
+   if (do_compress)
+   BTRFS_I(inode)->defrag_compress = compress_type;
+   ret = cluster_pages_for_defrag(inode, pages, i, 
cluster);
+   }
if (ret < 0) {
inode_unlock(inode);
goto out_ra;
@@ -1602,6 +1617,12 @@ static noinline int btrfs_ioctl_resize(struct file *file,
return BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
}
 
+   if (atomic_read(&fs_info->nr_swapfiles)) {
+   btrfs_info(fs_info, "cannot resize with active swapfile");
+   ret = -

[RFC PATCH v4 6/6] Btrfs: support swap files

2018-05-24 Thread Omar Sandoval
From: Omar Sandoval 

Implement the swap file a_ops on Btrfs. Activation needs to make sure
that the file can be used as a swap file, which currently means it must
be fully allocated as nocow with no compression on one device. It also
sets up the swap extents directly with add_swap_extent(), so export it.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/inode.c | 220 +++
 mm/swapfile.c|   1 +
 2 files changed, 221 insertions(+)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9228cb866115..6cca8529e307 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -10526,6 +10526,224 @@ void btrfs_set_range_writeback(void *private_data, 
u64 start, u64 end)
}
 }
 
+struct btrfs_swap_info {
+   u64 start;
+   u64 block_start;
+   u64 block_len;
+   u64 lowest_ppage;
+   u64 highest_ppage;
+   unsigned long nr_pages;
+   int nr_extents;
+};
+
+static int btrfs_add_swap_extent(struct swap_info_struct *sis,
+struct btrfs_swap_info *bsi)
+{
+   unsigned long nr_pages;
+   u64 first_ppage, first_ppage_reported, next_ppage;
+   int ret;
+
+   first_ppage = ALIGN(bsi->block_start, PAGE_SIZE) >> PAGE_SHIFT;
+   next_ppage = ALIGN_DOWN(bsi->block_start + bsi->block_len,
+   PAGE_SIZE) >> PAGE_SHIFT;
+
+   if (first_ppage >= next_ppage)
+   return 0;
+   nr_pages = next_ppage - first_ppage;
+
+   first_ppage_reported = first_ppage;
+   if (bsi->start == 0)
+   first_ppage_reported++;
+   if (bsi->lowest_ppage > first_ppage_reported)
+   bsi->lowest_ppage = first_ppage_reported;
+   if (bsi->highest_ppage < (next_ppage - 1))
+   bsi->highest_ppage = next_ppage - 1;
+
+   ret = add_swap_extent(sis, bsi->nr_pages, nr_pages, first_ppage);
+   if (ret < 0)
+   return ret;
+   bsi->nr_extents += ret;
+   bsi->nr_pages += nr_pages;
+   return 0;
+}
+
+static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file,
+  sector_t *span)
+{
+   struct inode *inode = file_inode(file);
+   struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
+   struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
+   struct extent_state *cached_state = NULL;
+   struct extent_map *em = NULL;
+   struct btrfs_device *device = NULL;
+   struct btrfs_swap_info bsi = {
+   .lowest_ppage = (sector_t)-1ULL,
+   };
+   int ret = 0;
+   u64 isize = inode->i_size;
+   u64 start;
+
+   /*
+* The inode is locked, so these flags won't change after we check them.
+*/
+   if (BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS) {
+   btrfs_err(fs_info, "swapfile is compressed");
+   return -EINVAL;
+   }
+   if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW)) {
+   btrfs_err(fs_info, "swapfile is copy-on-write");
+   return -EINVAL;
+   }
+
+   /*
+* Balance or device remove/replace/resize can move stuff around from
+* under us. The EXCL_OP flag makes sure they aren't running/won't run
+* concurrently while we are mapping the swap extents, and the fs_info
+* nr_swapfiles counter prevents them from running while the swap file
+* is active and moving the extents. Note that this also prevents a
+* concurrent device add which isn't actually necessary, but it's not
+* really worth the trouble to allow it.
+*/
+   if (test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags))
+   return -EBUSY;
+   atomic_inc(&fs_info->nr_swapfiles);
+   /*
+* Snapshots can create extents which require COW even if NODATACOW is
+* set. We use this counter to prevent snapshots. We must increment it
+* before walking the extents because we don't want a concurrent
+* snapshot to run after we've already checked the extents.
+*/
+   atomic_inc(&BTRFS_I(inode)->root->nr_swapfiles);
+
+   lock_extent_bits(io_tree, 0, isize - 1, &cached_state);
+   start = 0;
+   while (start < isize) {
+   u64 end, logical_block_start, physical_block_start;
+   u64 len = isize - start;
+
+   em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, start, len, 0);
+   if (IS_ERR(em)) {
+   ret = PTR_ERR(em);
+   goto out;
+   }
+   end = extent_map_end(em);
+
+   if (em->block_start == EXTENT_MAP_HOLE) {
+   btrfs_err(fs_info, "swapfile has holes");
+   ret = -EINVAL;
+

[RFC PATCH v4 5/6] Btrfs: rename get_chunk_map() and make it non-static

2018-05-24 Thread Omar Sandoval
From: Omar Sandoval 

The Btrfs swap code is going to need it, so give it a btrfs_ prefix and
make it non-static.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/volumes.c | 22 +++---
 fs/btrfs/volumes.h |  2 ++
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 33b3d329ebb9..6e1a89c6b362 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2789,8 +2789,8 @@ static int btrfs_del_sys_chunk(struct btrfs_fs_info 
*fs_info, u64 chunk_offset)
return ret;
 }
 
-static struct extent_map *get_chunk_map(struct btrfs_fs_info *fs_info,
-   u64 logical, u64 length)
+struct extent_map *btrfs_get_chunk_map(struct btrfs_fs_info *fs_info,
+  u64 logical, u64 length)
 {
struct extent_map_tree *em_tree;
struct extent_map *em;
@@ -2827,7 +2827,7 @@ int btrfs_remove_chunk(struct btrfs_trans_handle *trans,
int i, ret = 0;
struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
 
-   em = get_chunk_map(fs_info, chunk_offset, 1);
+   em = btrfs_get_chunk_map(fs_info, chunk_offset, 1);
if (IS_ERR(em)) {
/*
 * This is a logic error, but we don't want to just rely on the
@@ -4962,7 +4962,7 @@ int btrfs_finish_chunk_alloc(struct btrfs_trans_handle 
*trans,
int i = 0;
int ret = 0;
 
-   em = get_chunk_map(fs_info, chunk_offset, chunk_size);
+   em = btrfs_get_chunk_map(fs_info, chunk_offset, chunk_size);
if (IS_ERR(em))
return PTR_ERR(em);
 
@@ -5105,7 +5105,7 @@ int btrfs_chunk_readonly(struct btrfs_fs_info *fs_info, 
u64 chunk_offset)
int miss_ndevs = 0;
int i;
 
-   em = get_chunk_map(fs_info, chunk_offset, 1);
+   em = btrfs_get_chunk_map(fs_info, chunk_offset, 1);
if (IS_ERR(em))
return 1;
 
@@ -5165,7 +5165,7 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 
logical, u64 len)
struct map_lookup *map;
int ret;
 
-   em = get_chunk_map(fs_info, logical, len);
+   em = btrfs_get_chunk_map(fs_info, logical, len);
if (IS_ERR(em))
/*
 * We could return errors for these cases, but that could get
@@ -5211,7 +5211,7 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info 
*fs_info,
struct map_lookup *map;
unsigned long len = fs_info->sectorsize;
 
-   em = get_chunk_map(fs_info, logical, len);
+   em = btrfs_get_chunk_map(fs_info, logical, len);
 
if (!WARN_ON(IS_ERR(em))) {
map = em->map_lookup;
@@ -5228,7 +5228,7 @@ int btrfs_is_parity_mirror(struct btrfs_fs_info *fs_info, 
u64 logical, u64 len)
struct map_lookup *map;
int ret = 0;
 
-   em = get_chunk_map(fs_info, logical, len);
+   em = btrfs_get_chunk_map(fs_info, logical, len);
 
if(!WARN_ON(IS_ERR(em))) {
map = em->map_lookup;
@@ -5387,7 +5387,7 @@ static int __btrfs_map_block_for_discard(struct 
btrfs_fs_info *fs_info,
/* discard always return a bbio */
ASSERT(bbio_ret);
 
-   em = get_chunk_map(fs_info, logical, length);
+   em = btrfs_get_chunk_map(fs_info, logical, length);
if (IS_ERR(em))
return PTR_ERR(em);
 
@@ -5713,7 +5713,7 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info,
return __btrfs_map_block_for_discard(fs_info, logical,
 *length, bbio_ret);
 
-   em = get_chunk_map(fs_info, logical, *length);
+   em = btrfs_get_chunk_map(fs_info, logical, *length);
if (IS_ERR(em))
return PTR_ERR(em);
 
@@ -6012,7 +6012,7 @@ int btrfs_rmap_block(struct btrfs_fs_info *fs_info, u64 
chunk_start,
u64 rmap_len;
int i, j, nr = 0;
 
-   em = get_chunk_map(fs_info, chunk_start, 1);
+   em = btrfs_get_chunk_map(fs_info, chunk_start, 1);
if (IS_ERR(em))
return -EIO;
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 5139ec8daf4c..d3dedfd1324b 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -467,6 +467,8 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info 
*fs_info,
 int btrfs_finish_chunk_alloc(struct btrfs_trans_handle *trans,
struct btrfs_fs_info *fs_info,
u64 chunk_offset, u64 chunk_size);
+struct extent_map *btrfs_get_chunk_map(struct btrfs_fs_info *fs_info,
+  u64 logical, u64 length);
 int btrfs_remove_chunk(struct btrfs_trans_handle *trans,
   struct btrfs_fs_info *fs_info, u64 chunk_offset);
 
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v4 3/6] Btrfs: push EXCL_OP set into btrfs_rm_device()

2018-05-24 Thread Omar Sandoval
From: Omar Sandoval 

btrfs_ioctl_rm_dev() and btrfs_ioctl_rm_dev_v2() both manipulate this
bit. Let's move it into the common btrfs_rm_device(), which also makes
the following change to deal with swap files easier.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/ioctl.c   | 13 -
 fs/btrfs/volumes.c |  4 
 2 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 9af3be96099f..82be4a94334b 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3091,18 +3091,12 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, 
void __user *arg)
goto out;
}
 
-   if (test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags)) {
-   ret = BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
-   goto out;
-   }
-
if (vol_args->flags & BTRFS_DEVICE_SPEC_BY_ID) {
ret = btrfs_rm_device(fs_info, NULL, vol_args->devid);
} else {
vol_args->name[BTRFS_SUBVOL_NAME_MAX] = '\0';
ret = btrfs_rm_device(fs_info, vol_args->name, 0);
}
-   clear_bit(BTRFS_FS_EXCL_OP, &fs_info->flags);
 
if (!ret) {
if (vol_args->flags & BTRFS_DEVICE_SPEC_BY_ID)
@@ -3133,11 +3127,6 @@ static long btrfs_ioctl_rm_dev(struct file *file, void 
__user *arg)
if (ret)
return ret;
 
-   if (test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags)) {
-   ret = BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
-   goto out_drop_write;
-   }
-
vol_args = memdup_user(arg, sizeof(*vol_args));
if (IS_ERR(vol_args)) {
ret = PTR_ERR(vol_args);
@@ -3151,8 +3140,6 @@ static long btrfs_ioctl_rm_dev(struct file *file, void 
__user *arg)
btrfs_info(fs_info, "disk deleted %s", vol_args->name);
kfree(vol_args);
 out:
-   clear_bit(BTRFS_FS_EXCL_OP, &fs_info->flags);
-out_drop_write:
mnt_drop_write_file(file);
 
return ret;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b6757b53c297..9cfac177214f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1951,6 +1951,9 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const 
char *device_path,
u64 num_devices;
int ret = 0;
 
+   if (test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags))
+   return BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
+
mutex_lock(&uuid_mutex);
 
num_devices = fs_devices->num_devices;
@@ -2069,6 +2072,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const 
char *device_path,
 
 out:
mutex_unlock(&uuid_mutex);
+   clear_bit(BTRFS_FS_EXCL_OP, &fs_info->flags);
return ret;
 
 error_undo:
-- 
2.17.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v4 4/6] Btrfs: prevent ioctls from interfering with a swap file

2018-05-25 Thread Omar Sandoval
On Fri, May 25, 2018 at 04:50:55PM +0200, David Sterba wrote:
> On Thu, May 24, 2018 at 02:41:28PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > When a swap file is active, we must make sure that the extents of the
> > file are not moved and that they don't become shared. That means that
> > the following are not safe:
> > 
> > - chattr +c (enable compression)
> > - reflink
> > - dedupe
> > - snapshot
> > - defrag
> > - balance
> > - device remove/replace/resize
> > 
> > Don't allow those to happen on an active swap file. Balance and device
> > remove/replace/resize in particular are disallowed entirely; in the
> > future, we can relax this so that relocation skips/errors out only on
> > chunks containing an active swap file.
> 
> Hm, disabling the entire balance is too intrusive. It's clear that the
> swapfile causes a lot of trouble when it goes against the dynamic
> capabilities of btrfs (relocation and the functionality that builds on
> it).
> 
> Skipping the swapfile extents should be implemented at minimum.

Sure thing, this should definitely be possible. For balance, we can skip
them; for resize or delete, it of course has to fail if it encounters
swap extents. I'll take a stab at it.

> We can
> add some heuristics that will group the swap extents to a small number
> of chunks so the impact of unmovable chunks is limited.
> 
> I haven't looked at the implementation, but it might be possible to
> internally find a different location for the swap extent once it's not
> used for the actual paged data.
> 
> In an ideal case, the swap extents could span entire chunks (1G) and not
> mix with regular data/metadata.
> 
> > Note that we don't have to worry about chattr -C (disable nocow), which
> > we ignore for non-empty files, because an active swapfile must be
> > non-empty and can't be truncated. We also don't have to worry about
> > autodefrag because it's only done on COW files. Truncate and fallocate
> > are already taken care of by the generic code. Device add doesn't do
> > relocation so it's not an issue, either.
> 
> Ok, fine the remaining easy cases are covered.
> 
> I don't know if you mentioned that elsewhere (as design questions are
> in this patch), the allocation profile is single, or is it also possible
> to have striped or duplicated swap extents?

That's briefly mentioned in the last patch, only single data is
supported, although I think I can easily relax that to also allow RAID0.
Anything else is much harder to support, but we need to start somewhere.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v4 5/6] Btrfs: rename get_chunk_map() and make it non-static

2018-05-25 Thread Omar Sandoval
On Fri, May 25, 2018 at 12:21:41PM +0300, Nikolay Borisov wrote:
> 
> 
> On 25.05.2018 00:41, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > The Btrfs swap code is going to need it, so give it a btrfs_ prefix and
> > make it non-static.
> > 
> > Signed-off-by: Omar Sandoval 
> 
> Reviewed-by: Nikolay Borisov 
> 
> nit: How about introducing proper kernel doc for this function, now that
> it becomes public just as good practice so that eventually we will have
> proper kernel doc for all public interfaces. You could also mention that
> it needs a paired free_extent_map

Will do, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v4 6/6] Btrfs: support swap files

2018-05-25 Thread Omar Sandoval
On Fri, May 25, 2018 at 01:07:00PM +0300, Nikolay Borisov wrote:
> 
> 
> On 25.05.2018 00:41, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > Implement the swap file a_ops on Btrfs. Activation needs to make sure
> > that the file can be used as a swap file, which currently means it must
> > be fully allocated as nocow with no compression on one device. It also
> > sets up the swap extents directly with add_swap_extent(), so export it.
> > 
> > Signed-off-by: Omar Sandoval 
> 
> What testing (apart form the xfstest patches you sent) have this code
> seen?

Light testing with my swapme script [1] and btrfsck to make sure I
didn't swap to the wrong place. I was meaning to put this through
something more intensive like a kernel build, thanks for the reminder.
As opposed to the previous approach, the swapin/swapout paths are
simple, core code, so the edge cases I'm worried about are really in
activate/deactivate and other ioctls breaking things.

1: https://github.com/osandov/osandov-linux/blob/master/scripts/swapme.c

> Have you run it with lockdep enabled (I'm asking because when I
> picked up your v3 there was quite a bunch of deadlock warnings). Also
> see some inline questions below.

Yup, I've been running it with lockdep, no warnings. The nice part of
this approach is that there's no new locking involved, just whatever the
swap code does itself.

> > ---
> >  fs/btrfs/inode.c | 220 +++
> >  mm/swapfile.c|   1 +
> >  2 files changed, 221 insertions(+)
> > 
> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > index 9228cb866115..6cca8529e307 100644
> > --- a/fs/btrfs/inode.c
> > +++ b/fs/btrfs/inode.c
> > @@ -10526,6 +10526,224 @@ void btrfs_set_range_writeback(void 
> > *private_data, u64 start, u64 end)
> > }
> >  }
> >  
> 
> > +static int btrfs_swap_activate(struct swap_info_struct *sis, struct file 
> > *file,
> > +  sector_t *span)
> > +{
> > +   struct inode *inode = file_inode(file);
> > +   struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
> > +   struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
> > +   struct extent_state *cached_state = NULL;
> > +   struct extent_map *em = NULL;
> > +   struct btrfs_device *device = NULL;
> > +   struct btrfs_swap_info bsi = {
> > +   .lowest_ppage = (sector_t)-1ULL,
> > +   };
> > +   int ret = 0;
> > +   u64 isize = inode->i_size;
> > +   u64 start;
> > +
> > +   /*
> > +* The inode is locked, so these flags won't change after we check them.
> > +*/
> > +   if (BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS) {
> > +   btrfs_err(fs_info, "swapfile is compressed");
> > +   return -EINVAL;
> > +   }
> > +   if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW)) {
> > +   btrfs_err(fs_info, "swapfile is copy-on-write");
> > +   return -EINVAL;
> > +   }
> > +
> > +   /*
> > +* Balance or device remove/replace/resize can move stuff around from
> > +* under us. The EXCL_OP flag makes sure they aren't running/won't run
> > +* concurrently while we are mapping the swap extents, and the fs_info
> > +* nr_swapfiles counter prevents them from running while the swap file
> > +* is active and moving the extents. Note that this also prevents a
> > +* concurrent device add which isn't actually necessary, but it's not
> > +* really worth the trouble to allow it.
> > +*/
> > +   if (test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags))
> > +   return -EBUSY;
> > +   atomic_inc(&fs_info->nr_swapfiles);
> > +   /*
> > +* Snapshots can create extents which require COW even if NODATACOW is
> > +* set. We use this counter to prevent snapshots. We must increment it
> > +* before walking the extents because we don't want a concurrent
> > +* snapshot to run after we've already checked the extents.
> > +*/
> > +   atomic_inc(&BTRFS_I(inode)->root->nr_swapfiles);
> > +
> > +   lock_extent_bits(io_tree, 0, isize - 1, &cached_state);
> > +   start = 0;
> > +   while (start < isize) {
> > +   u64 end, logical_block_start, physical_block_start;
> > +   u64 len = isize - start;
> > +
> > +   em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, start, len, 0);
> > +   if (IS_ERR(em)) {
> > +   ret =

Re: [PATCH 1/2] btrfs: kill btrfs_write_inode

2018-05-29 Thread Omar Sandoval
On Mon, May 28, 2018 at 06:57:59PM +0200, David Sterba wrote:
> On Tue, May 22, 2018 at 01:47:22PM -0400, Josef Bacik wrote:
> > From: Josef Bacik 
> > 
> > We don't actually need this.  It used to be in place for O_SYNC writes,
> > but we've used the normal fsync() path for that for years now.  The
> > other case we hit this is through sync(), which will commit the
> > transaction anyway.  All this does is make us commit the transaction a
> > bunch for no reason, and it could deadlock with delayed iput's.
> 
> In what way does it deadlock with delayed iput?

Here's an example stack trace:

[  +0.005066]  __schedule+0x38e/0x8c0
[  +0.007144]  schedule+0x36/0x80
[  +0.006447]  bit_wait+0x11/0x60
[  +0.006446]  __wait_on_bit+0xbe/0x110
[  +0.007487]  ? bit_wait_io+0x60/0x60
[  +0.007319]  __inode_wait_for_writeback+0x96/0xc0
[  +0.009568]  ? autoremove_wake_function+0x40/0x40
[  +0.009565]  inode_wait_for_writeback+0x21/0x30
[  +0.009224]  evict+0xb0/0x190
[  +0.006099]  iput+0x1a8/0x210
[  +0.006103]  btrfs_run_delayed_iputs+0x73/0xc0
[  +0.009047]  btrfs_commit_transaction+0x799/0x8c0
[  +0.009567]  btrfs_write_inode+0x81/0xb0
[  +0.008008]  __writeback_single_inode+0x267/0x320
[  +0.009569]  writeback_sb_inodes+0x25b/0x4e0
[  +0.008702]  wb_writeback+0x102/0x2d0
[  +0.007487]  wb_workfn+0xa4/0x310
[  +0.006794]  ? wb_workfn+0xa4/0x310
[  +0.007143]  process_one_work+0x150/0x410
[  +0.008179]  worker_thread+0x6d/0x520
[  +0.007490]  kthread+0x12c/0x160
[  +0.006620]  ? put_pwq_unlocked+0x80/0x80
[  +0.008185]  ? kthread_park+0xa0/0xa0
[  +0.007484]  ? do_syscall_64+0x53/0x150
[  +0.007837]  ret_from_fork+0x29/0x40

So writeback calls btrfs_write_inode(), which calls
btrfs_commit_transaction(), which calls btrfs_run_delayed_iputs(), which
calls iput() on the inode currently in btrfs_write_inode(), which calls
evict(), which waits for writeback on that same inode.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: clean up error handling in btrfs_truncate()

2018-05-29 Thread Omar Sandoval
On Tue, May 22, 2018 at 09:59:50AM -0700, Omar Sandoval wrote:
> From: Omar Sandoval 
> 
> btrfs_truncate() uses two variables for error handling, ret and err (if
> this sounds familiar, it's because btrfs_truncate_inode_items() did
> something similar). This is error prone, as was made evident by "Btrfs:
> fix error handling in btrfs_truncate()". We only have err because we
> don't want to mask an error if we call btrfs_update_inode() and
> btrfs_end_transaction(), so let's make that its own scoped return
> variable and use ret everywhere else.
> 
> Reviewed-by: Nikolay Borisov 
> Signed-off-by: Omar Sandoval 
> ---
> This is the same as my v1 "Btrfs: fix error handling in
> btrfs_truncate()", but rebased on top of kdave/for-next + v2 of "Btrfs:
> fix error handling in btrfs_truncate()".
> 
>  fs/btrfs/inode.c | 33 ++---
>  1 file changed, 14 insertions(+), 19 deletions(-)

Dave, what are your thoughts on this one? Are you thinking 4.18 or 4.19?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] fs: add initial bh_result->b_private value to __blockdev_direct_IO()

2018-06-29 Thread Omar Sandoval
On Mon, Jun 25, 2018 at 07:16:38PM +0200, David Sterba wrote:
> On Mon, May 14, 2018 at 06:35:48PM +0200, David Sterba wrote:
> > On Fri, May 11, 2018 at 01:30:01PM -0700, Omar Sandoval wrote:
> > > On Fri, May 11, 2018 at 09:05:38PM +0100, Al Viro wrote:
> > > > On Thu, May 10, 2018 at 11:30:10PM -0700, Omar Sandoval wrote:
> > > > >  do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
> > > > > struct block_device *bdev, struct iov_iter *iter,
> > > > > get_block_t get_block, dio_iodone_t end_io,
> > > > > -   dio_submit_t submit_io, int flags)
> > > > > +   dio_submit_t submit_io, int flags, void *private)
> > > > 
> > > > Oh, dear...  That's what, 9 arguments?  I agree that the hack in 
> > > > question
> > > > is obscene, but so is this ;-/
> > > 
> > > So looking at these one by one, obviously needed:
> > > 
> > > - iocb
> > > - inode
> > > - iter
> > > 
> > > bdev is almost always inode->i_sb->s_bdev, except for Btrfs :(
> > > 
> > > These could _maybe_ go in struct kiocb:
> > > 
> > > - flags could maybe be folded into ki_flags
> > > - private could maybe go in iocb->private, but I haven't yet read
> > >   through to figure out if we're already using iocb->private for direct
> > >   I/O
> > 
> > I think the kiocb::private can be used for the purpose. There's only one
> > user, ext4, that also passes some DIO data around so it would in line
> > with the interface AFAICS.
> 
> Omar, do you have an update to the patchset? Thanks.

Al, what do you think of changing all users of map_bh->b_private to use
iocb->private? We'd have to pass the iocb to get_block() and
submit_io(), but we could get rid of dio->private.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: use customized batch size for total_bytes_pinned

2018-07-12 Thread Omar Sandoval
On Wed, Jul 11, 2018 at 11:59:36PM +0800, Ethan Lien wrote:
> In commit b150a4f10d878 ("Btrfs: use a percpu to keep track of possibly
> pinned bytes") we use total_bytes_pinned to track how many bytes we are
> going to free in this transaction. When we are close to ENOSPC, we check it
> and know if we can make the allocation by commit the current transaction.
> For every data/metadata extent we are going to free, we add
> total_bytes_pinned in btrfs_free_extent() and btrfs_free_tree_block(), and
> release it in unpin_extent_range() when we finish the transaction. So this
> is a variable we frequently update but rarely read - just the suitable
> use of percpu_counter. But in previous commit we update total_bytes_pinned
> by default 32 batch size, making every update essentially a spin lock
> protected update. Since every spin lock/unlock operation involves syncing
> a globally used variable and some kind of barrier in a SMP system, this is
> more expensive than using total_bytes_pinned as a simple atomic64_t. So
> fix this by using a customized batch size. Since we only read
> total_bytes_pinned when we are close to ENOSPC and fail to alloc new chunk,
> we can use a really large batch size and have nearly no penalty in most
> cases.
> 
> 
> [Test]
> We test the patch on a 4-cores x86 machine:
> 1. falloate a 16GiB size test file.
> 2. take snapshot (so all following writes will be cow write).
> 3. run a 180 sec, 4 jobs, 4K random write fio on test file.
> 
> We also add a temporary lockdep class on percpu_counter's spin lock used
> by total_bytes_pinned to track lock_stat.
> 
> 
> [Results]
> unpatched:
> lock_stat version 0.4
> ---
>   class namecon-bouncescontentions
> waittime-min   waittime-max waittime-total   waittime-avgacq-bounces
> acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
> 
>total_bytes_pinned_percpu:82 82
> 0.21   0.61  29.46   0.36 298340
>   635973   0.09  11.01  173476.25   0.27
> 
> 
> patched:
> lock_stat version 0.4
> ---
>   class namecon-bouncescontentions
> waittime-min   waittime-max waittime-total   waittime-avgacq-bounces
> acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
> 
>total_bytes_pinned_percpu: 1  1
> 0.62   0.62   0.62   0.62  13601
>31542   0.14   9.61   11016.90   0.35
> 
> 
> [Analysis]
> Since the spin lock only protect a single in-memory variable, the
> contentions (number of lock acquisitions that had to wait) in both
> unpatched and patched version are low. But when we see acquisitions and
> acq-bounces, we get much lower counts in patched version. Here the most
> important metric is acq-bounces. It means how many times the lock get
> transferred between different cpus, so the patch can really recude
> cacheline bouncing of spin lock (also the global counter of percpu_counter)
> in a SMP system.
> 
> Fixes: b150a4f10d878 ("Btrfs: use a percpu to keep track of possibly
> pinned bytes")
> 
> Signed-off-by: Ethan Lien 
> ---
> 
> V2:
>   Rewrite commit comments.
>   Add lock_stat test.
>   Pull dirty_metadata_bytes out to a separate patch.
> 
>  fs/btrfs/ctree.h   |  1 +
>  fs/btrfs/extent-tree.c | 46 --
>  2 files changed, 32 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 118346aceea9..df682a521635 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -422,6 +422,7 @@ struct btrfs_space_info {
>* time the transaction commits.
>*/
>   struct percpu_counter total_bytes_pinned;
> + s32 total_bytes_pinned_batch;

Can this just be a constant instead of adding it to space_info?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND 0/2] btrfs-progs: build improvements

2018-07-12 Thread Omar Sandoval
From: Omar Sandoval 

Hi, Dave,

This is a resend of a couple of patches I sent back in April, rebased on
the current devel branch. Patch 1 cleans up some stale build targets,
and patch 2 makes the btrfs-progs build more flexible, so that it's
possible to pick and choose what gets built. Please consider these for
the next progs release.

Thanks!

Omar Sandoval (2):
  btrfs-progs: remove stale dir-test and quick-test
  btrfs-progs: make all programs and libraries optional

 Makefile| 139 -
 Makefile.inc.in |  16 +-
 configure.ac| 138 +++--
 dir-test.c  | 518 
 quick-test.c| 226 -
 5 files changed, 227 insertions(+), 810 deletions(-)
 delete mode 100644 dir-test.c
 delete mode 100644 quick-test.c

-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND 1/2] btrfs-progs: remove stale dir-test and quick-test

2018-07-12 Thread Omar Sandoval
From: Omar Sandoval 

These don't build anymore and don't appear to be used for anything.

Signed-off-by: Omar Sandoval 
---
 Makefile |  10 +-
 dir-test.c   | 518 ---
 quick-test.c | 226 --
 3 files changed, 1 insertion(+), 753 deletions(-)
 delete mode 100644 dir-test.c
 delete mode 100644 quick-test.c

diff --git a/Makefile b/Makefile
index 544410e6..62102baf 100644
--- a/Makefile
+++ b/Makefile
@@ -495,14 +495,6 @@ btrfs-convert.static: $(static_convert_objects) 
$(static_objects) $(static_libbt
@echo "[LD] $@"
$(Q)$(CC) -o $@ $^ $(STATIC_LDFLAGS) $(btrfs_convert_libs) 
$(STATIC_LIBS)
 
-dir-test: dir-test.o $(objects) $(libs)
-   @echo "[LD] $@"
-   $(Q)$(CC) -o $@ $^ $(LDFLAGS) $(LIBS)
-
-quick-test: quick-test.o $(objects) $(libs)
-   @echo "[LD] $@"
-   $(Q)$(CC) -o $@ $^ $(LDFLAGS) $(LIBS)
-
 ioctl-test.o: ioctl-test.c ioctl.h kerncompat.h ctree.h
@echo "[CC]   $@"
$(Q)$(CC) $(CFLAGS) -c $< -o $@
@@ -603,7 +595,7 @@ clean: $(CLEANDIRS)
image/*.o image/*.o.d \
convert/*.o convert/*.o.d \
mkfs/*.o mkfs/*.o.d check/*.o check/*.o.d \
- dir-test ioctl-test quick-test library-test library-test-static \
+ ioctl-test library-test library-test-static \
   mktables btrfs.static mkfs.btrfs.static fssum \
  $(check_defs) \
  $(libs) $(lib_links) \
diff --git a/dir-test.c b/dir-test.c
deleted file mode 100644
index cfb77f2a..
--- a/dir-test.c
+++ /dev/null
@@ -1,518 +0,0 @@
-/*
- * Copyright (C) 2007 Oracle.  All rights reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public
- * License v2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public
- * License along with this program; if not, write to the
- * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
- * Boston, MA 021110-1307, USA.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include "kerncompat.h"
-#include "radix-tree.h"
-#include "ctree.h"
-#include "disk-io.h"
-#include "print-tree.h"
-#include "hash.h"
-#include "transaction.h"
-
-int keep_running = 1;
-struct btrfs_super_block super;
-static u64 dir_oid = 0;
-static u64 file_oid = 33778;
-
-static int find_num(struct radix_tree_root *root, unsigned long *num_ret,
-int exists)
-{
-   unsigned long num = rand();
-   unsigned long res[2];
-   int ret;
-
-again:
-   ret = radix_tree_gang_lookup(root, (void **)res, num, 2);
-   if (exists) {
-   if (ret == 0)
-   return -1;
-   num = res[0];
-   } else if (ret != 0 && num == res[0]) {
-   num++;
-   if (ret > 1 && num == res[1]) {
-   num++;
-   goto again;
-   }
-   }
-   *num_ret = num;
-   return 0;
-}
-
-static void initial_inode_init(struct btrfs_root *root,
-  struct btrfs_inode_item *inode_item)
-{
-   memset(inode_item, 0, sizeof(*inode_item));
-   btrfs_set_inode_generation(inode_item, root->fs_info->generation);
-   btrfs_set_inode_mode(inode_item, S_IFREG | 0700);
-}
-
-static int ins_one(struct btrfs_trans_handle *trans, struct btrfs_root *root,
-  struct radix_tree_root *radix)
-{
-   int ret;
-   char buf[128];
-   unsigned long oid;
-   u64 objectid;
-   struct btrfs_path path;
-   struct btrfs_key inode_map;
-   struct btrfs_inode_item inode_item;
-
-   find_num(radix, &oid, 0);
-   sprintf(buf, "str-%lu", oid);
-
-   ret = btrfs_find_free_objectid(trans, root, dir_oid + 1, &objectid);
-   if (ret)
-   goto error;
-
-   inode_map.objectid = objectid;
-   inode_map.flags = 0;
-   inode_map.type = BTRFS_INODE_ITEM_KEY;
-   inode_map.offset = 0;
-
-   initial_inode_init(root, &inode_item);
-   ret = btrfs_insert_inode(trans, root, objectid, &inode_item);
-   if (ret)
-   goto error;
-   ret = btrfs_insert_dir_item(trans, root, buf, strlen(buf), dir_oid,
-   &inode_map, BTRFS_FT_UNKNOWN);
-   if (ret)
-   goto error;
-
-   radix_tree_preload(GFP_KERNEL);
-   ret = radix_tree_insert(radix, oi

[PATCH RESEND 2/2] btrfs-progs: make all programs and libraries optional

2018-07-12 Thread Omar Sandoval
From: Omar Sandoval 

We have a build system internally which only needs to build the
libraries out of a repository, not any binaries. I looked at how this
works with other projects, and the best example was util-linux, which
makes it possible to enable or disable everything individually. This is
nice and really flexible, so let's do the same. This way, if you only
want to build and install libbtrfsutil, you can simply do

  ./configure --disable-documentation --disable-all-programs 
--enable-libbtrfsutil
  make
  make install

Signed-off-by: Omar Sandoval 
---
 Makefile| 129 +++-
 Makefile.inc.in |  16 +-
 configure.ac| 138 +---
 3 files changed, 226 insertions(+), 57 deletions(-)

diff --git a/Makefile b/Makefile
index 62102baf..fe71b694 100644
--- a/Makefile
+++ b/Makefile
@@ -206,22 +206,40 @@ endif
 
 MAKEOPTS = --no-print-directory Q=$(Q)
 
-# build all by default
-progs = $(progs_install) btrfsck btrfs-corrupt-block
-
-# install only selected
-progs_install = btrfs mkfs.btrfs btrfs-map-logical btrfs-image \
-   btrfs-find-root btrfstune \
-   btrfs-select-super
-
-# other tools, not built by default
-progs_extra = btrfs-fragments
-
-progs_static = $(foreach p,$(progs),$(p).static)
-
-ifneq ($(DISABLE_BTRFSCONVERT),1)
+ifeq ($(BUILD_BTRFS),1)
+progs_install += btrfs
+progs += btrfsck
+endif
+ifeq ($(BUILD_CONVERT),1)
 progs_install += btrfs-convert
 endif
+ifeq ($(BUILD_CORRUPT_BLOCK),1)
+progs += btrfs-corrupt-block
+endif
+ifeq ($(BUILD_FIND_ROOT),1)
+progs_install += btrfs-find-root
+endif
+ifeq ($(BUILD_FRAGMENTS),1)
+progs += btrfs-fragments
+endif
+ifeq ($(BUILD_IMAGE),1)
+progs_install += btrfs-image
+endif
+ifeq ($(BUILD_MAP_LOGICAL),1)
+progs_install += btrfs-map-logical
+endif
+ifeq ($(BUILD_MKFS),1)
+progs_install += mkfs.btrfs
+endif
+ifeq ($(BUILD_SELECT_SUPER),1)
+progs_install += btrfs-select-super
+endif
+ifeq ($(BUILD_TUNE),1)
+progs_install += btrfstune
+endif
+
+progs += $(progs_install)
+progs_static = $(foreach p,$(progs),$(p).static)
 
 # external libs required by various binaries; for btrfs-foo,
 # specify btrfs_foo_libs = ; see $($(subst...)) rules below
@@ -233,7 +251,7 @@ cmds_restore_cflags = 
-DBTRFSRESTORE_ZSTD=$(BTRFSRESTORE_ZSTD)
 CHECKER_FLAGS += $(btrfs_convert_cflags)
 
 # collect values of the variables above
-standalone_deps = $(foreach dep,$(patsubst %,%_objects,$(subst -,_,$(filter 
btrfs-%, $(progs) $(progs_extra,$($(dep)))
+standalone_deps = $(foreach dep,$(patsubst %,%_objects,$(subst -,_,$(filter 
btrfs-%, $(progs,$($(dep)))
 
 SUBDIRS =
 BUILDDIRS = $(patsubst %,build-%,$(SUBDIRS))
@@ -262,10 +280,21 @@ static_convert_objects = $(patsubst %.o, %.static.o, 
$(convert_objects))
 static_mkfs_objects = $(patsubst %.o, %.static.o, $(mkfs_objects))
 static_image_objects = $(patsubst %.o, %.static.o, $(image_objects))
 
-libs_shared = libbtrfs.so.0.1 libbtrfsutil.so.$(libbtrfsutil_version)
-libs_static = libbtrfs.a libbtrfsutil.a
+ifeq ($(BUILD_LIBBTRFS),1)
+ifeq ($(BUILD_SHARED),1)
+libs_shared += libbtrfs.so.0.1
+lib_links += libbtrfs.so.0 libbtrfs.so
+endif
+libs_static += libbtrfs.a
+endif
+ifeq ($(BUILD_LIBBTRFSUTIL),1)
+ifeq ($(BUILD_SHARED),1)
+libs_shared += libbtrfsutil.so.$(libbtrfsutil_version)
+lib_links += libbtrfsutil.so.$(libbtrfsutil_major) libbtrfsutil.so
+endif
+libs_static += libbtrfsutil.a
+endif
 libs = $(libs_shared) $(libs_static)
-lib_links = libbtrfs.so.0 libbtrfs.so libbtrfsutil.so.$(libbtrfsutil_major) 
libbtrfsutil.so
 
 # make C=1 to enable sparse
 ifdef C
@@ -303,7 +332,7 @@ endif
$($(subst -,_,btrfs-$(@:%/$(notdir $@)=%)-cflags))
 
 all: $(progs) $(libs) $(lib_links) $(BUILDDIRS)
-ifeq ($(PYTHON_BINDINGS),1)
+ifeq ($(BUILD_PYTHON),1)
 all: libbtrfsutil_python
 endif
 $(SUBDIRS): $(BUILDDIRS)
@@ -353,7 +382,7 @@ testsuite: btrfs-corrupt-block fssum
@echo "Export tests as a package"
$(Q)cd tests && ./export-testsuite.sh
 
-ifeq ($(PYTHON_BINDINGS),1)
+ifeq ($(BUILD_PYTHON),1)
 test-libbtrfsutil: libbtrfsutil_python mkfs.btrfs
$(Q)cd libbtrfsutil/python; \
LD_LIBRARY_PATH=../.. $(PYTHON) -m unittest discover -v tests
@@ -413,7 +442,7 @@ libbtrfsutil.so.$(libbtrfsutil_major) libbtrfsutil.so: 
libbtrfsutil.so.$(libbtrf
@echo "[LN] $@"
$(Q)$(LN_S) -f $< $@
 
-ifeq ($(PYTHON_BINDINGS),1)
+ifeq ($(BUILD_PYTHON),1)
 libbtrfsutil_python: libbtrfsutil.so.$(libbtrfsutil_major) libbtrfsutil.so 
libbtrfsutil/btrfsutil.h
@echo "[PY] libbtrfsutil"
$(Q)cd libbtrfsutil/python; \
@@ -439,14 +468,14 @@ btrfs-%.static: btrfs-%.static.o $(static_objects) 
$(patsubst %.o,%.static.o,$(s
$(static_libbtrfs_objects) $(STATIC_LDFLAGS) \
$($(subst -,_,$(subst .static,,$@)-libs)) $(STATIC_LIBS)
 
-btrfs-%: btrfs-%.o $(objects) $(standalone_deps) $(libs_static)
+b

Re: [PATCH RESEND 2/2] btrfs-progs: make all programs and libraries optional

2018-07-16 Thread Omar Sandoval
On Mon, Jul 16, 2018 at 04:56:57PM +0200, David Sterba wrote:
> On Thu, Jul 12, 2018 at 04:11:19PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval 
> > 
> > We have a build system internally which only needs to build the
> > libraries out of a repository, not any binaries. I looked at how this
> > works with other projects, and the best example was util-linux, which
> > makes it possible to enable or disable everything individually. This is
> > nice and really flexible, so let's do the same. This way, if you only
> > want to build and install libbtrfsutil, you can simply do
> > 
> >   ./configure --disable-documentation --disable-all-programs 
> > --enable-libbtrfsutil
> >   make
> >   make install
> 
> I think this is an overkill and abusing the --enable-XXX options.  You
> want to avoid building the tools by default, so adding an option for
> that is fine. Selectively building only certain tools can utilize that
> option too and just follow with 'make btrfs-image' etc.

Yeah, it's easy to build stuff selectively, but `make install` will
still try to build everything, that's the part I'm more concerned with.

> The number of --enable-* will stay minimal and we don't even have to
> discuss how to find a good naming scheme (that works for util-linux but
> looks a bit confusing for btrfs-progs).

Ok, I can collapse these into just --disable-programs/--enable-programs,
and --disable-libraries/--enable-libraries? That would be enough for me.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: kill btrfs_write_inode

2018-07-20 Thread Omar Sandoval
On Fri, Jul 20, 2018 at 01:48:01PM +0200, David Sterba wrote:
> On Thu, May 31, 2018 at 11:49:28AM +0200, David Sterba wrote:
> > On Tue, May 29, 2018 at 12:17:42PM -0700, Omar Sandoval wrote:
> > > On Mon, May 28, 2018 at 06:57:59PM +0200, David Sterba wrote:
> > > > On Tue, May 22, 2018 at 01:47:22PM -0400, Josef Bacik wrote:
> > > > > From: Josef Bacik 
> > > > > 
> > > > > We don't actually need this.  It used to be in place for O_SYNC 
> > > > > writes,
> > > > > but we've used the normal fsync() path for that for years now.  The
> > > > > other case we hit this is through sync(), which will commit the
> > > > > transaction anyway.  All this does is make us commit the transaction a
> > > > > bunch for no reason, and it could deadlock with delayed iput's.
> > > > 
> > > > In what way does it deadlock with delayed iput?
> > > 
> > > Here's an example stack trace:
> > 
> > Aha, so that's an actual bugfix. The changelog should have been stated
> > the other way around: there's a deadlock scenario and can be fixed by
> > removing the whole function because there's another mechanism to achieve
> > the same O_SYNC behaviour. Please update and resend.
> 
> Ping, update and resend if you want this patch to go to 4.19. Thanks.

Josef just left for vacation, but I can resend this with a better
description.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix btrfs_write_inode() vs delayed iput deadlock

2018-07-20 Thread Omar Sandoval
From: Josef Bacik 

We recently ran into the following deadlock involving
btrfs_write_inode():

[  +0.005066]  __schedule+0x38e/0x8c0
[  +0.007144]  schedule+0x36/0x80
[  +0.006447]  bit_wait+0x11/0x60
[  +0.006446]  __wait_on_bit+0xbe/0x110
[  +0.007487]  ? bit_wait_io+0x60/0x60
[  +0.007319]  __inode_wait_for_writeback+0x96/0xc0
[  +0.009568]  ? autoremove_wake_function+0x40/0x40
[  +0.009565]  inode_wait_for_writeback+0x21/0x30
[  +0.009224]  evict+0xb0/0x190
[  +0.006099]  iput+0x1a8/0x210
[  +0.006103]  btrfs_run_delayed_iputs+0x73/0xc0
[  +0.009047]  btrfs_commit_transaction+0x799/0x8c0
[  +0.009567]  btrfs_write_inode+0x81/0xb0
[  +0.008008]  __writeback_single_inode+0x267/0x320
[  +0.009569]  writeback_sb_inodes+0x25b/0x4e0
[  +0.008702]  wb_writeback+0x102/0x2d0
[  +0.007487]  wb_workfn+0xa4/0x310
[  +0.006794]  ? wb_workfn+0xa4/0x310
[  +0.007143]  process_one_work+0x150/0x410
[  +0.008179]  worker_thread+0x6d/0x520
[  +0.007490]  kthread+0x12c/0x160
[  +0.006620]  ? put_pwq_unlocked+0x80/0x80
[  +0.008185]  ? kthread_park+0xa0/0xa0
[  +0.007484]  ? do_syscall_64+0x53/0x150
[  +0.007837]  ret_from_fork+0x29/0x40

Writeback calls btrfs_write_inode(), which calls
btrfs_commit_transaction(), which calls btrfs_run_delayed_iputs(). If
iput() is called on that same inode, evict() will wait for writeback
forever.

btrfs_write_inode() was originally added way back in 4730a4bc5bf3
("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode()
hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so
btrfs_write_inode() is actually unnecessary (and leads to a bunch of
unnecessary commits). Get rid of it, which also gets rid of the
deadlock.

Signed-off-by: Josef Bacik 
[Omar: new commit message]
Signed-off-by: Omar Sandoval 
---
 fs/btrfs/inode.c | 26 --
 fs/btrfs/super.c |  1 -
 2 files changed, 27 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index eba61bcb9bb3..071d949f69ec 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6027,32 +6027,6 @@ static int btrfs_real_readdir(struct file *file, struct 
dir_context *ctx)
return ret;
 }
 
-int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc)
-{
-   struct btrfs_root *root = BTRFS_I(inode)->root;
-   struct btrfs_trans_handle *trans;
-   int ret = 0;
-   bool nolock = false;
-
-   if (test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags))
-   return 0;
-
-   if (btrfs_fs_closing(root->fs_info) &&
-   btrfs_is_free_space_inode(BTRFS_I(inode)))
-   nolock = true;
-
-   if (wbc->sync_mode == WB_SYNC_ALL) {
-   if (nolock)
-   trans = btrfs_join_transaction_nolock(root);
-   else
-   trans = btrfs_join_transaction(root);
-   if (IS_ERR(trans))
-   return PTR_ERR(trans);
-   ret = btrfs_commit_transaction(trans);
-   }
-   return ret;
-}
-
 /*
  * This is somewhat expensive, updating the tree every time the
  * inode changes.  But, it is most likely to find the inode in cache.
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 81107ad49f3a..bddfc28b27c0 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2331,7 +2331,6 @@ static const struct super_operations btrfs_super_ops = {
.sync_fs= btrfs_sync_fs,
.show_options   = btrfs_show_options,
.show_devname   = btrfs_show_devname,
-   .write_inode= btrfs_write_inode,
.alloc_inode= btrfs_alloc_inode,
.destroy_inode  = btrfs_destroy_inode,
.statfs = btrfs_statfs,
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/2] btrfs-progs: build improvements

2018-07-26 Thread Omar Sandoval
From: Omar Sandoval 

Hi, Dave,

Here's v2 of "btrfs-progs: make all programs and libraries optional",
this time much less overkill. Now, it's just --disable-programs,
--disable-shared, and --disable-static. Based on your devel branch.
Please consider these for the next progs release.

Thanks!

Omar Sandoval (2):
  btrfs-progs: add --disable-programs
  btrfs-progs: add --disable-shared and --disable-static

 Makefile| 56 +
 Makefile.inc.in |  3 +++
 configure.ac| 28 +++--
 3 files changed, 67 insertions(+), 20 deletions(-)

-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] btrfs-progs: add --disable-shared and --disable-static

2018-07-26 Thread Omar Sandoval
From: Omar Sandoval 

The build system mentioned in the previous commit builds libraries in
both PIC and non-PIC mode. Shared libraries don't work in PIC mode, so
it expects a --disable-shared configure option, which most open source
libraries using autoconf have. Let's add it, too.

Signed-off-by: Omar Sandoval 
---
 Makefile| 17 ++---
 Makefile.inc.in |  2 ++
 configure.ac| 18 +-
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 24aa8234..f4ab14ea 100644
--- a/Makefile
+++ b/Makefile
@@ -275,6 +275,13 @@ libs_shared = libbtrfs.so.0.1 
libbtrfsutil.so.$(libbtrfsutil_version)
 libs_static = libbtrfs.a libbtrfsutil.a
 libs = $(libs_shared) $(libs_static)
 lib_links = libbtrfs.so.0 libbtrfs.so libbtrfsutil.so.$(libbtrfsutil_major) 
libbtrfsutil.so
+libs_build =
+ifeq ($(BUILD_SHARED_LIBRARIES),1)
+libs_build += $(libs_shared) $(lib_links)
+endif
+ifeq ($(BUILD_STATIC_LIBRARIES),1)
+libs_build += $(libs_static)
+endif
 
 # make C=1 to enable sparse
 ifdef C
@@ -311,7 +318,7 @@ endif
$(Q)$(CC) $(STATIC_CFLAGS) -c $< -o $@ $($(subst 
-,_,$(@:%.static.o=%)-cflags)) \
$($(subst -,_,btrfs-$(@:%/$(notdir $@)=%)-cflags))
 
-all: $(progs_build) $(libs) $(lib_links) $(BUILDDIRS)
+all: $(progs_build) $(libs_build) $(BUILDDIRS)
 ifeq ($(PYTHON_BINDINGS),1)
 all: libbtrfsutil_python
 endif
@@ -635,7 +642,7 @@ $(CLEANDIRS):
@echo "Cleaning $(patsubst clean-%,%,$@)"
$(Q)$(MAKE) $(MAKEOPTS) -C $(patsubst clean-%,%,$@) clean
 
-install: $(libs) $(progs_install) $(INSTALLDIRS)
+install: $(libs_build) $(progs_install) $(INSTALLDIRS)
 ifeq ($(BUILD_PROGRAMS),1)
$(INSTALL) -m755 -d $(DESTDIR)$(bindir)
$(INSTALL) $(progs_install) $(DESTDIR)$(bindir)
@@ -647,12 +654,16 @@ ifneq ($(udevdir),)
$(INSTALL) -m644 $(udev_rules) $(DESTDIR)$(udevruledir)
 endif
 endif
+ifneq ($(libs_build),)
$(INSTALL) -m755 -d $(DESTDIR)$(libdir)
-   $(INSTALL) $(libs) $(DESTDIR)$(libdir)
+   $(INSTALL) $(libs_build) $(DESTDIR)$(libdir)
+ifeq ($(BUILD_SHARED_LIBRARIES),1)
cp -d $(lib_links) $(DESTDIR)$(libdir)
+endif
$(INSTALL) -m755 -d $(DESTDIR)$(incdir)/btrfs
$(INSTALL) -m644 $(libbtrfs_headers) $(DESTDIR)$(incdir)/btrfs
$(INSTALL) -m644 libbtrfsutil/btrfsutil.h $(DESTDIR)$(incdir)
+endif
 
 ifeq ($(PYTHON_BINDINGS),1)
 install_python: libbtrfsutil_python
diff --git a/Makefile.inc.in b/Makefile.inc.in
index 5c8d1297..a86c528e 100644
--- a/Makefile.inc.in
+++ b/Makefile.inc.in
@@ -13,6 +13,8 @@ INSTALL = @INSTALL@
 DISABLE_DOCUMENTATION = @DISABLE_DOCUMENTATION@
 DISABLE_BTRFSCONVERT = @DISABLE_BTRFSCONVERT@
 BUILD_PROGRAMS = @BUILD_PROGRAMS@
+BUILD_SHARED_LIBRARIES = @BUILD_SHARED_LIBRARIES@
+BUILD_STATIC_LIBRARIES = @BUILD_STATIC_LIBRARIES@
 BTRFSCONVERT_EXT2 = @BTRFSCONVERT_EXT2@
 BTRFSCONVERT_REISERFS = @BTRFSCONVERT_REISERFS@
 BTRFSRESTORE_ZSTD = @BTRFSRESTORE_ZSTD@
diff --git a/configure.ac b/configure.ac
index 230f37fa..df02f206 100644
--- a/configure.ac
+++ b/configure.ac
@@ -125,6 +125,20 @@ AC_ARG_ENABLE([programs],
 AS_IF([test "x$enable_programs" = xyes], [BUILD_PROGRAMS=1], 
[BUILD_PROGRAMS=0])
 AC_SUBST([BUILD_PROGRAMS])
 
+AC_ARG_ENABLE([shared],
+ AS_HELP_STRING([--disable-shared], [do not build shared 
libraries]),
+ [], [enable_shared=yes]
+)
+AS_IF([test "x$enable_shared" = xyes], [BUILD_SHARED_LIBRARIES=1], 
[BUILD_SHARED_LIBRARIES=0])
+AC_SUBST([BUILD_SHARED_LIBRARIES])
+
+AC_ARG_ENABLE([static],
+ AS_HELP_STRING([--disable-static], [do not build static 
libraries]),
+ [], [enable_static=yes]
+)
+AS_IF([test "x$enable_static" = xyes], [BUILD_STATIC_LIBRARIES=1], 
[BUILD_STATIC_LIBRARIES=0])
+AC_SUBST([BUILD_STATIC_LIBRARIES])
+
 AC_ARG_ENABLE([convert],
  AS_HELP_STRING([--disable-convert], [do not build btrfs-convert]),
   [], [enable_convert=$enable_programs]
@@ -222,7 +236,7 @@ AC_SUBST(BTRFSRESTORE_ZSTD)
 
 AC_ARG_ENABLE([python],
AS_HELP_STRING([--disable-python], [do not build libbtrfsutil Python 
bindings]),
-   [], [enable_python=yes]
+   [], [enable_python=$enable_shared]
 )
 
 if test "x$enable_python" = xyes; then
@@ -285,6 +299,8 @@ AC_MSG_RESULT([
ldflags:${LDFLAGS}
 
programs:   ${enable_programs}
+   shared libraries:   ${enable_shared}
+   static libraries:   ${enable_static}
documentation:  ${enable_documentation}
doc generator:  ${ASCIIDOC_TOOL}
backtrace support:  ${enable_backtrace}
-- 
2.18.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/2] btrfs-progs: add --disable-programs

2018-07-26 Thread Omar Sandoval
From: Omar Sandoval 

We have a build system internally which only needs to build and install
the libraries out of a repository, not any binaries. There's no easy way
to do this in btrfs-progs currently. Add --disable-programs to
./configure to support this.

Signed-off-by: Omar Sandoval 
---
 Makefile| 43 ++-
 Makefile.inc.in |  1 +
 configure.ac| 10 +-
 3 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/Makefile b/Makefile
index d53f6c1d..24aa8234 100644
--- a/Makefile
+++ b/Makefile
@@ -207,23 +207,31 @@ endif
 
 MAKEOPTS = --no-print-directory Q=$(Q)
 
-# build all by default
-progs = $(progs_install) btrfsck btrfs-corrupt-block
 
-# install only selected
+# Programs to install.
 progs_install = btrfs mkfs.btrfs btrfs-map-logical btrfs-image \
-   btrfs-find-root btrfstune \
-   btrfs-select-super
+   btrfs-find-root btrfstune btrfs-select-super
 
-# other tools, not built by default
-progs_extra = btrfs-fragments
+# Programs to build.
+progs_build = $(progs_install) btrfsck btrfs-corrupt-block
 
-progs_static = $(foreach p,$(progs),$(p).static)
+# All programs. Use := instead of = so that this is expanded before we reassign
+# progs_build below.
+progs := $(progs_build) btrfs-convert btrfs-fragments
 
 ifneq ($(DISABLE_BTRFSCONVERT),1)
 progs_install += btrfs-convert
 endif
 
+# Static programs to build. Use := instead of = because `make static` should
+# still build everything even if --disable-programs was passed to ./configure.
+progs_static := $(foreach p,$(progs_build),$(p).static)
+
+ifneq ($(BUILD_PROGRAMS),1)
+progs_install =
+progs_build =
+endif
+
 # external libs required by various binaries; for btrfs-foo,
 # specify btrfs_foo_libs = ; see $($(subst...)) rules below
 btrfs_convert_cflags = -DBTRFSCONVERT_EXT2=$(BTRFSCONVERT_EXT2)
@@ -234,7 +242,7 @@ cmds_restore_cflags = 
-DBTRFSRESTORE_ZSTD=$(BTRFSRESTORE_ZSTD)
 CHECKER_FLAGS += $(btrfs_convert_cflags)
 
 # collect values of the variables above
-standalone_deps = $(foreach dep,$(patsubst %,%_objects,$(subst -,_,$(filter 
btrfs-%, $(progs) $(progs_extra,$($(dep)))
+standalone_deps = $(foreach dep,$(patsubst %,%_objects,$(subst -,_,$(filter 
btrfs-%, $(progs,$($(dep)))
 
 SUBDIRS =
 BUILDDIRS = $(patsubst %,build-%,$(SUBDIRS))
@@ -303,7 +311,7 @@ endif
$(Q)$(CC) $(STATIC_CFLAGS) -c $< -o $@ $($(subst 
-,_,$(@:%.static.o=%)-cflags)) \
$($(subst -,_,btrfs-$(@:%/$(notdir $@)=%)-cflags))
 
-all: $(progs) $(libs) $(lib_links) $(BUILDDIRS)
+all: $(progs_build) $(libs) $(lib_links) $(BUILDDIRS)
 ifeq ($(PYTHON_BINDINGS),1)
 all: libbtrfsutil_python
 endif
@@ -570,9 +578,8 @@ test-build-pre:
 test-build-real:
$(MAKE) $(MAKEOPTS) library-test
-$(MAKE) $(MAKEOPTS) library-test.static
-   $(MAKE) $(MAKEOPTS) -j 8 all
+   $(MAKE) $(MAKEOPTS) -j 8 $(progs) $(libs) $(lib_links) $(BUILDDIRS)
-$(MAKE) $(MAKEOPTS) -j 8 static
-   $(MAKE) $(MAKEOPTS) -j 8 $(progs_extra)
 
 manpages:
$(Q)$(MAKE) $(MAKEOPTS) -C Documentation
@@ -604,7 +611,7 @@ clean: $(CLEANDIRS)
   mktables btrfs.static mkfs.btrfs.static fssum \
  $(check_defs) \
  $(libs) $(lib_links) \
- $(progs_static) $(progs_extra) \
+ $(progs_static) \
  libbtrfsutil/*.o libbtrfsutil/*.o.d
 ifeq ($(PYTHON_BINDINGS),1)
$(Q)cd libbtrfsutil/python; \
@@ -629,21 +636,23 @@ $(CLEANDIRS):
$(Q)$(MAKE) $(MAKEOPTS) -C $(patsubst clean-%,%,$@) clean
 
 install: $(libs) $(progs_install) $(INSTALLDIRS)
+ifeq ($(BUILD_PROGRAMS),1)
$(INSTALL) -m755 -d $(DESTDIR)$(bindir)
$(INSTALL) $(progs_install) $(DESTDIR)$(bindir)
$(INSTALL) fsck.btrfs $(DESTDIR)$(bindir)
# btrfsck is a link to btrfs in the src tree, make it so for installed 
file as well
$(LN_S) -f btrfs $(DESTDIR)$(bindir)/btrfsck
+ifneq ($(udevdir),)
+   $(INSTALL) -m755 -d $(DESTDIR)$(udevruledir)
+   $(INSTALL) -m644 $(udev_rules) $(DESTDIR)$(udevruledir)
+endif
+endif
$(INSTALL) -m755 -d $(DESTDIR)$(libdir)
$(INSTALL) $(libs) $(DESTDIR)$(libdir)
cp -d $(lib_links) $(DESTDIR)$(libdir)
$(INSTALL) -m755 -d $(DESTDIR)$(incdir)/btrfs
$(INSTALL) -m644 $(libbtrfs_headers) $(DESTDIR)$(incdir)/btrfs
$(INSTALL) -m644 libbtrfsutil/btrfsutil.h $(DESTDIR)$(incdir)
-ifneq ($(udevdir),)
-   $(INSTALL) -m755 -d $(DESTDIR)$(udevruledir)
-   $(INSTALL) -m644 $(udev_rules) $(DESTDIR)$(udevruledir)
-endif
 
 ifeq ($(PYTHON_BINDINGS),1)
 install_python: libbtrfsutil_python
diff --git a/Makefile.inc.in b/Makefile.inc.in
index fb324614..5c8d1297 100644
--- a/Makefile.inc.in
+++ b/Makefile.inc.in
@@ -12,6 +12,7 @@ RMDIR = @RMDIR@
 INSTALL = @INSTALL@
 DISABLE_DOCUMENTATION = @DISABLE_DOCUMENTATION@
 DISABLE_BTRFSCONVERT = @DISABLE_BTRFSCONVERT@
+BUILD_PROGRAMS = @BUILD_PROGRAMS@
 BTRFSCONVERT_EXT2 = @BTRFSCONV

[PATCH] Btrfs: clean up scrub is_dev_replace parameter

2018-08-14 Thread Omar Sandoval
From: Omar Sandoval 

struct scrub_ctx has an ->is_dev_replace member, so there's no point in
passing around is_dev_replace where sctx is available.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/scrub.c | 23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 6702896cdb8f..276c67c3da2e 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3342,8 +3342,7 @@ static noinline_for_stack int scrub_raid56_parity(struct 
scrub_ctx *sctx,
 static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx,
   struct map_lookup *map,
   struct btrfs_device *scrub_dev,
-  int num, u64 base, u64 length,
-  int is_dev_replace)
+  int num, u64 base, u64 length)
 {
struct btrfs_path *path, *ppath;
struct btrfs_fs_info *fs_info = sctx->fs_info;
@@ -3619,7 +3618,7 @@ static noinline_for_stack int scrub_stripe(struct 
scrub_ctx *sctx,
extent_physical = extent_logical - logical + physical;
extent_dev = scrub_dev;
extent_mirror_num = mirror_num;
-   if (is_dev_replace)
+   if (sctx->is_dev_replace)
scrub_remap_extent(fs_info, extent_logical,
   extent_len, &extent_physical,
   &extent_dev,
@@ -3717,8 +3716,7 @@ static noinline_for_stack int scrub_chunk(struct 
scrub_ctx *sctx,
  struct btrfs_device *scrub_dev,
  u64 chunk_offset, u64 length,
  u64 dev_offset,
- struct btrfs_block_group_cache *cache,
- int is_dev_replace)
+ struct btrfs_block_group_cache *cache)
 {
struct btrfs_fs_info *fs_info = sctx->fs_info;
struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree;
@@ -3755,8 +3753,7 @@ static noinline_for_stack int scrub_chunk(struct 
scrub_ctx *sctx,
if (map->stripes[i].dev->bdev == scrub_dev->bdev &&
map->stripes[i].physical == dev_offset) {
ret = scrub_stripe(sctx, map, scrub_dev, i,
-  chunk_offset, length,
-  is_dev_replace);
+  chunk_offset, length);
if (ret)
goto out;
}
@@ -3769,8 +3766,7 @@ static noinline_for_stack int scrub_chunk(struct 
scrub_ctx *sctx,
 
 static noinline_for_stack
 int scrub_enumerate_chunks(struct scrub_ctx *sctx,
-  struct btrfs_device *scrub_dev, u64 start, u64 end,
-  int is_dev_replace)
+  struct btrfs_device *scrub_dev, u64 start, u64 end)
 {
struct btrfs_dev_extent *dev_extent = NULL;
struct btrfs_path *path;
@@ -3864,7 +3860,7 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx,
 */
scrub_pause_on(fs_info);
ret = btrfs_inc_block_group_ro(fs_info, cache);
-   if (!ret && is_dev_replace) {
+   if (!ret && sctx->is_dev_replace) {
/*
 * If we are doing a device replace wait for any tasks
 * that started dellaloc right before we set the block
@@ -3929,7 +3925,7 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx,
dev_replace->item_needs_writeback = 1;
btrfs_dev_replace_write_unlock(&fs_info->dev_replace);
ret = scrub_chunk(sctx, scrub_dev, chunk_offset, length,
- found_key.offset, cache, is_dev_replace);
+ found_key.offset, cache);
 
/*
 * flush, submit all pending read and write bios, afterwards
@@ -3997,7 +3993,7 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx,
btrfs_put_block_group(cache);
if (ret)
break;
-   if (is_dev_replace &&
+   if (sctx->is_dev_replace &&
atomic64_read(&dev_replace->num_write_errors) > 0) {
ret = -EIO;
break;
@@ -4231,8 +4227,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 
devid, u64 start,
}
 
if (!ret)
-   ret = scrub_enumerate_chunks(sctx, dev, start, end,
- 

Re: [PATCH 2/2] Btrfs: sync log after logging new name

2018-08-14 Thread Omar Sandoval
On Mon, Jun 18, 2018 at 01:06:16PM +0200, David Sterba wrote:
> On Fri, Jun 15, 2018 at 05:19:07PM +0100, Filipe Manana wrote:
> > On Fri, Jun 15, 2018 at 4:54 PM, David Sterba  wrote:
> > > On Mon, Jun 11, 2018 at 07:24:28PM +0100, fdman...@kernel.org wrote:
> > >> From: Filipe Manana 
> > >> Fixes: 12fcfd22fe5b ("Btrfs: tree logging unlink/rename fixes")
> > >> Reported-by: Vijay Chidambaram 
> > >> Signed-off-by: Filipe Manana 
> > >
> > > There are some warnings and possible lock up caused by this patch, the
> > > 1/2 alone is ok but 1/2 + 2/2 leads to the following warnings. I checked
> > > twice, the patch base was the pull request ie. without any other 4.18
> > > stuff.
> > 
> > Are you sure it's this patch?
> > On top of for-4.18 it didn't cause any problems here, plus the trace
> > below has nothing to do with renames, hard links or fsync at all -
> > everything seems stuck on waiting for IO from dev replace.
> 
> It was a false alert, sorry. Strange that the warnings appeared only in
> the VM running both patches and not otherwise.
> 
> Though the test did not directly use rename, the possible error scenario
> I had in mind was some leftover from locking, error handling or state
> that blocked umount of 011.

Dave, are you sending this in for 4.19? I don't see it in your first
pull request.


Re: [RFC PATCH v4 4/6] Btrfs: prevent ioctls from interfering with a swap file

2018-08-21 Thread Omar Sandoval
On Fri, May 25, 2018 at 06:10:43PM +0200, David Sterba wrote:
> On Fri, May 25, 2018 at 09:00:58AM -0700, Omar Sandoval wrote:
> > On Fri, May 25, 2018 at 04:50:55PM +0200, David Sterba wrote:
> > > On Thu, May 24, 2018 at 02:41:28PM -0700, Omar Sandoval wrote:
> > > > From: Omar Sandoval 
> > > > 
> > > > When a swap file is active, we must make sure that the extents of the
> > > > file are not moved and that they don't become shared. That means that
> > > > the following are not safe:
> > > > 
> > > > - chattr +c (enable compression)
> > > > - reflink
> > > > - dedupe
> > > > - snapshot
> > > > - defrag
> > > > - balance
> > > > - device remove/replace/resize
> > > > 
> > > > Don't allow those to happen on an active swap file. Balance and device
> > > > remove/replace/resize in particular are disallowed entirely; in the
> > > > future, we can relax this so that relocation skips/errors out only on
> > > > chunks containing an active swap file.
> > > 
> > > Hm, disabling the entire balance is too intrusive. It's clear that the
> > > swapfile causes a lot of trouble when it goes against the dynamic
> > > capabilities of btrfs (relocation and the functionality that builds on
> > > it).
> > > 
> > > Skipping the swapfile extents should be implemented at minimum.
> > 
> > Sure thing, this should definitely be possible. For balance, we can skip
> > them; for resize or delete, it of course has to fail if it encounters
> > swap extents. I'll take a stab at it.
> 
> We can detect if there's an active swap file on the filesystem before
> shrink, delete or replace is started so the user is not surprised if it
> fails in the end, or not start the operations at all and give some hints
> what to do.

I looked at this some more, it's not pretty... Basically, we need to

- Add a counter of active swap extents to struct btrfs_block_group_cache
- At activate time, map the file extent to a block group and increment the 
counter
- At relocation time, check the counter, and either error out or skip as
  appropriate
- At deactivate time, decrement all of the block group counters. Easier
  said than done because the file could in theory have new extents
  allocated beyond EOF

The last point is the tricky one. The straightforward way to implement
deactivate would be to walk all of the extents of the file in the same
way we did for activate, but the extents may have changed. So, we have
to remember which extents were activated (or get that information from
the generic swap code somehow). This seems fragile.

Does anyone see a better approach? Is it worth the trouble?


[RFC PATCH v2 0/6] Btrfs: stop abusing current->journal_info for direct I/O

2018-08-27 Thread Omar Sandoval
From: Omar Sandoval 

Hi,

This is a different approach from v1 [1] of this series to stop abusing
current->journal_info in Btrfs. This approach unifies everything to use
iocb->private instead of map_bh->b_private. Patches 1 and 5 pass the
iocb to a couple of callbacks which need it. Patches 2 and 3 migrates
the users of b_private to use iocb->private, and patch 4 gets rid of the
b_private handling in the direct I/O code. Patch 6 cleans up Btrfs.

I'm not convinced that this is cleaner that my first approach, but it at
least avoids growing the argument list to do_blockdev_direct_IO(), which
was Al's complaint of v1.

Thanks!

1: https://www.spinics.net/lists/linux-btrfs/msg77859.html

Omar Sandoval (6):
  fs: pass iocb to direct I/O get_block()
  ext4: use iocb->private instead of bh->b_private
  ocfs2: use iocb->private instead of bh->b_private
  fs: stop propagating bh->b_private for direct I/O
  fs: pass iocb to direct I/O submit_io()
  Btrfs: stop abusing current->journal_info in btrfs_direct_IO()

 fs/affs/file.c  |  9 ++-
 fs/btrfs/inode.c| 36 +++--
 fs/direct-io.c  | 23 +++-
 fs/ext2/inode.c |  9 ++-
 fs/ext4/ext4.h  |  2 --
 fs/ext4/inode.c | 40 
 fs/f2fs/data.c  |  5 ++--
 fs/fat/inode.c  |  9 ++-
 fs/gfs2/aops.c  |  5 ++--
 fs/hfs/inode.c  |  9 ++-
 fs/hfsplus/inode.c  |  9 ++-
 fs/jfs/inode.c  |  9 ++-
 fs/nilfs2/inode.c   |  9 ++-
 fs/ocfs2/aops.c | 39 ++-
 fs/ocfs2/aops.h | 64 ++---
 fs/reiserfs/inode.c |  4 +--
 fs/udf/inode.c  |  9 ++-
 include/linux/fs.h  | 17 ++--
 18 files changed, 187 insertions(+), 120 deletions(-)

-- 
2.18.0



[RFC PATCH v2 5/6] fs: pass iocb to direct I/O submit_io()

2018-08-27 Thread Omar Sandoval
From: Omar Sandoval 

Btrfs abuses current->journal_info in btrfs_direct_IO() in order to pass
around some state to get_block() and submit_io(). However, iocb->private
is free for Btrfs to use, we just need it passed to submit_io(). Btrfs
is the only user of submit_io(), so this doesn't affect any other
filesystems.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/inode.c   | 4 ++--
 fs/direct-io.c | 3 ++-
 include/linux/fs.h | 4 ++--
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b61ea6dd9956..6efa6a6e3e20 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8427,8 +8427,8 @@ static int btrfs_submit_direct_hook(struct 
btrfs_dio_private *dip)
return 0;
 }
 
-static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
-   loff_t file_offset)
+static void btrfs_submit_direct(struct kiocb *iocb, struct bio *dio_bio,
+   struct inode *inode, loff_t file_offset)
 {
struct btrfs_dio_private *dip = NULL;
struct bio *bio = NULL;
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 80e488afe6c6..aa367e70456d 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -473,7 +473,8 @@ static inline void dio_bio_submit(struct dio *dio, struct 
dio_submit *sdio)
dio->bio_disk = bio->bi_disk;
 
if (sdio->submit_io) {
-   sdio->submit_io(bio, dio->inode, sdio->logical_offset_in_bio);
+   sdio->submit_io(dio->iocb, bio, dio->inode,
+   sdio->logical_offset_in_bio);
dio->bio_cookie = BLK_QC_T_NONE;
} else
dio->bio_cookie = submit_bio(bio);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f1a235f0fa21..daf1df811f67 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3003,8 +3003,8 @@ extern int generic_file_open(struct inode * inode, struct 
file * filp);
 extern int nonseekable_open(struct inode * inode, struct file * filp);
 
 #ifdef CONFIG_BLOCK
-typedef void (dio_submit_t)(struct bio *bio, struct inode *inode,
-   loff_t file_offset);
+typedef void (dio_submit_t)(struct kiocb *iocb, struct bio *bio,
+   struct inode *inode, loff_t file_offset);
 
 enum {
/* need locking between buffered and direct access */
-- 
2.18.0



[RFC PATCH v2 3/6] ocfs2: use iocb->private instead of bh->b_private

2018-08-27 Thread Omar Sandoval
From: Omar Sandoval 

As part of simplifying all of the private data passed around for direct
I/O, bh->b_private will no longer be passed to dio_iodone_t. Instead,
filesystems should use iocb->private. ocfs2 already uses iocb->private
for storing a couple of flag bits, but we can use it as a tagged pointer
and hide all of the messiness in helpers.

Cc: Mark Fasheh 
Cc: Joel Becker 
Signed-off-by: Omar Sandoval 
---
 fs/ocfs2/aops.c | 19 ---
 fs/ocfs2/aops.h | 64 +
 2 files changed, 54 insertions(+), 29 deletions(-)

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 93ca23c56b07..fc4a18b6ad3c 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2104,12 +2104,13 @@ struct ocfs2_dio_write_ctxt {
 };
 
 static struct ocfs2_dio_write_ctxt *
-ocfs2_dio_alloc_write_ctx(struct buffer_head *bh, int *alloc)
+ocfs2_dio_alloc_write_ctx(struct kiocb *iocb, int *alloc)
 {
-   struct ocfs2_dio_write_ctxt *dwc = NULL;
+   struct ocfs2_dio_write_ctxt *dwc;
 
-   if (bh->b_private)
-   return bh->b_private;
+   dwc = ocfs2_iocb_private(iocb);
+   if (dwc)
+   return dwc;
 
dwc = kmalloc(sizeof(struct ocfs2_dio_write_ctxt), GFP_NOFS);
if (dwc == NULL)
@@ -2118,7 +2119,7 @@ ocfs2_dio_alloc_write_ctx(struct buffer_head *bh, int 
*alloc)
dwc->dw_zero_count = 0;
dwc->dw_orphaned = 0;
dwc->dw_writer_pid = task_pid_nr(current);
-   bh->b_private = dwc;
+   ocfs2_iocb_set_private(iocb, dwc);
*alloc = 1;
 
return dwc;
@@ -2184,7 +2185,7 @@ static int ocfs2_dio_wr_get_block(struct kiocb *iocb, 
struct inode *inode,
bh_result->b_state = 0;
}
 
-   dwc = ocfs2_dio_alloc_write_ctx(bh_result, &first_get_block);
+   dwc = ocfs2_dio_alloc_write_ctx(iocb, &first_get_block);
if (unlikely(dwc == NULL)) {
ret = -ENOMEM;
mlog_errno(ret);
@@ -2408,6 +2409,7 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
ssize_t bytes,
void *private)
 {
+   struct ocfs2_dio_write_ctxt *dwc;
struct inode *inode = file_inode(iocb->ki_filp);
int level;
int ret = 0;
@@ -2415,8 +2417,9 @@ static int ocfs2_dio_end_io(struct kiocb *iocb,
/* this io's submitter should not have unlocked this before we could */
BUG_ON(!ocfs2_iocb_is_rw_locked(iocb));
 
-   if (bytes > 0 && private)
-   ret = ocfs2_dio_end_io_write(inode, private, offset, bytes);
+   dwc = ocfs2_iocb_private(iocb);
+   if (bytes > 0 && dwc)
+   ret = ocfs2_dio_end_io_write(inode, dwc, offset, bytes);
 
ocfs2_iocb_clear_rw_locked(iocb);
 
diff --git a/fs/ocfs2/aops.h b/fs/ocfs2/aops.h
index 3494a62ed749..2c3219e0c010 100644
--- a/fs/ocfs2/aops.h
+++ b/fs/ocfs2/aops.h
@@ -63,32 +63,54 @@ int ocfs2_size_fits_inline_data(struct buffer_head *di_bh, 
u64 new_size);
 
 int ocfs2_get_block(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create);
-/* all ocfs2_dio_end_io()'s fault */
-#define ocfs2_iocb_is_rw_locked(iocb) \
-   test_bit(0, (unsigned long *)&iocb->private)
+
+/*
+ * Direct I/O uses iocb->private as a tagged pointer. The bottom two bits
+ * defined below are used for communication between ocfs2_dio_end_io() and
+ * ocfs2_file_write/read_iter().
+ */
+#define OCFS2_IOCB_RW_LOCK 1
+#define OCFS2_IOCB_RW_LOCK_LEVEL 2
+
+static inline void *ocfs2_iocb_private(struct kiocb *iocb)
+{
+   return (void *)((unsigned long)iocb->private & ~3);
+}
+
+static inline void ocfs2_iocb_set_private(struct kiocb *iocb, void *private)
+{
+   iocb->private = (void *)(((unsigned long)iocb->private & 3) |
+((unsigned long)private & ~3));
+}
+
+static inline bool ocfs2_iocb_is_rw_locked(struct kiocb *iocb)
+{
+   return (unsigned long)iocb->private & OCFS2_IOCB_RW_LOCK;
+}
+
 static inline void ocfs2_iocb_set_rw_locked(struct kiocb *iocb, int level)
 {
-   set_bit(0, (unsigned long *)&iocb->private);
+   unsigned long private = (unsigned long)iocb->private;
+
+   private |= OCFS2_IOCB_RW_LOCK;
if (level)
-   set_bit(1, (unsigned long *)&iocb->private);
+   private |= OCFS2_IOCB_RW_LOCK_LEVEL;
else
-   clear_bit(1, (unsigned long *)&iocb->private);
+   private &= ~OCFS2_IOCB_RW_LOCK_LEVEL;
+   iocb->private = (void *)private;
 }
 
-/*
- * Using a named enum representing lock types in terms of #N bit stored in
- * iocb->private, which is going to be used for communication between
- * ocfs2_dio_end_io() and ocfs2_file_write/read_iter().
- */
-enum ocfs2_iocb_lock_bits {
-   OCFS2_IOCB_RW_LOCK = 0,
-   OCFS2_IOCB_RW_LOCK_L

[RFC PATCH v2 2/6] ext4: use iocb->private instead of bh->b_private

2018-08-27 Thread Omar Sandoval
From: Omar Sandoval 

As part of simplifying all of the private data passed around for direct
I/O, bh->b_private will no longer be passed to dio_iodone_t. iocb is
still available there, however, so convert ext4 to use it. Note that
ext4_file_write_iter() also uses iocb->private, but
ext4_direct_IO_write() resets it to NULL after reading it.

Also note that the comment above ext4_should_dioread_nolock() is no
longer accurate. It seems that it should be possible to remove the data
journaling restriction now?

Cc: "Theodore Ts'o" 
Cc: Andreas Dilger 
Signed-off-by: Omar Sandoval 
---
 fs/ext4/inode.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 18ad91b1c8f6..841d79919cef 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -884,18 +884,16 @@ static int ext4_dio_get_block_unwritten_async(struct 
kiocb *iocb,
/*
 * When doing DIO using unwritten extents, we need io_end to convert
 * unwritten extents to written on IO completion. We allocate io_end
-* once we spot unwritten extent and store it in b_private. Generic
-* DIO code keeps b_private set and furthermore passes the value to
-* our completion callback in 'private' argument.
+* once we spot unwritten extent and store it in iocb->private.
 */
if (!ret && buffer_unwritten(bh_result)) {
-   if (!bh_result->b_private) {
+   if (!iocb->private) {
ext4_io_end_t *io_end;
 
io_end = ext4_init_io_end(inode, GFP_KERNEL);
if (!io_end)
return -ENOMEM;
-   bh_result->b_private = io_end;
+   iocb->private = io_end;
ext4_set_io_unwritten_flag(inode, io_end);
}
set_buffer_defer_completion(bh_result);
@@ -3617,7 +3615,7 @@ const struct iomap_ops ext4_iomap_ops = {
 static int ext4_end_io_dio(struct kiocb *iocb, loff_t offset,
ssize_t size, void *private)
 {
-ext4_io_end_t *io_end = private;
+ext4_io_end_t *io_end = iocb->private;
 
/* if not async direct IO just return */
if (!io_end)
-- 
2.18.0



[RFC PATCH v2 4/6] fs: stop propagating bh->b_private for direct I/O

2018-08-27 Thread Omar Sandoval
From: Omar Sandoval 

Currently, the direct I/O code saves the value of bh->b_private set
by the filesystem and passes it to the end_io callback. However, struct
kiocb already has a ->private member which can be used for this purpose,
with the added benefit of being available before get_block is called,
too. The only users of the bh->b_private functionality have been
converted to use iocb->private, so stop passing it around.

Signed-off-by: Omar Sandoval 
---
 fs/direct-io.c | 7 +--
 fs/ext4/inode.c| 3 +--
 fs/ocfs2/aops.c| 5 +
 include/linux/fs.h | 3 +--
 4 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/fs/direct-io.c b/fs/direct-io.c
index f631aa98849b..80e488afe6c6 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -122,8 +122,6 @@ struct dio {
loff_t i_size;  /* i_size when submitted */
dio_iodone_t *end_io;   /* IO completion function */
 
-   void *private;  /* copy from map_bh.b_private */
-
/* BIO completion state */
spinlock_t bio_lock;/* protects BIO fields below */
int page_errors;/* errno from get_user_pages() */
@@ -288,7 +286,7 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, 
unsigned int flags)
 
if (dio->end_io) {
// XXX: ki_pos??
-   err = dio->end_io(dio->iocb, offset, ret, dio->private);
+   err = dio->end_io(dio->iocb, offset, ret);
if (err)
ret = err;
}
@@ -716,9 +714,6 @@ static int get_more_blocks(struct dio *dio, struct 
dio_submit *sdio,
ret = (*sdio->get_block)(dio->iocb, dio->inode, fs_startblk,
 map_bh, create);
 
-   /* Store for completion */
-   dio->private = map_bh->b_private;
-
if (ret == 0 && buffer_defer_completion(map_bh))
ret = dio_set_defer_completion(dio);
}
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 841d79919cef..0f42793765bf 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3612,8 +3612,7 @@ const struct iomap_ops ext4_iomap_ops = {
.iomap_end  = ext4_iomap_end,
 };
 
-static int ext4_end_io_dio(struct kiocb *iocb, loff_t offset,
-   ssize_t size, void *private)
+static int ext4_end_io_dio(struct kiocb *iocb, loff_t offset, ssize_t size)
 {
 ext4_io_end_t *io_end = iocb->private;
 
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index fc4a18b6ad3c..c1232df20be5 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2404,10 +2404,7 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
  * particularly interested in the aio/dio case.  We use the rw_lock DLM lock
  * to protect io on one node from truncation on another.
  */
-static int ocfs2_dio_end_io(struct kiocb *iocb,
-   loff_t offset,
-   ssize_t bytes,
-   void *private)
+static int ocfs2_dio_end_io(struct kiocb *iocb, loff_t offset, ssize_t bytes)
 {
struct ocfs2_dio_write_ctxt *dwc;
struct inode *inode = file_inode(iocb->ki_filp);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 85db69835023..f1a235f0fa21 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -83,8 +83,7 @@ typedef int (get_block_t)(struct inode *inode, sector_t 
iblock,
 typedef int (dio_get_block_t)(struct kiocb *iocb, struct inode *inode,
  sector_t iblock, struct buffer_head *bh_result,
  int create);
-typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
-   ssize_t bytes, void *private);
+typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset, ssize_t bytes);
 
 #define MAY_EXEC   0x0001
 #define MAY_WRITE  0x0002
-- 
2.18.0



[RFC PATCH v2 6/6] Btrfs: stop abusing current->journal_info in btrfs_direct_IO()

2018-08-27 Thread Omar Sandoval
From: Omar Sandoval 

Now that we can pass around the struct btrfs_dio_data through the
different callbacks generically, we don't need to shove it in
current->journal_info.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/inode.c | 29 ++---
 1 file changed, 6 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 6efa6a6e3e20..38a41e9d6e93 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7654,7 +7654,6 @@ static int btrfs_get_blocks_direct_write(struct 
extent_map **map,
WARN_ON(dio_data->reserve < len);
dio_data->reserve -= len;
dio_data->unsubmitted_oe_range_end = start + len;
-   current->journal_info = dio_data;
 out:
return ret;
 }
@@ -7666,7 +7665,7 @@ static int btrfs_get_blocks_direct(struct kiocb *iocb, 
struct inode *inode,
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
struct extent_map *em;
struct extent_state *cached_state = NULL;
-   struct btrfs_dio_data *dio_data = NULL;
+   struct btrfs_dio_data *dio_data = iocb->private;
u64 start = iblock << inode->i_blkbits;
u64 lockstart, lockend;
u64 len = bh_result->b_size;
@@ -7681,25 +7680,13 @@ static int btrfs_get_blocks_direct(struct kiocb *iocb, 
struct inode *inode,
lockstart = start;
lockend = start + len - 1;
 
-   if (current->journal_info) {
-   /*
-* Need to pull our outstanding extents and set journal_info to 
NULL so
-* that anything that needs to check if there's a transaction 
doesn't get
-* confused.
-*/
-   dio_data = current->journal_info;
-   current->journal_info = NULL;
-   }
-
/*
 * If this errors out it's because we couldn't invalidate pagecache for
 * this range and we need to fallback to buffered.
 */
if (lock_extent_direct(inode, lockstart, lockend, &cached_state,
-  create)) {
-   ret = -ENOTBLK;
-   goto err;
-   }
+  create))
+   return -ENOTBLK;
 
em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, start, len, 0);
if (IS_ERR(em)) {
@@ -7767,9 +7754,6 @@ static int btrfs_get_blocks_direct(struct kiocb *iocb, 
struct inode *inode,
 unlock_err:
clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend,
 unlock_bits, 1, 0, &cached_state);
-err:
-   if (dio_data)
-   current->journal_info = dio_data;
return ret;
 }
 
@@ -8470,7 +8454,7 @@ static void btrfs_submit_direct(struct kiocb *iocb, 
struct bio *dio_bio,
 * time by btrfs_direct_IO().
 */
if (write) {
-   struct btrfs_dio_data *dio_data = current->journal_info;
+   struct btrfs_dio_data *dio_data = iocb->private;
 
dio_data->unsubmitted_oe_range_end = dip->logical_offset +
dip->bytes;
@@ -8612,13 +8596,13 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, 
struct iov_iter *iter)
/*
 * We need to know how many extents we reserved so that we can
 * do the accounting properly if we go over the number we
-* originally calculated.  Abuse current->journal_info for this.
+* originally calculated.
 */
dio_data.reserve = round_up(count,
fs_info->sectorsize);
dio_data.unsubmitted_oe_range_start = (u64)offset;
dio_data.unsubmitted_oe_range_end = (u64)offset;
-   current->journal_info = &dio_data;
+   iocb->private = &dio_data;
down_read(&BTRFS_I(inode)->dio_sem);
} else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK,
 &BTRFS_I(inode)->runtime_flags)) {
@@ -8633,7 +8617,6 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct 
iov_iter *iter)
   btrfs_submit_direct, flags);
if (iov_iter_rw(iter) == WRITE) {
up_read(&BTRFS_I(inode)->dio_sem);
-   current->journal_info = NULL;
if (ret < 0 && ret != -EIOCBQUEUED) {
if (dio_data.reserve)
btrfs_delalloc_release_space(inode, 
data_reserved,
-- 
2.18.0



[RFC PATCH v2 1/6] fs: pass iocb to direct I/O get_block()

2018-08-27 Thread Omar Sandoval
From: Omar Sandoval 

Split out dio_get_block_t which is the same as get_block_t except that
it takes the iocb as well, and update fs/direct-io.c and all callers to
use it. This is preparation for replacing the use of bh->b_private in
the direct I/O code with iocb->private.

Signed-off-by: Omar Sandoval 
---
 fs/affs/file.c  |  9 -
 fs/btrfs/inode.c|  3 ++-
 fs/direct-io.c  | 13 ++---
 fs/ext2/inode.c |  9 -
 fs/ext4/ext4.h  |  2 --
 fs/ext4/inode.c | 27 ++-
 fs/f2fs/data.c  |  5 +++--
 fs/fat/inode.c  |  9 -
 fs/gfs2/aops.c  |  5 +++--
 fs/hfs/inode.c  |  9 -
 fs/hfsplus/inode.c  |  9 -
 fs/jfs/inode.c  |  9 -
 fs/nilfs2/inode.c   |  9 -
 fs/ocfs2/aops.c | 15 +--
 fs/reiserfs/inode.c |  4 ++--
 fs/udf/inode.c  |  9 -
 include/linux/fs.h  | 10 ++
 17 files changed, 113 insertions(+), 43 deletions(-)

diff --git a/fs/affs/file.c b/fs/affs/file.c
index a85817f54483..66a1a5601d65 100644
--- a/fs/affs/file.c
+++ b/fs/affs/file.c
@@ -389,6 +389,13 @@ static void affs_write_failed(struct address_space 
*mapping, loff_t to)
}
 }
 
+static int affs_get_block_dio(struct kiocb *iocb, struct inode *inode,
+ sector_t block, struct buffer_head *bh_result,
+ int create)
+{
+   return affs_get_block(inode, block, bh_result, create);
+}
+
 static ssize_t
 affs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 {
@@ -406,7 +413,7 @@ affs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
return 0;
}
 
-   ret = blockdev_direct_IO(iocb, inode, iter, affs_get_block);
+   ret = blockdev_direct_IO(iocb, inode, iter, affs_get_block_dio);
if (ret < 0 && iov_iter_rw(iter) == WRITE)
affs_write_failed(mapping, offset + count);
return ret;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index eba61bcb9bb3..b61ea6dd9956 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7659,7 +7659,8 @@ static int btrfs_get_blocks_direct_write(struct 
extent_map **map,
return ret;
 }
 
-static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
+static int btrfs_get_blocks_direct(struct kiocb *iocb, struct inode *inode,
+  sector_t iblock,
   struct buffer_head *bh_result, int create)
 {
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 093fb54cd316..f631aa98849b 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -82,7 +82,7 @@ struct dio_submit {
int reap_counter;   /* rate limit reaping */
sector_t final_block_in_request;/* doesn't change */
int boundary;   /* prev block is at a boundary */
-   get_block_t *get_block; /* block mapping function */
+   dio_get_block_t *get_block; /* block mapping function */
dio_submit_t *submit_io;/* IO submition function */
 
loff_t logical_offset_in_bio;   /* current first logical block in bio */
@@ -713,8 +713,8 @@ static int get_more_blocks(struct dio *dio, struct 
dio_submit *sdio,
create = 0;
}
 
-   ret = (*sdio->get_block)(dio->inode, fs_startblk,
-   map_bh, create);
+   ret = (*sdio->get_block)(dio->iocb, dio->inode, fs_startblk,
+map_bh, create);
 
/* Store for completion */
dio->private = map_bh->b_private;
@@ -1170,7 +1170,7 @@ static inline int drop_refcount(struct dio *dio)
 static inline ssize_t
 do_blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
  struct block_device *bdev, struct iov_iter *iter,
- get_block_t get_block, dio_iodone_t end_io,
+ dio_get_block_t get_block, dio_iodone_t end_io,
  dio_submit_t submit_io, int flags)
 {
unsigned i_blkbits = READ_ONCE(inode->i_blkbits);
@@ -1398,9 +1398,8 @@ do_blockdev_direct_IO(struct kiocb *iocb, struct inode 
*inode,
 
 ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode,
 struct block_device *bdev, struct iov_iter *iter,
-get_block_t get_block,
-dio_iodone_t end_io, dio_submit_t submit_io,
-int flags)
+dio_get_block_t get_block, dio_iodone_t end_io,
+dio_submit_t submit_io, int flags)
 {
/*
 * The block device state is needed in the end to finally
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 71635909df3b..f390e6392238 100644
--- a/fs/ext2/inode.c
+++ b/fs/

[PATCH v5 1/6] mm: split SWP_FILE into SWP_ACTIVATED and SWP_FS

2018-08-31 Thread Omar Sandoval
From: Omar Sandoval 

The SWP_FILE flag serves two purposes: to make swap_{read,write}page()
go through the filesystem, and to make swapoff() call
->swap_deactivate(). For Btrfs, we want the latter but not the former,
so split this flag into two. This makes us always call
->swap_deactivate() if ->swap_activate() succeeded, not just if it
didn't add any swap extents itself.

This also resolves the issue of the very misleading name of SWP_FILE,
which is only used for swap files over NFS.

Reviewed-by: Nikolay Borisov 
Signed-off-by: Omar Sandoval 
---
 include/linux/swap.h | 13 +++--
 mm/page_io.c |  6 +++---
 mm/swapfile.c| 13 -
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 8e2c11e692ba..0fda0aa743f0 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -167,13 +167,14 @@ enum {
SWP_SOLIDSTATE  = (1 << 4), /* blkdev seeks are cheap */
SWP_CONTINUED   = (1 << 5), /* swap_map has count continuation */
SWP_BLKDEV  = (1 << 6), /* its a block device */
-   SWP_FILE= (1 << 7), /* set after swap_activate success */
-   SWP_AREA_DISCARD = (1 << 8),/* single-time swap area discards */
-   SWP_PAGE_DISCARD = (1 << 9),/* freed swap page-cluster discards */
-   SWP_STABLE_WRITES = (1 << 10),  /* no overwrite PG_writeback pages */
-   SWP_SYNCHRONOUS_IO = (1 << 11), /* synchronous IO is efficient */
+   SWP_ACTIVATED   = (1 << 7), /* set after swap_activate success */
+   SWP_FS  = (1 << 8), /* swap file goes through fs */
+   SWP_AREA_DISCARD = (1 << 9),/* single-time swap area discards */
+   SWP_PAGE_DISCARD = (1 << 10),   /* freed swap page-cluster discards */
+   SWP_STABLE_WRITES = (1 << 11),  /* no overwrite PG_writeback pages */
+   SWP_SYNCHRONOUS_IO = (1 << 12), /* synchronous IO is efficient */
/* add others here before... */
-   SWP_SCANNING= (1 << 12),/* refcount in scan_swap_map */
+   SWP_SCANNING= (1 << 13),/* refcount in scan_swap_map */
 };
 
 #define SWAP_CLUSTER_MAX 32UL
diff --git a/mm/page_io.c b/mm/page_io.c
index aafd19ec1db4..e8653c368069 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -283,7 +283,7 @@ int __swap_writepage(struct page *page, struct 
writeback_control *wbc,
struct swap_info_struct *sis = page_swap_info(page);
 
VM_BUG_ON_PAGE(!PageSwapCache(page), page);
-   if (sis->flags & SWP_FILE) {
+   if (sis->flags & SWP_FS) {
struct kiocb kiocb;
struct file *swap_file = sis->swap_file;
struct address_space *mapping = swap_file->f_mapping;
@@ -365,7 +365,7 @@ int swap_readpage(struct page *page, bool synchronous)
goto out;
}
 
-   if (sis->flags & SWP_FILE) {
+   if (sis->flags & SWP_FS) {
struct file *swap_file = sis->swap_file;
struct address_space *mapping = swap_file->f_mapping;
 
@@ -423,7 +423,7 @@ int swap_set_page_dirty(struct page *page)
 {
struct swap_info_struct *sis = page_swap_info(page);
 
-   if (sis->flags & SWP_FILE) {
+   if (sis->flags & SWP_FS) {
struct address_space *mapping = sis->swap_file->f_mapping;
 
VM_BUG_ON_PAGE(!PageSwapCache(page), page);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index d954b71c4f9c..d3f95833d12e 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -989,7 +989,7 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], 
int entry_size)
goto nextsi;
}
if (size == SWAPFILE_CLUSTER) {
-   if (!(si->flags & SWP_FILE))
+   if (!(si->flags & SWP_FS))
n_ret = swap_alloc_cluster(si, swp_entries);
} else
n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
@@ -2310,12 +2310,13 @@ static void destroy_swap_extents(struct 
swap_info_struct *sis)
kfree(se);
}
 
-   if (sis->flags & SWP_FILE) {
+   if (sis->flags & SWP_ACTIVATED) {
struct file *swap_file = sis->swap_file;
struct address_space *mapping = swap_file->f_mapping;
 
-   sis->flags &= ~SWP_FILE;
-   mapping->a_ops->swap_deactivate(swap_file);
+   sis->flags &= ~SWP_ACTIVATED;
+   if (mapping->a_ops->swap_deactivate)
+   mapping->a_ops->swap_deactivate(swap_file);
}
 }
 
@@ -2411,8 +2412,10 @@ static int setup_swap_extents(struct swap_info_struct 
*sis, sector_t 

[PATCH v5 5/6] Btrfs: rename get_chunk_map() and make it non-static

2018-08-31 Thread Omar Sandoval
From: Omar Sandoval 

The Btrfs swap code is going to need it, so give it a btrfs_ prefix and
make it non-static.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/volumes.c | 22 +++---
 fs/btrfs/volumes.h |  9 +
 2 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c03ef5322689..0aa8aff6774b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2712,8 +2712,8 @@ static int btrfs_del_sys_chunk(struct btrfs_fs_info 
*fs_info, u64 chunk_offset)
return ret;
 }
 
-static struct extent_map *get_chunk_map(struct btrfs_fs_info *fs_info,
-   u64 logical, u64 length)
+struct extent_map *btrfs_get_chunk_map(struct btrfs_fs_info *fs_info,
+  u64 logical, u64 length)
 {
struct extent_map_tree *em_tree;
struct extent_map *em;
@@ -2750,7 +2750,7 @@ int btrfs_remove_chunk(struct btrfs_trans_handle *trans, 
u64 chunk_offset)
int i, ret = 0;
struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
 
-   em = get_chunk_map(fs_info, chunk_offset, 1);
+   em = btrfs_get_chunk_map(fs_info, chunk_offset, 1);
if (IS_ERR(em)) {
/*
 * This is a logic error, but we don't want to just rely on the
@@ -4884,7 +4884,7 @@ int btrfs_finish_chunk_alloc(struct btrfs_trans_handle 
*trans,
int i = 0;
int ret = 0;
 
-   em = get_chunk_map(fs_info, chunk_offset, chunk_size);
+   em = btrfs_get_chunk_map(fs_info, chunk_offset, chunk_size);
if (IS_ERR(em))
return PTR_ERR(em);
 
@@ -5026,7 +5026,7 @@ int btrfs_chunk_readonly(struct btrfs_fs_info *fs_info, 
u64 chunk_offset)
int miss_ndevs = 0;
int i;
 
-   em = get_chunk_map(fs_info, chunk_offset, 1);
+   em = btrfs_get_chunk_map(fs_info, chunk_offset, 1);
if (IS_ERR(em))
return 1;
 
@@ -5086,7 +5086,7 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 
logical, u64 len)
struct map_lookup *map;
int ret;
 
-   em = get_chunk_map(fs_info, logical, len);
+   em = btrfs_get_chunk_map(fs_info, logical, len);
if (IS_ERR(em))
/*
 * We could return errors for these cases, but that could get
@@ -5132,7 +5132,7 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info 
*fs_info,
struct map_lookup *map;
unsigned long len = fs_info->sectorsize;
 
-   em = get_chunk_map(fs_info, logical, len);
+   em = btrfs_get_chunk_map(fs_info, logical, len);
 
if (!WARN_ON(IS_ERR(em))) {
map = em->map_lookup;
@@ -5149,7 +5149,7 @@ int btrfs_is_parity_mirror(struct btrfs_fs_info *fs_info, 
u64 logical, u64 len)
struct map_lookup *map;
int ret = 0;
 
-   em = get_chunk_map(fs_info, logical, len);
+   em = btrfs_get_chunk_map(fs_info, logical, len);
 
if(!WARN_ON(IS_ERR(em))) {
map = em->map_lookup;
@@ -5308,7 +5308,7 @@ static int __btrfs_map_block_for_discard(struct 
btrfs_fs_info *fs_info,
/* discard always return a bbio */
ASSERT(bbio_ret);
 
-   em = get_chunk_map(fs_info, logical, length);
+   em = btrfs_get_chunk_map(fs_info, logical, length);
if (IS_ERR(em))
return PTR_ERR(em);
 
@@ -5634,7 +5634,7 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info,
return __btrfs_map_block_for_discard(fs_info, logical,
 *length, bbio_ret);
 
-   em = get_chunk_map(fs_info, logical, *length);
+   em = btrfs_get_chunk_map(fs_info, logical, *length);
if (IS_ERR(em))
return PTR_ERR(em);
 
@@ -5933,7 +5933,7 @@ int btrfs_rmap_block(struct btrfs_fs_info *fs_info, u64 
chunk_start,
u64 rmap_len;
int i, j, nr = 0;
 
-   em = get_chunk_map(fs_info, chunk_start, 1);
+   em = btrfs_get_chunk_map(fs_info, chunk_start, 1);
if (IS_ERR(em))
return -EIO;
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 23e9285d88de..d68c8a05a774 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -465,6 +465,15 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info 
*fs_info,
 int btrfs_finish_chunk_alloc(struct btrfs_trans_handle *trans,
 u64 chunk_offset, u64 chunk_size);
 int btrfs_remove_chunk(struct btrfs_trans_handle *trans, u64 chunk_offset);
+/**
+ * btrfs_get_chunk_map() - Find the mapping containing the given logical 
extent.
+ * @logical: Logical block offset in bytes.
+ * @length: Length of extent in bytes.
+ *
+ * Return: Chunk mapping or ERR_PTR.
+ */
+struct extent_map *btrfs_get_chunk_map(struct btrfs_fs_info *fs_info,
+  u64 logical, u64 length);
 
 static inline void btrfs_dev_stat_inc(struct btrfs_device *dev,

[PATCH v5 6/6] Btrfs: support swap files

2018-08-31 Thread Omar Sandoval
From: Omar Sandoval 

Implement the swap file a_ops on Btrfs. Activation needs to make sure
that the file can be used as a swap file, which currently means it must
be fully allocated as nocow with no compression on one device. It also
sets up the swap extents directly with add_swap_extent(), so export it.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/inode.c | 232 +++
 1 file changed, 232 insertions(+)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9357a19d2bff..c0409e632768 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include "ctree.h"
 #include "disk-io.h"
@@ -10437,6 +10438,235 @@ void btrfs_set_range_writeback(struct extent_io_tree 
*tree, u64 start, u64 end)
}
 }
 
+struct btrfs_swap_info {
+   u64 start;
+   u64 block_start;
+   u64 block_len;
+   u64 lowest_ppage;
+   u64 highest_ppage;
+   unsigned long nr_pages;
+   int nr_extents;
+};
+
+static int btrfs_add_swap_extent(struct swap_info_struct *sis,
+struct btrfs_swap_info *bsi)
+{
+   unsigned long nr_pages;
+   u64 first_ppage, first_ppage_reported, next_ppage;
+   int ret;
+
+   first_ppage = ALIGN(bsi->block_start, PAGE_SIZE) >> PAGE_SHIFT;
+   next_ppage = ALIGN_DOWN(bsi->block_start + bsi->block_len,
+   PAGE_SIZE) >> PAGE_SHIFT;
+
+   if (first_ppage >= next_ppage)
+   return 0;
+   nr_pages = next_ppage - first_ppage;
+
+   first_ppage_reported = first_ppage;
+   if (bsi->start == 0)
+   first_ppage_reported++;
+   if (bsi->lowest_ppage > first_ppage_reported)
+   bsi->lowest_ppage = first_ppage_reported;
+   if (bsi->highest_ppage < (next_ppage - 1))
+   bsi->highest_ppage = next_ppage - 1;
+
+   ret = add_swap_extent(sis, bsi->nr_pages, nr_pages, first_ppage);
+   if (ret < 0)
+   return ret;
+   bsi->nr_extents += ret;
+   bsi->nr_pages += nr_pages;
+   return 0;
+}
+
+static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file,
+  sector_t *span)
+{
+   struct inode *inode = file_inode(file);
+   struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
+   struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
+   struct extent_state *cached_state = NULL;
+   struct extent_map *em = NULL;
+   struct btrfs_device *device = NULL;
+   struct btrfs_swap_info bsi = {
+   .lowest_ppage = (sector_t)-1ULL,
+   };
+   int ret = 0;
+   u64 isize = inode->i_size;
+   u64 start;
+
+   /*
+* If the swap file was just created, make sure delalloc is done. If the
+* file changes again after this, the user is doing something stupid and
+* we don't really care.
+*/
+   ret = btrfs_wait_ordered_range(inode, 0, (u64)-1);
+   if (ret)
+   return ret;
+
+   /*
+* The inode is locked, so these flags won't change after we check them.
+*/
+   if (BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS) {
+   btrfs_err(fs_info, "swapfile must not be compressed");
+   return -EINVAL;
+   }
+   if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW)) {
+   btrfs_err(fs_info, "swapfile must not be copy-on-write");
+   return -EINVAL;
+   }
+
+   /*
+* Balance or device remove/replace/resize can move stuff around from
+* under us. The EXCL_OP flag makes sure they aren't running/won't run
+* concurrently while we are mapping the swap extents, and the fs_info
+* nr_swapfiles counter prevents them from running while the swap file
+* is active and moving the extents. Note that this also prevents a
+* concurrent device add which isn't actually necessary, but it's not
+* really worth the trouble to allow it.
+*/
+   if (test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags))
+   return -EBUSY;
+   atomic_inc(&fs_info->nr_swapfiles);
+   /*
+* Snapshots can create extents which require COW even if NODATACOW is
+* set. We use this counter to prevent snapshots. We must increment it
+* before walking the extents because we don't want a concurrent
+* snapshot to run after we've already checked the extents.
+*/
+   atomic_inc(&BTRFS_I(inode)->root->nr_swapfiles);
+
+   lock_extent_bits(io_tree, 0, isize - 1, &cached_state);
+   start = 0;
+   while (start < isize) {
+   u64 end, logical_block_start, physical_block_start

[PATCH v5 2/6] mm: export add_swap_extent()

2018-08-31 Thread Omar Sandoval
From: Omar Sandoval 

Btrfs will need this for swap file support.

Signed-off-by: Omar Sandoval 
---
 mm/swapfile.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index d3f95833d12e..51cb30de17bc 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2365,6 +2365,7 @@ add_swap_extent(struct swap_info_struct *sis, unsigned 
long start_page,
list_add_tail(&new_se->list, &sis->first_swap_extent.list);
return 1;
 }
+EXPORT_SYMBOL_GPL(add_swap_extent);
 
 /*
  * A `swap extent' is a simple thing which maps a contiguous range of pages
-- 
2.18.0



[PATCH v5 0/6] Btrfs: implement swap file support

2018-08-31 Thread Omar Sandoval
From: Omar Sandoval 

Hi,

This series implements swap file support for Btrfs.

Changes since v4 [1]:

- Added a kernel doc for btrfs_get_chunk_map()
- Got rid of "Btrfs: push EXCL_OP set into btrfs_rm_device()"
- Made activate error messages more clear and consistent
- Changed clear vs unlock order in activate error case
- Added "mm: export add_swap_extent()" as a separate patch
- Added a btrfs_wait_ordered_range() at the beginning of
  btrfs_swap_activate() to catch newly created files
- Added some Reviewed-bys from Nikolay

I took a stab at adding support for balance when a swap file is active,
but it's a major pain: we need to mark block groups which contain swap
file extents, check the block group counter in relocate/scrub, then
unmark the block groups when the swap file is deactivated, which gets
really messy because the file can grow while it is an active swap file.
If this is a deal breaker, I can work something out, but I don't think
it's worth the trouble.

This was tested with the swap tests in xfstests plus my new tests here
[2]. Additionally, I used my swapme test program [3] and ran a few
memory-intensive workloads (e.g., a highly parallel kernel build),
verifying that swap was being used. All of this was done with lockdep
enabled.

This series is based on v4.19-rc1. Please take a look.

Thanks!

1: https://www.spinics.net/lists/linux-btrfs/msg78731.html
2: https://github.com/osandov/xfstests/tree/btrfs-swap
3: https://github.com/osandov/osandov-linux/blob/master/scripts/swapme.c

Omar Sandoval (6):
  mm: split SWP_FILE into SWP_ACTIVATED and SWP_FS
  mm: export add_swap_extent()
  vfs: update swap_{,de}activate documentation
  Btrfs: prevent ioctls from interfering with a swap file
  Btrfs: rename get_chunk_map() and make it non-static
  Btrfs: support swap files

 Documentation/filesystems/Locking |  17 +--
 Documentation/filesystems/vfs.txt |  12 +-
 fs/btrfs/ctree.h  |   6 +
 fs/btrfs/disk-io.c|   3 +
 fs/btrfs/inode.c  | 232 ++
 fs/btrfs/ioctl.c  |  51 ++-
 fs/btrfs/volumes.c|  28 ++--
 fs/btrfs/volumes.h|   9 ++
 include/linux/swap.h  |  13 +-
 mm/page_io.c  |   6 +-
 mm/swapfile.c |  14 +-
 11 files changed, 348 insertions(+), 43 deletions(-)

-- 
2.18.0



[PATCH v5 4/6] Btrfs: prevent ioctls from interfering with a swap file

2018-08-31 Thread Omar Sandoval
From: Omar Sandoval 

When a swap file is active, we must make sure that the extents of the
file are not moved and that they don't become shared. That means that
the following are not safe:

- chattr +c (enable compression)
- reflink
- dedupe
- snapshot
- defrag
- balance
- device remove/replace/resize

Don't allow those to happen on an active swap file. Balance and device
remove/replace/resize in particular are disallowed entirely; in the
future, we can relax this so that relocation skips/errors out only on
chunks containing an active swap file.

Note that we don't have to worry about chattr -C (disable nocow), which
we ignore for non-empty files, because an active swapfile must be
non-empty and can't be truncated. We also don't have to worry about
autodefrag because it's only done on COW files. Truncate and fallocate
are already taken care of by the generic code. Device add doesn't do
relocation so it's not an issue, either.

Signed-off-by: Omar Sandoval 
---
 fs/btrfs/ctree.h   |  6 ++
 fs/btrfs/disk-io.c |  3 +++
 fs/btrfs/ioctl.c   | 51 ++
 fs/btrfs/volumes.c |  6 ++
 4 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 53af9f5253f4..1c767a6394ae 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1121,6 +1121,9 @@ struct btrfs_fs_info {
u32 sectorsize;
u32 stripesize;
 
+   /* Number of active swapfiles */
+   atomic_t nr_swapfiles;
+
 #ifdef CONFIG_BTRFS_FS_REF_VERIFY
spinlock_t ref_verify_lock;
struct rb_root block_tree;
@@ -1285,6 +1288,9 @@ struct btrfs_root {
spinlock_t qgroup_meta_rsv_lock;
u64 qgroup_meta_rsv_pertrans;
u64 qgroup_meta_rsv_prealloc;
+
+   /* Number of active swapfiles */
+   atomic_t nr_swapfiles;
 };
 
 struct btrfs_file_private {
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 5124c15705ce..50ee5cd3efae 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1187,6 +1187,7 @@ static void __setup_root(struct btrfs_root *root, struct 
btrfs_fs_info *fs_info,
atomic_set(&root->log_batch, 0);
refcount_set(&root->refs, 1);
atomic_set(&root->will_be_snapshotted, 0);
+   atomic_set(&root->nr_swapfiles, 0);
root->log_transid = 0;
root->log_transid_committed = -1;
root->last_log_commit = 0;
@@ -2781,6 +2782,8 @@ int open_ctree(struct super_block *sb,
fs_info->sectorsize = 4096;
fs_info->stripesize = 4096;
 
+   atomic_set(&fs_info->nr_swapfiles, 0);
+
ret = btrfs_alloc_stripe_hash_table(fs_info);
if (ret) {
err = ret;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 63600dc2ac4c..cc230dcd32a4 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -290,6 +290,11 @@ static int btrfs_ioctl_setflags(struct file *file, void 
__user *arg)
} else if (fsflags & FS_COMPR_FL) {
const char *comp;
 
+   if (IS_SWAPFILE(inode)) {
+   ret = -ETXTBSY;
+   goto out_unlock;
+   }
+
binode->flags |= BTRFS_INODE_COMPRESS;
binode->flags &= ~BTRFS_INODE_NOCOMPRESS;
 
@@ -751,6 +756,12 @@ static int create_snapshot(struct btrfs_root *root, struct 
inode *dir,
if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state))
return -EINVAL;
 
+   if (atomic_read(&root->nr_swapfiles)) {
+   btrfs_info(fs_info,
+  "cannot snapshot subvolume with active swapfile");
+   return -ETXTBSY;
+   }
+
pending_snapshot = kzalloc(sizeof(*pending_snapshot), GFP_KERNEL);
if (!pending_snapshot)
return -ENOMEM;
@@ -1487,9 +1498,13 @@ int btrfs_defrag_file(struct inode *inode, struct file 
*file,
}
 
inode_lock(inode);
-   if (do_compress)
-   BTRFS_I(inode)->defrag_compress = compress_type;
-   ret = cluster_pages_for_defrag(inode, pages, i, cluster);
+   if (IS_SWAPFILE(inode)) {
+   ret = -ETXTBSY;
+   } else {
+   if (do_compress)
+   BTRFS_I(inode)->defrag_compress = compress_type;
+   ret = cluster_pages_for_defrag(inode, pages, i, 
cluster);
+   }
if (ret < 0) {
inode_unlock(inode);
goto out_ra;
@@ -1585,6 +1600,12 @@ static noinline int btrfs_ioctl_resize(struct file *file,
return BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
}
 
+   if (atomic_read(&fs_info->nr_swapfiles)) {
+   btrfs_info(fs_info, "cannot resize with active swapfile");

[PATCH v5 3/6] vfs: update swap_{,de}activate documentation

2018-08-31 Thread Omar Sandoval
From: Omar Sandoval 

The documentation for these functions is wrong in several ways:

- swap_activate() is called with the inode locked
- swap_activate() takes a swap_info_struct * and a sector_t *
- swap_activate() can also return a positive number of extents it added
  itself
- swap_deactivate() does not return anything

Reviewed-by: Nikolay Borisov 
Signed-off-by: Omar Sandoval 
---
 Documentation/filesystems/Locking | 17 +++--
 Documentation/filesystems/vfs.txt | 12 
 2 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/Documentation/filesystems/Locking 
b/Documentation/filesystems/Locking
index efea228ccd8a..b970c8c2ee22 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -210,8 +210,9 @@ prototypes:
int (*launder_page)(struct page *);
int (*is_partially_uptodate)(struct page *, unsigned long, unsigned 
long);
int (*error_remove_page)(struct address_space *, struct page *);
-   int (*swap_activate)(struct file *);
-   int (*swap_deactivate)(struct file *);
+   int (*swap_activate)(struct swap_info_struct *, struct file *,
+sector_t *);
+   void (*swap_deactivate)(struct file *);
 
 locking rules:
All except set_page_dirty and freepage may block
@@ -235,8 +236,8 @@ putback_page:   yes
 launder_page:  yes
 is_partially_uptodate: yes
 error_remove_page: yes
-swap_activate: no
-swap_deactivate:   no
+swap_activate: yes
+swap_deactivate:   no
 
->write_begin(), ->write_end() and ->readpage() may be called from
 the request handler (/dev/loop).
@@ -333,14 +334,10 @@ cleaned, or an error value if not. Note that in order to 
prevent the page
 getting mapped back in and redirtied, it needs to be kept locked
 across the entire operation.
 
-   ->swap_activate will be called with a non-zero argument on
-files backing (non block device backed) swapfiles. A return value
-of zero indicates success, in which case this file can be used for
-backing swapspace. The swapspace operations will be proxied to the
-address space operations.
+   ->swap_activate is called from sys_swapon() with the inode locked.
 
->swap_deactivate() will be called in the sys_swapoff()
-path after ->swap_activate() returned success.
+path after ->swap_activate() returned success. The inode is not locked.
 
 --- file_lock_operations --
 prototypes:
diff --git a/Documentation/filesystems/vfs.txt 
b/Documentation/filesystems/vfs.txt
index 4b2084d0f1fb..40d6d6d4b76b 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -652,8 +652,9 @@ struct address_space_operations {
unsigned long);
void (*is_dirty_writeback) (struct page *, bool *, bool *);
int (*error_remove_page) (struct mapping *mapping, struct page *page);
-   int (*swap_activate)(struct file *);
-   int (*swap_deactivate)(struct file *);
+   int (*swap_activate)(struct swap_info_struct *, struct file *,
+sector_t *);
+   void (*swap_deactivate)(struct file *);
 };
 
   writepage: called by the VM to write a dirty page to backing store.
@@ -830,8 +831,11 @@ struct address_space_operations {
 
   swap_activate: Called when swapon is used on a file to allocate
space if necessary and pin the block lookup information in
-   memory. A return value of zero indicates success,
-   in which case this file can be used to back swapspace.
+   memory. If this returns zero, the swap system will call the address
+   space operations ->readpage() and ->direct_IO(). Alternatively, this
+   may call add_swap_extent() and return the number of extents added, in
+   which case the swap system will use the provided blocks directly
+   instead of going through the filesystem.
 
   swap_deactivate: Called during swapoff on files where swap_activate
was successful.
-- 
2.18.0



Re: [PATCH 02/35] btrfs: add cleanup_ref_head_accounting helper

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:41:52PM -0400, Josef Bacik wrote:
> From: Josef Bacik 
> 
> We were missing some quota cleanups in check_ref_cleanup, so break the
> ref head accounting cleanup into a helper and call that from both
> check_ref_cleanup and cleanup_ref_head.  This will hopefully ensure that
> we don't screw up accounting in the future for other things that we add.

Reviewed-by: Omar Sandoval 

> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/extent-tree.c | 67 
> +-
>  1 file changed, 39 insertions(+), 28 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 6799950fa057..4c9fd35bca07 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -2461,6 +2461,41 @@ static int cleanup_extent_op(struct btrfs_trans_handle 
> *trans,
>   return ret ? ret : 1;
>  }
>  
> +static void cleanup_ref_head_accounting(struct btrfs_trans_handle *trans,
> + struct btrfs_delayed_ref_head *head)
> +{
> + struct btrfs_fs_info *fs_info = trans->fs_info;
> + struct btrfs_delayed_ref_root *delayed_refs =
> + &trans->transaction->delayed_refs;
> +
> + if (head->total_ref_mod < 0) {
> + struct btrfs_space_info *space_info;
> + u64 flags;
> +
> + if (head->is_data)
> + flags = BTRFS_BLOCK_GROUP_DATA;
> + else if (head->is_system)
> + flags = BTRFS_BLOCK_GROUP_SYSTEM;
> + else
> + flags = BTRFS_BLOCK_GROUP_METADATA;
> + space_info = __find_space_info(fs_info, flags);
> + ASSERT(space_info);
> + percpu_counter_add_batch(&space_info->total_bytes_pinned,
> +-head->num_bytes,
> +BTRFS_TOTAL_BYTES_PINNED_BATCH);

While you're here, could you fix this botched whitespace?


Re: [PATCH 03/35] btrfs: use cleanup_extent_op in check_ref_cleanup

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:41:53PM -0400, Josef Bacik wrote:
> From: Josef Bacik 
> 
> Unify the extent_op handling as well, just add a flag so we don't
> actually run the extent op from check_ref_cleanup and instead return a
> value so that we can skip cleaning up the ref head.
> 
> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/extent-tree.c | 17 +
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 4c9fd35bca07..87c42a2c45b1 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -2443,18 +2443,23 @@ static void unselect_delayed_ref_head(struct 
> btrfs_delayed_ref_root *delayed_ref
>  }
>  
>  static int cleanup_extent_op(struct btrfs_trans_handle *trans,
> -  struct btrfs_delayed_ref_head *head)
> +  struct btrfs_delayed_ref_head *head,
> +  bool run_extent_op)
>  {
>   struct btrfs_delayed_extent_op *extent_op = head->extent_op;
>   int ret;
>  
>   if (!extent_op)
>   return 0;
> +
>   head->extent_op = NULL;
>   if (head->must_insert_reserved) {
>   btrfs_free_delayed_extent_op(extent_op);
>   return 0;
> + } else if (!run_extent_op) {
> + return 1;
>   }
> +
>   spin_unlock(&head->lock);
>   ret = run_delayed_extent_op(trans, head, extent_op);
>   btrfs_free_delayed_extent_op(extent_op);

So if cleanup_extent_op() returns 1, then the head was unlocked, unless
run_extent_op was true. That's pretty confusing. Can we make it always
unlock in the !must_insert_reserved case?

> @@ -2506,7 +2511,7 @@ static int cleanup_ref_head(struct btrfs_trans_handle 
> *trans,
>  
>   delayed_refs = &trans->transaction->delayed_refs;
>  
> - ret = cleanup_extent_op(trans, head);
> + ret = cleanup_extent_op(trans, head, true);
>   if (ret < 0) {
>   unselect_delayed_ref_head(delayed_refs, head);
>   btrfs_debug(fs_info, "run_delayed_extent_op returned %d", ret);
> @@ -6977,12 +6982,8 @@ static noinline int check_ref_cleanup(struct 
> btrfs_trans_handle *trans,
>   if (!RB_EMPTY_ROOT(&head->ref_tree))
>   goto out;
>  
> - if (head->extent_op) {
> - if (!head->must_insert_reserved)
> - goto out;
> - btrfs_free_delayed_extent_op(head->extent_op);
> - head->extent_op = NULL;
> - }
> + if (cleanup_extent_op(trans, head, false))
> + goto out;
>  
>   /*
>* waiting for the lock here would deadlock.  If someone else has it
> -- 
> 2.14.3
> 


Re: [PATCH 06/35] btrfs: check if free bgs for commit

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:41:56PM -0400, Josef Bacik wrote:
> may_commit_transaction will skip committing the transaction if we don't
> have enough pinned space or if we're trying to find space for a SYSTEM
> chunk.  However if we have pending free block groups in this transaction
> we still want to commit as we may be able to allocate a chunk to make
> our reservation.  So instead of just returning ENOSPC, check if we have
> free block groups pending, and if so commit the transaction to allow us
> to use that free space.

This makes sense.

> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/extent-tree.c | 18 --
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 6e7f350754d2..80615a579b18 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -4804,6 +4804,7 @@ static int may_commit_transaction(struct btrfs_fs_info 
> *fs_info,
>   struct btrfs_trans_handle *trans;
>   u64 bytes;
>   u64 reclaim_bytes = 0;
> + bool do_commit = true;

I find this naming a little mind bending when I read

do_commit = false;
goto commit;

Since the end result is that we always join the transaction if we make
it past the (!bytes) check anyways, can we do the pending bgs check
first? I find the following easier to follow, fwiw.

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index de6f75f5547b..dd7aeb5fb6bf 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4779,18 +4779,25 @@ static int may_commit_transaction(struct btrfs_fs_info 
*fs_info,
if (!bytes)
return 0;
 
-   /* See if there is enough pinned space to make this reservation */
-   if (__percpu_counter_compare(&space_info->total_bytes_pinned,
-  bytes,
-  BTRFS_TOTAL_BYTES_PINNED_BATCH) >= 0)
-   goto commit;
+   trans = btrfs_join_transaction(fs_info->extent_root);
+   if (IS_ERR(trans))
+   return -ENOSPC;
+
+   /*
+* See if we have a pending bg or there is enough pinned space to make
+* this reservation.
+*/
+   if (test_bit(BTRFS_TRANS_HAVE_FREE_BGS, &trans->transaction->flags) ||
+   __percpu_counter_compare(&space_info->total_bytes_pinned, bytes,
+BTRFS_TOTAL_BYTES_PINNED_BATCH) >= 0)
+   return btrfs_commit_transaction(trans);
 
/*
 * See if there is some space in the delayed insertion reservation for
 * this reservation.
 */
if (space_info != delayed_rsv->space_info)
-   return -ENOSPC;
+   goto enospc;
 
spin_lock(&delayed_rsv->lock);
if (delayed_rsv->size > bytes)
@@ -4801,16 +4808,14 @@ static int may_commit_transaction(struct btrfs_fs_info 
*fs_info,
 
if (__percpu_counter_compare(&space_info->total_bytes_pinned,
   bytes,
-  BTRFS_TOTAL_BYTES_PINNED_BATCH) < 0) {
-   return -ENOSPC;
-   }
-
-commit:
-   trans = btrfs_join_transaction(fs_info->extent_root);
-   if (IS_ERR(trans))
-   return -ENOSPC;
+  BTRFS_TOTAL_BYTES_PINNED_BATCH) < 0)
+   goto enospc;
 
return btrfs_commit_transaction(trans);
+
+enospc:
+   btrfs_end_transaction(trans);
+   return -ENOSPC;
 }
 
 /*



Re: [PATCH 22/35] btrfs: make sure we create all new bgs

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:42:12PM -0400, Josef Bacik wrote:
> We can actually allocate new chunks while we're creating our bg's, so
> instead of doing list_for_each_safe, just do while (!list_empty()) so we
> make sure to catch any new bg's that get added to the list.

Reviewed-by: Omar Sandoval 

Since Nikolay pointed it out, might as well mention in the commit
message that this can happen because we modify the chunk and extent
trees.

> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/extent-tree.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index ca98c39308f6..fc30ff96f0d6 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -10331,7 +10331,7 @@ int btrfs_read_block_groups(struct btrfs_fs_info 
> *info)
>  void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans)
>  {
>   struct btrfs_fs_info *fs_info = trans->fs_info;
> - struct btrfs_block_group_cache *block_group, *tmp;
> + struct btrfs_block_group_cache *block_group;
>   struct btrfs_root *extent_root = fs_info->extent_root;
>   struct btrfs_block_group_item item;
>   struct btrfs_key key;
> @@ -10339,7 +10339,10 @@ void btrfs_create_pending_block_groups(struct 
> btrfs_trans_handle *trans)
>   bool can_flush_pending_bgs = trans->can_flush_pending_bgs;
>  
>   trans->can_flush_pending_bgs = false;
> - list_for_each_entry_safe(block_group, tmp, &trans->new_bgs, bg_list) {
> + while (!list_empty(&trans->new_bgs)) {
> + block_group = list_first_entry(&trans->new_bgs,
> +struct btrfs_block_group_cache,
> +bg_list);
>   if (ret)
>   goto next;
>  
> -- 
> 2.14.3
> 


Re: [PATCH 08/35] btrfs: release metadata before running delayed refs

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:41:58PM -0400, Josef Bacik wrote:
> We want to release the unused reservation we have since it refills the
> delayed refs reserve, which will make everything go smoother when
> running the delayed refs if we're short on our reservation.

Reviewed-by: Omar Sandoval 

> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/transaction.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 99741254e27e..ebb0c0405598 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1915,6 +1915,9 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
> *trans)
>   return ret;
>   }
>  
> + btrfs_trans_release_metadata(trans);
> + trans->block_rsv = NULL;
> +
>   /* make a pass through all the delayed refs we have so far
>* any runnings procs may add more while we are here
>*/
> @@ -1924,9 +1927,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
> *trans)
>   return ret;
>   }
>  
> - btrfs_trans_release_metadata(trans);
> - trans->block_rsv = NULL;
> -
>   cur_trans = trans->transaction;
>  
>   /*
> -- 
> 2.14.3
> 


Re: [PATCH 09/35] btrfs: protect space cache inode alloc with nofs

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:41:59PM -0400, Josef Bacik wrote:
> If we're allocating a new space cache inode it's likely going to be
> under a transaction handle, so we need to use memalloc_nofs_save() in
> order to avoid deadlocks, and more importantly lockdep messages that
> make xfstests fail.

Could use a comment where we call memalloc_nofs_save(). Otherwise,

Reviewed-by: Omar Sandoval 

> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/free-space-cache.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index c3888c113d81..db93a5f035a0 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -10,6 +10,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "ctree.h"
>  #include "free-space-cache.h"
>  #include "transaction.h"
> @@ -47,6 +48,7 @@ static struct inode *__lookup_free_space_inode(struct 
> btrfs_root *root,
>   struct btrfs_free_space_header *header;
>   struct extent_buffer *leaf;
>   struct inode *inode = NULL;
> + unsigned nofs_flag;
>   int ret;
>  
>   key.objectid = BTRFS_FREE_SPACE_OBJECTID;
> @@ -68,7 +70,9 @@ static struct inode *__lookup_free_space_inode(struct 
> btrfs_root *root,
>   btrfs_disk_key_to_cpu(&location, &disk_key);
>   btrfs_release_path(path);
>  
> + nofs_flag = memalloc_nofs_save();
>   inode = btrfs_iget(fs_info->sb, &location, root, NULL);
> + memalloc_nofs_restore(nofs_flag);
>   if (IS_ERR(inode))
>   return inode;
>  
> -- 
> 2.14.3
> 


Re: [PATCH 23/35] btrfs: assert on non-empty delayed iputs

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:42:13PM -0400, Josef Bacik wrote:
> I ran into an issue where there was some reference being held on an
> inode that I couldn't track.  This assert wasn't triggered, but it at
> least rules out we're doing something stupid.

Reviewed-by: Omar Sandoval 

> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/disk-io.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 0e42401756b8..11ea2ea7439e 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3979,6 +3979,7 @@ void close_ctree(struct btrfs_fs_info *fs_info)
>   kthread_stop(fs_info->transaction_kthread);
>   kthread_stop(fs_info->cleaner_kthread);
>  
> + ASSERT(list_empty(&fs_info->delayed_iputs));
>   set_bit(BTRFS_FS_CLOSING_DONE, &fs_info->flags);
>  
>   btrfs_free_qgroup_config(fs_info);
> -- 
> 2.14.3
> 


Re: [PATCH 21/35] btrfs: only run delayed refs if we're committing

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:42:11PM -0400, Josef Bacik wrote:
> I noticed in a giant dbench run that we spent a lot of time on lock
> contention while running transaction commit.  This is because dbench
> results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> they all run the delayed refs first thing, so they all contend with
> each other.  This leads to seconds of 0 throughput.  Change this to only
> run the delayed refs if we're the ones committing the transaction.  This
> makes the latency go away and we get no more lock contention.

This means that we're going to spend more time running delayed refs
while in TRANS_STATE_COMMIT_START, so couldn't we end up blocking new
transactions more than before?

> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/transaction.c | 24 +---
>  1 file changed, 9 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index ebb0c0405598..2bb19e2ded5e 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1918,15 +1918,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
> *trans)
>   btrfs_trans_release_metadata(trans);
>   trans->block_rsv = NULL;
>  
> - /* make a pass through all the delayed refs we have so far
> -  * any runnings procs may add more while we are here
> -  */
> - ret = btrfs_run_delayed_refs(trans, 0);
> - if (ret) {
> - btrfs_end_transaction(trans);
> - return ret;
> - }
> -
>   cur_trans = trans->transaction;
>  
>   /*
> @@ -1939,12 +1930,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
> *trans)
>   if (!list_empty(&trans->new_bgs))
>   btrfs_create_pending_block_groups(trans);
>  
> - ret = btrfs_run_delayed_refs(trans, 0);
> - if (ret) {
> - btrfs_end_transaction(trans);
> - return ret;
> - }
> -
>   if (!test_bit(BTRFS_TRANS_DIRTY_BG_RUN, &cur_trans->flags)) {
>   int run_it = 0;
>  
> @@ -2015,6 +2000,15 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
> *trans)
>   spin_unlock(&fs_info->trans_lock);
>   }
>  
> + /*
> +  * We are now the only one in the commit area, we can run delayed refs
> +  * without hitting a bunch of lock contention from a lot of people
> +  * trying to commit the transaction at once.
> +  */
> + ret = btrfs_run_delayed_refs(trans, 0);
> + if (ret)
> + goto cleanup_transaction;
> +
>   extwriter_counter_dec(cur_trans, trans->type);
>  
>   ret = btrfs_start_delalloc_flush(fs_info);
> -- 
> 2.14.3
> 


Re: [PATCH 29/35] btrfs: just delete pending bgs if we are aborted

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:42:19PM -0400, Josef Bacik wrote:
> We still need to do all of the accounting cleanup for pending block
> groups if we abort.  So set the ret to trans->aborted so if we aborted
> the cleanup happens and everybody is happy.

Reviewed-by: Omar Sandoval 

Reusing the loop is fine IMO, but a comment would be appreciated.

> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/extent-tree.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 90f267f4dd0f..132a1157982c 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -10333,7 +10333,7 @@ void btrfs_create_pending_block_groups(struct 
> btrfs_trans_handle *trans)
>   struct btrfs_root *extent_root = fs_info->extent_root;
>   struct btrfs_block_group_item item;
>   struct btrfs_key key;
> - int ret = 0;
> + int ret = trans->aborted;
>   bool can_flush_pending_bgs = trans->can_flush_pending_bgs;
>  
>   trans->can_flush_pending_bgs = false;
> -- 
> 2.14.3
> 


Re: [PATCH 30/35] btrfs: cleanup pending bgs on transaction abort

2018-08-31 Thread Omar Sandoval
On Thu, Aug 30, 2018 at 01:42:20PM -0400, Josef Bacik wrote:
> We may abort the transaction during a commit and not have a chance to
> run the pending bgs stuff, which will leave block groups on our list and
> cause us accounting issues and leaked memory.  Fix this by running the
> pending bgs when we cleanup a transaction.

Reviewed-by: Omar Sandoval 

Again, I think it's fine to reuse the same function as long as there's a
comment here.

> Signed-off-by: Josef Bacik 
> ---
>  fs/btrfs/transaction.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 89d14f135837..0f39a0d302d3 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -2273,6 +2273,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
> *trans)
>   btrfs_scrub_continue(fs_info);
>  cleanup_transaction:
>   btrfs_trans_release_metadata(trans);
> + btrfs_create_pending_block_groups(trans);
>   btrfs_trans_release_chunk_metadata(trans);
>   trans->block_rsv = NULL;
>   btrfs_warn(fs_info, "Skipping commit of aborted transaction.");
> -- 
> 2.14.3
> 


Re: [PATCH 21/35] btrfs: only run delayed refs if we're committing

2018-09-04 Thread Omar Sandoval
On Tue, Sep 04, 2018 at 01:54:13PM -0400, Josef Bacik wrote:
> On Fri, Aug 31, 2018 at 05:28:09PM -0700, Omar Sandoval wrote:
> > On Thu, Aug 30, 2018 at 01:42:11PM -0400, Josef Bacik wrote:
> > > I noticed in a giant dbench run that we spent a lot of time on lock
> > > contention while running transaction commit.  This is because dbench
> > > results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> > > they all run the delayed refs first thing, so they all contend with
> > > each other.  This leads to seconds of 0 throughput.  Change this to only
> > > run the delayed refs if we're the ones committing the transaction.  This
> > > makes the latency go away and we get no more lock contention.
> > 
> > This means that we're going to spend more time running delayed refs
> > while in TRANS_STATE_COMMIT_START, so couldn't we end up blocking new
> > transactions more than before?
> > 
> 
> You'd think that, but the lock contention is enough that it makes it
> unfuckingpossible for anything to run for several seconds while everybody
> competes for either the delayed refs lock or the extent root lock.
> 
> With the delayed refs rsv we actually end up running the delayed refs often
> enough because of the extra ENOSPC pressure that we don't really end up with
> long chunks of time running delayed refs while blocking out START 
> transactions.
> 
> If at some point down the line this turns out to be an actual issue we can
> revisit the best way to do this.  Off the top of my head we do something like
> wrap it in a "run all the delayed refs" mutex so that all the committers just
> wait on whoever wins, and we move it back outside of the start logic in order 
> to
> make it better all the way around.  But I don't think that's something we need
> to do at this point.  Thanks,

Ok, that's good enough for me.

Reviewed-by: Omar Sandoval 


<    2   3   4   5   6   7   8   9   10   11   >